All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Using DC in amdgpu for upcoming GPU
@ 2016-12-08  2:02 Harry Wentland
  2016-12-08  9:59 ` Daniel Vetter
                   ` (3 more replies)
  0 siblings, 4 replies; 66+ messages in thread
From: Harry Wentland @ 2016-12-08  2:02 UTC (permalink / raw)
  To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie
  Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Cheng, Tony

We propose to use the Display Core (DC) driver for display support on
AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In 
order to avoid a flag day the plan is to only support uGPU initially and 
transition to older ASICs gradually.

The DC component has received extensive testing within AMD for DCE8, 10, 
and 11 GPUs and is being prepared for uGPU. Support should be better 
than amdgpu's current display support.

  * All of our QA effort is focused on DC
  * All of our CQE effort is focused on DC
  * All of our OEM preloads and custom engagements use DC
  * DC behavior mirrors what we do for other OSes

The new asic utilizes a completely re-designed atom interface, so we 
cannot easily leverage much of the existing atom-based code.

We've introduced DC to the community earlier in 2016 and received a fair 
amount of feedback. Some of what we've addressed so far are:

  * Self-contain ASIC specific code. We did a bunch of work to pull
    common sequences into dc/dce and leave ASIC specific code in
    separate folders.
  * Started to expose AUX and I2C through generic kernel/drm
    functionality and are mostly using that. Some of that code is still
    needlessly convoluted. This cleanup is in progress.
  * Integrated Dave and Jerome’s work on removing abstraction in bios
    parser.
  * Retire adapter service and asic capability
  * Remove some abstraction in GPIO

Since a lot of our code is shared with pre- and post-silicon validation 
suites changes need to be done gradually to prevent breakages due to a 
major flag day.  This, coupled with adding support for new asics and 
lots of new feature introductions means progress has not been as quick 
as we would have liked. We have made a lot of progress none the less.

The remaining concerns that were brought up during the last review that 
we are working on addressing:

  * Continue to cleanup and reduce the abstractions in DC where it
    makes sense.
  * Removing duplicate code in I2C and AUX as we transition to using the
    DRM core interfaces.  We can't fully transition until we've helped
    fill in the gaps in the drm core that we need for certain features.
  * Making sure Atomic API support is correct.  Some of the semantics of
    the Atomic API were not particularly clear when we started this,
    however, that is improving a lot as the core drm documentation
    improves.  Getting this code upstream and in the hands of more
    atomic users will further help us identify and rectify any gaps we
    have.

Unfortunately we cannot expose code for uGPU yet. However refactor / 
cleanup work on DC is public.  We're currently transitioning to a public 
patch review. You can follow our progress on the amd-gfx mailing list. 
We value community feedback on our work.

As an appendix I've included a brief overview of the how the code 
currently works to make understanding and reviewing the code easier.

Prior discussions on DC:

  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
  * 
https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html

Current version of DC:

  * 
https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

Once Alex pulls in the latest patches:

  * 
https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

Best Regards,
Harry


************************************************
*** Appendix: A Day in the Life of a Modeset ***
************************************************

Below is a high-level overview of a modeset with dc. Some of this might 
be a little out-of-date since it's based on my XDC presentation but it 
should be more-or-less the same.

amdgpu_dm_atomic_commit()
{
   /* setup atomic state */
   drm_atomic_helper_prepare_planes(dev, state);
   drm_atomic_helper_swap_state(dev, state);
   drm_atomic_helper_update_legacy_modeset_state(dev, state);

   /* create or remove targets */

   /********************************************************************
    * *** Call into DC to commit targets with list of all known targets
    ********************************************************************/
   /* DC is optimized not to do anything if 'targets' didn't change. */
   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
   {
     /******************************************************************
      * *** Build context (function also used for validation)
      ******************************************************************/
     result = core_dc->res_pool->funcs->validate_with_context(
                                core_dc,set,target_count,context);

     /******************************************************************
      * *** Apply safe power state
      ******************************************************************/
     pplib_apply_safe_state(core_dc);

     /****************************************************************
      * *** Apply the context to HW (program HW)
      ****************************************************************/
     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
     {
       /* reset pipes that need reprogramming */
       /* disable pipe power gating */
       /* set safe watermarks */

       /* for all pipes with an attached stream */
         /************************************************************
          * *** Programming all per-pipe contexts
          ************************************************************/
         status = apply_single_controller_ctx_to_hw(...)
         {
           pipe_ctx->tg->funcs->set_blank(...);
           pipe_ctx->clock_source->funcs->program_pix_clk(...);
           pipe_ctx->tg->funcs->program_timing(...);
           pipe_ctx->mi->funcs->allocate_mem_input(...);
           pipe_ctx->tg->funcs->enable_crtc(...);
           bios_parser_crtc_source_select(...);

           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
           pipe_ctx->opp->funcs->opp_program_fmt(...);

           stream->sink->link->link_enc->funcs->setup(...);
           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
           pipe_ctx->tg->funcs->set_blank_color(...);

           core_link_enable_stream(pipe_ctx);
           unblank_stream(pipe_ctx,

           program_scaler(dc, pipe_ctx);
         }
       /* program audio for all pipes */
       /* update watermarks */
     }

     program_timing_sync(core_dc, context);
     /* for all targets */
       target_enable_memory_requests(...);

     /* Update ASIC power states */
     pplib_apply_display_requirements(...);

     /* update surface or page flip */
   }
}


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08  2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland
@ 2016-12-08  9:59 ` Daniel Vetter
       [not found]   ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  2016-12-11 20:28 ` Daniel Vetter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 66+ messages in thread
From: Daniel Vetter @ 2016-12-08  9:59 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander, Cheng, Tony

Hi Harry,

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.
> 
> The DC component has received extensive testing within AMD for DCE8, 10, and
> 11 GPUs and is being prepared for uGPU. Support should be better than
> amdgpu's current display support.
> 
>  * All of our QA effort is focused on DC
>  * All of our CQE effort is focused on DC
>  * All of our OEM preloads and custom engagements use DC
>  * DC behavior mirrors what we do for other OSes
> 
> The new asic utilizes a completely re-designed atom interface, so we cannot
> easily leverage much of the existing atom-based code.
> 
> We've introduced DC to the community earlier in 2016 and received a fair
> amount of feedback. Some of what we've addressed so far are:
> 
>  * Self-contain ASIC specific code. We did a bunch of work to pull
>    common sequences into dc/dce and leave ASIC specific code in
>    separate folders.
>  * Started to expose AUX and I2C through generic kernel/drm
>    functionality and are mostly using that. Some of that code is still
>    needlessly convoluted. This cleanup is in progress.
>  * Integrated Dave and Jerome’s work on removing abstraction in bios
>    parser.
>  * Retire adapter service and asic capability
>  * Remove some abstraction in GPIO
> 
> Since a lot of our code is shared with pre- and post-silicon validation
> suites changes need to be done gradually to prevent breakages due to a major
> flag day.  This, coupled with adding support for new asics and lots of new
> feature introductions means progress has not been as quick as we would have
> liked. We have made a lot of progress none the less.
> 
> The remaining concerns that were brought up during the last review that we
> are working on addressing:
> 
>  * Continue to cleanup and reduce the abstractions in DC where it
>    makes sense.
>  * Removing duplicate code in I2C and AUX as we transition to using the
>    DRM core interfaces.  We can't fully transition until we've helped
>    fill in the gaps in the drm core that we need for certain features.
>  * Making sure Atomic API support is correct.  Some of the semantics of
>    the Atomic API were not particularly clear when we started this,
>    however, that is improving a lot as the core drm documentation
>    improves.  Getting this code upstream and in the hands of more
>    atomic users will further help us identify and rectify any gaps we
>    have.
> 
> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> work on DC is public.  We're currently transitioning to a public patch
> review. You can follow our progress on the amd-gfx mailing list. We value
> community feedback on our work.
> 
> As an appendix I've included a brief overview of the how the code currently
> works to make understanding and reviewing the code easier.
> 
> Prior discussions on DC:
> 
>  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>  *
> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> 
> Current version of DC:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Once Alex pulls in the latest patches:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Best Regards,
> Harry
> 
> 
> ************************************************
> *** Appendix: A Day in the Life of a Modeset ***
> ************************************************
> 
> Below is a high-level overview of a modeset with dc. Some of this might be a
> little out-of-date since it's based on my XDC presentation but it should be
> more-or-less the same.
> 
> amdgpu_dm_atomic_commit()
> {
>   /* setup atomic state */
>   drm_atomic_helper_prepare_planes(dev, state);
>   drm_atomic_helper_swap_state(dev, state);
>   drm_atomic_helper_update_legacy_modeset_state(dev, state);
> 
>   /* create or remove targets */
> 
>   /********************************************************************
>    * *** Call into DC to commit targets with list of all known targets
>    ********************************************************************/
>   /* DC is optimized not to do anything if 'targets' didn't change. */
>   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>   {
>     /******************************************************************
>      * *** Build context (function also used for validation)
>      ******************************************************************/
>     result = core_dc->res_pool->funcs->validate_with_context(
>                                core_dc,set,target_count,context);

I can't dig into details of DC, so this is not a 100% assessment, but if
you call a function called "validate" in atomic_commit, you're very, very
likely breaking atomic. _All_ validation must happen in ->atomic_check,
if that's not the case TEST_ONLY mode is broken. And atomic userspace is
relying on that working.

The only thing that you're allowed to return from ->atomic_commit is
out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
Kerneldoc expklains this.

Now the reason I bring this up (and we've discussed it at length in
private) is that DC still suffers from a massive abstraction midlayer. A
lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
timers, irq, ...) have been cleaned up, but the midlayer is still there.
And I understand why you have it, and why it's there - without some OS
abstraction your grand plan of a unified driver across everything doesn't
work out so well.

But in a way the backend stuff isn't such a big deal. It's annoying since
lots of code, and bugfixes have to be duplicated and all that, but it's
fairly easy to fix case-by-case, and as long as AMD folks stick around
(which I fully expect) not a maintainance issue. It makes it harder for
others to contribute, but then since it's mostly the leaf it's generally
easy to just improve the part you want to change (as an outsider). And if
you want to improve shared code the only downside is that you can't also
improve amd, but that's not so much a problem for non-amd folks ;-)

The problem otoh with the abstraction layer between drm core and the amd
driver is that you can't ignore if you want to refactor shared code. And
because it's an entire world of its own, it's much harder to understand
what the driver is doing (without reading it all). Some examples of what I
mean:

- All other drm drivers subclass drm objects (by embedding them) into the
  corresponding hw part that most closely matches the drm object's
  semantics. That means even when you have 0 clue about how a given piece
  of hw works, you have a reasonable chance of understanding code. If it's
  all your own stuff you always have to keep in minde the special amd
  naming conventions. That gets old real fast if you trying to figure out
  what 20+ (or are we at 30 already?) drivers are doing.

- This is even more true for atomic. Atomic has a pretty complicated
  check/commmit transactional model for updating display state. It's a
  standardized interface, and it's extensible, and we want generic
  userspace to be able to run on any driver. Fairly often we realize that
  semantics of existing or newly proposed properties and state isn't
  well-defined enough, and then we need to go&read all the drivers and
  figure out how to fix up the mess. DC has it's entirely separate state
  structures which again don't subclass the atomic core structures (afaik
  at least). Again the same problems apply that you can't find things, and
  that figuring out the exact semantics and spotting differences in
  behaviour is almost impossible.

- The trouble isn't just in reading code and understanding it correctly,
  it's also in finding it. If you have your own completely different world
  then just finding the right code is hard - cscope and grep fail to work.

- Another issue is that very often we unify semantics in drivers by adding
  some new helpers that at least dtrt for most of the drivers. If you have
  your own world then the impendance mismatch will make sure that amd
  drivers will have slightly different semantics, and I think that's not
  good for the ecosystem and kms - people want to run a lot more than just
  a boot splash with generic kms userspace, stuff like xf86-video-$vendor
  is going out of favour heavily.

Note that all this isn't about amd walking away and leaving an
unmaintainable mess behind. Like I've said I don't think this is a big
risk. The trouble is that having your own world makes it harder for
everyone else to understand the amd driver, and understanding all drivers
is very often step 1 in some big refactoring or feature addition effort.
Because starting to refactor without understanding the problem generally
doesn't work ;_) And you can't make this step 1 easier for others by
promising to always maintain DC and update it to all the core changes,
because that's only step 2.

In all the DC discussions we've had thus far I haven't seen anyone address
this issue. And this isn't just an issue in drm, it's pretty much
established across all linux subsystems with the "no midlayer or OS
abstraction layers in drivers" rule. There's some real solid reasons why
such a HAl is extremely unpopular with upstream. And I haven't yet seen
any good reason why amd needs to be different, thus far it looks like a
textbook case, and there's been lots of vendors in lots of subsystems who
tried to push their HAL.

Thanks, Daniel

> 
>     /******************************************************************
>      * *** Apply safe power state
>      ******************************************************************/
>     pplib_apply_safe_state(core_dc);
> 
>     /****************************************************************
>      * *** Apply the context to HW (program HW)
>      ****************************************************************/
>     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>     {
>       /* reset pipes that need reprogramming */
>       /* disable pipe power gating */
>       /* set safe watermarks */
> 
>       /* for all pipes with an attached stream */
>         /************************************************************
>          * *** Programming all per-pipe contexts
>          ************************************************************/
>         status = apply_single_controller_ctx_to_hw(...)
>         {
>           pipe_ctx->tg->funcs->set_blank(...);
>           pipe_ctx->clock_source->funcs->program_pix_clk(...);
>           pipe_ctx->tg->funcs->program_timing(...);
>           pipe_ctx->mi->funcs->allocate_mem_input(...);
>           pipe_ctx->tg->funcs->enable_crtc(...);
>           bios_parser_crtc_source_select(...);
> 
>           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>           pipe_ctx->opp->funcs->opp_program_fmt(...);
> 
>           stream->sink->link->link_enc->funcs->setup(...);
>           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>           pipe_ctx->tg->funcs->set_blank_color(...);
> 
>           core_link_enable_stream(pipe_ctx);
>           unblank_stream(pipe_ctx,
> 
>           program_scaler(dc, pipe_ctx);
>         }
>       /* program audio for all pipes */
>       /* update watermarks */
>     }
> 
>     program_timing_sync(core_dc, context);
>     /* for all targets */
>       target_enable_memory_requests(...);
> 
>     /* Update ASIC power states */
>     pplib_apply_display_requirements(...);
> 
>     /* update surface or page flip */
>   }
> }
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]   ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-08 14:33     ` Harry Wentland
  2016-12-08 15:34       ` Daniel Vetter
  2016-12-08 20:07     ` Dave Airlie
  1 sibling, 1 reply; 66+ messages in thread
From: Harry Wentland @ 2016-12-08 14:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Dave Airlie,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Cheng, Tony

Hi Daniel,

just a quick clarification in-line about "validation" inside atomic_commit.

On 2016-12-08 04:59 AM, Daniel Vetter wrote:
> Hi Harry,
>
> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>> We propose to use the Display Core (DC) driver for display support on
>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>> avoid a flag day the plan is to only support uGPU initially and transition
>> to older ASICs gradually.
>>
>> The DC component has received extensive testing within AMD for DCE8, 10, and
>> 11 GPUs and is being prepared for uGPU. Support should be better than
>> amdgpu's current display support.
>>
>>  * All of our QA effort is focused on DC
>>  * All of our CQE effort is focused on DC
>>  * All of our OEM preloads and custom engagements use DC
>>  * DC behavior mirrors what we do for other OSes
>>
>> The new asic utilizes a completely re-designed atom interface, so we cannot
>> easily leverage much of the existing atom-based code.
>>
>> We've introduced DC to the community earlier in 2016 and received a fair
>> amount of feedback. Some of what we've addressed so far are:
>>
>>  * Self-contain ASIC specific code. We did a bunch of work to pull
>>    common sequences into dc/dce and leave ASIC specific code in
>>    separate folders.
>>  * Started to expose AUX and I2C through generic kernel/drm
>>    functionality and are mostly using that. Some of that code is still
>>    needlessly convoluted. This cleanup is in progress.
>>  * Integrated Dave and Jerome’s work on removing abstraction in bios
>>    parser.
>>  * Retire adapter service and asic capability
>>  * Remove some abstraction in GPIO
>>
>> Since a lot of our code is shared with pre- and post-silicon validation
>> suites changes need to be done gradually to prevent breakages due to a major
>> flag day.  This, coupled with adding support for new asics and lots of new
>> feature introductions means progress has not been as quick as we would have
>> liked. We have made a lot of progress none the less.
>>
>> The remaining concerns that were brought up during the last review that we
>> are working on addressing:
>>
>>  * Continue to cleanup and reduce the abstractions in DC where it
>>    makes sense.
>>  * Removing duplicate code in I2C and AUX as we transition to using the
>>    DRM core interfaces.  We can't fully transition until we've helped
>>    fill in the gaps in the drm core that we need for certain features.
>>  * Making sure Atomic API support is correct.  Some of the semantics of
>>    the Atomic API were not particularly clear when we started this,
>>    however, that is improving a lot as the core drm documentation
>>    improves.  Getting this code upstream and in the hands of more
>>    atomic users will further help us identify and rectify any gaps we
>>    have.
>>
>> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
>> work on DC is public.  We're currently transitioning to a public patch
>> review. You can follow our progress on the amd-gfx mailing list. We value
>> community feedback on our work.
>>
>> As an appendix I've included a brief overview of the how the code currently
>> works to make understanding and reviewing the code easier.
>>
>> Prior discussions on DC:
>>
>>  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>>  *
>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
>>
>> Current version of DC:
>>
>>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>
>> Once Alex pulls in the latest patches:
>>
>>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>
>> Best Regards,
>> Harry
>>
>>
>> ************************************************
>> *** Appendix: A Day in the Life of a Modeset ***
>> ************************************************
>>
>> Below is a high-level overview of a modeset with dc. Some of this might be a
>> little out-of-date since it's based on my XDC presentation but it should be
>> more-or-less the same.
>>
>> amdgpu_dm_atomic_commit()
>> {
>>   /* setup atomic state */
>>   drm_atomic_helper_prepare_planes(dev, state);
>>   drm_atomic_helper_swap_state(dev, state);
>>   drm_atomic_helper_update_legacy_modeset_state(dev, state);
>>
>>   /* create or remove targets */
>>
>>   /********************************************************************
>>    * *** Call into DC to commit targets with list of all known targets
>>    ********************************************************************/
>>   /* DC is optimized not to do anything if 'targets' didn't change. */
>>   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>>   {
>>     /******************************************************************
>>      * *** Build context (function also used for validation)
>>      ******************************************************************/
>>     result = core_dc->res_pool->funcs->validate_with_context(
>>                                core_dc,set,target_count,context);
>
> I can't dig into details of DC, so this is not a 100% assessment, but if
> you call a function called "validate" in atomic_commit, you're very, very
> likely breaking atomic. _All_ validation must happen in ->atomic_check,
> if that's not the case TEST_ONLY mode is broken. And atomic userspace is
> relying on that working.
>

This function is not really named correctly. What it does is it builds a 
context and validates at the same time. In commit we simply care that it 
builds the context. Validate should never fail here (since this was 
already validated in atomic_check).

We call the same function at atomic_check

amdgpu_dm_atomic_check ->
	dc_validate_resources ->
		core_dc->res_pool->funcs->validate_with_context


As for the rest, I hear you and appreciate your feedback. Let me get 
back to you on that later.

Thanks,
Harry


> The only thing that you're allowed to return from ->atomic_commit is
> out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
> Kerneldoc expklains this.
>
> Now the reason I bring this up (and we've discussed it at length in
> private) is that DC still suffers from a massive abstraction midlayer. A
> lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
> timers, irq, ...) have been cleaned up, but the midlayer is still there.
> And I understand why you have it, and why it's there - without some OS
> abstraction your grand plan of a unified driver across everything doesn't
> work out so well.
>
> But in a way the backend stuff isn't such a big deal. It's annoying since
> lots of code, and bugfixes have to be duplicated and all that, but it's
> fairly easy to fix case-by-case, and as long as AMD folks stick around
> (which I fully expect) not a maintainance issue. It makes it harder for
> others to contribute, but then since it's mostly the leaf it's generally
> easy to just improve the part you want to change (as an outsider). And if
> you want to improve shared code the only downside is that you can't also
> improve amd, but that's not so much a problem for non-amd folks ;-)
>
> The problem otoh with the abstraction layer between drm core and the amd
> driver is that you can't ignore if you want to refactor shared code. And
> because it's an entire world of its own, it's much harder to understand
> what the driver is doing (without reading it all). Some examples of what I
> mean:
>
> - All other drm drivers subclass drm objects (by embedding them) into the
>   corresponding hw part that most closely matches the drm object's
>   semantics. That means even when you have 0 clue about how a given piece
>   of hw works, you have a reasonable chance of understanding code. If it's
>   all your own stuff you always have to keep in minde the special amd
>   naming conventions. That gets old real fast if you trying to figure out
>   what 20+ (or are we at 30 already?) drivers are doing.
>
> - This is even more true for atomic. Atomic has a pretty complicated
>   check/commmit transactional model for updating display state. It's a
>   standardized interface, and it's extensible, and we want generic
>   userspace to be able to run on any driver. Fairly often we realize that
>   semantics of existing or newly proposed properties and state isn't
>   well-defined enough, and then we need to go&read all the drivers and
>   figure out how to fix up the mess. DC has it's entirely separate state
>   structures which again don't subclass the atomic core structures (afaik
>   at least). Again the same problems apply that you can't find things, and
>   that figuring out the exact semantics and spotting differences in
>   behaviour is almost impossible.
>
> - The trouble isn't just in reading code and understanding it correctly,
>   it's also in finding it. If you have your own completely different world
>   then just finding the right code is hard - cscope and grep fail to work.
>
> - Another issue is that very often we unify semantics in drivers by adding
>   some new helpers that at least dtrt for most of the drivers. If you have
>   your own world then the impendance mismatch will make sure that amd
>   drivers will have slightly different semantics, and I think that's not
>   good for the ecosystem and kms - people want to run a lot more than just
>   a boot splash with generic kms userspace, stuff like xf86-video-$vendor
>   is going out of favour heavily.
>
> Note that all this isn't about amd walking away and leaving an
> unmaintainable mess behind. Like I've said I don't think this is a big
> risk. The trouble is that having your own world makes it harder for
> everyone else to understand the amd driver, and understanding all drivers
> is very often step 1 in some big refactoring or feature addition effort.
> Because starting to refactor without understanding the problem generally
> doesn't work ;_) And you can't make this step 1 easier for others by
> promising to always maintain DC and update it to all the core changes,
> because that's only step 2.
>
> In all the DC discussions we've had thus far I haven't seen anyone address
> this issue. And this isn't just an issue in drm, it's pretty much
> established across all linux subsystems with the "no midlayer or OS
> abstraction layers in drivers" rule. There's some real solid reasons why
> such a HAl is extremely unpopular with upstream. And I haven't yet seen
> any good reason why amd needs to be different, thus far it looks like a
> textbook case, and there's been lots of vendors in lots of subsystems who
> tried to push their HAL.
>
> Thanks, Daniel
>
>>
>>     /******************************************************************
>>      * *** Apply safe power state
>>      ******************************************************************/
>>     pplib_apply_safe_state(core_dc);
>>
>>     /****************************************************************
>>      * *** Apply the context to HW (program HW)
>>      ****************************************************************/
>>     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>>     {
>>       /* reset pipes that need reprogramming */
>>       /* disable pipe power gating */
>>       /* set safe watermarks */
>>
>>       /* for all pipes with an attached stream */
>>         /************************************************************
>>          * *** Programming all per-pipe contexts
>>          ************************************************************/
>>         status = apply_single_controller_ctx_to_hw(...)
>>         {
>>           pipe_ctx->tg->funcs->set_blank(...);
>>           pipe_ctx->clock_source->funcs->program_pix_clk(...);
>>           pipe_ctx->tg->funcs->program_timing(...);
>>           pipe_ctx->mi->funcs->allocate_mem_input(...);
>>           pipe_ctx->tg->funcs->enable_crtc(...);
>>           bios_parser_crtc_source_select(...);
>>
>>           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>>           pipe_ctx->opp->funcs->opp_program_fmt(...);
>>
>>           stream->sink->link->link_enc->funcs->setup(...);
>>           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>>           pipe_ctx->tg->funcs->set_blank_color(...);
>>
>>           core_link_enable_stream(pipe_ctx);
>>           unblank_stream(pipe_ctx,
>>
>>           program_scaler(dc, pipe_ctx);
>>         }
>>       /* program audio for all pipes */
>>       /* update watermarks */
>>     }
>>
>>     program_timing_sync(core_dc, context);
>>     /* for all targets */
>>       target_enable_memory_requests(...);
>>
>>     /* Update ASIC power states */
>>     pplib_apply_display_requirements(...);
>>
>>     /* update surface or page flip */
>>   }
>> }
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08 14:33     ` Harry Wentland
@ 2016-12-08 15:34       ` Daniel Vetter
       [not found]         ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel Vetter @ 2016-12-08 15:34 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx, dri-devel, Deucher, Alexander

On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
> Hi Daniel,
> 
> just a quick clarification in-line about "validation" inside atomic_commit.
> 
> On 2016-12-08 04:59 AM, Daniel Vetter wrote:
> > Hi Harry,
> > 
> > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> > > We propose to use the Display Core (DC) driver for display support on
> > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> > > avoid a flag day the plan is to only support uGPU initially and transition
> > > to older ASICs gradually.
> > > 
> > > The DC component has received extensive testing within AMD for DCE8, 10, and
> > > 11 GPUs and is being prepared for uGPU. Support should be better than
> > > amdgpu's current display support.
> > > 
> > >  * All of our QA effort is focused on DC
> > >  * All of our CQE effort is focused on DC
> > >  * All of our OEM preloads and custom engagements use DC
> > >  * DC behavior mirrors what we do for other OSes
> > > 
> > > The new asic utilizes a completely re-designed atom interface, so we cannot
> > > easily leverage much of the existing atom-based code.
> > > 
> > > We've introduced DC to the community earlier in 2016 and received a fair
> > > amount of feedback. Some of what we've addressed so far are:
> > > 
> > >  * Self-contain ASIC specific code. We did a bunch of work to pull
> > >    common sequences into dc/dce and leave ASIC specific code in
> > >    separate folders.
> > >  * Started to expose AUX and I2C through generic kernel/drm
> > >    functionality and are mostly using that. Some of that code is still
> > >    needlessly convoluted. This cleanup is in progress.
> > >  * Integrated Dave and Jerome’s work on removing abstraction in bios
> > >    parser.
> > >  * Retire adapter service and asic capability
> > >  * Remove some abstraction in GPIO
> > > 
> > > Since a lot of our code is shared with pre- and post-silicon validation
> > > suites changes need to be done gradually to prevent breakages due to a major
> > > flag day.  This, coupled with adding support for new asics and lots of new
> > > feature introductions means progress has not been as quick as we would have
> > > liked. We have made a lot of progress none the less.
> > > 
> > > The remaining concerns that were brought up during the last review that we
> > > are working on addressing:
> > > 
> > >  * Continue to cleanup and reduce the abstractions in DC where it
> > >    makes sense.
> > >  * Removing duplicate code in I2C and AUX as we transition to using the
> > >    DRM core interfaces.  We can't fully transition until we've helped
> > >    fill in the gaps in the drm core that we need for certain features.
> > >  * Making sure Atomic API support is correct.  Some of the semantics of
> > >    the Atomic API were not particularly clear when we started this,
> > >    however, that is improving a lot as the core drm documentation
> > >    improves.  Getting this code upstream and in the hands of more
> > >    atomic users will further help us identify and rectify any gaps we
> > >    have.
> > > 
> > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> > > work on DC is public.  We're currently transitioning to a public patch
> > > review. You can follow our progress on the amd-gfx mailing list. We value
> > > community feedback on our work.
> > > 
> > > As an appendix I've included a brief overview of the how the code currently
> > > works to make understanding and reviewing the code easier.
> > > 
> > > Prior discussions on DC:
> > > 
> > >  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
> > >  *
> > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> > > 
> > > Current version of DC:
> > > 
> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > 
> > > Once Alex pulls in the latest patches:
> > > 
> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > 
> > > Best Regards,
> > > Harry
> > > 
> > > 
> > > ************************************************
> > > *** Appendix: A Day in the Life of a Modeset ***
> > > ************************************************
> > > 
> > > Below is a high-level overview of a modeset with dc. Some of this might be a
> > > little out-of-date since it's based on my XDC presentation but it should be
> > > more-or-less the same.
> > > 
> > > amdgpu_dm_atomic_commit()
> > > {
> > >   /* setup atomic state */
> > >   drm_atomic_helper_prepare_planes(dev, state);
> > >   drm_atomic_helper_swap_state(dev, state);
> > >   drm_atomic_helper_update_legacy_modeset_state(dev, state);
> > > 
> > >   /* create or remove targets */
> > > 
> > >   /********************************************************************
> > >    * *** Call into DC to commit targets with list of all known targets
> > >    ********************************************************************/
> > >   /* DC is optimized not to do anything if 'targets' didn't change. */
> > >   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
> > >   {
> > >     /******************************************************************
> > >      * *** Build context (function also used for validation)
> > >      ******************************************************************/
> > >     result = core_dc->res_pool->funcs->validate_with_context(
> > >                                core_dc,set,target_count,context);
> > 
> > I can't dig into details of DC, so this is not a 100% assessment, but if
> > you call a function called "validate" in atomic_commit, you're very, very
> > likely breaking atomic. _All_ validation must happen in ->atomic_check,
> > if that's not the case TEST_ONLY mode is broken. And atomic userspace is
> > relying on that working.
> > 
> 
> This function is not really named correctly. What it does is it builds a
> context and validates at the same time. In commit we simply care that it
> builds the context. Validate should never fail here (since this was already
> validated in atomic_check).
> 
> We call the same function at atomic_check
> 
> amdgpu_dm_atomic_check ->
> 	dc_validate_resources ->
> 		core_dc->res_pool->funcs->validate_with_context

Ah right, iirc you told me this the last time around too ;-) I guess a
great example for what I mean with rolling your own world: Existing atomic
drivers put their derived/computed/validated check into their subclasses
state structures, which means they don't need to be re-computed in
atomic_check. It also makes sure that the validation code/state
computation code between check and commit doesn't get out of sync.

> As for the rest, I hear you and appreciate your feedback. Let me get back to
> you on that later.

Just an added note on that: I do think that there's some driver teams
who've managed to pull a shared codebase between validation and upstream
linux (iirc some of the intel wireless drivers work like that). But it
requires careful aligning of everything, and with something fast-moving
like drm it might become real painful and not really worth it. So not
outright rejecting DC (and the code sharing you want to achieve with it)
as an idea here.
-Daniel

> 
> Thanks,
> Harry
> 
> 
> > The only thing that you're allowed to return from ->atomic_commit is
> > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
> > Kerneldoc expklains this.
> > 
> > Now the reason I bring this up (and we've discussed it at length in
> > private) is that DC still suffers from a massive abstraction midlayer. A
> > lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
> > timers, irq, ...) have been cleaned up, but the midlayer is still there.
> > And I understand why you have it, and why it's there - without some OS
> > abstraction your grand plan of a unified driver across everything doesn't
> > work out so well.
> > 
> > But in a way the backend stuff isn't such a big deal. It's annoying since
> > lots of code, and bugfixes have to be duplicated and all that, but it's
> > fairly easy to fix case-by-case, and as long as AMD folks stick around
> > (which I fully expect) not a maintainance issue. It makes it harder for
> > others to contribute, but then since it's mostly the leaf it's generally
> > easy to just improve the part you want to change (as an outsider). And if
> > you want to improve shared code the only downside is that you can't also
> > improve amd, but that's not so much a problem for non-amd folks ;-)
> > 
> > The problem otoh with the abstraction layer between drm core and the amd
> > driver is that you can't ignore if you want to refactor shared code. And
> > because it's an entire world of its own, it's much harder to understand
> > what the driver is doing (without reading it all). Some examples of what I
> > mean:
> > 
> > - All other drm drivers subclass drm objects (by embedding them) into the
> >   corresponding hw part that most closely matches the drm object's
> >   semantics. That means even when you have 0 clue about how a given piece
> >   of hw works, you have a reasonable chance of understanding code. If it's
> >   all your own stuff you always have to keep in minde the special amd
> >   naming conventions. That gets old real fast if you trying to figure out
> >   what 20+ (or are we at 30 already?) drivers are doing.
> > 
> > - This is even more true for atomic. Atomic has a pretty complicated
> >   check/commmit transactional model for updating display state. It's a
> >   standardized interface, and it's extensible, and we want generic
> >   userspace to be able to run on any driver. Fairly often we realize that
> >   semantics of existing or newly proposed properties and state isn't
> >   well-defined enough, and then we need to go&read all the drivers and
> >   figure out how to fix up the mess. DC has it's entirely separate state
> >   structures which again don't subclass the atomic core structures (afaik
> >   at least). Again the same problems apply that you can't find things, and
> >   that figuring out the exact semantics and spotting differences in
> >   behaviour is almost impossible.
> > 
> > - The trouble isn't just in reading code and understanding it correctly,
> >   it's also in finding it. If you have your own completely different world
> >   then just finding the right code is hard - cscope and grep fail to work.
> > 
> > - Another issue is that very often we unify semantics in drivers by adding
> >   some new helpers that at least dtrt for most of the drivers. If you have
> >   your own world then the impendance mismatch will make sure that amd
> >   drivers will have slightly different semantics, and I think that's not
> >   good for the ecosystem and kms - people want to run a lot more than just
> >   a boot splash with generic kms userspace, stuff like xf86-video-$vendor
> >   is going out of favour heavily.
> > 
> > Note that all this isn't about amd walking away and leaving an
> > unmaintainable mess behind. Like I've said I don't think this is a big
> > risk. The trouble is that having your own world makes it harder for
> > everyone else to understand the amd driver, and understanding all drivers
> > is very often step 1 in some big refactoring or feature addition effort.
> > Because starting to refactor without understanding the problem generally
> > doesn't work ;_) And you can't make this step 1 easier for others by
> > promising to always maintain DC and update it to all the core changes,
> > because that's only step 2.
> > 
> > In all the DC discussions we've had thus far I haven't seen anyone address
> > this issue. And this isn't just an issue in drm, it's pretty much
> > established across all linux subsystems with the "no midlayer or OS
> > abstraction layers in drivers" rule. There's some real solid reasons why
> > such a HAl is extremely unpopular with upstream. And I haven't yet seen
> > any good reason why amd needs to be different, thus far it looks like a
> > textbook case, and there's been lots of vendors in lots of subsystems who
> > tried to push their HAL.
> > 
> > Thanks, Daniel
> > 
> > > 
> > >     /******************************************************************
> > >      * *** Apply safe power state
> > >      ******************************************************************/
> > >     pplib_apply_safe_state(core_dc);
> > > 
> > >     /****************************************************************
> > >      * *** Apply the context to HW (program HW)
> > >      ****************************************************************/
> > >     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
> > >     {
> > >       /* reset pipes that need reprogramming */
> > >       /* disable pipe power gating */
> > >       /* set safe watermarks */
> > > 
> > >       /* for all pipes with an attached stream */
> > >         /************************************************************
> > >          * *** Programming all per-pipe contexts
> > >          ************************************************************/
> > >         status = apply_single_controller_ctx_to_hw(...)
> > >         {
> > >           pipe_ctx->tg->funcs->set_blank(...);
> > >           pipe_ctx->clock_source->funcs->program_pix_clk(...);
> > >           pipe_ctx->tg->funcs->program_timing(...);
> > >           pipe_ctx->mi->funcs->allocate_mem_input(...);
> > >           pipe_ctx->tg->funcs->enable_crtc(...);
> > >           bios_parser_crtc_source_select(...);
> > > 
> > >           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
> > >           pipe_ctx->opp->funcs->opp_program_fmt(...);
> > > 
> > >           stream->sink->link->link_enc->funcs->setup(...);
> > >           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
> > >           pipe_ctx->tg->funcs->set_blank_color(...);
> > > 
> > >           core_link_enable_stream(pipe_ctx);
> > >           unblank_stream(pipe_ctx,
> > > 
> > >           program_scaler(dc, pipe_ctx);
> > >         }
> > >       /* program audio for all pipes */
> > >       /* update watermarks */
> > >     }
> > > 
> > >     program_timing_sync(core_dc, context);
> > >     /* for all targets */
> > >       target_enable_memory_requests(...);
> > > 
> > >     /* Update ASIC power states */
> > >     pplib_apply_display_requirements(...);
> > > 
> > >     /* update surface or page flip */
> > >   }
> > > }
> > > 
> > > 
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]         ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-08 15:41           ` Christian König
  2016-12-08 15:46             ` Daniel Vetter
  2016-12-08 20:24             ` Matthew Macy
  2016-12-08 17:40           ` Alex Deucher
  1 sibling, 2 replies; 66+ messages in thread
From: Christian König @ 2016-12-08 15:41 UTC (permalink / raw)
  To: Daniel Vetter, Harry Wentland
  Cc: Deucher, Alexander, Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 08.12.2016 um 16:34 schrieb Daniel Vetter:
> On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
>> Hi Daniel,
>>
>> just a quick clarification in-line about "validation" inside atomic_commit.
>>
>> On 2016-12-08 04:59 AM, Daniel Vetter wrote:
>>> Hi Harry,
>>>
>>> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>>>> We propose to use the Display Core (DC) driver for display support on
>>>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>>>> avoid a flag day the plan is to only support uGPU initially and transition
>>>> to older ASICs gradually.
>>>>
>>>> The DC component has received extensive testing within AMD for DCE8, 10, and
>>>> 11 GPUs and is being prepared for uGPU. Support should be better than
>>>> amdgpu's current display support.
>>>>
>>>>   * All of our QA effort is focused on DC
>>>>   * All of our CQE effort is focused on DC
>>>>   * All of our OEM preloads and custom engagements use DC
>>>>   * DC behavior mirrors what we do for other OSes
>>>>
>>>> The new asic utilizes a completely re-designed atom interface, so we cannot
>>>> easily leverage much of the existing atom-based code.
>>>>
>>>> We've introduced DC to the community earlier in 2016 and received a fair
>>>> amount of feedback. Some of what we've addressed so far are:
>>>>
>>>>   * Self-contain ASIC specific code. We did a bunch of work to pull
>>>>     common sequences into dc/dce and leave ASIC specific code in
>>>>     separate folders.
>>>>   * Started to expose AUX and I2C through generic kernel/drm
>>>>     functionality and are mostly using that. Some of that code is still
>>>>     needlessly convoluted. This cleanup is in progress.
>>>>   * Integrated Dave and Jerome’s work on removing abstraction in bios
>>>>     parser.
>>>>   * Retire adapter service and asic capability
>>>>   * Remove some abstraction in GPIO
>>>>
>>>> Since a lot of our code is shared with pre- and post-silicon validation
>>>> suites changes need to be done gradually to prevent breakages due to a major
>>>> flag day.  This, coupled with adding support for new asics and lots of new
>>>> feature introductions means progress has not been as quick as we would have
>>>> liked. We have made a lot of progress none the less.
>>>>
>>>> The remaining concerns that were brought up during the last review that we
>>>> are working on addressing:
>>>>
>>>>   * Continue to cleanup and reduce the abstractions in DC where it
>>>>     makes sense.
>>>>   * Removing duplicate code in I2C and AUX as we transition to using the
>>>>     DRM core interfaces.  We can't fully transition until we've helped
>>>>     fill in the gaps in the drm core that we need for certain features.
>>>>   * Making sure Atomic API support is correct.  Some of the semantics of
>>>>     the Atomic API were not particularly clear when we started this,
>>>>     however, that is improving a lot as the core drm documentation
>>>>     improves.  Getting this code upstream and in the hands of more
>>>>     atomic users will further help us identify and rectify any gaps we
>>>>     have.
>>>>
>>>> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
>>>> work on DC is public.  We're currently transitioning to a public patch
>>>> review. You can follow our progress on the amd-gfx mailing list. We value
>>>> community feedback on our work.
>>>>
>>>> As an appendix I've included a brief overview of the how the code currently
>>>> works to make understanding and reviewing the code easier.
>>>>
>>>> Prior discussions on DC:
>>>>
>>>>   * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>>>>   *
>>>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
>>>>
>>>> Current version of DC:
>>>>
>>>>   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>>>
>>>> Once Alex pulls in the latest patches:
>>>>
>>>>   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>>>
>>>> Best Regards,
>>>> Harry
>>>>
>>>>
>>>> ************************************************
>>>> *** Appendix: A Day in the Life of a Modeset ***
>>>> ************************************************
>>>>
>>>> Below is a high-level overview of a modeset with dc. Some of this might be a
>>>> little out-of-date since it's based on my XDC presentation but it should be
>>>> more-or-less the same.
>>>>
>>>> amdgpu_dm_atomic_commit()
>>>> {
>>>>    /* setup atomic state */
>>>>    drm_atomic_helper_prepare_planes(dev, state);
>>>>    drm_atomic_helper_swap_state(dev, state);
>>>>    drm_atomic_helper_update_legacy_modeset_state(dev, state);
>>>>
>>>>    /* create or remove targets */
>>>>
>>>>    /********************************************************************
>>>>     * *** Call into DC to commit targets with list of all known targets
>>>>     ********************************************************************/
>>>>    /* DC is optimized not to do anything if 'targets' didn't change. */
>>>>    dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>>>>    {
>>>>      /******************************************************************
>>>>       * *** Build context (function also used for validation)
>>>>       ******************************************************************/
>>>>      result = core_dc->res_pool->funcs->validate_with_context(
>>>>                                 core_dc,set,target_count,context);
>>> I can't dig into details of DC, so this is not a 100% assessment, but if
>>> you call a function called "validate" in atomic_commit, you're very, very
>>> likely breaking atomic. _All_ validation must happen in ->atomic_check,
>>> if that's not the case TEST_ONLY mode is broken. And atomic userspace is
>>> relying on that working.
>>>
>> This function is not really named correctly. What it does is it builds a
>> context and validates at the same time. In commit we simply care that it
>> builds the context. Validate should never fail here (since this was already
>> validated in atomic_check).
>>
>> We call the same function at atomic_check
>>
>> amdgpu_dm_atomic_check ->
>> 	dc_validate_resources ->
>> 		core_dc->res_pool->funcs->validate_with_context
> Ah right, iirc you told me this the last time around too ;-) I guess a
> great example for what I mean with rolling your own world: Existing atomic
> drivers put their derived/computed/validated check into their subclasses
> state structures, which means they don't need to be re-computed in
> atomic_check. It also makes sure that the validation code/state
> computation code between check and commit doesn't get out of sync.
>
>> As for the rest, I hear you and appreciate your feedback. Let me get back to
>> you on that later.
> Just an added note on that: I do think that there's some driver teams
> who've managed to pull a shared codebase between validation and upstream
> linux (iirc some of the intel wireless drivers work like that). But it
> requires careful aligning of everything, and with something fast-moving
> like drm it might become real painful and not really worth it. So not
> outright rejecting DC (and the code sharing you want to achieve with it)
> as an idea here.

I used to have examples of such a things for other network drivers as 
well, but right now I can't find them of hand. Leave me a note if you 
need more info on existing things.

A good idea might as well be to take a look at drivers shared between 
Linux and BSD as well, cause both code bases are usually public 
available and you can see what changes during porting and what stays the 
same.

Regards,
Christian.

> -Daniel
>
>> Thanks,
>> Harry
>>
>>
>>> The only thing that you're allowed to return from ->atomic_commit is
>>> out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
>>> Kerneldoc expklains this.
>>>
>>> Now the reason I bring this up (and we've discussed it at length in
>>> private) is that DC still suffers from a massive abstraction midlayer. A
>>> lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
>>> timers, irq, ...) have been cleaned up, but the midlayer is still there.
>>> And I understand why you have it, and why it's there - without some OS
>>> abstraction your grand plan of a unified driver across everything doesn't
>>> work out so well.
>>>
>>> But in a way the backend stuff isn't such a big deal. It's annoying since
>>> lots of code, and bugfixes have to be duplicated and all that, but it's
>>> fairly easy to fix case-by-case, and as long as AMD folks stick around
>>> (which I fully expect) not a maintainance issue. It makes it harder for
>>> others to contribute, but then since it's mostly the leaf it's generally
>>> easy to just improve the part you want to change (as an outsider). And if
>>> you want to improve shared code the only downside is that you can't also
>>> improve amd, but that's not so much a problem for non-amd folks ;-)
>>>
>>> The problem otoh with the abstraction layer between drm core and the amd
>>> driver is that you can't ignore if you want to refactor shared code. And
>>> because it's an entire world of its own, it's much harder to understand
>>> what the driver is doing (without reading it all). Some examples of what I
>>> mean:
>>>
>>> - All other drm drivers subclass drm objects (by embedding them) into the
>>>    corresponding hw part that most closely matches the drm object's
>>>    semantics. That means even when you have 0 clue about how a given piece
>>>    of hw works, you have a reasonable chance of understanding code. If it's
>>>    all your own stuff you always have to keep in minde the special amd
>>>    naming conventions. That gets old real fast if you trying to figure out
>>>    what 20+ (or are we at 30 already?) drivers are doing.
>>>
>>> - This is even more true for atomic. Atomic has a pretty complicated
>>>    check/commmit transactional model for updating display state. It's a
>>>    standardized interface, and it's extensible, and we want generic
>>>    userspace to be able to run on any driver. Fairly often we realize that
>>>    semantics of existing or newly proposed properties and state isn't
>>>    well-defined enough, and then we need to go&read all the drivers and
>>>    figure out how to fix up the mess. DC has it's entirely separate state
>>>    structures which again don't subclass the atomic core structures (afaik
>>>    at least). Again the same problems apply that you can't find things, and
>>>    that figuring out the exact semantics and spotting differences in
>>>    behaviour is almost impossible.
>>>
>>> - The trouble isn't just in reading code and understanding it correctly,
>>>    it's also in finding it. If you have your own completely different world
>>>    then just finding the right code is hard - cscope and grep fail to work.
>>>
>>> - Another issue is that very often we unify semantics in drivers by adding
>>>    some new helpers that at least dtrt for most of the drivers. If you have
>>>    your own world then the impendance mismatch will make sure that amd
>>>    drivers will have slightly different semantics, and I think that's not
>>>    good for the ecosystem and kms - people want to run a lot more than just
>>>    a boot splash with generic kms userspace, stuff like xf86-video-$vendor
>>>    is going out of favour heavily.
>>>
>>> Note that all this isn't about amd walking away and leaving an
>>> unmaintainable mess behind. Like I've said I don't think this is a big
>>> risk. The trouble is that having your own world makes it harder for
>>> everyone else to understand the amd driver, and understanding all drivers
>>> is very often step 1 in some big refactoring or feature addition effort.
>>> Because starting to refactor without understanding the problem generally
>>> doesn't work ;_) And you can't make this step 1 easier for others by
>>> promising to always maintain DC and update it to all the core changes,
>>> because that's only step 2.
>>>
>>> In all the DC discussions we've had thus far I haven't seen anyone address
>>> this issue. And this isn't just an issue in drm, it's pretty much
>>> established across all linux subsystems with the "no midlayer or OS
>>> abstraction layers in drivers" rule. There's some real solid reasons why
>>> such a HAl is extremely unpopular with upstream. And I haven't yet seen
>>> any good reason why amd needs to be different, thus far it looks like a
>>> textbook case, and there's been lots of vendors in lots of subsystems who
>>> tried to push their HAL.
>>>
>>> Thanks, Daniel
>>>
>>>>      /******************************************************************
>>>>       * *** Apply safe power state
>>>>       ******************************************************************/
>>>>      pplib_apply_safe_state(core_dc);
>>>>
>>>>      /****************************************************************
>>>>       * *** Apply the context to HW (program HW)
>>>>       ****************************************************************/
>>>>      result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>>>>      {
>>>>        /* reset pipes that need reprogramming */
>>>>        /* disable pipe power gating */
>>>>        /* set safe watermarks */
>>>>
>>>>        /* for all pipes with an attached stream */
>>>>          /************************************************************
>>>>           * *** Programming all per-pipe contexts
>>>>           ************************************************************/
>>>>          status = apply_single_controller_ctx_to_hw(...)
>>>>          {
>>>>            pipe_ctx->tg->funcs->set_blank(...);
>>>>            pipe_ctx->clock_source->funcs->program_pix_clk(...);
>>>>            pipe_ctx->tg->funcs->program_timing(...);
>>>>            pipe_ctx->mi->funcs->allocate_mem_input(...);
>>>>            pipe_ctx->tg->funcs->enable_crtc(...);
>>>>            bios_parser_crtc_source_select(...);
>>>>
>>>>            pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>>>>            pipe_ctx->opp->funcs->opp_program_fmt(...);
>>>>
>>>>            stream->sink->link->link_enc->funcs->setup(...);
>>>>            pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>>>>            pipe_ctx->tg->funcs->set_blank_color(...);
>>>>
>>>>            core_link_enable_stream(pipe_ctx);
>>>>            unblank_stream(pipe_ctx,
>>>>
>>>>            program_scaler(dc, pipe_ctx);
>>>>          }
>>>>        /* program audio for all pipes */
>>>>        /* update watermarks */
>>>>      }
>>>>
>>>>      program_timing_sync(core_dc, context);
>>>>      /* for all targets */
>>>>        target_enable_memory_requests(...);
>>>>
>>>>      /* Update ASIC power states */
>>>>      pplib_apply_display_requirements(...);
>>>>
>>>>      /* update surface or page flip */
>>>>    }
>>>> }
>>>>
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08 15:41           ` Christian König
@ 2016-12-08 15:46             ` Daniel Vetter
  2016-12-08 20:24             ` Matthew Macy
  1 sibling, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-08 15:46 UTC (permalink / raw)
  To: Christian König
  Cc: Grodzovsky, Andrey, dri-devel, amd-gfx, Deucher, Alexander, Cheng, Tony

On Thu, Dec 08, 2016 at 04:41:52PM +0100, Christian König wrote:
> Am 08.12.2016 um 16:34 schrieb Daniel Vetter:
> > On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
> > > Hi Daniel,
> > > 
> > > just a quick clarification in-line about "validation" inside atomic_commit.
> > > 
> > > On 2016-12-08 04:59 AM, Daniel Vetter wrote:
> > > > Hi Harry,
> > > > 
> > > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> > > > > We propose to use the Display Core (DC) driver for display support on
> > > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> > > > > avoid a flag day the plan is to only support uGPU initially and transition
> > > > > to older ASICs gradually.
> > > > > 
> > > > > The DC component has received extensive testing within AMD for DCE8, 10, and
> > > > > 11 GPUs and is being prepared for uGPU. Support should be better than
> > > > > amdgpu's current display support.
> > > > > 
> > > > >   * All of our QA effort is focused on DC
> > > > >   * All of our CQE effort is focused on DC
> > > > >   * All of our OEM preloads and custom engagements use DC
> > > > >   * DC behavior mirrors what we do for other OSes
> > > > > 
> > > > > The new asic utilizes a completely re-designed atom interface, so we cannot
> > > > > easily leverage much of the existing atom-based code.
> > > > > 
> > > > > We've introduced DC to the community earlier in 2016 and received a fair
> > > > > amount of feedback. Some of what we've addressed so far are:
> > > > > 
> > > > >   * Self-contain ASIC specific code. We did a bunch of work to pull
> > > > >     common sequences into dc/dce and leave ASIC specific code in
> > > > >     separate folders.
> > > > >   * Started to expose AUX and I2C through generic kernel/drm
> > > > >     functionality and are mostly using that. Some of that code is still
> > > > >     needlessly convoluted. This cleanup is in progress.
> > > > >   * Integrated Dave and Jerome’s work on removing abstraction in bios
> > > > >     parser.
> > > > >   * Retire adapter service and asic capability
> > > > >   * Remove some abstraction in GPIO
> > > > > 
> > > > > Since a lot of our code is shared with pre- and post-silicon validation
> > > > > suites changes need to be done gradually to prevent breakages due to a major
> > > > > flag day.  This, coupled with adding support for new asics and lots of new
> > > > > feature introductions means progress has not been as quick as we would have
> > > > > liked. We have made a lot of progress none the less.
> > > > > 
> > > > > The remaining concerns that were brought up during the last review that we
> > > > > are working on addressing:
> > > > > 
> > > > >   * Continue to cleanup and reduce the abstractions in DC where it
> > > > >     makes sense.
> > > > >   * Removing duplicate code in I2C and AUX as we transition to using the
> > > > >     DRM core interfaces.  We can't fully transition until we've helped
> > > > >     fill in the gaps in the drm core that we need for certain features.
> > > > >   * Making sure Atomic API support is correct.  Some of the semantics of
> > > > >     the Atomic API were not particularly clear when we started this,
> > > > >     however, that is improving a lot as the core drm documentation
> > > > >     improves.  Getting this code upstream and in the hands of more
> > > > >     atomic users will further help us identify and rectify any gaps we
> > > > >     have.
> > > > > 
> > > > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> > > > > work on DC is public.  We're currently transitioning to a public patch
> > > > > review. You can follow our progress on the amd-gfx mailing list. We value
> > > > > community feedback on our work.
> > > > > 
> > > > > As an appendix I've included a brief overview of the how the code currently
> > > > > works to make understanding and reviewing the code easier.
> > > > > 
> > > > > Prior discussions on DC:
> > > > > 
> > > > >   * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
> > > > >   *
> > > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> > > > > 
> > > > > Current version of DC:
> > > > > 
> > > > >   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > > > 
> > > > > Once Alex pulls in the latest patches:
> > > > > 
> > > > >   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > > > 
> > > > > Best Regards,
> > > > > Harry
> > > > > 
> > > > > 
> > > > > ************************************************
> > > > > *** Appendix: A Day in the Life of a Modeset ***
> > > > > ************************************************
> > > > > 
> > > > > Below is a high-level overview of a modeset with dc. Some of this might be a
> > > > > little out-of-date since it's based on my XDC presentation but it should be
> > > > > more-or-less the same.
> > > > > 
> > > > > amdgpu_dm_atomic_commit()
> > > > > {
> > > > >    /* setup atomic state */
> > > > >    drm_atomic_helper_prepare_planes(dev, state);
> > > > >    drm_atomic_helper_swap_state(dev, state);
> > > > >    drm_atomic_helper_update_legacy_modeset_state(dev, state);
> > > > > 
> > > > >    /* create or remove targets */
> > > > > 
> > > > >    /********************************************************************
> > > > >     * *** Call into DC to commit targets with list of all known targets
> > > > >     ********************************************************************/
> > > > >    /* DC is optimized not to do anything if 'targets' didn't change. */
> > > > >    dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
> > > > >    {
> > > > >      /******************************************************************
> > > > >       * *** Build context (function also used for validation)
> > > > >       ******************************************************************/
> > > > >      result = core_dc->res_pool->funcs->validate_with_context(
> > > > >                                 core_dc,set,target_count,context);
> > > > I can't dig into details of DC, so this is not a 100% assessment, but if
> > > > you call a function called "validate" in atomic_commit, you're very, very
> > > > likely breaking atomic. _All_ validation must happen in ->atomic_check,
> > > > if that's not the case TEST_ONLY mode is broken. And atomic userspace is
> > > > relying on that working.
> > > > 
> > > This function is not really named correctly. What it does is it builds a
> > > context and validates at the same time. In commit we simply care that it
> > > builds the context. Validate should never fail here (since this was already
> > > validated in atomic_check).
> > > 
> > > We call the same function at atomic_check
> > > 
> > > amdgpu_dm_atomic_check ->
> > > 	dc_validate_resources ->
> > > 		core_dc->res_pool->funcs->validate_with_context
> > Ah right, iirc you told me this the last time around too ;-) I guess a
> > great example for what I mean with rolling your own world: Existing atomic
> > drivers put their derived/computed/validated check into their subclasses
> > state structures, which means they don't need to be re-computed in
> > atomic_check. It also makes sure that the validation code/state
> > computation code between check and commit doesn't get out of sync.
> > 
> > > As for the rest, I hear you and appreciate your feedback. Let me get back to
> > > you on that later.
> > Just an added note on that: I do think that there's some driver teams
> > who've managed to pull a shared codebase between validation and upstream
> > linux (iirc some of the intel wireless drivers work like that). But it
> > requires careful aligning of everything, and with something fast-moving
> > like drm it might become real painful and not really worth it. So not
> > outright rejecting DC (and the code sharing you want to achieve with it)
> > as an idea here.
> 
> I used to have examples of such a things for other network drivers as well,
> but right now I can't find them of hand. Leave me a note if you need more
> info on existing things.
> 
> A good idea might as well be to take a look at drivers shared between Linux
> and BSD as well, cause both code bases are usually public available and you
> can see what changes during porting and what stays the same.

bsd and linux might not be a good example anymore, at least in the gfx
space - upstream linux has so massively outpaced bsd kernels that they
stopped porting and switched over to implement a shim in the bsd drm
subsystem to fully emulate the linux interfaces. I think on the networking
and storage side things are a bit better aligned still, and not quite
moving as fast, to make a more native approach on each OS feasible.
-Daniel

> 
> Regards,
> Christian.
> 
> > -Daniel
> > 
> > > Thanks,
> > > Harry
> > > 
> > > 
> > > > The only thing that you're allowed to return from ->atomic_commit is
> > > > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
> > > > Kerneldoc expklains this.
> > > > 
> > > > Now the reason I bring this up (and we've discussed it at length in
> > > > private) is that DC still suffers from a massive abstraction midlayer. A
> > > > lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
> > > > timers, irq, ...) have been cleaned up, but the midlayer is still there.
> > > > And I understand why you have it, and why it's there - without some OS
> > > > abstraction your grand plan of a unified driver across everything doesn't
> > > > work out so well.
> > > > 
> > > > But in a way the backend stuff isn't such a big deal. It's annoying since
> > > > lots of code, and bugfixes have to be duplicated and all that, but it's
> > > > fairly easy to fix case-by-case, and as long as AMD folks stick around
> > > > (which I fully expect) not a maintainance issue. It makes it harder for
> > > > others to contribute, but then since it's mostly the leaf it's generally
> > > > easy to just improve the part you want to change (as an outsider). And if
> > > > you want to improve shared code the only downside is that you can't also
> > > > improve amd, but that's not so much a problem for non-amd folks ;-)
> > > > 
> > > > The problem otoh with the abstraction layer between drm core and the amd
> > > > driver is that you can't ignore if you want to refactor shared code. And
> > > > because it's an entire world of its own, it's much harder to understand
> > > > what the driver is doing (without reading it all). Some examples of what I
> > > > mean:
> > > > 
> > > > - All other drm drivers subclass drm objects (by embedding them) into the
> > > >    corresponding hw part that most closely matches the drm object's
> > > >    semantics. That means even when you have 0 clue about how a given piece
> > > >    of hw works, you have a reasonable chance of understanding code. If it's
> > > >    all your own stuff you always have to keep in minde the special amd
> > > >    naming conventions. That gets old real fast if you trying to figure out
> > > >    what 20+ (or are we at 30 already?) drivers are doing.
> > > > 
> > > > - This is even more true for atomic. Atomic has a pretty complicated
> > > >    check/commmit transactional model for updating display state. It's a
> > > >    standardized interface, and it's extensible, and we want generic
> > > >    userspace to be able to run on any driver. Fairly often we realize that
> > > >    semantics of existing or newly proposed properties and state isn't
> > > >    well-defined enough, and then we need to go&read all the drivers and
> > > >    figure out how to fix up the mess. DC has it's entirely separate state
> > > >    structures which again don't subclass the atomic core structures (afaik
> > > >    at least). Again the same problems apply that you can't find things, and
> > > >    that figuring out the exact semantics and spotting differences in
> > > >    behaviour is almost impossible.
> > > > 
> > > > - The trouble isn't just in reading code and understanding it correctly,
> > > >    it's also in finding it. If you have your own completely different world
> > > >    then just finding the right code is hard - cscope and grep fail to work.
> > > > 
> > > > - Another issue is that very often we unify semantics in drivers by adding
> > > >    some new helpers that at least dtrt for most of the drivers. If you have
> > > >    your own world then the impendance mismatch will make sure that amd
> > > >    drivers will have slightly different semantics, and I think that's not
> > > >    good for the ecosystem and kms - people want to run a lot more than just
> > > >    a boot splash with generic kms userspace, stuff like xf86-video-$vendor
> > > >    is going out of favour heavily.
> > > > 
> > > > Note that all this isn't about amd walking away and leaving an
> > > > unmaintainable mess behind. Like I've said I don't think this is a big
> > > > risk. The trouble is that having your own world makes it harder for
> > > > everyone else to understand the amd driver, and understanding all drivers
> > > > is very often step 1 in some big refactoring or feature addition effort.
> > > > Because starting to refactor without understanding the problem generally
> > > > doesn't work ;_) And you can't make this step 1 easier for others by
> > > > promising to always maintain DC and update it to all the core changes,
> > > > because that's only step 2.
> > > > 
> > > > In all the DC discussions we've had thus far I haven't seen anyone address
> > > > this issue. And this isn't just an issue in drm, it's pretty much
> > > > established across all linux subsystems with the "no midlayer or OS
> > > > abstraction layers in drivers" rule. There's some real solid reasons why
> > > > such a HAl is extremely unpopular with upstream. And I haven't yet seen
> > > > any good reason why amd needs to be different, thus far it looks like a
> > > > textbook case, and there's been lots of vendors in lots of subsystems who
> > > > tried to push their HAL.
> > > > 
> > > > Thanks, Daniel
> > > > 
> > > > >      /******************************************************************
> > > > >       * *** Apply safe power state
> > > > >       ******************************************************************/
> > > > >      pplib_apply_safe_state(core_dc);
> > > > > 
> > > > >      /****************************************************************
> > > > >       * *** Apply the context to HW (program HW)
> > > > >       ****************************************************************/
> > > > >      result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
> > > > >      {
> > > > >        /* reset pipes that need reprogramming */
> > > > >        /* disable pipe power gating */
> > > > >        /* set safe watermarks */
> > > > > 
> > > > >        /* for all pipes with an attached stream */
> > > > >          /************************************************************
> > > > >           * *** Programming all per-pipe contexts
> > > > >           ************************************************************/
> > > > >          status = apply_single_controller_ctx_to_hw(...)
> > > > >          {
> > > > >            pipe_ctx->tg->funcs->set_blank(...);
> > > > >            pipe_ctx->clock_source->funcs->program_pix_clk(...);
> > > > >            pipe_ctx->tg->funcs->program_timing(...);
> > > > >            pipe_ctx->mi->funcs->allocate_mem_input(...);
> > > > >            pipe_ctx->tg->funcs->enable_crtc(...);
> > > > >            bios_parser_crtc_source_select(...);
> > > > > 
> > > > >            pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
> > > > >            pipe_ctx->opp->funcs->opp_program_fmt(...);
> > > > > 
> > > > >            stream->sink->link->link_enc->funcs->setup(...);
> > > > >            pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
> > > > >            pipe_ctx->tg->funcs->set_blank_color(...);
> > > > > 
> > > > >            core_link_enable_stream(pipe_ctx);
> > > > >            unblank_stream(pipe_ctx,
> > > > > 
> > > > >            program_scaler(dc, pipe_ctx);
> > > > >          }
> > > > >        /* program audio for all pipes */
> > > > >        /* update watermarks */
> > > > >      }
> > > > > 
> > > > >      program_timing_sync(core_dc, context);
> > > > >      /* for all targets */
> > > > >        target_enable_memory_requests(...);
> > > > > 
> > > > >      /* Update ASIC power states */
> > > > >      pplib_apply_display_requirements(...);
> > > > > 
> > > > >      /* update surface or page flip */
> > > > >    }
> > > > > }
> > > > > 
> > > > > 
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]         ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  2016-12-08 15:41           ` Christian König
@ 2016-12-08 17:40           ` Alex Deucher
  1 sibling, 0 replies; 66+ messages in thread
From: Alex Deucher @ 2016-12-08 17:40 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Harry Wentland, Maling list - DRI developers,
	amd-gfx list, Deucher, Alexander, Cheng, Tony

On Thu, Dec 8, 2016 at 10:34 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
>> Hi Daniel,
>>
>> just a quick clarification in-line about "validation" inside atomic_commit.
>>
>> On 2016-12-08 04:59 AM, Daniel Vetter wrote:
>> > Hi Harry,
>> >
>> > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>> > > We propose to use the Display Core (DC) driver for display support on
>> > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>> > > avoid a flag day the plan is to only support uGPU initially and transition
>> > > to older ASICs gradually.
>> > >
>> > > The DC component has received extensive testing within AMD for DCE8, 10, and
>> > > 11 GPUs and is being prepared for uGPU. Support should be better than
>> > > amdgpu's current display support.
>> > >
>> > >  * All of our QA effort is focused on DC
>> > >  * All of our CQE effort is focused on DC
>> > >  * All of our OEM preloads and custom engagements use DC
>> > >  * DC behavior mirrors what we do for other OSes
>> > >
>> > > The new asic utilizes a completely re-designed atom interface, so we cannot
>> > > easily leverage much of the existing atom-based code.
>> > >
>> > > We've introduced DC to the community earlier in 2016 and received a fair
>> > > amount of feedback. Some of what we've addressed so far are:
>> > >
>> > >  * Self-contain ASIC specific code. We did a bunch of work to pull
>> > >    common sequences into dc/dce and leave ASIC specific code in
>> > >    separate folders.
>> > >  * Started to expose AUX and I2C through generic kernel/drm
>> > >    functionality and are mostly using that. Some of that code is still
>> > >    needlessly convoluted. This cleanup is in progress.
>> > >  * Integrated Dave and Jerome’s work on removing abstraction in bios
>> > >    parser.
>> > >  * Retire adapter service and asic capability
>> > >  * Remove some abstraction in GPIO
>> > >
>> > > Since a lot of our code is shared with pre- and post-silicon validation
>> > > suites changes need to be done gradually to prevent breakages due to a major
>> > > flag day.  This, coupled with adding support for new asics and lots of new
>> > > feature introductions means progress has not been as quick as we would have
>> > > liked. We have made a lot of progress none the less.
>> > >
>> > > The remaining concerns that were brought up during the last review that we
>> > > are working on addressing:
>> > >
>> > >  * Continue to cleanup and reduce the abstractions in DC where it
>> > >    makes sense.
>> > >  * Removing duplicate code in I2C and AUX as we transition to using the
>> > >    DRM core interfaces.  We can't fully transition until we've helped
>> > >    fill in the gaps in the drm core that we need for certain features.
>> > >  * Making sure Atomic API support is correct.  Some of the semantics of
>> > >    the Atomic API were not particularly clear when we started this,
>> > >    however, that is improving a lot as the core drm documentation
>> > >    improves.  Getting this code upstream and in the hands of more
>> > >    atomic users will further help us identify and rectify any gaps we
>> > >    have.
>> > >
>> > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
>> > > work on DC is public.  We're currently transitioning to a public patch
>> > > review. You can follow our progress on the amd-gfx mailing list. We value
>> > > community feedback on our work.
>> > >
>> > > As an appendix I've included a brief overview of the how the code currently
>> > > works to make understanding and reviewing the code easier.
>> > >
>> > > Prior discussions on DC:
>> > >
>> > >  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>> > >  *
>> > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
>> > >
>> > > Current version of DC:
>> > >
>> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>> > >
>> > > Once Alex pulls in the latest patches:
>> > >
>> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>> > >
>> > > Best Regards,
>> > > Harry
>> > >
>> > >
>> > > ************************************************
>> > > *** Appendix: A Day in the Life of a Modeset ***
>> > > ************************************************
>> > >
>> > > Below is a high-level overview of a modeset with dc. Some of this might be a
>> > > little out-of-date since it's based on my XDC presentation but it should be
>> > > more-or-less the same.
>> > >
>> > > amdgpu_dm_atomic_commit()
>> > > {
>> > >   /* setup atomic state */
>> > >   drm_atomic_helper_prepare_planes(dev, state);
>> > >   drm_atomic_helper_swap_state(dev, state);
>> > >   drm_atomic_helper_update_legacy_modeset_state(dev, state);
>> > >
>> > >   /* create or remove targets */
>> > >
>> > >   /********************************************************************
>> > >    * *** Call into DC to commit targets with list of all known targets
>> > >    ********************************************************************/
>> > >   /* DC is optimized not to do anything if 'targets' didn't change. */
>> > >   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>> > >   {
>> > >     /******************************************************************
>> > >      * *** Build context (function also used for validation)
>> > >      ******************************************************************/
>> > >     result = core_dc->res_pool->funcs->validate_with_context(
>> > >                                core_dc,set,target_count,context);
>> >
>> > I can't dig into details of DC, so this is not a 100% assessment, but if
>> > you call a function called "validate" in atomic_commit, you're very, very
>> > likely breaking atomic. _All_ validation must happen in ->atomic_check,
>> > if that's not the case TEST_ONLY mode is broken. And atomic userspace is
>> > relying on that working.
>> >
>>
>> This function is not really named correctly. What it does is it builds a
>> context and validates at the same time. In commit we simply care that it
>> builds the context. Validate should never fail here (since this was already
>> validated in atomic_check).
>>
>> We call the same function at atomic_check
>>
>> amdgpu_dm_atomic_check ->
>>       dc_validate_resources ->
>>               core_dc->res_pool->funcs->validate_with_context
>
> Ah right, iirc you told me this the last time around too ;-) I guess a
> great example for what I mean with rolling your own world: Existing atomic
> drivers put their derived/computed/validated check into their subclasses
> state structures, which means they don't need to be re-computed in
> atomic_check. It also makes sure that the validation code/state
> computation code between check and commit doesn't get out of sync.
>
>> As for the rest, I hear you and appreciate your feedback. Let me get back to
>> you on that later.
>
> Just an added note on that: I do think that there's some driver teams
> who've managed to pull a shared codebase between validation and upstream
> linux (iirc some of the intel wireless drivers work like that). But it
> requires careful aligning of everything, and with something fast-moving
> like drm it might become real painful and not really worth it. So not
> outright rejecting DC (and the code sharing you want to achieve with it)
> as an idea here.

I think we have to make it work.  We don't have the resources to have
separate validation and Linux core teams.  It's not just the coding.
Much of our validation and compliance testing on Linux leverages this
as well.   From our perspective, I think the pain is probably worth it
at this point.  Display is starting to eclipse other blocks as far as
complexity. Not even just the complexity of lighting up complex
topologies.  The really tough stuff is that display is basically a
real-time service and the hw is designed with very little margin for
error with respect to timing and bandwidth.  That's where much of the
value comes from sharing resources with validation teams.  For us,
that makes the potential pain of dealing with fast moving drm worth
it.  This is not to say that we won't adopt more use of drm
infrastructure, we are working on it within the bounds of our resource
constraints.

Alex
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]   ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  2016-12-08 14:33     ` Harry Wentland
@ 2016-12-08 20:07     ` Dave Airlie
       [not found]       ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Dave Airlie @ 2016-12-08 20:07 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Harry Wentland, amd-gfx mailing list,
	dri-devel, Deucher, Alexander, Cheng, Tony

> I can't dig into details of DC, so this is not a 100% assessment, but if
> you call a function called "validate" in atomic_commit, you're very, very
> likely breaking atomic. _All_ validation must happen in ->atomic_check,
> if that's not the case TEST_ONLY mode is broken. And atomic userspace is
> relying on that working.
>
> The only thing that you're allowed to return from ->atomic_commit is
> out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
> Kerneldoc expklains this.
>
> Now the reason I bring this up (and we've discussed it at length in
> private) is that DC still suffers from a massive abstraction midlayer. A
> lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
> timers, irq, ...) have been cleaned up, but the midlayer is still there.
> And I understand why you have it, and why it's there - without some OS
> abstraction your grand plan of a unified driver across everything doesn't
> work out so well.
>
> But in a way the backend stuff isn't such a big deal. It's annoying since
> lots of code, and bugfixes have to be duplicated and all that, but it's
> fairly easy to fix case-by-case, and as long as AMD folks stick around
> (which I fully expect) not a maintainance issue. It makes it harder for
> others to contribute, but then since it's mostly the leaf it's generally
> easy to just improve the part you want to change (as an outsider). And if
> you want to improve shared code the only downside is that you can't also
> improve amd, but that's not so much a problem for non-amd folks ;-)
>
> The problem otoh with the abstraction layer between drm core and the amd
> driver is that you can't ignore if you want to refactor shared code. And
> because it's an entire world of its own, it's much harder to understand
> what the driver is doing (without reading it all). Some examples of what I
> mean:
>
> - All other drm drivers subclass drm objects (by embedding them) into the
>   corresponding hw part that most closely matches the drm object's
>   semantics. That means even when you have 0 clue about how a given piece
>   of hw works, you have a reasonable chance of understanding code. If it's
>   all your own stuff you always have to keep in minde the special amd
>   naming conventions. That gets old real fast if you trying to figure out
>   what 20+ (or are we at 30 already?) drivers are doing.
>
> - This is even more true for atomic. Atomic has a pretty complicated
>   check/commmit transactional model for updating display state. It's a
>   standardized interface, and it's extensible, and we want generic
>   userspace to be able to run on any driver. Fairly often we realize that
>   semantics of existing or newly proposed properties and state isn't
>   well-defined enough, and then we need to go&read all the drivers and
>   figure out how to fix up the mess. DC has it's entirely separate state
>   structures which again don't subclass the atomic core structures (afaik
>   at least). Again the same problems apply that you can't find things, and
>   that figuring out the exact semantics and spotting differences in
>   behaviour is almost impossible.
>
> - The trouble isn't just in reading code and understanding it correctly,
>   it's also in finding it. If you have your own completely different world
>   then just finding the right code is hard - cscope and grep fail to work.
>
> - Another issue is that very often we unify semantics in drivers by adding
>   some new helpers that at least dtrt for most of the drivers. If you have
>   your own world then the impendance mismatch will make sure that amd
>   drivers will have slightly different semantics, and I think that's not
>   good for the ecosystem and kms - people want to run a lot more than just
>   a boot splash with generic kms userspace, stuff like xf86-video-$vendor
>   is going out of favour heavily.
>
> Note that all this isn't about amd walking away and leaving an
> unmaintainable mess behind. Like I've said I don't think this is a big
> risk. The trouble is that having your own world makes it harder for
> everyone else to understand the amd driver, and understanding all drivers
> is very often step 1 in some big refactoring or feature addition effort.
> Because starting to refactor without understanding the problem generally
> doesn't work ;_) And you can't make this step 1 easier for others by
> promising to always maintain DC and update it to all the core changes,
> because that's only step 2.
>
> In all the DC discussions we've had thus far I haven't seen anyone address
> this issue. And this isn't just an issue in drm, it's pretty much
> established across all linux subsystems with the "no midlayer or OS
> abstraction layers in drivers" rule. There's some real solid reasons why
> such a HAl is extremely unpopular with upstream. And I haven't yet seen
> any good reason why amd needs to be different, thus far it looks like a
> textbook case, and there's been lots of vendors in lots of subsystems who
> tried to push their HAL.

Daniel has said this all very nicely, I'm going to try and be a bit more direct,
because apparently I've possibly been too subtle up until now.

No HALs. We don't do HALs in the kernel. We might do midlayers sometimes
we try not to do midlayers. In the DRM we don't do either unless the maintainers
are asleep. They might be worth the effort for AMD, however for the Linux kernel
they don't provide a benefit and make maintaining the code a lot harder. I've
maintained this code base for over 10 years now and I'd like to think
I've only merged
 something for semi-political reasons once (initial exynos was still
more Linuxy than DC),
and that thing took a lot of time to cleanup, I really don't feel like
saying yes again.

Given the choice between maintaining Linus' trust that I won't merge
100,000 lines
of abstracted HAL code and merging 100,000 lines of abstracted HAL code
I'll give you one guess where my loyalties lie. The reason the
toplevel maintainer (me)
doesn't work for Intel or AMD or any vendors, is that I can say NO
when your maintainers
can't or won't say it.

I've only got one true power as a maintainer, and that is to say No.
The other option
is I personally sit down and rewrite all the code in an acceptable
manner, and merge that
instead. But I've discovered I probably don't scale to that level, so
again it leaves me
with just the one actual power.

AMD can't threaten not to support new GPUs in upstream kernels without
merging this,
that is totally something you can do, and here's the thing Linux will
survive, we'll piss off
a bunch of people, but the Linux kernel will just keep on rolling
forward, maybe at some
point someone will get pissed about lacking upstream support for your
HW and go write
support and submit it, maybe they won't. The kernel is bigger than any
of us and has
standards about what is acceptable. Read up on the whole mac80211
problems we had
years ago, where every wireless vendor wrote their own 80211 layer
inside their driver,
there was a lot of time spent creating a central 80211 before any of
those drivers were
suitable for merge, well we've spent our time creating a central
modesetting infrastructure,
bypassing it is taking a driver in totally the wrong direction.

I've also wondered if the DC code is ready for being part of the
kernel anyways, what
happens if I merge this, and some external contributor rewrites 50% of
it and removes a
bunch of stuff that the kernel doesn't need. By any kernel standards
I'll merge that sort
of change over your heads if Alex doesn't, it might mean you have to
rewrite a chunk
of your internal validation code, or some other interactions, but
those won't be reasons
to block the changes from my POV. I'd like some serious introspection
on your team's
part on how you got into this situation and how even if I was feeling
like merging this
(which I'm not) how you'd actually deal with being part of the Linux
kernel and not hiding
in nicely framed orgchart silo behind a HAL. I honestly don't think
the code is Linux worthy
code, and I also really dislike having to spend my Friday morning
being negative about it,
but hey at least I can have a shower now.

No.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08 15:41           ` Christian König
  2016-12-08 15:46             ` Daniel Vetter
@ 2016-12-08 20:24             ` Matthew Macy
  1 sibling, 0 replies; 66+ messages in thread
From: Matthew Macy @ 2016-12-08 20:24 UTC (permalink / raw)
  To: "Christian König"; +Cc: Deucher, Alexander, amd-gfx, dri-devel

putation code between check and commit doesn't get out of sync.
 > >
 > >> As for the rest, I hear you and appreciate your feedback. Let me get back to
 > >> you on that later.
 > > Just an added note on that: I do think that there's some driver teams
 > > who've managed to pull a shared codebase between validation and upstream
 > > linux (iirc some of the intel wireless drivers work like that). But it
 > > requires careful aligning of everything, and with something fast-moving
 > > like drm it might become real painful and not really worth it. So not
 > > outright rejecting DC (and the code sharing you want to achieve with it)
 > > as an idea here.
 > 
 > I used to have examples of such a things for other network drivers as 
 > well, but right now I can't find them of hand. Leave me a note if you 
 > need more info on existing things.
 > 
 > A good idea might as well be to take a look at drivers shared between 
 > Linux and BSD as well, cause both code bases are usually public 
 > available and you can see what changes during porting and what stays the 
 > same.






Although their core drivers are tightly the coupled with a given OS, the Chelsio 10GigE and Intel ethernet drivers in general have large amounts of platform agnostic code coupled with a fairly minimal OS abstraction layer. I don't know how analogous to DAL/DC this is. However, I will say that the Chelsio driver was an order of magnitude easier to port to FreeBSD and the end result much better than Solarflare's which felt obliged to not have any separation of concerns.

-M 



_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-08 23:29         ` Dave Airlie
       [not found]           ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-12-09 17:56           ` Cheng, Tony
  2016-12-09 17:32         ` Deucher, Alexander
  1 sibling, 2 replies; 66+ messages in thread
From: Dave Airlie @ 2016-12-08 23:29 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Harry Wentland, amd-gfx mailing list,
	dri-devel, Deucher, Alexander, Cheng, Tony

>
> No.
>

I'd also like to apologise for the formatting, gmail great for typing,
crap for editing.

So I've thought about it a bit more and Daniel mentioned something
useful I thought I should add.

Merging this code as well as maintaining a trust relationship with
Linus, also maintains a trust relationship with the Linux graphics
community and other drm contributors. There have been countless
requests from various companies and contributors to merge unsavoury
things over the years and we've denied them. They've all had the same
reasons behind why they couldn't do what we want and why we were
wrong, but lots of people have shown up who do get what we are at and
have joined the community and contributed drivers that conform to the
standards. Turning around now and saying well AMD ignored our
directions, so we'll give them a free pass even though we've denied
you all the same thing over time.

If I'd given in and merged every vendor coded driver as-is we'd never
have progressed to having atomic modesetting, there would have been
too many vendor HALs and abstractions that would have blocked forward
progression. Merging one HAL or abstraction is going to cause pain,
but setting a precedent to merge more would be just downright stupid
maintainership.

Here's the thing, we want AMD to join the graphics community not hang
out inside the company in silos. We need to enable FreeSync on Linux,
go ask the community how would be best to do it, don't shove it inside
the driver hidden in a special ioctl. Got some new HDMI features that
are secret, talk to other ppl in the same position and work out a plan
for moving forward. At the moment there is no engaging with the Linux
stack because you aren't really using it, as long as you hide behind
the abstraction there won't be much engagement, and neither side
benefits, so why should we merge the code if nobody benefits?

The platform problem/Windows mindset is scary and makes a lot of
decisions for you, open source doesn't have those restrictions, and I
don't accept drivers that try and push those development model
problems into our codebase.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]           ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-09 17:26             ` Cheng, Tony
  2016-12-09 19:59               ` Daniel Vetter
  0 siblings, 1 reply; 66+ messages in thread
From: Cheng, Tony @ 2016-12-09 17:26 UTC (permalink / raw)
  To: Dave Airlie, Daniel Vetter
  Cc: Deucher, Alexander, Grodzovsky, Andrey, Wentland, Harry,
	amd-gfx mailing list, dri-devel

> Merging this code as well as maintaining a trust relationship with 
> Linus, also maintains a trust relationship with the Linux graphics 
> community and other drm contributors. There have been countless 
> requests from various companies and contributors to merge unsavoury 
> things over the years and we've denied them. They've all had the same 
> reasons behind why they couldn't do what we want and why we were 
> wrong, but lots of people have shown up who do get what we are at and 
> have joined the community and contributed drivers that conform to the standards.
> Turning around now and saying well AMD ignored our directions, so 
> we'll give them a free pass even though we've denied you all the same 
> thing over time.

I'd like to say that I acknowledge the good and hard work maintainers are doing.  You nor the community is wrong to say no. I understand where the no comes from.  If somebody wants to throw 100k lines into DAL I would say no as well.

> If I'd given in and merged every vendor coded driver as-is we'd never 
> have progressed to having atomic modesetting, there would have been  
> too many vendor HALs and abstractions that would have blocked forward 
> progression. Merging one HAL or abstraction is going to cause  pain, 
> but setting a precedent to merge more would be just downright stupid 
> maintainership.

> Here's the thing, we want AMD to join the graphics community not hang 
> out inside the company in silos. We need to enable FreeSync on Linux, 
> go ask the community how would be best to do it, don't shove it inside 
> the driver hidden in a special ioctl. Got some new HDMI features that 
> are secret, talk to other ppl in the same position and work out a plan 
> for moving forward. At the moment there is no engaging with the Linux 
> stack because you aren't really using it, as long as you hide behind 
> the abstraction there won't be much engagement, and neither side 
> benefits, so why should we merge the code if nobody benefits?


> The platform problem/Windows mindset is scary and makes a lot of 
> decisions for you, open source doesn't have those restrictions, and I 
> don't accept drivers that try and push those development model 
> problems into our codebase.

I would like to share how platform problem/Windows mindset look from our side.  We are dealing with ever more complex hardware with the push to reduce power while driving more pixels through.  It is the power reduction that is causing us driver developers most of the pain.  Display is a high bandwidth real time memory fetch sub system which is always on, even when the system is idle.  When the system is idle, pretty much all of power consumption comes from display.  Can we use existing DRM infrastructure?  Definitely yes, if we talk about modes up to 300Mpix/s and leaving a lot of voltage and clock margin on the table.  How hard is it to set up a timing while bypass most of the pixel processing pipeline to light up a display?  How about adding all the power optimization such as burst read to fill display cache and keep DRAM in self-refresh as much as possible?  How about powering off some of the cache or pixel processing pipeline if we are not using them?  We need to manage and maximize valuable resources like cache (cache == silicon area == $$) and clock (== power) and optimize memory request patterns at different memory clock speeds, while DPM is going, in real time on the system.  This is why there is so much code to program registers, track our states, and manages resources, and it's getting more complex as HW would prefer SW program the same value into 5 different registers in different sub blocks to save a few cross tile wires on silicon and do complex calculations to find the magical optimal settings (the hated bandwidth_cals.c).  There are a lot of registers need to be programmed to correct values in the right situation if we enable all these power/performance optimizations.

It's really not a problem of windows mindset, rather is what is the bring up platform when silicon is in the lab with HW designer support.  Today no surprise we do that almost exclusively on windows.  Display team is working hard to change that to have linux in the mix while we have the attention from HW designers.  We have a recent effort to try to enable all power features on Stoney (current gen low power APU) to match idle power on windows after Stoney shipped.  Linux driver guys working hard on it for 4+ month and still having hard time getting over the hurdle without support from HW designers because designers are tied up with the next generation silicon currently in the lab and the rest of them already moved onto next next generation.  To me I would rather have everything built on top of DC, including HW diagnostic test suites.  Even if I have to build DC on top of DRM mode setting I would prefer that over trying to do another bring up without HW support.  After all as driver developer refactoring and changing code is more fun than digging through documents/email and experimenting with different combination of settings in register and countless of reboots to try get pass some random hang.  

FYI, just dce_mem_input.c programs over 50 distinct register fields, and DC for current generation ASIC doesn't yet support all features and power optimizations.  This doesn't even include more complex programming model in future generation with HW IP getting more modular.  We are already making progress with bring up with shared DC code for next gen ASIC in the lab. DC HW programming / resource management / power optimization will be fully validated on all platforms including Linux and that will benefit the Linux driver running on AMD HW, especially in battery life.

Just in case you are wondering Polaris windows driver isn't using DC and was on a "windows architecture" code base.  We understand that from community point of view you are not getting much feature / power benefit yet because CI/VI/CZ/Polaris Linux driver with DC is only used in Linux and we don’t have the man power to make it fully optimized yet.  Next gen will be performance and power optimized at launch.  I acknowledge that we don't have full feature on Linux yet and we still need to work with community to amend DRM to enable FreeSync, HDR, next gen resolution and other display feature just made available in Crimson ReLive.  However it's not realistic to engage with community early on in these efforts, as up to 1 month prior to release we were still experimenting with different solutions to make the feature better and we wouldn't have known what we end up building half year ago.  And of course marketing wouldn't let us leak these features before Crimson launch.

I would like to work with the community and I think we have shown that we welcome, appreciate and take feedback seriously.  There is plenty of work done in DC addressing some of the easier to fix problems while we have next gen ASIC in the lab as top priority.  We are already down to 66k lines of code from 93k through refactoring and remove numerous abstractions.  We can't just tear apart the "mid layer" or "HAL" over night.  Plenty of work need to be done to understand if/how we can fit resource optimization complexity into existing DRM framework.  If you look at DC structure closely, we created them to plug into DRM structures (ie.  dc_surface == FB/plane, dc_stream ~= CRTC, dc_link+dc_sink = encoder + connector), but we need a resource layer to decide how to realize the given "state" with our HW.  The problem is not getting simpler as on top of multi-plane combine, shared encoders and clock resources,  compression is starting to get into display domain.  By the way, existing DRM structure do fit nicely for HW of 4 generations ago, and with current windows driver we do have concept of crtc, encoders, connector. However over the years complexity has grown and resource management is becoming a problem, which led us to design of putting in a resource management layer.  We might not be supporting full range of what atomic can do and our semantics may be different at this stage of development, but saying dc_validate breaks atomic only tells me you haven't take a close look at our DC code.  For us all validation runs same topology/resource algorithm in check and commit.  It's not optimal yet as we will end up doing this algorithm twice today on a commit but we do intend to fix it over time.  I welcome any concrete suggestions on using existing framework to solve the resource/topology management issue.  It's not too late to change DC now but after couple year after more OS and ASICs are built on top of DC it will be very difficult to change. 

> Now the reason I bring this up (and we've discussed it at length in
> private) is that DC still suffers from a massive abstraction midlayer. 
> A lot of the back-end stuff (dp aux, i2c, abstractions for allocation, 
> timers, irq, ...) have been cleaned up, but the midlayer is still there.
> And I understand why you have it, and why it's there - without some OS 
> abstraction your grand plan of a unified driver across everything 
> doesn't work out so well.
>
> But in a way the backend stuff isn't such a big deal. It's annoying 
> since lots of code, and bugfixes have to be duplicated and all that, 
> but it's fairly easy to fix case-by-case, and as long as AMD folks 
> stick around (which I fully expect) not a maintainance issue. It makes 
> it harder for others to contribute, but then since it's mostly the 
> leaf it's generally easy to just improve the part you want to change 
> (as an outsider). And if you want to improve shared code the only 
> downside is that you can't also improve amd, but that's not so much a 
> problem for non-amd folks ;-)

Unfortunately duplicating bug fixes is not trivial and if code base diverge some of the fixes will be different.  Surprisingly if you track where we spend our time, < 20% is writing code.  Probably 50% is trying to figure out which register need a different value programmed in those situations. The other 30% is trying to make sure the change doesn’t break other stuff in different scenarios.  If power and performance optimizations remains off in Linux then I would agree with your assessment.  

> I've only got one true power as a maintainer, and that is to say No.

We AMD driver developer only got 2 true power over community, and that is having access to internal documentation and HW designers.  Not pulling Linux into the mix while silicon is still in the lab means we lose half of our power (HW designer support).   

> I've also wondered if the DC code is ready for being part of the kernel 
> anyways, what happens if I merge this, and some external 
> contributor rewrites 50% of it and removes a bunch of stuff that the 
> kernel doesn't need. By any kernel standards I'll merge that sort of 
> change over your heads if Alex doesn't, it might mean you have to 
> rewrite a chunk of your internal validation code, or some other 
> interactions, but those won't be reasons to block the changes from 
> my POV. I'd like some serious introspection on your team's part on 
> how you got into this situation and how even if I was feeling like 
> merging this (which I'm not) how you'd actually deal with being part 
> of the Linux kernel and not hiding in nicely framed orgchart silo 
> behind a HAL. 

We have come a long way compare to how we used to be windows centric, and I am sure there is plenty of work remaining for us to be ready to be part of the kernel.  If community has clever and clean solution that doesn’t break our ASICs we’ll take it internally with open arms.  We merged Dave and Jerome’s clean up on removing abstractions and we had lots of patches following Dave and Jerome’s lead in different area.  

Again this is not about orgchart.  It’s about what’s validated when samples are in the lab.

God I miss the day when everything is plugged into the wall and dual link DVI was cutting edge.  At least most of our problem can be solved by diffing register dump between good and bad case.  

Tony
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-12-08 23:29         ` Dave Airlie
@ 2016-12-09 17:32         ` Deucher, Alexander
       [not found]           ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  2016-12-09 20:31           ` Daniel Vetter
  1 sibling, 2 replies; 66+ messages in thread
From: Deucher, Alexander @ 2016-12-09 17:32 UTC (permalink / raw)
  To: 'Dave Airlie', Daniel Vetter
  Cc: Grodzovsky, Andrey, Cheng, Tony, Wentland, Harry,
	amd-gfx mailing list, dri-devel

> -----Original Message-----
> From: Dave Airlie [mailto:airlied@gmail.com]
> Sent: Thursday, December 08, 2016 3:07 PM
> To: Daniel Vetter
> Cc: Wentland, Harry; dri-devel; Grodzovsky, Andrey; amd-gfx mailing list;
> Deucher, Alexander; Cheng, Tony
> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU
> 
> > I can't dig into details of DC, so this is not a 100% assessment, but if
> > you call a function called "validate" in atomic_commit, you're very, very
> > likely breaking atomic. _All_ validation must happen in ->atomic_check,
> > if that's not the case TEST_ONLY mode is broken. And atomic userspace is
> > relying on that working.
> >
> > The only thing that you're allowed to return from ->atomic_commit is
> > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
> > Kerneldoc expklains this.
> >
> > Now the reason I bring this up (and we've discussed it at length in
> > private) is that DC still suffers from a massive abstraction midlayer. A
> > lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
> > timers, irq, ...) have been cleaned up, but the midlayer is still there.
> > And I understand why you have it, and why it's there - without some OS
> > abstraction your grand plan of a unified driver across everything doesn't
> > work out so well.
> >
> > But in a way the backend stuff isn't such a big deal. It's annoying since
> > lots of code, and bugfixes have to be duplicated and all that, but it's
> > fairly easy to fix case-by-case, and as long as AMD folks stick around
> > (which I fully expect) not a maintainance issue. It makes it harder for
> > others to contribute, but then since it's mostly the leaf it's generally
> > easy to just improve the part you want to change (as an outsider). And if
> > you want to improve shared code the only downside is that you can't also
> > improve amd, but that's not so much a problem for non-amd folks ;-)
> >
> > The problem otoh with the abstraction layer between drm core and the
> amd
> > driver is that you can't ignore if you want to refactor shared code. And
> > because it's an entire world of its own, it's much harder to understand
> > what the driver is doing (without reading it all). Some examples of what I
> > mean:
> >
> > - All other drm drivers subclass drm objects (by embedding them) into the
> >   corresponding hw part that most closely matches the drm object's
> >   semantics. That means even when you have 0 clue about how a given
> piece
> >   of hw works, you have a reasonable chance of understanding code. If it's
> >   all your own stuff you always have to keep in minde the special amd
> >   naming conventions. That gets old real fast if you trying to figure out
> >   what 20+ (or are we at 30 already?) drivers are doing.
> >
> > - This is even more true for atomic. Atomic has a pretty complicated
> >   check/commmit transactional model for updating display state. It's a
> >   standardized interface, and it's extensible, and we want generic
> >   userspace to be able to run on any driver. Fairly often we realize that
> >   semantics of existing or newly proposed properties and state isn't
> >   well-defined enough, and then we need to go&read all the drivers and
> >   figure out how to fix up the mess. DC has it's entirely separate state
> >   structures which again don't subclass the atomic core structures (afaik
> >   at least). Again the same problems apply that you can't find things, and
> >   that figuring out the exact semantics and spotting differences in
> >   behaviour is almost impossible.
> >
> > - The trouble isn't just in reading code and understanding it correctly,
> >   it's also in finding it. If you have your own completely different world
> >   then just finding the right code is hard - cscope and grep fail to work.
> >
> > - Another issue is that very often we unify semantics in drivers by adding
> >   some new helpers that at least dtrt for most of the drivers. If you have
> >   your own world then the impendance mismatch will make sure that amd
> >   drivers will have slightly different semantics, and I think that's not
> >   good for the ecosystem and kms - people want to run a lot more than just
> >   a boot splash with generic kms userspace, stuff like xf86-video-$vendor
> >   is going out of favour heavily.
> >
> > Note that all this isn't about amd walking away and leaving an
> > unmaintainable mess behind. Like I've said I don't think this is a big
> > risk. The trouble is that having your own world makes it harder for
> > everyone else to understand the amd driver, and understanding all drivers
> > is very often step 1 in some big refactoring or feature addition effort.
> > Because starting to refactor without understanding the problem generally
> > doesn't work ;_) And you can't make this step 1 easier for others by
> > promising to always maintain DC and update it to all the core changes,
> > because that's only step 2.
> >
> > In all the DC discussions we've had thus far I haven't seen anyone address
> > this issue. And this isn't just an issue in drm, it's pretty much
> > established across all linux subsystems with the "no midlayer or OS
> > abstraction layers in drivers" rule. There's some real solid reasons why
> > such a HAl is extremely unpopular with upstream. And I haven't yet seen
> > any good reason why amd needs to be different, thus far it looks like a
> > textbook case, and there's been lots of vendors in lots of subsystems who
> > tried to push their HAL.
> 
> Daniel has said this all very nicely, I'm going to try and be a bit more direct,
> because apparently I've possibly been too subtle up until now.
> 
> No HALs. We don't do HALs in the kernel. We might do midlayers sometimes
> we try not to do midlayers. In the DRM we don't do either unless the
> maintainers
> are asleep. They might be worth the effort for AMD, however for the Linux
> kernel
> they don't provide a benefit and make maintaining the code a lot harder. I've
> maintained this code base for over 10 years now and I'd like to think
> I've only merged
>  something for semi-political reasons once (initial exynos was still
> more Linuxy than DC),
> and that thing took a lot of time to cleanup, I really don't feel like
> saying yes again.
> 
> Given the choice between maintaining Linus' trust that I won't merge
> 100,000 lines
> of abstracted HAL code and merging 100,000 lines of abstracted HAL code
> I'll give you one guess where my loyalties lie. The reason the
> toplevel maintainer (me)
> doesn't work for Intel or AMD or any vendors, is that I can say NO
> when your maintainers
> can't or won't say it.
> 
> I've only got one true power as a maintainer, and that is to say No.
> The other option
> is I personally sit down and rewrite all the code in an acceptable
> manner, and merge that
> instead. But I've discovered I probably don't scale to that level, so
> again it leaves me
> with just the one actual power.
> 
> AMD can't threaten not to support new GPUs in upstream kernels without
> merging this,
> that is totally something you can do, and here's the thing Linux will
> survive, we'll piss off
> a bunch of people, but the Linux kernel will just keep on rolling
> forward, maybe at some
> point someone will get pissed about lacking upstream support for your
> HW and go write
> support and submit it, maybe they won't. The kernel is bigger than any
> of us and has
> standards about what is acceptable. Read up on the whole mac80211
> problems we had
> years ago, where every wireless vendor wrote their own 80211 layer
> inside their driver,
> there was a lot of time spent creating a central 80211 before any of
> those drivers were
> suitable for merge, well we've spent our time creating a central
> modesetting infrastructure,
> bypassing it is taking a driver in totally the wrong direction.
> 
> I've also wondered if the DC code is ready for being part of the
> kernel anyways, what
> happens if I merge this, and some external contributor rewrites 50% of
> it and removes a
> bunch of stuff that the kernel doesn't need. By any kernel standards
> I'll merge that sort
> of change over your heads if Alex doesn't, it might mean you have to
> rewrite a chunk
> of your internal validation code, or some other interactions, but
> those won't be reasons
> to block the changes from my POV. I'd like some serious introspection
> on your team's
> part on how you got into this situation and how even if I was feeling
> like merging this
> (which I'm not) how you'd actually deal with being part of the Linux
> kernel and not hiding
> in nicely framed orgchart silo behind a HAL. I honestly don't think
> the code is Linux worthy
> code, and I also really dislike having to spend my Friday morning
> being negative about it,
> but hey at least I can have a shower now.
> 
> No.

Hi Dave,

I think this is part of the reason a lot of people get fed up with working upstream in Linux.  I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there.  But attacking us or our corporate culture is not cool.  I think perhaps you have been in the RH silo for too long.  Our corporate culture is not like RH's.  Like it or not, we have historically been a windows centric company.  We have a few small Linux team that has been engaged with the community for a long time, but the rest of the company has not.  We are working to improve it, but we can only do so many things at one time.  GPU cycles are fast.  There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant.  We are finally at a point where our AMD Linux drivers are almost feature complete compared to windows and we have support upstream well before hw launch and we get shit on for trying to do the right thing.  It doesn't exactly make us want to continue contributing.  That's the problem with Linux.  Unless you are part time hacker who is part of the "in" crowd can spend all of his days tinkering with making the code perfect, a vendor with massive resources who can just through more people at it, or a throw it over the wall and forget it vendor (hey, my code can just live in staging), there's no room for you.

You love to tell the exynos story about how crappy the code was and then after it was cleaned up how glorious it was. Except the vendor didn't do that.  Another vendor paid another vendor to do it.  We don't happen to have the resources to pay someone else to do that for us.  Moreover, doing so would negate all of the advantages to bringing up the code along with the hw team in the lab when the asics come back from the fab.  Additionally, the original argument against the exynos code was that it was just thrown over the wall and largely ignored by the vendor once it was upstream.  We've been consistently involved in upstream (heck, I've been at AMD almost 10 years now maintaining our drivers).  You talk about trust.  I think there's something to cutting a trusted partner some slack as they work to further improve their support vs. taking a hard line because you got burned once by a throw it over the wall vendor who was not engaged.  Even if you want to take a hard line, let's discuss it on technical merits, not mud-slinging.

I realize you care about code quality and style, but do you care about stable functionality?  Would you really merge a bunch of huge cleanups that would potentially break tons of stuff in subtle ways because coding style is that important?  I'm done with that myself.  I've merged too many half-baked cleanups and new features in the past and ended up spending way more time fixing them than I would have otherwise for relatively little gain.  The hw is just too complicated these days.  At some point people what support for the hw they have and they want it to work.  If code trumps all, then why do we have staging?  

I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code.  Every kernel cycle manages to break several drivers.  What happened to figuring out how to do in right in a couple of drivers and then moving that to the core.  We seem to have lost that in favor of starting in the core first.  I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout.  Is all we care about android?  I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end.  Right now we are all suffering and android barely even using this yet.  If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android.  Are you basically telling us that you'd rather we water down our driver and limit the features and capabilities and stability we can support so that others can refactor our code constantly for hazy goals to support some supposed glorious future that never seems to come?  What about right now?  Maybe we could try and support some features right now.  Maybe we'll finally see Linux on the desktop.

Alex

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08 23:29         ` Dave Airlie
       [not found]           ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-09 17:56           ` Cheng, Tony
  1 sibling, 0 replies; 66+ messages in thread
From: Cheng, Tony @ 2016-12-09 17:56 UTC (permalink / raw)
  To: Dave Airlie, Daniel Vetter
  Cc: Deucher, Alexander, Grodzovsky, Andrey, Wentland, Harry,
	amd-gfx mailing list, dri-devel

> Merging this code as well as maintaining a trust relationship with 
> Linus, also maintains a trust relationship with the Linux graphics 
> community and other drm contributors. There have been countless 
> requests from various companies and contributors to merge unsavoury 
> things over the years and we've denied them. They've all had the same 
> reasons behind why they couldn't do what we want and why we were 
> wrong, but lots of people have shown up who do get what we are at and 
> have joined the community and contributed drivers that conform to the standards.
> Turning around now and saying well AMD ignored our directions, so 
> we'll give them a free pass even though we've denied you all the same 
> thing over time.

I'd like to say that I acknowledge the good and hard work maintainers are doing.  You nor the community is wrong to say no. I understand where the no comes from.  If somebody wants to throw 100k lines into DAL I would say no as well.

> If I'd given in and merged every vendor coded driver as-is we'd never 
> have progressed to having atomic modesetting, there would have been 
> too many vendor HALs and abstractions that would have blocked forward 
> progression. Merging one HAL or abstraction is going to cause  pain, 
> but setting a precedent to merge more would be just downright stupid 
> maintainership.

> Here's the thing, we want AMD to join the graphics community not hang 
> out inside the company in silos. We need to enable FreeSync on Linux, 
> go ask the community how would be best to do it, don't shove it inside 
> the driver hidden in a special ioctl. Got some new HDMI features that 
> are secret, talk to other ppl in the same position and work out a plan 
> for moving forward. At the moment there is no engaging with the Linux 
> stack because you aren't really using it, as long as you hide behind 
> the abstraction there won't be much engagement, and neither side 
> benefits, so why should we merge the code if nobody benefits?


> The platform problem/Windows mindset is scary and makes a lot of 
> decisions for you, open source doesn't have those restrictions, and I 
> don't accept drivers that try and push those development model 
> problems into our codebase.

I would like to share how platform problem/Windows mindset look from our side.  We are dealing with ever more complex hardware with the push to reduce power while driving more pixels through.  It is the power reduction that is causing us driver developers most of the pain.  Display is a high bandwidth real time memory fetch sub system which is always on, even when the system is idle.  When the system is idle, pretty much all of power consumption comes from display.  Can we use existing DRM infrastructure?  Definitely yes, if we talk about modes up to 300Mpix/s and leaving a lot of voltage and clock margin on the table.  How hard is it to set up a timing while bypass most of the pixel processing pipeline to light up a display?  How about adding all the power optimization such as burst read to fill display cache and keep DRAM in self-refresh as much as possible?  How about powering off some of the cache or pixel processing pipeline if we are not using them?  We need to manage and maximize valuable resources like cache (cache == silicon area == $$) and clock (== power) and optimize memory request patterns at different memory clock speeds, while DPM is going, in real time on the system.  This is why there is so much code to program registers, track our states, and manages resources, and it's getting more complex as HW would prefer SW program the same value into 5 different registers in different sub blocks to save a few cross tile wires on silicon and do complex calculations to find the magical optimal settings (the hated bandwidth_cals.c).  There are a lot of registers need to be programmed to correct values in the right situation if we enable all these power/performance optimizations.

It's really not a problem of windows mindset, rather is what is the bring up platform when silicon is in the lab with HW designer support.  Today no surprise we do that almost exclusively on windows.  Display team is working hard to change that to have linux in the mix while we have the attention from HW designers.  We have a recent effort to try to enable all power features on Stoney (current gen low power APU) to match idle power on windows after Stoney shipped.  Linux driver guys working hard on it for 4+ month and still having hard time getting over the hurdle without support from HW designers because designers are tied up with the next generation silicon currently in the lab and the rest of them already moved onto next next generation.  To me I would rather have everything built on top of DC, including HW diagnostic test suites.  Even if I have to build DC on top of DRM mode setting I would prefer that over trying to do another bring up without HW support.  After all as driver developer refactoring and changing code is more fun than digging through documents/email and experimenting with different combination of settings in register and countless of reboots to try get pass some random hang.  

FYI, just dce_mem_input.c programs over 50 distinct register fields, and DC for current generation ASIC doesn't yet support all features and power optimizations.  This doesn't even include more complex programming model in future generation with HW IP getting more modular.  We are already making progress with bring up with shared DC code for next gen ASIC in the lab. DC HW programming / resource management / power optimization will be fully validated on all platforms including Linux and that will benefit the Linux driver running on AMD HW, especially in battery life.

Just in case you are wondering Polaris windows driver isn't using DC and was on a "windows architecture" code base.  We understand that from community point of view you are not getting much feature / power benefit yet because CI/VI/CZ/Polaris Linux driver with DC is only used in Linux and we don’t have the man power to make it fully optimized yet.  Next gen will be performance and power optimized at launch.  I acknowledge that we don't have full feature on Linux yet and we still need to work with community to amend DRM to enable FreeSync, HDR, next gen resolution and other display feature just made available in Crimson ReLive.  However it's not realistic to engage with community early on in these efforts, as up to 1 month prior to release we were still experimenting with different solutions to make the feature better and we wouldn't have known what we end up building half year ago.  And of course marketing wouldn't let us leak these features before Crimson launch.

I would like to work with the community and I think we have shown that we welcome, appreciate and take feedback seriously.  There is plenty of work done in DC addressing some of the easier to fix problems while we have next gen ASIC in the lab as top priority.  We are already down to 66k lines of code from 93k through refactoring and remove numerous abstractions.  We can't just tear apart the "mid layer" or "HAL" over night.  Plenty of work need to be done to understand if/how we can fit resource optimization complexity into existing DRM framework.  If you look at DC structure closely, we created them to plug into DRM structures (ie.  dc_surface == FB/plane, dc_stream ~= CRTC, dc_link+dc_sink = encoder + connector), but we need a resource layer to decide how to realize the given "state" with our HW.  The problem is not getting simpler as on top of multi-plane combine, shared encoders and clock resources,  compression is starting to get into display domain.  By the way, existing DRM structure do fit nicely for HW of 4 generations ago, and with current windows driver we do have concept of crtc, encoders, connector. However over the years complexity has grown and resource management is becoming a problem, which led us to design of putting in a resource management layer.  We might not be supporting full range of what atomic can do and our semantics may be different at this stage of development, but saying dc_validate breaks atomic only tells me you haven't take a close look at our DC code.  For us all validation runs same topology/resource algorithm in check and commit.  It's not optimal yet as we will end up doing this algorithm twice today on a commit but we do intend to fix it over time.  I welcome any concrete suggestions on using existing framework to solve the resource/topology management issue.  It's not too late to change DC now but after couple year after more OS and ASICs are built on top of DC it will be very difficult to change. 

> Now the reason I bring this up (and we've discussed it at length in
> private) is that DC still suffers from a massive abstraction midlayer. 
> A lot of the back-end stuff (dp aux, i2c, abstractions for allocation, 
> timers, irq, ...) have been cleaned up, but the midlayer is still there.
> And I understand why you have it, and why it's there - without some OS 
> abstraction your grand plan of a unified driver across everything 
> doesn't work out so well.
>
> But in a way the backend stuff isn't such a big deal. It's annoying 
> since lots of code, and bugfixes have to be duplicated and all that, 
> but it's fairly easy to fix case-by-case, and as long as AMD folks 
> stick around (which I fully expect) not a maintainance issue. It makes 
> it harder for others to contribute, but then since it's mostly the 
> leaf it's generally easy to just improve the part you want to change 
> (as an outsider). And if you want to improve shared code the only 
> downside is that you can't also improve amd, but that's not so much a 
> problem for non-amd folks ;-)

Unfortunately duplicating bug fixes is not trivial and if code base diverge some of the fixes will be different.  Surprisingly if you track where we spend our time, < 20% is writing code.  Probably 50% is trying to figure out which register need a different value programmed in those situations. The other 30% is trying to make sure the change doesn’t break other stuff in different scenarios.  If power and performance optimizations remains off in Linux then I would agree with your assessment.  

> I've only got one true power as a maintainer, and that is to say No.

We AMD driver developer only got 2 true power over community, and that is having access to internal documentation and HW designers.  Not pulling Linux into the mix while silicon is still in the lab means we lose half of our power (HW designer support).   

> I've also wondered if the DC code is ready for being part of the 
> kernel anyways, what happens if I merge this, and some external 
> contributor rewrites 50% of it and removes a bunch of stuff that the 
> kernel doesn't need. By any kernel standards I'll merge that sort of 
> change over your heads if Alex doesn't, it might mean you have to 
> rewrite a chunk of your internal validation code, or some other 
> interactions, but those won't be reasons to block the changes from my 
> POV. I'd like some serious introspection on your team's part on how 
> you got into this situation and how even if I was feeling like merging 
> this (which I'm not) how you'd actually deal with being part of the 
> Linux kernel and not hiding in nicely framed orgchart silo behind a 
> HAL.

We have come a long way compare to how we used to be windows centric, and I am sure there is plenty of work remaining for us to be ready to be part of the kernel.  If community has clever and clean solution that doesn’t break our ASICs we’ll take it internally with open arms.  We merged Dave and Jerome’s clean up on removing abstractions and we had lots of patches following Dave and Jerome’s lead in different area.  

Again this is not about orgchart.  It’s about what’s validated when samples are in the lab.

God I miss the day when everything is plugged into the wall and dual link DVI was cutting edge.  At least most of our problem can be solved by diffing register dump between good and bad case.  

Tony
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-09 17:26             ` Cheng, Tony
@ 2016-12-09 19:59               ` Daniel Vetter
       [not found]                 ` <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Daniel Vetter @ 2016-12-09 19:59 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander

I guess things went a bit sideways by me and Dave only talking about
the midlayer, so let me first state that the DC stuff has massively
improved through replacing all the backend services that reimplemented
Linux helper libraries with their native equivalent. That's some
serious work, and it shows that AMD is committed to doing the right
thing.

I absolutely didn't want to belittle all that effort by only raising
what I see is the one holdover left.

On Fri, Dec 9, 2016 at 6:26 PM, Cheng, Tony <Tony.Cheng@amd.com> wrote:
>> Merging this code as well as maintaining a trust relationship with
>> Linus, also maintains a trust relationship with the Linux graphics
>> community and other drm contributors. There have been countless
>> requests from various companies and contributors to merge unsavoury
>> things over the years and we've denied them. They've all had the same
>> reasons behind why they couldn't do what we want and why we were
>> wrong, but lots of people have shown up who do get what we are at and
>> have joined the community and contributed drivers that conform to the standards.
>> Turning around now and saying well AMD ignored our directions, so
>> we'll give them a free pass even though we've denied you all the same
>> thing over time.
>
> I'd like to say that I acknowledge the good and hard work maintainers are doing.  You nor the community is wrong to say no. I understand where the no comes from.  If somebody wants to throw 100k lines into DAL I would say no as well.
>
>> If I'd given in and merged every vendor coded driver as-is we'd never
>> have progressed to having atomic modesetting, there would have been
>> too many vendor HALs and abstractions that would have blocked forward
>> progression. Merging one HAL or abstraction is going to cause  pain,
>> but setting a precedent to merge more would be just downright stupid
>> maintainership.
>
>> Here's the thing, we want AMD to join the graphics community not hang
>> out inside the company in silos. We need to enable FreeSync on Linux,
>> go ask the community how would be best to do it, don't shove it inside
>> the driver hidden in a special ioctl. Got some new HDMI features that
>> are secret, talk to other ppl in the same position and work out a plan
>> for moving forward. At the moment there is no engaging with the Linux
>> stack because you aren't really using it, as long as you hide behind
>> the abstraction there won't be much engagement, and neither side
>> benefits, so why should we merge the code if nobody benefits?
>
>
>> The platform problem/Windows mindset is scary and makes a lot of
>> decisions for you, open source doesn't have those restrictions, and I
>> don't accept drivers that try and push those development model
>> problems into our codebase.
>
> I would like to share how platform problem/Windows mindset look from our side.  We are dealing with ever more complex hardware with the push to reduce power while driving more pixels through.  It is the power reduction that is causing us driver developers most of the pain.  Display is a high bandwidth real time memory fetch sub system which is always on, even when the system is idle.  When the system is idle, pretty much all of power consumption comes from display.  Can we use existing DRM infrastructure?  Definitely yes, if we talk about modes up to 300Mpix/s and leaving a lot of voltage and clock margin on the table.  How hard is it to set up a timing while bypass most of the pixel processing pipeline to light up a display?  How about adding all the power optimization such as burst read to fill display cache and keep DRAM in self-refresh as much as possible?  How about powering off some of the cache or pixel processing pipeline if we are not using them?  We need to manage and maximize valuable resources like cache (cache == silicon area == $$) and clock (== power) and optimize memory request patterns at different memory clock speeds, while DPM is going, in real time on the system.  This is why there is so much code to program registers, track our states, and manages resources, and it's getting more complex as HW would prefer SW program the same value into 5 different registers in different sub blocks to save a few cross tile wires on silicon and do complex calculations to find the magical optimal settings (the hated bandwidth_cals.c).  There are a lot of registers need to be programmed to correct values in the right situation if we enable all these power/performance optimizations.
>
> It's really not a problem of windows mindset, rather is what is the bring up platform when silicon is in the lab with HW designer support.  Today no surprise we do that almost exclusively on windows.  Display team is working hard to change that to have linux in the mix while we have the attention from HW designers.  We have a recent effort to try to enable all power features on Stoney (current gen low power APU) to match idle power on windows after Stoney shipped.  Linux driver guys working hard on it for 4+ month and still having hard time getting over the hurdle without support from HW designers because designers are tied up with the next generation silicon currently in the lab and the rest of them already moved onto next next generation.  To me I would rather have everything built on top of DC, including HW diagnostic test suites.  Even if I have to build DC on top of DRM mode setting I would prefer that over trying to do another bring up without HW support.  After all as driver developer refactoring and changing code is more fun than digging through documents/email and experimenting with different combination of settings in register and countless of reboots to try get pass some random hang.
>
> FYI, just dce_mem_input.c programs over 50 distinct register fields, and DC for current generation ASIC doesn't yet support all features and power optimizations.  This doesn't even include more complex programming model in future generation with HW IP getting more modular.  We are already making progress with bring up with shared DC code for next gen ASIC in the lab. DC HW programming / resource management / power optimization will be fully validated on all platforms including Linux and that will benefit the Linux driver running on AMD HW, especially in battery life.
>
> Just in case you are wondering Polaris windows driver isn't using DC and was on a "windows architecture" code base.  We understand that from community point of view you are not getting much feature / power benefit yet because CI/VI/CZ/Polaris Linux driver with DC is only used in Linux and we don’t have the man power to make it fully optimized yet.  Next gen will be performance and power optimized at launch.  I acknowledge that we don't have full feature on Linux yet and we still need to work with community to amend DRM to enable FreeSync, HDR, next gen resolution and other display feature just made available in Crimson ReLive.  However it's not realistic to engage with community early on in these efforts, as up to 1 month prior to release we were still experimenting with different solutions to make the feature better and we wouldn't have known what we end up building half year ago.  And of course marketing wouldn't let us leak these features before Crimson launch.

This is something you need to fix, or it'll stay completely painful
forever. It's hard work and takes years, but here at Intel we pulled
it off. We can upstream everything from a _very_ early stage (can't
tell you how early). And we have full marketing approval for that. If
you watch the i915 commit stream you can see how our code is chasing
updates from the hw engineers debugging things.

> I would like to work with the community and I think we have shown that we welcome, appreciate and take feedback seriously.  There is plenty of work done in DC addressing some of the easier to fix problems while we have next gen ASIC in the lab as top priority.  We are already down to 66k lines of code from 93k through refactoring and remove numerous abstractions.  We can't just tear apart the "mid layer" or "HAL" over night.  Plenty of work need to be done to understand if/how we can fit resource optimization complexity into existing DRM framework.  If you look at DC structure closely, we created them to plug into DRM structures (ie.  dc_surface == FB/plane, dc_stream ~= CRTC, dc_link+dc_sink = encoder + connector), but we need a resource layer to decide how to realize the given "state" with our HW.  The problem is not getting simpler as on top of multi-plane combine, shared encoders and clock resources,  compression is starting to get into display domain.  By the way, existing DRM structure do fit nicely for HW of 4 generations ago, and with current windows driver we do have concept of crtc, encoders, connector. However over the years complexity has grown and resource management is becoming a problem, which led us to design of putting in a resource management layer.  We might not be supporting full range of what atomic can do and our semantics may be different at this stage of development, but saying dc_validate breaks atomic only tells me you haven't take a close look at our DC code.  For us all validation runs same topology/resource algorithm in check and commit.  It's not optimal yet as we will end up doing this algorithm twice today on a commit but we do intend to fix it over time.  I welcome any concrete suggestions on using existing framework to solve the resource/topology management issue.  It's not too late to change DC now but after couple year after more OS and ASICs are built on top of DC it will be very difficult to change.

I guess I assumed too much that midlayer is a known thing. No one's
asking AMD to throw all the platform DC code away. No ones asking you
to rewrite the bandwidth calculations, clock tuning and all the code
that requires tons of work at power on to get right. Asking for that
would be beyond silly.

The disagreement is purely about how all that code interfaces with the
DRM subsystem, and how exactly it implements the userspace ABI. As
you've noticed the DRM objects don't really fit well for todays
hardware any more, but because it's Linux userspace ABI we can't ever
change those (well at least not easily) and will be stuck for another
few years or maybe even decades with them. Which means _every_ driver
has to deal with an impendence mismatch. The question now is how you
deal with that impendence mismatch.

The industry practice has been to insert an abstraction layer to
isolate your own code as much as possible. The linux best practice is
essentially an inversion of control where you write a bit of
linux-specific glue which drives the bits and pieces that are part of
your backend much more directly. And the reason why the abstraction
layer isn't popular in linux is that it makes cross-vendor
collaboration much more painful, and unecessarily so. Me misreading
your atomic code is a pretty good example - of course you understand
it and can see that I missed things, it's your codebase. But as
someone who reads drm drivers all day long stumbling over a driver
where things work completely differently means I'm just lost, and it's
much harder to understand things. And upstream does optimize for
cross-vendor collaboration.

But none of that means you need to throw away your entire backend. It
only means that the interface should try to be understandle to people
who don't look at dal/dc all day long. So if you say above that e.g.
dc_surface ~ drm_plane, then the expectation is that dc_surface embeds
(subclassing in OO speak) drm_plane. Yes there will be some mismatches
and there's code patterns and support in atomic to handle them, but it
makes it much easier to understand vendor code for outsides.

And this also doesn't mean that your backend code needs to deal with
drm_planes all the time, it will still deal with dc_surface. The only
thing that kinda changes is that if you want to keep cross-vendor
support you might need some abstraction so that your shared code
doesn't heavily depend upon the drm_plane layout.

Similar for resource optimization and state handling in atomic:
There's a very clear pattern that all DRM drivers follow, which is
massively extensible (you're not the only ones support shared
resources, and not the only vendor where the simple
plane->crtc->encoder pipeline has about 10 different components and IP
blocks in reality). And by following that pattern (and again you can
store whatever you want in your own private dc_surface_state) it makes
it really easy for others to quickly check a few things in your
driver, and I wouldn't have made the mistake of not realizing that you
do validate the state in atomic_check.

And personally I don't believe that designing things this way round
will result in more unshared code between different platforms. The
intel i915 is being reused (not on windows, but on a bunch of other
more fringe OS), and the people doing that don't seem to terribly
struggle with it.

>> Now the reason I bring this up (and we've discussed it at length in
>> private) is that DC still suffers from a massive abstraction midlayer.
>> A lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
>> timers, irq, ...) have been cleaned up, but the midlayer is still there.
>> And I understand why you have it, and why it's there - without some OS
>> abstraction your grand plan of a unified driver across everything
>> doesn't work out so well.
>>
>> But in a way the backend stuff isn't such a big deal. It's annoying
>> since lots of code, and bugfixes have to be duplicated and all that,
>> but it's fairly easy to fix case-by-case, and as long as AMD folks
>> stick around (which I fully expect) not a maintainance issue. It makes
>> it harder for others to contribute, but then since it's mostly the
>> leaf it's generally easy to just improve the part you want to change
>> (as an outsider). And if you want to improve shared code the only
>> downside is that you can't also improve amd, but that's not so much a
>> problem for non-amd folks ;-)
>
> Unfortunately duplicating bug fixes is not trivial and if code base diverge some of the fixes will be different.  Surprisingly if you track where we spend our time, < 20% is writing code.  Probably 50% is trying to figure out which register need a different value programmed in those situations. The other 30% is trying to make sure the change doesn’t break other stuff in different scenarios.  If power and performance optimizations remains off in Linux then I would agree with your assessment.
>
>> I've only got one true power as a maintainer, and that is to say No.
>
> We AMD driver developer only got 2 true power over community, and that is having access to internal documentation and HW designers.  Not pulling Linux into the mix while silicon is still in the lab means we lose half of our power (HW designer support).
>
>> I've also wondered if the DC code is ready for being part of the kernel
>> anyways, what happens if I merge this, and some external
>> contributor rewrites 50% of it and removes a bunch of stuff that the
>> kernel doesn't need. By any kernel standards I'll merge that sort of
>> change over your heads if Alex doesn't, it might mean you have to
>> rewrite a chunk of your internal validation code, or some other
>> interactions, but those won't be reasons to block the changes from
>> my POV. I'd like some serious introspection on your team's part on
>> how you got into this situation and how even if I was feeling like
>> merging this (which I'm not) how you'd actually deal with being part
>> of the Linux kernel and not hiding in nicely framed orgchart silo
>> behind a HAL.
>
> We have come a long way compare to how we used to be windows centric, and I am sure there is plenty of work remaining for us to be ready to be part of the kernel.  If community has clever and clean solution that doesn’t break our ASICs we’ll take it internally with open arms.  We merged Dave and Jerome’s clean up on removing abstractions and we had lots of patches following Dave and Jerome’s lead in different area.
>
> Again this is not about orgchart.  It’s about what’s validated when samples are in the lab.
>
> God I miss the day when everything is plugged into the wall and dual link DVI was cutting edge.  At least most of our problem can be solved by diffing register dump between good and bad case.

Yeah, nothing different to what we suffer/experience here at Intel ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]           ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-12-09 20:30             ` Dave Airlie
       [not found]               ` <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Airlie @ 2016-12-09 20:30 UTC (permalink / raw)
  To: Deucher, Alexander
  Cc: Grodzovsky, Andrey, Wentland, Harry, amd-gfx mailing list,
	dri-devel, Daniel Vetter, Cheng, Tony

> I think this is part of the reason a lot of people get fed up with working upstream in Linux.  I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there.  But attacking us or our corporate culture is not cool.  I think perhaps you have been in the RH silo for too long.  Our corporate culture is not like RH's.  Like it or not, we have historically been a windows centric company.  We have a few small Linux team that has been engaged with the community for a long time, but the rest of the company has not.  We are working to improve it, but we can only do so many things at one time.  GPU cycles are fast.  There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant.  We are finally at a point where our AMD Linux drivers are almost feature complete compared to windows and we have support upstream well before hw launch and we get shit on for trying to do the right thing.  It doesn't exactly make us want to continue contributing.  That's the problem with Linux.  Unless you are part time hacker who is part of the "in" crowd can spend all of his days tinkering with making the code perfect, a vendor with massive resources who can just through more people at it, or a throw it over the wall and forget it vendor (hey, my code can just live in staging), there's no room for you.

I don't think that's fair, AMD as a company has a number of
experienced Linux kernel developers, who are well aware of the
upstream kernel development process and views. I should not be put in
a position where I have to say no, that is frankly the position you
are in as a maintainer, you work for AMD but you answer to the kernel
development process out here. AMD is travelling a well travelled road
here, Intel/Daniel have lots of times I've had to deal with the same
problems, eventually Intel learn that what Daniel says matters and
people are a lot happier. I brought up the AMD culture because either
one of two things have happened here, a) you've lost sight of what
upstream kernel code looks like, or b) people in AMD aren't listening
to you, and if its the latter case then it is a direct result of the
AMD culture, and so far I'm not willing to believe it's the former
(except maybe CGS - still on the wall whether that was a good idea or
a floodgate warning).

From what I understood this DAL code was a rewrite from scratch, with
upstreamability as a possible goal, it isn't directly taken from
Windows or fglrx. This goal was not achieved, why do I have to live
with the result. AMD could have done better, they have so many people
experienced in how this thing should go down.

> You love to tell the exynos story about how crappy the code was and then after it was cleaned up how glorious it was. Except the vendor didn't do that.  Another vendor paid another vendor to do it.  We don't happen to have the resources to pay someone else to do that for us.  Moreover, doing so would negate all of the advantages to bringing up the code along with the hw team in the lab when the asics come back from the fab.  Additionally, the original argument against the exynos code was that it was just thrown over the wall and largely ignored by the vendor once it was upstream.  We've been consistently involved in upstream (heck, I've been at AMD almost 10 years now maintaining our drivers).  You talk about trust.  I think there's something to cutting a trusted partner some slack as they work to further improve their support vs. taking a hard line because you got burned once by a throw it over the wall vendor who was not engaged.  Even if you want to take a hard line, let's discuss it on technical merits, not mud-slinging.

Here's the thing, what happens if a vendor pays another vendor to
clean up DAL after I merge it, how do you handle it? Being part of the
upstream kernel isn't about hiding in the corner, if you want to gain
the benefits of upstream development you need to participate in
upstream development. If you want to do what AMD seems to be only in a
position to do, and have upstream development as an after thought then
you of course are going to run into lots of problems.

>
> I realize you care about code quality and style, but do you care about stable functionality?  Would you really merge a bunch of huge cleanups that would potentially break tons of stuff in subtle ways because coding style is that important?  I'm done with that myself.  I've merged too many half-baked cleanups and new features in the past and ended up spending way more time fixing them than I would have otherwise for relatively little gain.  The hw is just too complicated these days.  At some point people what support for the hw they have and they want it to work.  If code trumps all, then why do we have staging?

Code doesn't trump all, I'd have merged DAL if it did. Maintainability
trumps all. The kernel will be around for a long time more, I'd like
it to still be something we can make changes to as expectations
change.

> I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code.  Every kernel cycle manages to break several drivers.  What happened to figuring out how to do in right in a couple of drivers and then moving that to the core.  We seem to have lost that in favor of starting in the core first.  I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout.  Is all we care about android?  I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end.  Right now we are all suffering and android barely even using this yet.  If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android.  Are you basically telling us that you'd rather we water down our driver and limit the features and capabilities and stability we can support so that others can refactor our code constantly for hazy goals to support some supposed glorious future that never seems to come?  What about right now?  Maybe we could try and support some features right now.  Maybe we'll finally see Linux on the desktop.
>

All of this comes from the development model you have ended up at. Do
you have upstream CI? Upstream keeps breaking things, how do you find
out? I've seen spstarr bisect a bunch of AMD regressions in the past 6
months (not due to atomic), where are the QA/CI teams validating that,
why aren't they bisecting the upstream kernel, instead of people in
the community on irc. AMD has been operating in throw it over the wall
at upstream for a while, I've tried to help motivate changing that and
slowly we get there with things like the external mailing list, and I
realise these things take time, but if upstream isn't something that
people really care about at AMD enough to continuously validate and
get involved in defining new APIs like atomic, you are in no position
to come back when upstream refuses to participate in merging 60-90k of
vendor produced code with lots of bits of functionality that shouldn't
be in there.

I'm unloading a lot of stuff here, and really I understand it's not
your fault, but I've stated I've only got one power left when people
let code like DAL/DC get to me, I'm not going to be tell you how to
rewrite it, because you already know, you've always known, now we just
need the right people to listen to you.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-09 17:32         ` Deucher, Alexander
       [not found]           ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-12-09 20:31           ` Daniel Vetter
  1 sibling, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-09 20:31 UTC (permalink / raw)
  To: Deucher, Alexander
  Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel

Hi Alex,

I'll leave the other bits out, just replying to atomic/android comments.

On Fri, Dec 9, 2016 at 6:32 PM, Deucher, Alexander
<Alexander.Deucher@amd.com> wrote:
> I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code.  Every kernel cycle manages to break several drivers.  What happened to figuring out how to do in right in a couple of drivers and then moving that to the core.  We seem to have lost that in favor of starting in the core first.  I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout.  Is all we care about android?  I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end.  Right now we are all suffering and android barely even using this yet.  If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android.  Are you basically telling us that you'd rather we water down our driver and limit the features and capabilities and stability we can support so that others can refactor our code constantly for hazy goals to support some supposed glorious future that never seems to come?  What about right now?  Maybe we could try and support some features right now.  Maybe we'll finally see Linux on the desktop.

Before atomic landed we've had 3 proof-of-concept drivers. Before I've
added the the nonblocking helpers we've had about 5-10 drivers doing
it all wrong in different ways (and yes the rework highlighted that in
a few cases rather brutally). We know have about 20 atomic drivers
(and counting), and pretty much all the refactoring, helper
extractions and reworks _are_ motivated by a bunch of drivers
hand-rolling a given pattern. So I think we're doing things roughly
right, it's just a bit hard.

And no Android isn't everything we care about, we want atomic also for
CrOS (which is pretty much the only linux desktop thing shipping in
quantities), and we want it for the traditional linux desktop
(weston/wayland/mutter). And we want it for embedded/entertainment
systems. Atomic is pretty much the answer to "KMS is outdated and
doesn't match modern hw anymore". E.g. on i915 we want atomic (and
related work) to be able to support render compression.

And of course I'd like to invite everyone who wants something else
with DRM to also bring that in, e.g. over the past few months we've
merged the simple kms helpers for super-dumb displays to be able to be
better at the fbdev game than fbdev itself. Not something I care about
personally, but it's still great because more users and usecases.

And the same applies of course to AMD. But what I'm seeing (and you're
not the only one complaining, Michel has raised this on irc a few
times too) is that you're not in the driver seat, and AMD folks don't
really have any say in where DRM overall heads towards. As an outsider
looking in I think that's because AMD is largely absorbed with itself,
doesn't have people who can just do random things because they see the
long-term benefits, and is occopied absorbing new teams that don't yet
design and develop with an upstream first approach. Personally I'm not
really happy about that, because I'd like more of AMD's perspective in
infrastructure work. But I don't think that's because upstream and
maintainers reject your stuff, I'm trying as hard as possible to drag
you folks in all the time, and tons of people get stuff merged with
even smaller teams than you have.I think it's simply because core work
seems not to be a top priority (yet). I can't fix that.

You're other criticism is that all these changes break shit, and agree
there's been a bit much of that. But otoh if we can't change e.g. fb
refcounting anymore because it would break drivers, or try to
deprecate old interfaces to get rid of the plenty of rootholes in
there, then upstream is dead and why should we bother with having a
standardized, cross-vendor modeset interface. And I'm trying to fix
this mess, by emphasising CI, building up a cross-vendor validation
suite in igt, inviting folks to participate in drm-misc to make sure
core stuff is working for everyone and moves in the right direction.
And again lots of people pick up on that offer, and we have multiple
people and vendors now e.g. looking into igt and starting to
contribute. But again AMD is left out, and I don't think that can be
blamed on the community.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]                 ` <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-09 20:34                   ` Dave Airlie
  2016-12-09 20:38                     ` Daniel Vetter
                                       ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Dave Airlie @ 2016-12-09 20:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel,
	Deucher, Alexander, Wentland, Harry

On 10 December 2016 at 05:59, Daniel Vetter <daniel@ffwll.ch> wrote:
> I guess things went a bit sideways by me and Dave only talking about
> the midlayer, so let me first state that the DC stuff has massively
> improved through replacing all the backend services that reimplemented
> Linux helper libraries with their native equivalent. That's some
> serious work, and it shows that AMD is committed to doing the right
> thing.
>
> I absolutely didn't want to belittle all that effort by only raising
> what I see is the one holdover left.

I see myself and Daniel have kinda fallen into good-cop, bad-cop mode.

I agree with everything Daniel had said in here, and come next week I might
try and write something more constructive up, but believe me Daniel is totally
right! It's Saturday morning, I've got a weekend to deal with and I'm going to
try and avoid thinking too much about this.

I actually love bandwidth_calcs.c I'd like to merge it even before DAL, yes
it's ugly code, and it's horrible but it's a single piece of hw team magic, and
we can hide that. It's the sw abstraction magic that is my issue.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-09 20:34                   ` Dave Airlie
@ 2016-12-09 20:38                     ` Daniel Vetter
  2016-12-10  0:29                     ` Matthew Macy
  2016-12-11 12:34                     ` Daniel Vetter
  2 siblings, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-09 20:38 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel,
	Deucher, Alexander

On Fri, Dec 9, 2016 at 9:34 PM, Dave Airlie <airlied@gmail.com> wrote:
> I actually love bandwidth_calcs.c I'd like to merge it even before DAL, yes
> it's ugly code, and it's horrible but it's a single piece of hw team magic, and
> we can hide that. It's the sw abstraction magic that is my issue.

If anyone wants an example, look at the original vlv pll compuatation
code. A lot smaller but about 8 levels of indent, one function with no
structure, local variables i, j, k, l, m, o ... with no explanation,
but it was the Word of God (akak hw engineers) and that's why we
merged it. Later on we had to rewrite it because in the conversion
from the excel formula to C hw engineers forgot that u32 truncates
differently than the floating point excel uses ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-09 20:34                   ` Dave Airlie
  2016-12-09 20:38                     ` Daniel Vetter
@ 2016-12-10  0:29                     ` Matthew Macy
  2016-12-11 12:34                     ` Daniel Vetter
  2 siblings, 0 replies; 66+ messages in thread
From: Matthew Macy @ 2016-12-10  0:29 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Deucher, Alexander, dri-devel, amd-gfx mailing list


 ---- On Fri, 09 Dec 2016 12:34:17 -0800 Dave Airlie <airlied@gmail.com> wrote ---- 
 > On 10 December 2016 at 05:59, Daniel Vetter <daniel@ffwll.ch> wrote:
 > > I guess things went a bit sideways by me and Dave only talking about
 > > the midlayer, so let me first state that the DC stuff has massively
 > > improved through replacing all the backend services that reimplemented
 > > Linux helper libraries with their native equivalent. That's some
 > > serious work, and it shows that AMD is committed to doing the right
 > > thing.
 > >
 > > I absolutely didn't want to belittle all that effort by only raising
 > > what I see is the one holdover left.
 > 
 > I see myself and Daniel have kinda fallen into good-cop, bad-cop mode.
 > 
 > I agree with everything Daniel had said in here, and come next week I might
 > try and write something more constructive up, but believe me Daniel is totally
 > right! It's Saturday morning, I've got a weekend to deal with and I'm going to
 > try and avoid thinking too much about this.
 > 
 > I actually love bandwidth_calcs.c I'd like to merge it even before DAL, yes
 > it's ugly code, and it's horrible but it's a single piece of hw team magic, and
 > we can hide that. It's the sw abstraction magic that is my issue.
 > 
 > Dave.
 

David - 
 I recognize that the maintainer role you play is critical to the success of Linux. 
You need to honor your responsibilities as well as maintain 
your rapport with Linus. In FreeBSD committers are largely siloed and 
no one is designated to facilitate the import of outside contributions. As a 
consequence, vendor driver developers are given commit bits and 
not infrequently commit near unmaintainable garbage. Academic 
committers commit half baked code to meet a publishing deadline - which 
they subsequently abandon. And much work by non-committers never makes it 
in. Frequently, the self-appointed gatekeepers in the community will block work 
with little visible discussion or negotiation about how to meet their demands 
(ENOTIME being the typical response). When I talk to people outside the 
community about the ways in which FreeBSD most notably fell short of Linux, 
the one point that resonates the most is the lack of clear path or transparency 
in upstreaming contributions. 

As maintainer your responsibility is first and foremost to the long term health of 
Linux - not being popular with contributors. 


That said, as a prospective AMD shareholder I have a few observations to make 
about what they should do. 

First of all, by any measures AMD graphics profit margins are razor thin. Even 
when their products have been clearly superior to Nvidia's consumers have, 
as a group, held off and paid more for Nvidia's. See the following youtube video 
if you're curious as to just how poorly they have fared in mindshare: http://bit.ly/1J7020P 
I have no doubt that they lack the resources to support Linux at the same level 
of Windows without large amounts of code sharing. I was under the impression 
that their ROC compute stack would be near ready for mainline this summer. It's 
now clear that, at best, it won't happen any sooner than next summer. 

As a downstream consumer of Alex's code on Linux and FreeBSD I *hope* that 
AMD will do whatever it takes to put their codebase on par with Windows. There are 
only two makers of high end GPUs and one one of them is opaque and closed source. 

However, as a prospective shareholder who is under the impression that almost none 
of their income comes from Linux users - if they need to have a fully native 
Linux driver I think they have 3 real choices: 

a) Dumb down the driver. It just needs to push pixels. Admit that Nvidia has won the mindshare 
for anything like high end graphics on Linux. Just be good enough to run X and basic mesa demos. 

b) Go back to a closed source driver. Although the DRM layer churns rapidly, the underlying 
KPIs that it uses change very slowly. To the limited extent they need to it's not that hard to 
decouple from the underlying kernel. Nvidia's seen little to no blowback for only providing 
binary support for a narrow set of Linux kernels. 

In fact - the reason I run Linux is because of the Linux only binary CUDA stack. Blobs can 
have some nice lock-in benefits for Linux. 

c) a+b - write a "good enough" driver for open source and keep a closed driver for selected 
large consumers. 

AMD's responsibility is first and foremost to its shareholders. If doing right by Linux is in 
conflict with that the choice is clear. It is those of us dependent on it being open source that lose 
the most. 

I think the net consequence of this will be to reinforce the dominant position of Nvidia and  the marginal relevance of open source graphics outside of embedded (Intel's support for Linux is great, but it really is not in the same league as Nvidia or AMD).

-M

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]               ` <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-11  0:36                 ` Alex Deucher
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Deucher @ 2016-12-11  0:36 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel, amd-gfx mailing list,
	Daniel Vetter, Deucher, Alexander, Wentland, Harry

On Fri, Dec 9, 2016 at 3:30 PM, Dave Airlie <airlied@gmail.com> wrote:
>> I think this is part of the reason a lot of people get fed up with working upstream in Linux.  I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there.  But attacking us or our corporate culture is not cool.  I think perhaps you have been in the RH silo for too long.  Our corporate culture is not like RH's.  Like it or not, we have historically been a windows centric company.  We have a few small Linux team that has been engaged with the community for a long time, but the rest of the company has not.  We are working to improve it, but we can only do so many things at one time.  GPU cycles are fast.  There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant.  We are finally at a point where our AMD Linux drivers are almost feature complete compared to windows and we have support upstream well before hw launch and we get shit on for trying to do the right thing.  It doesn't exactly make us want to continue contributing.  That's the problem with Linux.  Unless you are part time hacker who is part of the "in" crowd can spend all of his days tinkering with making the code perfect, a vendor with massive resources who can just through more people at it, or a throw it over the wall and forget it vendor (hey, my code can just live in staging), there's no room for you.
>
> I don't think that's fair, AMD as a company has a number of
> experienced Linux kernel developers, who are well aware of the
> upstream kernel development process and views. I should not be put in
> a position where I have to say no, that is frankly the position you
> are in as a maintainer, you work for AMD but you answer to the kernel
> development process out here. AMD is travelling a well travelled road
> here, Intel/Daniel have lots of times I've had to deal with the same
> problems, eventually Intel learn that what Daniel says matters and
> people are a lot happier. I brought up the AMD culture because either
> one of two things have happened here, a) you've lost sight of what
> upstream kernel code looks like, or b) people in AMD aren't listening
> to you, and if its the latter case then it is a direct result of the
> AMD culture, and so far I'm not willing to believe it's the former
> (except maybe CGS - still on the wall whether that was a good idea or
> a floodgate warning).
>
> From what I understood this DAL code was a rewrite from scratch, with
> upstreamability as a possible goal, it isn't directly taken from
> Windows or fglrx. This goal was not achieved, why do I have to live
> with the result. AMD could have done better, they have so many people
> experienced in how this thing should go down.

I think I over-reated a bit with this email.  What I really wanted to
say was that this was an RFC, basically saying this is how far we've
come, this is what we still need to do, and here's what we'd like to
do.  This was not a request to merge now or an ultimatum.  I
understand the requirements of upstream, I just didn't expect such a
visceral response from that original email and it put me on the
defensive.  I take our driver quality seriously and the idea of having
arbitrary large patches applied to "clean up" our code without our say
or validation didn't sit well with me.

Alex
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-09 20:34                   ` Dave Airlie
  2016-12-09 20:38                     ` Daniel Vetter
  2016-12-10  0:29                     ` Matthew Macy
@ 2016-12-11 12:34                     ` Daniel Vetter
  2 siblings, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-11 12:34 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel,
	Deucher, Alexander

On Fri, Dec 9, 2016 at 9:34 PM, Dave Airlie airlied@gmail.com> wrote:
> On 10 December 2016 at 05:59, Daniel Vetter <daniel@ffwll.ch> wrote:
>> I guess things went a bit sideways by me and Dave only talking about
>> the midlayer, so let me first state that the DC stuff has massively
>> improved through replacing all the backend services that reimplemented
>> Linux helper libraries with their native equivalent. That's some
>> serious work, and it shows that AMD is committed to doing the right
>> thing.
>>
>> I absolutely didn't want to belittle all that effort by only raising
>> what I see is the one holdover left.
>
> I see myself and Daniel have kinda fallen into good-cop, bad-cop mode.
>
> I agree with everything Daniel had said in here, and come next week I might
> try and write something more constructive up, but believe me Daniel is totally
> right! It's Saturday morning, I've got a weekend to deal with and I'm going to
> try and avoid thinking too much about this.

Yeah I'm pondering what a reasonable action plan for dc from an atomic
pov is too. One issue we have is that right now the atomic docs are a
bit lacking for large-scale/design issues. But I'm working on this
(hopefully happens soonish, we need it for intel projects too), both
pulling the original atomic design stuff from my blog into docs and
beat into shape. And also how to handle state and atomic_check/commit
for when you want a state model that goes massively beyond what's
there with just drm_plane/crtc/connector_state (like e.g. i915 has).

But instead of me typing this up in this thread here and then getting
lost again (hopefully amdgpu/dc is not the last full-featured driver
we'll get ...) I think it's better if I type this up for the drm docs
and ask Harry/Tony&co for review feedback.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08  2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland
  2016-12-08  9:59 ` Daniel Vetter
@ 2016-12-11 20:28 ` Daniel Vetter
       [not found]   ` <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
       [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org>
  2016-12-12  7:22 ` Daniel Vetter
  3 siblings, 1 reply; 66+ messages in thread
From: Daniel Vetter @ 2016-12-11 20:28 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander, Cheng, Tony

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.

Bridgeman brought it up a few times that this here was the question - it's
kinda missing a question mark, hard to figure this out ;-). I'd say for
upstream it doesn't really matter, but imo having both atomic and
non-atomic paths in one driver is one world of hurt and I strongly
recommend against it, at least if feasible. All drivers that switched
switched in one go, the only exception was i915 (it took much longer than
we ever feared, causing lots of pain) and nouveau (which only converted
nv50+, but pre/post-nv50 have always been two almost completely separate
worlds anyway).

> The DC component has received extensive testing within AMD for DCE8, 10, and
> 11 GPUs and is being prepared for uGPU. Support should be better than
> amdgpu's current display support.
> 
>  * All of our QA effort is focused on DC
>  * All of our CQE effort is focused on DC
>  * All of our OEM preloads and custom engagements use DC
>  * DC behavior mirrors what we do for other OSes
> 
> The new asic utilizes a completely re-designed atom interface, so we cannot
> easily leverage much of the existing atom-based code.
> 
> We've introduced DC to the community earlier in 2016 and received a fair
> amount of feedback. Some of what we've addressed so far are:
> 
>  * Self-contain ASIC specific code. We did a bunch of work to pull
>    common sequences into dc/dce and leave ASIC specific code in
>    separate folders.
>  * Started to expose AUX and I2C through generic kernel/drm
>    functionality and are mostly using that. Some of that code is still
>    needlessly convoluted. This cleanup is in progress.
>  * Integrated Dave and Jerome’s work on removing abstraction in bios
>    parser.
>  * Retire adapter service and asic capability
>  * Remove some abstraction in GPIO
> 
> Since a lot of our code is shared with pre- and post-silicon validation
> suites changes need to be done gradually to prevent breakages due to a major
> flag day.  This, coupled with adding support for new asics and lots of new
> feature introductions means progress has not been as quick as we would have
> liked. We have made a lot of progress none the less.
> 
> The remaining concerns that were brought up during the last review that we
> are working on addressing:
> 
>  * Continue to cleanup and reduce the abstractions in DC where it
>    makes sense.
>  * Removing duplicate code in I2C and AUX as we transition to using the
>    DRM core interfaces.  We can't fully transition until we've helped
>    fill in the gaps in the drm core that we need for certain features.
>  * Making sure Atomic API support is correct.  Some of the semantics of
>    the Atomic API were not particularly clear when we started this,
>    however, that is improving a lot as the core drm documentation
>    improves.  Getting this code upstream and in the hands of more
>    atomic users will further help us identify and rectify any gaps we
>    have.

Ok so I guess Dave is typing some more general comments about
demidlayering, let me type some guidelines about atomic. Hopefully this
all materializes itself a bit better into improved upstream docs, but meh.

Step 0: Prep

So atomic is transactional, but it's not validate + rollback or commit,
but duplicate state, validate and then either throw away or commit.
There's a few big reasons for this: a) partial atomic updates - if you
duplicate it's much easier to check that you have all the right locks b)
kfree() is much easier to check for correctness than a rollback code and
c) atomic_check functions are much easier to audit for invalid changes to
persistent state.

Trouble is that this seems a bit unusual compared to all other approaches,
and ime (from the drawn-out i915 conversion) you really don't want to mix
things up. Ofc for private state you can roll back (e.g. vc4 does that for
the drm_mm allocator thing for scanout slots or whatever it is), but it's
trivial easy to accidentally check the wrong state or mix them up or
something else bad.

Long story short, I think step 0 for DC is to split state from objects,
i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And
all the back-end functions need to take both the object and the state
explicitly.

This is a bit a pain to do, but should be pretty much just mechanical. And
imo not all of it needs to happen before DC lands in upstream, but see
above imo that half-converted state is postively horrible. This should
also not harm cross-os reuse at all, you can still store things together
on os where that makes sense.

Guidelines for amdgpu atomic structures

drm atomic stores everything in state structs on plane/connector/crtc.
This includes any property extensions or anything else really, the entire
userspace abi is built on top of this. Non-trivial drivers are supposed to
subclass these to store their own stuff, so e.g.

amdgpu_plane_state {
	struct drm_plane_state base;

	/* amdgpu glue state and stuff that's linux-specific, e.g.
	 * property values and similar things. Note that there's strong
	 * push towards standardizing properties and stroing them in the
	 * drm_*_state structs. */

	struct dc_surface_state surface_state;

	/* other dc states that fit to a plane */
};

Yes not everything will fit 1:1 in one of these, but to get started I
strongly recommend to make them fit (maybe with reduced feature sets to
start out). Stuff that is shared between e.g. planes, but always on the
same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that
are assignable to a plane.

Of course atomic also supports truly global resources, for that you need
to subclass drm_atomic_state. Currently msm and i915 do that, and probably
best to read those structures as examples until I've typed the docs. But I
expect that especially for planes a few dc_*_state structs will stay in
amdgpu_*_state.

Guidelines for atomic_check

Please use the helpers as much as makes sense, and put at least the basic
steps that from drm_*_state into the respective dc_*_state functional
block into the helper callbacks for that object. I think basic validation
of individal bits (as much as possible, e.g. if you just don't support
e.g. scaling or rotation with certain pixel formats) should happen in
there too. That way when we e.g. want to check how drivers corrently
validate a given set of properties to be able to more strictly define the
semantics, that code is easy to find.

Also I expect that this won't result in code duplication with other OS,
you need code to map from drm to dc anyway, might as well check&reject the
stuff that dc can't even represent right there.

The other reason is that the helpers are good guidelines for some of the
semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right
answer after atomic_check. If it doesn't, then you're driver doesn't
follow atomic. If you completely roll your own this becomes much harder to
assure.

Of course extend it all however you want, e.g. by adding all the global
optimization and resource assignment stuff after initial per-object
checking has been done using the helper infrastructure.

Guidelines for atomic_commit

Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also,
your atomic_commit should pretty much match the helper one, except for a
custom swap_state to handle all your globally shared specia dc_*_state
objects. Everything hw specific should be in atomic_commit_tail.

Wrt the hw commit itself, for the modeset step just roll your own. That's
the entire point of atomic, and atm both i915 and nouveau exploit this
fully. Besides a bit of glue there shouldn't be much need for
linux-specific code here - what you need is something to fish the right
dc_*_state objects and give it your main sequencer functions. What you
should make sure though is that only ever do a modeset when that was
signalled, i.e. please use drm_crtc_needs_modeset to control that part.
Feel free to wrap up in a dc_*_needs_modeset for better abstraction if
that's needed.

I do strongly suggest however that you implement the plane commit using
the helpers. There's really only a few ways to implement this in the hw,
and it should work everywhere.

Misc guidelines

Use the suspend/resume helpers. If your atomic can't do that, it's not
terribly good. Also, if DC can't make those fit, it's probably still too
much midlayer and its own world than helper library.

Use all the legacy helpers, again your atomic should be able to pull it
off. One exception is async plane flips (both primary and cursors), that's
atm still unsolved. Probably best to keep the old code around for just
that case (but redirect to the compat helpers for everything), see e.g.
how vc4 implements cursors.

Most imporant of all

Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic
driver for which I don't remember a single discussion about some detail,
at least not with any of the DAL folks. Michel&Alex asked some questions
sometimes, but that indirection is bonghits and the defeats the point of
upstream: Direct cross-vendor collaboration to get shit done. Please make
it happen.

Oh and I pretty much assume Harry&Tony are volunteered to review atomic
docs ;-)

Cheers, Daniel



> 
> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> work on DC is public.  We're currently transitioning to a public patch
> review. You can follow our progress on the amd-gfx mailing list. We value
> community feedback on our work.
> 
> As an appendix I've included a brief overview of the how the code currently
> works to make understanding and reviewing the code easier.
> 
> Prior discussions on DC:
> 
>  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>  *
> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> 
> Current version of DC:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Once Alex pulls in the latest patches:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Best Regards,
> Harry
> 
> 
> ************************************************
> *** Appendix: A Day in the Life of a Modeset ***
> ************************************************
> 
> Below is a high-level overview of a modeset with dc. Some of this might be a
> little out-of-date since it's based on my XDC presentation but it should be
> more-or-less the same.
> 
> amdgpu_dm_atomic_commit()
> {
>   /* setup atomic state */
>   drm_atomic_helper_prepare_planes(dev, state);
>   drm_atomic_helper_swap_state(dev, state);
>   drm_atomic_helper_update_legacy_modeset_state(dev, state);
> 
>   /* create or remove targets */
> 
>   /********************************************************************
>    * *** Call into DC to commit targets with list of all known targets
>    ********************************************************************/
>   /* DC is optimized not to do anything if 'targets' didn't change. */
>   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>   {
>     /******************************************************************
>      * *** Build context (function also used for validation)
>      ******************************************************************/
>     result = core_dc->res_pool->funcs->validate_with_context(
>                                core_dc,set,target_count,context);
> 
>     /******************************************************************
>      * *** Apply safe power state
>      ******************************************************************/
>     pplib_apply_safe_state(core_dc);
> 
>     /****************************************************************
>      * *** Apply the context to HW (program HW)
>      ****************************************************************/
>     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>     {
>       /* reset pipes that need reprogramming */
>       /* disable pipe power gating */
>       /* set safe watermarks */
> 
>       /* for all pipes with an attached stream */
>         /************************************************************
>          * *** Programming all per-pipe contexts
>          ************************************************************/
>         status = apply_single_controller_ctx_to_hw(...)
>         {
>           pipe_ctx->tg->funcs->set_blank(...);
>           pipe_ctx->clock_source->funcs->program_pix_clk(...);
>           pipe_ctx->tg->funcs->program_timing(...);
>           pipe_ctx->mi->funcs->allocate_mem_input(...);
>           pipe_ctx->tg->funcs->enable_crtc(...);
>           bios_parser_crtc_source_select(...);
> 
>           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>           pipe_ctx->opp->funcs->opp_program_fmt(...);
> 
>           stream->sink->link->link_enc->funcs->setup(...);
>           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>           pipe_ctx->tg->funcs->set_blank_color(...);
> 
>           core_link_enable_stream(pipe_ctx);
>           unblank_stream(pipe_ctx,
> 
>           program_scaler(dc, pipe_ctx);
>         }
>       /* program audio for all pipes */
>       /* update watermarks */
>     }
> 
>     program_timing_sync(core_dc, context);
>     /* for all targets */
>       target_enable_memory_requests(...);
> 
>     /* Update ASIC power states */
>     pplib_apply_display_requirements(...);
> 
>     /* update surface or page flip */
>   }
> }
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-12  2:57   ` Dave Airlie
  2016-12-12  7:09     ` Daniel Vetter
                       ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Dave Airlie @ 2016-12-12  2:57 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan,
	amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony

On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.

[FAQ: from past few days]

1) Hey you replied to Daniel, you never addressed the points of the RFC!
I've read it being said that I hadn't addressed the RFC, and you know
I've realised I actually had, because the RFC is great but it
presupposes the codebase as designed can get upstream eventually, and
I don't think it can. The code is too littered with midlayering and
other problems, that actually addressing the individual points of the
RFC would be missing the main point I'm trying to make.

This code needs rewriting, not cleaning, not polishing, it needs to be
split into its constituent parts, and reintegrated in a form more
Linux process friendly.

I feel that if I reply to the individual points Harry has raised in
this RFC, that it means the code would then be suitable for merging,
which it still won't, and I don't want people wasting another 6
months.

If DC was ready for the next-gen GPU it would be ready for the current
GPU, it's not the specific ASIC code that is the problem, it's the
huge midlayer sitting in the middle.

2) We really need to share all of this code between OSes, why does
Linux not want it?

Sharing code is a laudable goal and I appreciate the resourcing
constraints that led us to the point at which we find ourselves, but
the way forward involves finding resources to upstream this code,
dedicated people (even one person) who can spend time on a day by day
basis talking to people in the open and working upstream, improving
other pieces of the drm as they go, reading atomic patches and
reviewing them, and can incrementally build the DC experience on top
of the Linux kernel infrastructure. Then having the corresponding
changes in the DC codebase happen internally to correspond to how the
kernel code ends up looking. Lots of this code overlaps with stuff the
drm already does, lots of is stuff the drm should be doing, so patches
to the drm should be sent instead.

3) Then how do we upstream it?
Resource(s) need(s) to start concentrating at splitting this thing up
and using portions of it in the upstream kernel. We don't land fully
formed code in the kernel if we can avoid it. Because you can't review
the ideas and structure as easy as when someone builds up code in
chunks and actually develops in the Linux kernel. This has always
produced better more maintainable code. Maybe the result will end up
improving the AMD codebase as well.

4) Why can't we put this in staging?
People have also mentioned staging, Daniel has called it a dead end,
I'd have considered staging for this code base, and I still might.
However staging has rules, and the main one is code in staging needs a
TODO list, and agreed criteria for exiting staging, I don't think we'd
be able to get an agreement on what the TODO list should contain and
how we'd ever get all things on it done. If this code ended up in
staging, it would most likely require someone dedicated to recreating
it in the mainline driver in an incremental fashion, and I don't see
that resource being available.

5) Why is a midlayer bad?
I'm not going to go into specifics on the DC midlayer, but we abhor
midlayers for a fair few reasons. The main reason I find causes the
most issues is locking. When you have breaks in code flow between
multiple layers, but having layers calling back into previous layers
it becomes near impossible to track who owns the locking and what the
current locking state is.

Consider
    drma -> dca -> dcb -> drmb
    drmc -> dcc  -> dcb -> drmb

We have two codes paths that go back into drmb, now maybe drma has a
lock taken, but drmc doesn't, but we've no indication when we hit drmb
of what the context pre entering the DC layer is. This causes all
kinds of problems. The main requirement is the driver maintains the
execution flow as much as possible. The only callback behaviour should
be from an irq or workqueue type situations where you've handed
execution flow to the hardware to do something and it is getting back
to you. The pattern we use to get our of this sort of hole is helper
libraries, we structure code as much as possible as leaf nodes that
don't call back into the parents if we can avoid it (we don't always
succeed).

So the above might becomes
   drma-> dca_helper
           -> dcb_helper
           -> drmb.

In this case the code flow is controlled by drma, dca/dcb might be
modifying data or setting hw state but when we get to drmb it's easy
to see what data is needs and what locking.

DAL/DC goes against this in so many ways, and when I look at the code
I'm never sure where to even start pulling the thread to unravel it.

Some questions I have for AMD engineers that also I'd want to see
addressed before any consideration of merging would happen!

How do you plan on dealing with people rewriting or removing code
upstream that is redundant in the kernel, but required for internal
stuff?
How are you going to deal with new Linux things that overlap
incompatibly with your internally developed stuff?
If the code is upstream will it be tested in the kernel by some QA
group, or will there be some CI infrastructure used to maintain and to
watch for Linux code that breaks assumptions in the DC code?
Can you show me you understand that upstream code is no longer 100% in
your control and things can happen to it that you might not expect and
you need to deal with it?

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]     ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-12  3:21       ` Bridgman, John
  2016-12-12  3:23         ` Bridgman, John
  2016-12-13  1:49       ` Harry Wentland
  1 sibling, 1 reply; 66+ messages in thread
From: Bridgman, John @ 2016-12-12  3:21 UTC (permalink / raw)
  To: Dave Airlie, Wentland, Harry
  Cc: Grodzovsky, Andrey, Cyr, Aric, Lazare, Jordan,
	amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony


[-- Attachment #1.1: Type: text/plain, Size: 7084 bytes --]

Thanks Dave. Apologies in advance for top posting but I'm stuck on a mail client that makes a big mess when I try...


>If DC was ready for the next-gen GPU it would be ready for the current
>GPU, it's not the specific ASIC code that is the problem, it's the
>huge midlayer sitting in the middle.


We realize that (a) we are getting into the high-risk-of-breakage part of the rework and (b) no matter how much we change the code structure there's a good chance that a month after it goes upstream one of us is going to find that more structural changes are required.


I was kinda thinking that if we are doing high-risk activities (risk of subtle breakage not obvious regression, and/or risk of making structural changes that turn out to be a bad idea even though we all thought they were correct last week) there's an argument for doing it in code which only supports cards that people can't buy yet.

________________________________
From: Dave Airlie <airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Sent: December 11, 2016 9:57 PM
To: Wentland, Harry
Cc: dri-devel; amd-gfx mailing list; Bridgman, John; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On 8 December 2016 at 12:02, Harry Wentland <harry.wentland-5C7GfCeVMHo@public.gmane.org> wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.

[FAQ: from past few days]

1) Hey you replied to Daniel, you never addressed the points of the RFC!
I've read it being said that I hadn't addressed the RFC, and you know
I've realised I actually had, because the RFC is great but it
presupposes the codebase as designed can get upstream eventually, and
I don't think it can. The code is too littered with midlayering and
other problems, that actually addressing the individual points of the
RFC would be missing the main point I'm trying to make.

This code needs rewriting, not cleaning, not polishing, it needs to be
split into its constituent parts, and reintegrated in a form more
Linux process friendly.

I feel that if I reply to the individual points Harry has raised in
this RFC, that it means the code would then be suitable for merging,
which it still won't, and I don't want people wasting another 6
months.

If DC was ready for the next-gen GPU it would be ready for the current
GPU, it's not the specific ASIC code that is the problem, it's the
huge midlayer sitting in the middle.

2) We really need to share all of this code between OSes, why does
Linux not want it?

Sharing code is a laudable goal and I appreciate the resourcing
constraints that led us to the point at which we find ourselves, but
the way forward involves finding resources to upstream this code,
dedicated people (even one person) who can spend time on a day by day
basis talking to people in the open and working upstream, improving
other pieces of the drm as they go, reading atomic patches and
reviewing them, and can incrementally build the DC experience on top
of the Linux kernel infrastructure. Then having the corresponding
changes in the DC codebase happen internally to correspond to how the
kernel code ends up looking. Lots of this code overlaps with stuff the
drm already does, lots of is stuff the drm should be doing, so patches
to the drm should be sent instead.

3) Then how do we upstream it?
Resource(s) need(s) to start concentrating at splitting this thing up
and using portions of it in the upstream kernel. We don't land fully
formed code in the kernel if we can avoid it. Because you can't review
the ideas and structure as easy as when someone builds up code in
chunks and actually develops in the Linux kernel. This has always
produced better more maintainable code. Maybe the result will end up
improving the AMD codebase as well.

4) Why can't we put this in staging?
People have also mentioned staging, Daniel has called it a dead end,
I'd have considered staging for this code base, and I still might.
However staging has rules, and the main one is code in staging needs a
TODO list, and agreed criteria for exiting staging, I don't think we'd
be able to get an agreement on what the TODO list should contain and
how we'd ever get all things on it done. If this code ended up in
staging, it would most likely require someone dedicated to recreating
it in the mainline driver in an incremental fashion, and I don't see
that resource being available.

5) Why is a midlayer bad?
I'm not going to go into specifics on the DC midlayer, but we abhor
midlayers for a fair few reasons. The main reason I find causes the
most issues is locking. When you have breaks in code flow between
multiple layers, but having layers calling back into previous layers
it becomes near impossible to track who owns the locking and what the
current locking state is.

Consider
    drma -> dca -> dcb -> drmb
    drmc -> dcc  -> dcb -> drmb

We have two codes paths that go back into drmb, now maybe drma has a
lock taken, but drmc doesn't, but we've no indication when we hit drmb
of what the context pre entering the DC layer is. This causes all
kinds of problems. The main requirement is the driver maintains the
execution flow as much as possible. The only callback behaviour should
be from an irq or workqueue type situations where you've handed
execution flow to the hardware to do something and it is getting back
to you. The pattern we use to get our of this sort of hole is helper
libraries, we structure code as much as possible as leaf nodes that
don't call back into the parents if we can avoid it (we don't always
succeed).

So the above might becomes
   drma-> dca_helper
           -> dcb_helper
           -> drmb.

In this case the code flow is controlled by drma, dca/dcb might be
modifying data or setting hw state but when we get to drmb it's easy
to see what data is needs and what locking.

DAL/DC goes against this in so many ways, and when I look at the code
I'm never sure where to even start pulling the thread to unravel it.

Some questions I have for AMD engineers that also I'd want to see
addressed before any consideration of merging would happen!

How do you plan on dealing with people rewriting or removing code
upstream that is redundant in the kernel, but required for internal
stuff?
How are you going to deal with new Linux things that overlap
incompatibly with your internally developed stuff?
If the code is upstream will it be tested in the kernel by some QA
group, or will there be some CI infrastructure used to maintain and to
watch for Linux code that breaks assumptions in the DC code?
Can you show me you understand that upstream code is no longer 100% in
your control and things can happen to it that you might not expect and
you need to deal with it?

Dave.

[-- Attachment #1.2: Type: text/html, Size: 8759 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-12  3:21       ` Bridgman, John
@ 2016-12-12  3:23         ` Bridgman, John
       [not found]           ` <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Bridgman, John @ 2016-12-12  3:23 UTC (permalink / raw)
  To: Dave Airlie, Wentland, Harry
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher,
	Alexander, Cheng, Tony


[-- Attachment #1.1: Type: text/plain, Size: 7435 bytes --]

couple of typo fixes re: top posting and "only supports" -> "is only used for"


________________________________
From: Bridgman, John
Sent: December 11, 2016 10:21 PM
To: Dave Airlie; Wentland, Harry
Cc: dri-devel; amd-gfx mailing list; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU


Thanks Dave. Apologies in advance for top posting but I'm stuck on a mail client that makes a big mess when I try anything else...


>If DC was ready for the next-gen GPU it would be ready for the current
>GPU, it's not the specific ASIC code that is the problem, it's the
>huge midlayer sitting in the middle.


We realize that (a) we are getting into the high-risk-of-breakage part of the rework and (b) no matter how much we change the code structure there's a good chance that a month after it goes upstream one of us is going to find that more structural changes are required.


I was kinda thinking that if we are doing high-risk activities (risk of subtle breakage not obvious regression, and/or risk of making structural changes that turn out to be a bad idea even though we all thought they were correct last week) there's an argument for doing it in code which is only used for cards that people can't buy yet.

________________________________
From: Dave Airlie <airlied@gmail.com>
Sent: December 11, 2016 9:57 PM
To: Wentland, Harry
Cc: dri-devel; amd-gfx mailing list; Bridgman, John; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.

[FAQ: from past few days]

1) Hey you replied to Daniel, you never addressed the points of the RFC!
I've read it being said that I hadn't addressed the RFC, and you know
I've realised I actually had, because the RFC is great but it
presupposes the codebase as designed can get upstream eventually, and
I don't think it can. The code is too littered with midlayering and
other problems, that actually addressing the individual points of the
RFC would be missing the main point I'm trying to make.

This code needs rewriting, not cleaning, not polishing, it needs to be
split into its constituent parts, and reintegrated in a form more
Linux process friendly.

I feel that if I reply to the individual points Harry has raised in
this RFC, that it means the code would then be suitable for merging,
which it still won't, and I don't want people wasting another 6
months.

If DC was ready for the next-gen GPU it would be ready for the current
GPU, it's not the specific ASIC code that is the problem, it's the
huge midlayer sitting in the middle.

2) We really need to share all of this code between OSes, why does
Linux not want it?

Sharing code is a laudable goal and I appreciate the resourcing
constraints that led us to the point at which we find ourselves, but
the way forward involves finding resources to upstream this code,
dedicated people (even one person) who can spend time on a day by day
basis talking to people in the open and working upstream, improving
other pieces of the drm as they go, reading atomic patches and
reviewing them, and can incrementally build the DC experience on top
of the Linux kernel infrastructure. Then having the corresponding
changes in the DC codebase happen internally to correspond to how the
kernel code ends up looking. Lots of this code overlaps with stuff the
drm already does, lots of is stuff the drm should be doing, so patches
to the drm should be sent instead.

3) Then how do we upstream it?
Resource(s) need(s) to start concentrating at splitting this thing up
and using portions of it in the upstream kernel. We don't land fully
formed code in the kernel if we can avoid it. Because you can't review
the ideas and structure as easy as when someone builds up code in
chunks and actually develops in the Linux kernel. This has always
produced better more maintainable code. Maybe the result will end up
improving the AMD codebase as well.

4) Why can't we put this in staging?
People have also mentioned staging, Daniel has called it a dead end,
I'd have considered staging for this code base, and I still might.
However staging has rules, and the main one is code in staging needs a
TODO list, and agreed criteria for exiting staging, I don't think we'd
be able to get an agreement on what the TODO list should contain and
how we'd ever get all things on it done. If this code ended up in
staging, it would most likely require someone dedicated to recreating
it in the mainline driver in an incremental fashion, and I don't see
that resource being available.

5) Why is a midlayer bad?
I'm not going to go into specifics on the DC midlayer, but we abhor
midlayers for a fair few reasons. The main reason I find causes the
most issues is locking. When you have breaks in code flow between
multiple layers, but having layers calling back into previous layers
it becomes near impossible to track who owns the locking and what the
current locking state is.

Consider
    drma -> dca -> dcb -> drmb
    drmc -> dcc  -> dcb -> drmb

We have two codes paths that go back into drmb, now maybe drma has a
lock taken, but drmc doesn't, but we've no indication when we hit drmb
of what the context pre entering the DC layer is. This causes all
kinds of problems. The main requirement is the driver maintains the
execution flow as much as possible. The only callback behaviour should
be from an irq or workqueue type situations where you've handed
execution flow to the hardware to do something and it is getting back
to you. The pattern we use to get our of this sort of hole is helper
libraries, we structure code as much as possible as leaf nodes that
don't call back into the parents if we can avoid it (we don't always
succeed).

So the above might becomes
   drma-> dca_helper
           -> dcb_helper
           -> drmb.

In this case the code flow is controlled by drma, dca/dcb might be
modifying data or setting hw state but when we get to drmb it's easy
to see what data is needs and what locking.

DAL/DC goes against this in so many ways, and when I look at the code
I'm never sure where to even start pulling the thread to unravel it.

Some questions I have for AMD engineers that also I'd want to see
addressed before any consideration of merging would happen!

How do you plan on dealing with people rewriting or removing code
upstream that is redundant in the kernel, but required for internal
stuff?
How are you going to deal with new Linux things that overlap
incompatibly with your internally developed stuff?
If the code is upstream will it be tested in the kernel by some QA
group, or will there be some CI infrastructure used to maintain and to
watch for Linux code that breaks assumptions in the DC code?
Can you show me you understand that upstream code is no longer 100% in
your control and things can happen to it that you might not expect and
you need to deal with it?

Dave.

[-- Attachment #1.2: Type: text/html, Size: 9555 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]           ` <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-12-12  3:43             ` Bridgman, John
  2016-12-12  4:05               ` Dave Airlie
  0 siblings, 1 reply; 66+ messages in thread
From: Bridgman, John @ 2016-12-12  3:43 UTC (permalink / raw)
  To: Dave Airlie, Wentland, Harry
  Cc: Grodzovsky, Andrey, Cyr, Aric, Lazare, Jordan,
	amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony


[-- Attachment #1.1: Type: text/plain, Size: 8159 bytes --]

v3 with typo fixes and additional comments/questions..


________________________________
From: Bridgman, John
Sent: December 11, 2016 10:21 PM
To: Dave Airlie; Wentland, Harry
Cc: dri-devel; amd-gfx mailing list; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU


Thanks Dave. Apologies in advance for top posting but I'm stuck on a mail client that makes a big mess when I try anything else...


>This code needs rewriting, not cleaning, not polishing, it needs to be
>split into its constituent parts, and reintegrated in a form more
>Linux process friendly.


Can we say "restructuring" just for consistency with Daniel's message (the HW-dependent bits don't need to be rewritten but the way they are used/called needs to change) ?


>I feel that if I reply to the individual points Harry has raised in
>this RFC, that it means the code would then be suitable for merging,
>which it still won't, and I don't want people wasting another 6
>months.


That's fair. There was an implicit "when it's suitable" assumption in the RFC, but we'll make that explicit in the future.


>If DC was ready for the next-gen GPU it would be ready for the current
>GPU, it's not the specific ASIC code that is the problem, it's the
>huge midlayer sitting in the middle.


We realize that (a) we are getting into the high-risk-of-breakage part of the rework and (b) no matter how much we change the code structure there's a good chance that a month after it goes upstream one of us is going to find that more structural changes are required.


I was kinda thinking that if we are doing high-risk activities (risk of subtle breakage not obvious regression, and/or risk of making structural changes that turn out to be a bad idea even though we all thought they were correct last week) there's an argument for doing it in code which is only used for cards that people can't buy yet.

________________________________
From: Dave Airlie <airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Sent: December 11, 2016 9:57 PM
To: Wentland, Harry
Cc: dri-devel; amd-gfx mailing list; Bridgman, John; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On 8 December 2016 at 12:02, Harry Wentland <harry.wentland-5C7GfCeVMHo@public.gmane.org> wrote:
> We propose to use the Display Core (DC) driver for display support on
> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> avoid a flag day the plan is to only support uGPU initially and transition
> to older ASICs gradually.

[FAQ: from past few days]

1) Hey you replied to Daniel, you never addressed the points of the RFC!
I've read it being said that I hadn't addressed the RFC, and you know
I've realised I actually had, because the RFC is great but it
presupposes the codebase as designed can get upstream eventually, and
I don't think it can. The code is too littered with midlayering and
other problems, that actually addressing the individual points of the
RFC would be missing the main point I'm trying to make.

This code needs rewriting, not cleaning, not polishing, it needs to be
split into its constituent parts, and reintegrated in a form more
Linux process friendly.

I feel that if I reply to the individual points Harry has raised in
this RFC, that it means the code would then be suitable for merging,
which it still won't, and I don't want people wasting another 6
months.

If DC was ready for the next-gen GPU it would be ready for the current
GPU, it's not the specific ASIC code that is the problem, it's the
huge midlayer sitting in the middle.

2) We really need to share all of this code between OSes, why does
Linux not want it?

Sharing code is a laudable goal and I appreciate the resourcing
constraints that led us to the point at which we find ourselves, but
the way forward involves finding resources to upstream this code,
dedicated people (even one person) who can spend time on a day by day
basis talking to people in the open and working upstream, improving
other pieces of the drm as they go, reading atomic patches and
reviewing them, and can incrementally build the DC experience on top
of the Linux kernel infrastructure. Then having the corresponding
changes in the DC codebase happen internally to correspond to how the
kernel code ends up looking. Lots of this code overlaps with stuff the
drm already does, lots of is stuff the drm should be doing, so patches
to the drm should be sent instead.

3) Then how do we upstream it?
Resource(s) need(s) to start concentrating at splitting this thing up
and using portions of it in the upstream kernel. We don't land fully
formed code in the kernel if we can avoid it. Because you can't review
the ideas and structure as easy as when someone builds up code in
chunks and actually develops in the Linux kernel. This has always
produced better more maintainable code. Maybe the result will end up
improving the AMD codebase as well.

4) Why can't we put this in staging?
People have also mentioned staging, Daniel has called it a dead end,
I'd have considered staging for this code base, and I still might.
However staging has rules, and the main one is code in staging needs a
TODO list, and agreed criteria for exiting staging, I don't think we'd
be able to get an agreement on what the TODO list should contain and
how we'd ever get all things on it done. If this code ended up in
staging, it would most likely require someone dedicated to recreating
it in the mainline driver in an incremental fashion, and I don't see
that resource being available.

5) Why is a midlayer bad?
I'm not going to go into specifics on the DC midlayer, but we abhor
midlayers for a fair few reasons. The main reason I find causes the
most issues is locking. When you have breaks in code flow between
multiple layers, but having layers calling back into previous layers
it becomes near impossible to track who owns the locking and what the
current locking state is.

Consider
    drma -> dca -> dcb -> drmb
    drmc -> dcc  -> dcb -> drmb

We have two codes paths that go back into drmb, now maybe drma has a
lock taken, but drmc doesn't, but we've no indication when we hit drmb
of what the context pre entering the DC layer is. This causes all
kinds of problems. The main requirement is the driver maintains the
execution flow as much as possible. The only callback behaviour should
be from an irq or workqueue type situations where you've handed
execution flow to the hardware to do something and it is getting back
to you. The pattern we use to get our of this sort of hole is helper
libraries, we structure code as much as possible as leaf nodes that
don't call back into the parents if we can avoid it (we don't always
succeed).

So the above might becomes
   drma-> dca_helper
           -> dcb_helper
           -> drmb.

In this case the code flow is controlled by drma, dca/dcb might be
modifying data or setting hw state but when we get to drmb it's easy
to see what data is needs and what locking.

DAL/DC goes against this in so many ways, and when I look at the code
I'm never sure where to even start pulling the thread to unravel it.

Some questions I have for AMD engineers that also I'd want to see
addressed before any consideration of merging would happen!

How do you plan on dealing with people rewriting or removing code
upstream that is redundant in the kernel, but required for internal
stuff?
How are you going to deal with new Linux things that overlap
incompatibly with your internally developed stuff?
If the code is upstream will it be tested in the kernel by some QA
group, or will there be some CI infrastructure used to maintain and to
watch for Linux code that breaks assumptions in the DC code?
Can you show me you understand that upstream code is no longer 100% in
your control and things can happen to it that you might not expect and
you need to deal with it?

Dave.

[-- Attachment #1.2: Type: text/html, Size: 10728 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-12  3:43             ` Bridgman, John
@ 2016-12-12  4:05               ` Dave Airlie
  0 siblings, 0 replies; 66+ messages in thread
From: Dave Airlie @ 2016-12-12  4:05 UTC (permalink / raw)
  To: Bridgman, John
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher,
	Alexander, Cheng, Tony

>
>>This code needs rewriting, not cleaning, not polishing, it needs to be
>>split into its constituent parts, and reintegrated in a form more
>>Linux process friendly.
>
>
> Can we say "restructuring" just for consistency with Daniel's message (the
> HW-dependent bits don't need to be rewritten but the way they are
> used/called needs to change) ?

Yes I think there is a lot of the code that could be reused with little change,

it's just all the pieces tying it together needs restructure.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-12  2:57   ` Dave Airlie
@ 2016-12-12  7:09     ` Daniel Vetter
       [not found]     ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-12-13  2:52     ` Cheng, Tony
  2 siblings, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-12  7:09 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Grodzovsky, Andrey, dri-devel, amd-gfx mailing list, Deucher,
	Alexander, Cheng, Tony

On Mon, Dec 12, 2016 at 12:57:40PM +1000, Dave Airlie wrote:
> 4) Why can't we put this in staging?
> People have also mentioned staging, Daniel has called it a dead end,
> I'd have considered staging for this code base, and I still might.
> However staging has rules, and the main one is code in staging needs a
> TODO list, and agreed criteria for exiting staging, I don't think we'd
> be able to get an agreement on what the TODO list should contain and
> how we'd ever get all things on it done. If this code ended up in
> staging, it would most likely require someone dedicated to recreating
> it in the mainline driver in an incremental fashion, and I don't see
> that resource being available.

So it's not just that I think the staging experience for drivers isn't
good (e.g. imx, gma500), there's also the trouble that it's a separate
tree and the coordination becomes a pain. That was very ugly around all
the sync_file stuff imo, and for next time around we ever do that we
should just put it into drm first and clean up second. We could do staging
like with nouveau, but that's imo not really any different from just
merging if we only slap a Kconfig depends upon the entire pile. So just
don't see the benefit.

I think stagin is good for checkpatch cleanup, but we already agreed that
we're ok with ugly code if it's the stuff debugged by hw engineers. And
for anything else like real refactoring I of big pieces of code I just
don't see how staging makes sense. Maybe if it's a completely new
subsystem, but the point here is that we want DC to integrate tighter with
drm and be able to share code.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-08  2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland
                   ` (2 preceding siblings ...)
       [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-12  7:22 ` Daniel Vetter
       [not found]   ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  3 siblings, 1 reply; 66+ messages in thread
From: Daniel Vetter @ 2016-12-12  7:22 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander, Cheng, Tony

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> Current version of DC:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> 
> Once Alex pulls in the latest patches:
> 
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

One more: That 4.7 here is going to be unbelievable amounts of pain for
you. Yes it's a totally sensible idea to just freeze your baseline kernel
because then linux looks a lot more like Windows where the driver abi is
frozen. But it makes following upstream entirely impossible, because
rebasing is always a pain and hence postponed. Which means you can't just
use the latest stuff in upstream drm, which means collaboration with
others and sharing bugfixes in core is a lot more pain, which then means
you do more than necessary in your own code and results in HALs like DAL,
perpetuating the entire mess.

So I think you don't just need to demidlayer DAL/DC, you also need to
demidlayer your development process. In our experience here at Intel that
needs continuous integration testing (in drm-tip), because even 1 month of
not resyncing with drm-next is sometimes way too long. See e.g. the
controlD regression we just had. And DAL is stuck on a 1 year old kernel,
so pretty much only of historical significance and otherwise dead code.

And then for any stuff which isn't upstream yet (like your internal
enabling, or DAL here, or our own internal enabling) you need continuous
rebasing&re-validation. When we started doing this years ago it was still
manually, but we still rebased like every few days to keep the pain down
and adjust continuously to upstream evolution. But then going to a
continous rebase bot that sends you mail when something goes wrong was
again a massive improvement.

I guess in the end Conway's law that your software architecture
necessarily reflects how you organize your teams applies again. Fix your
process and it'll become glaringly obvious to everyone involved that
DC-the-design as-is is entirely unworkeable and how it needs to be fixed.

From my own experience over the past few years: Doing that is a fun
journey ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]   ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-12  7:54     ` Bridgman, John
       [not found]       ` <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  2016-12-13  2:05     ` Harry Wentland
  1 sibling, 1 reply; 66+ messages in thread
From: Bridgman, John @ 2016-12-12  7:54 UTC (permalink / raw)
  To: Daniel Vetter, Wentland, Harry
  Cc: Deucher, Alexander, Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 3796 bytes --]

Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been:


1. Support enterprise distros (with old kernels) via the hybrid driver (AMDGPU-PRO), where the closer to upstream we get the more of a gap we have to paper over with KCL code


2. Push architecturally simple code (new GPU support) upstream, where being closer to upstream makes the up-streaming task simpler but not by that much


So 4.7 isn't as bad a compromise as it might seem.


That said, in the case of DAL/DC it's a different story as you say... architecturally complex code needing to be woven into a fast-moving subsystem of the kernel. So for DAL/DC anything other than upstream is going to be a big pain.


OK, need to think that through.


Thanks !

________________________________
From: dri-devel <dri-devel-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> on behalf of Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org>
Sent: December 12, 2016 2:22 AM
To: Wentland, Harry
Cc: Grodzovsky, Andrey; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; dri-devel-PD4FTy7X32mptlylMvRsHA@public.gmane.orgdesktop.org; Deucher, Alexander; Cheng, Tony
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> Current version of DC:
>
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>
> Once Alex pulls in the latest patches:
>
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

One more: That 4.7 here is going to be unbelievable amounts of pain for
you. Yes it's a totally sensible idea to just freeze your baseline kernel
because then linux looks a lot more like Windows where the driver abi is
frozen. But it makes following upstream entirely impossible, because
rebasing is always a pain and hence postponed. Which means you can't just
use the latest stuff in upstream drm, which means collaboration with
others and sharing bugfixes in core is a lot more pain, which then means
you do more than necessary in your own code and results in HALs like DAL,
perpetuating the entire mess.

So I think you don't just need to demidlayer DAL/DC, you also need to
demidlayer your development process. In our experience here at Intel that
needs continuous integration testing (in drm-tip), because even 1 month of
not resyncing with drm-next is sometimes way too long. See e.g. the
controlD regression we just had. And DAL is stuck on a 1 year old kernel,
so pretty much only of historical significance and otherwise dead code.

And then for any stuff which isn't upstream yet (like your internal
enabling, or DAL here, or our own internal enabling) you need continuous
rebasing&re-validation. When we started doing this years ago it was still
manually, but we still rebased like every few days to keep the pain down
and adjust continuously to upstream evolution. But then going to a
continous rebase bot that sends you mail when something goes wrong was
again a massive improvement.

I guess in the end Conway's law that your software architecture
necessarily reflects how you organize your teams applies again. Fix your
process and it'll become glaringly obvious to everyone involved that
DC-the-design as-is is entirely unworkeable and how it needs to be fixed.

>From my own experience over the past few years: Doing that is a fun
journey ;-)

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[-- Attachment #1.2: Type: text/html, Size: 5225 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-12-12  9:27         ` Daniel Vetter
       [not found]           ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  2016-12-12 15:28           ` Deucher, Alexander
  0 siblings, 2 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-12  9:27 UTC (permalink / raw)
  To: Bridgman, John
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher,
	Alexander, Wentland, Harry

On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote:
> Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been:
> 
> 
> 1. Support enterprise distros (with old kernels) via the hybrid driver
> (AMDGPU-PRO), where the closer to upstream we get the more of a gap we
> have to paper over with KCL code

Hm, I thought resonable enterprise distros roll their drm core forward to
the very latest upstream fairly often, so it shouldn't be too bad? Fixing
this completely requires that you upstream your pre-production hw support
early enough that by the time it ships its the backport is already in a
realeased enterprise distro upgrade. But then adding bugfixes on top
should be doable.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]           ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-12  9:29             ` Daniel Vetter
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-12  9:29 UTC (permalink / raw)
  To: Bridgman, John
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher,
	Alexander, Wentland, Harry

On Mon, Dec 12, 2016 at 10:27:27AM +0100, Daniel Vetter wrote:
> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote:
> > Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been:
> > 
> > 
> > 1. Support enterprise distros (with old kernels) via the hybrid driver
> > (AMDGPU-PRO), where the closer to upstream we get the more of a gap we
> > have to paper over with KCL code
> 
> Hm, I thought resonable enterprise distros roll their drm core forward to
> the very latest upstream fairly often, so it shouldn't be too bad? Fixing
> this completely requires that you upstream your pre-production hw support
> early enough that by the time it ships its the backport is already in a
> realeased enterprise distro upgrade. But then adding bugfixes on top
> should be doable.

Or just put an entire statically linked copy of the corresponding drm core
into your dkms. A bit horrible, but iirc it's been done before.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-12  9:27         ` Daniel Vetter
       [not found]           ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-12 15:28           ` Deucher, Alexander
       [not found]             ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Deucher, Alexander @ 2016-12-12 15:28 UTC (permalink / raw)
  To: 'Daniel Vetter', Bridgman, John
  Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx, dri-devel

> -----Original Message-----
> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Daniel Vetter
> Sent: Monday, December 12, 2016 4:27 AM
> To: Bridgman, John
> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; amd-
> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland,
> Harry
> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU
> 
> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote:
> > Yep, good point. We have tended to stay a bit behind bleeding edge
> because our primary tasks so far have been:
> >
> >
> > 1. Support enterprise distros (with old kernels) via the hybrid driver
> > (AMDGPU-PRO), where the closer to upstream we get the more of a gap
> we
> > have to paper over with KCL code
> 
> Hm, I thought resonable enterprise distros roll their drm core forward to
> the very latest upstream fairly often, so it shouldn't be too bad? Fixing
> this completely requires that you upstream your pre-production hw support
> early enough that by the time it ships its the backport is already in a
> realeased enterprise distro upgrade. But then adding bugfixes on top
> should be doable.

The issue is we need DAL/DC for enterprise distros and OEM preloads and, for workstation customers, we need some additional patches that aren't upstream yet because they we don’t have an open source user for them yet.  This gets much easier once we get OCL and VK open sourced.  As for new asic support, unfortunately, they do not often align well with enterprise distros at least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs cycles are very fast).  The other problem with dGPUs is that we often can't release support for new hw or feature too much earlier than launch due to the very competitive dGPU environment in gaming and workstation.

Alex

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]             ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-12-12 16:06               ` Luke A. Guest
  2016-12-12 16:17               ` Luke A. Guest
  1 sibling, 0 replies; 66+ messages in thread
From: Luke A. Guest @ 2016-12-12 16:06 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 12/12/16 15:28, Deucher, Alexander wrote:
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of Daniel Vetter
>> Sent: Monday, December 12, 2016 4:27 AM
>> To: Bridgman, John
>> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; amd-
>> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland,
>> Harry
>> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU
>>
>> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote:
>>> Yep, good point. We have tended to stay a bit behind bleeding edge
>> because our primary tasks so far have been:
>>>
>>> 1. Support enterprise distros (with old kernels) via the hybrid driver
>>> (AMDGPU-PRO), where the closer to upstream we get the more of a gap
>> we
>>> have to paper over with KCL code
>> Hm, I thought resonable enterprise distros roll their drm core forward to
>> the very latest upstream fairly often, so it shouldn't be too bad? Fixing
>> this completely requires that you upstream your pre-production hw support
>> early enough that by the time it ships its the backport is already in a
>> realeased enterprise distro upgrade. But then adding bugfixes on top
>> should be doable.
> The issue is we need DAL/DC for enterprise distros and OEM preloads and, for workstation customers, we need some additional patches that aren't upstream yet because they we don’t have an open source user for them yet.  This gets much easier once we get OCL and VK open sourced.  As for new asic support, unfortunately, they do not often align well with enterprise distros at least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs cycles are very fast).  The other problem with dGPUs is that we often can't release support for new hw or feature too much earlier than launch due to the very competitive dGPU environment in gaming and workstation.
>

What Daniel said is something I've said to you before, especially
regarding libdrm. You keep mentioning these patches you need, but tbh,
there's no reason why these patches cannot be in patchwork so people can
use them. I've asked for this for months and the response was,
"shouldn't be a problem, but I won't get to it this week," months later,
still not there.

Please just get your stuff public so the people who aren't on enterprise
and ancient OSes can upgrade their systems. This would enable me to test
amdgpu-pro and latest Mesa/LLVM alongside each other for Gentoo without
having to replace a source built libdrm with your ancient one.

Luke.


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]             ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  2016-12-12 16:06               ` Luke A. Guest
@ 2016-12-12 16:17               ` Luke A. Guest
       [not found]                 ` <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Luke A. Guest @ 2016-12-12 16:17 UTC (permalink / raw)
  To: Deucher, Alexander, 'Daniel Vetter', Bridgman, John
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 12/12/16 15:28, Deucher, Alexander wrote:
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of Daniel Vetter
>> Sent: Monday, December 12, 2016 4:27 AM
>> To: Bridgman, John
>> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; amd-
>> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland,
>> Harry
>> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU
>>
>> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote:
>>> Yep, good point. We have tended to stay a bit behind bleeding edge
>> because our primary tasks so far have been:
>>>
>>> 1. Support enterprise distros (with old kernels) via the hybrid driver
>>> (AMDGPU-PRO), where the closer to upstream we get the more of a gap
>> we
>>> have to paper over with KCL code
>> Hm, I thought resonable enterprise distros roll their drm core forward to
>> the very latest upstream fairly often, so it shouldn't be too bad? Fixing
>> this completely requires that you upstream your pre-production hw support
>> early enough that by the time it ships its the backport is already in a
>> realeased enterprise distro upgrade. But then adding bugfixes on top
>> should be doable.
> The issue is we need DAL/DC for enterprise distros and OEM preloads and, for workstation customers, we need some additional patches that aren't upstream yet because they we don’t have an open source user for them yet.  This gets much easier once we get OCL and VK open sourced.  As for new asic support, unfortunately, they do not often align well with enterprise distros at least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs cycles are very fast).  The other problem with dGPUs is that we often can't release support for new hw or feature too much earlier than launch due to the very competitive dGPU environment in gaming and workstation.
>
>
Apologies for spamming, but I didn't send this to all.

What Daniel said is something I've said to you before, especially
regarding libdrm. You keep mentioning these patches you need, but tbh,
there's no reason why these patches cannot be in patchwork so people can
use them. I've asked for this for months and the response was,
"shouldn't be a problem, but I won't get to it this week," months later,
still not there.

Please just get your stuff public so the people who aren't on enterprise
and ancient OSes can upgrade their systems. This would enable me to test
amdgpu-pro and latest Mesa/LLVM alongside each other for Gentoo without
having to replace a source built libdrm with your ancient one.

Luke.


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]                 ` <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org>
@ 2016-12-12 16:44                   ` Deucher, Alexander
  0 siblings, 0 replies; 66+ messages in thread
From: Deucher, Alexander @ 2016-12-12 16:44 UTC (permalink / raw)
  To: 'Luke A. Guest', 'Daniel Vetter', Bridgman, John
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

> -----Original Message-----
> From: Luke A. Guest [mailto:laguest@archeia.com]
> Sent: Monday, December 12, 2016 11:17 AM
> To: Deucher, Alexander; 'Daniel Vetter'; Bridgman, John
> Cc: Grodzovsky, Andrey; Cheng, Tony; amd-gfx@lists.freedesktop.org; dri-
> devel@lists.freedesktop.org
> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU
> 
> On 12/12/16 15:28, Deucher, Alexander wrote:
> >> -----Original Message-----
> >> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On
> Behalf
> >> Of Daniel Vetter
> >> Sent: Monday, December 12, 2016 4:27 AM
> >> To: Bridgman, John
> >> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org;
> amd-
> >> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland,
> >> Harry
> >> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU
> >>
> >> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote:
> >>> Yep, good point. We have tended to stay a bit behind bleeding edge
> >> because our primary tasks so far have been:
> >>>
> >>> 1. Support enterprise distros (with old kernels) via the hybrid driver
> >>> (AMDGPU-PRO), where the closer to upstream we get the more of a
> gap
> >> we
> >>> have to paper over with KCL code
> >> Hm, I thought resonable enterprise distros roll their drm core forward to
> >> the very latest upstream fairly often, so it shouldn't be too bad? Fixing
> >> this completely requires that you upstream your pre-production hw
> support
> >> early enough that by the time it ships its the backport is already in a
> >> realeased enterprise distro upgrade. But then adding bugfixes on top
> >> should be doable.
> > The issue is we need DAL/DC for enterprise distros and OEM preloads and,
> for workstation customers, we need some additional patches that aren't
> upstream yet because they we don’t have an open source user for them yet.
> This gets much easier once we get OCL and VK open sourced.  As for new asic
> support, unfortunately, they do not often align well with enterprise distros at
> least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs
> cycles are very fast).  The other problem with dGPUs is that we often can't
> release support for new hw or feature too much earlier than launch due to
> the very competitive dGPU environment in gaming and workstation.
> >
> >
> Apologies for spamming, but I didn't send this to all.
> 
> What Daniel said is something I've said to you before, especially
> regarding libdrm. You keep mentioning these patches you need, but tbh,
> there's no reason why these patches cannot be in patchwork so people can
> use them. I've asked for this for months and the response was,
> "shouldn't be a problem, but I won't get to it this week," months later,
> still not there.

The kernel side is public.  The dkms packages have the full source tree.  As I said before, we plan to make this all public, but just haven't had the time (as this thread shows, we've got a lot of other higher priority things on our plate).  Even when we do, it doesn’t change the fact that the patches can't go upstream at the moment so it doesn't fix the situation Daniel was talking about anyway.  Distro's generally don't take code that is not upstream yet.  While we only validate the dkms packages on the enterprise distros, they should be adaptable to other kernels.

Alex

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]     ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-12-12  3:21       ` Bridgman, John
@ 2016-12-13  1:49       ` Harry Wentland
       [not found]         ` <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Harry Wentland @ 2016-12-13  1:49 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan,
	amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony

Hi Dave,

Apologies for waking you up with the RFC on a Friday morning. I'll try 
to time big stuff better next time.

A couple of thoughts below after having some discussions internally. I 
think Tony might add to some of them or provide his own.

On 2016-12-11 09:57 PM, Dave Airlie wrote:
> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
>> We propose to use the Display Core (DC) driver for display support on
>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>> avoid a flag day the plan is to only support uGPU initially and transition
>> to older ASICs gradually.
>
> [FAQ: from past few days]
>
> 1) Hey you replied to Daniel, you never addressed the points of the RFC!
> I've read it being said that I hadn't addressed the RFC, and you know
> I've realised I actually had, because the RFC is great but it
> presupposes the codebase as designed can get upstream eventually, and
> I don't think it can. The code is too littered with midlayering and
> other problems, that actually addressing the individual points of the
> RFC would be missing the main point I'm trying to make.
>
> This code needs rewriting, not cleaning, not polishing, it needs to be
> split into its constituent parts, and reintegrated in a form more
> Linux process friendly.
>
> I feel that if I reply to the individual points Harry has raised in
> this RFC, that it means the code would then be suitable for merging,
> which it still won't, and I don't want people wasting another 6
> months.
>
> If DC was ready for the next-gen GPU it would be ready for the current
> GPU, it's not the specific ASIC code that is the problem, it's the
> huge midlayer sitting in the middle.
>
> 2) We really need to share all of this code between OSes, why does
> Linux not want it?
>
> Sharing code is a laudable goal and I appreciate the resourcing
> constraints that led us to the point at which we find ourselves, but
> the way forward involves finding resources to upstream this code,
> dedicated people (even one person) who can spend time on a day by day
> basis talking to people in the open and working upstream, improving
> other pieces of the drm as they go, reading atomic patches and
> reviewing them, and can incrementally build the DC experience on top
> of the Linux kernel infrastructure. Then having the corresponding
> changes in the DC codebase happen internally to correspond to how the
> kernel code ends up looking. Lots of this code overlaps with stuff the
> drm already does, lots of is stuff the drm should be doing, so patches
> to the drm should be sent instead.
>

Personally I'm with you on this and hope to get us there. I'm 
learning... we're learning. I agree that changes on atomic, removing 
abstractions, etc. should happen on dri-devel.

When it comes to brand-new technologies (MST, Freesync), though, we're 
often the first which means that we're spending a considerable amount of 
time to get things right, working with HW teams, receiver vendors and 
other partners internal and external to AMD. By the time we do get it 
right it's time to hit the market. This gives us fairly little leeway to 
work with the community on patches that won't land in distros for 
another half a year. We're definitely hoping to improve some of this but 
it's not easy and in some case impossible ahead of time (though 
definitely possibly after initial release).


> 3) Then how do we upstream it?
> Resource(s) need(s) to start concentrating at splitting this thing up
> and using portions of it in the upstream kernel. We don't land fully
> formed code in the kernel if we can avoid it. Because you can't review
> the ideas and structure as easy as when someone builds up code in
> chunks and actually develops in the Linux kernel. This has always
> produced better more maintainable code. Maybe the result will end up
> improving the AMD codebase as well.
>
> 4) Why can't we put this in staging?
> People have also mentioned staging, Daniel has called it a dead end,
> I'd have considered staging for this code base, and I still might.
> However staging has rules, and the main one is code in staging needs a
> TODO list, and agreed criteria for exiting staging, I don't think we'd
> be able to get an agreement on what the TODO list should contain and
> how we'd ever get all things on it done. If this code ended up in
> staging, it would most likely require someone dedicated to recreating
> it in the mainline driver in an incremental fashion, and I don't see
> that resource being available.
>

I don't think we really want staging. If it helps us get into DRM, sure, 
but if it's more of a pain, as suggested, then probably no.

> 5) Why is a midlayer bad?
> I'm not going to go into specifics on the DC midlayer, but we abhor
> midlayers for a fair few reasons. The main reason I find causes the
> most issues is locking. When you have breaks in code flow between
> multiple layers, but having layers calling back into previous layers
> it becomes near impossible to track who owns the locking and what the
> current locking state is.
>

There's a conscious design decision to have absolutely no locking in DC. 
This is one of the reasons. Locking is really OS dependent behavior 
which has no place in DC.

> Consider
>     drma -> dca -> dcb -> drmb
>     drmc -> dcc  -> dcb -> drmb
>
> We have two codes paths that go back into drmb, now maybe drma has a
> lock taken, but drmc doesn't, but we've no indication when we hit drmb
> of what the context pre entering the DC layer is. This causes all
> kinds of problems. The main requirement is the driver maintains the
> execution flow as much as possible. The only callback behaviour should
> be from an irq or workqueue type situations where you've handed
> execution flow to the hardware to do something and it is getting back
> to you. The pattern we use to get our of this sort of hole is helper
> libraries, we structure code as much as possible as leaf nodes that
> don't call back into the parents if we can avoid it (we don't always
> succeed).
>

Is that the reason for using ww_mutex in atomic?

> So the above might becomes
>    drma-> dca_helper
>            -> dcb_helper
>            -> drmb.
>
> In this case the code flow is controlled by drma, dca/dcb might be
> modifying data or setting hw state but when we get to drmb it's easy
> to see what data is needs and what locking.
>

This actually looks pretty close to

drm_atomic_commit
	-> amdgpu_dm_atomic_commit
	-> dc_commit_targets > dce110_apply_ctx_to_hw
	-> apply_single_controller_ctx_to_hw
	-> core_link_enable_stream > allocate_mst_payload
	-> dm_helpers_dp_mst_write_payload_allocation_table
		-> drm_dp_update_payload_part1

though the latter is a bit more complex.

> DAL/DC goes against this in so many ways, and when I look at the code
> I'm never sure where to even start pulling the thread to unravel it.
>

There's a lot of code there but that doesn't mean it's needlessly 
complex. We're definitely open for suggestions on how to simplify this, 
ideally without breaking existing functionality.

> Some questions I have for AMD engineers that also I'd want to see
> addressed before any consideration of merging would happen!
>
> How do you plan on dealing with people rewriting or removing code
> upstream that is redundant in the kernel, but required for internal
> stuff?

There's already a bunch of stuff in our internal trees that never make 
it into open-source trees, for various reasons. We guard those with an 
#ifdef and strip them when preparing code for open source. It shouldn't 
be a big deal to deal with code removed upstream in similar ways.

Rewritten code would have to be looked at on a case by case basis. DC 
code is fully validated in many different configurations and is used for 
ASIC bringup when we can sit next to HW guys to work out complex issues. 
Modifying the code in a way that can't be shared would mean that all 
this validation is lost. Some of the bugs we're talking about are 
non-trivial and will show up only if HW is programmed in a certain way 
(e.g. Linux code leaves out some power-saving feature, causing HW to 
hang in weird scenarios).

> How are you going to deal with new Linux things that overlap
> incompatibly with your internally developed stuff?

Do you have examples?

If we're talking about stuff like MST, atomic, FreeSync, HDR... we're 
generally the first to the game and would love to be working with the 
community to push those out.

> If the code is upstream will it be tested in the kernel by some QA
> group, or will there be some CI infrastructure used to maintain and to
> watch for Linux code that breaks assumptions in the DC code?

I think Alex is working on getting our internal tree onto a rolling tip 
of drm-next (or nearly there). Once we got this we'll be switching our 
existing builds and manual (no automated yet) testing onto. We're 
currently building daily and with each DC commit and doing testing of a 
basic feature matrix at least every second day.

> Can you show me you understand that upstream code is no longer 100% in
> your control and things can happen to it that you might not expect and
> you need to deal with it?
>

I think this is the big question. I would love to let other AMDers chime 
in on this.

Harry

> Dave.
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]   ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  2016-12-12  7:54     ` Bridgman, John
@ 2016-12-13  2:05     ` Harry Wentland
       [not found]       ` <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Harry Wentland @ 2016-12-13  2:05 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Dave Airlie,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Cheng, Tony


On 2016-12-12 02:22 AM, Daniel Vetter wrote:
> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>> Current version of DC:
>>
>>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>
>> Once Alex pulls in the latest patches:
>>
>>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>
> One more: That 4.7 here is going to be unbelievable amounts of pain for
> you. Yes it's a totally sensible idea to just freeze your baseline kernel
> because then linux looks a lot more like Windows where the driver abi is
> frozen. But it makes following upstream entirely impossible, because
> rebasing is always a pain and hence postponed. Which means you can't just
> use the latest stuff in upstream drm, which means collaboration with
> others and sharing bugfixes in core is a lot more pain, which then means
> you do more than necessary in your own code and results in HALs like DAL,
> perpetuating the entire mess.
>
> So I think you don't just need to demidlayer DAL/DC, you also need to
> demidlayer your development process. In our experience here at Intel that
> needs continuous integration testing (in drm-tip), because even 1 month of
> not resyncing with drm-next is sometimes way too long. See e.g. the
> controlD regression we just had. And DAL is stuck on a 1 year old kernel,
> so pretty much only of historical significance and otherwise dead code.
>
> And then for any stuff which isn't upstream yet (like your internal
> enabling, or DAL here, or our own internal enabling) you need continuous
> rebasing&re-validation. When we started doing this years ago it was still
> manually, but we still rebased like every few days to keep the pain down
> and adjust continuously to upstream evolution. But then going to a
> continous rebase bot that sends you mail when something goes wrong was
> again a massive improvement.
>

I think we've seen that pain already but haven't quite realized how much 
of it is due to a mismatch in kernel trees. We're trying to move onto a 
tree following drm-next much more closely. I'd love to help automate 
some of that (time permitting). Would the drm-misc scripts be of any use 
with that? I only had a very cursory glance at those.

> I guess in the end Conway's law that your software architecture
> necessarily reflects how you organize your teams applies again. Fix your
> process and it'll become glaringly obvious to everyone involved that
> DC-the-design as-is is entirely unworkeable and how it needs to be fixed.
>
> From my own experience over the past few years: Doing that is a fun
> journey ;-)
>

Absolutely. We're only at the start of this but have learned a lot from 
the community (maybe others in the DC team disagree with me somewhat).

Not sure if I fully agree that this means that DC-the-design-as-is will 
become apparent as unworkable... There are definitely pieces to be 
cleaned here and lessons learned from the DRM community but on the other 
hand we feel there are some good reasons behind our approach that we'd 
like to share with the community (some of which I'm learning myself).

Harry

> Cheers, Daniel
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]   ` <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-13  2:33     ` Harry Wentland
       [not found]       ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Harry Wentland @ 2016-12-13  2:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Grodzovsky, Andrey, Dave Airlie,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Cheng, Tony

On 2016-12-11 03:28 PM, Daniel Vetter wrote:
> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>> We propose to use the Display Core (DC) driver for display support on
>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>> avoid a flag day the plan is to only support uGPU initially and transition
>> to older ASICs gradually.
>
> Bridgeman brought it up a few times that this here was the question - it's
> kinda missing a question mark, hard to figure this out ;-). I'd say for

My bad for the missing question mark (imprecise phrasing). On the other 
hand letting this blow over a bit helped get us on the map a bit more 
and allows us to argue the challenges (and benefits) of open source. :)

> upstream it doesn't really matter, but imo having both atomic and
> non-atomic paths in one driver is one world of hurt and I strongly
> recommend against it, at least if feasible. All drivers that switched
> switched in one go, the only exception was i915 (it took much longer than
> we ever feared, causing lots of pain) and nouveau (which only converted
> nv50+, but pre/post-nv50 have always been two almost completely separate
> worlds anyway).
>

You mention the two probably most complex DRM drivers didn't switch in a 
single go...  I imagine amdgpu/DC falls into the same category.

I think one of the problems is making a sudden change with a fully 
validated driver without breaking existing use cases and customers. We 
really should've started DC development in public and probably would do 
that if we had to start anew.

>> The DC component has received extensive testing within AMD for DCE8, 10, and
>> 11 GPUs and is being prepared for uGPU. Support should be better than
>> amdgpu's current display support.
>>
>>  * All of our QA effort is focused on DC
>>  * All of our CQE effort is focused on DC
>>  * All of our OEM preloads and custom engagements use DC
>>  * DC behavior mirrors what we do for other OSes
>>
>> The new asic utilizes a completely re-designed atom interface, so we cannot
>> easily leverage much of the existing atom-based code.
>>
>> We've introduced DC to the community earlier in 2016 and received a fair
>> amount of feedback. Some of what we've addressed so far are:
>>
>>  * Self-contain ASIC specific code. We did a bunch of work to pull
>>    common sequences into dc/dce and leave ASIC specific code in
>>    separate folders.
>>  * Started to expose AUX and I2C through generic kernel/drm
>>    functionality and are mostly using that. Some of that code is still
>>    needlessly convoluted. This cleanup is in progress.
>>  * Integrated Dave and Jerome’s work on removing abstraction in bios
>>    parser.
>>  * Retire adapter service and asic capability
>>  * Remove some abstraction in GPIO
>>
>> Since a lot of our code is shared with pre- and post-silicon validation
>> suites changes need to be done gradually to prevent breakages due to a major
>> flag day.  This, coupled with adding support for new asics and lots of new
>> feature introductions means progress has not been as quick as we would have
>> liked. We have made a lot of progress none the less.
>>
>> The remaining concerns that were brought up during the last review that we
>> are working on addressing:
>>
>>  * Continue to cleanup and reduce the abstractions in DC where it
>>    makes sense.
>>  * Removing duplicate code in I2C and AUX as we transition to using the
>>    DRM core interfaces.  We can't fully transition until we've helped
>>    fill in the gaps in the drm core that we need for certain features.
>>  * Making sure Atomic API support is correct.  Some of the semantics of
>>    the Atomic API were not particularly clear when we started this,
>>    however, that is improving a lot as the core drm documentation
>>    improves.  Getting this code upstream and in the hands of more
>>    atomic users will further help us identify and rectify any gaps we
>>    have.
>
> Ok so I guess Dave is typing some more general comments about
> demidlayering, let me type some guidelines about atomic. Hopefully this
> all materializes itself a bit better into improved upstream docs, but meh.
>

Excellent writeup. Let us know when/if you want our review for upstream 
docs.

We'll have to really take some time to go over our atomic 
implementation. A couple small comments below with regard to DC.

> Step 0: Prep
>
> So atomic is transactional, but it's not validate + rollback or commit,
> but duplicate state, validate and then either throw away or commit.
> There's a few big reasons for this: a) partial atomic updates - if you
> duplicate it's much easier to check that you have all the right locks b)
> kfree() is much easier to check for correctness than a rollback code and
> c) atomic_check functions are much easier to audit for invalid changes to
> persistent state.
>

There isn't really any rollback. I believe even in our other drivers 
we've abandoned the rollback approach years ago because it doesn't 
really work on modern HW. Any rollback cases you might find in DC should 
really only be for catastrophic errors (read: something went horribly 
wrong... read: congratulations, you just found a bug).

> Trouble is that this seems a bit unusual compared to all other approaches,
> and ime (from the drawn-out i915 conversion) you really don't want to mix
> things up. Ofc for private state you can roll back (e.g. vc4 does that for
> the drm_mm allocator thing for scanout slots or whatever it is), but it's
> trivial easy to accidentally check the wrong state or mix them up or
> something else bad.
>
> Long story short, I think step 0 for DC is to split state from objects,
> i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And
> all the back-end functions need to take both the object and the state
> explicitly.
>
> This is a bit a pain to do, but should be pretty much just mechanical. And
> imo not all of it needs to happen before DC lands in upstream, but see
> above imo that half-converted state is postively horrible. This should
> also not harm cross-os reuse at all, you can still store things together
> on os where that makes sense.
>
> Guidelines for amdgpu atomic structures
>
> drm atomic stores everything in state structs on plane/connector/crtc.
> This includes any property extensions or anything else really, the entire
> userspace abi is built on top of this. Non-trivial drivers are supposed to
> subclass these to store their own stuff, so e.g.
>
> amdgpu_plane_state {
> 	struct drm_plane_state base;
>
> 	/* amdgpu glue state and stuff that's linux-specific, e.g.
> 	 * property values and similar things. Note that there's strong
> 	 * push towards standardizing properties and stroing them in the
> 	 * drm_*_state structs. */
>
> 	struct dc_surface_state surface_state;
>
> 	/* other dc states that fit to a plane */
> };
>
> Yes not everything will fit 1:1 in one of these, but to get started I
> strongly recommend to make them fit (maybe with reduced feature sets to
> start out). Stuff that is shared between e.g. planes, but always on the
> same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that
> are assignable to a plane.
>
> Of course atomic also supports truly global resources, for that you need
> to subclass drm_atomic_state. Currently msm and i915 do that, and probably
> best to read those structures as examples until I've typed the docs. But I
> expect that especially for planes a few dc_*_state structs will stay in
> amdgpu_*_state.
>
> Guidelines for atomic_check
>
> Please use the helpers as much as makes sense, and put at least the basic
> steps that from drm_*_state into the respective dc_*_state functional
> block into the helper callbacks for that object. I think basic validation
> of individal bits (as much as possible, e.g. if you just don't support
> e.g. scaling or rotation with certain pixel formats) should happen in
> there too. That way when we e.g. want to check how drivers corrently
> validate a given set of properties to be able to more strictly define the
> semantics, that code is easy to find.
>
> Also I expect that this won't result in code duplication with other OS,
> you need code to map from drm to dc anyway, might as well check&reject the
> stuff that dc can't even represent right there.
>
> The other reason is that the helpers are good guidelines for some of the
> semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right
> answer after atomic_check. If it doesn't, then you're driver doesn't
> follow atomic. If you completely roll your own this becomes much harder to
> assure.
>

Interesting point. Not sure if we've checked that. Is there some sort of 
automated test for this that we can use to check?

> Of course extend it all however you want, e.g. by adding all the global
> optimization and resource assignment stuff after initial per-object
> checking has been done using the helper infrastructure.
>
> Guidelines for atomic_commit
>
> Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also,

I believe we're not using those and didn't start with those which might 
explain (along with lack of discussion on dri-devel) why atomic 
currently looks the way it does in DC. This is definitely one of the 
bigger issues we'd want to clean up and where you wouldn't find much 
pushback, other than us trying to find time to do it.

> your atomic_commit should pretty much match the helper one, except for a
> custom swap_state to handle all your globally shared specia dc_*_state
> objects. Everything hw specific should be in atomic_commit_tail.
>
> Wrt the hw commit itself, for the modeset step just roll your own. That's
> the entire point of atomic, and atm both i915 and nouveau exploit this
> fully. Besides a bit of glue there shouldn't be much need for
> linux-specific code here - what you need is something to fish the right
> dc_*_state objects and give it your main sequencer functions. What you
> should make sure though is that only ever do a modeset when that was
> signalled, i.e. please use drm_crtc_needs_modeset to control that part.
> Feel free to wrap up in a dc_*_needs_modeset for better abstraction if
> that's needed.
>
> I do strongly suggest however that you implement the plane commit using
> the helpers. There's really only a few ways to implement this in the hw,
> and it should work everywhere.
>
> Misc guidelines
>
> Use the suspend/resume helpers. If your atomic can't do that, it's not
> terribly good. Also, if DC can't make those fit, it's probably still too
> much midlayer and its own world than helper library.
>

Do they handle swapping DP displays while the system is asleep? If not 
we'll probably need to add that. The other case where we have some 
special handling has to do with headless (sleep or resume, don't remember).

> Use all the legacy helpers, again your atomic should be able to pull it
> off. One exception is async plane flips (both primary and cursors), that's
> atm still unsolved. Probably best to keep the old code around for just
> that case (but redirect to the compat helpers for everything), see e.g.
> how vc4 implements cursors.
>

Good old flip. There probably isn't much shareable code between OSes 
here. It seems like every OS rolls there own thing, regarding flips. We 
still seem to be revisiting flips regularly, especially with FreeSync 
(adaptive sync) in the mix now. Good to know that this is still a bit of 
an open topic.

> Most imporant of all
>
> Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic
> driver for which I don't remember a single discussion about some detail,
> at least not with any of the DAL folks. Michel&Alex asked some questions
> sometimes, but that indirection is bonghits and the defeats the point of
> upstream: Direct cross-vendor collaboration to get shit done. Please make
> it happen.
>

Please keep asking us to get on dri-devel with questions. I need to get 
into the habit again of leaving the IRC channel open. I think most of us 
are still a bit scared of it or don't know how to deal with some of the 
information overload (IRC and mailing list). It's some of my job to 
change that all the while I'm learning this myself. :)

Thanks for all your effort trying to get people involved.

> Oh and I pretty much assume Harry&Tony are volunteered to review atomic
> docs ;-)
>

Sure.

Cheers,
Harry

> Cheers, Daniel
>
>
>
>>
>> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
>> work on DC is public.  We're currently transitioning to a public patch
>> review. You can follow our progress on the amd-gfx mailing list. We value
>> community feedback on our work.
>>
>> As an appendix I've included a brief overview of the how the code currently
>> works to make understanding and reviewing the code easier.
>>
>> Prior discussions on DC:
>>
>>  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>>  *
>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
>>
>> Current version of DC:
>>
>>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>
>> Once Alex pulls in the latest patches:
>>
>>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>
>> Best Regards,
>> Harry
>>
>>
>> ************************************************
>> *** Appendix: A Day in the Life of a Modeset ***
>> ************************************************
>>
>> Below is a high-level overview of a modeset with dc. Some of this might be a
>> little out-of-date since it's based on my XDC presentation but it should be
>> more-or-less the same.
>>
>> amdgpu_dm_atomic_commit()
>> {
>>   /* setup atomic state */
>>   drm_atomic_helper_prepare_planes(dev, state);
>>   drm_atomic_helper_swap_state(dev, state);
>>   drm_atomic_helper_update_legacy_modeset_state(dev, state);
>>
>>   /* create or remove targets */
>>
>>   /********************************************************************
>>    * *** Call into DC to commit targets with list of all known targets
>>    ********************************************************************/
>>   /* DC is optimized not to do anything if 'targets' didn't change. */
>>   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>>   {
>>     /******************************************************************
>>      * *** Build context (function also used for validation)
>>      ******************************************************************/
>>     result = core_dc->res_pool->funcs->validate_with_context(
>>                                core_dc,set,target_count,context);
>>
>>     /******************************************************************
>>      * *** Apply safe power state
>>      ******************************************************************/
>>     pplib_apply_safe_state(core_dc);
>>
>>     /****************************************************************
>>      * *** Apply the context to HW (program HW)
>>      ****************************************************************/
>>     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>>     {
>>       /* reset pipes that need reprogramming */
>>       /* disable pipe power gating */
>>       /* set safe watermarks */
>>
>>       /* for all pipes with an attached stream */
>>         /************************************************************
>>          * *** Programming all per-pipe contexts
>>          ************************************************************/
>>         status = apply_single_controller_ctx_to_hw(...)
>>         {
>>           pipe_ctx->tg->funcs->set_blank(...);
>>           pipe_ctx->clock_source->funcs->program_pix_clk(...);
>>           pipe_ctx->tg->funcs->program_timing(...);
>>           pipe_ctx->mi->funcs->allocate_mem_input(...);
>>           pipe_ctx->tg->funcs->enable_crtc(...);
>>           bios_parser_crtc_source_select(...);
>>
>>           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>>           pipe_ctx->opp->funcs->opp_program_fmt(...);
>>
>>           stream->sink->link->link_enc->funcs->setup(...);
>>           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>>           pipe_ctx->tg->funcs->set_blank_color(...);
>>
>>           core_link_enable_stream(pipe_ctx);
>>           unblank_stream(pipe_ctx,
>>
>>           program_scaler(dc, pipe_ctx);
>>         }
>>       /* program audio for all pipes */
>>       /* update watermarks */
>>     }
>>
>>     program_timing_sync(core_dc, context);
>>     /* for all targets */
>>       target_enable_memory_requests(...);
>>
>>     /* Update ASIC power states */
>>     pplib_apply_display_requirements(...);
>>
>>     /* update surface or page flip */
>>   }
>> }
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-12  2:57   ` Dave Airlie
  2016-12-12  7:09     ` Daniel Vetter
       [not found]     ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-13  2:52     ` Cheng, Tony
       [not found]       ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org>
  2016-12-13  9:40       ` Lukas Wunner
  2 siblings, 2 replies; 66+ messages in thread
From: Cheng, Tony @ 2016-12-13  2:52 UTC (permalink / raw)
  To: Dave Airlie, Harry Wentland
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander


[-- Attachment #1.1: Type: text/plain, Size: 17244 bytes --]



On 12/11/2016 9:57 PM, Dave Airlie wrote:
> On 8 December 2016 at 12:02, Harry Wentland<harry.wentland@amd.com>  wrote:
>> We propose to use the Display Core (DC) driver for display support on
>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>> avoid a flag day the plan is to only support uGPU initially and transition
>> to older ASICs gradually.
> [FAQ: from past few days]
>
> 1) Hey you replied to Daniel, you never addressed the points of the RFC!
> I've read it being said that I hadn't addressed the RFC, and you know
> I've realised I actually had, because the RFC is great but it
> presupposes the codebase as designed can get upstream eventually, and
> I don't think it can. The code is too littered with midlayering and
> other problems, that actually addressing the individual points of the
> RFC would be missing the main point I'm trying to make.
>
> This code needs rewriting, not cleaning, not polishing, it needs to be
> split into its constituent parts, and reintegrated in a form more
> Linux process friendly.
>
> I feel that if I reply to the individual points Harry has raised in
> this RFC, that it means the code would then be suitable for merging,
> which it still won't, and I don't want people wasting another 6
> months.
>
> If DC was ready for the next-gen GPU it would be ready for the current
> GPU, it's not the specific ASIC code that is the problem, it's the
> huge midlayer sitting in the middle.

We would love to upstream DC for all supported asic!  We made enough 
change to make Sea Island work but it's really not validate to the 
extend we validate Polaris on linux and no where close to what we do for 
2017 ASICs.  With DC the display hardware programming, resource 
optimization, power management and interaction with rest of system will 
be fully validated across multiple OSs.  Therefore we have high 
confidence that the quality is going to better than what we have 
upstreammed today.

I don't have a baseline to say if DC is in good enough quality for older 
generation compare to upstream.  For example we don't have HW generate 
bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is 
structured in a way that we assume bandwidth_calc is there.  None of us 
feel like go untangle the formulas in windows driver at this point to 
create our own version of bandwidth_calc. It sort of work with HW 
default values but some mode / config is likely to underflows.  If 
community is okay with uncertain quality, sure we would love to upstream 
everything to reduce our maintaince overhead.  You do get audio with DC 
on DCE8 though.

> 2) We really need to share all of this code between OSes, why does
> Linux not want it?
>
> Sharing code is a laudable goal and I appreciate the resourcing
> constraints that led us to the point at which we find ourselves, but
> the way forward involves finding resources to upstream this code,
> dedicated people (even one person) who can spend time on a day by day
> basis talking to people in the open and working upstream, improving
> other pieces of the drm as they go, reading atomic patches and
> reviewing them, and can incrementally build the DC experience on top
> of the Linux kernel infrastructure. Then having the corresponding
> changes in the DC codebase happen internally to correspond to how the
> kernel code ends up looking. Lots of this code overlaps with stuff the
> drm already does, lots of is stuff the drm should be doing, so patches
> to the drm should be sent instead.

Maybe let me share what we are doing and see if we can come up with 
something to make DC work for both upstream and our internal need. We 
are sharing code not just on Linux and we will do our best to make our 
code upstream friendly.  Last year we focussed on having enough code to 
prove that our DAL rewrite works and get more people contributing to 
it.  We rush a bit as a result we had a few legacy component we port 
from Windows driver and we know it's bloat that needed to go.

We designed DC so HW can contribute bandwidth_calc magic and psuedo code 
to program the HW blocks.  The HW blocks on the bottom of DC.JPG in 
models our HW blocks and the programming sequence are provided by HW 
engineers.  If a piece of HW need a bit toggled 7 times during power up 
I rather have HW engineer put that in their psedo code rather than me 
trying to find that sequence in some document.  Afterall they did 
simulate the HW with the toggle sequence.  I guess these are back-end 
code Daniel talked about.  Can we agree that DRM core is not interested 
in how things are done in that layer and we can upstream these as it?

The next is dce_hwseq.c to program the HW blocks in correct sequence.  
Some HW block can be programmed in any sequence, but some requires 
strict sequence to be followed.  For example Display CLK and PHY CLK 
need to be up before we enable timing generator.  I would like these 
sequence to remain in DC as it's really not DRM's business to know how 
to program the HW.  In a way you can consider hwseq as a helper to 
commit state to HW.

Above hwseq is the dce*_resource.c.  It's job is to come up with the HW 
state required to realize given config.  For example we would use the 
exact same HW resources with same optimization setting to drive any same 
given config.  If 4 x 4k@60 is supported with resource setting A on HW 
diagnositc suite during bring up setting B on Linux then we have a 
problem.  It know which HW block work with which block and their 
capability and limitations.  I hope you are not asking this stuff to 
move up to core because in reality we should probably hide this in some 
FW, as HW expose the register to config them differently that doesn't 
mean all combination of HW usage is validated.   To me resource is more 
of a helper to put together functional pipeline and does not make any 
decision that any OS might be interested in.

These yellow boxes in DC.JPG are really specific to each generation of 
HW and changes frequently.  These are things that HW has consider hiding 
it in FW before.  Can we agree on those code (under /dc/dce*) can stay?

DAL3.JPG shows how we put this all together.  The core part is design to 
behave like helper, except we try to limit the entry point and opted for 
caller to build desire state we want DC to commit to. It didn't make 
sense for us to expose hundred of function (our windows dal interface 
did) and require caller to call these helpers in correct sequence.  
Caller builds absolute state it want to get to and DC will make it 
happen with the HW available.

> 3) Then how do we upstream it?
> Resource(s) need(s) to start concentrating at splitting this thing up
> and using portions of it in the upstream kernel. We don't land fully
> formed code in the kernel if we can avoid it. Because you can't review
> the ideas and structure as easy as when someone builds up code in
> chunks and actually develops in the Linux kernel. This has always
> produced better more maintainable code. Maybe the result will end up
> improving the AMD codebase as well.
Is this about demonstration how basic functionality work and add more 
features with series of patches to make review eaiser?  If so I don't 
think we are staff to do this kind of rewrite.  For example it make no 
sense to hooking up bandwidth_calc to calculate HW magic if we don't 
have mem_input to program the memory settings.  We need portion of 
hw_seq to ensure these blocks are programming in correct sequence.  We 
will need to feed bandwidth_calc it's required inputs, which is 
basically the whole system state tracked in validate_context today, 
which means we basically need big bulk of resource.c.  This effort might 
have benefit in reviewing the code, but we will end up with pretty much 
similar if not the same as what we already have.

Or is the objection that we have the white boxes in DC.JPG instead of 
using DRM objects?  We can probably workout something to have the white 
boxes derive from DRM objects and extend atomic state with our 
validate_context where dce*_resource.c stores the constructed pipelines.

> 4) Why can't we put this in staging?
> People have also mentioned staging, Daniel has called it a dead end,
> I'd have considered staging for this code base, and I still might.
> However staging has rules, and the main one is code in staging needs a
> TODO list, and agreed criteria for exiting staging, I don't think we'd
> be able to get an agreement on what the TODO list should contain and
> how we'd ever get all things on it done. If this code ended up in
> staging, it would most likely require someone dedicated to recreating
> it in the mainline driver in an incremental fashion, and I don't see
> that resource being available.
>
> 5) Why is a midlayer bad?
> I'm not going to go into specifics on the DC midlayer, but we abhor
> midlayers for a fair few reasons. The main reason I find causes the
> most issues is locking. When you have breaks in code flow between
> multiple layers, but having layers calling back into previous layers
> it becomes near impossible to track who owns the locking and what the
> current locking state is.
>
> Consider
>      drma -> dca -> dcb -> drmb
>      drmc -> dcc  -> dcb -> drmb
>
> We have two codes paths that go back into drmb, now maybe drma has a
> lock taken, but drmc doesn't, but we've no indication when we hit drmb
> of what the context pre entering the DC layer is. This causes all
> kinds of problems. The main requirement is the driver maintains the
> execution flow as much as possible. The only callback behaviour should
> be from an irq or workqueue type situations where you've handed
> execution flow to the hardware to do something and it is getting back
> to you. The pattern we use to get our of this sort of hole is helper
> libraries, we structure code as much as possible as leaf nodes that
> don't call back into the parents if we can avoid it (we don't always
> succeed).
Okay.  by the way DC does behave like a helper for most part.  There is 
no locking in DC.  We work enough with different OS to know they all 
have different synchronization primatives and interrupt handling and 
have DC lock anything is just shooting ourself in the foot.  We do have 
function with lock in their function name in DC but those are HW 
register lock to ensure that the HW register update atomically. ie have 
50 register write latch in HW at next vsync to ensure HW state change on 
vsync boundary.

> So the above might becomes
>     drma-> dca_helper
>             -> dcb_helper
>             -> drmb.
>
> In this case the code flow is controlled by drma, dca/dcb might be
> modifying data or setting hw state but when we get to drmb it's easy
> to see what data is needs and what locking.
>
> DAL/DC goes against this in so many ways, and when I look at the code
> I'm never sure where to even start pulling the thread to unravel it.
I don't know where we go against it.  In the case we do callback to DRM 
for MST case we have

amdgpu_dm_atomic_commit (implement atomic_commit)
dc_commit_targets (commit helper)
dce110_apply_ctx_to_hw (hw_seq)
core_link_enable_stream (part of MST enable sequence)
allocate_mst_payload (helper for above func in same file)
dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM)
drm_dp_mst_allocate_vcpi (DRM)

As you see even in this case we are only 6 level deep before we callback 
to DRM, and 2 of those functions are in same file as helper func of the 
bigger sequence.

Can you clarify the distinction between what you would call a mid layer 
vs helper.  We consulted Alex a lot and we know about this inversion of 
control pattern and we are trying our best to do it. Is it the way 
functions are named and files folder structure?  Would it help if we 
flatten amdgpu_dm_atomic_commit and dc_commit_targets?  Even if we do I 
would imagine we want some helper in commit rather a giant 1000 line 
function.  Is there any concern that we put dc_commit_targets under /dc 
folder as we want other platform to run exact same helper?  Or this is 
about the state dc_commit_targets is too big?  or the state is stored 
validate_context rather than drm_atomic_state?

I don't think it make sense for DRM to get into how we decide to use our 
HW blocks.  For example any refactor done in core should not result in 
us using different pipeline to drive the same config.  We would like to 
have control over how our HW pipeline is constructed.

> Some questions I have for AMD engineers that also I'd want to see
> addressed before any consideration of merging would happen!
>
> How do you plan on dealing with people rewriting or removing code
> upstream that is redundant in the kernel, but required for internal
> stuff?

Honestly I don't know what these are.  Like you and Jerome remove func 
ptr abstraction (I know it was bad, that was one of the component we 
ported from windows) and we need to keep it as function pointer so we 
can still run our code on FPGA before we see first silicon?  I don't 
think if we nak the function ptr removal will be a problem for 
community.  The rest is valued and we took with open arm.

Or this is more like we have code duplication after DRM added some 
functionality we can use?  I would imaging its more of moving what we 
got working in our code to DRM core if we are upstreamed and we have no 
problem accomodate for that as the code moved out to DRM core can be 
included in other platforms.  We don't have any private ioctl today and 
we don't plan to have any outside of using DRM object properties.

> How are you going to deal with new Linux things that overlap
> incompatibly with your internally developed stuff?
I really don't know what those new linux things can be that could cause 
us problem.  If anything the new things will be probably come from us if 
we are upstreammed.

atomic: we had that on windows 8 years ago for windows vista, yes 
sematic/abstraction is different but concept is the same. We could have 
easily settled with DRM semantics or DRM could easily take some form of 
our pattern.

DP MST:  AMD was the first source certified and we work closely with the 
first branch certified. I was a part of that team and we had a very 
solid implementation.  If we were upstreamed I don't see you would want 
to reinvent the wheel and not try to massage what we have into shape for 
DRM core for other driver to reuse.

drm_plane: windows multi-plane overlay and Andriod HW composer? We had 
that working 2 years ago.  If you are upstreammed and you are first you 
usually have a say in how it should go down don't you?

The new thing coming are Free Sync HDR, 8k@60 with DP DSC etc.  I would 
imaging we would beat all other vendor to the first open source solution 
if we leverage effort from our extended display team.

> If the code is upstream will it be tested in the kernel by some QA
> group, or will there be some CI infrastructure used to maintain and to
> watch for Linux code that breaks assumptions in the DC code?
We have tester that runs set of display test every other day on linux.  
We don't run on DRM_Next tree yet and Alex is working out a plan to 
allow us use DRM_Next as our development branch.  Upstream is not likely 
to be tested by QA though.

DC does not assume anything.  DC require full state given in 
dc_commit_targets / dc_commit_surfaces_to_target.  we do whatever is 
specified in the data structure.  dc_commit_surfaces_to_target can be 
considered as a helper function to change plane without visual side 
effect on vsync bondary.  dc_commit_targets can be considered as a 
helper function to light up a display with black screen.  DRM core has 
full control if you want to light up to black screen as soon as monitor 
is plugged in or you want to light up after someone does a mode set.  
Hotplug interrupt goes to amdgpu_dm, and it will take the require lock 
in DRM object because calling DC to detect.
> Can you show me you understand that upstream code is no longer 100% in
> your control and things can happen to it that you might not expect and
> you need to deal with it?
I think so, other than we haven't been spanning the mailing list. We 
already dealing with we don't control 100% our code to some extend.  We 
don't control bandwidth_calc.  Trust me we are not keeping up with the 
updates that HW is doing with it for next gen hw.  Everytime we pull 
there is a new term they added and we have to find a way to feed that 
input.  We had to clean up linux style for them everytime we pull.  Our 
HW diagnostic suite has different set of requirements and they 
frequently contribute to our code.  We took you and Jerome's patch.  If 
it's validated we want that code.

At end of the day I think the architecture is really about what's HW and 
what's DRM core.  Like I said all the yellow boxes has been proposed to 
running on firmware but we decide to keep them in driver as it's easier 
to debug on x86 than uC.  I can tell you that our HW guys were happy 
when I decide to open source bandwidth_calc but we did it anyways.  I 
feel like because we are opening up the complexity and inner working of 
our HW, we are somehow getting penalized for being open.
>
> Dave.
Tony

[-- Attachment #1.2: Type: text/html, Size: 20118 bytes --]

[-- Attachment #2: DC.JPG --]
[-- Type: image/jpeg, Size: 147686 bytes --]

[-- Attachment #3: DAL3.JPG --]
[-- Type: image/jpeg, Size: 117556 bytes --]

[-- Attachment #4: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-13  4:10         ` Cheng, Tony
  2016-12-13  7:50           ` Daniel Vetter
       [not found]           ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org>
  2016-12-13  7:31         ` Daniel Vetter
  2016-12-13 10:09         ` Ernst Sjöstrand
  2 siblings, 2 replies; 66+ messages in thread
From: Cheng, Tony @ 2016-12-13  4:10 UTC (permalink / raw)
  To: Harry Wentland, Daniel Vetter
  Cc: Deucher, Alexander, Grodzovsky, Andrey, Dave Airlie,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1: Type: text/plain, Size: 22899 bytes --]

Thanks for the write up for the guide.  We can definitely re-do atomic 
according to guideline provided as I am not satified with how our code 
look today.  To me it seems more like we need to shuffle stuff around 
and rename a few things than rewrite much of anything.

I hope to get an answer on the reply to Dave's question regarding to if 
there is anything else.  If we can keep most of the stuff under /dc as 
the "back end" helper and do most of the change under /amdgpu_dm then it 
isn't that difficult as we don't need to go deal with the fall out on 
other platforms.  Again it's not just windows.  We are fully aware that 
it's hard to find the common abstraction between all different OS so we 
try our best to have DC behave more like a helper than abstraction layer 
anyways.  In our design states and policies are domain of Display 
Managers (DM) and because of linux we also say anything DRM can do 
that's also domain of DM.  We don't put anything in DC that we don't 
feel comfortable if HW decide to hide it in FW.


On 12/12/2016 9:33 PM, Harry Wentland wrote:
> On 2016-12-11 03:28 PM, Daniel Vetter wrote:
>> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>>> We propose to use the Display Core (DC) driver for display support on
>>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In 
>>> order to
>>> avoid a flag day the plan is to only support uGPU initially and 
>>> transition
>>> to older ASICs gradually.
>>
>> Bridgeman brought it up a few times that this here was the question - 
>> it's
>> kinda missing a question mark, hard to figure this out ;-). I'd say for
>
> My bad for the missing question mark (imprecise phrasing). On the 
> other hand letting this blow over a bit helped get us on the map a bit 
> more and allows us to argue the challenges (and benefits) of open 
> source. :)
>
>> upstream it doesn't really matter, but imo having both atomic and
>> non-atomic paths in one driver is one world of hurt and I strongly
>> recommend against it, at least if feasible. All drivers that switched
>> switched in one go, the only exception was i915 (it took much longer 
>> than
>> we ever feared, causing lots of pain) and nouveau (which only converted
>> nv50+, but pre/post-nv50 have always been two almost completely separate
>> worlds anyway).
>>
>
trust me we would like to upstream everything.  Just we didn't invest 
enough in DC code in the previous generation so the quality might not be 
there.

> You mention the two probably most complex DRM drivers didn't switch in 
> a single go...  I imagine amdgpu/DC falls into the same category.
>
> I think one of the problems is making a sudden change with a fully 
> validated driver without breaking existing use cases and customers. We 
> really should've started DC development in public and probably would 
> do that if we had to start anew.
>
>>> The DC component has received extensive testing within AMD for DCE8, 
>>> 10, and
>>> 11 GPUs and is being prepared for uGPU. Support should be better than
>>> amdgpu's current display support.
>>>
>>>  * All of our QA effort is focused on DC
>>>  * All of our CQE effort is focused on DC
>>>  * All of our OEM preloads and custom engagements use DC
>>>  * DC behavior mirrors what we do for other OSes
>>>
>>> The new asic utilizes a completely re-designed atom interface, so we 
>>> cannot
>>> easily leverage much of the existing atom-based code.
>>>
>>> We've introduced DC to the community earlier in 2016 and received a 
>>> fair
>>> amount of feedback. Some of what we've addressed so far are:
>>>
>>>  * Self-contain ASIC specific code. We did a bunch of work to pull
>>>    common sequences into dc/dce and leave ASIC specific code in
>>>    separate folders.
>>>  * Started to expose AUX and I2C through generic kernel/drm
>>>    functionality and are mostly using that. Some of that code is still
>>>    needlessly convoluted. This cleanup is in progress.
>>>  * Integrated Dave and Jerome’s work on removing abstraction in bios
>>>    parser.
>>>  * Retire adapter service and asic capability
>>>  * Remove some abstraction in GPIO
>>>
>>> Since a lot of our code is shared with pre- and post-silicon validation
>>> suites changes need to be done gradually to prevent breakages due to 
>>> a major
>>> flag day.  This, coupled with adding support for new asics and lots 
>>> of new
>>> feature introductions means progress has not been as quick as we 
>>> would have
>>> liked. We have made a lot of progress none the less.
>>>
>>> The remaining concerns that were brought up during the last review 
>>> that we
>>> are working on addressing:
>>>
>>>  * Continue to cleanup and reduce the abstractions in DC where it
>>>    makes sense.
>>>  * Removing duplicate code in I2C and AUX as we transition to using the
>>>    DRM core interfaces.  We can't fully transition until we've helped
>>>    fill in the gaps in the drm core that we need for certain features.
>>>  * Making sure Atomic API support is correct.  Some of the semantics of
>>>    the Atomic API were not particularly clear when we started this,
>>>    however, that is improving a lot as the core drm documentation
>>>    improves.  Getting this code upstream and in the hands of more
>>>    atomic users will further help us identify and rectify any gaps we
>>>    have.
>>
>> Ok so I guess Dave is typing some more general comments about
>> demidlayering, let me type some guidelines about atomic. Hopefully this
>> all materializes itself a bit better into improved upstream docs, but 
>> meh.
>>
>
> Excellent writeup. Let us know when/if you want our review for 
> upstream docs.
>
> We'll have to really take some time to go over our atomic 
> implementation. A couple small comments below with regard to DC.
>
>> Step 0: Prep
>>
>> So atomic is transactional, but it's not validate + rollback or commit,
>> but duplicate state, validate and then either throw away or commit.
>> There's a few big reasons for this: a) partial atomic updates - if you
>> duplicate it's much easier to check that you have all the right locks b)
>> kfree() is much easier to check for correctness than a rollback code and
>> c) atomic_check functions are much easier to audit for invalid 
>> changes to
>> persistent state.
>>
>
> There isn't really any rollback. I believe even in our other drivers 
> we've abandoned the rollback approach years ago because it doesn't 
> really work on modern HW. Any rollback cases you might find in DC 
> should really only be for catastrophic errors (read: something went 
> horribly wrong... read: congratulations, you just found a bug).
>
There is no rollback.  We moved to "atomic" for Windows Vista in the 
previous DAL 8 years ago.  Windows only care about VidPnSource (frame 
buffer) and VidPnTarget (display output) and leave the rest up to driver 
but we had to behave atomic as Window obsolutely "check" every possible 
config with the famous EnumConfunctionalModality DDI.

>> Trouble is that this seems a bit unusual compared to all other 
>> approaches,
>> and ime (from the drawn-out i915 conversion) you really don't want to 
>> mix
>> things up. Ofc for private state you can roll back (e.g. vc4 does 
>> that for
>> the drm_mm allocator thing for scanout slots or whatever it is), but 
>> it's
>> trivial easy to accidentally check the wrong state or mix them up or
>> something else bad.
>>
>> Long story short, I think step 0 for DC is to split state from objects,
>> i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. 
>> And
>> all the back-end functions need to take both the object and the state
>> explicitly.
>>
>> This is a bit a pain to do, but should be pretty much just 
>> mechanical. And
>> imo not all of it needs to happen before DC lands in upstream, but see
>> above imo that half-converted state is postively horrible. This should
>> also not harm cross-os reuse at all, you can still store things together
>> on os where that makes sense.
>>
>> Guidelines for amdgpu atomic structures
>>
>> drm atomic stores everything in state structs on plane/connector/crtc.
>> This includes any property extensions or anything else really, the 
>> entire
>> userspace abi is built on top of this. Non-trivial drivers are 
>> supposed to
>> subclass these to store their own stuff, so e.g.
>>
>> amdgpu_plane_state {
>>     struct drm_plane_state base;
>>
>>     /* amdgpu glue state and stuff that's linux-specific, e.g.
>>      * property values and similar things. Note that there's strong
>>      * push towards standardizing properties and stroing them in the
>>      * drm_*_state structs. */
>>
>>     struct dc_surface_state surface_state;
>>
>>     /* other dc states that fit to a plane */
>> };
>>
Is there any requirement where the header and code that deal with 
dc_surface_state has to be?  Can we keep it under /dc while 
amdgpu_plane_state exist under /amdgpu_dm?
>> Yes not everything will fit 1:1 in one of these, but to get started I
>> strongly recommend to make them fit (maybe with reduced feature sets to
>> start out). Stuff that is shared between e.g. planes, but always on the
>> same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers 
>> that
>> are assignable to a plane.
>>
>> Of course atomic also supports truly global resources, for that you need
>> to subclass drm_atomic_state. Currently msm and i915 do that, and 
>> probably
>> best to read those structures as examples until I've typed the docs. 
>> But I
>> expect that especially for planes a few dc_*_state structs will stay in
>> amdgpu_*_state.
>>
We need to treat most of resource that don't map well as global. One 
example is pixel pll.  We have 6 display pipes but only 2 or 3 plls in 
CI/VI, as a result we are limited in number of HDMI or DVI we can drive 
at the same time.  Also the pixel pll can be used to drive DP as well, 
so there is another layer of HW specific but we can't really contain it 
in crtc or encoder by itself.  Doing this resource allocation require 
knowlege of the whole system, and knowning which pixel pll is already 
used, and what can we support with remaining pll.

Another ask is lets say we are driving 2 displays, we would always want 
instance 0 and instance 1 of scaler, timing generator etc getting used.  
We want to avoid possiblity of due to different user mode commit 
sequence we end up with driving the 2 display with 0 and 2nd instance of 
HW.  Not only this configuration isn't really validated in the lab, we 
will be less effective in power gating as instance 0 and 1 are one the 
same tile.  instead of having 2/3 of processing pipeline silicon power 
gated we can only power gate 1/3. And if we power gate wrong the you 
will have 1 of the 2 display not lighting up.

Having HW resource used the same way on all platform under any sequence 
/ circumstance is important for us, as power optimization/measure is 
done for given platform + display config mostly on only 1 OS by the HW team.

>> Guidelines for atomic_check
>>
>> Please use the helpers as much as makes sense, and put at least the 
>> basic
>> steps that from drm_*_state into the respective dc_*_state functional
>> block into the helper callbacks for that object. I think basic 
>> validation
>> of individal bits (as much as possible, e.g. if you just don't support
>> e.g. scaling or rotation with certain pixel formats) should happen in
>> there too. That way when we e.g. want to check how drivers corrently
>> validate a given set of properties to be able to more strictly define 
>> the
>> semantics, that code is easy to find.
>>
>> Also I expect that this won't result in code duplication with other OS,
>> you need code to map from drm to dc anyway, might as well 
>> check&reject the
>> stuff that dc can't even represent right there.
>>
>> The other reason is that the helpers are good guidelines for some of the
>> semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the 
>> right
>> answer after atomic_check. If it doesn't, then you're driver doesn't
>> follow atomic. If you completely roll your own this becomes much 
>> harder to
>> assure.
>>
it doesn't today and we have equilvant check in dc in our hw_seq. We 
will look into how to make it work.  Our "atomic" operate on always 
knowing the current state (core_dc.current_ctx) and finding out the 
delta between the desired future state computed in our dc_validate.  One 
thing we were stuggling with it seems DRM is building up incremental 
state, ie. if something isn't mentioned in atomic_commit then you don't 
touch it.  We operate in a mode where if something isn't mentioned in 
dc_commit_target we disable those output.  this method allow us to 
always know current and future state, as future state is built up by 
caller (amdgpu), and we are able to transition into the future state on 
vsync boundary if required. It seems to me that drm_*_state require us 
to compartimentize states.  It won't be as trivial to fill the input for 
bandwidth_calc as that beast need everything as everything end up goes 
through the same memory controller.  our validate_context is 
specifically design to make it easy to generate input parameter for 
bandwidth_calc.  Per pipe validate like pixel format, scaling is not a 
problem.

>
> Interesting point. Not sure if we've checked that. Is there some sort 
> of automated test for this that we can use to check?
>
>> Of course extend it all however you want, e.g. by adding all the global
>> optimization and resource assignment stuff after initial per-object
>> checking has been done using the helper infrastructure.
>>
>> Guidelines for atomic_commit
>>
>> Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also,
>
> I believe we're not using those and didn't start with those which 
> might explain (along with lack of discussion on dri-devel) why atomic 
> currently looks the way it does in DC. This is definitely one of the 
> bigger issues we'd want to clean up and where you wouldn't find much 
> pushback, other than us trying to find time to do it.
>
>> your atomic_commit should pretty much match the helper one, except for a
>> custom swap_state to handle all your globally shared specia dc_*_state
>> objects. Everything hw specific should be in atomic_commit_tail.
>>
>> Wrt the hw commit itself, for the modeset step just roll your own. 
>> That's
>> the entire point of atomic, and atm both i915 and nouveau exploit this
>> fully. Besides a bit of glue there shouldn't be much need for
>> linux-specific code here - what you need is something to fish the right
>> dc_*_state objects and give it your main sequencer functions. What you
>> should make sure though is that only ever do a modeset when that was
>> signalled, i.e. please use drm_crtc_needs_modeset to control that part.
>> Feel free to wrap up in a dc_*_needs_modeset for better abstraction if
>> that's needed.
>>
Using state properly will solve our double resource 
assignment/validation problem during commit.  Thanks for the guidance on 
how to do this.

now the question is can we have a helper function to house the main 
sequence and put it in /dc?
>> I do strongly suggest however that you implement the plane commit using
>> the helpers. There's really only a few ways to implement this in the hw,
>> and it should work everywhere.
>>
Maybe from SW perspective I'll look at the intel code to understand 
this.  In terms of HW I would have to say I disagree with that.  The 
multi-plane blend stuff even in our HW we have gone through 1 minor 
revision and 1 major change.  Also the same HW is build to handle stereo 
3D, multi-plane blending, pipe spliting and more.  The pipeline / 
blending stuff tend to change in HW because HW need to constently design 
to meet the timing requirement of ever increasing pixel rate to keep us 
competitive. When HW can't meet timing they employ the split trick and 
have 2 copy of the same HW to be able to push through that many number 
of pixel.   If we are Intel and on latest process node then we probably 
won't have this problem.  I beat our 2018 HW will change again 
especially things are moving toward 64bpp FP16 pixel format by default 
for HDR.
>> Misc guidelines
>>
>> Use the suspend/resume helpers. If your atomic can't do that, it's not
>> terribly good. Also, if DC can't make those fit, it's probably still too
>> much midlayer and its own world than helper library.
>>
>
> Do they handle swapping DP displays while the system is asleep? If not 
> we'll probably need to add that. The other case where we have some 
> special handling has to do with headless (sleep or resume, don't 
> remember).
>
>> Use all the legacy helpers, again your atomic should be able to pull it
>> off. One exception is async plane flips (both primary and cursors), 
>> that's
>> atm still unsolved. Probably best to keep the old code around for just
>> that case (but redirect to the compat helpers for everything), see e.g.
>> how vc4 implements cursors.
>>
>
> Good old flip. There probably isn't much shareable code between OSes 
> here. It seems like every OS rolls there own thing, regarding flips. 
> We still seem to be revisiting flips regularly, especially with 
> FreeSync (adaptive sync) in the mix now. Good to know that this is 
> still a bit of an open topic.
>
>> Most imporant of all
>>
>> Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic
>> driver for which I don't remember a single discussion about some detail,
>> at least not with any of the DAL folks. Michel&Alex asked some questions
>> sometimes, but that indirection is bonghits and the defeats the point of
>> upstream: Direct cross-vendor collaboration to get shit done. Please 
>> make
>> it happen.
>>
>
> Please keep asking us to get on dri-devel with questions. I need to 
> get into the habit again of leaving the IRC channel open. I think most 
> of us are still a bit scared of it or don't know how to deal with some 
> of the information overload (IRC and mailing list). It's some of my 
> job to change that all the while I'm learning this myself. :)
>
> Thanks for all your effort trying to get people involved.
>
>> Oh and I pretty much assume Harry&Tony are volunteered to review atomic
>> docs ;-)
>>
>
> Sure.
>
> Cheers,
> Harry
>
>> Cheers, Daniel
>>
>>
>>
>>>
>>> Unfortunately we cannot expose code for uGPU yet. However refactor / 
>>> cleanup
>>> work on DC is public.  We're currently transitioning to a public patch
>>> review. You can follow our progress on the amd-gfx mailing list. We 
>>> value
>>> community feedback on our work.
>>>
>>> As an appendix I've included a brief overview of the how the code 
>>> currently
>>> works to make understanding and reviewing the code easier.
>>>
>>> Prior discussions on DC:
>>>
>>>  * 
>>> https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>>>  *
>>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html 
>>>
>>>
>>> Current version of DC:
>>>
>>>  * 
>>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>>
>>> Once Alex pulls in the latest patches:
>>>
>>>  * 
>>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>>>
>>> Best Regards,
>>> Harry
>>>
>>>
>>> ************************************************
>>> *** Appendix: A Day in the Life of a Modeset ***
>>> ************************************************
>>>
>>> Below is a high-level overview of a modeset with dc. Some of this 
>>> might be a
>>> little out-of-date since it's based on my XDC presentation but it 
>>> should be
>>> more-or-less the same.
>>>
>>> amdgpu_dm_atomic_commit()
>>> {
>>>   /* setup atomic state */
>>>   drm_atomic_helper_prepare_planes(dev, state);
>>>   drm_atomic_helper_swap_state(dev, state);
>>>   drm_atomic_helper_update_legacy_modeset_state(dev, state);
>>>
>>>   /* create or remove targets */
>>>
>>> /********************************************************************
>>>    * *** Call into DC to commit targets with list of all known targets
>>> ********************************************************************/
>>>   /* DC is optimized not to do anything if 'targets' didn't change. */
>>>   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>>>   {
>>> /******************************************************************
>>>      * *** Build context (function also used for validation)
>>> ******************************************************************/
>>>     result = core_dc->res_pool->funcs->validate_with_context(
>>> core_dc,set,target_count,context);
>>>
>>> /******************************************************************
>>>      * *** Apply safe power state
>>> ******************************************************************/
>>>     pplib_apply_safe_state(core_dc);
>>>
>>> /****************************************************************
>>>      * *** Apply the context to HW (program HW)
>>> ****************************************************************/
>>>     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
>>>     {
>>>       /* reset pipes that need reprogramming */
>>>       /* disable pipe power gating */
>>>       /* set safe watermarks */
>>>
>>>       /* for all pipes with an attached stream */
>>> /************************************************************
>>>          * *** Programming all per-pipe contexts
>>> ************************************************************/
>>>         status = apply_single_controller_ctx_to_hw(...)
>>>         {
>>>           pipe_ctx->tg->funcs->set_blank(...);
>>> pipe_ctx->clock_source->funcs->program_pix_clk(...);
>>>           pipe_ctx->tg->funcs->program_timing(...);
>>> pipe_ctx->mi->funcs->allocate_mem_input(...);
>>>           pipe_ctx->tg->funcs->enable_crtc(...);
>>>           bios_parser_crtc_source_select(...);
>>>
>>> pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
>>>           pipe_ctx->opp->funcs->opp_program_fmt(...);
>>>
>>> stream->sink->link->link_enc->funcs->setup(...);
>>> pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
>>>           pipe_ctx->tg->funcs->set_blank_color(...);
>>>
>>>           core_link_enable_stream(pipe_ctx);
>>>           unblank_stream(pipe_ctx,
>>>
>>>           program_scaler(dc, pipe_ctx);
>>>         }
>>>       /* program audio for all pipes */
>>>       /* update watermarks */
>>>     }
>>>
>>>     program_timing_sync(core_dc, context);
>>>     /* for all targets */
>>>       target_enable_memory_requests(...);
>>>
>>>     /* Update ASIC power states */
>>>     pplib_apply_display_requirements(...);
>>>
>>>     /* update surface or page flip */
>>>   }
>>> }
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>


[-- Attachment #2: DAL3.JPG --]
[-- Type: image/jpeg, Size: 117556 bytes --]

[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-13  7:09         ` Dave Airlie
  0 siblings, 0 replies; 66+ messages in thread
From: Dave Airlie @ 2016-12-13  7:09 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan,
	amd-gfx mailing list, dri-devel, Deucher, Alexander,
	Harry Wentland

> We would love to upstream DC for all supported asic!  We made enough change
> to make Sea Island work but it's really not validate to the extend we
> validate Polaris on linux and no where close to what we do for 2017 ASICs.
> With DC the display hardware programming, resource optimization, power
> management and interaction with rest of system will be fully validated
> across multiple OSs.  Therefore we have high confidence that the quality is
> going to better than what we have upstreammed today.
>
> I don't have a baseline to say if DC is in good enough quality for older
> generation compare to upstream.  For example we don't have HW generate
> bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is
> structured in a way that we assume bandwidth_calc is there.  None of us feel
> like go untangle the formulas in windows driver at this point to create our
> own version of bandwidth_calc.  It sort of work with HW default values but
> some mode / config is likely to underflows.  If community is okay with
> uncertain quality, sure we would love to upstream everything to reduce our
> maintaince overhead.  You do get audio with DC on DCE8 though.

If we get any of this upstream, we should get all of the hw supported with it.

If it regresses we just need someone to debug why.

> Maybe let me share what we are doing and see if we can come up with
> something to make DC work for both upstream and our internal need.  We are
> sharing code not just on Linux and we will do our best to make our code
> upstream friendly.  Last year we focussed on having enough code to prove
> that our DAL rewrite works and get more people contributing to it.  We rush
> a bit as a result we had a few legacy component we port from Windows driver
> and we know it's bloat that needed to go.
>
> We designed DC so HW can contribute bandwidth_calc magic and psuedo code to
> program the HW blocks.  The HW blocks on the bottom of DC.JPG in models our
> HW blocks and the programming sequence are provided by HW engineers.  If a
> piece of HW need a bit toggled 7 times during power up I rather have HW
> engineer put that in their psedo code rather than me trying to find that
> sequence in some document.  Afterall they did simulate the HW with the
> toggle sequence.  I guess these are back-end code Daniel talked about.  Can
> we agree that DRM core is not interested in how things are done in that
> layer and we can upstream these as it?
>
> The next is dce_hwseq.c to program the HW blocks in correct sequence.  Some
> HW block can be programmed in any sequence, but some requires strict
> sequence to be followed.  For example Display CLK and PHY CLK need to be up
> before we enable timing generator.  I would like these sequence to remain in
> DC as it's really not DRM's business to know how to program the HW.  In a
> way you can consider hwseq as a helper to commit state to HW.
>
> Above hwseq is the dce*_resource.c.  It's job is to come up with the HW
> state required to realize given config.  For example we would use the exact
> same HW resources with same optimization setting to drive any same given
> config.  If 4 x 4k@60 is supported with resource setting A on HW diagnositc
> suite during bring up setting B on Linux then we have a problem.  It know
> which HW block work with which block and their capability and limitations.
> I hope you are not asking this stuff to move up to core because in reality
> we should probably hide this in some FW, as HW expose the register to config
> them differently that doesn't mean all combination of HW usage is validated.
> To me resource is more of a helper to put together functional pipeline and
> does not make any decision that any OS might be interested in.
>
> These yellow boxes in DC.JPG are really specific to each generation of HW
> and changes frequently.  These are things that HW has consider hiding it in
> FW before.  Can we agree on those code (under /dc/dce*) can stay?

I think most of these things are fine to be part of the solution we end up at,
but I can't say for certain they won't require interface changes. I think the
most useful code is probably the stuff in the dce subdirectories.

>
> Is this about demonstration how basic functionality work and add more
> features with series of patches to make review eaiser?  If so I don't think
> we are staff to do this kind of rewrite.  For example it make no sense to
> hooking up bandwidth_calc to calculate HW magic if we don't have mem_input
> to program the memory settings.  We need portion of hw_seq to ensure these
> blocks are programming in correct sequence.  We will need to feed
> bandwidth_calc it's required inputs, which is basically the whole system
> state tracked in validate_context today, which means we basically need big
> bulk of resource.c.  This effort might have benefit in reviewing the code,
> but we will end up with pretty much similar if not the same as what we
> already have.

This is something people always say, I'm betting you won't end up there at all,
it's not just review, it's incremental development model, so that when things
go wrong we can pinpoint why and where a lot easier. Just merging this all in
one fell swoop is going to just mean a lot of pain in the end. I understand you
aren't resourced for this sort of development on this codebase, but it's going
to be an impasse to try and merge this all at once even if was clean code.

> Or is the objection that we have the white boxes in DC.JPG instead of using
> DRM objects?  We can probably workout something to have the white boxes
> derive from DRM objects and extend atomic state with our validate_context
> where dce*_resource.c stores the constructed pipelines.

I think Daniel explained quite well how things should look in terms of
subclassing.

>
> 5) Why is a midlayer bad?
> I'm not going to go into specifics on the DC midlayer, but we abhor
> midlayers for a fair few reasons. The main reason I find causes the
> most issues is locking. When you have breaks in code flow between
> multiple layers, but having layers calling back into previous layers
> it becomes near impossible to track who owns the locking and what the
> current locking state is.
>
> Consider
>     drma -> dca -> dcb -> drmb
>     drmc -> dcc  -> dcb -> drmb
>
> We have two codes paths that go back into drmb, now maybe drma has a
> lock taken, but drmc doesn't, but we've no indication when we hit drmb
> of what the context pre entering the DC layer is. This causes all
> kinds of problems. The main requirement is the driver maintains the
> execution flow as much as possible. The only callback behaviour should
> be from an irq or workqueue type situations where you've handed
> execution flow to the hardware to do something and it is getting back
> to you. The pattern we use to get our of this sort of hole is helper
> libraries, we structure code as much as possible as leaf nodes that
> don't call back into the parents if we can avoid it (we don't always
> succeed).
>
> Okay.  by the way DC does behave like a helper for most part.  There is no
> locking in DC.  We work enough with different OS to know they all have
> different synchronization primatives and interrupt handling and have DC lock
> anything is just shooting ourself in the foot.  We do have function with
> lock in their function name in DC but those are HW register lock to ensure
> that the HW register update atomically. ie have 50 register write latch in
> HW at next vsync to ensure HW state change on vsync boundary.
>
> So the above might becomes
>    drma-> dca_helper
>            -> dcb_helper
>            -> drmb.
>
> In this case the code flow is controlled by drma, dca/dcb might be
> modifying data or setting hw state but when we get to drmb it's easy
> to see what data is needs and what locking.
>
> DAL/DC goes against this in so many ways, and when I look at the code
> I'm never sure where to even start pulling the thread to unravel it.
>
> I don't know where we go against it.  In the case we do callback to DRM for
> MST case we have
>
> amdgpu_dm_atomic_commit (implement atomic_commit)
> dc_commit_targets (commit helper)
> dce110_apply_ctx_to_hw (hw_seq)
> core_link_enable_stream (part of MST enable sequence)
> allocate_mst_payload (helper for above func in same file)
> dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM)
> drm_dp_mst_allocate_vcpi (DRM)
>
> As you see even in this case we are only 6 level deep before we callback to
> DRM, and 2 of those functions are in same file as helper func of the bigger
> sequence.
>
> Can you clarify the distinction between what you would call a mid layer vs
> helper.  We consulted Alex a lot and we know about this inversion of control
> pattern and we are trying our best to do it.  Is it the way functions are
> named and files folder structure?  Would it help if we flatten
> amdgpu_dm_atomic_commit and dc_commit_targets?  Even if we do I would
> imagine we want some helper in commit rather a giant 1000 line function.  Is
> there any concern that we put dc_commit_targets under /dc folder as we want
> other platform to run exact same helper?  Or this is about the state
> dc_commit_targets is too big?  or the state is stored validate_context
> rather than drm_atomic_state?

Well one area I hit today while looking, is trace the path for a dpcd
read or write.

An internal one in the dc layer goes

core_link_dpcd_read

>
> I don't think it make sense for DRM to get into how we decide to use our HW
> blocks.  For example any refactor done in core should not result in us using
> different pipeline to drive the same config.  We would like to have control
> over how our HW pipeline is constructed.
>
> Some questions I have for AMD engineers that also I'd want to see
> addressed before any consideration of merging would happen!
>
> How do you plan on dealing with people rewriting or removing code
> upstream that is redundant in the kernel, but required for internal
> stuff?
>
>
> Honestly I don't know what these are.  Like you and Jerome remove func ptr
> abstraction (I know it was bad, that was one of the component we ported from
> windows) and we need to keep it as function pointer so we can still run our
> code on FPGA before we see first silicon?  I don't think if we nak the
> function ptr removal will be a problem for community.  The rest is valued
> and we took with open arm.
>
> Or this is more like we have code duplication after DRM added some
> functionality we can use?  I would imaging its more of moving what we got
> working in our code to DRM core if we are upstreamed and we have no problem
> accomodate for that as the code moved out to DRM core can be included in
> other platforms.  We don't have any private ioctl today and we don't plan to
> have any outside of using DRM object properties.
>
>
> How are you going to deal with new Linux things that overlap
> incompatibly with your internally developed stuff?
>
> I really don't know what those new linux things can be that could cause us
> problem.  If anything the new things will be probably come from us if we are
> upstreammed.
>
> atomic: we had that on windows 8 years ago for windows vista, yes
> sematic/abstraction is different but concept is the same. We could have
> easily settled with DRM semantics or DRM could easily take some form of our
> pattern.
>
> DP MST:  AMD was the first source certified and we work closely with the
> first branch certified. I was a part of that team and we had a very solid
> implementation.  If we were upstreamed I don't see you would want to
> reinvent the wheel and not try to massage what we have into shape for DRM
> core for other driver to reuse.
>
> drm_plane: windows multi-plane overlay and Andriod HW composer? We had that
> working 2 years ago.  If you are upstreammed and you are first you usually
> have a say in how it should go down don't you?
>
> The new thing coming are Free Sync HDR, 8k@60 with DP DSC etc.  I would
> imaging we would beat all other vendor to the first open source solution if
> we leverage effort from our extended display team.
>
> If the code is upstream will it be tested in the kernel by some QA
> group, or will there be some CI infrastructure used to maintain and to
> watch for Linux code that breaks assumptions in the DC code?
>
> We have tester that runs set of display test every other day on linux.  We
> don't run on DRM_Next tree yet and Alex is working out a plan to allow us
> use DRM_Next as our development branch.  Upstream is not likely to be tested
> by QA though.
>
> DC does not assume anything.  DC require full state given in
> dc_commit_targets / dc_commit_surfaces_to_target.  we do whatever is
> specified in the data structure.  dc_commit_surfaces_to_target can be
> considered as a helper function to change plane without visual side effect
> on vsync bondary.  dc_commit_targets can be considered as a helper function
> to light up a display with black screen.  DRM core has full control if you
> want to light up to black screen as soon as monitor is plugged in or you
> want to light up after someone does a mode set.  Hotplug interrupt goes to
> amdgpu_dm, and it will take the require lock in DRM object because calling
> DC to detect.
>
> Can you show me you understand that upstream code is no longer 100% in
> your control and things can happen to it that you might not expect and
> you need to deal with it?
>
> I think so, other than we haven't been spanning the mailing list.  We
> already dealing with we don't control 100% our code to some extend.  We
> don't control bandwidth_calc.  Trust me we are not keeping up with the
> updates that HW is doing with it for next gen hw.  Everytime we pull there
> is a new term they added and we have to find a way to feed that input.  We
> had to clean up linux style for them everytime we pull.  Our HW diagnostic
> suite has different set of requirements and they frequently contribute to
> our code.  We took you and Jerome's patch.  If it's validated we want that
> code.
>
> At end of the day I think the architecture is really about what's HW and
> what's DRM core.  Like I said all the yellow boxes has been proposed to
> running on firmware but we decide to keep them in driver as it's easier to
> debug on x86 than uC.  I can tell you that our HW guys were happy when I
> decide to open source bandwidth_calc but we did it anyways.  I feel like
> because we are opening up the complexity and inner working of our HW, we are
> somehow getting penalized for being open.
>
> Dave.
>
> Tony
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]           ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-13  7:30             ` Dave Airlie
  2016-12-13  9:14               ` Cheng, Tony
  2016-12-13 14:59             ` Rob Clark
  1 sibling, 1 reply; 66+ messages in thread
From: Dave Airlie @ 2016-12-13  7:30 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel,
	Daniel Vetter, Deucher, Alexander, Harry Wentland

(hit send too early)
> We would love to upstream DC for all supported asic!  We made enough change
> to make Sea Island work but it's really not validate to the extend we
> validate Polaris on linux and no where close to what we do for 2017 ASICs.
> With DC the display hardware programming, resource optimization, power
> management and interaction with rest of system will be fully validated
> across multiple OSs.  Therefore we have high confidence that the quality is
> going to better than what we have upstreammed today.
>
> I don't have a baseline to say if DC is in good enough quality for older
> generation compare to upstream.  For example we don't have HW generate
> bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is
> structured in a way that we assume bandwidth_calc is there.  None of us feel
> like go untangle the formulas in windows driver at this point to create our
> own version of bandwidth_calc.  It sort of work with HW default values but
> some mode / config is likely to underflows.  If community is okay with
> uncertain quality, sure we would love to upstream everything to reduce our
> maintaince overhead.  You do get audio with DC on DCE8 though.

If we get any of this upstream, we should get all of the hw supported with it.

If it regresses we just need someone to debug why.

> Maybe let me share what we are doing and see if we can come up with
> something to make DC work for both upstream and our internal need.  We are
> sharing code not just on Linux and we will do our best to make our code
> upstream friendly.  Last year we focussed on having enough code to prove
> that our DAL rewrite works and get more people contributing to it.  We rush
> a bit as a result we had a few legacy component we port from Windows driver
> and we know it's bloat that needed to go.
>
> We designed DC so HW can contribute bandwidth_calc magic and psuedo code to
> program the HW blocks.  The HW blocks on the bottom of DC.JPG in models our
> HW blocks and the programming sequence are provided by HW engineers.  If a
> piece of HW need a bit toggled 7 times during power up I rather have HW
> engineer put that in their psedo code rather than me trying to find that
> sequence in some document.  Afterall they did simulate the HW with the
> toggle sequence.  I guess these are back-end code Daniel talked about.  Can
> we agree that DRM core is not interested in how things are done in that
> layer and we can upstream these as it?
>
> The next is dce_hwseq.c to program the HW blocks in correct sequence.  Some
> HW block can be programmed in any sequence, but some requires strict
> sequence to be followed.  For example Display CLK and PHY CLK need to be up
> before we enable timing generator.  I would like these sequence to remain in
> DC as it's really not DRM's business to know how to program the HW.  In a
> way you can consider hwseq as a helper to commit state to HW.
>
> Above hwseq is the dce*_resource.c.  It's job is to come up with the HW
> state required to realize given config.  For example we would use the exact
> same HW resources with same optimization setting to drive any same given
> config.  If 4 x 4k@60 is supported with resource setting A on HW diagnositc
> suite during bring up setting B on Linux then we have a problem.  It know
> which HW block work with which block and their capability and limitations.
> I hope you are not asking this stuff to move up to core because in reality
> we should probably hide this in some FW, as HW expose the register to config
> them differently that doesn't mean all combination of HW usage is validated.
> To me resource is more of a helper to put together functional pipeline and
> does not make any decision that any OS might be interested in.
>
> These yellow boxes in DC.JPG are really specific to each generation of HW
> and changes frequently.  These are things that HW has consider hiding it in
> FW before.  Can we agree on those code (under /dc/dce*) can stay?

I think most of these things are fine to be part of the solution we end up at,
but I can't say for certain they won't require interface changes. I think the
most useful code is probably the stuff in the dce subdirectories.

>
> Is this about demonstration how basic functionality work and add more
> features with series of patches to make review eaiser?  If so I don't think
> we are staff to do this kind of rewrite.  For example it make no sense to
> hooking up bandwidth_calc to calculate HW magic if we don't have mem_input
> to program the memory settings.  We need portion of hw_seq to ensure these
> blocks are programming in correct sequence.  We will need to feed
> bandwidth_calc it's required inputs, which is basically the whole system
> state tracked in validate_context today, which means we basically need big
> bulk of resource.c.  This effort might have benefit in reviewing the code,
> but we will end up with pretty much similar if not the same as what we
> already have.

This is something people always say, I'm betting you won't end up there at all,
it's not just review, it's incremental development model, so that when things
go wrong we can pinpoint why and where a lot easier. Just merging this all in
one fell swoop is going to just mean a lot of pain in the end. I understand you
aren't resourced for this sort of development on this codebase, but it's going
to be an impasse to try and merge this all at once even if was clean code.

> Or is the objection that we have the white boxes in DC.JPG instead of using
> DRM objects?  We can probably workout something to have the white boxes
> derive from DRM objects and extend atomic state with our validate_context
> where dce*_resource.c stores the constructed pipelines.

I think Daniel explained quite well how things should look in terms of
subclassing.

>
> 5) Why is a midlayer bad?
> I'm not going to go into specifics on the DC midlayer, but we abhor
> midlayers for a fair few reasons. The main reason I find causes the
> most issues is locking. When you have breaks in code flow between
> multiple layers, but having layers calling back into previous layers
> it becomes near impossible to track who owns the locking and what the
> current locking state is.
>
> Consider
>     drma -> dca -> dcb -> drmb
>     drmc -> dcc  -> dcb -> drmb
>
> We have two codes paths that go back into drmb, now maybe drma has a
> lock taken, but drmc doesn't, but we've no indication when we hit drmb
> of what the context pre entering the DC layer is. This causes all
> kinds of problems. The main requirement is the driver maintains the
> execution flow as much as possible. The only callback behaviour should
> be from an irq or workqueue type situations where you've handed
> execution flow to the hardware to do something and it is getting back
> to you. The pattern we use to get our of this sort of hole is helper
> libraries, we structure code as much as possible as leaf nodes that
> don't call back into the parents if we can avoid it (we don't always
> succeed).
>
> Okay.  by the way DC does behave like a helper for most part.  There is no
> locking in DC.  We work enough with different OS to know they all have
> different synchronization primatives and interrupt handling and have DC lock
> anything is just shooting ourself in the foot.  We do have function with
> lock in their function name in DC but those are HW register lock to ensure
> that the HW register update atomically. ie have 50 register write latch in
> HW at next vsync to ensure HW state change on vsync boundary.
>
> So the above might becomes
>    drma-> dca_helper
>            -> dcb_helper
>            -> drmb.
>
> In this case the code flow is controlled by drma, dca/dcb might be
> modifying data or setting hw state but when we get to drmb it's easy
> to see what data is needs and what locking.
>
> DAL/DC goes against this in so many ways, and when I look at the code
> I'm never sure where to even start pulling the thread to unravel it.
>
> I don't know where we go against it.  In the case we do callback to DRM for
> MST case we have
>
> amdgpu_dm_atomic_commit (implement atomic_commit)
> dc_commit_targets (commit helper)
> dce110_apply_ctx_to_hw (hw_seq)
> core_link_enable_stream (part of MST enable sequence)
> allocate_mst_payload (helper for above func in same file)
> dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM)
> drm_dp_mst_allocate_vcpi (DRM)
>
> As you see even in this case we are only 6 level deep before we callback to
> DRM, and 2 of those functions are in same file as helper func of the bigger
> sequence.
>
> Can you clarify the distinction between what you would call a mid layer vs
> helper.  We consulted Alex a lot and we know about this inversion of control
> pattern and we are trying our best to do it.  Is it the way functions are
> named and files folder structure?  Would it help if we flatten
> amdgpu_dm_atomic_commit and dc_commit_targets?  Even if we do I would
> imagine we want some helper in commit rather a giant 1000 line function.  Is
> there any concern that we put dc_commit_targets under /dc folder as we want
> other platform to run exact same helper?  Or this is about the state
> dc_commit_targets is too big?  or the state is stored validate_context
> rather than drm_atomic_state?

Well one area I hit today while looking, is trace the path for a dpcd
read or write.

An internal one in the dc layer goes

core_link_dpcd_read (core_link)
dm_helpers_dp_read_dpcd(context, dc_link)
  search connector list for the appropriate connector
  drm_dp_dpcd_read

Note the connector list searching, this is a case of where you have called
back into the toplevel driver without the info necessary because core_link
and dc_link are too far abstracted from the drm connector.
(get_connector_for_link is a bad idea)

Then we get back around through the aux stuff and end up at:
dc_read_dpcd which passes connector->dc_link->link_index down
   this look up the dc_link again in core_dc->links[index]
dal_ddc_service_read_dpcd_data(link->ddc)
which calls into the i2caux path.

This is not helper functions or anything close, this is layering hell.

> I don't think it make sense for DRM to get into how we decide to use our HW
> blocks.  For example any refactor done in core should not result in us using
> different pipeline to drive the same config.  We would like to have control
> over how our HW pipeline is constructed.

I don't think the DRM wants to get involved at that level, but it would be good
if we could collapse the mountains of functions and layers so that you can
clearly see how a modeset happens all the way down to the hw in a linear
fashion.

>
> How do you plan on dealing with people rewriting or removing code
> upstream that is redundant in the kernel, but required for internal
> stuff?
>
>
> Honestly I don't know what these are.  Like you and Jerome remove func ptr
> abstraction (I know it was bad, that was one of the component we ported from
> windows) and we need to keep it as function pointer so we can still run our
> code on FPGA before we see first silicon?  I don't think if we nak the
> function ptr removal will be a problem for community.  The rest is valued
> and we took with open arm.
>
> Or this is more like we have code duplication after DRM added some
> functionality we can use?  I would imaging its more of moving what we got
> working in our code to DRM core if we are upstreamed and we have no problem
> accomodate for that as the code moved out to DRM core can be included in
> other platforms.  We don't have any private ioctl today and we don't plan to
> have any outside of using DRM object properties.

I've just sent some patches to remove a bunch of dpcd defines, that is just
one small example.

> I really don't know what those new linux things can be that could cause us
> problem.  If anything the new things will be probably come from us if we are
> upstreammed.

But until then there will be competing development upstream, and you might
want to merge things.

>
> DP MST:  AMD was the first source certified and we work closely with the
> first branch certified. I was a part of that team and we had a very solid
> implementation.  If we were upstreamed I don't see you would want to
> reinvent the wheel and not try to massage what we have into shape for DRM
> core for other driver to reuse.

Definitely, I hate writing MST code, and it would have been good if someone else
had gotten to it first.

So I think after looking more at it, my major issue is with DC, the
core stuff, not the hw
touching stuff, but the layering stuff, dc and core infrastructure in
a lot of places
calls into the the DM layer and back into itself. It's a bit of an
tangle to pull any
one thread of it and try to unravel it.

There also seems to be a fair lot of headers of questionable value, I've found
the same set of defines (or pretty close ones) in a few headers,
conversion functions
between different layer definitions etc. There are redundant header
files, unused structs, structs
of questionable value or structs that should be merged.

Stuff is hidden between dc and core structs, but it isn't always obvious why
stuff is in dc_link vs core_link. Ideally we'd lose some of that layering.

Also things like loggers and fixed function calculators, and vector
code probably need to be bumped
up a layer or two or made sure to be completely generic, and put
outside the DC code, if code is
in amd/display dir it should be display code.

I'm going to be happily ignoring most of this until early next year at
this point (I might jump in/out a few times)
but I think Daniel and Alex have a pretty good handle on where this
code should be going to get upstream, I think we should
all be listening to them as much as possible.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org>
  2016-12-13  4:10         ` Cheng, Tony
@ 2016-12-13  7:31         ` Daniel Vetter
  2016-12-13 10:09         ` Ernst Sjöstrand
  2 siblings, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-13  7:31 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter,
	Deucher, Alexander, Dave Airlie

On Mon, Dec 12, 2016 at 09:33:52PM -0500, Harry Wentland wrote:
> On 2016-12-11 03:28 PM, Daniel Vetter wrote:
> > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> > > We propose to use the Display Core (DC) driver for display support on
> > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> > > avoid a flag day the plan is to only support uGPU initially and transition
> > > to older ASICs gradually.
> > 
> > Bridgeman brought it up a few times that this here was the question - it's
> > kinda missing a question mark, hard to figure this out ;-). I'd say for
> 
> My bad for the missing question mark (imprecise phrasing). On the other hand
> letting this blow over a bit helped get us on the map a bit more and allows
> us to argue the challenges (and benefits) of open source. :)
> 
> > upstream it doesn't really matter, but imo having both atomic and
> > non-atomic paths in one driver is one world of hurt and I strongly
> > recommend against it, at least if feasible. All drivers that switched
> > switched in one go, the only exception was i915 (it took much longer than
> > we ever feared, causing lots of pain) and nouveau (which only converted
> > nv50+, but pre/post-nv50 have always been two almost completely separate
> > worlds anyway).
> > 
> 
> You mention the two probably most complex DRM drivers didn't switch in a
> single go...  I imagine amdgpu/DC falls into the same category.
> 
> I think one of the problems is making a sudden change with a fully validated
> driver without breaking existing use cases and customers. We really
> should've started DC development in public and probably would do that if we
> had to start anew.
> 
> > > The DC component has received extensive testing within AMD for DCE8, 10, and
> > > 11 GPUs and is being prepared for uGPU. Support should be better than
> > > amdgpu's current display support.
> > > 
> > >  * All of our QA effort is focused on DC
> > >  * All of our CQE effort is focused on DC
> > >  * All of our OEM preloads and custom engagements use DC
> > >  * DC behavior mirrors what we do for other OSes
> > > 
> > > The new asic utilizes a completely re-designed atom interface, so we cannot
> > > easily leverage much of the existing atom-based code.
> > > 
> > > We've introduced DC to the community earlier in 2016 and received a fair
> > > amount of feedback. Some of what we've addressed so far are:
> > > 
> > >  * Self-contain ASIC specific code. We did a bunch of work to pull
> > >    common sequences into dc/dce and leave ASIC specific code in
> > >    separate folders.
> > >  * Started to expose AUX and I2C through generic kernel/drm
> > >    functionality and are mostly using that. Some of that code is still
> > >    needlessly convoluted. This cleanup is in progress.
> > >  * Integrated Dave and Jerome’s work on removing abstraction in bios
> > >    parser.
> > >  * Retire adapter service and asic capability
> > >  * Remove some abstraction in GPIO
> > > 
> > > Since a lot of our code is shared with pre- and post-silicon validation
> > > suites changes need to be done gradually to prevent breakages due to a major
> > > flag day.  This, coupled with adding support for new asics and lots of new
> > > feature introductions means progress has not been as quick as we would have
> > > liked. We have made a lot of progress none the less.
> > > 
> > > The remaining concerns that were brought up during the last review that we
> > > are working on addressing:
> > > 
> > >  * Continue to cleanup and reduce the abstractions in DC where it
> > >    makes sense.
> > >  * Removing duplicate code in I2C and AUX as we transition to using the
> > >    DRM core interfaces.  We can't fully transition until we've helped
> > >    fill in the gaps in the drm core that we need for certain features.
> > >  * Making sure Atomic API support is correct.  Some of the semantics of
> > >    the Atomic API were not particularly clear when we started this,
> > >    however, that is improving a lot as the core drm documentation
> > >    improves.  Getting this code upstream and in the hands of more
> > >    atomic users will further help us identify and rectify any gaps we
> > >    have.
> > 
> > Ok so I guess Dave is typing some more general comments about
> > demidlayering, let me type some guidelines about atomic. Hopefully this
> > all materializes itself a bit better into improved upstream docs, but meh.
> > 
> 
> Excellent writeup. Let us know when/if you want our review for upstream
> docs.
> 
> We'll have to really take some time to go over our atomic implementation. A
> couple small comments below with regard to DC.
> 
> > Step 0: Prep
> > 
> > So atomic is transactional, but it's not validate + rollback or commit,
> > but duplicate state, validate and then either throw away or commit.
> > There's a few big reasons for this: a) partial atomic updates - if you
> > duplicate it's much easier to check that you have all the right locks b)
> > kfree() is much easier to check for correctness than a rollback code and
> > c) atomic_check functions are much easier to audit for invalid changes to
> > persistent state.
> > 
> 
> There isn't really any rollback. I believe even in our other drivers we've
> abandoned the rollback approach years ago because it doesn't really work on
> modern HW. Any rollback cases you might find in DC should really only be for
> catastrophic errors (read: something went horribly wrong... read:
> congratulations, you just found a bug).

I meant rollback in software. Rollback in hw isn't a good idea, and
atomi's point is to avoid these.

> > Trouble is that this seems a bit unusual compared to all other approaches,
> > and ime (from the drawn-out i915 conversion) you really don't want to mix
> > things up. Ofc for private state you can roll back (e.g. vc4 does that for
> > the drm_mm allocator thing for scanout slots or whatever it is), but it's
> > trivial easy to accidentally check the wrong state or mix them up or
> > something else bad.
> > 
> > Long story short, I think step 0 for DC is to split state from objects,
> > i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And
> > all the back-end functions need to take both the object and the state
> > explicitly.
> > 
> > This is a bit a pain to do, but should be pretty much just mechanical. And
> > imo not all of it needs to happen before DC lands in upstream, but see
> > above imo that half-converted state is postively horrible. This should
> > also not harm cross-os reuse at all, you can still store things together
> > on os where that makes sense.
> > 
> > Guidelines for amdgpu atomic structures
> > 
> > drm atomic stores everything in state structs on plane/connector/crtc.
> > This includes any property extensions or anything else really, the entire
> > userspace abi is built on top of this. Non-trivial drivers are supposed to
> > subclass these to store their own stuff, so e.g.
> > 
> > amdgpu_plane_state {
> > 	struct drm_plane_state base;
> > 
> > 	/* amdgpu glue state and stuff that's linux-specific, e.g.
> > 	 * property values and similar things. Note that there's strong
> > 	 * push towards standardizing properties and stroing them in the
> > 	 * drm_*_state structs. */
> > 
> > 	struct dc_surface_state surface_state;
> > 
> > 	/* other dc states that fit to a plane */
> > };
> > 
> > Yes not everything will fit 1:1 in one of these, but to get started I
> > strongly recommend to make them fit (maybe with reduced feature sets to
> > start out). Stuff that is shared between e.g. planes, but always on the
> > same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that
> > are assignable to a plane.
> > 
> > Of course atomic also supports truly global resources, for that you need
> > to subclass drm_atomic_state. Currently msm and i915 do that, and probably
> > best to read those structures as examples until I've typed the docs. But I
> > expect that especially for planes a few dc_*_state structs will stay in
> > amdgpu_*_state.
> > 
> > Guidelines for atomic_check
> > 
> > Please use the helpers as much as makes sense, and put at least the basic
> > steps that from drm_*_state into the respective dc_*_state functional
> > block into the helper callbacks for that object. I think basic validation
> > of individal bits (as much as possible, e.g. if you just don't support
> > e.g. scaling or rotation with certain pixel formats) should happen in
> > there too. That way when we e.g. want to check how drivers corrently
> > validate a given set of properties to be able to more strictly define the
> > semantics, that code is easy to find.
> > 
> > Also I expect that this won't result in code duplication with other OS,
> > you need code to map from drm to dc anyway, might as well check&reject the
> > stuff that dc can't even represent right there.
> > 
> > The other reason is that the helpers are good guidelines for some of the
> > semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right
> > answer after atomic_check. If it doesn't, then you're driver doesn't
> > follow atomic. If you completely roll your own this becomes much harder to
> > assure.
> > 
> 
> Interesting point. Not sure if we've checked that. Is there some sort of
> automated test for this that we can use to check?

We're typing them up in igt - generic testcase is pretty simple:
Semi-randomly change stuff, ask with TEST_ONLY whether it would modeset,
then commit and watch the vblank counter: If there's a gap, there was a
modeset. If it doesn't match the TEST_ONLY answer, complain.

> > Of course extend it all however you want, e.g. by adding all the global
> > optimization and resource assignment stuff after initial per-object
> > checking has been done using the helper infrastructure.
> > 
> > Guidelines for atomic_commit
> > 
> > Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also,
> 
> I believe we're not using those and didn't start with those which might
> explain (along with lack of discussion on dri-devel) why atomic currently
> looks the way it does in DC. This is definitely one of the bigger issues
> we'd want to clean up and where you wouldn't find much pushback, other than
> us trying to find time to do it.

Yeah, back when DC was developed the recommendation was still "roll your
own".

> > your atomic_commit should pretty much match the helper one, except for a
> > custom swap_state to handle all your globally shared specia dc_*_state
> > objects. Everything hw specific should be in atomic_commit_tail.
> > 
> > Wrt the hw commit itself, for the modeset step just roll your own. That's
> > the entire point of atomic, and atm both i915 and nouveau exploit this
> > fully. Besides a bit of glue there shouldn't be much need for
> > linux-specific code here - what you need is something to fish the right
> > dc_*_state objects and give it your main sequencer functions. What you
> > should make sure though is that only ever do a modeset when that was
> > signalled, i.e. please use drm_crtc_needs_modeset to control that part.
> > Feel free to wrap up in a dc_*_needs_modeset for better abstraction if
> > that's needed.
> > 
> > I do strongly suggest however that you implement the plane commit using
> > the helpers. There's really only a few ways to implement this in the hw,
> > and it should work everywhere.
> > 
> > Misc guidelines
> > 
> > Use the suspend/resume helpers. If your atomic can't do that, it's not
> > terribly good. Also, if DC can't make those fit, it's probably still too
> > much midlayer and its own world than helper library.
> > 
> 
> Do they handle swapping DP displays while the system is asleep? If not we'll
> probably need to add that. The other case where we have some special
> handling has to do with headless (sleep or resume, don't remember).

Atm we do a dumb restore, and since the link will fail to train, then just
light it up with a default mode. You kinda have to do that, because
disabling a pipe behind userspace's back is not a nice thing to do. Then
we also send out the usual uevent (or should, there's some broken versions
out there) so that userspace sees the reconfiguration and can adjust the
desired config.

Same with MST, although that was only fixed recently: Before we had the
connector refcounting we just force-unplugged everything and caused some
surprises with userspace. -intel learned to cope, but with the
proliferation of kms native compositors I don't think assuming that your
userspace will always cope if the kernel yanks screens randomly is a good
one.

This approach also will tie into the new link_status flag, to indicate
that something is wrong with an output.

> > Use all the legacy helpers, again your atomic should be able to pull it
> > off. One exception is async plane flips (both primary and cursors), that's
> > atm still unsolved. Probably best to keep the old code around for just
> > that case (but redirect to the compat helpers for everything), see e.g.
> > how vc4 implements cursors.
> > 
> 
> Good old flip. There probably isn't much shareable code between OSes here.
> It seems like every OS rolls there own thing, regarding flips. We still seem
> to be revisiting flips regularly, especially with FreeSync (adaptive sync)
> in the mix now. Good to know that this is still a bit of an open topic.

Yeah, so freesync and atomic is also not solved. But since that's just a
variable vblank, but the flip itself is still synced to it, it shouldn't
be a problem to wire this up for atomic.
> 
> > Most imporant of all
> > 
> > Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic
> > driver for which I don't remember a single discussion about some detail,
> > at least not with any of the DAL folks. Michel&Alex asked some questions
> > sometimes, but that indirection is bonghits and the defeats the point of
> > upstream: Direct cross-vendor collaboration to get shit done. Please make
> > it happen.
> > 
> 
> Please keep asking us to get on dri-devel with questions. I need to get into
> the habit again of leaving the IRC channel open. I think most of us are
> still a bit scared of it or don't know how to deal with some of the
> information overload (IRC and mailing list). It's some of my job to change
> that all the while I'm learning this myself. :)

Also just discuss design issues that interact with the core/helpers there,
even amongst yourself. Since sometimes not understand it is a problem with
a bug in the core ;-)

> Thanks for all your effort trying to get people involved.
> 
> > Oh and I pretty much assume Harry&Tony are volunteered to review atomic
> > docs ;-)
> > 
> 
> Sure.

Thanks, Daniel

> 
> Cheers,
> Harry
> 
> > Cheers, Daniel
> > 
> > 
> > 
> > > 
> > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> > > work on DC is public.  We're currently transitioning to a public patch
> > > review. You can follow our progress on the amd-gfx mailing list. We value
> > > community feedback on our work.
> > > 
> > > As an appendix I've included a brief overview of the how the code currently
> > > works to make understanding and reviewing the code easier.
> > > 
> > > Prior discussions on DC:
> > > 
> > >  * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
> > >  *
> > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> > > 
> > > Current version of DC:
> > > 
> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > 
> > > Once Alex pulls in the latest patches:
> > > 
> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > 
> > > Best Regards,
> > > Harry
> > > 
> > > 
> > > ************************************************
> > > *** Appendix: A Day in the Life of a Modeset ***
> > > ************************************************
> > > 
> > > Below is a high-level overview of a modeset with dc. Some of this might be a
> > > little out-of-date since it's based on my XDC presentation but it should be
> > > more-or-less the same.
> > > 
> > > amdgpu_dm_atomic_commit()
> > > {
> > >   /* setup atomic state */
> > >   drm_atomic_helper_prepare_planes(dev, state);
> > >   drm_atomic_helper_swap_state(dev, state);
> > >   drm_atomic_helper_update_legacy_modeset_state(dev, state);
> > > 
> > >   /* create or remove targets */
> > > 
> > >   /********************************************************************
> > >    * *** Call into DC to commit targets with list of all known targets
> > >    ********************************************************************/
> > >   /* DC is optimized not to do anything if 'targets' didn't change. */
> > >   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
> > >   {
> > >     /******************************************************************
> > >      * *** Build context (function also used for validation)
> > >      ******************************************************************/
> > >     result = core_dc->res_pool->funcs->validate_with_context(
> > >                                core_dc,set,target_count,context);
> > > 
> > >     /******************************************************************
> > >      * *** Apply safe power state
> > >      ******************************************************************/
> > >     pplib_apply_safe_state(core_dc);
> > > 
> > >     /****************************************************************
> > >      * *** Apply the context to HW (program HW)
> > >      ****************************************************************/
> > >     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
> > >     {
> > >       /* reset pipes that need reprogramming */
> > >       /* disable pipe power gating */
> > >       /* set safe watermarks */
> > > 
> > >       /* for all pipes with an attached stream */
> > >         /************************************************************
> > >          * *** Programming all per-pipe contexts
> > >          ************************************************************/
> > >         status = apply_single_controller_ctx_to_hw(...)
> > >         {
> > >           pipe_ctx->tg->funcs->set_blank(...);
> > >           pipe_ctx->clock_source->funcs->program_pix_clk(...);
> > >           pipe_ctx->tg->funcs->program_timing(...);
> > >           pipe_ctx->mi->funcs->allocate_mem_input(...);
> > >           pipe_ctx->tg->funcs->enable_crtc(...);
> > >           bios_parser_crtc_source_select(...);
> > > 
> > >           pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
> > >           pipe_ctx->opp->funcs->opp_program_fmt(...);
> > > 
> > >           stream->sink->link->link_enc->funcs->setup(...);
> > >           pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
> > >           pipe_ctx->tg->funcs->set_blank_color(...);
> > > 
> > >           core_link_enable_stream(pipe_ctx);
> > >           unblank_stream(pipe_ctx,
> > > 
> > >           program_scaler(dc, pipe_ctx);
> > >         }
> > >       /* program audio for all pipes */
> > >       /* update watermarks */
> > >     }
> > > 
> > >     program_timing_sync(core_dc, context);
> > >     /* for all targets */
> > >       target_enable_memory_requests(...);
> > > 
> > >     /* Update ASIC power states */
> > >     pplib_apply_display_requirements(...);
> > > 
> > >     /* update surface or page flip */
> > >   }
> > > }
> > > 
> > > 
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13  4:10         ` Cheng, Tony
@ 2016-12-13  7:50           ` Daniel Vetter
       [not found]           ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-13  7:50 UTC (permalink / raw)
  To: Cheng, Tony; +Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander

On Mon, Dec 12, 2016 at 11:10:30PM -0500, Cheng, Tony wrote:
> Thanks for the write up for the guide.  We can definitely re-do atomic
> according to guideline provided as I am not satified with how our code look
> today.  To me it seems more like we need to shuffle stuff around and rename
> a few things than rewrite much of anything.
> 
> I hope to get an answer on the reply to Dave's question regarding to if
> there is anything else.  If we can keep most of the stuff under /dc as the
> "back end" helper and do most of the change under /amdgpu_dm then it isn't
> that difficult as we don't need to go deal with the fall out on other
> platforms.  Again it's not just windows.  We are fully aware that it's hard
> to find the common abstraction between all different OS so we try our best
> to have DC behave more like a helper than abstraction layer anyways.  In our
> design states and policies are domain of Display Managers (DM) and because
> of linux we also say anything DRM can do that's also domain of DM.  We don't
> put anything in DC that we don't feel comfortable if HW decide to hide it in
> FW.
> 
> 
> On 12/12/2016 9:33 PM, Harry Wentland wrote:
> > On 2016-12-11 03:28 PM, Daniel Vetter wrote:
> > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> > > > We propose to use the Display Core (DC) driver for display support on
> > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc).
> > > > In order to
> > > > avoid a flag day the plan is to only support uGPU initially and
> > > > transition
> > > > to older ASICs gradually.
> > > 
> > > Bridgeman brought it up a few times that this here was the question
> > > - it's
> > > kinda missing a question mark, hard to figure this out ;-). I'd say for
> > 
> > My bad for the missing question mark (imprecise phrasing). On the other
> > hand letting this blow over a bit helped get us on the map a bit more
> > and allows us to argue the challenges (and benefits) of open source. :)
> > 
> > > upstream it doesn't really matter, but imo having both atomic and
> > > non-atomic paths in one driver is one world of hurt and I strongly
> > > recommend against it, at least if feasible. All drivers that switched
> > > switched in one go, the only exception was i915 (it took much longer
> > > than
> > > we ever feared, causing lots of pain) and nouveau (which only converted
> > > nv50+, but pre/post-nv50 have always been two almost completely separate
> > > worlds anyway).
> > > 
> > 
> trust me we would like to upstream everything.  Just we didn't invest enough
> in DC code in the previous generation so the quality might not be there.
> 
> > You mention the two probably most complex DRM drivers didn't switch in a
> > single go...  I imagine amdgpu/DC falls into the same category.
> > 
> > I think one of the problems is making a sudden change with a fully
> > validated driver without breaking existing use cases and customers. We
> > really should've started DC development in public and probably would do
> > that if we had to start anew.
> > 
> > > > The DC component has received extensive testing within AMD for
> > > > DCE8, 10, and
> > > > 11 GPUs and is being prepared for uGPU. Support should be better than
> > > > amdgpu's current display support.
> > > > 
> > > >  * All of our QA effort is focused on DC
> > > >  * All of our CQE effort is focused on DC
> > > >  * All of our OEM preloads and custom engagements use DC
> > > >  * DC behavior mirrors what we do for other OSes
> > > > 
> > > > The new asic utilizes a completely re-designed atom interface,
> > > > so we cannot
> > > > easily leverage much of the existing atom-based code.
> > > > 
> > > > We've introduced DC to the community earlier in 2016 and
> > > > received a fair
> > > > amount of feedback. Some of what we've addressed so far are:
> > > > 
> > > >  * Self-contain ASIC specific code. We did a bunch of work to pull
> > > >    common sequences into dc/dce and leave ASIC specific code in
> > > >    separate folders.
> > > >  * Started to expose AUX and I2C through generic kernel/drm
> > > >    functionality and are mostly using that. Some of that code is still
> > > >    needlessly convoluted. This cleanup is in progress.
> > > >  * Integrated Dave and Jerome’s work on removing abstraction in bios
> > > >    parser.
> > > >  * Retire adapter service and asic capability
> > > >  * Remove some abstraction in GPIO
> > > > 
> > > > Since a lot of our code is shared with pre- and post-silicon validation
> > > > suites changes need to be done gradually to prevent breakages
> > > > due to a major
> > > > flag day.  This, coupled with adding support for new asics and
> > > > lots of new
> > > > feature introductions means progress has not been as quick as we
> > > > would have
> > > > liked. We have made a lot of progress none the less.
> > > > 
> > > > The remaining concerns that were brought up during the last
> > > > review that we
> > > > are working on addressing:
> > > > 
> > > >  * Continue to cleanup and reduce the abstractions in DC where it
> > > >    makes sense.
> > > >  * Removing duplicate code in I2C and AUX as we transition to using the
> > > >    DRM core interfaces.  We can't fully transition until we've helped
> > > >    fill in the gaps in the drm core that we need for certain features.
> > > >  * Making sure Atomic API support is correct.  Some of the semantics of
> > > >    the Atomic API were not particularly clear when we started this,
> > > >    however, that is improving a lot as the core drm documentation
> > > >    improves.  Getting this code upstream and in the hands of more
> > > >    atomic users will further help us identify and rectify any gaps we
> > > >    have.
> > > 
> > > Ok so I guess Dave is typing some more general comments about
> > > demidlayering, let me type some guidelines about atomic. Hopefully this
> > > all materializes itself a bit better into improved upstream docs,
> > > but meh.
> > > 
> > 
> > Excellent writeup. Let us know when/if you want our review for upstream
> > docs.
> > 
> > We'll have to really take some time to go over our atomic
> > implementation. A couple small comments below with regard to DC.
> > 
> > > Step 0: Prep
> > > 
> > > So atomic is transactional, but it's not validate + rollback or commit,
> > > but duplicate state, validate and then either throw away or commit.
> > > There's a few big reasons for this: a) partial atomic updates - if you
> > > duplicate it's much easier to check that you have all the right locks b)
> > > kfree() is much easier to check for correctness than a rollback code and
> > > c) atomic_check functions are much easier to audit for invalid
> > > changes to
> > > persistent state.
> > > 
> > 
> > There isn't really any rollback. I believe even in our other drivers
> > we've abandoned the rollback approach years ago because it doesn't
> > really work on modern HW. Any rollback cases you might find in DC should
> > really only be for catastrophic errors (read: something went horribly
> > wrong... read: congratulations, you just found a bug).
> > 
> There is no rollback.  We moved to "atomic" for Windows Vista in the
> previous DAL 8 years ago.  Windows only care about VidPnSource (frame
> buffer) and VidPnTarget (display output) and leave the rest up to driver but
> we had to behave atomic as Window obsolutely "check" every possible config
> with the famous EnumConfunctionalModality DDI.
> 
> > > Trouble is that this seems a bit unusual compared to all other
> > > approaches,
> > > and ime (from the drawn-out i915 conversion) you really don't want
> > > to mix
> > > things up. Ofc for private state you can roll back (e.g. vc4 does
> > > that for
> > > the drm_mm allocator thing for scanout slots or whatever it is), but
> > > it's
> > > trivial easy to accidentally check the wrong state or mix them up or
> > > something else bad.
> > > 
> > > Long story short, I think step 0 for DC is to split state from objects,
> > > i.e. for each dc_surface/foo/bar you need a
> > > dc_surface/foo/bar_state. And
> > > all the back-end functions need to take both the object and the state
> > > explicitly.
> > > 
> > > This is a bit a pain to do, but should be pretty much just
> > > mechanical. And
> > > imo not all of it needs to happen before DC lands in upstream, but see
> > > above imo that half-converted state is postively horrible. This should
> > > also not harm cross-os reuse at all, you can still store things together
> > > on os where that makes sense.
> > > 
> > > Guidelines for amdgpu atomic structures
> > > 
> > > drm atomic stores everything in state structs on plane/connector/crtc.
> > > This includes any property extensions or anything else really, the
> > > entire
> > > userspace abi is built on top of this. Non-trivial drivers are
> > > supposed to
> > > subclass these to store their own stuff, so e.g.
> > > 
> > > amdgpu_plane_state {
> > >     struct drm_plane_state base;
> > > 
> > >     /* amdgpu glue state and stuff that's linux-specific, e.g.
> > >      * property values and similar things. Note that there's strong
> > >      * push towards standardizing properties and stroing them in the
> > >      * drm_*_state structs. */
> > > 
> > >     struct dc_surface_state surface_state;
> > > 
> > >     /* other dc states that fit to a plane */
> > > };
> > > 
> Is there any requirement where the header and code that deal with
> dc_surface_state has to be?  Can we keep it under /dc while
> amdgpu_plane_state exist under /amdgpu_dm?

None. And my proposal here with having dc_*_state structures for the dc
block, and fairly separate amdgpu_*_state blocks to bind it into drm is
exaclty to facilitate this split, so dc_* stuff would still entirely live
in dc/ (and hopefully shared with everyone else), while amdgpu would be
the linux glue.
1
> > > Yes not everything will fit 1:1 in one of these, but to get started I
> > > strongly recommend to make them fit (maybe with reduced feature sets to
> > > start out). Stuff that is shared between e.g. planes, but always on the
> > > same crtc can be put into amdgpu_crtc_state, e.g. if you have
> > > scalers that
> > > are assignable to a plane.
> > > 
> > > Of course atomic also supports truly global resources, for that you need
> > > to subclass drm_atomic_state. Currently msm and i915 do that, and
> > > probably
> > > best to read those structures as examples until I've typed the docs.
> > > But I
> > > expect that especially for planes a few dc_*_state structs will stay in
> > > amdgpu_*_state.
> > > 
> We need to treat most of resource that don't map well as global. One example
> is pixel pll.  We have 6 display pipes but only 2 or 3 plls in CI/VI, as a
> result we are limited in number of HDMI or DVI we can drive at the same
> time.  Also the pixel pll can be used to drive DP as well, so there is
> another layer of HW specific but we can't really contain it in crtc or
> encoder by itself.  Doing this resource allocation require knowlege of the
> whole system, and knowning which pixel pll is already used, and what can we
> support with remaining pll.

Same on i915. Other stuff we currently treat as global are the overall
clocks&bandwidth/latency needs, plus all the fetch fifo settings and
latencies (because they depend in complicated ways on everything else).
But e.g. the fetch latency and bw needed for each plane is computed in the
plane check code.

Scalers otoh are per-crtc on intel.

> Another ask is lets say we are driving 2 displays, we would always want
> instance 0 and instance 1 of scaler, timing generator etc getting used.  We
> want to avoid possiblity of due to different user mode commit sequence we
> end up with driving the 2 display with 0 and 2nd instance of HW.  Not only
> this configuration isn't really validated in the lab, we will be less
> effective in power gating as instance 0 and 1 are one the same tile.
> instead of having 2/3 of processing pipeline silicon power gated we can only
> power gate 1/3. And if we power gate wrong the you will have 1 of the 2
> display not lighting up.

Just implement some bias in which shared resoures your prefer for which
crtc. Also note that with atomic you can always add more drm objeccts to
the commit. So if you've put yourself into a corner (for power
optimizatino reasons), but then userspace wants to light up more displays,
and you want to reassing the resources for the already enabled outputs
(e.g. when not all clocks are the same and only some support really high
clocks).

We do that on intel when we need to change the display core clock, since
that means recomputing everything (and you can't change the display core
clock while a display is on).

> Having HW resource used the same way on all platform under any sequence /
> circumstance is important for us, as power optimization/measure is done for
> given platform + display config mostly on only 1 OS by the HW team.

Yeah, should all be possible with atomic. And I think with some work it
should be possible to keep that selection logic for shared resources in
the shared code, even with this redesign.

> > > Guidelines for atomic_check
> > > 
> > > Please use the helpers as much as makes sense, and put at least the
> > > basic
> > > steps that from drm_*_state into the respective dc_*_state functional
> > > block into the helper callbacks for that object. I think basic
> > > validation
> > > of individal bits (as much as possible, e.g. if you just don't support
> > > e.g. scaling or rotation with certain pixel formats) should happen in
> > > there too. That way when we e.g. want to check how drivers corrently
> > > validate a given set of properties to be able to more strictly
> > > define the
> > > semantics, that code is easy to find.
> > > 
> > > Also I expect that this won't result in code duplication with other OS,
> > > you need code to map from drm to dc anyway, might as well
> > > check&reject the
> > > stuff that dc can't even represent right there.
> > > 
> > > The other reason is that the helpers are good guidelines for some of the
> > > semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the
> > > right
> > > answer after atomic_check. If it doesn't, then you're driver doesn't
> > > follow atomic. If you completely roll your own this becomes much
> > > harder to
> > > assure.
> > > 
> it doesn't today and we have equilvant check in dc in our hw_seq. We will
> look into how to make it work.  Our "atomic" operate on always knowing the
> current state (core_dc.current_ctx) and finding out the delta between the
> desired future state computed in our dc_validate.  One thing we were
> stuggling with it seems DRM is building up incremental state, ie. if
> something isn't mentioned in atomic_commit then you don't touch it.  We
> operate in a mode where if something isn't mentioned in dc_commit_target we
> disable those output.  this method allow us to always know current and
> future state, as future state is built up by caller (amdgpu), and we are
> able to transition into the future state on vsync boundary if required. It
> seems to me that drm_*_state require us to compartimentize states.  It won't
> be as trivial to fill the input for bandwidth_calc as that beast need
> everything as everything end up goes through the same memory controller.
> our validate_context is specifically design to make it easy to generate
> input parameter for bandwidth_calc.  Per pipe validate like pixel format,
> scaling is not a problem.

Hm, that needs to be fixed. Atom also gives you both old and new state,
but only for state objects which are changed. You can just go around and
add everything (see example above), but you should _only_ do that when
necessary. One design goal of atomic is that when you do a modeset on an
2nd display then page-flip should continue to work (assuming the hw can do
it) on the 1st display without any stalls. We have fine-grained locking
for this, but if you always need all the state then you defeat that point.

Of course this is a bit tricky if you have lots of complicated shared
state. The way we solve this is by pushing copies of relevant data from
planes/crtc down to the shared resources. This way you end up with a
read-only copy, and as long as those derived values don't change the
independtly running pageflip loop won't need to stall for your modeset.
And the modeset code can still look at the data, without grabbing the full
update lock.

This is probably going to be a bit of a rework, so for starters would make
sense to only aim to have parallel flips (without any modesets). That
still means you need to be careful with grabbing global states.

> > Interesting point. Not sure if we've checked that. Is there some sort of
> > automated test for this that we can use to check?
> > 
> > > Of course extend it all however you want, e.g. by adding all the global
> > > optimization and resource assignment stuff after initial per-object
> > > checking has been done using the helper infrastructure.
> > > 
> > > Guidelines for atomic_commit
> > > 
> > > Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also,
> > 
> > I believe we're not using those and didn't start with those which might
> > explain (along with lack of discussion on dri-devel) why atomic
> > currently looks the way it does in DC. This is definitely one of the
> > bigger issues we'd want to clean up and where you wouldn't find much
> > pushback, other than us trying to find time to do it.
> > 
> > > your atomic_commit should pretty much match the helper one, except for a
> > > custom swap_state to handle all your globally shared specia dc_*_state
> > > objects. Everything hw specific should be in atomic_commit_tail.
> > > 
> > > Wrt the hw commit itself, for the modeset step just roll your own.
> > > That's
> > > the entire point of atomic, and atm both i915 and nouveau exploit this
> > > fully. Besides a bit of glue there shouldn't be much need for
> > > linux-specific code here - what you need is something to fish the right
> > > dc_*_state objects and give it your main sequencer functions. What you
> > > should make sure though is that only ever do a modeset when that was
> > > signalled, i.e. please use drm_crtc_needs_modeset to control that part.
> > > Feel free to wrap up in a dc_*_needs_modeset for better abstraction if
> > > that's needed.
> > > 
> Using state properly will solve our double resource assignment/validation
> problem during commit.  Thanks for the guidance on how to do this.
> 
> now the question is can we have a helper function to house the main sequence
> and put it in /dc?

Yeah, that's my proposal. You probably need some helper functions and
iterator macros on the overall dc_state (or amdgpu_state, whatever you
call it) so that your helper function can walk all the state objects
correctly. And only those which are part of the state (see above for why
this is important), but sharing that overall commit logic should be
possible.

> > > I do strongly suggest however that you implement the plane commit using
> > > the helpers. There's really only a few ways to implement this in the hw,
> > > and it should work everywhere.
> > > 
> Maybe from SW perspective I'll look at the intel code to understand this.
> In terms of HW I would have to say I disagree with that.  The multi-plane
> blend stuff even in our HW we have gone through 1 minor revision and 1 major
> change.  Also the same HW is build to handle stereo 3D, multi-plane
> blending, pipe spliting and more.  The pipeline / blending stuff tend to
> change in HW because HW need to constently design to meet the timing
> requirement of ever increasing pixel rate to keep us competitive. When HW
> can't meet timing they employ the split trick and have 2 copy of the same HW
> to be able to push through that many number of pixel.   If we are Intel and
> on latest process node then we probably won't have this problem.  I beat our
> 2018 HW will change again especially things are moving toward 64bpp FP16
> pixel format by default for HDR.

None of this matters for atomic multi-plane commit. There's about 3 ways
to do that:
- GO bit that you set to signal to the hw the new state that it should
  commit on the next vblank.
- vblank inhibit bit (works like GO inverted, but doesn't auto-clear).
- vblank evasion in software.

I haven't seen anything else in 20+ atomic drivers. None of this has
anything to do with the features your planes and blending engine support.
And the above is somewhat interesting to know because it matters for how
you send out the completion event for the atomic commit correctly, which
again is part of the uabi contract. Hence why I think it makes sense to
consider these strongly, even though they're fairly deeply nested - they
are again a part of the glue that binds DC into the linux world.

Cheers, Daniel


> > > Misc guidelines
> > > 
> > > Use the suspend/resume helpers. If your atomic can't do that, it's not
> > > terribly good. Also, if DC can't make those fit, it's probably still too
> > > much midlayer and its own world than helper library.
> > > 
> > 
> > Do they handle swapping DP displays while the system is asleep? If not
> > we'll probably need to add that. The other case where we have some
> > special handling has to do with headless (sleep or resume, don't
> > remember).
> > 
> > > Use all the legacy helpers, again your atomic should be able to pull it
> > > off. One exception is async plane flips (both primary and cursors),
> > > that's
> > > atm still unsolved. Probably best to keep the old code around for just
> > > that case (but redirect to the compat helpers for everything), see e.g.
> > > how vc4 implements cursors.
> > > 
> > 
> > Good old flip. There probably isn't much shareable code between OSes
> > here. It seems like every OS rolls there own thing, regarding flips. We
> > still seem to be revisiting flips regularly, especially with FreeSync
> > (adaptive sync) in the mix now. Good to know that this is still a bit of
> > an open topic.
> > 
> > > Most imporant of all
> > > 
> > > Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic
> > > driver for which I don't remember a single discussion about some detail,
> > > at least not with any of the DAL folks. Michel&Alex asked some questions
> > > sometimes, but that indirection is bonghits and the defeats the point of
> > > upstream: Direct cross-vendor collaboration to get shit done. Please
> > > make
> > > it happen.
> > > 
> > 
> > Please keep asking us to get on dri-devel with questions. I need to get
> > into the habit again of leaving the IRC channel open. I think most of us
> > are still a bit scared of it or don't know how to deal with some of the
> > information overload (IRC and mailing list). It's some of my job to
> > change that all the while I'm learning this myself. :)
> > 
> > Thanks for all your effort trying to get people involved.
> > 
> > > Oh and I pretty much assume Harry&Tony are volunteered to review atomic
> > > docs ;-)
> > > 
> > 
> > Sure.
> > 
> > Cheers,
> > Harry
> > 
> > > Cheers, Daniel
> > > 
> > > 
> > > 
> > > > 
> > > > Unfortunately we cannot expose code for uGPU yet. However
> > > > refactor / cleanup
> > > > work on DC is public.  We're currently transitioning to a public patch
> > > > review. You can follow our progress on the amd-gfx mailing list.
> > > > We value
> > > > community feedback on our work.
> > > > 
> > > > As an appendix I've included a brief overview of the how the
> > > > code currently
> > > > works to make understanding and reviewing the code easier.
> > > > 
> > > > Prior discussions on DC:
> > > > 
> > > >  *
> > > > https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
> > > >  *
> > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> > > > 
> > > > 
> > > > Current version of DC:
> > > > 
> > > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > > 
> > > > Once Alex pulls in the latest patches:
> > > > 
> > > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > > 
> > > > Best Regards,
> > > > Harry
> > > > 
> > > > 
> > > > ************************************************
> > > > *** Appendix: A Day in the Life of a Modeset ***
> > > > ************************************************
> > > > 
> > > > Below is a high-level overview of a modeset with dc. Some of
> > > > this might be a
> > > > little out-of-date since it's based on my XDC presentation but
> > > > it should be
> > > > more-or-less the same.
> > > > 
> > > > amdgpu_dm_atomic_commit()
> > > > {
> > > >   /* setup atomic state */
> > > >   drm_atomic_helper_prepare_planes(dev, state);
> > > >   drm_atomic_helper_swap_state(dev, state);
> > > >   drm_atomic_helper_update_legacy_modeset_state(dev, state);
> > > > 
> > > >   /* create or remove targets */
> > > > 
> > > > /********************************************************************
> > > >    * *** Call into DC to commit targets with list of all known targets
> > > > ********************************************************************/
> > > >   /* DC is optimized not to do anything if 'targets' didn't change. */
> > > >   dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
> > > >   {
> > > > /******************************************************************
> > > >      * *** Build context (function also used for validation)
> > > > ******************************************************************/
> > > >     result = core_dc->res_pool->funcs->validate_with_context(
> > > > core_dc,set,target_count,context);
> > > > 
> > > > /******************************************************************
> > > >      * *** Apply safe power state
> > > > ******************************************************************/
> > > >     pplib_apply_safe_state(core_dc);
> > > > 
> > > > /****************************************************************
> > > >      * *** Apply the context to HW (program HW)
> > > > ****************************************************************/
> > > >     result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
> > > >     {
> > > >       /* reset pipes that need reprogramming */
> > > >       /* disable pipe power gating */
> > > >       /* set safe watermarks */
> > > > 
> > > >       /* for all pipes with an attached stream */
> > > > /************************************************************
> > > >          * *** Programming all per-pipe contexts
> > > > ************************************************************/
> > > >         status = apply_single_controller_ctx_to_hw(...)
> > > >         {
> > > >           pipe_ctx->tg->funcs->set_blank(...);
> > > > pipe_ctx->clock_source->funcs->program_pix_clk(...);
> > > >           pipe_ctx->tg->funcs->program_timing(...);
> > > > pipe_ctx->mi->funcs->allocate_mem_input(...);
> > > >           pipe_ctx->tg->funcs->enable_crtc(...);
> > > >           bios_parser_crtc_source_select(...);
> > > > 
> > > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
> > > >           pipe_ctx->opp->funcs->opp_program_fmt(...);
> > > > 
> > > > stream->sink->link->link_enc->funcs->setup(...);
> > > > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
> > > >           pipe_ctx->tg->funcs->set_blank_color(...);
> > > > 
> > > >           core_link_enable_stream(pipe_ctx);
> > > >           unblank_stream(pipe_ctx,
> > > > 
> > > >           program_scaler(dc, pipe_ctx);
> > > >         }
> > > >       /* program audio for all pipes */
> > > >       /* update watermarks */
> > > >     }
> > > > 
> > > >     program_timing_sync(core_dc, context);
> > > >     /* for all targets */
> > > >       target_enable_memory_requests(...);
> > > > 
> > > >     /* Update ASIC power states */
> > > >     pplib_apply_display_requirements(...);
> > > > 
> > > >     /* update surface or page flip */
> > > >   }
> > > > }
> > > > 
> > > > 
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > 
> 



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-13  8:33         ` Daniel Vetter
  0 siblings, 0 replies; 66+ messages in thread
From: Daniel Vetter @ 2016-12-13  8:33 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter,
	Deucher, Alexander, Dave Airlie

On Mon, Dec 12, 2016 at 09:05:15PM -0500, Harry Wentland wrote:
> 
> On 2016-12-12 02:22 AM, Daniel Vetter wrote:
> > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> > > Current version of DC:
> > > 
> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > 
> > > Once Alex pulls in the latest patches:
> > > 
> > >  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > 
> > One more: That 4.7 here is going to be unbelievable amounts of pain for
> > you. Yes it's a totally sensible idea to just freeze your baseline kernel
> > because then linux looks a lot more like Windows where the driver abi is
> > frozen. But it makes following upstream entirely impossible, because
> > rebasing is always a pain and hence postponed. Which means you can't just
> > use the latest stuff in upstream drm, which means collaboration with
> > others and sharing bugfixes in core is a lot more pain, which then means
> > you do more than necessary in your own code and results in HALs like DAL,
> > perpetuating the entire mess.
> > 
> > So I think you don't just need to demidlayer DAL/DC, you also need to
> > demidlayer your development process. In our experience here at Intel that
> > needs continuous integration testing (in drm-tip), because even 1 month of
> > not resyncing with drm-next is sometimes way too long. See e.g. the
> > controlD regression we just had. And DAL is stuck on a 1 year old kernel,
> > so pretty much only of historical significance and otherwise dead code.
> > 
> > And then for any stuff which isn't upstream yet (like your internal
> > enabling, or DAL here, or our own internal enabling) you need continuous
> > rebasing&re-validation. When we started doing this years ago it was still
> > manually, but we still rebased like every few days to keep the pain down
> > and adjust continuously to upstream evolution. But then going to a
> > continous rebase bot that sends you mail when something goes wrong was
> > again a massive improvement.
> > 
> 
> I think we've seen that pain already but haven't quite realized how much of
> it is due to a mismatch in kernel trees. We're trying to move onto a tree
> following drm-next much more closely. I'd love to help automate some of that
> (time permitting). Would the drm-misc scripts be of any use with that? I
> only had a very cursory glance at those.

I've offered to Alex that we could include the amd trees (only stuff ready
for pull request) into drm-tip for continuous integration testing at
least. Would mean that Alex needs to use dim when updating those branches,
and you CI needs to test drm-tip (and do that everytime it changes, i.e.
really continuously).

For continues rebasing there's no ready-made thing public, but I highly
recommend you use one of the patch pile tools. In intel we have a glue of
quilt + tracking quilt state with git, implemented in the qf script in
maintainer-tools. That one has a lot more sharp edges than dim, but it
gets the job done. And the combination of git track + raw patch file for
seding is very powerful for rebasing.

Long term I'm hopefully that git series will become the new shiny, since
Josh Tripplet really understands the use-cases of having long-term
rebasing trees which are collaboratively maintainer. It's a lot nicer than
qf, but can't yet do everything we need (and likely what you'll need to be
able to rebase DC without going crazy).

> > I guess in the end Conway's law that your software architecture
> > necessarily reflects how you organize your teams applies again. Fix your
> > process and it'll become glaringly obvious to everyone involved that
> > DC-the-design as-is is entirely unworkeable and how it needs to be fixed.
> > 
> > From my own experience over the past few years: Doing that is a fun
> > journey ;-)
> > 
> 
> Absolutely. We're only at the start of this but have learned a lot from the
> community (maybe others in the DC team disagree with me somewhat).
> 
> Not sure if I fully agree that this means that DC-the-design-as-is will
> become apparent as unworkable... There are definitely pieces to be cleaned
> here and lessons learned from the DRM community but on the other hand we
> feel there are some good reasons behind our approach that we'd like to share
> with the community (some of which I'm learning myself).

Tony asking what the difference between a midlayer and a helper library is
is imo a good indicator that there's still learning to do in the team ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13  7:30             ` Dave Airlie
@ 2016-12-13  9:14               ` Cheng, Tony
  0 siblings, 0 replies; 66+ messages in thread
From: Cheng, Tony @ 2016-12-13  9:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander



On 12/13/2016 2:30 AM, Dave Airlie wrote:
> (hit send too early)
>> We would love to upstream DC for all supported asic!  We made enough change
>> to make Sea Island work but it's really not validate to the extend we
>> validate Polaris on linux and no where close to what we do for 2017 ASICs.
>> With DC the display hardware programming, resource optimization, power
>> management and interaction with rest of system will be fully validated
>> across multiple OSs.  Therefore we have high confidence that the quality is
>> going to better than what we have upstreammed today.
>>
>> I don't have a baseline to say if DC is in good enough quality for older
>> generation compare to upstream.  For example we don't have HW generate
>> bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is
>> structured in a way that we assume bandwidth_calc is there.  None of us feel
>> like go untangle the formulas in windows driver at this point to create our
>> own version of bandwidth_calc.  It sort of work with HW default values but
>> some mode / config is likely to underflows.  If community is okay with
>> uncertain quality, sure we would love to upstream everything to reduce our
>> maintaince overhead.  You do get audio with DC on DCE8 though.
> If we get any of this upstream, we should get all of the hw supported with it.
>
> If it regresses we just need someone to debug why.
great will do.
>
>> Maybe let me share what we are doing and see if we can come up with
>> something to make DC work for both upstream and our internal need.  We are
>> sharing code not just on Linux and we will do our best to make our code
>> upstream friendly.  Last year we focussed on having enough code to prove
>> that our DAL rewrite works and get more people contributing to it.  We rush
>> a bit as a result we had a few legacy component we port from Windows driver
>> and we know it's bloat that needed to go.
>>
>> We designed DC so HW can contribute bandwidth_calc magic and psuedo code to
>> program the HW blocks.  The HW blocks on the bottom of DC.JPG in models our
>> HW blocks and the programming sequence are provided by HW engineers.  If a
>> piece of HW need a bit toggled 7 times during power up I rather have HW
>> engineer put that in their psedo code rather than me trying to find that
>> sequence in some document.  Afterall they did simulate the HW with the
>> toggle sequence.  I guess these are back-end code Daniel talked about.  Can
>> we agree that DRM core is not interested in how things are done in that
>> layer and we can upstream these as it?
>>
>> The next is dce_hwseq.c to program the HW blocks in correct sequence.  Some
>> HW block can be programmed in any sequence, but some requires strict
>> sequence to be followed.  For example Display CLK and PHY CLK need to be up
>> before we enable timing generator.  I would like these sequence to remain in
>> DC as it's really not DRM's business to know how to program the HW.  In a
>> way you can consider hwseq as a helper to commit state to HW.
>>
>> Above hwseq is the dce*_resource.c.  It's job is to come up with the HW
>> state required to realize given config.  For example we would use the exact
>> same HW resources with same optimization setting to drive any same given
>> config.  If 4 x 4k@60 is supported with resource setting A on HW diagnositc
>> suite during bring up setting B on Linux then we have a problem.  It know
>> which HW block work with which block and their capability and limitations.
>> I hope you are not asking this stuff to move up to core because in reality
>> we should probably hide this in some FW, as HW expose the register to config
>> them differently that doesn't mean all combination of HW usage is validated.
>> To me resource is more of a helper to put together functional pipeline and
>> does not make any decision that any OS might be interested in.
>>
>> These yellow boxes in DC.JPG are really specific to each generation of HW
>> and changes frequently.  These are things that HW has consider hiding it in
>> FW before.  Can we agree on those code (under /dc/dce*) can stay?
> I think most of these things are fine to be part of the solution we end up at,
> but I can't say for certain they won't require interface changes. I think the
> most useful code is probably the stuff in the dce subdirectories.
okay as long as we can agree on this piece stay I am sure we can make it 
work.
>
>> Is this about demonstration how basic functionality work and add more
>> features with series of patches to make review eaiser?  If so I don't think
>> we are staff to do this kind of rewrite.  For example it make no sense to
>> hooking up bandwidth_calc to calculate HW magic if we don't have mem_input
>> to program the memory settings.  We need portion of hw_seq to ensure these
>> blocks are programming in correct sequence.  We will need to feed
>> bandwidth_calc it's required inputs, which is basically the whole system
>> state tracked in validate_context today, which means we basically need big
>> bulk of resource.c.  This effort might have benefit in reviewing the code,
>> but we will end up with pretty much similar if not the same as what we
>> already have.
> This is something people always say, I'm betting you won't end up there at all,
> it's not just review, it's incremental development model, so that when things
> go wrong we can pinpoint why and where a lot easier. Just merging this all in
> one fell swoop is going to just mean a lot of pain in the end. I understand you
> aren't resourced for this sort of development on this codebase, but it's going
> to be an impasse to try and merge this all at once even if was clean code.
how is it going to work then?  we can merge hardware programming code 
(the hw objects under /dc/dce) without anyone call it?
>
>> Or is the objection that we have the white boxes in DC.JPG instead of using
>> DRM objects?  We can probably workout something to have the white boxes
>> derive from DRM objects and extend atomic state with our validate_context
>> where dce*_resource.c stores the constructed pipelines.
> I think Daniel explained quite well how things should look in terms of
> subclassing.
okay we will look into how to do it.  this definitely won't happen over 
night as we need to get clear on what to do first and look at how other 
driver does it.  As per Harry's RFC (last still to do item), we plan to 
work on atomic anyways this expanded the scope of that a bit.
>
>> 5) Why is a midlayer bad?
>> I'm not going to go into specifics on the DC midlayer, but we abhor
>> midlayers for a fair few reasons. The main reason I find causes the
>> most issues is locking. When you have breaks in code flow between
>> multiple layers, but having layers calling back into previous layers
>> it becomes near impossible to track who owns the locking and what the
>> current locking state is.
>>
>> Consider
>>      drma -> dca -> dcb -> drmb
>>      drmc -> dcc  -> dcb -> drmb
>>
>> We have two codes paths that go back into drmb, now maybe drma has a
>> lock taken, but drmc doesn't, but we've no indication when we hit drmb
>> of what the context pre entering the DC layer is. This causes all
>> kinds of problems. The main requirement is the driver maintains the
>> execution flow as much as possible. The only callback behaviour should
>> be from an irq or workqueue type situations where you've handed
>> execution flow to the hardware to do something and it is getting back
>> to you. The pattern we use to get our of this sort of hole is helper
>> libraries, we structure code as much as possible as leaf nodes that
>> don't call back into the parents if we can avoid it (we don't always
>> succeed).
>>
>> Okay.  by the way DC does behave like a helper for most part.  There is no
>> locking in DC.  We work enough with different OS to know they all have
>> different synchronization primatives and interrupt handling and have DC lock
>> anything is just shooting ourself in the foot.  We do have function with
>> lock in their function name in DC but those are HW register lock to ensure
>> that the HW register update atomically. ie have 50 register write latch in
>> HW at next vsync to ensure HW state change on vsync boundary.
>>
>> So the above might becomes
>>     drma-> dca_helper
>>             -> dcb_helper
>>             -> drmb.
>>
>> In this case the code flow is controlled by drma, dca/dcb might be
>> modifying data or setting hw state but when we get to drmb it's easy
>> to see what data is needs and what locking.
>>
>> DAL/DC goes against this in so many ways, and when I look at the code
>> I'm never sure where to even start pulling the thread to unravel it.
>>
>> I don't know where we go against it.  In the case we do callback to DRM for
>> MST case we have
>>
>> amdgpu_dm_atomic_commit (implement atomic_commit)
>> dc_commit_targets (commit helper)
>> dce110_apply_ctx_to_hw (hw_seq)
>> core_link_enable_stream (part of MST enable sequence)
>> allocate_mst_payload (helper for above func in same file)
>> dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM)
>> drm_dp_mst_allocate_vcpi (DRM)
>>
>> As you see even in this case we are only 6 level deep before we callback to
>> DRM, and 2 of those functions are in same file as helper func of the bigger
>> sequence.
>>
>> Can you clarify the distinction between what you would call a mid layer vs
>> helper.  We consulted Alex a lot and we know about this inversion of control
>> pattern and we are trying our best to do it.  Is it the way functions are
>> named and files folder structure?  Would it help if we flatten
>> amdgpu_dm_atomic_commit and dc_commit_targets?  Even if we do I would
>> imagine we want some helper in commit rather a giant 1000 line function.  Is
>> there any concern that we put dc_commit_targets under /dc folder as we want
>> other platform to run exact same helper?  Or this is about the state
>> dc_commit_targets is too big?  or the state is stored validate_context
>> rather than drm_atomic_state?
> Well one area I hit today while looking, is trace the path for a dpcd
> read or write.
>
> An internal one in the dc layer goes
>
> core_link_dpcd_read (core_link)
> dm_helpers_dp_read_dpcd(context, dc_link)
>    search connector list for the appropriate connector
>    drm_dp_dpcd_read
>
> Note the connector list searching, this is a case of where you have called
> back into the toplevel driver without the info necessary because core_link
> and dc_link are too far abstracted from the drm connector.
> (get_connector_for_link is a bad idea)
>
> Then we get back around through the aux stuff and end up at:
> dc_read_dpcd which passes connector->dc_link->link_index down
>     this look up the dc_link again in core_dc->links[index]
> dal_ddc_service_read_dpcd_data(link->ddc)
> which calls into the i2caux path.
>
> This is not helper functions or anything close, this is layering hell.
As per Harry's RFC (2nd still todo item), we are working on switching 
fully to use I2c / aux code provided by DRM.  we just haven't got there 
yet.  I would think this part would look similar to how MST part look 
once we cleaned it up.  By the way anything with dal_* in the name are 
stuff we ported to get us up an running quickly.  dal_* has no business 
in dc and will be removed.
>> I don't think it make sense for DRM to get into how we decide to use our HW
>> blocks.  For example any refactor done in core should not result in us using
>> different pipeline to drive the same config.  We would like to have control
>> over how our HW pipeline is constructed.
> I don't think the DRM wants to get involved at that level, but it would be good
> if we could collapse the mountains of functions and layers so that you can
> clearly see how a modeset happens all the way down to the hw in a linear
> fashion.
there is really not that many layers.  if you look at MST example we 
will hit registers in 5 level.

amdgpu_dm_atomic_commit
dc_commit_targets
dce110_apply_ctx_to_hw
core_link_enable_stream
allocate_mst_payload (same as above drm callback exmaple)
dce110_stream_encoder_set_mst_bandwidth
REG_SET

>
>> How do you plan on dealing with people rewriting or removing code
>> upstream that is redundant in the kernel, but required for internal
>> stuff?
>>
>>
>> Honestly I don't know what these are.  Like you and Jerome remove func ptr
>> abstraction (I know it was bad, that was one of the component we ported from
>> windows) and we need to keep it as function pointer so we can still run our
>> code on FPGA before we see first silicon?  I don't think if we nak the
>> function ptr removal will be a problem for community.  The rest is valued
>> and we took with open arm.
>>
>> Or this is more like we have code duplication after DRM added some
>> functionality we can use?  I would imaging its more of moving what we got
>> working in our code to DRM core if we are upstreamed and we have no problem
>> accomodate for that as the code moved out to DRM core can be included in
>> other platforms.  We don't have any private ioctl today and we don't plan to
>> have any outside of using DRM object properties.
> I've just sent some patches to remove a bunch of dpcd defines, that is just
> one small example.
All of them are great for us to merge except patch 1/8. dc: remove dc 
hub.  As you might have guess that function is for ASIC currently in the 
lab.  Maybe we should have sanitized it with #ifdef and not have it 
visiable upstream in the first place.
>
>> I really don't know what those new linux things can be that could cause us
>> problem.  If anything the new things will be probably come from us if we are
>> upstreammed.
> But until then there will be competing development upstream, and you might
> want to merge things.
Maybe.  Somehow my gut feel is either we will either have those new 
things in demo-able shape before competing development start like 
FreeSync or it's something everybody care about and need SW ecosystem 
support like HDR.  In HDR case we are more than happen to participate up 
front.
>
>> DP MST:  AMD was the first source certified and we work closely with the
>> first branch certified. I was a part of that team and we had a very solid
>> implementation.  If we were upstreamed I don't see you would want to
>> reinvent the wheel and not try to massage what we have into shape for DRM
>> core for other driver to reuse.
> Definitely, I hate writing MST code, and it would have been good if someone else
> had gotten to it first.
>
> So I think after looking more at it, my major issue is with DC, the
> core stuff,
let me clarify you mean stuff under /dc/core?  I think there is a path 
for us to have those subclass DRM objects.
> not the hw
> touching stuff, but the layering stuff, dc and core infrastructure in
> a lot of places
> calls into the the DM layer and back into itself. It's a bit of an
> tangle to pull any
> one thread of it and try to unravel it.
I think we only have dpcd/i2c left, which we said we will fix.
>
> There also seems to be a fair lot of headers of questionable value, I've found
> the same set of defines (or pretty close ones) in a few headers,

redundant header will go.  just we need to spend time to go through 
them.  most of them are leftover from dal port.
> conversion functions
> between different layer definitions etc. There are redundant header
> files, unused structs, structs
> of questionable value or structs that should be merged.
>
> Stuff is hidden between dc and core structs, but it isn't always obvious why
> stuff is in dc_link vs core_link. Ideally we'd lose some of that layering.
okay, we will probably end up subclassing for DRM object anyways.
>
> Also things like loggers and fixed function calculators, and vector
> code probably need to be bumped
> up a layer or two or made sure to be completely generic, and put
> outside the DC code, if code is
> in amd/display dir it should be display code.

stuff under dc/basic are quick ports to get us going and you probably 
already notice we don't use some of them.   By the way is there any 
problem using float to do our bandwidth_calc?  we can safe/restore 
context on x86.  bandwidth_calc will run a lot faster and we can make it 
somewhat readible and get rid of fixpt31_32.c.
>
> I'm going to be happily ignoring most of this until early next year at
> this point (I might jump in/out a few times)
> but I think Daniel and Alex have a pretty good handle on where this
> code should be going to get upstream, I think we should
> all be listening to them as much as possible.
>
> Dave.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13  2:52     ` Cheng, Tony
       [not found]       ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-13  9:40       ` Lukas Wunner
       [not found]         ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Lukas Wunner @ 2016-12-13  9:40 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, dri-devel, amd-gfx mailing list, Deucher, Alexander

On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote:
> With DC the display hardware programming, resource optimization, power
> management and interaction with rest of system will be fully validated
> across multiple OSs.

Do I understand DAL3.jpg correctly that the macOS driver builds on top
of DAL Core?  I'm asking because the graphics drivers shipping with
macOS as well as on Apple's EFI Firmware Volume are closed source.

If the Linux community contributes to DC, I guess those contributions
can generally be assumed to be GPLv2 licensed.  Yet a future version
of the macOS driver would incorporate those contributions in the same
binary as their closed source OS-specific portion.

I don't quite see how that would be legal but maybe I'm missing
something.

Presumably the situation with the Windows driver is the same.

I guess you could maintain a separate branch sans community contributions
which would serve as a basis for closed source drivers, but not sure if
that is feasible given your resource constraints.

Thanks,

Lukas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]       ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org>
  2016-12-13  4:10         ` Cheng, Tony
  2016-12-13  7:31         ` Daniel Vetter
@ 2016-12-13 10:09         ` Ernst Sjöstrand
  2 siblings, 0 replies; 66+ messages in thread
From: Ernst Sjöstrand @ 2016-12-13 10:09 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, Cheng, Tony,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx mailing list,
	Daniel Vetter, Deucher, Alexander, Dave Airlie


[-- Attachment #1.1: Type: text/plain, Size: 563 bytes --]

2016-12-13 3:33 GMT+01:00 Harry Wentland <harry.wentland-5C7GfCeVMHo@public.gmane.org>:

Please keep asking us to get on dri-devel with questions. I need to get
> into the habit again of leaving the IRC channel open. I think most of us
> are still a bit scared of it or don't know how to deal with some of the
> information overload (IRC and mailing list). It's some of my job to change
> that all the while I'm learning this myself. :)
>

https://www.irccloud.com/ is pretty nice if you're not the
keep-irssi-running-in-screen-on-a-server type.

Regards
//Ernst

[-- Attachment #1.2: Type: text/html, Size: 1023 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]         ` <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-13 12:22           ` Daniel Stone
  2016-12-13 12:59             ` Daniel Vetter
       [not found]             ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 2 replies; 66+ messages in thread
From: Daniel Stone @ 2016-12-13 12:22 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Grodzovsky, Andrey, Dave Airlie, dri-devel, amd-gfx mailing list,
	Deucher, Alexander, Cheng, Tony

Hi Harry,
I've been loathe to jump in here, not least because both cop roles
seem to be taken, but ...

On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote:
> On 2016-12-11 09:57 PM, Dave Airlie wrote:
>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
>> Sharing code is a laudable goal and I appreciate the resourcing
>> constraints that led us to the point at which we find ourselves, but
>> the way forward involves finding resources to upstream this code,
>> dedicated people (even one person) who can spend time on a day by day
>> basis talking to people in the open and working upstream, improving
>> other pieces of the drm as they go, reading atomic patches and
>> reviewing them, and can incrementally build the DC experience on top
>> of the Linux kernel infrastructure. Then having the corresponding
>> changes in the DC codebase happen internally to correspond to how the
>> kernel code ends up looking. Lots of this code overlaps with stuff the
>> drm already does, lots of is stuff the drm should be doing, so patches
>> to the drm should be sent instead.
>
> Personally I'm with you on this and hope to get us there. I'm learning...
> we're learning. I agree that changes on atomic, removing abstractions, etc.
> should happen on dri-devel.
>
> When it comes to brand-new technologies (MST, Freesync), though, we're often
> the first which means that we're spending a considerable amount of time to
> get things right, working with HW teams, receiver vendors and other partners
> internal and external to AMD. By the time we do get it right it's time to
> hit the market. This gives us fairly little leeway to work with the
> community on patches that won't land in distros for another half a year.
> We're definitely hoping to improve some of this but it's not easy and in
> some case impossible ahead of time (though definitely possibly after initial
> release).

Speaking with my Wayland hat on, I think these need to be very
carefully considered. Both MST and FreeSync have _significant_ UABI
implications, which may not be immediately obvious when working with a
single implementation. Having them working and validated with a
vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu +
Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world
we now have Weston, Mutter and KWin all directly driving KMS, plus
whatever Mir/Unity ends up doing (presumably the same), and that's
just on the desktop. Beyond the desktop, there's also CrOS/Freon and
Android/HWC. For better or worse, outside of Xorg and HWC, we no
longer have a vendor-provided userspace component driving KMS.

It was also easy to get away with loose semantics before with X11
imposing little to no structure on rendering, but we now have the twin
requirements of an atomic and timing-precise ABI - see Mario Kleiner's
unending quest for accuracy - and also a vendor-independent ABI. So a
good part of the (not insignificant) pain incurred in the atomic
transition for drivers, was in fact making those drivers conform to
the expectations of the KMS UABI contract, which just happened to not
have been tripped over previously.

Speaking with my Collabora hat on now: we did do a substantial amount
of demidlayering on the Exynos driver, including an atomic conversion,
on Google's behalf. The original Exynos driver happened to work with
the Tizen stack, but ChromeOS exposed a huge amount of subtle
behaviour differences between that and other drivers when using Freon.
We'd also hit the same issues when attempting to use Weston on Exynos
in embedded devices for OEMs we worked with, so took on the project to
remove the midlayer and have as much as possible driven from generic
code.

How the hardware is programmed is of course ultimately up to you, and
in this regard AMD will be very different from Intel is very different
from Nouveau is very different from Rockchip. But especially for new
features like FreeSync, I think we need to be very conscious of
walking the line between getting those features in early, and setting
unworkable UABI in stone. It would be unfortunate if later on down the
line, you had to choose between breaking older xf86-video-amdgpu
userspace which depended on specific behaviours of the amdgpu kernel
driver, or breaking the expectations of generic userspace such as
Weston/Mutter/etc.

One good way to make sure you don't get into that position, is to have
core KMS code driving as much of the machinery as possible, with a
very clear separation of concerns between actual hardware programming,
versus things which may be visible to userspace. This I think is
DanielV's point expressed at much greater length. ;)

I should be clear though that this isn't unique to AMD, nor a problem
of your creation. For example, I'm currently looking at a flip-timing
issue in Rockchip - a fairly small, recent, atomic-native, and
generally exemplary driver - which I'm pretty sure is going to be
resolved by deleting more driver code and using more of the helpers!
Probably one of the reasons why KMS has been lagging behind in
capability for a while (as Alex noted), is that even the basic ABI was
utterly incoherent between drivers. The magnitude of the sea change
that's taken place in KMS lately isn't always obvious to the outside
world: the actual atomic modesetting API is just the cherry on top,
rather than the most drastic change, which is the coherent
driver-independent core machinery.

Cheers,
Daniel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13 12:22           ` Daniel Stone
@ 2016-12-13 12:59             ` Daniel Vetter
       [not found]               ` <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
       [not found]             ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 66+ messages in thread
From: Daniel Vetter @ 2016-12-13 12:59 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher,
	Alexander, Cheng, Tony

On Tue, Dec 13, 2016 at 12:22:59PM +0000, Daniel Stone wrote:
> Hi Harry,
> I've been loathe to jump in here, not least because both cop roles
> seem to be taken, but ...
> 
> On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote:
> > On 2016-12-11 09:57 PM, Dave Airlie wrote:
> >> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
> >> Sharing code is a laudable goal and I appreciate the resourcing
> >> constraints that led us to the point at which we find ourselves, but
> >> the way forward involves finding resources to upstream this code,
> >> dedicated people (even one person) who can spend time on a day by day
> >> basis talking to people in the open and working upstream, improving
> >> other pieces of the drm as they go, reading atomic patches and
> >> reviewing them, and can incrementally build the DC experience on top
> >> of the Linux kernel infrastructure. Then having the corresponding
> >> changes in the DC codebase happen internally to correspond to how the
> >> kernel code ends up looking. Lots of this code overlaps with stuff the
> >> drm already does, lots of is stuff the drm should be doing, so patches
> >> to the drm should be sent instead.
> >
> > Personally I'm with you on this and hope to get us there. I'm learning...
> > we're learning. I agree that changes on atomic, removing abstractions, etc.
> > should happen on dri-devel.
> >
> > When it comes to brand-new technologies (MST, Freesync), though, we're often
> > the first which means that we're spending a considerable amount of time to
> > get things right, working with HW teams, receiver vendors and other partners
> > internal and external to AMD. By the time we do get it right it's time to
> > hit the market. This gives us fairly little leeway to work with the
> > community on patches that won't land in distros for another half a year.
> > We're definitely hoping to improve some of this but it's not easy and in
> > some case impossible ahead of time (though definitely possibly after initial
> > release).
> 
> Speaking with my Wayland hat on, I think these need to be very
> carefully considered. Both MST and FreeSync have _significant_ UABI
> implications, which may not be immediately obvious when working with a
> single implementation. Having them working and validated with a
> vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu +
> Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world
> we now have Weston, Mutter and KWin all directly driving KMS, plus
> whatever Mir/Unity ends up doing (presumably the same), and that's
> just on the desktop. Beyond the desktop, there's also CrOS/Freon and
> Android/HWC. For better or worse, outside of Xorg and HWC, we no
> longer have a vendor-provided userspace component driving KMS.
> 
> It was also easy to get away with loose semantics before with X11
> imposing little to no structure on rendering, but we now have the twin
> requirements of an atomic and timing-precise ABI - see Mario Kleiner's
> unending quest for accuracy - and also a vendor-independent ABI. So a
> good part of the (not insignificant) pain incurred in the atomic
> transition for drivers, was in fact making those drivers conform to
> the expectations of the KMS UABI contract, which just happened to not
> have been tripped over previously.
> 
> Speaking with my Collabora hat on now: we did do a substantial amount
> of demidlayering on the Exynos driver, including an atomic conversion,
> on Google's behalf. The original Exynos driver happened to work with
> the Tizen stack, but ChromeOS exposed a huge amount of subtle
> behaviour differences between that and other drivers when using Freon.
> We'd also hit the same issues when attempting to use Weston on Exynos
> in embedded devices for OEMs we worked with, so took on the project to
> remove the midlayer and have as much as possible driven from generic
> code.
> 
> How the hardware is programmed is of course ultimately up to you, and
> in this regard AMD will be very different from Intel is very different
> from Nouveau is very different from Rockchip. But especially for new
> features like FreeSync, I think we need to be very conscious of
> walking the line between getting those features in early, and setting
> unworkable UABI in stone. It would be unfortunate if later on down the
> line, you had to choose between breaking older xf86-video-amdgpu
> userspace which depended on specific behaviours of the amdgpu kernel
> driver, or breaking the expectations of generic userspace such as
> Weston/Mutter/etc.
> 
> One good way to make sure you don't get into that position, is to have
> core KMS code driving as much of the machinery as possible, with a
> very clear separation of concerns between actual hardware programming,
> versus things which may be visible to userspace. This I think is
> DanielV's point expressed at much greater length. ;)
> 
> I should be clear though that this isn't unique to AMD, nor a problem
> of your creation. For example, I'm currently looking at a flip-timing
> issue in Rockchip - a fairly small, recent, atomic-native, and
> generally exemplary driver - which I'm pretty sure is going to be
> resolved by deleting more driver code and using more of the helpers!
> Probably one of the reasons why KMS has been lagging behind in
> capability for a while (as Alex noted), is that even the basic ABI was
> utterly incoherent between drivers. The magnitude of the sea change
> that's taken place in KMS lately isn't always obvious to the outside
> world: the actual atomic modesetting API is just the cherry on top,
> rather than the most drastic change, which is the coherent
> driver-independent core machinery.

+1 on everything Daniel said here. And I'm a bit worried that AMD is not
realizing what's going on here, given that Michel called the plan that
most everything will switch over to a generic kms userspace a "pipe
dream". It's happening, and in a few years I expect the only amd-specific
userspace left and still shipping will be amdgpu-pro for
enterprise/workstation customers.

In the end AMD missing that seems just another case of designing something
pretty inhouse and entirely missing to synchronize with the community and
what's going on outside of AMD.

And for freesync specifically I agree with Daniel that enabling this only
in -amdgpu gives us a very high chance of ending up with something that
doesn't work elsewhere. Or is at least badly underspecified, and then
tears and blodshed ensues when someone else enables things. At intel we've
already stopped enabling kms features only in -intel, and instead using
weston, -modesetting or drm_hwcomposer as userspace demonstration vehicles
for new stuff. And I'll be pushing everyone else into that direction, too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]           ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org>
  2016-12-13  7:30             ` Dave Airlie
@ 2016-12-13 14:59             ` Rob Clark
  1 sibling, 0 replies; 66+ messages in thread
From: Rob Clark @ 2016-12-13 14:59 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher,
	Alexander, Harry Wentland

On Mon, Dec 12, 2016 at 11:10 PM, Cheng, Tony <tony.cheng@amd.com> wrote:
> We need to treat most of resource that don't map well as global. One example
> is pixel pll.  We have 6 display pipes but only 2 or 3 plls in CI/VI, as a
> result we are limited in number of HDMI or DVI we can drive at the same
> time.  Also the pixel pll can be used to drive DP as well, so there is
> another layer of HW specific but we can't really contain it in crtc or
> encoder by itself.  Doing this resource allocation require knowlege of the
> whole system, and knowning which pixel pll is already used, and what can we
> support with remaining pll.
>
> Another ask is lets say we are driving 2 displays, we would always want
> instance 0 and instance 1 of scaler, timing generator etc getting used.  We
> want to avoid possiblity of due to different user mode commit sequence we
> end up with driving the 2 display with 0 and 2nd instance of HW.  Not only
> this configuration isn't really validated in the lab, we will be less
> effective in power gating as instance 0 and 1 are one the same tile.
> instead of having 2/3 of processing pipeline silicon power gated we can only
> power gate 1/3. And if we power gate wrong the you will have 1 of the 2
> display not lighting up.

Note that as of 4.10, drm/msm/mdp5 is dynamically assigning hwpipes to
planes tracked as part of the driver's global atomic state.  (And for
future hw we will need to dynamically assign layermixers to crtc's).
I'm also using global state for allocating SMP (basically fifo)
blocks.  And drm/i915 is also using global atomic state for shared
resources.

Dynamic assignment of hw resources to kms objects is not a problem,
and the locking model in atomic allows for this.  (I introduced one
new global modeset_lock to protect the global state, so only multiple
parallel updates which both touch shared state will serialize)

BR,
-R
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]         ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
@ 2016-12-13 15:03           ` Cheng, Tony
  2016-12-13 15:09             ` Deucher, Alexander
                               ` (2 more replies)
  2016-12-13 16:14           ` Bridgman, John
  1 sibling, 3 replies; 66+ messages in thread
From: Cheng, Tony @ 2016-12-13 15:03 UTC (permalink / raw)
  To: Lukas Wunner, John
  Cc: Grodzovsky, Andrey, Dave Airlie, dri-devel, amd-gfx mailing list,
	Deucher, Alexander, Harry Wentland


[-- Attachment #1.1: Type: text/plain, Size: 2098 bytes --]

Only DM that’s open source is amdgpu_dm.the rest will remain closed 
source.I remember we had discussion around legal issues with our grand 
plan of unifying everything, and I remember maybe it was John who 
assured us that it's okay.John can you chime in how it would work with 
GPLv2 licsense?


On 12/13/2016 4:40 AM, Lukas Wunner wrote:
> On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote:
>> With DC the display hardware programming, resource optimization, power
>> management and interaction with rest of system will be fully validated
>> across multiple OSs.
> Do I understand DAL3.jpg correctly that the macOS driver builds on top
> of DAL Core?  I'm asking because the graphics drivers shipping with
> macOS as well as on Apple's EFI Firmware Volume are closed source.
macOS currently ship with their own driver.  I can't really comment on 
what macOS do without getting into trouble.
> If the Linux community contributes to DC, I guess those contributions
> can generally be assumed to be GPLv2 licensed.  Yet a future version
> of the macOS driver would incorporate those contributions in the same
> binary as their closed source OS-specific portion.
I am struggling with that these comminty contributions to DC would be.

Us AMD developer has access to HW docs and designer and we are still 
spending 50% of our time figuring out why our HW doesn't work right. I 
can't image community doing much of this heavy lifting.
>
> I don't quite see how that would be legal but maybe I'm missing
> something.
>
> Presumably the situation with the Windows driver is the same.
>
> I guess you could maintain a separate branch sans community contributions
> which would serve as a basis for closed source drivers, but not sure if
> that is feasible given your resource constraints.
Dave sent us series of patch to show how it would look like if someone 
were to change DC.  These changes are more removing code that DRM 
already has and deleting/clean up stuff.  I guess we can nak all changes 
and "rewrite" our own version of clean up patch community want to see?
> Thanks,
>
> Lukas


[-- Attachment #1.2: Type: text/html, Size: 41251 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13 15:03           ` Cheng, Tony
@ 2016-12-13 15:09             ` Deucher, Alexander
  2016-12-13 15:57             ` Lukas Wunner
  2016-12-14  9:57             ` Jani Nikula
  2 siblings, 0 replies; 66+ messages in thread
From: Deucher, Alexander @ 2016-12-13 15:09 UTC (permalink / raw)
  To: Cheng, Tony, Lukas Wunner, Bridgman, John
  Cc: dri-devel, amd-gfx mailing list, Grodzovsky, Andrey


[-- Attachment #1.1: Type: text/plain, Size: 2569 bytes --]

Our driver code and most of the drm is MIT/X11 licensed.  Lot of other non GPL OSes (e.g., the BSDs) already import Linux drm drivers and core code.

Alex

From: Cheng, Tony
Sent: Tuesday, December 13, 2016 10:04 AM
To: Lukas Wunner; Bridgman, John
Cc: Dave Airlie; Wentland, Harry; Grodzovsky, Andrey; amd-gfx mailing list; dri-devel; Deucher, Alexander
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU


Only DM that's open source is amdgpu_dm.  the rest will remain closed source.  I remember we had discussion around legal issues with our grand plan of unifying everything, and I remember maybe it was John who assured us that it's okay.  John can you chime in how it would work with GPLv2 licsense?

On 12/13/2016 4:40 AM, Lukas Wunner wrote:

On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote:

With DC the display hardware programming, resource optimization, power

management and interaction with rest of system will be fully validated

across multiple OSs.



Do I understand DAL3.jpg correctly that the macOS driver builds on top

of DAL Core?  I'm asking because the graphics drivers shipping with

macOS as well as on Apple's EFI Firmware Volume are closed source.
macOS currently ship with their own driver.  I can't really comment on what macOS do without getting into trouble.




If the Linux community contributes to DC, I guess those contributions

can generally be assumed to be GPLv2 licensed.  Yet a future version

of the macOS driver would incorporate those contributions in the same

binary as their closed source OS-specific portion.
I am struggling with that these comminty contributions to DC would be.

Us AMD developer has access to HW docs and designer and we are still spending 50% of our time figuring out why our HW doesn't work right. I can't image community doing much of this heavy lifting.






I don't quite see how that would be legal but maybe I'm missing

something.



Presumably the situation with the Windows driver is the same.



I guess you could maintain a separate branch sans community contributions

which would serve as a basis for closed source drivers, but not sure if

that is feasible given your resource constraints.
Dave sent us series of patch to show how it would look like if someone were to change DC.  These changes are more removing code that DRM already has and deleting/clean up stuff.  I guess we can nak all changes and "rewrite" our own version of clean up patch community want to see?



Thanks,



Lukas


[-- Attachment #1.2: Type: text/html, Size: 7720 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13 15:03           ` Cheng, Tony
  2016-12-13 15:09             ` Deucher, Alexander
@ 2016-12-13 15:57             ` Lukas Wunner
  2016-12-14  9:57             ` Jani Nikula
  2 siblings, 0 replies; 66+ messages in thread
From: Lukas Wunner @ 2016-12-13 15:57 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, dri-devel, amd-gfx mailing list, Deucher, Alexander

On Tue, Dec 13, 2016 at 10:03:58AM -0500, Cheng, Tony wrote:
> On 12/13/2016 4:40 AM, Lukas Wunner wrote:
> > If the Linux community contributes to DC, I guess those contributions
> > can generally be assumed to be GPLv2 licensed.  Yet a future version
> > of the macOS driver would incorporate those contributions in the same
> > binary as their closed source OS-specific portion.
> 
> I am struggling with that these comminty contributions to DC would be.
> 
> Us AMD developer has access to HW docs and designer and we are still
> spending 50% of our time figuring out why our HW doesn't work right.
> I can't image community doing much of this heavy lifting.

True, but past experience with radeon/amdgpu is that the community has
use cases that AMD developers don't specifically cater to, e.g. due to
lack of the required hardware or resource constraints.

E.g. Mario Kleiner has contributed lots of patches for proper vsync
handling which are needed for his neuroscience software.  I've
contributed DDC switching support for MacBook Pros to radeon.
Your driver becomes more useful, you get more customers, everyone wins.


> > Do I understand DAL3.jpg correctly that the macOS driver builds on top
> > of DAL Core?  I'm asking because the graphics drivers shipping with
> > macOS as well as on Apple's EFI Firmware Volume are closed source.
> 
> macOS currently ship with their own driver.  I can't really comment on what
> macOS do without getting into trouble.

The Intel Israel folks working on Thunderbolt are similarly between
the rock that is the community's expectation of openness and the hard
place that is Apple's secrecy.  So I sympathize with your situation,
kudos for trying to do the right thing.


> I guess we can nak all changes and "rewrite" our
> own version of clean up patch community want to see?

I don't think that would be workable honestly.

One way out of this conundrum might be to use a permissive license such
as BSD for DC.  Then whenever you merge a community patch, in addition
to informing the contributor thereof, send them a boilerplate one-liner
that community contributions are assumed to be under the same license
and if the contributor disagrees they should send a short notice to
have their contribution removed.

But IANAL.

Best regards,

Lukas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]         ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
  2016-12-13 15:03           ` Cheng, Tony
@ 2016-12-13 16:14           ` Bridgman, John
  1 sibling, 0 replies; 66+ messages in thread
From: Bridgman, John @ 2016-12-13 16:14 UTC (permalink / raw)
  To: Lukas Wunner, Cheng, Tony
  Cc: Deucher, Alexander, Grodzovsky, Andrey, amd-gfx mailing list, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2679 bytes --]

>>If the Linux community contributes to DC, I guess those contributions
can generally be assumed to be GPLv2 licensed.  Yet a future version
of the macOS driver would incorporate those contributions in the same
binary as their closed source OS-specific portion.


My understanding of the "general rule" was that contributions are normally assumed to be made under the "local license", ie GPLv2 for kernel changes in general, but the appropriate lower-level license when made to a specific subsystem with a more permissive license (eg the X11 license aka MIT aka "GPL plus additional rights" license we use for almost all of the graphics subsystem. If DC is not X11 licensed today it should be (but I'm pretty sure it already is).


We need to keep the graphics subsystem permissively licensed in general to allow uptake by other free OS projects such as *BSD, not just closed source.


Either way, driver-level maintainers are going to have to make sure that contributions have clear licensing.


Thanks,

John

________________________________
From: dri-devel <dri-devel-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> on behalf of Lukas Wunner <lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
Sent: December 13, 2016 4:40 AM
To: Cheng, Tony
Cc: Grodzovsky, Andrey; dri-devel; amd-gfx mailing list; Deucher, Alexander
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote:
> With DC the display hardware programming, resource optimization, power
> management and interaction with rest of system will be fully validated
> across multiple OSs.

Do I understand DAL3.jpg correctly that the macOS driver builds on top
of DAL Core?  I'm asking because the graphics drivers shipping with
macOS as well as on Apple's EFI Firmware Volume are closed source.

If the Linux community contributes to DC, I guess those contributions
can generally be assumed to be GPLv2 licensed.  Yet a future version
of the macOS driver would incorporate those contributions in the same
binary as their closed source OS-specific portion.

I don't quite see how that would be legal but maybe I'm missing
something.

Presumably the situation with the Windows driver is the same.

I guess you could maintain a separate branch sans community contributions
which would serve as a basis for closed source drivers, but not sure if
that is feasible given your resource constraints.

Thanks,

Lukas
_______________________________________________
dri-devel mailing list
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[-- Attachment #1.2: Type: text/html, Size: 3828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]               ` <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-14  1:50                 ` Michel Dänzer
       [not found]                   ` <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Michel Dänzer @ 2016-12-14  1:50 UTC (permalink / raw)
  To: Daniel Vetter, Daniel Stone
  Cc: Grodzovsky, Andrey, Harry Wentland, dri-devel,
	amd-gfx mailing list, Deucher, Alexander, Cheng, Tony

On 13/12/16 09:59 PM, Daniel Vetter wrote:
> On Tue, Dec 13, 2016 at 12:22:59PM +0000, Daniel Stone wrote:
>> Hi Harry,
>> I've been loathe to jump in here, not least because both cop roles
>> seem to be taken, but ...
>>
>> On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote:
>>> On 2016-12-11 09:57 PM, Dave Airlie wrote:
>>>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
>>>> Sharing code is a laudable goal and I appreciate the resourcing
>>>> constraints that led us to the point at which we find ourselves, but
>>>> the way forward involves finding resources to upstream this code,
>>>> dedicated people (even one person) who can spend time on a day by day
>>>> basis talking to people in the open and working upstream, improving
>>>> other pieces of the drm as they go, reading atomic patches and
>>>> reviewing them, and can incrementally build the DC experience on top
>>>> of the Linux kernel infrastructure. Then having the corresponding
>>>> changes in the DC codebase happen internally to correspond to how the
>>>> kernel code ends up looking. Lots of this code overlaps with stuff the
>>>> drm already does, lots of is stuff the drm should be doing, so patches
>>>> to the drm should be sent instead.
>>>
>>> Personally I'm with you on this and hope to get us there. I'm learning...
>>> we're learning. I agree that changes on atomic, removing abstractions, etc.
>>> should happen on dri-devel.
>>>
>>> When it comes to brand-new technologies (MST, Freesync), though, we're often
>>> the first which means that we're spending a considerable amount of time to
>>> get things right, working with HW teams, receiver vendors and other partners
>>> internal and external to AMD. By the time we do get it right it's time to
>>> hit the market. This gives us fairly little leeway to work with the
>>> community on patches that won't land in distros for another half a year.
>>> We're definitely hoping to improve some of this but it's not easy and in
>>> some case impossible ahead of time (though definitely possibly after initial
>>> release).
>>
>> Speaking with my Wayland hat on, I think these need to be very
>> carefully considered. Both MST and FreeSync have _significant_ UABI
>> implications, which may not be immediately obvious when working with a
>> single implementation. Having them working and validated with a
>> vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu +
>> Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world
>> we now have Weston, Mutter and KWin all directly driving KMS, plus
>> whatever Mir/Unity ends up doing (presumably the same), and that's
>> just on the desktop. Beyond the desktop, there's also CrOS/Freon and
>> Android/HWC. For better or worse, outside of Xorg and HWC, we no
>> longer have a vendor-provided userspace component driving KMS.
>>
>> It was also easy to get away with loose semantics before with X11
>> imposing little to no structure on rendering, but we now have the twin
>> requirements of an atomic and timing-precise ABI - see Mario Kleiner's
>> unending quest for accuracy - and also a vendor-independent ABI. So a
>> good part of the (not insignificant) pain incurred in the atomic
>> transition for drivers, was in fact making those drivers conform to
>> the expectations of the KMS UABI contract, which just happened to not
>> have been tripped over previously.
>>
>> Speaking with my Collabora hat on now: we did do a substantial amount
>> of demidlayering on the Exynos driver, including an atomic conversion,
>> on Google's behalf. The original Exynos driver happened to work with
>> the Tizen stack, but ChromeOS exposed a huge amount of subtle
>> behaviour differences between that and other drivers when using Freon.
>> We'd also hit the same issues when attempting to use Weston on Exynos
>> in embedded devices for OEMs we worked with, so took on the project to
>> remove the midlayer and have as much as possible driven from generic
>> code.
>>
>> How the hardware is programmed is of course ultimately up to you, and
>> in this regard AMD will be very different from Intel is very different
>> from Nouveau is very different from Rockchip. But especially for new
>> features like FreeSync, I think we need to be very conscious of
>> walking the line between getting those features in early, and setting
>> unworkable UABI in stone. It would be unfortunate if later on down the
>> line, you had to choose between breaking older xf86-video-amdgpu
>> userspace which depended on specific behaviours of the amdgpu kernel
>> driver, or breaking the expectations of generic userspace such as
>> Weston/Mutter/etc.
>>
>> One good way to make sure you don't get into that position, is to have
>> core KMS code driving as much of the machinery as possible, with a
>> very clear separation of concerns between actual hardware programming,
>> versus things which may be visible to userspace. This I think is
>> DanielV's point expressed at much greater length. ;)
>>
>> I should be clear though that this isn't unique to AMD, nor a problem
>> of your creation. For example, I'm currently looking at a flip-timing
>> issue in Rockchip - a fairly small, recent, atomic-native, and
>> generally exemplary driver - which I'm pretty sure is going to be
>> resolved by deleting more driver code and using more of the helpers!
>> Probably one of the reasons why KMS has been lagging behind in
>> capability for a while (as Alex noted), is that even the basic ABI was
>> utterly incoherent between drivers. The magnitude of the sea change
>> that's taken place in KMS lately isn't always obvious to the outside
>> world: the actual atomic modesetting API is just the cherry on top,
>> rather than the most drastic change, which is the coherent
>> driver-independent core machinery.
> 
> +1 on everything Daniel said here. And I'm a bit worried that AMD is not
> realizing what's going on here, given that Michel called the plan that
> most everything will switch over to a generic kms userspace a "pipe
> dream". It's happening, and in a few years I expect the only amd-specific
> userspace left and still shipping will be amdgpu-pro for
> enterprise/workstation customers.

The pipe dream is replacing our Xorg drivers with -modesetting. I fully
agree with you Daniels when it comes to non-Xorg userspace.


> In the end AMD missing that seems just another case of designing something
> pretty inhouse and entirely missing to synchronize with the community and
> what's going on outside of AMD.
> 
> And for freesync specifically I agree with Daniel that enabling this only
> in -amdgpu gives us a very high chance of ending up with something that
> doesn't work elsewhere. Or is at least badly underspecified, and then
> tears and blodshed ensues when someone else enables things.

Right, I think I clearly stated before both internally and externally
that the current amdgpu-pro FreeSync support isn't suitable for upstream
(not even for xf86-video-amdgpu).


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-13 15:03           ` Cheng, Tony
  2016-12-13 15:09             ` Deucher, Alexander
  2016-12-13 15:57             ` Lukas Wunner
@ 2016-12-14  9:57             ` Jani Nikula
  2016-12-14 17:23               ` Cheng, Tony
  2 siblings, 1 reply; 66+ messages in thread
From: Jani Nikula @ 2016-12-14  9:57 UTC (permalink / raw)
  To: Cheng, Tony, Lukas Wunner, John
  Cc: Deucher, Alexander, Grodzovsky, Andrey, amd-gfx mailing list, dri-devel

On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote:
> I am struggling with that these comminty contributions to DC would be.
>
> Us AMD developer has access to HW docs and designer and we are still 
> spending 50% of our time figuring out why our HW doesn't work right. I 
> can't image community doing much of this heavy lifting.

I can sympathize with that view, and certainly most of the heavy lifting
would come from you, same as with us and i915. However, when you put
together your hardware, an open source driver, and smart people, they
*will* scratch their itches, whether they're bugs you're not fixing or
features you're missing. Please don't underestimate and patronize them,
it's going to rub people the wrong way.

> Dave sent us series of patch to show how it would look like if someone 
> were to change DC.  These changes are more removing code that DRM 
> already has and deleting/clean up stuff.  I guess we can nak all changes 
> and "rewrite" our own version of clean up patch community want to see?

Please have a look at, say,

$ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v amd\.com

Do you really want to actively discourage all of them from contributing?
I think this would be detrimental to not only your driver, but the whole
drm community. It feels like you'd like to have your code upstream, but
still retain ownership as if it was in your internal repo. You can't
have your cake and eat it too.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]                   ` <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2016-12-14 15:46                     ` Harry Wentland
  0 siblings, 0 replies; 66+ messages in thread
From: Harry Wentland @ 2016-12-14 15:46 UTC (permalink / raw)
  To: Michel Dänzer, Daniel Vetter, Daniel Stone
  Cc: Deucher, Alexander, Grodzovsky, Andrey, Cheng, Tony, dri-devel,
	amd-gfx mailing list

On 2016-12-13 08:50 PM, Michel Dänzer wrote:
> On 13/12/16 09:59 PM, Daniel Vetter wrote:
>> On Tue, Dec 13, 2016 at 12:22:59PM +0000, Daniel Stone wrote:
>>> Hi Harry,
>>> I've been loathe to jump in here, not least because both cop roles
>>> seem to be taken, but ...
>>>
>>> On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote:
>>>> On 2016-12-11 09:57 PM, Dave Airlie wrote:
>>>>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
>>>>> Sharing code is a laudable goal and I appreciate the resourcing
>>>>> constraints that led us to the point at which we find ourselves, but
>>>>> the way forward involves finding resources to upstream this code,
>>>>> dedicated people (even one person) who can spend time on a day by day
>>>>> basis talking to people in the open and working upstream, improving
>>>>> other pieces of the drm as they go, reading atomic patches and
>>>>> reviewing them, and can incrementally build the DC experience on top
>>>>> of the Linux kernel infrastructure. Then having the corresponding
>>>>> changes in the DC codebase happen internally to correspond to how the
>>>>> kernel code ends up looking. Lots of this code overlaps with stuff the
>>>>> drm already does, lots of is stuff the drm should be doing, so patches
>>>>> to the drm should be sent instead.
>>>>
>>>> Personally I'm with you on this and hope to get us there. I'm learning...
>>>> we're learning. I agree that changes on atomic, removing abstractions, etc.
>>>> should happen on dri-devel.
>>>>
>>>> When it comes to brand-new technologies (MST, Freesync), though, we're often
>>>> the first which means that we're spending a considerable amount of time to
>>>> get things right, working with HW teams, receiver vendors and other partners
>>>> internal and external to AMD. By the time we do get it right it's time to
>>>> hit the market. This gives us fairly little leeway to work with the
>>>> community on patches that won't land in distros for another half a year.
>>>> We're definitely hoping to improve some of this but it's not easy and in
>>>> some case impossible ahead of time (though definitely possibly after initial
>>>> release).
>>>
>>> Speaking with my Wayland hat on, I think these need to be very
>>> carefully considered. Both MST and FreeSync have _significant_ UABI
>>> implications, which may not be immediately obvious when working with a
>>> single implementation. Having them working and validated with a
>>> vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu +
>>> Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world
>>> we now have Weston, Mutter and KWin all directly driving KMS, plus
>>> whatever Mir/Unity ends up doing (presumably the same), and that's
>>> just on the desktop. Beyond the desktop, there's also CrOS/Freon and
>>> Android/HWC. For better or worse, outside of Xorg and HWC, we no
>>> longer have a vendor-provided userspace component driving KMS.
>>>
>>> It was also easy to get away with loose semantics before with X11
>>> imposing little to no structure on rendering, but we now have the twin
>>> requirements of an atomic and timing-precise ABI - see Mario Kleiner's
>>> unending quest for accuracy - and also a vendor-independent ABI. So a
>>> good part of the (not insignificant) pain incurred in the atomic
>>> transition for drivers, was in fact making those drivers conform to
>>> the expectations of the KMS UABI contract, which just happened to not
>>> have been tripped over previously.
>>>
>>> Speaking with my Collabora hat on now: we did do a substantial amount
>>> of demidlayering on the Exynos driver, including an atomic conversion,
>>> on Google's behalf. The original Exynos driver happened to work with
>>> the Tizen stack, but ChromeOS exposed a huge amount of subtle
>>> behaviour differences between that and other drivers when using Freon.
>>> We'd also hit the same issues when attempting to use Weston on Exynos
>>> in embedded devices for OEMs we worked with, so took on the project to
>>> remove the midlayer and have as much as possible driven from generic
>>> code.
>>>
>>> How the hardware is programmed is of course ultimately up to you, and
>>> in this regard AMD will be very different from Intel is very different
>>> from Nouveau is very different from Rockchip. But especially for new
>>> features like FreeSync, I think we need to be very conscious of
>>> walking the line between getting those features in early, and setting
>>> unworkable UABI in stone. It would be unfortunate if later on down the
>>> line, you had to choose between breaking older xf86-video-amdgpu
>>> userspace which depended on specific behaviours of the amdgpu kernel
>>> driver, or breaking the expectations of generic userspace such as
>>> Weston/Mutter/etc.
>>>
>>> One good way to make sure you don't get into that position, is to have
>>> core KMS code driving as much of the machinery as possible, with a
>>> very clear separation of concerns between actual hardware programming,
>>> versus things which may be visible to userspace. This I think is
>>> DanielV's point expressed at much greater length. ;)
>>>
>>> I should be clear though that this isn't unique to AMD, nor a problem
>>> of your creation. For example, I'm currently looking at a flip-timing
>>> issue in Rockchip - a fairly small, recent, atomic-native, and
>>> generally exemplary driver - which I'm pretty sure is going to be
>>> resolved by deleting more driver code and using more of the helpers!
>>> Probably one of the reasons why KMS has been lagging behind in
>>> capability for a while (as Alex noted), is that even the basic ABI was
>>> utterly incoherent between drivers. The magnitude of the sea change
>>> that's taken place in KMS lately isn't always obvious to the outside
>>> world: the actual atomic modesetting API is just the cherry on top,
>>> rather than the most drastic change, which is the coherent
>>> driver-independent core machinery.
>>
>> +1 on everything Daniel said here. And I'm a bit worried that AMD is not
>> realizing what's going on here, given that Michel called the plan that
>> most everything will switch over to a generic kms userspace a "pipe
>> dream". It's happening, and in a few years I expect the only amd-specific
>> userspace left and still shipping will be amdgpu-pro for
>> enterprise/workstation customers.
>
> The pipe dream is replacing our Xorg drivers with -modesetting. I fully
> agree with you Daniels when it comes to non-Xorg userspace.
>
>
>> In the end AMD missing that seems just another case of designing something
>> pretty inhouse and entirely missing to synchronize with the community and
>> what's going on outside of AMD.
>>
>> And for freesync specifically I agree with Daniel that enabling this only
>> in -amdgpu gives us a very high chance of ending up with something that
>> doesn't work elsewhere. Or is at least badly underspecified, and then
>> tears and blodshed ensues when someone else enables things.
>
> Right, I think I clearly stated before both internally and externally
> that the current amdgpu-pro FreeSync support isn't suitable for upstream
> (not even for xf86-video-amdgpu).
>
>

Thanks, DanielS, DanielV, and Michel for the insight.

Michel is actually one of the strongest voices at AMD against any ABI 
stuff that's not well thought-out and might get us in trouble down the road.

Harry
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]             ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-14 16:35               ` Alex Deucher
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Deucher @ 2016-12-14 16:35 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Grodzovsky, Andrey, Harry Wentland, amd-gfx mailing list, Cheng,
	Tony, dri-devel, Deucher, Alexander, Dave Airlie

On Tue, Dec 13, 2016 at 7:22 AM, Daniel Stone <daniel@fooishbar.org> wrote:
> Hi Harry,
> I've been loathe to jump in here, not least because both cop roles
> seem to be taken, but ...
>
> On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote:
>> On 2016-12-11 09:57 PM, Dave Airlie wrote:
>>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote:
>>> Sharing code is a laudable goal and I appreciate the resourcing
>>> constraints that led us to the point at which we find ourselves, but
>>> the way forward involves finding resources to upstream this code,
>>> dedicated people (even one person) who can spend time on a day by day
>>> basis talking to people in the open and working upstream, improving
>>> other pieces of the drm as they go, reading atomic patches and
>>> reviewing them, and can incrementally build the DC experience on top
>>> of the Linux kernel infrastructure. Then having the corresponding
>>> changes in the DC codebase happen internally to correspond to how the
>>> kernel code ends up looking. Lots of this code overlaps with stuff the
>>> drm already does, lots of is stuff the drm should be doing, so patches
>>> to the drm should be sent instead.
>>
>> Personally I'm with you on this and hope to get us there. I'm learning...
>> we're learning. I agree that changes on atomic, removing abstractions, etc.
>> should happen on dri-devel.
>>
>> When it comes to brand-new technologies (MST, Freesync), though, we're often
>> the first which means that we're spending a considerable amount of time to
>> get things right, working with HW teams, receiver vendors and other partners
>> internal and external to AMD. By the time we do get it right it's time to
>> hit the market. This gives us fairly little leeway to work with the
>> community on patches that won't land in distros for another half a year.
>> We're definitely hoping to improve some of this but it's not easy and in
>> some case impossible ahead of time (though definitely possibly after initial
>> release).
>
> Speaking with my Wayland hat on, I think these need to be very
> carefully considered. Both MST and FreeSync have _significant_ UABI
> implications, which may not be immediately obvious when working with a
> single implementation. Having them working and validated with a
> vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu +
> Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world
> we now have Weston, Mutter and KWin all directly driving KMS, plus
> whatever Mir/Unity ends up doing (presumably the same), and that's
> just on the desktop. Beyond the desktop, there's also CrOS/Freon and
> Android/HWC. For better or worse, outside of Xorg and HWC, we no
> longer have a vendor-provided userspace component driving KMS.
>
> It was also easy to get away with loose semantics before with X11
> imposing little to no structure on rendering, but we now have the twin
> requirements of an atomic and timing-precise ABI - see Mario Kleiner's
> unending quest for accuracy - and also a vendor-independent ABI. So a
> good part of the (not insignificant) pain incurred in the atomic
> transition for drivers, was in fact making those drivers conform to
> the expectations of the KMS UABI contract, which just happened to not
> have been tripped over previously.
>
> Speaking with my Collabora hat on now: we did do a substantial amount
> of demidlayering on the Exynos driver, including an atomic conversion,
> on Google's behalf. The original Exynos driver happened to work with
> the Tizen stack, but ChromeOS exposed a huge amount of subtle
> behaviour differences between that and other drivers when using Freon.
> We'd also hit the same issues when attempting to use Weston on Exynos
> in embedded devices for OEMs we worked with, so took on the project to
> remove the midlayer and have as much as possible driven from generic
> code.
>
> How the hardware is programmed is of course ultimately up to you, and
> in this regard AMD will be very different from Intel is very different
> from Nouveau is very different from Rockchip. But especially for new
> features like FreeSync, I think we need to be very conscious of
> walking the line between getting those features in early, and setting
> unworkable UABI in stone. It would be unfortunate if later on down the
> line, you had to choose between breaking older xf86-video-amdgpu
> userspace which depended on specific behaviours of the amdgpu kernel
> driver, or breaking the expectations of generic userspace such as
> Weston/Mutter/etc.

For clarity, as Michel said, the freesync stuff we have in the pro
driver is not indented for upstream in either the kernel or the
userspace.  It's a short term solution for short term deliverables.
That said, I think it's also useful to have something developers in
the community can test and play with to get a better understanding of
what use cases make sense when designing and validating the upstream
solution.

Alex
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-14  9:57             ` Jani Nikula
@ 2016-12-14 17:23               ` Cheng, Tony
       [not found]                 ` <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Cheng, Tony @ 2016-12-14 17:23 UTC (permalink / raw)
  To: Jani Nikula, Lukas Wunner, John
  Cc: Deucher, Alexander, Grodzovsky, Andrey, amd-gfx mailing list, dri-devel



On 12/14/2016 4:57 AM, Jani Nikula wrote:
> On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote:
>> I am struggling with that these comminty contributions to DC would be.
>>
>> Us AMD developer has access to HW docs and designer and we are still
>> spending 50% of our time figuring out why our HW doesn't work right. I
>> can't image community doing much of this heavy lifting.
> I can sympathize with that view, and certainly most of the heavy lifting
> would come from you, same as with us and i915. However, when you put
> together your hardware, an open source driver, and smart people, they
> *will* scratch their itches, whether they're bugs you're not fixing or
> features you're missing. Please don't underestimate and patronize them,
> it's going to rub people the wrong way.
I aplogize if my statement offended any one in the community.  I'll say 
more about bugs below.
>> Dave sent us series of patch to show how it would look like if someone
>> were to change DC.  These changes are more removing code that DRM
>> already has and deleting/clean up stuff.  I guess we can nak all changes
>> and "rewrite" our own version of clean up patch community want to see?
> Please have a look at, say,
>
> $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v amd\.com
>
> Do you really want to actively discourage all of them from contributing?
> I think this would be detrimental to not only your driver, but the whole
> drm community. It feels like you'd like to have your code upstream, but
> still retain ownership as if it was in your internal repo. You can't
> have your cake and eat it too.
That's none "dal" path.  It's just Alex plus a handful of guys trying to 
figure out what register writes is needed base on windows driver.  You 
knwo who has been contributing to that code path from AMD and we know 
it's a relatively small group of people.  Alex and team does great job 
at being good citizen on linux world and provide support. But in terms 
of HW programming and fully expolit our HW that's pretty much the best 
they can do with the resource constraint.  Of course the quality is not 
as good as we would like thus we needed all the help we can get from 
community.  We just don't have the man power to make it great.

We are proposing to get on a path where we can fully leverage the coding 
and validation resources from rest of AMD Display teams (SW, HW, tuning, 
validation, QA etc).  Our goal is to provide a driver to linux community 
that's feature rich and high quality.  My goal is community finds 0 bug 
in our code because we should've seen and fixed those bug in our 
validation pass before we release the GPUs. We do have a good size team 
around validation, just today that validation covers 0% of upstream 
source code.   Alex and I are trying to find a path to get these goodies 
on the upstream driver without 2x size of our teams.  We know 2x our 
team size is not an option.

I just want to say I understand were community is coming from.  Like I 
said in my first respond to Dave that I would've say no if someone want 
to throw 100k lines of code into project (DAL) I have to maintain 
without knowning what's there and the benefit we are getting.  We 
already made a lot of change and design choice in our code base to play 
well with community and absorbing the effort to restructure code on 
other platforms as result of these modification.  We are going to 
continue making more modifications to make our code linux worthy base on 
the good feedback we have gotten so far.

DAL3/DC is a new project we started a little over years ago and still 
early enough stage to make changes.  Like how community is pushing back 
on our code, after 1 or 2 future generation of GPU built on top of DC, 
the AMD teams on rest of platforms will start pushing back on changes in 
DC.  We need find find the balance of what's HW and what's core and how 
to draw the line so community doesn't make much modification in what we 
(both AMD and community) deem "hardware backend code".  We need to have 
the linux coding style and design principals baked in DC code so when 
our internal teams contribute to DC the code is written in a form linux 
community can accept.  All of this need to happen soon or we miss this 
critical inflection point and it's going to be anther 6-8 years before 
we get another crack at re-archiecture project to try getting the rest 
of extended AMD display teams behind our upstream driver.
>
> BR,
> Jani.
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]                 ` <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org>
@ 2016-12-14 18:01                   ` Alex Deucher
       [not found]                     ` <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Deucher @ 2016-12-14 18:01 UTC (permalink / raw)
  To: Cheng, Tony
  Cc: Grodzovsky, Andrey, John, dri-devel, amd-gfx mailing list,
	Lukas Wunner, Jani Nikula, Deucher, Alexander

On Wed, Dec 14, 2016 at 12:23 PM, Cheng, Tony <tony.cheng@amd.com> wrote:
>
>
> On 12/14/2016 4:57 AM, Jani Nikula wrote:
>>
>> On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote:
>>>
>>> I am struggling with that these comminty contributions to DC would be.
>>>
>>> Us AMD developer has access to HW docs and designer and we are still
>>> spending 50% of our time figuring out why our HW doesn't work right. I
>>> can't image community doing much of this heavy lifting.
>>
>> I can sympathize with that view, and certainly most of the heavy lifting
>> would come from you, same as with us and i915. However, when you put
>> together your hardware, an open source driver, and smart people, they
>> *will* scratch their itches, whether they're bugs you're not fixing or
>> features you're missing. Please don't underestimate and patronize them,
>> it's going to rub people the wrong way.
>
> I aplogize if my statement offended any one in the community.  I'll say more
> about bugs below.
>>>
>>> Dave sent us series of patch to show how it would look like if someone
>>> were to change DC.  These changes are more removing code that DRM
>>> already has and deleting/clean up stuff.  I guess we can nak all changes
>>> and "rewrite" our own version of clean up patch community want to see?
>>
>> Please have a look at, say,
>>
>> $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v
>> amd\.com
>>
>> Do you really want to actively discourage all of them from contributing?
>> I think this would be detrimental to not only your driver, but the whole
>> drm community. It feels like you'd like to have your code upstream, but
>> still retain ownership as if it was in your internal repo. You can't
>> have your cake and eat it too.
>
> That's none "dal" path.  It's just Alex plus a handful of guys trying to
> figure out what register writes is needed base on windows driver.  You knwo
> who has been contributing to that code path from AMD and we know it's a
> relatively small group of people.  Alex and team does great job at being
> good citizen on linux world and provide support. But in terms of HW
> programming and fully expolit our HW that's pretty much the best they can do
> with the resource constraint.  Of course the quality is not as good as we
> would like thus we needed all the help we can get from community.  We just
> don't have the man power to make it great.
>
> We are proposing to get on a path where we can fully leverage the coding and
> validation resources from rest of AMD Display teams (SW, HW, tuning,
> validation, QA etc).  Our goal is to provide a driver to linux community
> that's feature rich and high quality.  My goal is community finds 0 bug in
> our code because we should've seen and fixed those bug in our validation
> pass before we release the GPUs. We do have a good size team around
> validation, just today that validation covers 0% of upstream source code.
> Alex and I are trying to find a path to get these goodies on the upstream
> driver without 2x size of our teams.  We know 2x our team size is not an
> option.
>
> I just want to say I understand were community is coming from.  Like I said
> in my first respond to Dave that I would've say no if someone want to throw
> 100k lines of code into project (DAL) I have to maintain without knowning
> what's there and the benefit we are getting.  We already made a lot of
> change and design choice in our code base to play well with community and
> absorbing the effort to restructure code on other platforms as result of
> these modification.  We are going to continue making more modifications to
> make our code linux worthy base on the good feedback we have gotten so far.
>
> DAL3/DC is a new project we started a little over years ago and still early
> enough stage to make changes.  Like how community is pushing back on our
> code, after 1 or 2 future generation of GPU built on top of DC, the AMD
> teams on rest of platforms will start pushing back on changes in DC.  We
> need find find the balance of what's HW and what's core and how to draw the
> line so community doesn't make much modification in what we (both AMD and
> community) deem "hardware backend code".  We need to have the linux coding
> style and design principals baked in DC code so when our internal teams
> contribute to DC the code is written in a form linux community can accept.
> All of this need to happen soon or we miss this critical inflection point
> and it's going to be anther 6-8 years before we get another crack at
> re-archiecture project to try getting the rest of extended AMD display teams
> behind our upstream driver.

I think the point is that there are changes that make sense and
changes that don't If they make sense, we'll definitely take them.
Removing dead code or duplicate defines makes sense.  Rearranging a
propgramming sequence so all registers that start with CRTC_ get
programmed at the same time just because it looks logical does not
make sense.

Alex
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [RFC] Using DC in amdgpu for upcoming GPU
       [not found]                     ` <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-14 18:16                       ` Cheng, Tony
  0 siblings, 0 replies; 66+ messages in thread
From: Cheng, Tony @ 2016-12-14 18:16 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Grodzovsky, Andrey, Bridgman, John, dri-devel,
	amd-gfx mailing list, Lukas Wunner, Jani Nikula, Deucher,
	Alexander

Thanks Alex my reply was a little off topic :)

-----Original Message-----
From: Alex Deucher [mailto:alexdeucher@gmail.com] 
Sent: Wednesday, December 14, 2016 1:02 PM
To: Cheng, Tony <Tony.Cheng@amd.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>; Lukas Wunner <lukas@wunner.de>; Bridgman, John <John.Bridgman@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; amd-gfx mailing list <amd-gfx@lists.freedesktop.org>; dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On Wed, Dec 14, 2016 at 12:23 PM, Cheng, Tony <tony.cheng@amd.com> wrote:
>
>
> On 12/14/2016 4:57 AM, Jani Nikula wrote:
>>
>> On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote:
>>>
>>> I am struggling with that these comminty contributions to DC would be.
>>>
>>> Us AMD developer has access to HW docs and designer and we are still 
>>> spending 50% of our time figuring out why our HW doesn't work right. 
>>> I can't image community doing much of this heavy lifting.
>>
>> I can sympathize with that view, and certainly most of the heavy 
>> lifting would come from you, same as with us and i915. However, when 
>> you put together your hardware, an open source driver, and smart 
>> people, they
>> *will* scratch their itches, whether they're bugs you're not fixing 
>> or features you're missing. Please don't underestimate and patronize 
>> them, it's going to rub people the wrong way.
>
> I aplogize if my statement offended any one in the community.  I'll 
> say more about bugs below.
>>>
>>> Dave sent us series of patch to show how it would look like if 
>>> someone were to change DC.  These changes are more removing code 
>>> that DRM already has and deleting/clean up stuff.  I guess we can 
>>> nak all changes and "rewrite" our own version of clean up patch community want to see?
>>
>> Please have a look at, say,
>>
>> $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v 
>> amd\.com
>>
>> Do you really want to actively discourage all of them from contributing?
>> I think this would be detrimental to not only your driver, but the 
>> whole drm community. It feels like you'd like to have your code 
>> upstream, but still retain ownership as if it was in your internal 
>> repo. You can't have your cake and eat it too.
>
> That's none "dal" path.  It's just Alex plus a handful of guys trying 
> to figure out what register writes is needed base on windows driver.  
> You knwo who has been contributing to that code path from AMD and we 
> know it's a relatively small group of people.  Alex and team does 
> great job at being good citizen on linux world and provide support. 
> But in terms of HW programming and fully expolit our HW that's pretty 
> much the best they can do with the resource constraint.  Of course the 
> quality is not as good as we would like thus we needed all the help we 
> can get from community.  We just don't have the man power to make it great.
>
> We are proposing to get on a path where we can fully leverage the 
> coding and validation resources from rest of AMD Display teams (SW, 
> HW, tuning, validation, QA etc).  Our goal is to provide a driver to 
> linux community that's feature rich and high quality.  My goal is 
> community finds 0 bug in our code because we should've seen and fixed 
> those bug in our validation pass before we release the GPUs. We do 
> have a good size team around validation, just today that validation covers 0% of upstream source code.
> Alex and I are trying to find a path to get these goodies on the 
> upstream driver without 2x size of our teams.  We know 2x our team 
> size is not an option.
>
> I just want to say I understand were community is coming from.  Like I 
> said in my first respond to Dave that I would've say no if someone 
> want to throw 100k lines of code into project (DAL) I have to maintain 
> without knowning what's there and the benefit we are getting.  We 
> already made a lot of change and design choice in our code base to 
> play well with community and absorbing the effort to restructure code 
> on other platforms as result of these modification.  We are going to 
> continue making more modifications to make our code linux worthy base on the good feedback we have gotten so far.
>
> DAL3/DC is a new project we started a little over years ago and still 
> early enough stage to make changes.  Like how community is pushing 
> back on our code, after 1 or 2 future generation of GPU built on top 
> of DC, the AMD teams on rest of platforms will start pushing back on 
> changes in DC.  We need find find the balance of what's HW and what's 
> core and how to draw the line so community doesn't make much 
> modification in what we (both AMD and
> community) deem "hardware backend code".  We need to have the linux 
> coding style and design principals baked in DC code so when our 
> internal teams contribute to DC the code is written in a form linux community can accept.
> All of this need to happen soon or we miss this critical inflection 
> point and it's going to be anther 6-8 years before we get another 
> crack at re-archiecture project to try getting the rest of extended 
> AMD display teams behind our upstream driver.

I think the point is that there are changes that make sense and changes that don't If they make sense, we'll definitely take them.
Removing dead code or duplicate defines makes sense.  Rearranging a propgramming sequence so all registers that start with CRTC_ get programmed at the same time just because it looks logical does not make sense.

Alex
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [RFC] Using DC in amdgpu for upcoming GPU
@ 2016-12-15 15:48 Kevin Brace
  0 siblings, 0 replies; 66+ messages in thread
From: Kevin Brace @ 2016-12-15 15:48 UTC (permalink / raw)
  To: dri-devel

Hi,

I have been reading the ongoing discussion about what to do about AMD DC (Display Core) with great interest since I have started to put more time into developing OpenChrome DRM for VIA Technologies Chrome IGP.
I particularly enjoyed reading what Tony Cheng wrote about what is going on inside AMD Radeon GPUs.
As a graphics stack developer, I suppose I am still someone somewhat above a beginner level, and Chrome IGP might be considered garbage graphics to some (I do not really care what people say or think about it.), but since my background is really digital hardware design (self taught) rather than graphics device driver development, I will like to add my 2 cents (U.S.D.) to the discussion.
I also consider myself an amateur semiconductor industry historian, and in particular, I have been a close watcher of Intel's business / hiring practice for many years. 
For some, what I am writing may not make sense or even offend some (my guess will be the people who work at Intel), but I will not pull any punches, and if you do not like what I write, let me know. (That does not mean I will necessarily take back my comment even if it offended you. I typically stand behind what I say, unless it is obvious that I am wrong.)
    While my understanding of DRM is still quite primitive, my simplistic understanding of why AMD is pushing DC is due to the following factors.

1) AMD is understaffed due to its precarious financial condition it is in right now (i.e., < $1 billion CoH and losing 7,000 employees since Year 2008 or so)
2) The complexity of the next generation ASIC is only getting worse due to the continuing process scaling = more transistors one has to use (i.e., TSMC 28 nm to GF 14 nm to probably Samsung / TSMC 10 nm or GF 7 nm)
3) Based on 1 and 2, unless the design productively can be improved, AMD will be late to market, and this can be the possible end to AMD as a corporation
4) Hence, in order to meet TtM and improve engineer productivity, AMD needs to reuse the existing pre-silicon / post-silicon bring up test code and share the code with the Windows side of the device driver developers
5) In addition, power is already the biggest design challenge, and very precise power management is crucial to the performance of the chip (i.e., it's not all about the laptop anymore, and desktop "monster" graphics cards also need power management for performance reasons, in order to manage heat generation)
6) AMD Radeon is really running an RTOS (Real Time Operating System) inside the GPU card, and they want to put the code to handle initialization / power management closer to the GPU rather than from the slower response x86 (or any other general purpose) microprocessor


Since I will probably need to obtain "favors" down the road when I try to get OpenChrome DRM mainlined, I probably should not go into what I think of how Intel works on their graphics device driver stack (I do not mean to make this personal, but Intel is the "other" open source camp in the OSS x86 graphics world, so I find it a fair game to discuss the approach Intel takes from semiconductor industry perspective. I am probably going to overly generalize what is going on, so if you wanted to correct me, let me know.), but based on my understanding of how Intel works, Intel probably has more staffing resources than AMD when it comes to graphics device driver stack development. (and on the x86 microprocessor development side)
Based on my understanding of where Intel stands financially, I feel like Intel is standing on very thin ice due to the following factors, and I will predict that they will eventually adopt AMD DC like design concept. (i.e., use of a HAL)
Here is my logic.

1) PC (desktop and laptop) x86 processors are not selling very well, and my understanding is that since Year 2012 peak, x86 processor shipment is down 30% as of Year 2016 (I will say around $200 ASP)
2) Intel's margins are being propped up by the unnaturally high data center marketshare (99% for x86 data center microprocessors) and very high data center x86 processor ASP (Average Selling Price) of $600 (Up from $500 a few years ago due to AMD screwing up the Bulldozer microarchitecture. More on this later.)
3) Intel did a significant layoff in April 2016 where they targeted older (read "expensive"), experienced engineers
4) Like Cisco Systems (notorious for their annual summer time 5,000 layoff), Intel then turns around and goes in a hiring spree hiring from many graduate programs of U.S. second and third tier universities, bringing down the overall experience level of the engineering departments
5) While AMD is financially in a desperate shape, it will likely have one last chance in Zen microarchitecture to get back into the game (Zen will be the last chance for AMD, IMO.)
6) Since AMD is now fabless due to divestiture of the fabs in Year 2009 (GLOBALFOUNDRIES), it no longer has the financial burden of having to pay for the fab, whereas Intel "had to" delay 10 nm process deployment to 2H'17 due to weak demand of 14 nm process products and low utilization of 14 nm process (Low utilization delays the amortization of 14 nm process. Intel historically amortized the given process technology in 2 years. 14 nm is starting to look like 2.5 to 3 years due to yield issues they encountered in 2014.)
7) Inevitably, the magic of market competition will drag down Intel ASP (both PC and data center) since Zen microarchitecture is a rather straight forward x86 microarchitectural implementation (i.e., not too far apart from Skylake), hence, their low 60% gross margin will be under pressure from AMD starting in Year 2017.
8) Intel overpaid for Altera (a struggling FPGA vendor where the CEO probably felt like he had to sell the corporation in order to cover up the Stratix 10 FPGA development screw up of missing the tape out target date by 1.5 years) by $8 billion, and the next generation process technology is getting ever more expensive (10 nm, 7 nm, 5 nm, etc.)
9) In order to "please" Wall Street, Intel management will possibly do further destructive layoffs every year, and if I were to guess, will likely layoff another 25,000 to 30,000 people over the next 3 to 4 years
10) Intel has already lost the experienced engineers over the past layoffs, replacing them with far less experienced engineers hired relatively recently from mostly second and third tier U.S. universities
11) Now, with 25,000 to 30,000 layoff, the management will force the software engineering side to reorganize, and Intel will be "forced" to come up with ways to reuse their graphics stack code (i.e., sharing more code between Windows and Linux)
12) Hence, maybe a few years from now, Intel people will have to do something similar to AMD DC, in order to improve their design productivity since they no longer can throw people at the problem (Their tendency to overhire new college graduates since they are cheaper, and this allowed them to throw people at the problem relatively cheaply until recently. High x86 ASP also allowed them to do this as well, and they got too used to this for too long. They will not be able to do this in the future. In the meantime, their organizational experience level is coming down due to hiring too many NCGs and laying off too many experienced people at the same time.)


I am sure there are people who are not happy reading this, but this is my harsh, honest assessment of what Intel is going through right now, and what will happen in the future.
I am sure I will be effectively blacklisted from working at Intel for writing what I just wrote (That's okay since I am not interested in working at Intel.), but I came to this conclusion based on various people who used to work at Intel told me and observing the hiring practice of Intel for a number of years.
In particular, one person who worked on Intel 740 project (i.e., the long forgotten discrete AGP graphics chip from 1998) on the technical side has told me that Intel is really terrible at IP (Intellectual Property) core reuse, and Intel frequently redesigns too many portions of their ASICs all the time.
Based on that, I am not too surprised to hear that Intel does Windows and Linux graphics device driver stack development separately. (That's what I read.)
In other words, Intel is bloated from a staffing point of view. (I do not necessarily like people to lose jobs, but compared to AMD and NVIDIA, Intel is really bloated. The same person who worked on the Intel 740 project told me that Intel employee productivity is much lower than their competitors like AMD and NVIDIA on a per employee basis, and they have not been able to fix this for years.)
Despite the constant layoffs, Intel's employee count has not really gone down for the past few years (it is staying around 100,000 for the past 4 years), but eventually Intel will have to get rid of people in absolute numbers.
Intel also heavily relies on its "shadow" workforce of interns (from local universities, especially the foreign master's degree students desperate to pay off part of their high out of state tuition) and contractors / consultants, so their "real" employee count is probably closer to 115,000 or 120,000.
I get Intel related contractor / consultant position "unsolicited" e-mails from recruiters possibly located 12 time zones away from where I reside (please do not call me a racist for pointing this out since I find this so weird as a U.S. citizen) almost every weekday (M-F), and I am always surprised at the type of work Intel wants contractors to work on.
Many of the positions they want people to work are highly specialized stuff (I saw a graphics device driver contract position recently.), and they have been like this for several years already.
I no longer bother with Intel anymore based on this since they appear to not want to commit to proper employment of highly technical people.
Going back to the graphics world, my take is, Intel will have to get used to doing the same with far fewer people, and they will need to change their corporate culture of throwing people at the problem very soon since their x86 ASP will be crashing down fairly soon, and AMD will likely never repeat the Bulldozer microarchitecture screw up again. (Intel got lucky when former IBM PowerPC architects AMD hired around Year 2005 screwed up the Bulldozer. Speed Demon design is a disaster in a power constrained post-90 nm process node. They tried to compensate for Bulldozer's low IPC with high clock frequency. Intel learned a painful lesson about power with NetBurst microarchitecture between Year 2003 to 2005. Also, then AMD management seem to have really believed in the many-core concept too seriously. AMD had to live with the messed up Bulldozer for 10+ years with disastrous financial results.)
    I do understand that what I am writing isn't terribly technical in nature (it is more like corporate strategy stuff business / marketing side people worry about), but I feel like what AMD is doing is quite logical. (i.e., using higher abstraction level for initialization / power management, and code reuse)
Sorry for the off topic assessment of Intel (i.e., hiring practice stuff, x86 stuff), and based on the subsequent messages, it appears that DC can be rearchitected to satisfy Linux kernel developers, but overall, I feel like there is a lack of appreciation for the concept of design reuse in this case even though in ASIC / FPGA design world, this is very normal. (It has been like this since the mid-'90s when ASIC engineers had to start doing this regularly.)
AMD side people appeared to have been trying to apply this concept to the device driver side as well.
Considering AMD's meager staffing resources (currently approximately 9,000; less than 1/10 of Intel although Intel owns many fabs and product lines, so the actual developer staffing disadvantage is probably more like 1:3 to 1:5 ratio), I am not too surprised to read that it is trying to improve their productivity where they can, and combining some portions of Windows and Linux code makes sense.
I would imagine that NVIDIA is going something like this already. (but closed source)
Again, I will almost bet that Intel will adopt AMD DC like concept in the next few years.
Let me know if I was right in a few years.

Regards,

Kevin Brace
The OpenChrome Project maintainer / developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
  2016-12-09 16:32 Jan Ziak
@ 2016-12-13  7:31 ` Michel Dänzer
  0 siblings, 0 replies; 66+ messages in thread
From: Michel Dänzer @ 2016-12-13  7:31 UTC (permalink / raw)
  To: Jan Ziak; +Cc: dri-devel

On 10/12/16 01:32 AM, Jan Ziak wrote:
> Hello Dave
> 
> Let's cool down the discussion a bit and try to work out a solution.
> 
> To summarize the facts, your decision implies that the probability of
> merging DAL/DC into the mainline Linux kernel the next year (2017) has
> become extremely low.
> 
> In essence, the strategy you are implicitly proposing is to move away
> from a software architecture which looks like this:
> 
>   APPLICATION
>   USERSPACE DRIVERS (OPENGL, XSERVER)
>   ----
>   HAL/DC IN AMDGPU.KO (FREESYNC, etc)
>   LINUX KERNEL SERVICES
>   HARDWARE
> 
> towards a software architecture looking like this:
> 
>   APPLICATION
>   USERSPACE DRIVERS (OPENGL, XSERVER)
>   USERSPACE HAL/DC IMPLEMENTATION (FREESYNC, etc)
>   ----
>   AMDGPU.KO
>   LINUX KERNEL SERVICES
>   HARDWARE

You misunderstood what Dave wrote.

The whole discussion is mostly about the DC related code in the amdgpu
driver and its interaction with core DRM/kernel code, i.e. mostly about
code under drivers/gpu/drm/. It doesn't affect anything outside of that,
certainly not how things are divided up between kernel and userspace.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [RFC] Using DC in amdgpu for upcoming GPU
@ 2016-12-09 16:32 Jan Ziak
  2016-12-13  7:31 ` Michel Dänzer
  0 siblings, 1 reply; 66+ messages in thread
From: Jan Ziak @ 2016-12-09 16:32 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2603 bytes --]

Hello Dave

Let's cool down the discussion a bit and try to work out a solution.

To summarize the facts, your decision implies that the probability of
merging DAL/DC into the mainline Linux kernel the next year (2017) has
become extremely low.

In essence, the strategy you are implicitly proposing is to move away from
a software architecture which looks like this:

  APPLICATION
  USERSPACE DRIVERS (OPENGL, XSERVER)
  ----
  HAL/DC IN AMDGPU.KO (FREESYNC, etc)
  LINUX KERNEL SERVICES
  HARDWARE

towards a software architecture looking like this:

  APPLICATION
  USERSPACE DRIVERS (OPENGL, XSERVER)
  USERSPACE HAL/DC IMPLEMENTATION (FREESYNC, etc)
  ----
  AMDGPU.KO
  LINUX KERNEL SERVICES
  HARDWARE

For the future of Linux the latter basically means that the Linux kernel
won't be initializing display resolution (modesetting) when the machine is
booting. The initial modesetting will be performed by a user-space
executable launched by openrc/systemd/etc as soon as possible. Launching
the userspace modesetting executable will be among the first actions of
openrc/systemd/etc.

Note that during the 90-ties Linux-based systems _already_ had the xserver
responsible for modesetting. Linux gradually moved away from the 90-ties
software architecture towards an in-kernel modesetting architecture.

A citation from https://en.wikipedia.org/wiki/X.Org_Server is in order
here: "In ancient times, the mode-setting was done by some x-server
graphics device drivers specific to some video controller/graphics card. To
this mode-setting functionality, additional support for 2D acceleration was
added when such became available with various GPUs. The mode-setting
functionality was moved into the DRM and is being exposed through an DRM
mode-setting interface, the new approach being called "kernel mode-setting"
(KMS)."

The underlying simple hard fact behind the transition from 90-ties
user-modesetting to kernel-modesetting is that most Linux users prefer to
see a single display mode initialization which persists from machine boot
to machine shutdown. In the near future, the combination of the following
four factors:

1. General availability of 144Hz displays
2. Up/down-scaling of fullscreen OpenGL/Vulkan apps (virtual display
resolution)
3. Per-frame monitor refresh rate adjustment (freesync, g-sync)
4. Competition of innovations

will render non-native monitor resolutions and non-native physical
framerates completely obsolete. (3) is a transient phenomenon which will be
later superseded by further developments in the field of (1) towards
emergence of virtual refresh rates.

Jan

[-- Attachment #1.2: Type: text/html, Size: 3202 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2016-12-15 15:48 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-08  2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland
2016-12-08  9:59 ` Daniel Vetter
     [not found]   ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-08 14:33     ` Harry Wentland
2016-12-08 15:34       ` Daniel Vetter
     [not found]         ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-08 15:41           ` Christian König
2016-12-08 15:46             ` Daniel Vetter
2016-12-08 20:24             ` Matthew Macy
2016-12-08 17:40           ` Alex Deucher
2016-12-08 20:07     ` Dave Airlie
     [not found]       ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-08 23:29         ` Dave Airlie
     [not found]           ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-09 17:26             ` Cheng, Tony
2016-12-09 19:59               ` Daniel Vetter
     [not found]                 ` <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-09 20:34                   ` Dave Airlie
2016-12-09 20:38                     ` Daniel Vetter
2016-12-10  0:29                     ` Matthew Macy
2016-12-11 12:34                     ` Daniel Vetter
2016-12-09 17:56           ` Cheng, Tony
2016-12-09 17:32         ` Deucher, Alexander
     [not found]           ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-12-09 20:30             ` Dave Airlie
     [not found]               ` <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-11  0:36                 ` Alex Deucher
2016-12-09 20:31           ` Daniel Vetter
2016-12-11 20:28 ` Daniel Vetter
     [not found]   ` <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-13  2:33     ` Harry Wentland
     [not found]       ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org>
2016-12-13  4:10         ` Cheng, Tony
2016-12-13  7:50           ` Daniel Vetter
     [not found]           ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org>
2016-12-13  7:30             ` Dave Airlie
2016-12-13  9:14               ` Cheng, Tony
2016-12-13 14:59             ` Rob Clark
2016-12-13  7:31         ` Daniel Vetter
2016-12-13 10:09         ` Ernst Sjöstrand
     [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org>
2016-12-12  2:57   ` Dave Airlie
2016-12-12  7:09     ` Daniel Vetter
     [not found]     ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-12  3:21       ` Bridgman, John
2016-12-12  3:23         ` Bridgman, John
     [not found]           ` <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-12-12  3:43             ` Bridgman, John
2016-12-12  4:05               ` Dave Airlie
2016-12-13  1:49       ` Harry Wentland
     [not found]         ` <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org>
2016-12-13 12:22           ` Daniel Stone
2016-12-13 12:59             ` Daniel Vetter
     [not found]               ` <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-14  1:50                 ` Michel Dänzer
     [not found]                   ` <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org>
2016-12-14 15:46                     ` Harry Wentland
     [not found]             ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-14 16:35               ` Alex Deucher
2016-12-13  2:52     ` Cheng, Tony
     [not found]       ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org>
2016-12-13  7:09         ` Dave Airlie
2016-12-13  9:40       ` Lukas Wunner
     [not found]         ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
2016-12-13 15:03           ` Cheng, Tony
2016-12-13 15:09             ` Deucher, Alexander
2016-12-13 15:57             ` Lukas Wunner
2016-12-14  9:57             ` Jani Nikula
2016-12-14 17:23               ` Cheng, Tony
     [not found]                 ` <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org>
2016-12-14 18:01                   ` Alex Deucher
     [not found]                     ` <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-14 18:16                       ` Cheng, Tony
2016-12-13 16:14           ` Bridgman, John
2016-12-12  7:22 ` Daniel Vetter
     [not found]   ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-12  7:54     ` Bridgman, John
     [not found]       ` <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-12-12  9:27         ` Daniel Vetter
     [not found]           ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-12  9:29             ` Daniel Vetter
2016-12-12 15:28           ` Deucher, Alexander
     [not found]             ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-12-12 16:06               ` Luke A. Guest
2016-12-12 16:17               ` Luke A. Guest
     [not found]                 ` <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org>
2016-12-12 16:44                   ` Deucher, Alexander
2016-12-13  2:05     ` Harry Wentland
     [not found]       ` <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org>
2016-12-13  8:33         ` Daniel Vetter
2016-12-09 16:32 Jan Ziak
2016-12-13  7:31 ` Michel Dänzer
2016-12-15 15:48 Kevin Brace

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.