All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Sharma, Shashank" <shashank.sharma@amd.com>
To: Satyajit Sahu <satyajit.sahu@amd.com>, leo.liu@amd.com
Cc: Alexander.Deucher@amd.com, Christian.Koenig@amd.com,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu/vcn:enable priority queues for encoder
Date: Tue, 17 Aug 2021 11:56:21 +0530	[thread overview]
Message-ID: <0214eee6-b1c7-d9d2-f938-f0d89d4070cf@amd.com> (raw)
In-Reply-To: <20210810070954.4480-1-satyajit.sahu@amd.com>

Hi Satyajit,

On 8/10/2021 12:39 PM, Satyajit Sahu wrote:
> VCN and VCE support multiple queues with different priority.
> Use differnt encoder queue based on the priority set by UMD.
> 
> Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 35 +++++++++++++++++++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c   | 14 +++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h   |  9 ++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   | 14 +++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h   |  8 ++++++
>   drivers/gpu/drm/amd/amdgpu/vce_v2_0.c     |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c     |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c     |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c     |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c     |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c     |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c     |  5 ++--
>   13 files changed, 99 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index e7a010b7ca1f..b0ae2b45c7c3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -73,15 +73,44 @@ static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_compute_prio(enum drm_sch
>   	}
>   }
>   
> +static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vcn_prio(enum drm_sched_priority prio)
> +{
> +	switch (prio) {
> +	case DRM_SCHED_PRIORITY_HIGH:
> +		return AMDGPU_VCN_ENC_PRIO_HIGH;
> +	case DRM_SCHED_PRIORITY_KERNEL:
> +		return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
> +	default:
> +		return AMDGPU_VCN_ENC_PRIO_NORMAL;
> +	}
> +}
> +
> +static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vce_prio(enum drm_sched_priority prio)
> +{
> +	switch (prio) {
> +	case DRM_SCHED_PRIORITY_HIGH:
> +		return AMDGPU_VCE_ENC_PRIO_HIGH;
> +	case DRM_SCHED_PRIORITY_KERNEL:
> +		return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
> +	default:
> +		return AMDGPU_VCE_ENC_PRIO_NORMAL;
> +	}
> +}
> +

As it is also mentioned by Nirmoy, we can't use kernel priority for 
VCE/VCN ENC work due to the delay/stale punishment.

But adding DRM_SCHED_PRIORITY_VERY_HIGH also might not be a good way as 
other drivers might not be very interested in that (or are they ?)

I can think of two ways to go ahead from here:

Mrthod 1:
=========
- Introduce DRM_SCHED_PRIORITY_VERY_HIGH in the DRM scheduler layer, and 
demo a usecase of the same in our high priority VCN queues (in fact this 
should work for high priority gfx queues as well)

- Check the feedback from the community

If there are not many consumer or there is not much interest around:

Mehod 2:
========
- Add a AMDGPU specific priority flag while submitting the work
- Use this flag to keep the DRM scheduler level: DRM_SCHED_PRIORITY_HIGH 
but the AMD scheduler level: AMDGPU_VCE_ENC_PRIO_VERY_HIGH
- Pass this flag parameter to 
amdgpu_ctx_sched_prio_to_vcn_prio/amdgpu_ctx_sched_prio_to_vce_prio 
functions and set the proper scheduler level which will allow AMDGPU 
driver to schedule it in the correct ring

>   static unsigned int amdgpu_ctx_prio_sched_to_hw(struct amdgpu_device *adev,
>   						 enum drm_sched_priority prio,
>   						 u32 hw_ip)
>   {
>   	unsigned int hw_prio;
>   
> -	hw_prio = (hw_ip == AMDGPU_HW_IP_COMPUTE) ?
> -			amdgpu_ctx_sched_prio_to_compute_prio(prio) :
> -			AMDGPU_RING_PRIO_DEFAULT;
> +	if (hw_ip == AMDGPU_HW_IP_COMPUTE)
> +		hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(prio);
> +	else if (hw_ip == AMDGPU_HW_IP_VCN_ENC)
> +		hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(prio);
> +	else if (hw_ip == AMDGPU_HW_IP_VCE)
> +		hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(prio);
> +	else
> +		hw_prio = AMDGPU_RING_PRIO_DEFAULT;

Move to switch(case) probably for the better appearance ?

>   	hw_ip = array_index_nospec(hw_ip, AMDGPU_HW_IP_NUM);
>   	if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0)
>   		hw_prio = AMDGPU_RING_PRIO_DEFAULT;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
> index b7d861ed5284..4087acb6b8bb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
> @@ -37,7 +37,7 @@ int amdgpu_to_sched_priority(int amdgpu_priority,
>   {
>   	switch (amdgpu_priority) {
>   	case AMDGPU_CTX_PRIORITY_VERY_HIGH:
> -		*prio = DRM_SCHED_PRIORITY_HIGH;
> +		*prio = DRM_SCHED_PRIORITY_KERNEL;
as mentioned above
>   		break;
>   	case AMDGPU_CTX_PRIORITY_HIGH:
>   		*prio = DRM_SCHED_PRIORITY_HIGH;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index 1ae7f824adc7..9d59ca31bc22 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -1168,3 +1168,17 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring *ring, long timeout)
>   	amdgpu_bo_free_kernel(&bo, NULL, NULL);
>   	return r;
>   }
> +
> +enum vce_enc_ring_priority get_vce_ring_prio(int index)
> +{
> +	switch(index) {
> +	case 0:
> +		return AMDGPU_VCE_ENC_PRIO_NORMAL;
> +	case 1:
> +		return AMDGPU_VCE_ENC_PRIO_HIGH;
> +	case 2:
> +		return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
Please use AMDGPU_VCE_ENC_PRIO_NORMAL/HIGH/VERY_HIGH instead of 0/1/2 
for better readability.
> +	default:
> +		return AMDGPU_VCE_ENC_PRIO_NORMAL;
> +	}
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
> index d6d83a3ec803..f5265f385890 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
> @@ -32,6 +32,13 @@
>   
>   #define AMDGPU_VCE_FW_53_45	((53 << 24) | (45 << 16))
>   
> +enum vce_enc_ring_priority {
> +	AMDGPU_VCE_ENC_PRIO_NORMAL = 1,
> +	AMDGPU_VCE_ENC_PRIO_HIGH,
> +	AMDGPU_VCE_ENC_PRIO_VERY_HIGH,
> +	AMDGPU_VCE_ENC_PRIO_MAX
> +};
> +
>   struct amdgpu_vce {
>   	struct amdgpu_bo	*vcpu_bo;
>   	uint64_t		gpu_addr;
> @@ -72,4 +79,6 @@ void amdgpu_vce_ring_end_use(struct amdgpu_ring *ring);
>   unsigned amdgpu_vce_ring_get_emit_ib_size(struct amdgpu_ring *ring);
>   unsigned amdgpu_vce_ring_get_dma_frame_size(struct amdgpu_ring *ring);
>   
> +enum vce_enc_ring_priority get_vce_ring_prio(int index);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 6780df0fb265..ca6cc07a726b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -951,3 +951,17 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
>   
>   	return r;
>   }
> +
> +enum vcn_enc_ring_priority get_vcn_enc_ring_prio(int index)
> +{
> +	switch(index) {
> +	case 0:
> +		return AMDGPU_VCN_ENC_PRIO_NORMAL;
> +	case 1:
> +		return AMDGPU_VCN_ENC_PRIO_HIGH;
> +	case 2:
> +		return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
Same here
> +	default:
> +		return AMDGPU_VCN_ENC_PRIO_NORMAL;
> +	}
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
> index d74c62b49795..bf14ee54f774 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
> @@ -290,6 +290,13 @@ enum vcn_ring_type {
>   	VCN_UNIFIED_RING,
>   };
>   
> +enum vcn_enc_ring_priority {
> +	AMDGPU_VCN_ENC_PRIO_NORMAL = 1,
> +	AMDGPU_VCN_ENC_PRIO_HIGH,
> +	AMDGPU_VCN_ENC_PRIO_VERY_HIGH,
> +	AMDGPU_VCN_ENC_PRIO_MAX
> +};
> +
>   int amdgpu_vcn_sw_init(struct amdgpu_device *adev);
>   int amdgpu_vcn_sw_fini(struct amdgpu_device *adev);
>   int amdgpu_vcn_suspend(struct amdgpu_device *adev);
> @@ -307,5 +314,6 @@ int amdgpu_vcn_dec_sw_ring_test_ib(struct amdgpu_ring *ring, long timeout);
>   
>   int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring);
>   int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout);
> +enum vcn_enc_ring_priority get_vcn_enc_ring_prio(int index);
>   
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> index c7d28c169be5..2b6b7f1a77b9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> @@ -431,10 +431,12 @@ static int vce_v2_0_sw_init(void *handle)
>   		return r;
>   
>   	for (i = 0; i < adev->vce.num_rings; i++) {
> +		unsigned int hw_prio = get_vce_ring_prio(i);
> +
>   		ring = &adev->vce.ring[i];
>   		sprintf(ring->name, "vce%d", i);
>   		r = amdgpu_ring_init(adev, ring, 512, &adev->vce.irq, 0,
> -				     AMDGPU_RING_PRIO_DEFAULT, NULL);
> +				     hw_prio, NULL);
>   		if (r)
>   			return r;
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index 3b82fb289ef6..5ce182a837f3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -440,10 +440,12 @@ static int vce_v3_0_sw_init(void *handle)
>   		return r;
>   
>   	for (i = 0; i < adev->vce.num_rings; i++) {
> +		unsigned int hw_prio = get_vce_ring_prio(i);
> +
>   		ring = &adev->vce.ring[i];
>   		sprintf(ring->name, "vce%d", i);
>   		r = amdgpu_ring_init(adev, ring, 512, &adev->vce.irq, 0,
> -				     AMDGPU_RING_PRIO_DEFAULT, NULL);
> +				     hw_prio, NULL);
>   		if (r)
>   			return r;
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> index 90910d19db12..c085defaabfe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -463,6 +463,8 @@ static int vce_v4_0_sw_init(void *handle)
>   	}
>   
>   	for (i = 0; i < adev->vce.num_rings; i++) {
> +		unsigned int hw_prio = get_vce_ring_prio(i);
> +
>   		ring = &adev->vce.ring[i];
>   		sprintf(ring->name, "vce%d", i);
>   		if (amdgpu_sriov_vf(adev)) {
> @@ -478,7 +480,7 @@ static int vce_v4_0_sw_init(void *handle)
>   				ring->doorbell_index = adev->doorbell_index.uvd_vce.vce_ring2_3 * 2 + 1;
>   		}
>   		r = amdgpu_ring_init(adev, ring, 512, &adev->vce.irq, 0,
> -				     AMDGPU_RING_PRIO_DEFAULT, NULL);
> +				     hw_prio, NULL);
>   		if (r)
>   			return r;
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> index 121ee9f2b8d1..f471c65d78bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> @@ -145,10 +145,12 @@ static int vcn_v1_0_sw_init(void *handle)
>   		SOC15_REG_OFFSET(UVD, 0, mmUVD_NO_OP);
>   
>   	for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
> +		unsigned int hw_prio = get_vcn_enc_ring_prio(i);
> +
>   		ring = &adev->vcn.inst->ring_enc[i];
>   		sprintf(ring->name, "vcn_enc%d", i);
>   		r = amdgpu_ring_init(adev, ring, 512, &adev->vcn.inst->irq, 0,
> -				     AMDGPU_RING_PRIO_DEFAULT, NULL);
> +				     hw_prio, NULL);
>   		if (r)
>   			return r;
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
> index f4686e918e0d..06978d5a93c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
> @@ -159,6 +159,8 @@ static int vcn_v2_0_sw_init(void *handle)
>   	adev->vcn.inst->external.nop = SOC15_REG_OFFSET(UVD, 0, mmUVD_NO_OP);
>   
>   	for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
> +		unsigned int hw_prio = get_vcn_enc_ring_prio(i);
> +
>   		ring = &adev->vcn.inst->ring_enc[i];
>   		ring->use_doorbell = true;
>   		if (!amdgpu_sriov_vf(adev))
> @@ -167,7 +169,7 @@ static int vcn_v2_0_sw_init(void *handle)
>   			ring->doorbell_index = (adev->doorbell_index.vcn.vcn_ring0_1 << 1) + 1 + i;
>   		sprintf(ring->name, "vcn_enc%d", i);
>   		r = amdgpu_ring_init(adev, ring, 512, &adev->vcn.inst->irq, 0,
> -				     AMDGPU_RING_PRIO_DEFAULT, NULL);
> +				     hw_prio, NULL);
>   		if (r)
>   			return r;
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
> index e0c0c3734432..fba453ca74fa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
> @@ -194,6 +194,8 @@ static int vcn_v2_5_sw_init(void *handle)
>   			return r;
>   
>   		for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
> +			unsigned int hw_prio = get_vcn_enc_ring_prio(i);
> +
>   			ring = &adev->vcn.inst[j].ring_enc[i];
>   			ring->use_doorbell = true;
>   
> @@ -203,7 +205,7 @@ static int vcn_v2_5_sw_init(void *handle)
>   			sprintf(ring->name, "vcn_enc_%d.%d", j, i);
>   			r = amdgpu_ring_init(adev, ring, 512,
>   					     &adev->vcn.inst[j].irq, 0,
> -					     AMDGPU_RING_PRIO_DEFAULT, NULL);
> +					     hw_prio, NULL);
>   			if (r)
>   				return r;
>   		}
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> index 3d18aab88b4e..f5baf7321b0d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> @@ -224,6 +224,8 @@ static int vcn_v3_0_sw_init(void *handle)
>   			return r;
>   
>   		for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
> +			unsigned int hw_prio = get_vcn_enc_ring_prio(j);
> +
>   			/* VCN ENC TRAP */
>   			r = amdgpu_irq_add_id(adev, amdgpu_ih_clientid_vcns[i],
>   				j + VCN_2_0__SRCID__UVD_ENC_GENERAL_PURPOSE, &adev->vcn.inst[i].irq);
> @@ -239,8 +241,7 @@ static int vcn_v3_0_sw_init(void *handle)
>   			}
>   			sprintf(ring->name, "vcn_enc_%d.%d", i, j);
>   			r = amdgpu_ring_init(adev, ring, 512, &adev->vcn.inst[i].irq, 0,
> -					     AMDGPU_RING_PRIO_DEFAULT,
> -					     &adev->vcn.inst[i].sched_score);
> +					     hw_prio, &adev->vcn.inst[i].sched_score);
>   			if (r)
>   				return r;
>   		}
> 
I think this patch can be split into 2,
  patch 1: introduce the priority core logic and functions
  patch 2: consume priority logic across various vcn generations sw_init 
functions (vcn_v1_sw_init, vcn_v2_sw_init, vcn_v2_5_sw_init and so on)

- Shashank


      parent reply	other threads:[~2021-08-17  6:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-10  7:09 [PATCH] drm/amdgpu/vcn:enable priority queues for encoder Satyajit Sahu
2021-08-10  8:31 ` Das, Nirmoy
2021-08-10  9:38   ` Sahu, Satyajit
2021-08-10 12:42     ` Das, Nirmoy
2021-08-11  3:51       ` Sahu, Satyajit
2021-08-17  6:26 ` Sharma, Shashank [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0214eee6-b1c7-d9d2-f938-f0d89d4070cf@amd.com \
    --to=shashank.sharma@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=leo.liu@amd.com \
    --cc=satyajit.sahu@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.