dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2)
@ 2017-07-18  3:52 Dave Airlie
  2017-07-18 13:59 ` Christian König
  2017-08-10 13:56 ` Emil Velikov
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Airlie @ 2017-07-18  3:52 UTC (permalink / raw)
  To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

From: Dave Airlie <airlied@redhat.com>

This just sends chunks to the kernel API for a single command
stream.

This should provide a more future proof and extensible API
for command submission.

v2: use amdgpu_bo_list_handle, add two helper functions to
access bo and context internals.

Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 amdgpu/amdgpu.h    | 30 ++++++++++++++++++++++++++++++
 amdgpu/amdgpu_cs.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index 183f974..238b1aa 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1382,6 +1382,36 @@ int amdgpu_cs_import_syncobj(amdgpu_device_handle dev,
 			     int shared_fd,
 			     uint32_t *syncobj);
 
+/**
+ *  Submit raw command submission to kernel
+ *
+ * \param   dev	       - \c [in] device handle
+ * \param   context    - \c [in] context handle for context id
+ * \param   bo_list_handle - \c [in] request bo list handle (0 for none)
+ * \param   num_chunks - \c [in] number of CS chunks to submit
+ * \param   chunks     - \c [in] array of CS chunks
+ * \param   seq_no     - \c [out] output sequence number for submission.
+ *
+ * \return   0 on success\n
+ *          <0 - Negative POSIX Error code
+ *
+ */
+struct drm_amdgpu_cs_chunk;
+struct drm_amdgpu_cs_chunk_dep;
+struct drm_amdgpu_cs_chunk_data;
+
+int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
+			 amdgpu_context_handle context,
+			 amdgpu_bo_list_handle bo_list_handle,
+			 int num_chunks,
+			 struct drm_amdgpu_cs_chunk *chunks,
+			 uint64_t *seq_no);
+
+void amdgpu_cs_chunk_fence_to_dep(struct amdgpu_cs_fence *fence,
+				  struct drm_amdgpu_cs_chunk_dep *dep);
+void amdgpu_cs_chunk_fence_info_to_data(struct amdgpu_cs_fence_info *fence_info,
+					struct drm_amdgpu_cs_chunk_data *data);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 722fd75..dfba875 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -634,3 +634,50 @@ int amdgpu_cs_import_syncobj(amdgpu_device_handle dev,
 
 	return drmSyncobjFDToHandle(dev->fd, shared_fd, handle);
 }
+
+int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
+			 amdgpu_context_handle context,
+			 amdgpu_bo_list_handle bo_list_handle,
+			 int num_chunks,
+			 struct drm_amdgpu_cs_chunk *chunks,
+			 uint64_t *seq_no)
+{
+	union drm_amdgpu_cs cs = {0};
+	uint64_t *chunk_array;
+	int i, r;
+	if (num_chunks == 0)
+		return -EINVAL;
+
+	chunk_array = alloca(sizeof(uint64_t) * num_chunks);
+	for (i = 0; i < num_chunks; i++)
+		chunk_array[i] = (uint64_t)(uintptr_t)&chunks[i];
+	cs.in.chunks = (uint64_t)(uintptr_t)chunk_array;
+	cs.in.ctx_id = context->id;
+	cs.in.bo_list_handle = bo_list_handle ? bo_list_handle->handle : 0;
+	cs.in.num_chunks = num_chunks;
+	r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_CS,
+				&cs, sizeof(cs));
+	if (r)
+		return r;
+
+	if (seq_no)
+		*seq_no = cs.out.handle;
+	return 0;
+}
+
+void amdgpu_cs_chunk_fence_info_to_data(struct amdgpu_cs_fence_info *fence_info,
+					struct drm_amdgpu_cs_chunk_data *data)
+{
+	data->fence_data.handle = fence_info->handle->handle;
+	data->fence_data.offset = fence_info->offset * sizeof(uint64_t);
+}
+
+void amdgpu_cs_chunk_fence_to_dep(struct amdgpu_cs_fence *fence,
+				  struct drm_amdgpu_cs_chunk_dep *dep)
+{
+	dep->ip_type = fence->ip_type;
+	dep->ip_instance = fence->ip_instance;
+	dep->ring = fence->ring;
+	dep->ctx_id = fence->context->id;
+	dep->handle = fence->fence;
+}
-- 
2.9.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2)
  2017-07-18  3:52 [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2) Dave Airlie
@ 2017-07-18 13:59 ` Christian König
  2017-08-10 13:56 ` Emil Velikov
  1 sibling, 0 replies; 5+ messages in thread
From: Christian König @ 2017-07-18 13:59 UTC (permalink / raw)
  To: Dave Airlie, dri-devel, amd-gfx

Am 18.07.2017 um 05:52 schrieb Dave Airlie:
> From: Dave Airlie <airlied@redhat.com>
>
> This just sends chunks to the kernel API for a single command
> stream.
>
> This should provide a more future proof and extensible API
> for command submission.
>
> v2: use amdgpu_bo_list_handle, add two helper functions to
> access bo and context internals.
>
> Signed-off-by: Dave Airlie <airlied@redhat.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   amdgpu/amdgpu.h    | 30 ++++++++++++++++++++++++++++++
>   amdgpu/amdgpu_cs.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 77 insertions(+)
>
> diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
> index 183f974..238b1aa 100644
> --- a/amdgpu/amdgpu.h
> +++ b/amdgpu/amdgpu.h
> @@ -1382,6 +1382,36 @@ int amdgpu_cs_import_syncobj(amdgpu_device_handle dev,
>   			     int shared_fd,
>   			     uint32_t *syncobj);
>   
> +/**
> + *  Submit raw command submission to kernel
> + *
> + * \param   dev	       - \c [in] device handle
> + * \param   context    - \c [in] context handle for context id
> + * \param   bo_list_handle - \c [in] request bo list handle (0 for none)
> + * \param   num_chunks - \c [in] number of CS chunks to submit
> + * \param   chunks     - \c [in] array of CS chunks
> + * \param   seq_no     - \c [out] output sequence number for submission.
> + *
> + * \return   0 on success\n
> + *          <0 - Negative POSIX Error code
> + *
> + */
> +struct drm_amdgpu_cs_chunk;
> +struct drm_amdgpu_cs_chunk_dep;
> +struct drm_amdgpu_cs_chunk_data;
> +
> +int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
> +			 amdgpu_context_handle context,
> +			 amdgpu_bo_list_handle bo_list_handle,
> +			 int num_chunks,
> +			 struct drm_amdgpu_cs_chunk *chunks,
> +			 uint64_t *seq_no);
> +
> +void amdgpu_cs_chunk_fence_to_dep(struct amdgpu_cs_fence *fence,
> +				  struct drm_amdgpu_cs_chunk_dep *dep);
> +void amdgpu_cs_chunk_fence_info_to_data(struct amdgpu_cs_fence_info *fence_info,
> +					struct drm_amdgpu_cs_chunk_data *data);
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
> index 722fd75..dfba875 100644
> --- a/amdgpu/amdgpu_cs.c
> +++ b/amdgpu/amdgpu_cs.c
> @@ -634,3 +634,50 @@ int amdgpu_cs_import_syncobj(amdgpu_device_handle dev,
>   
>   	return drmSyncobjFDToHandle(dev->fd, shared_fd, handle);
>   }
> +
> +int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
> +			 amdgpu_context_handle context,
> +			 amdgpu_bo_list_handle bo_list_handle,
> +			 int num_chunks,
> +			 struct drm_amdgpu_cs_chunk *chunks,
> +			 uint64_t *seq_no)
> +{
> +	union drm_amdgpu_cs cs = {0};
> +	uint64_t *chunk_array;
> +	int i, r;
> +	if (num_chunks == 0)
> +		return -EINVAL;
> +
> +	chunk_array = alloca(sizeof(uint64_t) * num_chunks);
> +	for (i = 0; i < num_chunks; i++)
> +		chunk_array[i] = (uint64_t)(uintptr_t)&chunks[i];
> +	cs.in.chunks = (uint64_t)(uintptr_t)chunk_array;
> +	cs.in.ctx_id = context->id;
> +	cs.in.bo_list_handle = bo_list_handle ? bo_list_handle->handle : 0;
> +	cs.in.num_chunks = num_chunks;
> +	r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_CS,
> +				&cs, sizeof(cs));
> +	if (r)
> +		return r;
> +
> +	if (seq_no)
> +		*seq_no = cs.out.handle;
> +	return 0;
> +}
> +
> +void amdgpu_cs_chunk_fence_info_to_data(struct amdgpu_cs_fence_info *fence_info,
> +					struct drm_amdgpu_cs_chunk_data *data)
> +{
> +	data->fence_data.handle = fence_info->handle->handle;
> +	data->fence_data.offset = fence_info->offset * sizeof(uint64_t);
> +}
> +
> +void amdgpu_cs_chunk_fence_to_dep(struct amdgpu_cs_fence *fence,
> +				  struct drm_amdgpu_cs_chunk_dep *dep)
> +{
> +	dep->ip_type = fence->ip_type;
> +	dep->ip_instance = fence->ip_instance;
> +	dep->ring = fence->ring;
> +	dep->ctx_id = fence->context->id;
> +	dep->handle = fence->fence;
> +}


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2)
  2017-07-18  3:52 [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2) Dave Airlie
  2017-07-18 13:59 ` Christian König
@ 2017-08-10 13:56 ` Emil Velikov
       [not found]   ` <CACvgo50nHR-L_ExwdvT48t9dHXrs50CGOmcR4j1SzFBVP6XHiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 5+ messages in thread
From: Emil Velikov @ 2017-08-10 13:56 UTC (permalink / raw)
  To: Dave Airlie; +Cc: amd-gfx mailing list, ML dri-devel

Hi Dave,

On 18 July 2017 at 04:52, Dave Airlie <airlied@gmail.com> wrote:

> +int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
> +                        amdgpu_context_handle context,
> +                        amdgpu_bo_list_handle bo_list_handle,
> +                        int num_chunks,
> +                        struct drm_amdgpu_cs_chunk *chunks,
> +                        uint64_t *seq_no)
> +{
> +       union drm_amdgpu_cs cs = {0};
> +       uint64_t *chunk_array;
> +       int i, r;
> +       if (num_chunks == 0)
> +               return -EINVAL;
> +
> +       chunk_array = alloca(sizeof(uint64_t) * num_chunks);
Out of curiosity:
Does malloc/free produce noticeable overhead that lead you you alloca?

num_chunks is signed - should we bail on negative values, can we make
it unsigned?

Thanks
Emil
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2)
       [not found]   ` <CACvgo50nHR-L_ExwdvT48t9dHXrs50CGOmcR4j1SzFBVP6XHiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-10 19:43     ` Dave Airlie
       [not found]       ` <CAPM=9tx0HMQ8oqzm4o1g0JLBjyCmWX4jyaHWYAPnS0fJzM9u6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Airlie @ 2017-08-10 19:43 UTC (permalink / raw)
  To: Emil Velikov; +Cc: amd-gfx mailing list, ML dri-devel

On 10 August 2017 at 23:56, Emil Velikov <emil.l.velikov@gmail.com> wrote:
> Hi Dave,
>
> On 18 July 2017 at 04:52, Dave Airlie <airlied@gmail.com> wrote:
>
>> +int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
>> +                        amdgpu_context_handle context,
>> +                        amdgpu_bo_list_handle bo_list_handle,
>> +                        int num_chunks,
>> +                        struct drm_amdgpu_cs_chunk *chunks,
>> +                        uint64_t *seq_no)
>> +{
>> +       union drm_amdgpu_cs cs = {0};
>> +       uint64_t *chunk_array;
>> +       int i, r;
>> +       if (num_chunks == 0)
>> +               return -EINVAL;
>> +
>> +       chunk_array = alloca(sizeof(uint64_t) * num_chunks);
> Out of curiosity:
> Does malloc/free produce noticeable overhead that lead you you alloca?

The preexisting code for these ioctls used alloca so I just followed
the existing pattern.

I doubt we'd notice malloc/free, but we shouldn't also be sending more
than 5-10 chunks
even in the future, so stack alloc should be fine.

> num_chunks is signed - should we bail on negative values, can we make
> it unsigned?

Possibly, I don't see random users of this API appearing though, but yeah could
change the if <= 0.

Dave.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2)
       [not found]       ` <CAPM=9tx0HMQ8oqzm4o1g0JLBjyCmWX4jyaHWYAPnS0fJzM9u6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-11  7:41         ` Christian König
  0 siblings, 0 replies; 5+ messages in thread
From: Christian König @ 2017-08-11  7:41 UTC (permalink / raw)
  To: Dave Airlie, Emil Velikov; +Cc: ML dri-devel, amd-gfx mailing list

Am 10.08.2017 um 21:43 schrieb Dave Airlie:
> On 10 August 2017 at 23:56, Emil Velikov <emil.l.velikov@gmail.com> wrote:
>> Hi Dave,
>>
>> On 18 July 2017 at 04:52, Dave Airlie <airlied@gmail.com> wrote:
>>
>>> +int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
>>> +                        amdgpu_context_handle context,
>>> +                        amdgpu_bo_list_handle bo_list_handle,
>>> +                        int num_chunks,
>>> +                        struct drm_amdgpu_cs_chunk *chunks,
>>> +                        uint64_t *seq_no)
>>> +{
>>> +       union drm_amdgpu_cs cs = {0};
>>> +       uint64_t *chunk_array;
>>> +       int i, r;
>>> +       if (num_chunks == 0)
>>> +               return -EINVAL;
>>> +
>>> +       chunk_array = alloca(sizeof(uint64_t) * num_chunks);
>> Out of curiosity:
>> Does malloc/free produce noticeable overhead that lead you you alloca?
> The preexisting code for these ioctls used alloca so I just followed
> the existing pattern.
>
> I doubt we'd notice malloc/free, but we shouldn't also be sending more
> than 5-10 chunks
> even in the future, so stack alloc should be fine.
>
>> num_chunks is signed - should we bail on negative values, can we make
>> it unsigned?
> Possibly, I don't see random users of this API appearing though, but yeah could
> change the if <= 0.

I would rather make the variable unsigned, but yeah it's not so much of 
an issue.

Christian.

>
> Dave.
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-08-11  7:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-18  3:52 [PATCH libdrm] drm/amdgpu: add new low overhead command submission API. (v2) Dave Airlie
2017-07-18 13:59 ` Christian König
2017-08-10 13:56 ` Emil Velikov
     [not found]   ` <CACvgo50nHR-L_ExwdvT48t9dHXrs50CGOmcR4j1SzFBVP6XHiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-10 19:43     ` Dave Airlie
     [not found]       ` <CAPM=9tx0HMQ8oqzm4o1g0JLBjyCmWX4jyaHWYAPnS0fJzM9u6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-11  7:41         ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).