dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* About upstreaming ArmChina NPU driver
@ 2024-03-28  7:46 Dejia Shang
  2024-03-28 10:32 ` Sudeep Holla
  2024-04-03  6:25 ` Oded Gabbay
  0 siblings, 2 replies; 5+ messages in thread
From: Dejia Shang @ 2024-03-28  7:46 UTC (permalink / raw)
  To: ogabbay, airlied, daniel; +Cc: linux-kernel, dri-devel, linux-arm-kernel

Dear Kernel Maintainers,

I am a driver developer and would like to upstream the ArmChina Zhouyi NPU driver ("Zhouyi" is the brand) to accel subsystem.

The driver is already open sourced (both UMD and KMD) and anyone can find the code from https://github.com/Arm-China/Compass_NPU_Driver.git.

This driver is responsible for scheduling AI inference tasks to the NPU cores (V1/V2/V3). Specifically, a simplified end-to-end flow is:

        1. A TFLite/ONNX model is transformed to an executable binary file in ELF format by the NN graph compiler (designed by ArmChina)
        2. An application loads the executable binary file to UMD and provides the input data.
        3. UMD parses the binary and sends ioctls to KMD (open device, do memory allocation/mmap/free, submit the job descriptor).
        4. KMD dispatches the job to NPU h/w, handles interrupts and updates the execution status.
        5. UMD polls the status of the pre-scheduled job.
        6. The application gets the output results.

So...for the upstreaming,

Q1: do you think our NPU driver is suitable for accel? If the answer is yes, which tree & branch should the patches be based on?

Q2: in thread https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9edad63@kernel.org/ showing a similar case, Oded mentioned that:

        "If we would have upstreamed a new driver, the expectation would have been that we would use some drm mechanisms.", and
        "the minimal requirement is to use GEM/BOs for memory management operations".

I guess those requirements are also applicable for the Zhouyi NPU KMD? Currently, the memory management (MM) in KMD is based on dma-mapping APIs, which handles both reserved CMA region(s) and SMMU mapped buffers, and supports the dma-buf framework. Maybe I should replace the implementations with DRM APIs.

Q3: if you have looked at the KMD code, do you think I should make any other major change before submitting the first patch series? Thank you!

Thanks for your time and look forward to your reply~ 😊

Best Regards,
Dejia
IMPORTANT NOTICE: The contents of this email and any attachments may be privileged and confidential. If you are not the intended recipient, please delete the email immediately. It is strictly prohibited to disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ©Arm Technology (China) Co., Ltd copyright and reserve all rights. 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此邮件。严禁通过任何渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采取任何行动。感谢您的配合。 ©安谋科技(中国)有限公司 版权所有并保留一切权利。

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: About upstreaming ArmChina NPU driver
  2024-03-28  7:46 About upstreaming ArmChina NPU driver Dejia Shang
@ 2024-03-28 10:32 ` Sudeep Holla
  2024-04-03  9:42   ` Dejia Shang
  2024-04-03  6:25 ` Oded Gabbay
  1 sibling, 1 reply; 5+ messages in thread
From: Sudeep Holla @ 2024-03-28 10:32 UTC (permalink / raw)
  To: Dejia Shang
  Cc: ogabbay, Sudeep Holla, airlied, daniel, linux-kernel, dri-devel,
	linux-arm-kernel

On Thu, Mar 28, 2024 at 07:46:01AM +0000, Dejia Shang wrote:
> IMPORTANT NOTICE: The contents of this email and any attachments may be privileged and confidential. If you are not the intended recipient, please delete the email immediately. It is strictly prohibited to disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ©Arm Technology (China) Co., Ltd copyright and reserve all rights. 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此邮件。严禁通过任何渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采取任何行动。感谢您的配合。 ©安谋科技(中国)有限公司 版权所有并保留一切权利。

You need to get this fixed, otherwise people will delete this email
as you have suggested and/or refrain from responding to this email.

Please talk to your local IT and get a setup without this disclaimer for
all mailing list activities.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: About upstreaming ArmChina NPU driver
  2024-03-28  7:46 About upstreaming ArmChina NPU driver Dejia Shang
  2024-03-28 10:32 ` Sudeep Holla
@ 2024-04-03  6:25 ` Oded Gabbay
  2024-04-03 10:09   ` Dejia Shang
  1 sibling, 1 reply; 5+ messages in thread
From: Oded Gabbay @ 2024-04-03  6:25 UTC (permalink / raw)
  To: Dejia Shang
  Cc: ogabbay, airlied, daniel, linux-kernel, dri-devel, linux-arm-kernel

On Thu, Mar 28, 2024 at 10:01 AM Dejia Shang <Dejia.Shang@armchina.com> wrote:
>
> Dear Kernel Maintainers,
>
> I am a driver developer and would like to upstream the ArmChina Zhouyi NPU driver ("Zhouyi" is the brand) to accel subsystem.
>
> The driver is already open sourced (both UMD and KMD) and anyone can find the code from https://github.com/Arm-China/Compass_NPU_Driver.git.
>
> This driver is responsible for scheduling AI inference tasks to the NPU cores (V1/V2/V3). Specifically, a simplified end-to-end flow is:
>
>         1. A TFLite/ONNX model is transformed to an executable binary file in ELF format by the NN graph compiler (designed by ArmChina)
>         2. An application loads the executable binary file to UMD and provides the input data.
>         3. UMD parses the binary and sends ioctls to KMD (open device, do memory allocation/mmap/free, submit the job descriptor).
>         4. KMD dispatches the job to NPU h/w, handles interrupts and updates the execution status.
>         5. UMD polls the status of the pre-scheduled job.
>         6. The application gets the output results.
>
> So...for the upstreaming,
>
> Q1: do you think our NPU driver is suitable for accel? If the answer is yes, which tree & branch should the patches be based on?
Hi Dejia,
Yes, it definitely sounds as a good fit to the accel subsystem.
Please base your patches on "drm-misc-next" branch in drm-misc repo:
https://anongit.freedesktop.org/git/drm/drm-misc.git

>
> Q2: in thread https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9edad63@kernel.org/ showing a similar case, Oded mentioned that:
>
>         "If we would have upstreamed a new driver, the expectation would have been that we would use some drm mechanisms.", and
>         "the minimal requirement is to use GEM/BOs for memory management operations".
>
> I guess those requirements are also applicable for the Zhouyi NPU KMD? Currently, the memory management (MM) in KMD is based on dma-mapping APIs, which handles both reserved CMA region(s) and SMMU mapped buffers, and supports the dma-buf framework. Maybe I should replace the implementations with DRM APIs.
Yes, those requirements definitely apply here.
>
> Q3: if you have looked at the KMD code, do you think I should make any other major change before submitting the first patch series? Thank you!
I took a quick glance. In general, it seems to be ok, but I noticed
two things related to the integration with drm/accel:

1. You us a scheduler for the job submission, which provides the
ability to defer jobs. In that case, I suggest to check if you can use
drm_sched instead of your own implementation. No point in re-inventing
the wheel.
2. You provide several memory zones for allocation of memory. I would
suggest here to look at using ttm as the memory manager instead of
re-implementing your own.

And please remove the IMPORTANT NOTICE at the end of your emails. I
would have to refrain from answering to further emails if that notice
remains.

Thanks,
Oded

>
> Thanks for your time and look forward to your reply~ 😊
>
> Best Regards,
> Dejia
> IMPORTANT NOTICE: The contents of this email and any attachments may be privileged and confidential. If you are not the intended recipient, please delete the email immediately. It is strictly prohibited to disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ©Arm Technology (China) Co., Ltd copyright and reserve all rights. 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此邮件。严禁通过任何渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采取任何行动。感谢您的配合。 ©安谋科技(中国)有限公司 版权所有并保留一切权利。

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: About upstreaming ArmChina NPU driver
  2024-03-28 10:32 ` Sudeep Holla
@ 2024-04-03  9:42   ` Dejia Shang
  0 siblings, 0 replies; 5+ messages in thread
From: Dejia Shang @ 2024-04-03  9:42 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, dri-devel, linux-arm-kernel, Toby Huang, Chengkun Sun


> -----Original Message-----
> From: Sudeep Holla <sudeep.holla@arm.com>
> Sent: 2024年3月28日 18:32
> To: Dejia Shang <Dejia.Shang@armchina.com>
> Cc: ogabbay@kernel.org; Sudeep Holla <sudeep.holla@arm.com>;
> airlied@redhat.com; daniel@ffwll.ch; linux-kernel@vger.kernel.org;
> dri-devel@lists.freedesktop.org; linux-arm-kernel@lists.infradead.org
> Subject: Re: About upstreaming ArmChina NPU driver
> 
> On Thu, Mar 28, 2024 at 07:46:01AM +0000, Dejia Shang wrote:
> > IMPORTANT NOTICE: The contents of this email and any attachments may
> > be privileged and confidential. If you are not the intended recipient,
> > please delete the email immediately. It is strictly prohibited to
> > disclose the contents to any other person, use it for any purpose, or
> > store or copy the information in any medium. Thank you. ©Arm
> > Technology (China) Co., Ltd copyright and reserve all rights.
> > 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的
> 使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此
> 邮件。严禁通过任何
> > 渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采
> 取任何行动。感谢您的配合。 ©安谋科技(中国)有限公司 版权所有并
> 保留一切权利。
> 
> You need to get this fixed, otherwise people will delete this email as you have
> suggested and/or refrain from responding to this email.
> 
> Please talk to your local IT and get a setup without this disclaimer for all
> mailing list activities.

Now fixed. I did not realize that because the server auto appended that disclaimer. Thanks for your reminder!

Best Regards,
Dejia

> 
> --
> Regards,
> Sudeep

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: About upstreaming ArmChina NPU driver
  2024-04-03  6:25 ` Oded Gabbay
@ 2024-04-03 10:09   ` Dejia Shang
  0 siblings, 0 replies; 5+ messages in thread
From: Dejia Shang @ 2024-04-03 10:09 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: ogabbay, airlied, daniel, linux-kernel, dri-devel,
	linux-arm-kernel, Toby Huang, Chengkun Sun


> -----Original Message-----
> From: Oded Gabbay <oded.gabbay@gmail.com>
> Sent: 2024年4月3日 14:26
> To: Dejia Shang <Dejia.Shang@armchina.com>
> Cc: ogabbay@kernel.org; airlied@redhat.com; daniel@ffwll.ch;
> linux-kernel@vger.kernel.org; dri-devel@lists.freedesktop.org;
> linux-arm-kernel@lists.infradead.org
> Subject: Re: About upstreaming ArmChina NPU driver
> 
> On Thu, Mar 28, 2024 at 10:01 AM Dejia Shang <Dejia.Shang@armchina.com>
> wrote:
> >
> > Dear Kernel Maintainers,
> >
> > I am a driver developer and would like to upstream the ArmChina Zhouyi
> NPU driver ("Zhouyi" is the brand) to accel subsystem.
> >
> > The driver is already open sourced (both UMD and KMD) and anyone can
> find the code from https://github.com/Arm-China/Compass_NPU_Driver.git.
> >
> > This driver is responsible for scheduling AI inference tasks to the NPU cores
> (V1/V2/V3). Specifically, a simplified end-to-end flow is:
> >
> >         1. A TFLite/ONNX model is transformed to an executable binary
> file in ELF format by the NN graph compiler (designed by ArmChina)
> >         2. An application loads the executable binary file to UMD and
> provides the input data.
> >         3. UMD parses the binary and sends ioctls to KMD (open device,
> do memory allocation/mmap/free, submit the job descriptor).
> >         4. KMD dispatches the job to NPU h/w, handles interrupts and
> updates the execution status.
> >         5. UMD polls the status of the pre-scheduled job.
> >         6. The application gets the output results.
> >
> > So...for the upstreaming,
> >
> > Q1: do you think our NPU driver is suitable for accel? If the answer is yes,
> which tree & branch should the patches be based on?
> Hi Dejia,
> Yes, it definitely sounds as a good fit to the accel subsystem.
> Please base your patches on "drm-misc-next" branch in drm-misc repo:
> https://anongit.freedesktop.org/git/drm/drm-misc.git
> 

Hi Oded,
Got it.

> >
> > Q2: in thread
> https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9edad63@kern
> el.org/ showing a similar case, Oded mentioned that:
> >
> >         "If we would have upstreamed a new driver, the expectation
> would have been that we would use some drm mechanisms.", and
> >         "the minimal requirement is to use GEM/BOs for memory
> management operations".
> >
> > I guess those requirements are also applicable for the Zhouyi NPU KMD?
> Currently, the memory management (MM) in KMD is based on dma-mapping
> APIs, which handles both reserved CMA region(s) and SMMU mapped buffers,
> and supports the dma-buf framework. Maybe I should replace the
> implementations with DRM APIs.
> Yes, those requirements definitely apply here.
> >
> > Q3: if you have looked at the KMD code, do you think I should make any
> other major change before submitting the first patch series? Thank you!
> I took a quick glance. In general, it seems to be ok, but I noticed two things
> related to the integration with drm/accel:
> 
> 1. You us a scheduler for the job submission, which provides the ability to
> defer jobs. In that case, I suggest to check if you can use drm_sched instead of
> your own implementation. No point in re-inventing the wheel.
> 2. You provide several memory zones for allocation of memory. I would
> suggest here to look at using ttm as the memory manager instead of
> re-implementing your own.

Thanks for your time! I will try to refactor the code as suggested and then send the first patch series.

> 
> And please remove the IMPORTANT NOTICE at the end of your emails. I
> would have to refrain from answering to further emails if that notice remains.

Now fixed. I did not realize that because the server auto appended the notice. Sorry for the inconvenience.

Best Regards,
Dejia

> 
> Thanks,
> Oded
> 
> >
> > Thanks for your time and look forward to your reply~ 😊
> >
> > Best Regards,
> > Dejia

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-03 10:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-28  7:46 About upstreaming ArmChina NPU driver Dejia Shang
2024-03-28 10:32 ` Sudeep Holla
2024-04-03  9:42   ` Dejia Shang
2024-04-03  6:25 ` Oded Gabbay
2024-04-03 10:09   ` Dejia Shang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).