From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yiwei Zhang Subject: Proposal to report GPU private memory allocations with sysfs nodes [plain text version] Date: Fri, 25 Oct 2019 11:35:32 -0700 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1396659780==" Return-path: Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by gabe.freedesktop.org (Postfix) with ESMTPS id E9F526EB41 for ; Fri, 25 Oct 2019 18:35:45 +0000 (UTC) Received: by mail-lj1-x233.google.com with SMTP id y3so3813580ljj.6 for ; Fri, 25 Oct 2019 11:35:45 -0700 (PDT) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org Cc: Alistair Delva , Prahlad Kilambi , Sean Paul , kraxel@redhat.com, Chris Forbes , kernel-team@android.com List-Id: dri-devel@lists.freedesktop.org --===============1396659780== Content-Type: multipart/alternative; boundary="000000000000ec4fd50595c06a50" --000000000000ec4fd50595c06a50 Content-Type: text/plain; charset="UTF-8" Hi folks, This is the plain text version of the previous email in case that was considered as spam. --- Background --- On the downstream Android, vendors used to report GPU private memory allocations with debugfs nodes in their own formats. However, debugfs nodes are getting deprecated in the next Android release. --- Proposal --- We are taking the chance to unify all the vendors to migrate their existing debugfs nodes into a standardized sysfs node structure. Then the platform is able to do a bunch of useful things: memory profiling, system health coverage, field metrics, local shell dump, in-app api, etc. This proposal is better served upstream as all GPU vendors can standardize a gpu memory structure and reduce fragmentation across Android and Linux that clients can rely on. --- Detailed design --- The sysfs node structure looks like below: /sys/devices/// e.g. "/sys/devices/mali0/gpu_mem/606/gl_buffer" and the gl_buffer is a node having the comma separated size values: "4096,81920,...,4096". For the top level root, vendors can choose their own names based on the value of ro.gfx.sysfs.0 the vendors set. (1) For the multiple gpu driver cases, we can use ro.gfx.sysfs.1, ro.gfx.sysfs.2 for the 2nd and 3rd KMDs. (2) It's also allowed to put some sub-dir for example "kgsl/gpu_mem" or "mali0/gpu_mem" in the ro.gfx.sysfs. property if the root name under /sys/devices/ is already created and used for other purposes. For the 2nd level "pid", there are usually just a couple of them per snapshot, since we only takes snapshot for the active ones. For the 3rd level "type_name", the type name will be one of the GPU memory object types in lower case, and the value will be a comma separated sequence of size values for all the allocations under that specific type. We especially would like some comments on this part. For the GPU memory object types, we defined 9 different types for Android: (1) UNKNOWN // not accounted for in any other category (2) SHADER // shader binaries (3) COMMAND // allocations which have a lifetime similar to a VkCommandBuffer (4) VULKAN // backing for VkDeviceMemory (5) GL_TEXTURE // GL Texture and RenderBuffer (6) GL_BUFFER // GL Buffer (7) QUERY // backing for query (8) DESCRIPTOR // allocations which have a lifetime similar to a VkDescriptorSet (9) TRANSIENT // random transient things that the driver needs We are wondering if those type enumerations make sense to the upstream side as well, or maybe we just deal with our own different type sets. Cuz on the Android side, we'll just read those nodes named after the types we defined in the sysfs node structure. Looking forward to any concerns/comments/suggestions! Best regards, Yiwei --000000000000ec4fd50595c06a50 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi folks,

This is the plain text versio= n of the previous email in case that was considered as spam.

=
--- Background ---
On the downstream Android, vendors = used to report GPU private memory allocations with debugfs nodes in their o= wn formats. However, debugfs nodes are getting deprecated in the next Andro= id release.

--- Proposal ---
We are taki= ng the chance to unify all the vendors to migrate their existing debugfs no= des into a standardized sysfs node structure. Then the platform is able to = do a bunch of useful things: memory profiling, system health coverage, fiel= d metrics, local shell dump, in-app api, etc.=C2=A0This proposal is better = served upstream as all GPU vendors can standardize a gpu memory structure a= nd reduce fragmentation across Android and Linux that clients can rely on.<= /div>

--- Detailed design ---
The sysfs node= =C2=A0structure=C2=A0looks like below:
/sys/devices/<ro.gfx.sy= sfs.0>/<pid>/<type_name>
e.g. "/sys/devices/m= ali0/gpu_mem/606/gl_buffer" and the gl_buffer is a node having the com= ma separated size values: "4096,81920,...,4096".

For the top level root, vendors can choose their own names based o= n the value of ro.gfx.sysfs.0 the vendors set. (1) For the multiple gpu dri= ver cases, we can use ro.gfx.sysfs.1, ro.gfx.sysfs.2 for the 2nd and 3rd KM= Ds. (2) It's also allowed to put some sub-dir for example "kgsl/gp= u_mem" or "mali0/gpu_mem" in the ro.gfx.sysfs.<channel>= ; property if the root name under /sys/devices/ is already created and used= for other purposes.

For the 2nd level "p= id", there are usually just a couple of them per snapshot, since we on= ly takes snapshot for the active ones.

For the= 3rd level "type_name", the type name will be one of the GPU memo= ry object types in lower case, and the value will be a comma separated sequ= ence of size values for all the allocations under that specific type.

We especially would like some comments on this part= . For the GPU memory object types, we defined 9 different types for Android= :
(1) UNKNOWN // not accounted for in any other category
(2)=C2=A0SHADER=C2=A0// shader binaries
(3)=C2=A0COMMAND=C2= =A0// allocations which have a lifetime similar to a VkCommandBuffer
<= div>(4)=C2=A0VULKAN=C2=A0// backing for VkDeviceMemory
(5)=C2=A0G= L_TEXTURE=C2=A0// GL Texture and RenderBuffer
(6)=C2=A0GL_BUFFER= =C2=A0// GL Buffer
(7)=C2=A0QUERY=C2=A0// backing for query
=
(8)=C2=A0DESCRIPTOR=C2=A0// allocations which have a lifetime similar = to a VkDescriptorSet
(9)=C2=A0TRANSIENT // random transient thing= s that the driver needs

We are wondering if those = type enumerations make sense to the upstream side as well, or maybe we just= deal with our own different type sets. Cuz on the Android side, we'll = just read those nodes named after the types we defined in the sysfs node st= ructure.

Looking forward to any concerns/comme= nts/suggestions!

Best regards,
Yiwei=

--000000000000ec4fd50595c06a50-- --===============1396659780== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVs --===============1396659780==--