On Mon, Oct 28, 2019 at 8:26 AM Jerome Gl= isse <jglisse@redhat.com> w= rote:

On Fri, Oct 25, 2019 at 11:35:32AM -0700, Yiwei Zhang wrot= e:
> Hi folks,
>
> This is the plain text version of the previous email in case that was<= br> > considered as spam.
>
> --- Background ---
> On the downstream Android, vendors used to report GPU private memory > allocations with debugfs nodes in their own formats. However, debugfs = nodes
> are getting deprecated in the next Android release.

Maybe explain why it is useful first ?

= Memory is precious on Android mobile platforms. Apps using a large amount o= f

memory, games, tend to maintain a table for the memory on diffe= rent devices with

different prediction models. Private gpu memory= allocations is currently=C2=A0semi-blind

to the apps and the pla= tform as well.

By having the data, the platform ca= n do:

(1) GPU memory profiling as part of the huge Android profil= er in progress.

(2) Android system health team can enrich the per= formance test coverage.

(3) We can collect filed metrics to detec= t any regression on the gpu private memory

allocations in the pro= duction population.

(4) Shell user can easily dump the allocation= s in a uniform way across vendors.

(5) Platform can feed the data= to the apps so that apps can do memory allocations

in a more pre= dictable way.

=C2=A0

>
> --- Proposal ---
> We are taking the chance to unify all the vendors to migrate their exi= sting
> debugfs nodes into a standardized sysfs node structure. Then the platf= orm
> is able to do a bunch of useful things: memory profiling, system healt= h
> coverage, field metrics, local shell dump, in-app api, etc. This propo= sal
> is better served upstream as all GPU vendors can standardize a gpu mem= ory
> structure and reduce fragmentation across Android and Linux that clien= ts
> can rely on.
>
> --- Detailed design ---
> The sysfs node structure looks like below:
> /sys/devices/<ro.gfx.sysfs.0>/<pid>/<type_name>
> e.g. "/sys/devices/mali0/gpu_mem/606/gl_buffer" and the gl_b= uffer is a node
> having the comma separated size values: "4096,81920,...,4096"= ;.

How does kernel knows what API the allocation is use for ? With the
open source driver you never specify what API is creating a gem object
(opengl, vulkan, ...) nor what purpose (transient, shader, ...).

Oh, is this a hard requirement for the open sourc= e drivers to not bookkeep any
data from userland? I think the API= is just some additional metadata passed down.
=C2=A0

> For the top level root, vendors can choose their own names based on th= e
> value of ro.gfx.sysfs.0 the vendors set. (1) For the multiple gpu driv= er
> cases, we can use ro.gfx.sysfs.1, ro.gfx.sysfs.2 for the 2nd and 3rd K= MDs.
> (2) It's also allowed to put some sub-dir for example "kgsl/g= pu_mem" or
> "mali0/gpu_mem" in the ro.gfx.sysfs.<channel> property= if the root name
> under /sys/devices/ is already created and used for other purposes.
On one side you want to standardize on the other you want to give
complete freedom on the top level naming scheme. I would rather see a
consistent naming scheme (ie something more restraint and with little
place for interpration by individual driver)

Thanks for commenting on this. We definitely need some suggestions on= the root

directory. In the multi-gpu case on desktop, is there s= ome existing consumer to

query "some data" from all the= GPUs? How does the tool find all GPUs and

differentiate between = them? Is this already standardized?

> For the 2nd level "pid", there are usually just a couple of = them per
> snapshot, since we only takes snapshot for the active ones.

? Do not understand here, you can have any number of applications with
GPU objects ? And thus there is no bound on the number of PID. Please
consider desktop too, i do not know what kind of limitation android
impose.

We are only interested in track= ing *active* GPU private allocations. So yes, any

application cur= rently holding an active GPU context will probably has a node here.

Since we want to do profiling for specific apps, the data has to be per = application

based. I don't=C2=A0get your concerns here. If it= 's about the tracking overhead, it's rare

to see tons of = application doing private gpu allocations at the same time. Could

you help elaborate=C2=A0a bit?

> For the 3rd level "type_name", the type name will be one of = the GPU memory
> object types in lower case, and the value will be a comma separated > sequence of size values for all the allocations under that specific ty= pe.
>
> We especially would like some comments on this part. For the GPU memor= y
> object types, we defined 9 different types for Android:
> (1) UNKNOWN // not accounted for in any other category
> (2) SHADER // shader binaries
> (3) COMMAND // allocations which have a lifetime similar to a
> VkCommandBuffer
> (4) VULKAN // backing for VkDeviceMemory
> (5) GL_TEXTURE // GL Texture and RenderBuffer
> (6) GL_BUFFER // GL Buffer
> (7) QUERY // backing for query
> (8) DESCRIPTOR // allocations which have a lifetime similar to a
> VkDescriptorSet
> (9) TRANSIENT // random transient things that the driver needs
>
> We are wondering if those type enumerations make sense to the upstream= side
> as well, or maybe we just deal with our own different type sets. Cuz o= n the
> Android side, we'll just read those nodes named after the types we= defined
> in the sysfs node structure.

See my above point of open source driver and kernel being unaware
of the allocation purpose and use.

Cheers,
J=C3=A9r=C3=B4me

Many thanks for the reply!

Y= iwei