Hi Kenny,

Thanks for the info. Do you mind forwarding the existing discussion to me
or have me cc'ed in that thread?

Best,
Yiwei

On Wed, Oct 30, 2019 at 10:23 PM Kenny Ho <y2kenny@gmail.com> wrote:

> Hi Yiwei,
>
> I am not sure if you are aware, there is an ongoing RFC on adding drm
> support in cgroup for the purpose of resource tracking.  One of the
> resource is GPU memory.  It's not exactly the same as what you are
> proposing (it doesn't track API usage, but it tracks the type of GPU
> memory from kmd perspective) but perhaps it would be of interest to
> you.  There are no consensus on it at this point.
>
> (sorry for being late to the discussion.  I only noticed this thread
> when one of the email got lucky and escape the spam folder.)
>
> Regards,
> Kenny
>
> On Wed, Oct 30, 2019 at 4:14 AM Yiwei Zhang <zzyiwei@google.com> wrote:
> >
> > Hi Jerome and all folks,
> >
> > In addition to my last reply, I just wanna get some more information
> regarding this on the upstream side.
> >
> > 1. Do you think this(standardize a way to report GPU private
> allocations) is going to be a useful thing on the upstream as well? It
> grants a lot benefits for Android, but I'd like to get an idea for the
> non-Android world.
> >
> > 2. There might be some worries that upstream kernel driver has no idea
> regarding the API. However, to achieve good fidelity around memory
> reporting, we'd have to pass down certain metadata which is known only by
> the userland. Consider this use case: on the upstream side, freedreno for
> example, some memory buffer object(BO) during its own lifecycle could
> represent totally different things, and kmd is not aware of that. When we'd
> like to take memory snapshots at certain granularity, we have to know what
> that buffer represents so that the snapshot can be meaningful and useful.
> >
> > If we just keep this Android specific, I'd worry some day the upstream
> has standardized a way to report this and Android vendors have to take
> extra efforts to migrate over. This is one of the main reasons we'd like to
> do this on the upstream side.
> >
> > Timeline wise, Android has explicit deadlines for the next release and
> we have to push hard towards those. Any prompt responses are very much
> appreciated!
> >
> > Best regards,
> > Yiwei
> >
> > On Mon, Oct 28, 2019 at 11:33 AM Yiwei Zhang <zzyiwei@google.com> wrote:
> >>
> >> On Mon, Oct 28, 2019 at 8:26 AM Jerome Glisse <jglisse@redhat.com>
> wrote:
> >>>
> >>> On Fri, Oct 25, 2019 at 11:35:32AM -0700, Yiwei Zhang wrote:
> >>> > Hi folks,
> >>> >
> >>> > This is the plain text version of the previous email in case that was
> >>> > considered as spam.
> >>> >
> >>> > --- Background ---
> >>> > On the downstream Android, vendors used to report GPU private memory
> >>> > allocations with debugfs nodes in their own formats. However,
> debugfs nodes
> >>> > are getting deprecated in the next Android release.
> >>>
> >>> Maybe explain why it is useful first ?
> >>
> >>
> >> Memory is precious on Android mobile platforms. Apps using a large
> amount of
> >> memory, games, tend to maintain a table for the memory on different
> devices with
> >> different prediction models. Private gpu memory allocations is
> currently semi-blind
> >> to the apps and the platform as well.
> >>
> >> By having the data, the platform can do:
> >> (1) GPU memory profiling as part of the huge Android profiler in
> progress.
> >> (2) Android system health team can enrich the performance test coverage.
> >> (3) We can collect filed metrics to detect any regression on the gpu
> private memory
> >> allocations in the production population.
> >> (4) Shell user can easily dump the allocations in a uniform way across
> vendors.
> >> (5) Platform can feed the data to the apps so that apps can do memory
> allocations
> >> in a more predictable way.
> >>
> >>>
> >>> >
> >>> > --- Proposal ---
> >>> > We are taking the chance to unify all the vendors to migrate their
> existing
> >>> > debugfs nodes into a standardized sysfs node structure. Then the
> platform
> >>> > is able to do a bunch of useful things: memory profiling, system
> health
> >>> > coverage, field metrics, local shell dump, in-app api, etc. This
> proposal
> >>> > is better served upstream as all GPU vendors can standardize a gpu
> memory
> >>> > structure and reduce fragmentation across Android and Linux that
> clients
> >>> > can rely on.
> >>> >
> >>> > --- Detailed design ---
> >>> > The sysfs node structure looks like below:
> >>> > /sys/devices/<ro.gfx.sysfs.0>/<pid>/<type_name>
> >>> > e.g. "/sys/devices/mali0/gpu_mem/606/gl_buffer" and the gl_buffer is
> a node
> >>> > having the comma separated size values: "4096,81920,...,4096".
> >>>
> >>> How does kernel knows what API the allocation is use for ? With the
> >>> open source driver you never specify what API is creating a gem object
> >>> (opengl, vulkan, ...) nor what purpose (transient, shader, ...).
> >>
> >>
> >> Oh, is this a hard requirement for the open source drivers to not
> bookkeep any
> >> data from userland? I think the API is just some additional metadata
> passed down.
> >>
> >>>
> >>>
> >>> > For the top level root, vendors can choose their own names based on
> the
> >>> > value of ro.gfx.sysfs.0 the vendors set. (1) For the multiple gpu
> driver
> >>> > cases, we can use ro.gfx.sysfs.1, ro.gfx.sysfs.2 for the 2nd and 3rd
> KMDs.
> >>> > (2) It's also allowed to put some sub-dir for example "kgsl/gpu_mem"
> or
> >>> > "mali0/gpu_mem" in the ro.gfx.sysfs.<channel> property if the root
> name
> >>> > under /sys/devices/ is already created and used for other purposes.
> >>>
> >>> On one side you want to standardize on the other you want to give
> >>> complete freedom on the top level naming scheme. I would rather see a
> >>> consistent naming scheme (ie something more restraint and with little
> >>> place for interpration by individual driver)
> >>
> >>
> >> Thanks for commenting on this. We definitely need some suggestions on
> the root
> >> directory. In the multi-gpu case on desktop, is there some existing
> consumer to
> >> query "some data" from all the GPUs? How does the tool find all GPUs and
> >> differentiate between them? Is this already standardized?
> >>
> >>> > For the 2nd level "pid", there are usually just a couple of them per
> >>> > snapshot, since we only takes snapshot for the active ones.
> >>>
> >>> ? Do not understand here, you can have any number of applications with
> >>> GPU objects ? And thus there is no bound on the number of PID. Please
> >>> consider desktop too, i do not know what kind of limitation android
> >>> impose.
> >>
> >>
> >> We are only interested in tracking *active* GPU private allocations. So
> yes, any
> >> application currently holding an active GPU context will probably has a
> node here.
> >> Since we want to do profiling for specific apps, the data has to be per
> application
> >> based. I don't get your concerns here. If it's about the tracking
> overhead, it's rare
> >> to see tons of application doing private gpu allocations at the same
> time. Could
> >> you help elaborate a bit?
> >>
> >>> > For the 3rd level "type_name", the type name will be one of the GPU
> memory
> >>> > object types in lower case, and the value will be a comma separated
> >>> > sequence of size values for all the allocations under that specific
> type.
> >>> >
> >>> > We especially would like some comments on this part. For the GPU
> memory
> >>> > object types, we defined 9 different types for Android:
> >>> > (1) UNKNOWN // not accounted for in any other category
> >>> > (2) SHADER // shader binaries
> >>> > (3) COMMAND // allocations which have a lifetime similar to a
> >>> > VkCommandBuffer
> >>> > (4) VULKAN // backing for VkDeviceMemory
> >>> > (5) GL_TEXTURE // GL Texture and RenderBuffer
> >>> > (6) GL_BUFFER // GL Buffer
> >>> > (7) QUERY // backing for query
> >>> > (8) DESCRIPTOR // allocations which have a lifetime similar to a
> >>> > VkDescriptorSet
> >>> > (9) TRANSIENT // random transient things that the driver needs
> >>> >
> >>> > We are wondering if those type enumerations make sense to the
> upstream side
> >>> > as well, or maybe we just deal with our own different type sets. Cuz
> on the
> >>> > Android side, we'll just read those nodes named after the types we
> defined
> >>> > in the sysfs node structure.
> >>>
> >>> See my above point of open source driver and kernel being unaware
> >>> of the allocation purpose and use.
> >>>
> >>> Cheers,
> >>> Jérôme
> >>>
> >>
> >> Many thanks for the reply!
> >> Yiwei
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>