All of lore.kernel.org
 help / color / mirror / Atom feed
From: "T.J. Mercier" <tjmercier@google.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Dave Airlie" <airlied@redhat.com>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Rob Clark" <robdclark@chromium.org>,
	"Stéphane Marchesin" <marcheu@chromium.org>,
	Kenny.Ho@amd.com, "Christian König" <christian.koenig@amd.com>,
	"Brian Welty" <brian.welty@intel.com>,
	"Tvrtko Ursulin" <tvrtko.ursulin@intel.com>,
	"Eero Tamminen" <eero.t.tamminen@intel.com>
Subject: Re: [RFC v5 00/17] DRM cgroup controller with scheduling control and memory stats
Date: Thu, 20 Jul 2023 10:22:03 -0700	[thread overview]
Message-ID: <CABdmKX0M2z0H74D7Pj1qt5HZgG1LhBKU4YDqgTUaOk8UvXb28A@mail.gmail.com> (raw)
In-Reply-To: <95de5c1e-f03b-8fb7-b5ef-59ac7ca82f31@linux.intel.com>

On Thu, Jul 20, 2023 at 3:55 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> Hi,
>
> On 19/07/2023 21:31, T.J. Mercier wrote:
> > On Wed, Jul 12, 2023 at 4:47 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>    drm.memory.stat
> >>          A nested file containing cumulative memory statistics for the whole
> >>          sub-hierarchy, broken down into separate GPUs and separate memory
> >>          regions supported by the latter.
> >>
> >>          For example::
> >>
> >>            $ cat drm.memory.stat
> >>            card0 region=system total=12898304 shared=0 active=0 resident=12111872 purgeable=167936
> >>            card0 region=stolen-system total=0 shared=0 active=0 resident=0 purgeable=0
> >>
> >>          Card designation corresponds to the DRM device names and multiple line
> >>          entries can be present per card.
> >>
> >>          Memory region names should be expected to be driver specific with the
> >>          exception of 'system' which is standardised and applicable for GPUs
> >>          which can operate on system memory buffers.
> >>
> >>          Sub-keys 'resident' and 'purgeable' are optional.
> >>
> >>          Per category region usage is reported in bytes.
> >>
> >>   * Feedback from people interested in drm.active_us and drm.memory.stat is
> >>     required to understand the use cases and their usefulness (of the fields).
> >>
> >>     Memory stats are something which was easy to add to my series, since I was
> >>     already working on the fdinfo memory stats patches, but the question is how
> >>     useful it is.
> >>
> > Hi Tvrtko,
> >
> > I think this style of driver-defined categories for reporting of
> > memory could potentially allow us to eliminate the GPU memory tracking
> > tracepoint used on Android (gpu_mem_total). This would involve reading
> > drm.memory.stat at the root cgroup (I see it's currently disabled on
>
> I can put it available under root too, don't think there is any
> technical reason to not have it. In fact, now that I look at it again,
> memory.stat is present on root so that would align with my general
> guideline to keep the two as similar as possible.
>
> > the root), which means traversing the whole cgroup tree under the
> > cgroup lock to generate the values on-demand. This would be done
> > rarely, but I still wonder what the cost of that would turn out to be.
>
> Yeah that's ugly. I could eliminate cgroup_lock by being a bit smarter.
> Just didn't think it worth it for the RFC.
>
> Basically to account memory stats for any sub-tree I need the equivalent
> one struct drm_memory_stats per DRM device present in the hierarchy. So
> I could pre-allocate a few and restart if run out of spares, or
> something. They are really small so pre-allocating a good number, based
> on past state or something, should would good enough. Or even total
> number of DRM devices in a system as a pessimistic and safe option for
> most reasonable deployments.
>
> > The drm_memory_stats categories in the output don't seem like a big
> > value-add for this use-case, but no real objection to them being
>
> You mean the fact there are different categories is not a value add for
> your use case because you would only use one?
>
Exactly, I guess that'd be just "private" (or pick another one) for
the driver-defined "regions" where
shared/private/resident/purgeable/active aren't really applicable.
That doesn't seem like a big problem to me since you already need an
understanding of what a driver-defined region means. It's just adding
a requirement to understand what fields are used, and a driver can
document that in the same place as the region itself. That does mean
performing arithmetic on values from different drivers might not make
sense. But this is just my perspective from trying to fit the
gpu_mem_total tracepoint here. I think we could probably change the
way drivers that use it report memory to fit closer into the
drm_memory_stats categories.

> The idea was to align 1:1 with DRM memory stats fdinfo and somewhat
> emulate how memory.stat also offers a breakdown.
>
> > there. I know it's called the DRM cgroup controller, but it'd be nice
> > if there were a way to make the mem tracking part work for any driver
> > that wishes to participate as many of our devices don't use a DRM
> > driver. But making that work doesn't look like it would fit very
>
> Ah that would be a challenge indeed to which I don't have any answers
> right now.
>
> Hm if you have a DRM device somewhere in the chain memory stats would
> still show up. Like if you had a dma-buf producer which is not a DRM
> driver, but then that buffer was imported by a DRM driver, it would show
> up in a cgroup. Or vice-versa. But if there aren't any in the whole
> chain then it would not.
>
Creating a dummy DRM driver underneath an existing driver as an
adaptation layer also came to mind, but yeah... probably not. :)

By the way I still want to try to add tracking for dma-bufs backed by
system memory to memcg, but I'm trying to get memcg v2 up and running
for us first. I don't think that should conflict with the tracking
here.

> > cleanly into this controller, so I'll just shut up now.
>
> Not all all, good feedback!
>
> Regards,
>
> Tvrtko

WARNING: multiple messages have this Message-ID (diff)
From: "T.J. Mercier" <tjmercier@google.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rob Clark" <robdclark@chromium.org>,
	Kenny.Ho@amd.com, "Dave Airlie" <airlied@redhat.com>,
	"Stéphane Marchesin" <marcheu@chromium.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	Intel-gfx@lists.freedesktop.org,
	"Brian Welty" <brian.welty@intel.com>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	"Tvrtko Ursulin" <tvrtko.ursulin@intel.com>,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Tejun Heo" <tj@kernel.org>,
	cgroups@vger.kernel.org,
	"Eero Tamminen" <eero.t.tamminen@intel.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [RFC v5 00/17] DRM cgroup controller with scheduling control and memory stats
Date: Thu, 20 Jul 2023 10:22:03 -0700	[thread overview]
Message-ID: <CABdmKX0M2z0H74D7Pj1qt5HZgG1LhBKU4YDqgTUaOk8UvXb28A@mail.gmail.com> (raw)
In-Reply-To: <95de5c1e-f03b-8fb7-b5ef-59ac7ca82f31@linux.intel.com>

On Thu, Jul 20, 2023 at 3:55 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> Hi,
>
> On 19/07/2023 21:31, T.J. Mercier wrote:
> > On Wed, Jul 12, 2023 at 4:47 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>    drm.memory.stat
> >>          A nested file containing cumulative memory statistics for the whole
> >>          sub-hierarchy, broken down into separate GPUs and separate memory
> >>          regions supported by the latter.
> >>
> >>          For example::
> >>
> >>            $ cat drm.memory.stat
> >>            card0 region=system total=12898304 shared=0 active=0 resident=12111872 purgeable=167936
> >>            card0 region=stolen-system total=0 shared=0 active=0 resident=0 purgeable=0
> >>
> >>          Card designation corresponds to the DRM device names and multiple line
> >>          entries can be present per card.
> >>
> >>          Memory region names should be expected to be driver specific with the
> >>          exception of 'system' which is standardised and applicable for GPUs
> >>          which can operate on system memory buffers.
> >>
> >>          Sub-keys 'resident' and 'purgeable' are optional.
> >>
> >>          Per category region usage is reported in bytes.
> >>
> >>   * Feedback from people interested in drm.active_us and drm.memory.stat is
> >>     required to understand the use cases and their usefulness (of the fields).
> >>
> >>     Memory stats are something which was easy to add to my series, since I was
> >>     already working on the fdinfo memory stats patches, but the question is how
> >>     useful it is.
> >>
> > Hi Tvrtko,
> >
> > I think this style of driver-defined categories for reporting of
> > memory could potentially allow us to eliminate the GPU memory tracking
> > tracepoint used on Android (gpu_mem_total). This would involve reading
> > drm.memory.stat at the root cgroup (I see it's currently disabled on
>
> I can put it available under root too, don't think there is any
> technical reason to not have it. In fact, now that I look at it again,
> memory.stat is present on root so that would align with my general
> guideline to keep the two as similar as possible.
>
> > the root), which means traversing the whole cgroup tree under the
> > cgroup lock to generate the values on-demand. This would be done
> > rarely, but I still wonder what the cost of that would turn out to be.
>
> Yeah that's ugly. I could eliminate cgroup_lock by being a bit smarter.
> Just didn't think it worth it for the RFC.
>
> Basically to account memory stats for any sub-tree I need the equivalent
> one struct drm_memory_stats per DRM device present in the hierarchy. So
> I could pre-allocate a few and restart if run out of spares, or
> something. They are really small so pre-allocating a good number, based
> on past state or something, should would good enough. Or even total
> number of DRM devices in a system as a pessimistic and safe option for
> most reasonable deployments.
>
> > The drm_memory_stats categories in the output don't seem like a big
> > value-add for this use-case, but no real objection to them being
>
> You mean the fact there are different categories is not a value add for
> your use case because you would only use one?
>
Exactly, I guess that'd be just "private" (or pick another one) for
the driver-defined "regions" where
shared/private/resident/purgeable/active aren't really applicable.
That doesn't seem like a big problem to me since you already need an
understanding of what a driver-defined region means. It's just adding
a requirement to understand what fields are used, and a driver can
document that in the same place as the region itself. That does mean
performing arithmetic on values from different drivers might not make
sense. But this is just my perspective from trying to fit the
gpu_mem_total tracepoint here. I think we could probably change the
way drivers that use it report memory to fit closer into the
drm_memory_stats categories.

> The idea was to align 1:1 with DRM memory stats fdinfo and somewhat
> emulate how memory.stat also offers a breakdown.
>
> > there. I know it's called the DRM cgroup controller, but it'd be nice
> > if there were a way to make the mem tracking part work for any driver
> > that wishes to participate as many of our devices don't use a DRM
> > driver. But making that work doesn't look like it would fit very
>
> Ah that would be a challenge indeed to which I don't have any answers
> right now.
>
> Hm if you have a DRM device somewhere in the chain memory stats would
> still show up. Like if you had a dma-buf producer which is not a DRM
> driver, but then that buffer was imported by a DRM driver, it would show
> up in a cgroup. Or vice-versa. But if there aren't any in the whole
> chain then it would not.
>
Creating a dummy DRM driver underneath an existing driver as an
adaptation layer also came to mind, but yeah... probably not. :)

By the way I still want to try to add tracking for dma-bufs backed by
system memory to memcg, but I'm trying to get memcg v2 up and running
for us first. I don't think that should conflict with the tracking
here.

> > cleanly into this controller, so I'll just shut up now.
>
> Not all all, good feedback!
>
> Regards,
>
> Tvrtko

WARNING: multiple messages have this Message-ID (diff)
From: "T.J. Mercier" <tjmercier@google.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rob Clark" <robdclark@chromium.org>,
	Kenny.Ho@amd.com, "Dave Airlie" <airlied@redhat.com>,
	"Stéphane Marchesin" <marcheu@chromium.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	Intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Tejun Heo" <tj@kernel.org>,
	cgroups@vger.kernel.org,
	"Eero Tamminen" <eero.t.tamminen@intel.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [Intel-gfx] [RFC v5 00/17] DRM cgroup controller with scheduling control and memory stats
Date: Thu, 20 Jul 2023 10:22:03 -0700	[thread overview]
Message-ID: <CABdmKX0M2z0H74D7Pj1qt5HZgG1LhBKU4YDqgTUaOk8UvXb28A@mail.gmail.com> (raw)
In-Reply-To: <95de5c1e-f03b-8fb7-b5ef-59ac7ca82f31@linux.intel.com>

On Thu, Jul 20, 2023 at 3:55 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> Hi,
>
> On 19/07/2023 21:31, T.J. Mercier wrote:
> > On Wed, Jul 12, 2023 at 4:47 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>    drm.memory.stat
> >>          A nested file containing cumulative memory statistics for the whole
> >>          sub-hierarchy, broken down into separate GPUs and separate memory
> >>          regions supported by the latter.
> >>
> >>          For example::
> >>
> >>            $ cat drm.memory.stat
> >>            card0 region=system total=12898304 shared=0 active=0 resident=12111872 purgeable=167936
> >>            card0 region=stolen-system total=0 shared=0 active=0 resident=0 purgeable=0
> >>
> >>          Card designation corresponds to the DRM device names and multiple line
> >>          entries can be present per card.
> >>
> >>          Memory region names should be expected to be driver specific with the
> >>          exception of 'system' which is standardised and applicable for GPUs
> >>          which can operate on system memory buffers.
> >>
> >>          Sub-keys 'resident' and 'purgeable' are optional.
> >>
> >>          Per category region usage is reported in bytes.
> >>
> >>   * Feedback from people interested in drm.active_us and drm.memory.stat is
> >>     required to understand the use cases and their usefulness (of the fields).
> >>
> >>     Memory stats are something which was easy to add to my series, since I was
> >>     already working on the fdinfo memory stats patches, but the question is how
> >>     useful it is.
> >>
> > Hi Tvrtko,
> >
> > I think this style of driver-defined categories for reporting of
> > memory could potentially allow us to eliminate the GPU memory tracking
> > tracepoint used on Android (gpu_mem_total). This would involve reading
> > drm.memory.stat at the root cgroup (I see it's currently disabled on
>
> I can put it available under root too, don't think there is any
> technical reason to not have it. In fact, now that I look at it again,
> memory.stat is present on root so that would align with my general
> guideline to keep the two as similar as possible.
>
> > the root), which means traversing the whole cgroup tree under the
> > cgroup lock to generate the values on-demand. This would be done
> > rarely, but I still wonder what the cost of that would turn out to be.
>
> Yeah that's ugly. I could eliminate cgroup_lock by being a bit smarter.
> Just didn't think it worth it for the RFC.
>
> Basically to account memory stats for any sub-tree I need the equivalent
> one struct drm_memory_stats per DRM device present in the hierarchy. So
> I could pre-allocate a few and restart if run out of spares, or
> something. They are really small so pre-allocating a good number, based
> on past state or something, should would good enough. Or even total
> number of DRM devices in a system as a pessimistic and safe option for
> most reasonable deployments.
>
> > The drm_memory_stats categories in the output don't seem like a big
> > value-add for this use-case, but no real objection to them being
>
> You mean the fact there are different categories is not a value add for
> your use case because you would only use one?
>
Exactly, I guess that'd be just "private" (or pick another one) for
the driver-defined "regions" where
shared/private/resident/purgeable/active aren't really applicable.
That doesn't seem like a big problem to me since you already need an
understanding of what a driver-defined region means. It's just adding
a requirement to understand what fields are used, and a driver can
document that in the same place as the region itself. That does mean
performing arithmetic on values from different drivers might not make
sense. But this is just my perspective from trying to fit the
gpu_mem_total tracepoint here. I think we could probably change the
way drivers that use it report memory to fit closer into the
drm_memory_stats categories.

> The idea was to align 1:1 with DRM memory stats fdinfo and somewhat
> emulate how memory.stat also offers a breakdown.
>
> > there. I know it's called the DRM cgroup controller, but it'd be nice
> > if there were a way to make the mem tracking part work for any driver
> > that wishes to participate as many of our devices don't use a DRM
> > driver. But making that work doesn't look like it would fit very
>
> Ah that would be a challenge indeed to which I don't have any answers
> right now.
>
> Hm if you have a DRM device somewhere in the chain memory stats would
> still show up. Like if you had a dma-buf producer which is not a DRM
> driver, but then that buffer was imported by a DRM driver, it would show
> up in a cgroup. Or vice-versa. But if there aren't any in the whole
> chain then it would not.
>
Creating a dummy DRM driver underneath an existing driver as an
adaptation layer also came to mind, but yeah... probably not. :)

By the way I still want to try to add tracking for dma-bufs backed by
system memory to memcg, but I'm trying to get memcg v2 up and running
for us first. I don't think that should conflict with the tracking
here.

> > cleanly into this controller, so I'll just shut up now.
>
> Not all all, good feedback!
>
> Regards,
>
> Tvrtko

WARNING: multiple messages have this Message-ID (diff)
From: "T.J. Mercier" <tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Tvrtko Ursulin <tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: Intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Tejun Heo" <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Zefan Li" <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	"Dave Airlie" <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Daniel Vetter" <daniel.vetter-/w4YWyX8dFk@public.gmane.org>,
	"Rob Clark" <robdclark-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"Stéphane Marchesin"
	<marcheu-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Kenny.Ho-5C7GfCeVMHo@public.gmane.org,
	"Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>,
	"Brian Welty"
	<brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Tvrtko Ursulin"
	<tvrtko.ursulin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Eero Tamminen"
	<eero.t.tamminen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC v5 00/17] DRM cgroup controller with scheduling control and memory stats
Date: Thu, 20 Jul 2023 10:22:03 -0700	[thread overview]
Message-ID: <CABdmKX0M2z0H74D7Pj1qt5HZgG1LhBKU4YDqgTUaOk8UvXb28A@mail.gmail.com> (raw)
In-Reply-To: <95de5c1e-f03b-8fb7-b5ef-59ac7ca82f31-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

On Thu, Jul 20, 2023 at 3:55 AM Tvrtko Ursulin
<tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>
>
> Hi,
>
> On 19/07/2023 21:31, T.J. Mercier wrote:
> > On Wed, Jul 12, 2023 at 4:47 AM Tvrtko Ursulin
> > <tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
> >>
> >>    drm.memory.stat
> >>          A nested file containing cumulative memory statistics for the whole
> >>          sub-hierarchy, broken down into separate GPUs and separate memory
> >>          regions supported by the latter.
> >>
> >>          For example::
> >>
> >>            $ cat drm.memory.stat
> >>            card0 region=system total=12898304 shared=0 active=0 resident=12111872 purgeable=167936
> >>            card0 region=stolen-system total=0 shared=0 active=0 resident=0 purgeable=0
> >>
> >>          Card designation corresponds to the DRM device names and multiple line
> >>          entries can be present per card.
> >>
> >>          Memory region names should be expected to be driver specific with the
> >>          exception of 'system' which is standardised and applicable for GPUs
> >>          which can operate on system memory buffers.
> >>
> >>          Sub-keys 'resident' and 'purgeable' are optional.
> >>
> >>          Per category region usage is reported in bytes.
> >>
> >>   * Feedback from people interested in drm.active_us and drm.memory.stat is
> >>     required to understand the use cases and their usefulness (of the fields).
> >>
> >>     Memory stats are something which was easy to add to my series, since I was
> >>     already working on the fdinfo memory stats patches, but the question is how
> >>     useful it is.
> >>
> > Hi Tvrtko,
> >
> > I think this style of driver-defined categories for reporting of
> > memory could potentially allow us to eliminate the GPU memory tracking
> > tracepoint used on Android (gpu_mem_total). This would involve reading
> > drm.memory.stat at the root cgroup (I see it's currently disabled on
>
> I can put it available under root too, don't think there is any
> technical reason to not have it. In fact, now that I look at it again,
> memory.stat is present on root so that would align with my general
> guideline to keep the two as similar as possible.
>
> > the root), which means traversing the whole cgroup tree under the
> > cgroup lock to generate the values on-demand. This would be done
> > rarely, but I still wonder what the cost of that would turn out to be.
>
> Yeah that's ugly. I could eliminate cgroup_lock by being a bit smarter.
> Just didn't think it worth it for the RFC.
>
> Basically to account memory stats for any sub-tree I need the equivalent
> one struct drm_memory_stats per DRM device present in the hierarchy. So
> I could pre-allocate a few and restart if run out of spares, or
> something. They are really small so pre-allocating a good number, based
> on past state or something, should would good enough. Or even total
> number of DRM devices in a system as a pessimistic and safe option for
> most reasonable deployments.
>
> > The drm_memory_stats categories in the output don't seem like a big
> > value-add for this use-case, but no real objection to them being
>
> You mean the fact there are different categories is not a value add for
> your use case because you would only use one?
>
Exactly, I guess that'd be just "private" (or pick another one) for
the driver-defined "regions" where
shared/private/resident/purgeable/active aren't really applicable.
That doesn't seem like a big problem to me since you already need an
understanding of what a driver-defined region means. It's just adding
a requirement to understand what fields are used, and a driver can
document that in the same place as the region itself. That does mean
performing arithmetic on values from different drivers might not make
sense. But this is just my perspective from trying to fit the
gpu_mem_total tracepoint here. I think we could probably change the
way drivers that use it report memory to fit closer into the
drm_memory_stats categories.

> The idea was to align 1:1 with DRM memory stats fdinfo and somewhat
> emulate how memory.stat also offers a breakdown.
>
> > there. I know it's called the DRM cgroup controller, but it'd be nice
> > if there were a way to make the mem tracking part work for any driver
> > that wishes to participate as many of our devices don't use a DRM
> > driver. But making that work doesn't look like it would fit very
>
> Ah that would be a challenge indeed to which I don't have any answers
> right now.
>
> Hm if you have a DRM device somewhere in the chain memory stats would
> still show up. Like if you had a dma-buf producer which is not a DRM
> driver, but then that buffer was imported by a DRM driver, it would show
> up in a cgroup. Or vice-versa. But if there aren't any in the whole
> chain then it would not.
>
Creating a dummy DRM driver underneath an existing driver as an
adaptation layer also came to mind, but yeah... probably not. :)

By the way I still want to try to add tracking for dma-bufs backed by
system memory to memcg, but I'm trying to get memcg v2 up and running
for us first. I don't think that should conflict with the tracking
here.

> > cleanly into this controller, so I'll just shut up now.
>
> Not all all, good feedback!
>
> Regards,
>
> Tvrtko

  reply	other threads:[~2023-07-20 17:22 UTC|newest]

Thread overview: 156+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-12 11:45 [RFC v5 00/17] DRM cgroup controller with scheduling control and memory stats Tvrtko Ursulin
2023-07-12 11:45 ` Tvrtko Ursulin
2023-07-12 11:45 ` Tvrtko Ursulin
2023-07-12 11:45 ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 01/17] drm/i915: Add ability for tracking buffer objects per client Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 02/17] drm/i915: Record which client owns a VM Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 03/17] drm/i915: Track page table backing store usage Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 04/17] drm/i915: Account ring buffer and context state storage Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 05/17] drm/i915: Implement fdinfo memory stats printing Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 06/17] drm: Update file owner during use Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [PATCH 07/17] cgroup: Add the DRM cgroup controller Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:45 ` [Intel-gfx] [PATCH 08/17] drm/cgroup: Track DRM clients per cgroup Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-21 22:14   ` Tejun Heo
2023-07-21 22:14     ` Tejun Heo
2023-07-21 22:14     ` [Intel-gfx] " Tejun Heo
2023-07-21 22:14     ` Tejun Heo
2023-07-12 11:45 ` [Intel-gfx] [PATCH 09/17] drm/cgroup: Add ability to query drm cgroup GPU time Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45 ` [Intel-gfx] [PATCH 10/17] drm/cgroup: Add over budget signalling callback Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45 ` [Intel-gfx] [PATCH 11/17] drm/cgroup: Only track clients which are providing drm_cgroup_ops Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:45   ` Tvrtko Ursulin
2023-07-12 11:46 ` [PATCH 12/17] cgroup/drm: Introduce weight based drm cgroup control Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-21 22:17   ` Tejun Heo
2023-07-21 22:17     ` Tejun Heo
2023-07-21 22:17     ` [Intel-gfx] " Tejun Heo
2023-07-21 22:17     ` Tejun Heo
2023-07-25 13:46     ` Tvrtko Ursulin
2023-07-25 13:46       ` Tvrtko Ursulin
2023-07-25 13:46       ` [Intel-gfx] " Tvrtko Ursulin
2023-07-25 13:46       ` Tvrtko Ursulin
2023-07-12 11:46 ` [PATCH 13/17] drm/i915: Wire up with drm controller GPU time query Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:46 ` [PATCH 14/17] drm/i915: Implement cgroup controller over budget throttling Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 11:46 ` [PATCH 15/17] cgroup/drm: Expose GPU utilisation Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-21 22:19   ` Tejun Heo
2023-07-21 22:19     ` Tejun Heo
2023-07-21 22:19     ` [Intel-gfx] " Tejun Heo
2023-07-21 22:19     ` Tejun Heo
2023-07-21 22:20     ` Tejun Heo
2023-07-21 22:20       ` Tejun Heo
2023-07-21 22:20       ` [Intel-gfx] " Tejun Heo
2023-07-21 22:20       ` Tejun Heo
2023-07-25 14:08       ` Tvrtko Ursulin
2023-07-25 14:08         ` Tvrtko Ursulin
2023-07-25 14:08         ` [Intel-gfx] " Tvrtko Ursulin
2023-07-25 14:08         ` Tvrtko Ursulin
2023-07-25 21:44         ` Tejun Heo
2023-07-25 21:44           ` Tejun Heo
2023-07-25 21:44           ` [Intel-gfx] " Tejun Heo
2023-07-25 21:44           ` Tejun Heo
2023-07-12 11:46 ` [PATCH 16/17] cgroup/drm: Expose memory stats Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-21 22:21   ` Tejun Heo
2023-07-21 22:21     ` Tejun Heo
2023-07-21 22:21     ` [Intel-gfx] " Tejun Heo
2023-07-21 22:21     ` Tejun Heo
2023-07-26 10:14     ` Maarten Lankhorst
2023-07-26 10:14       ` [Intel-gfx] " Maarten Lankhorst
2023-07-26 10:14       ` Maarten Lankhorst
2023-07-26 11:41       ` Tvrtko Ursulin
2023-07-26 11:41         ` Tvrtko Ursulin
2023-07-26 11:41         ` Tvrtko Ursulin
2023-07-26 11:41         ` [Intel-gfx] " Tvrtko Ursulin
2023-07-27 11:54         ` Maarten Lankhorst
2023-07-27 11:54           ` Maarten Lankhorst
2023-07-27 11:54           ` [Intel-gfx] " Maarten Lankhorst
2023-07-27 11:54           ` Maarten Lankhorst
2023-07-27 17:08           ` Tvrtko Ursulin
2023-07-27 17:08             ` Tvrtko Ursulin
2023-07-27 17:08             ` [Intel-gfx] " Tvrtko Ursulin
2023-07-27 17:08             ` Tvrtko Ursulin
2023-07-28 14:15             ` Tvrtko Ursulin
2023-07-28 14:15               ` Tvrtko Ursulin
2023-07-28 14:15               ` [Intel-gfx] " Tvrtko Ursulin
2023-07-28 14:15               ` Tvrtko Ursulin
2023-07-26 19:44       ` Tejun Heo
2023-07-26 19:44         ` Tejun Heo
2023-07-26 19:44         ` [Intel-gfx] " Tejun Heo
2023-07-26 19:44         ` Tejun Heo
2023-07-27 13:42         ` Maarten Lankhorst
2023-07-27 13:42           ` Maarten Lankhorst
2023-07-27 13:42           ` [Intel-gfx] " Maarten Lankhorst
2023-07-27 13:42           ` Maarten Lankhorst
2023-07-27 16:43           ` Tvrtko Ursulin
2023-07-27 16:43             ` Tvrtko Ursulin
2023-07-27 16:43             ` [Intel-gfx] " Tvrtko Ursulin
2023-07-27 16:43             ` Tvrtko Ursulin
2023-07-26 16:44     ` Tvrtko Ursulin
2023-07-26 16:44       ` Tvrtko Ursulin
2023-07-26 16:44       ` [Intel-gfx] " Tvrtko Ursulin
2023-07-26 16:44       ` Tvrtko Ursulin
2023-07-26 19:49       ` Tejun Heo
2023-07-26 19:49         ` Tejun Heo
2023-07-26 19:49         ` [Intel-gfx] " Tejun Heo
2023-07-26 19:49         ` Tejun Heo
2023-07-12 11:46 ` [PATCH 17/17] drm/i915: Wire up to the drm cgroup " Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` Tvrtko Ursulin
2023-07-12 11:46   ` [Intel-gfx] " Tvrtko Ursulin
2023-07-12 14:46 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for DRM cgroup controller with scheduling control and " Patchwork
2023-07-19 20:31 ` [RFC v5 00/17] " T.J. Mercier
2023-07-19 20:31   ` T.J. Mercier
2023-07-19 20:31   ` T.J. Mercier
2023-07-19 20:31   ` [Intel-gfx] " T.J. Mercier
2023-07-20 10:55   ` Tvrtko Ursulin
2023-07-20 10:55     ` Tvrtko Ursulin
2023-07-20 10:55     ` [Intel-gfx] " Tvrtko Ursulin
2023-07-20 10:55     ` Tvrtko Ursulin
2023-07-20 17:22     ` T.J. Mercier [this message]
2023-07-20 17:22       ` T.J. Mercier
2023-07-20 17:22       ` [Intel-gfx] " T.J. Mercier
2023-07-20 17:22       ` T.J. Mercier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABdmKX0M2z0H74D7Pj1qt5HZgG1LhBKU4YDqgTUaOk8UvXb28A@mail.gmail.com \
    --to=tjmercier@google.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=Kenny.Ho@amd.com \
    --cc=airlied@redhat.com \
    --cc=brian.welty@intel.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=eero.t.tamminen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=marcheu@chromium.org \
    --cc=robdclark@chromium.org \
    --cc=tj@kernel.org \
    --cc=tvrtko.ursulin@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.