bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Per cgroup accounting of context switches
@ 2019-09-07  4:30 Gautam Kulkarni
  2019-09-07  6:38 ` Yonghong Song
  0 siblings, 1 reply; 2+ messages in thread
From: Gautam Kulkarni @ 2019-09-07  4:30 UTC (permalink / raw)
  To: bpf

Hi,

We are evaluating eBPF as a means to account voluntary and
non-voluntary context switches against cgroups. Currently, this
information is only present in the task_struct for an individual
process and not in the cgroup data structure.

With this context, I was looking for recommendation on the following
possible approaches:

1. Use the existing tracepoint (trace_sched_switch) as it exists here
with BPF_PROG_TYPE_TRACEPOINT:
https://github.com/torvalds/linux/blob/master/kernel/sched/core.c#L3877
However, based on the trace format, the kernel does not expose
prev->nivcsw and prev->nvcsw. Due to this, I feel like this approach
may not be feasible. Is my understanding correct?

2. Attach a kprobe to __schedule() and use BPF_PROG_TYPE_KPROBE
This will allow us access to the prev pointer. From the prev
(task_struct), we can access the cgroup and use an eBPF map to
accumulate per cgroup counts of context switches.

3. Implement a kernel module that attaches a kprobe to __schedule()
and implement the map in the kprobe handler.

4. Modify the kernel to have context switch information in task_group.
Would this be something that would make sense to the community?

I would highly appreciate any feedback on this.

Regards,
Gautam

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Per cgroup accounting of context switches
  2019-09-07  4:30 Per cgroup accounting of context switches Gautam Kulkarni
@ 2019-09-07  6:38 ` Yonghong Song
  0 siblings, 0 replies; 2+ messages in thread
From: Yonghong Song @ 2019-09-07  6:38 UTC (permalink / raw)
  To: kulkarni, bpf



On 9/6/19 9:30 PM, Gautam Kulkarni wrote:
> Hi,
> 
> We are evaluating eBPF as a means to account voluntary and
> non-voluntary context switches against cgroups. Currently, this
> information is only present in the task_struct for an individual
> process and not in the cgroup data structure.
> 
> With this context, I was looking for recommendation on the following
> possible approaches:
> 
> 1. Use the existing tracepoint (trace_sched_switch) as it exists here
> with BPF_PROG_TYPE_TRACEPOINT:
> https://github.com/torvalds/linux/blob/master/kernel/sched/core.c#L3877
> However, based on the trace format, the kernel does not expose
> prev->nivcsw and prev->nvcsw. Due to this, I feel like this approach
> may not be feasible. Is my understanding correct?

You can use BPF_RAW_TRACEPOINT_OPEN and `prev` argument will
be available to bpf programs.

> 
> 2. Attach a kprobe to __schedule() and use BPF_PROG_TYPE_KPROBE
> This will allow us access to the prev pointer. From the prev
> (task_struct), we can access the cgroup and use an eBPF map to
> accumulate per cgroup counts of context switches.
> 
> 3. Implement a kernel module that attaches a kprobe to __schedule()
> and implement the map in the kprobe handler.
> 
> 4. Modify the kernel to have context switch information in task_group.
> Would this be something that would make sense to the community?
> 
> I would highly appreciate any feedback on this.
> 
> Regards,
> Gautam
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-09-07  6:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-07  4:30 Per cgroup accounting of context switches Gautam Kulkarni
2019-09-07  6:38 ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).