All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Jiri Olsa <jolsa@redhat.com>, Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>, Li Zefan <lizefan@huawei.com>,
	Daniel Mack <daniel@zonque.org>,
	"open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"David S. Miller" <davem@davemloft.net>,
	Pavel Hrdina <phrdina@redhat.com>
Subject: Re: [RFC] cgroup gets release after long time
Date: Thu, 16 May 2019 08:12:13 -0700	[thread overview]
Message-ID: <CAADnVQ+3rBfhJ=L=ZECUNss8Vdsu5snacT8-SqSwnjGHbUna+g@mail.gmail.com> (raw)
In-Reply-To: <20190516103915.GB27421@krava>

On Thu, May 16, 2019 at 3:39 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> hi,
> Pavel reported an issue with bpf programs (attached to cgroup)
> not being released at the time when the cgroup is removed and
> are still visible in 'bpftool prog' list afterwards.

right. the workaround systemd and others are using today is
to detach bpf prog before rmdir of cgroup.
Roman has patches to do this automatically.

> It seems like this is not bpf specific, because I was able
> to cut the bpf code from his example and still see delayed
> release of cgroup.
>
> It happens only on cgroup2 fs (booted with systemd.unified_cgroup_hierarchy=1
> kernel command line option), please check the attached program
> below and following scenario:
>
> TERM 1
> # gcc -o test test.c
>
>                         TERM 2
>                         # cd /sys/kernel/debug/tracing
>                         # echo 1 > events/cgroup/cgroup_release/enable
>
> TERM 1 -> create and remove cgroup1
> # ./test group1
> qemu-system-x86_64: terminating on signal 15 from pid 1775 (./test)
>
>                         TERM 2
>                         # cat trace_pipe
>                         <nothing>
>
> TERM 1 -> create and remove cgroup2
> # ./test group2
> qemu-system-x86_64: terminating on signal 15 from pid 1783 (./test)
>
>                         TERM 2  - group1 being released
>                         # cat trace_pipe
>                         kworker/22:2-1135  [022] ....  2947.375526: cgroup_release: root=0 id=78 level=1 path=/group1
>
> TERM 1 -> create and remove cgroup3
> # ./test group3
> qemu-system-x86_64: terminating on signal 15 from pid 1798 (./test)
>
>                         TERM 2 - group2 being released
>                         # cat trace_pipe
>                         kworker/22:2-1135  [022] ....  2947.375526: cgroup_release: root=0 id=78 level=1 path=/group1
>                         kworker/22:0-1787  [022] ....  2961.501261: cgroup_release: root=0 id=78 level=1 path=/group2
>
>
> Looks like the previous cgroup release is triggered by creating
> another cgroup.  If I don't do anything the cgroup is released
> (tracepoint shows) in about 90 seconds.
>
> The cgroup_release tracepoint is triggered in css_release_work_fn,
> the same function where the cgroup_bpf_put is called, hence the
> delay in releasing of the bpf programs.
>
> Is this expected or somehow configurable? It's confusing seeing
> all the bpf programs from removed cgroups being around. In Pavel's
> setup it's about 100 of them.
>
> Note, I could reproduce this only with qemu-kvm being run in child
> process in the example below.
>
> thoughts? thanks,
> jirka
>
>
> ---
> #include <fcntl.h>
> #include <signal.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> #define CGROUP_PATH "/sys/fs/cgroup"
>
> int
> main(int argc, char **argv)
> {
>         pid_t pid = -1;
>         char path[1024];
>         int rc;
>
>         pid = fork();
>
>         if (pid == 0) {
>                 execl("/usr/bin/qemu-kvm",
>                       "/usr/bin/qemu-kvm",
>                       "-display", "none",
>                       NULL);
>                 fprintf(stderr, "failed to start qemu process\n");
>                 _exit(-1);
>         } else {
>                 int filefd = -1;
>                 char proc[1024];
>
>                 snprintf(path, 1024, "%s/%s", CGROUP_PATH, argv[1]);
>
>                 sleep(1);
>
>                 if (mkdir(path, 0755) < 0) {
>                         fprintf(stderr, "failed to create cgroup '%s'\n", path);
>                         return -1;
>                 }
>
>                 snprintf(proc, 1024, "%s/cgroup.procs", path);
>
>                 filefd = open(proc, O_WRONLY|O_TRUNC);
>                 if (filefd > 0) {
>                         dprintf(filefd, "%u", pid);
>                         close(filefd);
>                 }
>
>                 sleep(1);
>         }
>
>         if (pid > 0)
>                 kill(pid, SIGTERM);
>         do {
>                 rc = rmdir(path);
>         } while (rc != 0);
>
>         return 0;
> }

  reply	other threads:[~2019-05-16 15:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 10:39 [RFC] cgroup gets release after long time Jiri Olsa
2019-05-16 15:12 ` Alexei Starovoitov [this message]
2019-05-16 15:22 ` Roman Gushchin
2019-05-16 15:26   ` Jiri Olsa
2019-05-16 16:46     ` Roman Gushchin
2019-05-16 15:31   ` Pavel Hrdina
2019-05-16 17:14     ` Roman Gushchin
2019-05-16 17:25       ` Alexei Starovoitov
2019-05-17 10:12         ` Pavel Hrdina
2019-05-18  0:56           ` Roman Gushchin
2019-05-20  8:41             ` Pavel Hrdina
2019-05-20 19:11               ` Roman Gushchin
2019-05-21  8:00                 ` Pavel Hrdina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQ+3rBfhJ=L=ZECUNss8Vdsu5snacT8-SqSwnjGHbUna+g@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=daniel@zonque.org \
    --cc=davem@davemloft.net \
    --cc=guro@fb.com \
    --cc=jolsa@redhat.com \
    --cc=lizefan@huawei.com \
    --cc=phrdina@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.