All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Ashfield <bruce.ashfield@gmail.com>
To: Roman Gushchin <guro@fb.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tj@kernel.org" <tj@kernel.org>,
	Richard Purdie <richard.purdie@linuxfoundation.org>,
	"oleg@redhat.com" <oleg@redhat.com>
Subject: Re: ptrace/strace and freezer oddities and v5.2+ kernels
Date: Wed, 2 Oct 2019 17:59:36 -0400	[thread overview]
Message-ID: <CADkTA4PmGBR7YdOXvi6sEDJ+uztuB7x2G95TCcW2u_iqjwhUNQ@mail.gmail.com> (raw)
In-Reply-To: <20191002181914.GA7617@castle.DHCP.thefacebook.com>

On Wed, Oct 2, 2019 at 2:19 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Wed, Oct 02, 2019 at 12:18:54AM -0400, Bruce Ashfield wrote:
> > On Tue, Oct 1, 2019 at 10:01 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Tue, Oct 01, 2019 at 12:14:18PM -0400, Bruce Ashfield wrote:
> > > > Hi all,
> > > >
> > >
> > > Hi Bruce!
> > >
> > > > The Yocto project has an upcoming release this fall, and I've been trying to
> > > > sort through some issues that are happening with kernel 5.2+ .. although
> > > > there is a specific yocto kernel, I'm testing and seeing this with
> > > > normal / vanilla
> > > > mainline kernels as well.
> > > >
> > > > I'm running into an issue that is *very* similar to the one discussed in the
> > > > [REGRESSION] ptrace broken from "cgroup: cgroup v2 freezer" (76f969e)
> > > > thread from this past may: https://lkml.org/lkml/2019/5/12/272
> > > >
> > > > I can confirm that I have the proposed fix for the initial regression report in
> > > > my build (05b2892637 [signal: unconditionally leave the frozen state
> > > > in ptrace_stop()]),
> > > > but yet I'm still seeing 3 or 4 minute runtimes on a test that used to take 3 or
> > > > 4 seconds.
> > >
> > > So, the problem is that you're experiencing a severe performance regression
> > > in some test, right?
> >
> > Hi Roman,
> >
> > Correct. In particular, running some of the tests that ship with strace itself.
> > The performance change is so drastic, that it definitely makes you wonder
> > "What have I done wrong? Since everyone must be seeing this" .. and I
> > always blame myself first.
> >
> > >
> > > >
> > > > This isn't my normal area of kernel hacking, so I've so far come up empty
> > > > at either fixing it myself, or figuring out a viable workaround. (well, I can
> > > > "fix" it by remove the cgroup_enter_frozen() call in ptrace_stop ...
> > > > but obviously,
> > > > that is just me trying to figure out what could be causing the issue).
> > > >
> > > > As part of the release, we run tests that come with various applications. The
> > > > ptrace test that is causing us issues can be boiled down to this:
> > > >
> > > > $ cd /usr/lib/strace/ptest/tests
> > > > $ time ../strace -o log -qq -esignal=none -e/clock ./printpath-umovestr>ttt
> > > >
> > > > (I can provide as many details as needed, but I wanted to keep this initial
> > > > email relatively short).
> > > >
> > > > I'll continue to debug and attempt to fix this myself, but I grabbed the
> > > > email list from the regression report in May to see if anyone has any ideas
> > > > or angles that I haven't covered in my search for a fix.
> > >
> > > I'm definitely happy to help, but it's a bit hard to say anything from what
> > > you've provided. I'm not aware of any open issues with the freezer except
> > > some spurious cgroup frozen<->not frozen transitions which can happen in some
> > > cases. If you'll describe how can I reproduce the issue, and I'll try to take
> > > a look asap.
> >
> > That would be great.
> >
> > I'll attempt to remove all of the build system specifics out of this
> > (and Richard Purdie
> > on the cc' of this can probably help provide more details / setup info as well).
> >
> > We are running the built-in tests of strace. So here's a cut and paste of what I
> > did to get the tests available (ignore/skip what is common sense or isn't needed
> > in your test rig).
> >
> > % git clone https://github.com/strace/strace.git
> > % cd strace
> > % ./bootstrap
> > # the --enable flag isn't strictly required, but may break on some
> > build machines
> > % ./configure --enable-mpers=no
> > % make
> > % make check-TESTS
> >
> > That last step will not only build the tests, but run them all .. so
> > ^c the run once
> > it starts, since it is a lot of noise (we carry a patch to strace that
> > allows us to build
> > the tests without running them).
> >
> > % cd tests
> > % time strace -o log -qq -esignal=none -e/clock ./printpath-umovestr > fff
> > real    0m2.566s
> > user    0m0.284s
> > sys     0m2.519
> >
> > On pre-cgroup2 freezer kernels, you see a run time similar to what I have above.
> > On the newer kernels we are testing, it is taking 3 or 4 minutes to
> > run the test.
> >
> > I hope that is simple enough to setup and try. Since I've been seeing
> > this on both
> > mainline kernels and the yocto reference kernels, I don't think it is
> > something that
> > I'm carrying in the distro/reference kernel that is causing this (but
> > again, I always
> > blame myself first). If you don't see that same run time, then that
> > does point the finger
> > back at what we are doing and I'll have to apologize for chewing up some of your
> > time.
>
> Thank you for the detailed description!
> I'll try to reproduce the issue and will be back
> by the end of the week.

Thanks again!

While discussing the issue with a few yocto folks today, it came up that
someone wasn't seeing the same behaviour on the opensuse v5.2 kernel
(but I've yet to figure out exactly where to find that tree) .. but when I do,
I'll try and confirm that and will look for patches or config differences that
could explain the results.

I did confirm that 5.3 shows the same thing today, and I'll do a 5.4-rc1 test
tomorrow.

We are also primarily reproducing the issue on qemux86-64, so I'm also
going to try and rule out qemu (but the same version of qemu with just
the kernel changing shows the issue).

I just wanted to add those data points in case that strace test doesn't
reproduce the problem for you. I'll follow up if I learn anything more during
my debug on Thursday.

Bruce

>
> Thank you!
>
> Roman



-- 
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II

  reply	other threads:[~2019-10-02 21:59 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01 16:14 ptrace/strace and freezer oddities and v5.2+ kernels Bruce Ashfield
2019-10-02  2:01 ` Roman Gushchin
2019-10-02  4:18   ` Bruce Ashfield
2019-10-02 18:19     ` Roman Gushchin
2019-10-02 21:59       ` Bruce Ashfield [this message]
2019-10-04  0:09         ` Roman Gushchin
2019-10-07 12:54           ` Bruce Ashfield
2019-10-07 20:11             ` Bruce Ashfield
2019-10-07 23:27               ` Roman Gushchin
2019-10-08 12:13                 ` Bruce Ashfield
2019-10-08 12:36                   ` Oleg Nesterov
2019-10-08 12:58                     ` Bruce Ashfield
2019-10-08 17:27                     ` Roman Gushchin
2019-10-09 15:02 ` [PATCH] cgroup: freezer: call cgroup_enter_frozen() with preemption disabled in ptrace_stop() Oleg Nesterov
2019-10-09 15:21   ` Roman Gushchin
2019-10-11 16:52   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADkTA4PmGBR7YdOXvi6sEDJ+uztuB7x2G95TCcW2u_iqjwhUNQ@mail.gmail.com \
    --to=bruce.ashfield@gmail.com \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.