linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephane Eranian <eranian@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	tonyj@suse.com, nelson.dsouza@intel.com
Subject: Re: [PATCH 1/8] perf/x86/intel: Fix memory corruption
Date: Tue, 19 Mar 2019 10:52:01 -0700	[thread overview]
Message-ID: <CABPqkBQapZFCyJNRjP-GUjfOy62=rJU=Fv=cYPxPTnVFBkqtVA@mail.gmail.com> (raw)
In-Reply-To: <20190319110549.GC5996@hirez.programming.kicks-ass.net>

On Tue, Mar 19, 2019 at 4:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Mar 18, 2019 at 11:29:25PM -0700, Stephane Eranian wrote:
>
> > > --- a/arch/x86/events/intel/core.c
> > > +++ b/arch/x86/events/intel/core.c
> > > @@ -3410,7 +3410,7 @@ tfa_get_event_constraints(struct cpu_hw_
> > >         /*
> > >          * Without TFA we must not use PMC3.
> > >          */
> > > -       if (!allow_tsx_force_abort && test_bit(3, c->idxmsk)) {
> > > +       if (!allow_tsx_force_abort && test_bit(3, c->idxmsk) && idx >= 0) {
> > >                 c = dyn_constraint(cpuc, c, idx);
> > >                 c->idxmsk64 &= ~(1ULL << 3);
> > >                 c->weight--;
> > >
> > >
>
> > I was not cc'd on the patch that added  allow_tsx_force_abort, so I
>
> Yeah, that never was public :-( I didn't particularly like that, but
> that's the way it is.
>
> > will give some comments here.
>
> > If I understand the goal of the control parameter it is to turn on/off
> > the TFA workaround and thus determine whether or not PMC3 is
> > available. I don't know why you would need to make this a runtime
> > tunable.
>
> Not quite; the control on its own doesn't directly write the MSR. And
> even when the work-around is allowed, we'll not set the MSR unless there
> is also demand for PMC3.
>
Trying to understand this better here. When the workaround is enabled
(tfa=0), you lose
PMC3 and transactions operate normally. When it is disabled (tfa=1),
transactions are
all aborted and PMC3 is available. If you are saying that when there
is a PMU event
requesting PMC3, then you need PMC3 avail, so you set the MSR so that
tfa=1 forcing
all transactions to abort. But in that case, you are modifying the
execution of the workload
when you are monitoring it, assuming it uses TSX.  You want lowest
overhead and no
modifications to how the workload operates, otherwise how
representative is the data you are
collecting? I understand that there is no impact on apps not using
TSX, well, except on context
switch where you have to toggle that MSR. But for workloads using TSX,
there is potentially
an impact.

> It is a runtime tunable because boot parameters suck.
>
> > That seems a bit dodgy. But given the code you have here right now, we
> > have to deal with it. A sysadmin could flip the control at any time,
> > including when PMC3 is already in used by some events. I do not see
> > the code that schedules out all the events on all CPUs once PMC3
> > becomes unavailable. You cannot just rely on the next context-switch
> > or timer tick for multiplexing.
>
> Yeah, meh. You're admin, you can 'fix' it. In practise I don't expect
> most people to care about the knob, and the few people that do, should
> be able to make it work.

I don't understand how this can work reliably. You have a knob to toggle
that MSR. Then, you have another one inside perf_events and then the sysadmin
has to make sure nobody (incl. NMI watchdog) is using the PMU when
this all happens.
How can this be a practical solution? Am I missing something here?

  reply	other threads:[~2019-03-19 17:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-14 13:01 [RFC][PATCH 0/8] perf/x86: event scheduling cleanups Peter Zijlstra
2019-03-14 13:01 ` [PATCH 1/8] perf/x86/intel: Fix memory corruption Peter Zijlstra
2019-03-15 11:29   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
2019-03-19  6:29   ` [PATCH 1/8] " Stephane Eranian
2019-03-19 11:05     ` Peter Zijlstra
2019-03-19 17:52       ` Stephane Eranian [this message]
2019-03-19 18:20         ` Peter Zijlstra
2019-03-20 20:47           ` Stephane Eranian
2019-03-20 20:52             ` Stephane Eranian
2019-03-20 22:22             ` Peter Zijlstra
2019-03-21 12:38               ` Peter Zijlstra
2019-03-21 16:45                 ` Thomas Gleixner
2019-03-21 17:10                   ` Peter Zijlstra
2019-03-21 17:17                     ` Thomas Gleixner
2019-03-21 18:20                       ` Peter Zijlstra
2019-03-21 19:42                         ` Tony Jones
2019-03-21 19:47                           ` DSouza, Nelson
2019-03-21 20:07                             ` Peter Zijlstra
2019-03-21 23:16                               ` DSouza, Nelson
2019-03-22 22:14                                 ` DSouza, Nelson
2019-03-21 17:23                   ` Stephane Eranian
2019-03-21 17:51                     ` Thomas Gleixner
2019-03-22 19:04                       ` Stephane Eranian
2019-04-03  7:32                         ` Peter Zijlstra
2019-04-03 10:40                 ` [tip:perf/urgent] perf/x86/intel: Initialize TFA MSR tip-bot for Peter Zijlstra
2019-04-03 11:30                   ` Thomas Gleixner
2019-04-03 12:23                     ` Vince Weaver
2019-03-14 13:01 ` [RFC][PATCH 2/8] perf/x86/intel: Simplify intel_tfa_commit_scheduling() Peter Zijlstra
2019-03-14 13:01 ` [RFC][PATCH 3/8] perf/x86: Simplify x86_pmu.get_constraints() interface Peter Zijlstra
2019-03-19 21:21   ` Stephane Eranian
2019-03-14 13:01 ` [RFC][PATCH 4/8] perf/x86: Remove PERF_X86_EVENT_COMMITTED Peter Zijlstra
2019-03-19 20:48   ` Stephane Eranian
2019-03-19 21:00     ` Peter Zijlstra
2019-03-20 13:14       ` Peter Zijlstra
2019-03-20 12:23     ` Peter Zijlstra
2019-03-14 13:01 ` [RFC][PATCH 5/8] perf/x86/intel: Optimize intel_get_excl_constraints() Peter Zijlstra
2019-03-19 23:43   ` Stephane Eranian
2019-03-14 13:01 ` [RFC][PATCH 6/8] perf/x86: Clear ->event_constraint[] on put Peter Zijlstra
2019-03-19 21:50   ` Stephane Eranian
2019-03-20 12:25     ` Peter Zijlstra
2019-03-14 13:01 ` [RFC][PATCH 7/8] perf/x86: Optimize x86_schedule_events() Peter Zijlstra
2019-03-19 23:55   ` Stephane Eranian
2019-03-20 13:11     ` Peter Zijlstra
2019-03-20 19:30       ` Stephane Eranian
2019-03-14 13:01 ` [RFC][PATCH 8/8] perf/x86: Add sanity checks to x86_schedule_events() Peter Zijlstra
2019-03-15  7:15 ` [RFC][PATCH 0/8] perf/x86: event scheduling cleanups Stephane Eranian
2019-03-15  7:15   ` Stephane Eranian
2019-03-15  8:01     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPqkBQapZFCyJNRjP-GUjfOy62=rJU=Fv=cYPxPTnVFBkqtVA@mail.gmail.com' \
    --to=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=nelson.dsouza@intel.com \
    --cc=peterz@infradead.org \
    --cc=tonyj@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).