All of lore.kernel.org
 help / color / mirror / Atom feed
* SSB numbers
@ 2018-05-09 12:39 Thomas Gleixner
  0 siblings, 0 replies; only message in thread
From: Thomas Gleixner @ 2018-05-09 12:39 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]

Hi!

I did some tests to evaluate the impact of RDS toggling. The test is pretty
trivial. It starts two threads which do:

	 clock_gettime(CLOCK_MONOTONIC, &tstart);
	 for (i = 0; i < data->loops; i++) {
	     	clock_gettime(CLOCK_MONOTONIC, &lstart);
	 	for (j = 0; j < data->workloops; j++)
			data->count++;
	     	clock_gettime(CLOCK_MONOTONIC, &lend);
		data->worktime += delta(lend, lstart);
		sched_yield();
	}
	clock_gettime(CLOCK_MONOTONIC, &tend);
	data->tottime = delta(tend, tstart);

Compiled with O0.

Both threads are pinned on the same CPU so the sched_yield() will make them
play ping pong. Nothing else runs on the CPU and the system is otherwise
idle.

data->workloops is used to control the time spent in the thread
'working'.

The full test does:

    for (work = 1; work < MAXWORK; work <<= 1) {
    	do_tests(work, NORDS, NORDS);
    	do_tests(work, NORDS, RDS);
    	do_tests(work, RDS, RDS);
    }	

So it doubles the work loop count on each round and invokes the ping pong
test three times:

     #1 Both threads have RDS off

     #2 One thread has RDS off, i.e. RDS is toggled on every context switch

     #3 Both threads have RDS off, i.e. no RDS toggling on context switch

The attached pngs show the effect of #2 and #3 normalized on #1 (100%) for
the work values.

The lines represent:

   Yellow (CPUtime 1 SSB) : data->worktime total for run #2
   Red    (Testtime 1 SSB): data->tottime total for run #2

   Green  (CPUtime 2 SSB) : data->worktime total for run #3
   Blue   (Testtime 2 SSB): data->tottime total for run #3

x-axis is the computation time which is spent for one 'work loop'.

The SKL client (skl.png) has an interesting effect with the very small work
loops as it raises total test time massively first. The SKL-X server
(skl-x.png) shows more what one would expect. The AMD Ryzen server box has
a completely different and also partially odd behaviour (epyc.png).

But both SKL machines have common properties:

    The total overhead for this 'workload' is 45% !?!?

    The amortization for the MSR toggle is around 1us worth of work
    (slightly lower on the SKL-X).

So the idea of throttling MSR toggling for full ticks which was proposed
vs. the BPF mitigation is not a good idea as that will burden every
'innocent' thread with the massive overhead for a long time and if traffic
is continous then it will simply never toggle back.

But OTOH Dave's concern about fast toggling the MSR bit for BPF in the
context of socket filtering has to be considered as well.

A potential solution is to do the following:

  On the first 'set RDS' operation from softirq context the SSB magic
  forces the RDS bit so subsequent 'clear RDS' invocations will be ignored.

  At the end of softirq processing a call to the SSB magic is done, which
  checks the 'force RDS' bit, and if set clears it and also clear RDS.

  It's not going to be perfect, but it might be a worthwhile exercise to
  try. I'm still looking for a simple BPF test case which could be used for
  that. Any folks here who are less BPF agnostic than me?

  I'll do a simple experiment first with hardcoded invocations in the
  softirq rx path to see whether it works at all and that might give me
  some rough numbers to see whether this is a feasible approach.

I also did some trivial kernel compile measurements with a launcher which
forces RDS for the build via prctl.

       	       SKL	SKL-X	EPYC	
Mitigated      +8 %	+3% 	+ 2%

So with my really stupid test case I obviously hit a the worst point on
Intel.

Thanks,

	tglx

[-- Attachment #2: Type: image/png, Size: 46696 bytes --]

[-- Attachment #3: Type: image/png, Size: 39833 bytes --]

[-- Attachment #4: Type: image/png, Size: 36348 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2018-05-09 12:39 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-09 12:39 SSB numbers Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.