On 03/27/2015 12:34 PM, Ingo Molnar wrote: > > * Brian Gerst wrote: > >>> Btw., there's a neat trick we could do: in the HLT, MWAIT and >>> ACPI-idle code we could attempt to set up RCX to match RIP, to >>> trigger this optimization in the common 'irq interrupted the idle >>> task' case? >> >> sysret only returns to CPL3. > > Indeed, an IRET ought to be pretty cheap for same-ring interrupt > returns in any case. Unfortunately, it is not. Try attached program. On this CPU, 1 ns ~= 3 cycles. $ ./timing_test64 callret 10000 loops in 0.00008s = 7.87 nsec/loop for callret 100000 loops in 0.00076s = 7.56 nsec/loop for callret 1000000 loops in 0.00548s = 5.48 nsec/loop for callret 10000000 loops in 0.02882s = 2.88 nsec/loop for callret 100000000 loops in 0.18334s = 1.83 nsec/loop for callret 200000000 loops in 0.36051s = 1.80 nsec/loop for callret 400000000 loops in 0.71632s = 1.79 nsec/loop for callret Near call + near ret = 5 cycles $ ./timing_test64 lret 10000 loops in 0.00034s = 33.95 nsec/loop for lret 100000 loops in 0.00328s = 32.83 nsec/loop for lret 1000000 loops in 0.04541s = 45.41 nsec/loop for lret 10000000 loops in 0.32130s = 32.13 nsec/loop for lret 20000000 loops in 0.64191s = 32.10 nsec/loop for lret push my_cs + push next_label + far ret = ~90 cycles $ ./timing_test64 iret 10000 loops in 0.00344s = 343.90 nsec/loop for iret 100000 loops in 0.01890s = 188.97 nsec/loop for iret 1000000 loops in 0.08228s = 82.28 nsec/loop for iret 10000000 loops in 0.77910s = 77.91 nsec/loop for iret This is the "same-ring interrupt return". ~230 cycles! :(