From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753599AbbETKXG (ORCPT <rfc822;w@1wt.eu>);
	Wed, 20 May 2015 06:23:06 -0400
Received: from mail-wg0-f41.google.com ([74.125.82.41]:35878 "EHLO
	mail-wg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752817AbbETKXE (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 20 May 2015 06:23:04 -0400
Date: Wed, 20 May 2015 12:22:58 +0200
From: Ingo Molnar <mingo@kernel.org>
To: Borislav Petkov <bp@suse.de>
Cc: Huang Rui <ray.huang@amd.com>, Len Brown <lenb@kernel.org>,
        "Rafael J. Wysocki" <rjw@rjwysocki.net>,
        Thomas Gleixner <tglx@linutronix.de>, x86@kernel.org,
        linux-kernel@vger.kernel.org, Fengguang Wu <fengguang.wu@intel.com>,
        Aaron Lu <aaron.lu@intel.com>, Tony Li <tony.li@amd.com>
Subject: Re: [RFC PATCH 2/4] x86, mwaitt: introduce mwaitx idle with a
 configurable timer
Message-ID: <20150520102258.GA21245@gmail.com>
References: <1432022472-2224-1-git-send-email-ray.huang@amd.com>
 <1432022472-2224-3-git-send-email-ray.huang@amd.com>
 <20150519113121.GD4819@pd.tnic>
 <20150520085520.GA8566@gmail.com>
 <20150520091213.GA3645@pd.tnic>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150520091213.GA3645@pd.tnic>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


* Borislav Petkov <bp@suse.de> wrote:

> On Wed, May 20, 2015 at 10:55:20AM +0200, Ingo Molnar wrote:
> > Does it use it to decide how 'deep' a sleep it will go into, i.e. 
> > larger timeouts cause longer entry and exit latencies?
> 
> That's what the HLT thing does. Cores go into C1 and then at some 
> point (hysteresis, etc) the whole core complex enters C1E.

Well, HLT does not get any hint from the OS how long the idling is 
expected to last.

So I don't think it's the same thing.

> The MWAIT* should be used for only shorter sleeps as it remains in 
> C1. IMHO, of course.
> 
> But the problem there is another: what happens if the timeout fires, 
> you wake up and see that you can remain idle? Do HLT? Do another 
> MWAITX round?

Another MWAITX round - we've got no crystal ball, so the hint might be 
wrong if an external event occurs that we did not anticipate.

As long as it's a statistical optimization it's OK: i.e. if the 
hardware only uses the timeout to determine how deep to sleep.

> This means you have an additional unnecessary wakeup which costs.

I don't think MWAITX will wake up in itself. (If yes then it's 
essentially a timer in disguise and needs a whole different approach!)

> > I suppose it's also the case that if an interrupt arrives _before_ 
> > the expected timeout then MWAITX will try to exit immediately, it 
> > won't wait until the timeout, right?
> 
> I'd assume so - I mean, it must, right.
> 
> BUT!, in talking to Andy about it last night on IRC, he pointed out 
> that when using acpi_idle, we never come to calling x86_idle() and 
> from looking quickly at cpuidle_idle_call(), that still might be the 
> case as we go to use_default only when there's an error with the 
> cpuidle driver or so.

Yes, we don't normally see these idle handlers, ACPI takes over on 
most systems.

> So Rui, before you go and do more work on it, you should probably 
> analyze what cpuidle exactly does (if you haven't done so yet). And 
> on AMD we do use acpi_idle - at least on my F15h box that is the 
> case:
> 
> $ grep . /sys/devices/system/cpu/cpuidle/current_*
> /sys/devices/system/cpu/cpuidle/current_driver:acpi_idle
> /sys/devices/system/cpu/cpuidle/current_governor_ro:menu

Yes.

The question would be: on systems that provide ACPI idle but also have 
MWAITX support, which one behaves better on the hardware side?

Thanks,

	Ingo