All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>,
	"jeremy@goop.org" <jeremy@goop.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Ian.Campbell@citrix.com" <Ian.Campbell@citrix.com>,
	"mike.mcclurg@citrix.com" <mike.mcclurg@citrix.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stefan.bader@canonical.com" <stefan.bader@canonical.com>,
	"rjw@sisk.pl" <rjw@sisk.pl>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"liang.tang@oracle.com" <liang.tang@oracle.com>,
	"Yu, Ke" <ke.yu@intel.com>,
	"konrad@kernel.org" <konrad@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>
Subject: RE: [Xen-devel] [PATCH 3/8] ACPI: processor: add __acpi_processor_[un]register_driver helpers.
Date: Tue, 17 Jan 2012 03:03:30 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D10FCD37A9@SHSMSX102.ccr.corp.intel.com> (raw)
In-Reply-To: <20120113222451.GA6922@phenom.dumpdata.com>

> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Saturday, January 14, 2012 6:25 AM
> 
> > > > sure. VCPU!=PCPU requirement is orthogonal to the basic part for gearing
> > > > ACPI information to Xen.
> 
> .. snip..
> > > >
> > > > >  1). For new distros (Ubuntu, Fedora), the default is all VCPUs.
> > > >
> > > > good to know that.
> > > >
> > > > >      Enterprising users might use dom0_max_vcpus to limit the VCPU
> > > count,
> > > > >      but most won't.
> > > > >      Which mean we can concentrate on bringing the _Pxx/_Cxx
> parsing
> > > > >      up to the hypervisor. Which is really neccessary on any chipset
> > > > >      which has the notion of TurboBoost (otherwise the Xen scheduler
> > > > >      won't pick this up and won't engage this mode in certain
> > > > >      workloads).
> 
> .. snip..
> 
> > yes, this is a big question. First, I'd like to give my sincere thanks to you and
> > Liang for help push this part to Linux upstream. You've done a great job. :-)
> 
> Thanks!
> > Unfortunately I can't afford the time in the short term, due to extremely busy
> > tasks in other projects, at least in the whole Q1. Really sorry about that. :/
> 
> Bummer.
> >
> > I would very appreciate your help if you could continue lending some time
> here,
> > since you've done plenty of works on the cleanup. The majority of the tricky
> > part is related to VCPU/PCPU handling. If putting that part aside, the
> remaining
> > logic should be much simpler.
> 
> I was trying to figure out how difficult it would be to just bring Pxx states to
> the Xen hypervisor using the existing ACPI interfaces. And while it did not pass
> all the _Pxx states (seems that all the _PCT, _PSS, _PSD, _PPC flags need to
> be enabled in the hypercall to make this work), it demonstrates what I had in
> mind.
> 
> 
> #include <linux/device.h>
> #include <linux/kernel.h>
> #include <linux/module.h>
> #include <linux/init.h>
> #include <linux/types.h>
> #include <acpi/acpi_bus.h>
> #include <acpi/acpi_drivers.h>
> #include <acpi/processor.h>
> #include <linux/cpumask.h>
> 
> #include <xen/interface/platform.h>
> #include <asm/xen/hypercall.h>
> 
> #define DRV_NAME	"ACPI_PXX"
> #define DRV_CLASS	"ACPI_PXX_CLASS"
> MODULE_AUTHOR("Konrad Rzeszutek Wilk");
> MODULE_DESCRIPTION("ACPI Processor Driver to send data to Xen
> hypervisor");
> MODULE_LICENSE("GPL");
> 
> static int parse_acpi_cxx(struct acpi_processor *_pr)
> {
> 	struct acpi_processor_cx *cx;
> 	int i;
> 
> 	for (i = 1; i <= _pr->power.count; i++) {
> 		cx = &_pr->power.states[i];
> 		if (!cx->valid)
> 			continue;
> 		pr_info("%s: %d %d %d 0x%x\n", __func__,
> 			cx->type, cx->latency, cx->power, (u32)cx->address);
> 	}
> 	/* TODO: Under Xen, the C-states information is not present.
>  	 * Figure out why. */

it's possible related to this long thread:

http://lists.xen.org/archives/html/xen-devel/2011-08/msg00511.html

IOW, Xen doesn't export mwait capability to dom0, which impacts _PDC setting.
Final solution is to have a para-virtualized PDC call for that.

> 	return 0;
> }
> static int push_pxx_to_hypervisor(struct acpi_processor *_pr)
> {
> 	int ret = -EINVAL;
> 	struct xen_platform_op op = {
> 		.cmd			= XENPF_set_processor_pminfo,
> 		.interface_version	= XENPF_INTERFACE_VERSION,
> 		.u.set_pminfo.id	= _pr->acpi_id,
> 		.u.set_pminfo.type	= XEN_PM_PX,
> 	};
> 	struct xen_processor_performance *xen_perf;
> 	struct xen_processor_px *xen_states, *xen_px = NULL;
> 	struct acpi_processor_px *px;
> 	unsigned i;
> 
> 	xen_perf = &op.u.set_pminfo.perf;
> 	xen_perf->flags = XEN_PX_PSS;
> 
> 	xen_perf->state_count = _pr->performance->state_count;
> 	xen_states = kzalloc(xen_perf->state_count *
> 			     sizeof(struct xen_processor_px), GFP_KERNEL);
> 	if (!xen_states)
> 		return -ENOMEM;
> 
> 	for (i = 0; i < _pr->performance->state_count; i++) {
> 		xen_px = &(xen_states[i]);
> 		px = &(_pr->performance->states[i]);
> 
> 		xen_px->core_frequency = px->core_frequency;
> 		xen_px->power = px->power;
> 		xen_px->transition_latency = px->transition_latency;
> 		xen_px->bus_master_latency = px->bus_master_latency;
> 		xen_px->control = px->control;
> 		xen_px->status = px->status;
> 	}
> 	set_xen_guest_handle(xen_perf->states, xen_states);
> 	ret = HYPERVISOR_dom0_op(&op);
> 	kfree(xen_states);
> 	return ret;
> }
> static int parse_acpi_pxx(struct acpi_processor *_pr)
> {
> 	struct acpi_processor_px *px;
> 	int i;
> 
> 	for (i = 0; i < _pr->performance->state_count;i++) {
> 		px = &(_pr->performance->states[i]);
> 		pr_info("%s: [%d]: %d, %d, %d, %d, %d, %d\n", __func__,
> 			i, (u32)px->core_frequency, (u32)px->power,
> 			(u32)px->transition_latency,
> 			(u32)px->bus_master_latency,
> 			(u32)px->control, (u32)px->status);
> 	}
> 	if (xen_initial_domain())
> 		return push_pxx_to_hypervisor(_pr);
> 	return 0;
> }
> static int parse_acpi_data(void)
> {
> 	int cpu;
> 	int err = -ENODEV;
> 	struct acpi_processor *_pr;
> 	struct cpuinfo_x86 *c = &cpu_data(0);
> 
> 	/* TODO: Under AMD, the information is populated
> 	 * using the powernow-k8 driver which does an MSR_PSTATE_CUR_LIMIT
> 	 * MSR which returns the wrong value so the population of 'processors'
> 	 * has bogus data. So only run this under Intel for right now. */
> 	if (!cpu_has(c, X86_FEATURE_EST))
> 		return -ENODEV;
> 	for_each_possible_cpu(cpu) {
> 		_pr = per_cpu(processors, cpu);
> 		if (!_pr)
> 			continue;
> 
> 		if (_pr->flags.power)
> 			(void)parse_acpi_cxx(_pr);
> 
> 		if (_pr->performance->states)
> 			err = parse_acpi_pxx(_pr);
> 		if (err)
> 			break;
> 	}
> 	return -ENODEV; /* force it to unload */
> }
> static int __init acpi_pxx_init(void)
> {
> 	return parse_acpi_data();
> }
> static void __exit acpi_pxx_exit(void)
> {
> }
> module_init(acpi_pxx_init);
> module_exit(acpi_pxx_exit);

the prerequisites for this module to work correctly, is that dom0 has the right
configurations to have all necessary Cx/Px information ready before this 
module is loaded. That may mean enabling full CONFIG_CPU_IDLE and CONFIG_CPUFREQ,
which in current form may add some negative impact, e.g. dom0 will try to control
Px/Cx to conflict with Xen. So some tweaks may be required in that part.

given our purpose now, is to come up a cleaner approach which tolerate some
assumptions (e.g. #VCPU of dom0 == #PCPU), there's another option following this
trend (perhaps compensate your idea). We can register a Xen-cpuidle and 
xen-cpufreq driver to current Linux cpuidle and cpufreq framework, which plays 
mainly two roles:
	- a dummy driver to prevent dom0 touching actual Px/Cx
	- parse ACPI Cx/Px information to Xen, in a similar way you did above

there may have some other trickiness, but the majority code will be self-contained.

Thanks
Kevin

WARNING: multiple messages have this Message-ID (diff)
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	liang tang <liang.tang@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>,
	"jeremy@goop.org" <jeremy@goop.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Ian.Campbell@citrix.com" <Ian.Campbell@citrix.com>,
	"mike.mcclurg@citrix.com" <mike.mcclurg@citrix.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stefan.bader@canonical.com" <stefan.bader@canonical.com>,
	"rjw@sisk.pl" <rjw@sisk.pl>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"liang.tang@oracle.com" <liang.tang@oracle.com>,
	"Yu, Ke" <ke.yu@intel.com>,
	"konrad@kernel.org" <konrad@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>
Subject: RE: [Xen-devel] [PATCH 3/8] ACPI: processor: add __acpi_processor_[un]register_driver helpers.
Date: Tue, 17 Jan 2012 03:03:30 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D10FCD37A9@SHSMSX102.ccr.corp.intel.com> (raw)
In-Reply-To: <20120113222451.GA6922@phenom.dumpdata.com>

> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Saturday, January 14, 2012 6:25 AM
> 
> > > > sure. VCPU!=PCPU requirement is orthogonal to the basic part for gearing
> > > > ACPI information to Xen.
> 
> .. snip..
> > > >
> > > > >  1). For new distros (Ubuntu, Fedora), the default is all VCPUs.
> > > >
> > > > good to know that.
> > > >
> > > > >      Enterprising users might use dom0_max_vcpus to limit the VCPU
> > > count,
> > > > >      but most won't.
> > > > >      Which mean we can concentrate on bringing the _Pxx/_Cxx
> parsing
> > > > >      up to the hypervisor. Which is really neccessary on any chipset
> > > > >      which has the notion of TurboBoost (otherwise the Xen scheduler
> > > > >      won't pick this up and won't engage this mode in certain
> > > > >      workloads).
> 
> .. snip..
> 
> > yes, this is a big question. First, I'd like to give my sincere thanks to you and
> > Liang for help push this part to Linux upstream. You've done a great job. :-)
> 
> Thanks!
> > Unfortunately I can't afford the time in the short term, due to extremely busy
> > tasks in other projects, at least in the whole Q1. Really sorry about that. :/
> 
> Bummer.
> >
> > I would very appreciate your help if you could continue lending some time
> here,
> > since you've done plenty of works on the cleanup. The majority of the tricky
> > part is related to VCPU/PCPU handling. If putting that part aside, the
> remaining
> > logic should be much simpler.
> 
> I was trying to figure out how difficult it would be to just bring Pxx states to
> the Xen hypervisor using the existing ACPI interfaces. And while it did not pass
> all the _Pxx states (seems that all the _PCT, _PSS, _PSD, _PPC flags need to
> be enabled in the hypercall to make this work), it demonstrates what I had in
> mind.
> 
> 
> #include <linux/device.h>
> #include <linux/kernel.h>
> #include <linux/module.h>
> #include <linux/init.h>
> #include <linux/types.h>
> #include <acpi/acpi_bus.h>
> #include <acpi/acpi_drivers.h>
> #include <acpi/processor.h>
> #include <linux/cpumask.h>
> 
> #include <xen/interface/platform.h>
> #include <asm/xen/hypercall.h>
> 
> #define DRV_NAME	"ACPI_PXX"
> #define DRV_CLASS	"ACPI_PXX_CLASS"
> MODULE_AUTHOR("Konrad Rzeszutek Wilk");
> MODULE_DESCRIPTION("ACPI Processor Driver to send data to Xen
> hypervisor");
> MODULE_LICENSE("GPL");
> 
> static int parse_acpi_cxx(struct acpi_processor *_pr)
> {
> 	struct acpi_processor_cx *cx;
> 	int i;
> 
> 	for (i = 1; i <= _pr->power.count; i++) {
> 		cx = &_pr->power.states[i];
> 		if (!cx->valid)
> 			continue;
> 		pr_info("%s: %d %d %d 0x%x\n", __func__,
> 			cx->type, cx->latency, cx->power, (u32)cx->address);
> 	}
> 	/* TODO: Under Xen, the C-states information is not present.
>  	 * Figure out why. */

it's possible related to this long thread:

http://lists.xen.org/archives/html/xen-devel/2011-08/msg00511.html

IOW, Xen doesn't export mwait capability to dom0, which impacts _PDC setting.
Final solution is to have a para-virtualized PDC call for that.

> 	return 0;
> }
> static int push_pxx_to_hypervisor(struct acpi_processor *_pr)
> {
> 	int ret = -EINVAL;
> 	struct xen_platform_op op = {
> 		.cmd			= XENPF_set_processor_pminfo,
> 		.interface_version	= XENPF_INTERFACE_VERSION,
> 		.u.set_pminfo.id	= _pr->acpi_id,
> 		.u.set_pminfo.type	= XEN_PM_PX,
> 	};
> 	struct xen_processor_performance *xen_perf;
> 	struct xen_processor_px *xen_states, *xen_px = NULL;
> 	struct acpi_processor_px *px;
> 	unsigned i;
> 
> 	xen_perf = &op.u.set_pminfo.perf;
> 	xen_perf->flags = XEN_PX_PSS;
> 
> 	xen_perf->state_count = _pr->performance->state_count;
> 	xen_states = kzalloc(xen_perf->state_count *
> 			     sizeof(struct xen_processor_px), GFP_KERNEL);
> 	if (!xen_states)
> 		return -ENOMEM;
> 
> 	for (i = 0; i < _pr->performance->state_count; i++) {
> 		xen_px = &(xen_states[i]);
> 		px = &(_pr->performance->states[i]);
> 
> 		xen_px->core_frequency = px->core_frequency;
> 		xen_px->power = px->power;
> 		xen_px->transition_latency = px->transition_latency;
> 		xen_px->bus_master_latency = px->bus_master_latency;
> 		xen_px->control = px->control;
> 		xen_px->status = px->status;
> 	}
> 	set_xen_guest_handle(xen_perf->states, xen_states);
> 	ret = HYPERVISOR_dom0_op(&op);
> 	kfree(xen_states);
> 	return ret;
> }
> static int parse_acpi_pxx(struct acpi_processor *_pr)
> {
> 	struct acpi_processor_px *px;
> 	int i;
> 
> 	for (i = 0; i < _pr->performance->state_count;i++) {
> 		px = &(_pr->performance->states[i]);
> 		pr_info("%s: [%d]: %d, %d, %d, %d, %d, %d\n", __func__,
> 			i, (u32)px->core_frequency, (u32)px->power,
> 			(u32)px->transition_latency,
> 			(u32)px->bus_master_latency,
> 			(u32)px->control, (u32)px->status);
> 	}
> 	if (xen_initial_domain())
> 		return push_pxx_to_hypervisor(_pr);
> 	return 0;
> }
> static int parse_acpi_data(void)
> {
> 	int cpu;
> 	int err = -ENODEV;
> 	struct acpi_processor *_pr;
> 	struct cpuinfo_x86 *c = &cpu_data(0);
> 
> 	/* TODO: Under AMD, the information is populated
> 	 * using the powernow-k8 driver which does an MSR_PSTATE_CUR_LIMIT
> 	 * MSR which returns the wrong value so the population of 'processors'
> 	 * has bogus data. So only run this under Intel for right now. */
> 	if (!cpu_has(c, X86_FEATURE_EST))
> 		return -ENODEV;
> 	for_each_possible_cpu(cpu) {
> 		_pr = per_cpu(processors, cpu);
> 		if (!_pr)
> 			continue;
> 
> 		if (_pr->flags.power)
> 			(void)parse_acpi_cxx(_pr);
> 
> 		if (_pr->performance->states)
> 			err = parse_acpi_pxx(_pr);
> 		if (err)
> 			break;
> 	}
> 	return -ENODEV; /* force it to unload */
> }
> static int __init acpi_pxx_init(void)
> {
> 	return parse_acpi_data();
> }
> static void __exit acpi_pxx_exit(void)
> {
> }
> module_init(acpi_pxx_init);
> module_exit(acpi_pxx_exit);

the prerequisites for this module to work correctly, is that dom0 has the right
configurations to have all necessary Cx/Px information ready before this 
module is loaded. That may mean enabling full CONFIG_CPU_IDLE and CONFIG_CPUFREQ,
which in current form may add some negative impact, e.g. dom0 will try to control
Px/Cx to conflict with Xen. So some tweaks may be required in that part.

given our purpose now, is to come up a cleaner approach which tolerate some
assumptions (e.g. #VCPU of dom0 == #PCPU), there's another option following this
trend (perhaps compensate your idea). We can register a Xen-cpuidle and 
xen-cpufreq driver to current Linux cpuidle and cpufreq framework, which plays 
mainly two roles:
	- a dummy driver to prevent dom0 touching actual Px/Cx
	- parse ACPI Cx/Px information to Xen, in a similar way you did above

there may have some other trickiness, but the majority code will be self-contained.

Thanks
Kevin

  reply	other threads:[~2012-01-17  3:03 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-30 17:20 [RFC PATCH] Exporting ACPI Pxx/Cxx states to other kernel subsystems (v1) Konrad Rzeszutek Wilk
2011-11-30 17:20 ` [PATCH 1/8] ACPI: processor: export necessary interfaces Konrad Rzeszutek Wilk
2011-12-16 21:33   ` Konrad Rzeszutek Wilk
2011-12-19  5:43     ` Tian, Kevin
2011-12-19  5:43       ` Tian, Kevin
2011-12-19 14:17       ` [Xen-devel] " Konrad Rzeszutek Wilk
2011-11-30 17:20 ` [PATCH 2/8] ACPI: processor: cache acpi_power_register in cx structure Konrad Rzeszutek Wilk
2011-11-30 17:20 ` [PATCH 3/8] ACPI: processor: add __acpi_processor_[un]register_driver helpers Konrad Rzeszutek Wilk
2011-11-30 17:20   ` Konrad Rzeszutek Wilk
2011-12-16 22:03   ` Konrad Rzeszutek Wilk
2011-12-19  5:48     ` Tian, Kevin
2011-12-19 14:26       ` [Xen-devel] " Konrad Rzeszutek Wilk
2011-12-20  2:29         ` Tian, Kevin
2011-12-20 15:31           ` Konrad Rzeszutek Wilk
2011-12-21  0:35             ` Tian, Kevin
2011-12-23  3:01               ` Konrad Rzeszutek Wilk
2011-12-26  1:31                 ` Tian, Kevin
2012-01-03 20:59                   ` Konrad Rzeszutek Wilk
2012-01-06  1:07                     ` Tian, Kevin
2012-01-13 22:24                       ` Konrad Rzeszutek Wilk
2012-01-13 22:24                         ` Konrad Rzeszutek Wilk
2012-01-17  3:03                         ` Tian, Kevin [this message]
2012-01-17  3:03                           ` Tian, Kevin
2012-01-17 17:13                           ` Konrad Rzeszutek Wilk
2012-01-17 18:19                             ` Konrad Rzeszutek Wilk
2012-01-23 16:53                               ` Konrad Rzeszutek Wilk
2011-11-30 17:21 ` [PATCH 4/8] ACPI: processor: Don't setup cpu idle driver and handler when we do not want them Konrad Rzeszutek Wilk
2011-12-16 21:36   ` Konrad Rzeszutek Wilk
2011-12-19 10:33     ` liang tang
2011-12-19 14:26       ` [Xen-devel] " Konrad Rzeszutek Wilk
2011-12-20  6:25         ` liang tang
2011-11-30 17:21 ` [PATCH 5/8] ACPI: add processor driver for Xen virtual CPUs Konrad Rzeszutek Wilk
2011-11-30 17:21   ` Konrad Rzeszutek Wilk
2011-12-01  9:24   ` [Xen-devel] " Jan Beulich
2011-12-01  9:24     ` Jan Beulich
2011-12-12 17:29     ` Konrad Rzeszutek Wilk
2011-12-12 17:29       ` Konrad Rzeszutek Wilk
2011-12-13  7:45       ` Jan Beulich
2011-12-13  7:45         ` Jan Beulich
2011-12-13  9:26         ` liang tang
2011-12-16 22:21           ` Konrad Rzeszutek Wilk
2012-02-10 17:18   ` Konrad Rzeszutek Wilk
2011-11-30 17:21 ` [PATCH 6/8] ACPI: processor: override the interface of register acpi processor handler for Xen vcpu Konrad Rzeszutek Wilk
2011-11-30 17:21 ` [PATCH 7/8] ACPI: xen processor: add PM notification interfaces Konrad Rzeszutek Wilk
2011-11-30 17:21 ` [PATCH 8/8] ACPI: xen processor: set ignore_ppc to handle PPC event for Xen vcpu Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AADFC41AFE54684AB9EE6CBC0274A5D10FCD37A9@SHSMSX102.ccr.corp.intel.com \
    --to=kevin.tian@intel.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=jeremy@goop.org \
    --cc=jun.nakajima@intel.com \
    --cc=ke.yu@intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@darnok.org \
    --cc=konrad@kernel.org \
    --cc=lenb@kernel.org \
    --cc=liang.tang@oracle.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mike.mcclurg@citrix.com \
    --cc=rjw@sisk.pl \
    --cc=stefan.bader@canonical.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.