All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Klimov <alexey.klimov@arm.com>
To: Prashanth Prakash <pprakash@codeaurora.org>
Cc: rjw@rjwysocki.net, linux-acpi@vger.kernel.org,
	linaro-acpi@lists.linaro.org, ashwin.chaugule@linaro.org,
	timur@codeaurora.org
Subject: Re: [PATCH V3 1/4] ACPI / CPPC: Optimize PCC Read Write operations
Date: Mon, 15 Feb 2016 16:37:20 +0000	[thread overview]
Message-ID: <20160215163719.GA22453@arm.com> (raw)
In-Reply-To: <1455134762-31400-2-git-send-email-pprakash@codeaurora.org>

Hi Prashanth and Ashwin,

On Wed, Feb 10, 2016 at 01:05:59PM -0700, Prashanth Prakash wrote:
> From: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> 
> Previously the send_pcc_cmd() code checked if the
> PCC operation had completed before returning from
> the function. This check was performed regardless
> of the PCC op type (i.e. Read/Write). Knowing
> the type of cmd can be used to optimize the check
> and avoid needless waiting. e.g. with Write ops,
> the actual Writing is done before calling send_pcc_cmd().
> And the subsequent Writes will check if the channel is
> free at the entry of send_pcc_cmd() anyway.
> 
> However, for Read cmds, we need to wait for the cmd
> completion bit to be flipped, since the actual Read
> ops follow after returning from the send_pcc_cmd(). So,
> only do the looping check at the end for Read ops.
> 
> Also, instead of using udelay() calls, use ktime as a
> means to check for deadlines. The current deadline
> in which the Remote should flip the cmd completion bit
> is defined as N * Nominal latency. Where N is arbitrary
> and large enough to work on slow emulators and Nominal
> latency comes from the ACPI table (PCCT). This helps
> in working around the CONFIG_HZ effects on udelay()
> and also avoids needing different ACPI tables for Silicon
> and Emulation platforms.
> 
> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
> ---
>  drivers/acpi/cppc_acpi.c | 99 ++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 71 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 6730f96..6ae327e 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -39,6 +39,7 @@
>  
>  #include <linux/cpufreq.h>
>  #include <linux/delay.h>
> +#include <linux/ktime.h>
>  
>  #include <acpi/cppc_acpi.h>
>  /*
> @@ -63,25 +64,53 @@ static struct mbox_chan *pcc_channel;
>  static void __iomem *pcc_comm_addr;
>  static u64 comm_base_addr;
>  static int pcc_subspace_idx = -1;
> -static u16 pcc_cmd_delay;
>  static bool pcc_channel_acquired;
> +static ktime_t deadline;
>  
>  /*
>   * Arbitrary Retries in case the remote processor is slow to respond
> - * to PCC commands.
> + * to PCC commands. Keeping it high enough to cover emulators where
> + * the processors run painfully slow.
>   */
>  #define NUM_RETRIES 500
>  
> +static int check_pcc_chan(void)
> +{
> +	int ret = -EIO;
> +	struct acpi_pcct_shared_memory __iomem *generic_comm_base = pcc_comm_addr;
> +	ktime_t next_deadline = ktime_add(ktime_get(), deadline);
> +
> +	/* Retry in case the remote processor was too slow to catch up. */
> +	while (!ktime_after(ktime_get(), next_deadline)) {
> +		if (readw_relaxed(&generic_comm_base->status) & PCC_CMD_COMPLETE) {

What do you think about polling for error too?
Firmware might set error bit and kernel will spin here for full cycle (maybe even
more than one time).


> +			ret = 0;
> +			break;
> +		}
> +		/*
> +		 * Reducing the bus traffic in case this loop takes longer than
> +		 * a few retries.
> +		 */
> +		udelay(3);
> +	}
> +
> +	return ret;
> +}
> +
>  static int send_pcc_cmd(u16 cmd)
>  {
> -	int retries, result = -EIO;
> -	struct acpi_pcct_hw_reduced *pcct_ss = pcc_channel->con_priv;
> +	int ret = -EIO;
>  	struct acpi_pcct_shared_memory *generic_comm_base =
>  		(struct acpi_pcct_shared_memory *) pcc_comm_addr;
> -	u32 cmd_latency = pcct_ss->latency;
>  
> -	/* Min time OS should wait before sending next command. */
> -	udelay(pcc_cmd_delay);

Ouch.
This delay is Minimum Request Turnaround Time and specification is strict about this:
The minimum amount of time that OSPM must wait after the
completion of a command before issuing the next command, in microseconds.

And you're going to completely remove delay that described as "must wait" regardless if it's
write or read command. In my view completion of a command is when CMD complete bit is set to 1.
This field in table is not optional.

Even worse, what if platform report latency = 0 (or very close to zero) and turnaround time as very large number?

Well, my other comments here are minor things but this one looks like main issue of this patch.



> +	/*
> +	 * For CMD_WRITE we know for a fact the caller should have checked
> +	 * the channel before writing to PCC space
> +	 */
> +	if (cmd == CMD_READ) {
> +		ret = check_pcc_chan();
> +		if (ret)
> +			return ret;
> +	}
>  
>  	/* Write to the shared comm region. */
>  	writew(cmd, &generic_comm_base->command);
> @@ -90,26 +119,25 @@ static int send_pcc_cmd(u16 cmd)
>  	writew(0, &generic_comm_base->status);
>  
>  	/* Ring doorbell */
> -	result = mbox_send_message(pcc_channel, &cmd);
> -	if (result < 0) {
> +	ret = mbox_send_message(pcc_channel, &cmd);
> +	if (ret < 0) {
>  		pr_err("Err sending PCC mbox message. cmd:%d, ret:%d\n",
> -				cmd, result);
> -		return result;
> +				cmd, ret);
> +		return ret;
>  	}
>  
> -	/* Wait for a nominal time to let platform process command. */
> -	udelay(cmd_latency);
> -
> -	/* Retry in case the remote processor was too slow to catch up. */
> -	for (retries = NUM_RETRIES; retries > 0; retries--) {
> -		if (readw_relaxed(&generic_comm_base->status) & PCC_CMD_COMPLETE) {
> -			result = 0;
> -			break;
> -		}
> -	}
> +	/*
> +	 * For READs we need to ensure the cmd completed to ensure
> +	 * the ensuing read()s can proceed. For WRITEs we dont care
> +	 * because the actual write()s are done before coming here
> +	 * and the next READ or WRITE will check if the channel
> +	 * is busy/free at the entry of this call.
> +	 */
> +	if (cmd == CMD_READ)
> +		ret = check_pcc_chan();
>  
> -	mbox_client_txdone(pcc_channel, result);
> -	return result;
> +	mbox_client_txdone(pcc_channel, ret);
> +	return ret;
>  }
>  
>  static void cppc_chan_tx_done(struct mbox_client *cl, void *msg, int ret)
> @@ -306,6 +334,7 @@ static int register_pcc_channel(int pcc_subspace_idx)
>  {
>  	struct acpi_pcct_hw_reduced *cppc_ss;
>  	unsigned int len;
> +	u64 usecs_lat;
>  
>  	if (pcc_subspace_idx >= 0) {
>  		pcc_channel = pcc_mbox_request_channel(&cppc_mbox_cl,
> @@ -335,7 +364,14 @@ static int register_pcc_channel(int pcc_subspace_idx)
>  		 */
>  		comm_base_addr = cppc_ss->base_address;
>  		len = cppc_ss->length;
> -		pcc_cmd_delay = cppc_ss->min_turnaround_time;
> +
> +		/*
> +		 * cppc_ss->latency is just a Nominal value. In reality
> +		 * the remote processor could be much slower to reply.
> +		 * So add an arbitrary amount of wait on top of Nominal.

Specification requires platform to report accurate values.
If remote party requires arbitrary amount on top that means someone screwed up
values in firmware.
That's not only slowness of remote processor but broken/inaccurate values given by firmware.
I think if you want this comment you need to blame firmware too.


> +		 */
> +		usecs_lat = NUM_RETRIES * cppc_ss->latency;
> +		deadline = ns_to_ktime(usecs_lat * NSEC_PER_USEC);
>  
>  		pcc_comm_addr = acpi_os_ioremap(comm_base_addr, len);
>  		if (!pcc_comm_addr) {
> @@ -604,7 +640,7 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
>  			(ref_perf->cpc_entry.reg.space_id == ACPI_ADR_SPACE_PLATFORM_COMM) ||
>  			(nom_perf->cpc_entry.reg.space_id == ACPI_ADR_SPACE_PLATFORM_COMM)) {
>  		/* Ring doorbell once to update PCC subspace */
> -		if (send_pcc_cmd(CMD_READ)) {
> +		if (send_pcc_cmd(CMD_READ) < 0) {

Somewhere in previous email I asked you to point me to specs with words about init values before
issuing first PCC command. Thanks for that.
However I still think that comment should be added about init value of status register.
Please add it unless you have something to come up with. I think such comment will be useful.


>  			ret = -EIO;
>  			goto out_err;
>  		}
> @@ -662,7 +698,7 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
>  	if ((delivered_reg->cpc_entry.reg.space_id == ACPI_ADR_SPACE_PLATFORM_COMM) ||
>  			(reference_reg->cpc_entry.reg.space_id == ACPI_ADR_SPACE_PLATFORM_COMM)) {
>  		/* Ring doorbell once to update PCC subspace */
> -		if (send_pcc_cmd(CMD_READ)) {
> +		if (send_pcc_cmd(CMD_READ) < 0) {
>  			ret = -EIO;
>  			goto out_err;
>  		}
> @@ -713,6 +749,13 @@ int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls)
>  
>  	spin_lock(&pcc_lock);
>  
> +	/* If this is PCC reg, check if channel is free before writing */
> +	if (desired_reg->cpc_entry.reg.space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
> +		ret = check_pcc_chan();
> +		if (ret)
> +			goto busy_channel;
> +	}
> +
>  	/*
>  	 * Skip writing MIN/MAX until Linux knows how to come up with
>  	 * useful values.
> @@ -722,10 +765,10 @@ int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls)
>  	/* Is this a PCC reg ?*/
>  	if (desired_reg->cpc_entry.reg.space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
>  		/* Ring doorbell so Remote can get our perf request. */
> -		if (send_pcc_cmd(CMD_WRITE))
> +		if (send_pcc_cmd(CMD_WRITE) < 0)
>  			ret = -EIO;
>  	}
> -
> +busy_channel:
>  	spin_unlock(&pcc_lock);
>  
>  	return ret;

Best regards,
Alexey.
 

  parent reply	other threads:[~2016-02-15 16:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-10 20:05 [PATCH V3 0/4] acpi: cppc optimization patches Prashanth Prakash
2016-02-10 20:05 ` [PATCH V3 1/4] ACPI / CPPC: Optimize PCC Read Write operations Prashanth Prakash
2016-02-10 20:42   ` Timur Tabi
2016-02-10 21:15     ` Ashwin Chaugule
2016-02-10 21:57       ` Timur Tabi
2016-02-10 22:17         ` Ashwin Chaugule
2016-02-15 16:37   ` Alexey Klimov [this message]
2016-02-16 18:47     ` Ashwin Chaugule
2016-02-16 19:10       ` Rafael J. Wysocki
2016-02-16 19:33         ` Ashwin Chaugule
2016-02-16 19:39           ` Rafael J. Wysocki
2016-02-29 17:39       ` Alexey Klimov
2016-02-29 19:20         ` Prakash, Prashanth
2016-02-10 20:06 ` [PATCH V3 2/4] acpi: cppc: optimized cpc_read and cpc_write Prashanth Prakash
2016-02-10 20:06 ` [PATCH V3 3/4] mailbox: pcc: optimized pcc_send_data Prashanth Prakash
2016-02-10 20:06 ` [PATCH V3 4/4] acpi: cppc: replace writeX/readX to PCC with relaxed version Prashanth Prakash

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160215163719.GA22453@arm.com \
    --to=alexey.klimov@arm.com \
    --cc=ashwin.chaugule@linaro.org \
    --cc=linaro-acpi@lists.linaro.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=pprakash@codeaurora.org \
    --cc=rjw@rjwysocki.net \
    --cc=timur@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.