* [PATCH v3] can: c_can: Speed up rx_poll function
@ 2013-11-01 9:36 Markus Pargmann
2013-11-03 19:20 ` Wolfgang Grandegger
0 siblings, 1 reply; 3+ messages in thread
From: Markus Pargmann @ 2013-11-01 9:36 UTC (permalink / raw)
To: Marc Kleine-Budde, Wolfgang Grandegger
Cc: Joe Perches, linux-can, netdev, linux-kernel, kernel, Markus Pargmann
This patch speeds up the rx_poll function by reducing the number of
register reads.
Replace the 32bit register read by a 16bit register read. Currently
the 32bit register read is implemented by using 2 16bit reads. This is
inefficient as we only use the lower 16bit in rx_poll.
The for loop reads the pending interrupts in every iteration. This
leads up to 16 reads of pending interrupts. The patch introduces a new
outer loop to read the pending interrupts as long as 'quota' is above 0.
This reduces the total number of reads.
The third change is to replace the for-loop by a ffs loop.
Tested on AM335x. I removed all 'static' and 'inline' from c_can.c to
see the timings for all functions. I used the function tracer with
trace_stats.
125kbit:
Function Hit Time Avg s^2
-------- --- ---- --- ---
c_can_do_rx_poll 63960 10168178 us 158.977 us 1493056 us
With patch:
c_can_do_rx_poll 63941 3764057 us 58.867 us 776162.2 us
1Mbit:
Function Hit Time Avg s^2
-------- --- ---- --- ---
c_can_do_rx_poll 69489 30049498 us 432.435 us 9271851 us
With patch:
c_can_do_rx_poll 207109 24322185 us 117.436 us 171469047 us
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
Notes:
Changes in v3:
- Update commit message (measurements and ffs)
Changes in v2:
- Small changes, find_next_bit -> ffs and other
drivers/net/can/c_can/c_can.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
index a668cd4..428681e 100644
--- a/drivers/net/can/c_can/c_can.c
+++ b/drivers/net/can/c_can/c_can.c
@@ -798,17 +798,19 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota)
u32 num_rx_pkts = 0;
unsigned int msg_obj, msg_ctrl_save;
struct c_can_priv *priv = netdev_priv(dev);
- u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
+ u16 val;
+
+ /*
+ * It is faster to read only one 16bit register. This is only possible
+ * for a maximum number of 16 objects.
+ */
+ BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
+ "Implementation does not support more message objects than 16");
+
+ while (quota > 0 && (val = priv->read_reg(priv, C_CAN_INTPND1_REG))) {
+ while ((msg_obj = ffs(val)) && quota > 0) {
+ val &= ~BIT(msg_obj - 1);
- for (msg_obj = C_CAN_MSG_OBJ_RX_FIRST;
- msg_obj <= C_CAN_MSG_OBJ_RX_LAST && quota > 0;
- val = c_can_read_reg32(priv, C_CAN_INTPND1_REG),
- msg_obj++) {
- /*
- * as interrupt pending register's bit n-1 corresponds to
- * message object n, we need to handle the same properly.
- */
- if (val & (1 << (msg_obj - 1))) {
c_can_object_get(dev, 0, msg_obj, IF_COMM_ALL &
~IF_COMM_TXRQST);
msg_ctrl_save = priv->read_reg(priv,
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3] can: c_can: Speed up rx_poll function
2013-11-01 9:36 [PATCH v3] can: c_can: Speed up rx_poll function Markus Pargmann
@ 2013-11-03 19:20 ` Wolfgang Grandegger
2013-11-03 21:05 ` Marc Kleine-Budde
0 siblings, 1 reply; 3+ messages in thread
From: Wolfgang Grandegger @ 2013-11-03 19:20 UTC (permalink / raw)
To: Markus Pargmann, Marc Kleine-Budde
Cc: Joe Perches, linux-can, netdev, linux-kernel, kernel
On 11/01/2013 10:36 AM, Markus Pargmann wrote:
> This patch speeds up the rx_poll function by reducing the number of
> register reads.
>
> Replace the 32bit register read by a 16bit register read. Currently
> the 32bit register read is implemented by using 2 16bit reads. This is
> inefficient as we only use the lower 16bit in rx_poll.
>
> The for loop reads the pending interrupts in every iteration. This
> leads up to 16 reads of pending interrupts. The patch introduces a new
> outer loop to read the pending interrupts as long as 'quota' is above 0.
> This reduces the total number of reads.
>
> The third change is to replace the for-loop by a ffs loop.
>
> Tested on AM335x. I removed all 'static' and 'inline' from c_can.c to
> see the timings for all functions. I used the function tracer with
> trace_stats.
>
> 125kbit:
> Function Hit Time Avg s^2
> -------- --- ---- --- ---
> c_can_do_rx_poll 63960 10168178 us 158.977 us 1493056 us
> With patch:
> c_can_do_rx_poll 63941 3764057 us 58.867 us 776162.2 us
>
> 1Mbit:
> Function Hit Time Avg s^2
> -------- --- ---- --- ---
> c_can_do_rx_poll 69489 30049498 us 432.435 us 9271851 us
> With patch:
> c_can_do_rx_poll 207109 24322185 us 117.436 us 171469047 us
>
> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> ---
>
> Notes:
> Changes in v3:
> - Update commit message (measurements and ffs)
>
> Changes in v2:
> - Small changes, find_next_bit -> ffs and other
>
> drivers/net/can/c_can/c_can.c | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
> index a668cd4..428681e 100644
> --- a/drivers/net/can/c_can/c_can.c
> +++ b/drivers/net/can/c_can/c_can.c
> @@ -798,17 +798,19 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota)
> u32 num_rx_pkts = 0;
> unsigned int msg_obj, msg_ctrl_save;
> struct c_can_priv *priv = netdev_priv(dev);
> - u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
> + u16 val;
> +
> + /*
> + * It is faster to read only one 16bit register. This is only possible
> + * for a maximum number of 16 objects.
> + */
> + BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
> + "Implementation does not support more message objects than 16");
> +
> + while (quota > 0 && (val = priv->read_reg(priv, C_CAN_INTPND1_REG))) {
> + while ((msg_obj = ffs(val)) && quota > 0) {
> + val &= ~BIT(msg_obj - 1);
IIRC, we should avoid assignment in if/while statements.
Wolfgang.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v3] can: c_can: Speed up rx_poll function
2013-11-03 19:20 ` Wolfgang Grandegger
@ 2013-11-03 21:05 ` Marc Kleine-Budde
0 siblings, 0 replies; 3+ messages in thread
From: Marc Kleine-Budde @ 2013-11-03 21:05 UTC (permalink / raw)
To: Wolfgang Grandegger
Cc: Markus Pargmann, Joe Perches, linux-can, netdev, linux-kernel, kernel
[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]
On 11/03/2013 08:20 PM, Wolfgang Grandegger wrote:
> On 11/01/2013 10:36 AM, Markus Pargmann wrote:
>> This patch speeds up the rx_poll function by reducing the number of
>> register reads.
>>
>> Replace the 32bit register read by a 16bit register read. Currently
>> the 32bit register read is implemented by using 2 16bit reads. This is
>> inefficient as we only use the lower 16bit in rx_poll.
>>
>> The for loop reads the pending interrupts in every iteration. This
>> leads up to 16 reads of pending interrupts. The patch introduces a new
>> outer loop to read the pending interrupts as long as 'quota' is above 0.
>> This reduces the total number of reads.
>>
>> The third change is to replace the for-loop by a ffs loop.
>>
>> Tested on AM335x. I removed all 'static' and 'inline' from c_can.c to
>> see the timings for all functions. I used the function tracer with
>> trace_stats.
>>
>> 125kbit:
>> Function Hit Time Avg s^2
>> -------- --- ---- --- ---
>> c_can_do_rx_poll 63960 10168178 us 158.977 us 1493056 us
>> With patch:
>> c_can_do_rx_poll 63941 3764057 us 58.867 us 776162.2 us
>>
>> 1Mbit:
>> Function Hit Time Avg s^2
>> -------- --- ---- --- ---
>> c_can_do_rx_poll 69489 30049498 us 432.435 us 9271851 us
>> With patch:
>> c_can_do_rx_poll 207109 24322185 us 117.436 us 171469047 us
>>
>> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
>> ---
>>
>> Notes:
>> Changes in v3:
>> - Update commit message (measurements and ffs)
>>
>> Changes in v2:
>> - Small changes, find_next_bit -> ffs and other
>>
>> drivers/net/can/c_can/c_can.c | 22 ++++++++++++----------
>> 1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
>> index a668cd4..428681e 100644
>> --- a/drivers/net/can/c_can/c_can.c
>> +++ b/drivers/net/can/c_can/c_can.c
>> @@ -798,17 +798,19 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota)
>> u32 num_rx_pkts = 0;
>> unsigned int msg_obj, msg_ctrl_save;
>> struct c_can_priv *priv = netdev_priv(dev);
>> - u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
>> + u16 val;
>> +
>> + /*
>> + * It is faster to read only one 16bit register. This is only possible
>> + * for a maximum number of 16 objects.
>> + */
>> + BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
>> + "Implementation does not support more message objects than 16");
>> +
>> + while (quota > 0 && (val = priv->read_reg(priv, C_CAN_INTPND1_REG))) {
>> + while ((msg_obj = ffs(val)) && quota > 0) {
>> + val &= ~BIT(msg_obj - 1);
>
> IIRC, we should avoid assignment in if/while statements.
Yes, but the code looks IMHO better this way.
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-11-03 21:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-01 9:36 [PATCH v3] can: c_can: Speed up rx_poll function Markus Pargmann
2013-11-03 19:20 ` Wolfgang Grandegger
2013-11-03 21:05 ` Marc Kleine-Budde
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).