linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] can: c_can: Speed up rx_poll function
@ 2013-10-28 16:59 Markus Pargmann
  2013-10-28 17:12 ` Joe Perches
  0 siblings, 1 reply; 3+ messages in thread
From: Markus Pargmann @ 2013-10-28 16:59 UTC (permalink / raw)
  To: Marc Kleine-Budde, Wolfgang Grandegger
  Cc: linux-can, netdev, linux-kernel, kernel, Markus Pargmann

This patch speeds up the rx_poll function by reducing the number of
register reads.

Replace the 32bit register read by a 16bit register read. Currently
the 32bit register read is implemented by using 2 16bit reads. This is
inefficient as we only use the lower 16bit in rx_poll.

The for loops reads the pending interrupts in every iteration. This
leads up to 16 reads of pending interrupts. The patch introduces a new
outer loop to read the pending interrupts as long as 'quota' is above 0.
This reduces the total number of reads.

The third change is to replace the for-loop by a find_next_bit loop.

Tested on AM335x. I removed all 'static' and 'inline' from c_can.c to
see the timings for all functions. I used the function tracer with
trace_stats.

125kbit:
  Function                               Hit    Time            Avg             s^2
  --------                               ---    ----            ---             ---
  c_can_do_rx_poll                     63960    10168178 us     158.977 us      1493056 us
With patch:
  c_can_do_rx_poll                     63939    4268457 us     66.758 us       818790.9 us

1Mbit:
  Function                               Hit    Time            Avg             s^2
  --------                               ---    ----            ---             ---
  c_can_do_rx_poll                     69489    30049498 us     432.435 us      9271851 us
With patch:
  c_can_do_rx_poll                    103034    24220362 us     235.071 us      6016656 us

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
 drivers/net/can/c_can/c_can.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
index a668cd4..31af033 100644
--- a/drivers/net/can/c_can/c_can.c
+++ b/drivers/net/can/c_can/c_can.c
@@ -259,6 +259,12 @@ static u32 c_can_read_reg32(struct c_can_priv *priv, enum reg index)
 	return val;
 }
 
+u16 c_can_read_reg16(struct c_can_priv *priv, enum reg index)
+{
+	u16 val = priv->read_reg(priv, index);
+	return val;
+}
+
 static void c_can_enable_all_interrupts(struct c_can_priv *priv,
 						int enable)
 {
@@ -798,17 +804,21 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota)
 	u32 num_rx_pkts = 0;
 	unsigned int msg_obj, msg_ctrl_save;
 	struct c_can_priv *priv = netdev_priv(dev);
-	u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
+	unsigned long val = c_can_read_reg16(priv, C_CAN_INTPND1_REG);
+
+	/*
+	 * It is faster to read only one 16bit register. This is only possible
+	 * for a maximum number of 16 objects.
+	 */
+	BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
+			"Implementation does not support more message objects than 16");
+
+	while (quota > 0 && (val = c_can_read_reg16(priv, C_CAN_INTPND1_REG))) {
+		msg_obj = 0;
+		while ((msg_obj = find_next_bit(&val, 16, msg_obj)) < 16 &&
+				quota > 0) {
+			++msg_obj;
 
-	for (msg_obj = C_CAN_MSG_OBJ_RX_FIRST;
-			msg_obj <= C_CAN_MSG_OBJ_RX_LAST && quota > 0;
-			val = c_can_read_reg32(priv, C_CAN_INTPND1_REG),
-			msg_obj++) {
-		/*
-		 * as interrupt pending register's bit n-1 corresponds to
-		 * message object n, we need to handle the same properly.
-		 */
-		if (val & (1 << (msg_obj - 1))) {
 			c_can_object_get(dev, 0, msg_obj, IF_COMM_ALL &
 					~IF_COMM_TXRQST);
 			msg_ctrl_save = priv->read_reg(priv,
-- 
1.8.4.rc3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] can: c_can: Speed up rx_poll function
  2013-10-28 16:59 [PATCH] can: c_can: Speed up rx_poll function Markus Pargmann
@ 2013-10-28 17:12 ` Joe Perches
  2013-10-28 18:28   ` Markus Pargmann
  0 siblings, 1 reply; 3+ messages in thread
From: Joe Perches @ 2013-10-28 17:12 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: Marc Kleine-Budde, Wolfgang Grandegger, linux-can, netdev,
	linux-kernel, kernel

On Mon, 2013-10-28 at 17:59 +0100, Markus Pargmann wrote:
> This patch speeds up the rx_poll function by reducing the number of
> register reads.

trivial notes:

> diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
[]
> @@ -259,6 +259,12 @@ static u32 c_can_read_reg32(struct c_can_priv *priv, enum reg index)
> +u16 c_can_read_reg16(struct c_can_priv *priv, enum reg index)
> +{
> +	u16 val = priv->read_reg(priv, index);
> +	return val;
> +}

This function doesn't seem useful at all.
It's not exported and it's not static.

Why not use an in-place priv->read_reg(priv, index)?

> +
>  static void c_can_enable_all_interrupts(struct c_can_priv *priv,
>  						int enable)
>  {
> @@ -798,17 +804,21 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota)
>  	u32 num_rx_pkts = 0;
>  	unsigned int msg_obj, msg_ctrl_save;
>  	struct c_can_priv *priv = netdev_priv(dev);
> -	u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
> +	unsigned long val = c_can_read_reg16(priv, C_CAN_INTPND1_REG);

Probably better as a u16 as detailed below.

> +	/*
> +	 * It is faster to read only one 16bit register. This is only possible
> +	 * for a maximum number of 16 objects.
> +	 */
> +	BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
> +			"Implementation does not support more message objects than 16");
> +
> +	while (quota > 0 && (val = c_can_read_reg16(priv, C_CAN_INTPND1_REG))) {
> +		msg_obj = 0;
> +		while ((msg_obj = find_next_bit(&val, 16, msg_obj)) < 16 &&

Using ffs instead of find_next_bit would be more standard
and probably faster too.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] can: c_can: Speed up rx_poll function
  2013-10-28 17:12 ` Joe Perches
@ 2013-10-28 18:28   ` Markus Pargmann
  0 siblings, 0 replies; 3+ messages in thread
From: Markus Pargmann @ 2013-10-28 18:28 UTC (permalink / raw)
  To: Joe Perches
  Cc: Marc Kleine-Budde, Wolfgang Grandegger, linux-can, netdev,
	linux-kernel, kernel

On Mon, Oct 28, 2013 at 10:12:16AM -0700, Joe Perches wrote:
> On Mon, 2013-10-28 at 17:59 +0100, Markus Pargmann wrote:
> > This patch speeds up the rx_poll function by reducing the number of
> > register reads.
> 
> trivial notes:
> 
> > diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
> []
> > @@ -259,6 +259,12 @@ static u32 c_can_read_reg32(struct c_can_priv *priv, enum reg index)
> > +u16 c_can_read_reg16(struct c_can_priv *priv, enum reg index)
> > +{
> > +	u16 val = priv->read_reg(priv, index);
> > +	return val;
> > +}
> 
> This function doesn't seem useful at all.
> It's not exported and it's not static.

Oh right, should be static.

> 
> Why not use an in-place priv->read_reg(priv, index)?

No reason for that. Especially with a u16 as mentioned below, it is
clear enough that we have a 16bit value, so I will change it.

> 
> > +
> >  static void c_can_enable_all_interrupts(struct c_can_priv *priv,
> >  						int enable)
> >  {
> > @@ -798,17 +804,21 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota)
> >  	u32 num_rx_pkts = 0;
> >  	unsigned int msg_obj, msg_ctrl_save;
> >  	struct c_can_priv *priv = netdev_priv(dev);
> > -	u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
> > +	unsigned long val = c_can_read_reg16(priv, C_CAN_INTPND1_REG);
> 
> Probably better as a u16 as detailed below.
> 
> > +	/*
> > +	 * It is faster to read only one 16bit register. This is only possible
> > +	 * for a maximum number of 16 objects.
> > +	 */
> > +	BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
> > +			"Implementation does not support more message objects than 16");
> > +
> > +	while (quota > 0 && (val = c_can_read_reg16(priv, C_CAN_INTPND1_REG))) {
> > +		msg_obj = 0;
> > +		while ((msg_obj = find_next_bit(&val, 16, msg_obj)) < 16 &&
> 
> Using ffs instead of find_next_bit would be more standard
> and probably faster too.

I wasn't aware of that. I will change it

Thanks,

Markus Pargmann

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-10-28 18:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-28 16:59 [PATCH] can: c_can: Speed up rx_poll function Markus Pargmann
2013-10-28 17:12 ` Joe Perches
2013-10-28 18:28   ` Markus Pargmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).