linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: My vote against eepro* removal
@ 2006-01-23 11:01 kus Kusche Klaus
  2006-01-23 20:23 ` Jesse Brandeburg
  0 siblings, 1 reply; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-23 11:01 UTC (permalink / raw)
  To: John Ronciak, Lee Revell
  Cc: Evgeniy Polyakov, Adrian Bunk, linux-kernel, john.ronciak,
	ganesh.venkatesan, jesse.brandeburg, netdev, Steven Rostedt

From: John Ronciak
> Can we try a couple of things? 1) just comment out all the check for
> link code in the e100 driver and give that a try and 2) just comment
> out the update stats call and see if that works.  These seem to be the
> differences and we need to know which one is causing the problem.

First of all, I am still unable to get any traces of this in the
latency tracer. Moreover, as I told before, removing parts of the 
watchdog usually made my eth0 nonfunctional (which is bad - this
is an embedded system with ssh access).

Hence, I explicitely instrumented the watchdog function with tsc.
Output of the timings is done by a background thread, so the
timings should not increase the runtime of the watchdog.

Here are my results:

If the watchdog doesn't get interrupted, preempted, or whatever,
it spends 340 us in its body:
* 303 us in the mii code
*  36 us in the following code up to e100_adjust_adaptive_ifs
*   1 us in the remaining code (I think my chip doesn't need any
of those chip-specific fixups)

The 303 us in the mii code are divided in the following way:
* 101 us in mii_ethtool_gset
* 135 us in the whole if
*  67 us in mii_check_link

This is with the udelay(2) instead of udelay(20) hack applied.
With udelay(20), the mii times are 128 + 170 + 85 us,
i.e. 383 us instead of 303 us, or >= 420 us for the whole watchdog.

As the RTC runs with 8192 Hz during my tests, the watchdog is hit
by 2-3 interrupts, which adds another 75 - 110 us to its total
execution time, i.e. the time it blocks other rtprio 1 threads.

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
  2006-01-23 11:01 My vote against eepro* removal kus Kusche Klaus
@ 2006-01-23 20:23 ` Jesse Brandeburg
  0 siblings, 0 replies; 25+ messages in thread
From: Jesse Brandeburg @ 2006-01-23 20:23 UTC (permalink / raw)
  To: kus Kusche Klaus
  Cc: Lee Revell, Evgeniy Polyakov, Adrian Bunk, linux-kernel, Ronciak,
	John, Brandeburg, Jesse, netdev, Steven Rostedt

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3497 bytes --]

On Mon, 23 Jan 2006, kus Kusche Klaus wrote:
> From: John Ronciak
> > Can we try a couple of things? 1) just comment out all the check for
> > link code in the e100 driver and give that a try and 2) just comment
> > out the update stats call and see if that works.  These seem to be the
> > differences and we need to know which one is causing the problem.
> 
> First of all, I am still unable to get any traces of this in the
> latency tracer. Moreover, as I told before, removing parts of the
> watchdog usually made my eth0 nonfunctional (which is bad - this
> is an embedded system with ssh access).
> 
> Hence, I explicitely instrumented the watchdog function with tsc.
> Output of the timings is done by a background thread, so the
> timings should not increase the runtime of the watchdog.
> 
> Here are my results:
> 
> If the watchdog doesn't get interrupted, preempted, or whatever,
> it spends 340 us in its body:
> * 303 us in the mii code
> *  36 us in the following code up to e100_adjust_adaptive_ifs
> *   1 us in the remaining code (I think my chip doesn't need any
> of those chip-specific fixups)
> 
> The 303 us in the mii code are divided in the following way:
> * 101 us in mii_ethtool_gset
> * 135 us in the whole if
> *  67 us in mii_check_link
> 
> This is with the udelay(2) instead of udelay(20) hack applied.
> With udelay(20), the mii times are 128 + 170 + 85 us,
> i.e. 383 us instead of 303 us, or >= 420 us for the whole watchdog.
> 
> As the RTC runs with 8192 Hz during my tests, the watchdog is hit
> by 2-3 interrupts, which adds another 75 - 110 us to its total
> execution time, i.e. the time it blocks other rtprio 1 threads.

Thank you very much for that detailed analysis!  okay, so calls to mii.c 
take too long, but those depend on mmio_read in e100 to do the work, so 
this patch attempts to minimize the latency.

This patch is against linus-2.6.git, I compile and ssh/ping tested it. 
Would you be willing to send your instrumentation patches?  I could then 
test any fixes easier.

e100: attempt a shorter delay for mdio reads

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

Simply reorder our write/read sequence for mdio reads to minimize latency
as well as delay a shorter interval for each loop.
---

  drivers/net/e100.c |   12 +++++++-----
  1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -891,23 +891,25 @@ static u16 mdio_ctrl(struct nic *nic, u3
  	 * procedure it should be done under lock.
  	 */
  	spin_lock_irqsave(&nic->mdio_lock, flags);
-	for (i = 100; i; --i) {
+	for (i = 1000; i; --i) {
  		if (readl(&nic->csr->mdi_ctrl) & mdi_ready)
  			break;
-		udelay(20);
+		udelay(2);
  	}
  	if (unlikely(!i)) {
-		printk("e100.mdio_ctrl(%s) won't go Ready\n",
+		DPRINTK(PROBE, ERR, "e100.mdio_ctrl(%s) won't go Ready\n",
  			nic->netdev->name );
  		spin_unlock_irqrestore(&nic->mdio_lock, flags);
  		return 0;		/* No way to indicate timeout error */
  	}
  	writel((reg << 16) | (addr << 21) | dir | data, &nic->csr->mdi_ctrl);

-	for (i = 0; i < 100; i++) {
-		udelay(20);
+	/* to avoid latency, read to flush the write, then delay, and only
+	 * delay 2us per loop, manual says read should complete in < 64us */
+	for (i = 0; i < 1000; i++) {
  		if ((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready)
  			break;
+		udelay(2);
  	}
  	spin_unlock_irqrestore(&nic->mdio_lock, flags);
  	DPRINTK(HW, DEBUG,

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-24  7:38 kus Kusche Klaus
  0 siblings, 0 replies; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-24  7:38 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: Lee Revell, Evgeniy Polyakov, Adrian Bunk, linux-kernel, Ronciak,
	John, netdev, Steven Rostedt

From: Jesse Brandeburg
> On Mon, 23 Jan 2006, kus Kusche Klaus wrote:
> > Here are my results:
> > 
> > If the watchdog doesn't get interrupted, preempted, or whatever,
> > it spends 340 us in its body:
> > * 303 us in the mii code
> > *  36 us in the following code up to e100_adjust_adaptive_ifs
> > *   1 us in the remaining code (I think my chip doesn't need any
> > of those chip-specific fixups)
> > 
> > The 303 us in the mii code are divided in the following way:
> > * 101 us in mii_ethtool_gset
> > * 135 us in the whole if
> > *  67 us in mii_check_link
> > 
> > This is with the udelay(2) instead of udelay(20) hack applied.
> > With udelay(20), the mii times are 128 + 170 + 85 us,
> > i.e. 383 us instead of 303 us, or >= 420 us for the whole watchdog.
> 
> Thank you very much for that detailed analysis!  okay, so 
> calls to mii.c 
> take too long, but those depend on mmio_read in e100 to do 
> the work, so 
> this patch attempts to minimize the latency.
> 
> This patch is against linus-2.6.git, I compile and ssh/ping 
> tested it. 
> Would you be willing to send your instrumentation patches?  I 
> could then 
> test any fixes easier.

No deep magic behind my instrumentation:

A few global variables and some rdtscl in the watchdog:

unsigned long my_tsc_1 = 0;
unsigned long my_tsc_2 = 0;
unsigned long my_tsc_3 = 0;
unsigned long my_tsc_4 = 0;
EXPORT_SYMBOL(my_tsc_1);
EXPORT_SYMBOL(my_tsc_2);
EXPORT_SYMBOL(my_tsc_3);
EXPORT_SYMBOL(my_tsc_4);

static void e100_watchdog(unsigned long data)
{
	struct nic *nic = (struct nic *)data;
	struct ethtool_cmd cmd;

	DPRINTK(TIMER, DEBUG, "right now = %ld\n", jiffies);

	/* mii library handles link maintenance tasks */
	rdtscl(my_tsc_1);

	mii_ethtool_gset(&nic->mii, &cmd);

	rdtscl(my_tsc_2);
	if(mii_link_ok(&nic->mii) && !netif_carrier_ok(nic->netdev)) {
		DPRINTK(LINK, INFO, "link up, %sMbps, %s-duplex\n",
			cmd.speed == SPEED_100 ? "100" : "10",
			cmd.duplex == DUPLEX_FULL ? "full" : "half");
	} else if(!mii_link_ok(&nic->mii) && netif_carrier_ok(nic->netdev)) {
		DPRINTK(LINK, INFO, "link down\n");
	}

	rdtscl(my_tsc_3);
	mii_check_link(&nic->mii);
	rdtscl(my_tsc_4);

	/* Software generated interrupt to recover from (rare) Rx
	* allocation failure.
...

And a small module which prints the timings periodically 
when loaded:

/* Example module, built after LDD book release 3 */

#include <linux/init.h>
#include <linux/module.h>
#include <linux/version.h>
#include <linux/errno.h>
#include <linux/timer.h>

MODULE_LICENSE("GPL");

/* Output interval, in jiffies */
#define INTERVAL 2111
/* Output scaling: TSC ==> microseconds */
#define SCALE(x) ((x)/500)

extern unsigned long my_tsc_1;
extern unsigned long my_tsc_2;
extern unsigned long my_tsc_3;
extern unsigned long my_tsc_4;

static struct timer_list my_timer;

static void timer_func(unsigned long dummy)
{
  printk(KERN_NOTICE "my_timer: diff = %lu / %lu / %lu\n",
         SCALE(my_tsc_2 - my_tsc_1),
         SCALE(my_tsc_3 - my_tsc_2),
         SCALE(my_tsc_4 - my_tsc_3));

  my_timer.expires += INTERVAL;
  add_timer(&my_timer);
}

static int __init mymod_init(void)
{
  init_timer(&my_timer);
  my_timer.function = timer_func;
  my_timer.expires = jiffies + INTERVAL;
  add_timer(&my_timer);
  
  printk(KERN_NOTICE "Started mymod...\n");
  return 0;
}

static void __exit mymod_exit(void)
{
  del_timer_sync(&my_timer);
  printk(KERN_NOTICE "Finished mymod...\n");
}

module_init(mymod_init);
module_exit(mymod_exit);


> 
> e100: attempt a shorter delay for mdio reads
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> Simply reorder our write/read sequence for mdio reads to 
> minimize latency
> as well as delay a shorter interval for each loop.
>
> diff --git a/drivers/net/e100.c b/drivers/net/e100.c
> --- a/drivers/net/e100.c
> +++ b/drivers/net/e100.c
> @@ -891,23 +891,25 @@ static u16 mdio_ctrl(struct nic *nic, u3
>   	 * procedure it should be done under lock.
>   	 */
>   	spin_lock_irqsave(&nic->mdio_lock, flags);
> -	for (i = 100; i; --i) {
> +	for (i = 1000; i; --i) {
>   		if (readl(&nic->csr->mdi_ctrl) & mdi_ready)
>   			break;
> -		udelay(20);
> +		udelay(2);
>   	}
>   	if (unlikely(!i)) {
> -		printk("e100.mdio_ctrl(%s) won't go Ready\n",
> +		DPRINTK(PROBE, ERR, "e100.mdio_ctrl(%s) won't 
> go Ready\n",
>   			nic->netdev->name );
>   		spin_unlock_irqrestore(&nic->mdio_lock, flags);
>   		return 0;		/* No way to indicate 
> timeout error */
>   	}

The piece of code above is not yet present 
in my version of e100.
(I'm still at 2.6.15, there is no -rt patch for 2.6.16 yet)

>   	writel((reg << 16) | (addr << 21) | dir | data, 
> &nic->csr->mdi_ctrl);
> 
> -	for (i = 0; i < 100; i++) {
> -		udelay(20);
> +	/* to avoid latency, read to flush the write, then 
> delay, and only
> +	 * delay 2us per loop, manual says read should complete 
> in < 64us */
> +	for (i = 0; i < 1000; i++) {
>   		if ((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready)
>   			break;
> +		udelay(2);
>   	}

Exchanging the if and the udelay made things slightly worse:
It runs with 103 / 136 / 68 instead of 101 / 135 / 67 us.


-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-21  2:01         ` John Ronciak
@ 2006-01-21  3:56           ` Lee Revell
  0 siblings, 0 replies; 25+ messages in thread
From: Lee Revell @ 2006-01-21  3:56 UTC (permalink / raw)
  To: John Ronciak
  Cc: Evgeniy Polyakov, kus Kusche Klaus, Adrian Bunk, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, 2006-01-20 at 18:01 -0800, John Ronciak wrote:
> There is a timer routine in the eepro100 driver which does the check
> for link as well as a check for on of the hang conditions (with
> work-around).  It does the check for link in a different way than
> e100.  e100 uses mii call where eepro100 does it manually.  Another
> difference is that eepro100 doesn't get stats unless called by the
> system.  It's not in the timer routine at all.
> 
> Can we try a couple of things? 1) just comment out all the check for
> link code in the e100 driver and give that a try and 2) just comment
> out the update stats call and see if that works.  These seem to be the
> differences and we need to know which one is causing the problem.

Heh, FWIW, Microsoft found this exact same bug with 2 different chipsets
in their latency testing (see section 4.4):

http://research.microsoft.com/~mbj/papers/tr-98-29.html

Lee


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-21  1:30       ` Lee Revell
@ 2006-01-21  2:01         ` John Ronciak
  2006-01-21  3:56           ` Lee Revell
  0 siblings, 1 reply; 25+ messages in thread
From: John Ronciak @ 2006-01-21  2:01 UTC (permalink / raw)
  To: Lee Revell
  Cc: Evgeniy Polyakov, kus Kusche Klaus, Adrian Bunk, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On 1/20/06, Lee Revell <rlrevell@joe-job.com> wrote:
>
> Why don't these cause excessive scheduling delays in eepro100 then?
> Can't we just copy the eepro100 behavior?
Reports still float around from time to time about hangs with the
eepro100 which go away when the e100 driver is used.  We don't
maintain the eepro100 driver so I can't tell you much about it other
than the reports we get sometimes.

There is a timer routine in the eepro100 driver which does the check
for link as well as a check for on of the hang conditions (with
work-around).  It does the check for link in a different way than
e100.  e100 uses mii call where eepro100 does it manually.  Another
difference is that eepro100 doesn't get stats unless called by the
system.  It's not in the timer routine at all.

Can we try a couple of things? 1) just comment out all the check for
link code in the e100 driver and give that a try and 2) just comment
out the update stats call and see if that works.  These seem to be the
differences and we need to know which one is causing the problem.


--
Cheers,
John

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-21  1:19     ` John Ronciak
@ 2006-01-21  1:30       ` Lee Revell
  2006-01-21  2:01         ` John Ronciak
  0 siblings, 1 reply; 25+ messages in thread
From: Lee Revell @ 2006-01-21  1:30 UTC (permalink / raw)
  To: John Ronciak
  Cc: Evgeniy Polyakov, kus Kusche Klaus, Adrian Bunk, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, 2006-01-20 at 17:19 -0800, John Ronciak wrote:
> On 1/20/06, Lee Revell <rlrevell@joe-job.com> wrote:
> > Seems like the important question is, why does e100 need a watchdog if
> > eepro100 works fine without one?  Isn't the point of a watchdog in this
> > context to work around other bugs in the driver (or the hardware)?
> There are a number of things that the watchdog in e100 does.  It
> checks link (up, down), reads the hardware stats, adjusts the adaptive
> IFS and checks to 3 known hang conditions based on certain types of
> the hardware.  You might be able to get around without doing the
> work-arounds (as long as you don't' see hangs happening with the
> hardware being used) but the checking of the link and the stats are
> probably needed.

Why don't these cause excessive scheduling delays in eepro100 then?
Can't we just copy the eepro100 behavior?

Lee


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-21  0:40   ` Lee Revell
@ 2006-01-21  1:19     ` John Ronciak
  2006-01-21  1:30       ` Lee Revell
  0 siblings, 1 reply; 25+ messages in thread
From: John Ronciak @ 2006-01-21  1:19 UTC (permalink / raw)
  To: Lee Revell
  Cc: Evgeniy Polyakov, kus Kusche Klaus, Adrian Bunk, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On 1/20/06, Lee Revell <rlrevell@joe-job.com> wrote:
> Seems like the important question is, why does e100 need a watchdog if
> eepro100 works fine without one?  Isn't the point of a watchdog in this
> context to work around other bugs in the driver (or the hardware)?
There are a number of things that the watchdog in e100 does.  It
checks link (up, down), reads the hardware stats, adjusts the adaptive
IFS and checks to 3 known hang conditions based on certain types of
the hardware.  You might be able to get around without doing the
work-arounds (as long as you don't' see hangs happening with the
hardware being used) but the checking of the link and the stats are
probably needed.


--
Cheers,
John

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
  2006-01-20 10:19 kus Kusche Klaus
  2006-01-20 11:02 ` Evgeniy Polyakov
@ 2006-01-21  0:45 ` Lee Revell
  1 sibling, 0 replies; 25+ messages in thread
From: Lee Revell @ 2006-01-21  0:45 UTC (permalink / raw)
  To: kus Kusche Klaus
  Cc: Evgeniy Polyakov, John Ronciak, Adrian Bunk, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, 2006-01-20 at 11:19 +0100, kus Kusche Klaus wrote:
> For a non-full preemption kernel, your patch moves the 500 us 
> piece of code from kernel to thread context, so it really 
> improves things. But is 500 us something to worry about in a
> non-full preemption kernel? 

Yes, absolutely.  Once exit_mmap (a latency regression which was
introduced in 2.6.14) and rt_run_flush/rt_garbage_collect (which have
always been problematic) are fixed, 500usecs will stick out like a sore
thumb even on a regular PREEMPT kernel.

Also, you should be able to capture this latency in /proc/latency trace
by configuring an -rt kernel with PREEMPT_DESKTOP and hard/softirq
preemption disabled.

Lee


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-20  9:55 ` Evgeniy Polyakov
@ 2006-01-21  0:40   ` Lee Revell
  2006-01-21  1:19     ` John Ronciak
  0 siblings, 1 reply; 25+ messages in thread
From: Lee Revell @ 2006-01-21  0:40 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: kus Kusche Klaus, John Ronciak, Adrian Bunk, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, 2006-01-20 at 12:55 +0300, Evgeniy Polyakov wrote:
> > Analysis of e100:
> > * If I comment out the whole body of e100_watchdog except for the
> >   timer re-registration, the delays are gone (so it is really the
> >   body of e100_watchdog). However, this makes eth0 non-functional.
> > * Commenting out parts of it, I found out that most of the time
> >   goes into its first half: The code from mii_ethtool_gset to
> >   mii_check_link (including) makes the big difference, as far as
> >   I can tell especially mii_ethtool_gset.
> 
> Each MDIO read can take upto 2 msecs (!) and at least 20 usecs in
> e100,
> and this runs in timer handler.
> Concider attaching (only compile tested) patch which moves e100
> watchdog
> into workqueue. 

Seems like the important question is, why does e100 need a watchdog if
eepro100 works fine without one?  Isn't the point of a watchdog in this
context to work around other bugs in the driver (or the hardware)?

Lee




^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-20 11:27 kus Kusche Klaus
  0 siblings, 0 replies; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-20 11:27 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: John Ronciak, Adrian Bunk, Lee Revell, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

From: Evgeniy Polyakov
> Just a hack:
> 
> --- drivers/net/e100.c.1	2006-01-20 13:39:19.000000000 +0300
> +++ drivers/net/e100.c	2006-01-20 14:15:40.000000000 +0300
> @@ -879,8 +879,8 @@
>  
>  	writel((reg << 16) | (addr << 21) | dir | data, 
> &nic->csr->mdi_ctrl);
>  
> -	for(i = 0; i < 100; i++) {
> -		udelay(20);
> +	for(i = 0; i < 1000; i++) {
> +		udelay(2);
>  		if((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready)
>  			break;
>  	}

My test environment and software is not precise enough for small 
improvements, but I'd say this results in a 10-15 % improvement
(i.e. something like 50 us shorter delay) on the average.

To be sure, one would have to take and print tsc timestamps directly
in the watchdog code, but printk's mess up my timings.

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-20 10:51 kus Kusche Klaus
@ 2006-01-20 11:05 ` Evgeniy Polyakov
  0 siblings, 0 replies; 25+ messages in thread
From: Evgeniy Polyakov @ 2006-01-20 11:05 UTC (permalink / raw)
  To: kus Kusche Klaus
  Cc: John Ronciak, Adrian Bunk, Lee Revell, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, Jan 20, 2006 at 11:51:23AM +0100, kus Kusche Klaus (kus@keba.com) wrote:
> From: Evgeniy Polyakov [mailto:johnpol@2ka.mipt.ru] 
> > Each MDIO read can take upto 2 msecs (!) and at least 20 
> > usecs in e100,
> > and this runs in timer handler.
> > Concider attaching (only compile tested) patch which moves 
> > e100 watchdog
> > into workqueue.
> 
> Tested the patch. Works and has the expected effects:
> 
> Fully preemptible kernel: 
> No change: 500 us delay at rtprio 1, no delay at higher rtprio.
> (you just moved the 500 us piece of code from one rtprio 1 kernel 
> thread to another rtprio 1 kernel thread).
> 
> Kernel with desktop preemption:
> Originally: Threads at any rtprio suffered from 500 us delay.
> With your patch: Only rtprio 1 threads suffer from 500 us delay,
> no delay at higher rtprio.

Just a hack:

--- drivers/net/e100.c.1	2006-01-20 13:39:19.000000000 +0300
+++ drivers/net/e100.c	2006-01-20 14:15:40.000000000 +0300
@@ -879,8 +879,8 @@
 
 	writel((reg << 16) | (addr << 21) | dir | data, &nic->csr->mdi_ctrl);
 
-	for(i = 0; i < 100; i++) {
-		udelay(20);
+	for(i = 0; i < 1000; i++) {
+		udelay(2);
 		if((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready)
 			break;
 	}


> -- 
> Klaus Kusche                 (Software Development - Control Systems)
> KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
> Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
> E-Mail: kus@keba.com                                WWW: www.keba.com
>  

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-20 10:19 kus Kusche Klaus
@ 2006-01-20 11:02 ` Evgeniy Polyakov
  2006-01-21  0:45 ` Lee Revell
  1 sibling, 0 replies; 25+ messages in thread
From: Evgeniy Polyakov @ 2006-01-20 11:02 UTC (permalink / raw)
  To: kus Kusche Klaus
  Cc: John Ronciak, Adrian Bunk, Lee Revell, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, Jan 20, 2006 at 11:19:20AM +0100, kus Kusche Klaus (kus@keba.com) wrote:
> From: Evgeniy Polyakov
> > Each MDIO read can take upto 2 msecs (!) and at least 20 
> > usecs in e100,
> > and this runs in timer handler.
> > Concider attaching (only compile tested) patch which moves 
> > e100 watchdog
> > into workqueue.
> 
> Hmmm, I don't think moving it around is worth the trouble
> (nevertheless, I will test later if I find time).
> 
> For a full preemption kernel, both timer code and workqueue code
> are executed in a thread of their own. If I know that there is a
> 500 us piece of code in either of them, I have to adjust the prio
> of the corresponding thread (and all others) accordingly anyway.
> 
> For a non-full preemption kernel, your patch moves the 500 us 
> piece of code from kernel to thread context, so it really 
> improves things. But is 500 us something to worry about in a
> non-full preemption kernel?

In the worst case it will be milliseconds, which is not very nice
timeout for timer handler...

P.S. above patch works without problems with my e100 card.

> -- 
> Klaus Kusche                 (Software Development - Control Systems)
> KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
> Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
> E-Mail: kus@keba.com                                WWW: www.keba.com

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-20 10:51 kus Kusche Klaus
  2006-01-20 11:05 ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-20 10:51 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: John Ronciak, Adrian Bunk, Lee Revell, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

From: Evgeniy Polyakov [mailto:johnpol@2ka.mipt.ru] 
> Each MDIO read can take upto 2 msecs (!) and at least 20 
> usecs in e100,
> and this runs in timer handler.
> Concider attaching (only compile tested) patch which moves 
> e100 watchdog
> into workqueue.

Tested the patch. Works and has the expected effects:

Fully preemptible kernel: 
No change: 500 us delay at rtprio 1, no delay at higher rtprio.
(you just moved the 500 us piece of code from one rtprio 1 kernel 
thread to another rtprio 1 kernel thread).

Kernel with desktop preemption:
Originally: Threads at any rtprio suffered from 500 us delay.
With your patch: Only rtprio 1 threads suffer from 500 us delay,
no delay at higher rtprio.

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-20 10:19 kus Kusche Klaus
  2006-01-20 11:02 ` Evgeniy Polyakov
  2006-01-21  0:45 ` Lee Revell
  0 siblings, 2 replies; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-20 10:19 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: John Ronciak, Adrian Bunk, Lee Revell, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

From: Evgeniy Polyakov
> Each MDIO read can take upto 2 msecs (!) and at least 20 
> usecs in e100,
> and this runs in timer handler.
> Concider attaching (only compile tested) patch which moves 
> e100 watchdog
> into workqueue.

Hmmm, I don't think moving it around is worth the trouble
(nevertheless, I will test later if I find time).

For a full preemption kernel, both timer code and workqueue code
are executed in a thread of their own. If I know that there is a
500 us piece of code in either of them, I have to adjust the prio
of the corresponding thread (and all others) accordingly anyway.

For a non-full preemption kernel, your patch moves the 500 us 
piece of code from kernel to thread context, so it really 
improves things. But is 500 us something to worry about in a
non-full preemption kernel?

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-20  9:37 kus Kusche Klaus
@ 2006-01-20  9:55 ` Evgeniy Polyakov
  2006-01-21  0:40   ` Lee Revell
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2006-01-20  9:55 UTC (permalink / raw)
  To: kus Kusche Klaus
  Cc: John Ronciak, Adrian Bunk, Lee Revell, linux-kernel,
	john.ronciak, ganesh.venkatesan, jesse.brandeburg, netdev,
	Steven Rostedt

On Fri, Jan 20, 2006 at 10:37:43AM +0100, kus Kusche Klaus (kus@keba.com) wrote:
> > From: John Ronciak
> > During the watchdog the e100 driver reads all of the status registers
> > from the actual hardware.  There are 26 (worst case) register reads. 
> > There is also a spin lock for another check in the watchdog.  It would
> > still surprise me that all of this would take 500 usec.  If you are
> > seeing this delay, you can comment out the scheduling of the watchdog
> > to see if this goes away.  We'll need to narrow down exactly what in
> > the watchdog is causing the delay
> 
> Retested it.

...

> Effect of the e100 driver:
> * There is no measurable effect when my test program is running
>   at prio 2 - 99.
> * For prio 1, I get an interval of 500-650 us every 2 seconds,
>   which indicates a scheduling latency of 380-530 us.
> Hence, some piece of code is running for ~500 us at rt prio 1.
> 
> Analysis of e100:
> * If I comment out the whole body of e100_watchdog except for the
>   timer re-registration, the delays are gone (so it is really the
>   body of e100_watchdog). However, this makes eth0 non-functional.
> * Commenting out parts of it, I found out that most of the time
>   goes into its first half: The code from mii_ethtool_gset to
>   mii_check_link (including) makes the big difference, as far as
>   I can tell especially mii_ethtool_gset.

Each MDIO read can take upto 2 msecs (!) and at least 20 usecs in e100,
and this runs in timer handler.
Concider attaching (only compile tested) patch which moves e100 watchdog
into workqueue.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 22cd045..e884578 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -549,7 +549,7 @@ struct nic {
 	enum phy phy;
 	struct params params;
 	struct net_device_stats net_stats;
-	struct timer_list watchdog;
+	struct work_struct watchdog;
 	struct timer_list blink_timer;
 	struct mii_if_info mii;
 	struct work_struct tx_timeout_task;
@@ -1495,7 +1495,7 @@ static void e100_adjust_adaptive_ifs(str
 	}
 }
 
-static void e100_watchdog(unsigned long data)
+static void e100_watchdog(void *data)
 {
 	struct nic *nic = (struct nic *)data;
 	struct ethtool_cmd cmd;
@@ -1539,7 +1539,7 @@ static void e100_watchdog(unsigned long 
 	else
 		nic->flags &= ~ich_10h_workaround;
 
-	mod_timer(&nic->watchdog, jiffies + E100_WATCHDOG_PERIOD);
+	schedule_delayed_work(&nic->watchdog, E100_WATCHDOG_PERIOD);
 }
 
 static inline void e100_xmit_prepare(struct nic *nic, struct cb *cb,
@@ -2004,7 +2004,7 @@ static int e100_up(struct nic *nic)
 		goto err_clean_cbs;
 	e100_set_multicast_list(nic->netdev);
 	e100_start_receiver(nic, NULL);
-	mod_timer(&nic->watchdog, jiffies);
+	schedule_work(&nic->watchdog);
 	if((err = request_irq(nic->pdev->irq, e100_intr, SA_SHIRQ,
 		nic->netdev->name, nic->netdev)))
 		goto err_no_irq;
@@ -2016,7 +2016,7 @@ static int e100_up(struct nic *nic)
 	return 0;
 
 err_no_irq:
-	del_timer_sync(&nic->watchdog);
+	cancel_rearming_delayed_work(&nic->watchdog);
 err_clean_cbs:
 	e100_clean_cbs(nic);
 err_rx_clean_list:
@@ -2031,7 +2031,7 @@ static void e100_down(struct nic *nic)
 	netif_stop_queue(nic->netdev);
 	e100_hw_reset(nic);
 	free_irq(nic->pdev->irq, nic->netdev);
-	del_timer_sync(&nic->watchdog);
+	flush_scheduled_work();
 	netif_carrier_off(nic->netdev);
 	e100_clean_cbs(nic);
 	e100_rx_clean_list(nic);
@@ -2570,9 +2570,7 @@ static int __devinit e100_probe(struct p
 
 	pci_set_master(pdev);
 
-	init_timer(&nic->watchdog);
-	nic->watchdog.function = e100_watchdog;
-	nic->watchdog.data = (unsigned long)nic;
+	INIT_WORK(&nic->watchdog, e100_watchdog, nic);
 	init_timer(&nic->blink_timer);
 	nic->blink_timer.function = e100_blink_led;
 	nic->blink_timer.data = (unsigned long)nic;

> -- 
> Klaus Kusche                 (Software Development - Control Systems)
> KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
> Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
> E-Mail: kus@keba.com                                WWW: www.keba.com

-- 
	Evgeniy Polyakov

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-20  9:37 kus Kusche Klaus
  2006-01-20  9:55 ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-20  9:37 UTC (permalink / raw)
  To: John Ronciak, Adrian Bunk
  Cc: kus Kusche Klaus, Lee Revell, linux-kernel, john.ronciak,
	ganesh.venkatesan, jesse.brandeburg, netdev, Steven Rostedt

> From: John Ronciak
> During the watchdog the e100 driver reads all of the status registers
> from the actual hardware.  There are 26 (worst case) register reads. 
> There is also a spin lock for another check in the watchdog.  It would
> still surprise me that all of this would take 500 usec.  If you are
> seeing this delay, you can comment out the scheduling of the watchdog
> to see if this goes away.  We'll need to narrow down exactly what in
> the watchdog is causing the delay

Retested it.

Software:
2.6.15.1-rt6, no additional patches, configured for full preemption.

Hardware:
500 MHz Pentium 3
According to lspci: 82443+82371 Chipset (440 BX + PIIX4)
According to lspci: 82559ER (09) e100
100 Mbit full duplex

Measurement software:
* I start the rtc at 8192 Hz.
* I read from /dev/rtc in a usermode realtime process.
* After each read, I read the tsc and calculate the interval.
* It should be 122 us most of the time, and indeed it is.
  At rtprio 2-99, practically all measurements are < 160-175 us
  on an otherwise idle machine (160 for 99, 175 for 2).
  At rtprio 1, there are some intervals of up to 300 us on an 
  idle machine.
* The program records the intervals in internal arrays and prints
  them at the end, it doesn't perform any I/O while it is running.

Basically, my software measures how long it takes for a realtime
process at prio n to get executed after it became ready:
The actual interval minus 122 is the time the process was waiting
for the CPU (didn't get scheduled).
This means that there was continuous CPU activity for at least
that long, either at a higher or the same rt prio, on in
non-thread kernel code.
(the rtc threaded interrupt handler is running above all others)

Effect of the e100 driver:
* There is no measurable effect when my test program is running
  at prio 2 - 99.
* For prio 1, I get an interval of 500-650 us every 2 seconds,
  which indicates a scheduling latency of 380-530 us.
Hence, some piece of code is running for ~500 us at rt prio 1.

Analysis of e100:
* If I comment out the whole body of e100_watchdog except for the
  timer re-registration, the delays are gone (so it is really the
  body of e100_watchdog). However, this makes eth0 non-functional.
* Commenting out parts of it, I found out that most of the time
  goes into its first half: The code from mii_ethtool_gset to
  mii_check_link (including) makes the big difference, as far as
  I can tell especially mii_ethtool_gset.

Effect of desktop preemption:
With desktop preemption insted of full preemption, the 400-500 us
scheduling delay hits at any rt priority, even at 99, i.e. the
CPU is occupied by non-thread kernel code.
Again, commenting out e100_watchdog brings the delays back to
well below 400 us.

Latency traces:
None.
The latency tracer does not record this scheduling latency.
My max. traces are around 35 us with full preemption and
around 70 us with desktop preemption, and are not related to e100.

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
  2006-01-19 19:09 ` Steven Rostedt
@ 2006-01-19 22:57   ` Steven Rostedt
  0 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2006-01-19 22:57 UTC (permalink / raw)
  To: kus Kusche Klaus; +Cc: Lee Revell, linux-kernel

On Thu, 2006-01-19 at 14:09 -0500, Steven Rostedt wrote:
> On Thu, 2006-01-19 at 11:26 +0100, kus Kusche Klaus wrote:
> > > From: Lee Revell
> > > On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> > > > Last time I tested (around 2.6.12), eepro100 worked much better 
> > > > in -rt kernels w.r.t. latencies than e100:
> 
> Try the latest -rt kernel with e100 to see if it still is a delay.  You
> can also run in PREEMPT_DESKTOP so that the interrupt handlers are not
> threads and see if that shows up in the latency.

I just booted up 2.6.15-rt6 (PREEMPT_DESKTOP, regular soft and hard
irqs) on a 366 MHz UP machine with init=/bin/bash.  Loaded the e100
driver, setup the network. Then started to ping it from another box.  I
had a 80 usec latency, and that wasn't even from the network card.

So, e100 should not be a problem.  I did see the interrupts go off every
2 seconds too.

Check to see if you still get the latencies with e100 and the latest
kernel.

As Lee already said.  You notice something fishy __PLEASE__ report it.
Arjan's response was that this shows that we should only have one driver
for a certain task, otherwise people wont report a problem with one, if
the other satisfies their needs. And thus, the problem remains.

-- Steve


^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
  2006-01-19 10:26 kus Kusche Klaus
  2006-01-19 16:20 ` Adrian Bunk
@ 2006-01-19 19:09 ` Steven Rostedt
  2006-01-19 22:57   ` Steven Rostedt
  1 sibling, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2006-01-19 19:09 UTC (permalink / raw)
  To: kus Kusche Klaus; +Cc: linux-kernel, Lee Revell

On Thu, 2006-01-19 at 11:26 +0100, kus Kusche Klaus wrote:
> > From: Lee Revell
> > On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> > > Last time I tested (around 2.6.12), eepro100 worked much better 
> > > in -rt kernels w.r.t. latencies than e100:

Try the latest -rt kernel with e100 to see if it still is a delay.  You
can also run in PREEMPT_DESKTOP so that the interrupt handlers are not
threads and see if that shows up in the latency.

-- Steve



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-19 16:20 ` Adrian Bunk
@ 2006-01-19 17:16   ` John Ronciak
  0 siblings, 0 replies; 25+ messages in thread
From: John Ronciak @ 2006-01-19 17:16 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: kus Kusche Klaus, Lee Revell, linux-kernel, john.ronciak,
	ganesh.venkatesan, jesse.brandeburg, netdev

During the watchdog the e100 driver reads all of the status registers
from the actual hardware.  There are 26 (worst case) register reads. 
There is also a spin lock for another check in the watchdog.  It would
still surprise me that all of this would take 500 usec.  If you are
seeing this delay, you can comment out the scheduling of the watchdog
to see if this goes away.  We'll need to narrow down exactly what in
the watchdog is causing the delay

Thanks.

--
Cheers,
John

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-19 10:26 kus Kusche Klaus
@ 2006-01-19 16:20 ` Adrian Bunk
  2006-01-19 17:16   ` John Ronciak
  2006-01-19 19:09 ` Steven Rostedt
  1 sibling, 1 reply; 25+ messages in thread
From: Adrian Bunk @ 2006-01-19 16:20 UTC (permalink / raw)
  To: kus Kusche Klaus
  Cc: Lee Revell, linux-kernel, john.ronciak, ganesh.venkatesan,
	jesse.brandeburg, netdev

On Thu, Jan 19, 2006 at 11:26:51AM +0100, kus Kusche Klaus wrote:
> > From: Lee Revell
> > On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> > > Last time I tested (around 2.6.12), eepro100 worked much better 
> > > in -rt kernels w.r.t. latencies than e100:
> > > 
> > > e100 caused a periodic latency of about 500 microseconds
> > > exactly every 2 seconds, no matter what the load on the interface
> > > was (i.e. even on an idle interface).
> > > 
> > > eepro100 did not show any latencies that long, it worked much
> > > smoother w.r.t. latencies.
> > > 
> > > Of course I would prefer to have e100 fixed over keeping eepro100
> > > around forever, but the last time I checked, it still wasn't fixed.
> > 
> > Please provide latency traces to illustrate the problematic code path.
> 
> It's not a "latency": As far as I can tell, interrupts or preemption
> are not disabled, the latency tracer doesn't show anything.
> 
> I just noticed that low-pri rt processes did not get scheduled for
> about 500 microseconds when e100 was active (even if the net was
> idle), and that there were no such breaks with eepro100.
> 
> I didn't analyze it in detail at that time, I believed that the e100
> interrupt handler thread was running every 2 seconds for 500 
> microseconds, because the interrupt count of eth0 incremented every
> 2 seconds, exactly when my rt processes paused.
> 
> This would be bad: That irq thread is at rt prio 47 on my system, 
> above many importent things.
> 
> However, I checked more closely now, and found out that only a small
> portion of the 500 microseconds is spent in the irq thread. Most of
> it is spent in the timer thread, at rt prio 1, so the whole thing
> is a much smaller problem than I originally believed.
> 
> Must be the function e100_watchdog.
>...

Is this with 2.6.12 or 2.6.16-rc1?

If it's the former, please check whether the problem is still presnt in 
the latter.

If it's the latter, I'm sure the e100 developers (Cc'ed) are interested 
in your problem.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-19 10:26 kus Kusche Klaus
  2006-01-19 16:20 ` Adrian Bunk
  2006-01-19 19:09 ` Steven Rostedt
  0 siblings, 2 replies; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-19 10:26 UTC (permalink / raw)
  To: Lee Revell; +Cc: linux-kernel

> From: Lee Revell
> On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> > Last time I tested (around 2.6.12), eepro100 worked much better 
> > in -rt kernels w.r.t. latencies than e100:
> > 
> > e100 caused a periodic latency of about 500 microseconds
> > exactly every 2 seconds, no matter what the load on the interface
> > was (i.e. even on an idle interface).
> > 
> > eepro100 did not show any latencies that long, it worked much
> > smoother w.r.t. latencies.
> > 
> > Of course I would prefer to have e100 fixed over keeping eepro100
> > around forever, but the last time I checked, it still wasn't fixed.
> 
> Please provide latency traces to illustrate the problematic code path.

It's not a "latency": As far as I can tell, interrupts or preemption
are not disabled, the latency tracer doesn't show anything.

I just noticed that low-pri rt processes did not get scheduled for
about 500 microseconds when e100 was active (even if the net was
idle), and that there were no such breaks with eepro100.

I didn't analyze it in detail at that time, I believed that the e100
interrupt handler thread was running every 2 seconds for 500 
microseconds, because the interrupt count of eth0 incremented every
2 seconds, exactly when my rt processes paused.

This would be bad: That irq thread is at rt prio 47 on my system, 
above many importent things.

However, I checked more closely now, and found out that only a small
portion of the 500 microseconds is spent in the irq thread. Most of
it is spent in the timer thread, at rt prio 1, so the whole thing
is a much smaller problem than I originally believed.

Must be the function e100_watchdog.

> It sounds like you have known about this issue for a while, were you
> waiting for it to fix itself?

See my other reply: I didn't notice that eepro100 is to be removed,
and as long as eepro100 was there, it was no problem for me.

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: My vote against eepro* removal
@ 2006-01-19 10:08 kus Kusche Klaus
  0 siblings, 0 replies; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-19 10:08 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel

> From: Arjan van de Ven
> On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> > Last time I tested (around 2.6.12), eepro100 worked much better 
> > in -rt kernels w.r.t. latencies than e100:
> 
> no offence but this is EXACTLY the reason why having 2 drivers for the
> same hardware is bad. People (in general) will switch to the 
> 2nd driver
> if they hit some thing that is suboptimal, rather than 
> reporting or even
> fixing it. The result of that is that you end up with 2 drivers, each
> serving a portion of the users but both suboptimal in non-overlapping
> ways. Having one driver that's good enough for both groups is clearly
> superior to that....

You describe exactly what happened: I had a problem with e100, I tried
eepro100, I was happy with eepro100 (I didn't notice it was scheduled
for removal), I didn't care about e100 any more...

I also agree that things should not happen that way, but it was the
easy way.

-- 
Klaus Kusche                 (Software Development - Control Systems)
KEBA AG             Gewerbepark Urfahr, A-4041 Linz, Austria (Europe)
Tel: +43 / 732 / 7090-3120                 Fax: +43 / 732 / 7090-6301
E-Mail: kus@keba.com                                WWW: www.keba.com
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-19  7:19 kus Kusche Klaus
  2006-01-19  7:24 ` Lee Revell
@ 2006-01-19  7:42 ` Arjan van de Ven
  1 sibling, 0 replies; 25+ messages in thread
From: Arjan van de Ven @ 2006-01-19  7:42 UTC (permalink / raw)
  To: kus Kusche Klaus; +Cc: linux-kernel

On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> Last time I tested (around 2.6.12), eepro100 worked much better 
> in -rt kernels w.r.t. latencies than e100:

no offence but this is EXACTLY the reason why having 2 drivers for the
same hardware is bad. People (in general) will switch to the 2nd driver
if they hit some thing that is suboptimal, rather than reporting or even
fixing it. The result of that is that you end up with 2 drivers, each
serving a portion of the users but both suboptimal in non-overlapping
ways. Having one driver that's good enough for both groups is clearly
superior to that....



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: My vote against eepro* removal
  2006-01-19  7:19 kus Kusche Klaus
@ 2006-01-19  7:24 ` Lee Revell
  2006-01-19  7:42 ` Arjan van de Ven
  1 sibling, 0 replies; 25+ messages in thread
From: Lee Revell @ 2006-01-19  7:24 UTC (permalink / raw)
  To: kus Kusche Klaus; +Cc: linux-kernel

On Thu, 2006-01-19 at 08:19 +0100, kus Kusche Klaus wrote:
> Last time I tested (around 2.6.12), eepro100 worked much better 
> in -rt kernels w.r.t. latencies than e100:
> 
> e100 caused a periodic latency of about 500 microseconds
> exactly every 2 seconds, no matter what the load on the interface
> was (i.e. even on an idle interface).
> 
> eepro100 did not show any latencies that long, it worked much
> smoother w.r.t. latencies.
> 
> Of course I would prefer to have e100 fixed over keeping eepro100
> around forever, but the last time I checked, it still wasn't fixed.

Please provide latency traces to illustrate the problematic code path.

It sounds like you have known about this issue for a while, were you
waiting for it to fix itself?

Lee


^ permalink raw reply	[flat|nested] 25+ messages in thread

* My vote against eepro* removal
@ 2006-01-19  7:19 kus Kusche Klaus
  2006-01-19  7:24 ` Lee Revell
  2006-01-19  7:42 ` Arjan van de Ven
  0 siblings, 2 replies; 25+ messages in thread
From: kus Kusche Klaus @ 2006-01-19  7:19 UTC (permalink / raw)
  To: linux-kernel

Last time I tested (around 2.6.12), eepro100 worked much better 
in -rt kernels w.r.t. latencies than e100:

e100 caused a periodic latency of about 500 microseconds
exactly every 2 seconds, no matter what the load on the interface
was (i.e. even on an idle interface).

eepro100 did not show any latencies that long, it worked much
smoother w.r.t. latencies.

Of course I would prefer to have e100 fixed over keeping eepro100
around forever, but the last time I checked, it still wasn't fixed.

Klaus Kusche
Entwicklung Software - Steuerung
Software Development - Control

KEBA AG
A-4041 Linz
Gewerbepark Urfahr
Tel +43 / 732 / 7090-3120
Fax +43 / 732 / 7090-6301
E-Mail: kus@keba.com
www.keba.com



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2006-01-24  7:38 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-23 11:01 My vote against eepro* removal kus Kusche Klaus
2006-01-23 20:23 ` Jesse Brandeburg
  -- strict thread matches above, loose matches on Subject: below --
2006-01-24  7:38 kus Kusche Klaus
2006-01-20 11:27 kus Kusche Klaus
2006-01-20 10:51 kus Kusche Klaus
2006-01-20 11:05 ` Evgeniy Polyakov
2006-01-20 10:19 kus Kusche Klaus
2006-01-20 11:02 ` Evgeniy Polyakov
2006-01-21  0:45 ` Lee Revell
2006-01-20  9:37 kus Kusche Klaus
2006-01-20  9:55 ` Evgeniy Polyakov
2006-01-21  0:40   ` Lee Revell
2006-01-21  1:19     ` John Ronciak
2006-01-21  1:30       ` Lee Revell
2006-01-21  2:01         ` John Ronciak
2006-01-21  3:56           ` Lee Revell
2006-01-19 10:26 kus Kusche Klaus
2006-01-19 16:20 ` Adrian Bunk
2006-01-19 17:16   ` John Ronciak
2006-01-19 19:09 ` Steven Rostedt
2006-01-19 22:57   ` Steven Rostedt
2006-01-19 10:08 kus Kusche Klaus
2006-01-19  7:19 kus Kusche Klaus
2006-01-19  7:24 ` Lee Revell
2006-01-19  7:42 ` Arjan van de Ven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).