All of lore.kernel.org
 help / color / mirror / Atom feed
* Tweak Latency on Intel ATOM
@ 2010-02-09  7:41 Max Miller
  2010-02-10 22:38 ` Clark Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Max Miller @ 2010-02-09  7:41 UTC (permalink / raw)
  To: linux-rt-users

Hello,

im am using the PREEMPT-RT patch on linux 2.6.29.6. It runs on a MSI965GSE
industial board with Intel ATOM CPU (N270, 1,6GHz) and i945GSE Northbridge. 

I got about 45µs as maximum and 13µs as average latency when hyperthreading is
disabled. With enabled Hyperthreading the maximum latency increses to about
100µs. I measured the latency with cyclictest. 

What can i do to get better maximum latency? Can I do somthing in the kernel
configuration or are there some kernel bootoptions? Or is it still impossible
with this CPU to get better results?

Thanks in advance,
Max Miller 

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-09  7:41 Tweak Latency on Intel ATOM Max Miller
@ 2010-02-10 22:38 ` Clark Williams
  2010-02-11  7:54   ` Max Müller
  0 siblings, 1 reply; 10+ messages in thread
From: Clark Williams @ 2010-02-10 22:38 UTC (permalink / raw)
  To: Max Miller; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

On Tue, 9 Feb 2010 07:41:58 +0000 (UTC)
Max Miller <mxmr@gmx.net> wrote:

> Hello,
> 
> im am using the PREEMPT-RT patch on linux 2.6.29.6. It runs on a MSI965GSE
> industial board with Intel ATOM CPU (N270, 1,6GHz) and i945GSE Northbridge. 
> 
> I got about 45µs as maximum and 13µs as average latency when hyperthreading is
> disabled. With enabled Hyperthreading the maximum latency increses to about
> 100µs. I measured the latency with cyclictest. 
> 
> What can i do to get better maximum latency? Can I do somthing in the kernel
> configuration or are there some kernel bootoptions? Or is it still impossible
> with this CPU to get better results?
> 
> Thanks in advance,
> Max Miller 
> 
>

Make sure you turn off any power management settings in the BIOS and
turn off the irqbalance and cpuspeed services on the Linux side.

What cyclictest command are you using to measure latency?

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-10 22:38 ` Clark Williams
@ 2010-02-11  7:54   ` Max Müller
  2010-02-11 15:34     ` Clark Williams
  2010-02-12 20:54     ` mapping of PCI memory to user space not working with uio.c ? Armin Steinhoff
  0 siblings, 2 replies; 10+ messages in thread
From: Max Müller @ 2010-02-11  7:54 UTC (permalink / raw)
  To: Clark Williams; +Cc: linux-rt-users

Clark Williams schrieb:
> On Tue, 9 Feb 2010 07:41:58 +0000 (UTC)
> Max Miller <mxmr@gmx.net> wrote:
>
>   
>> Hello,
>>
>> im am using the PREEMPT-RT patch on linux 2.6.29.6. It runs on a MSI965GSE
>> industial board with Intel ATOM CPU (N270, 1,6GHz) and i945GSE Northbridge. 
>>
>> I got about 45µs as maximum and 13µs as average latency when hyperthreading is
>> disabled. With enabled Hyperthreading the maximum latency increses to about
>> 100µs. I measured the latency with cyclictest. 
>>
>> What can i do to get better maximum latency? Can I do somthing in the kernel
>> configuration or are there some kernel bootoptions? Or is it still impossible
>> with this CPU to get better results?
>>
>> Thanks in advance,
>> Max Miller 
>>
>>
>>     
>
> Make sure you turn off any power management settings in the BIOS and
> turn off the irqbalance and cpuspeed services on the Linux side.
>
> What cyclictest command are you using to measure latency?
>
> Clark
>   

I run cyclictest as follows:

cyclictest -n -t3 -p99

For generating additional system load i run (one to several instances):

while true; do echo "blah" > /dev/null; done &

Then i watch the max. latency from the thread with the highest priority.
Sometimes i add the parameter '-h' to generate a history. In this 
history i can
see that the most latency times are under 20µs, only  about 5ppm are 
worse than 30µs.
Am i doing this correctly?


The only powersave setting in the BIOS is "Intel speedstep" which i 
disabled.


I will check with disabled "irqbalance and cpuspeed services" disabled 
and will report later.


What should the adequate max. latency on this system?





--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-11  7:54   ` Max Müller
@ 2010-02-11 15:34     ` Clark Williams
  2010-02-15  9:32       ` Max Müller
  2010-02-12 20:54     ` mapping of PCI memory to user space not working with uio.c ? Armin Steinhoff
  1 sibling, 1 reply; 10+ messages in thread
From: Clark Williams @ 2010-02-11 15:34 UTC (permalink / raw)
  To: Max Müller; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 3489 bytes --]

On Thu, 11 Feb 2010 08:54:29 +0100
Max Müller <mxmr@gmx.net> wrote:

> Clark Williams schrieb:
> > On Tue, 9 Feb 2010 07:41:58 +0000 (UTC)
> > Max Miller <mxmr@gmx.net> wrote:
> >
> >   
> >> Hello,
> >>
> >> im am using the PREEMPT-RT patch on linux 2.6.29.6. It runs on a MSI965GSE
> >> industial board with Intel ATOM CPU (N270, 1,6GHz) and i945GSE Northbridge. 
> >>
> >> I got about 45µs as maximum and 13µs as average latency when hyperthreading is
> >> disabled. With enabled Hyperthreading the maximum latency increses to about
> >> 100µs. I measured the latency with cyclictest. 
> >>
> >> What can i do to get better maximum latency? Can I do somthing in the kernel
> >> configuration or are there some kernel bootoptions? Or is it still impossible
> >> with this CPU to get better results?
> >>
> >> Thanks in advance,
> >> Max Miller 
> >>
> >>
> >>     
> >
> > Make sure you turn off any power management settings in the BIOS and
> > turn off the irqbalance and cpuspeed services on the Linux side.
> >
> > What cyclictest command are you using to measure latency?
> >
> > Clark
> >   
> 
> I run cyclictest as follows:
> 
> cyclictest -n -t3 -p99

You might want to try the new cyclictest option --smp (which is really
the options -t, -a -n) and I'd back the priority down to -p95 just to
keep out of the way of watchdog and migration threads. In general, when
I run it on a multi-core box I use:

	$ cyclictest --smp -m -p95 -d0

Lately on AMD boxes I use --numa, which makes calls into libnuma to
allocated memory on local nodes for the measurement threads.

If you want to get fancy and look at at the history of the run, you can
use the -h <n> option to keep a histogram of <n> buckets (1 bucket == 1
microsecond). 

> 
> For generating additional system load i run (one to several instances):
> 
> while true; do echo "blah" > /dev/null; done &
> 
> Then i watch the max. latency from the thread with the highest priority.
> Sometimes i add the parameter '-h' to generate a history. In this 
> history i can
> see that the most latency times are under 20µs, only  about 5ppm are 
> worse than 30µs.
> Am i doing this correctly?

You're seeing some nice numbers there (any max latency under 100us is
pretty good). 

I have a python program I've been developing named 'rteval' which kicks
off a kernel compile and a scheduler benchmark called 'hackbench', then
runs cyclictest with the histogram option. After the run it generates a
report on how well cyclictest did with the loads in place. If you're
interested, you can get rteval from my kernel.org git repo:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rteval.git

It's not 100% complete, but it's getting there. 

> 
> 
> The only powersave setting in the BIOS is "Intel speedstep" which i 
> disabled.
> 
> 
> I will check with disabled "irqbalance and cpuspeed services" disabled 
> and will report later.
> 
> 
> What should the adequate max. latency on this system?
> 
> 

I'd say you're doing pretty good keeping under 50us. You might want to
try it under a heavier load than the shell script you've been running.
If you don't want to fool with rteval, try kicking off a kernel compile
in another window like this:

$ while true; do make -j4 clean bzImage modules; done

and then run cyclictest. A kernel compile with parallel jobs (-j) is a
good overall load of computation and I/O.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mapping of PCI memory to  user space not working with uio.c ?
  2010-02-11  7:54   ` Max Müller
  2010-02-11 15:34     ` Clark Williams
@ 2010-02-12 20:54     ` Armin Steinhoff
  1 sibling, 0 replies; 10+ messages in thread
From: Armin Steinhoff @ 2010-02-12 20:54 UTC (permalink / raw)
  To: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 507 bytes --]


Hello,

I need some help.

I'm writing a user space driver for a PCI board.
The kernel part is working well ... all infos in sysfs are correct.

The mappings of the memories behind the BARs are "working" ... but the 
returned addresses are not valid.

Is it possible that the call remap_pfn_range used in uio.c are not 
working with PCI memory ??

I have attached the mapping procedure of the user space and the kernel part.

The mapping of BAR[2] (it's kernel memory) is valid ...


Best Regards

--Armin

[-- Attachment #2: uio_ems.c --]
[-- Type: text/x-csrc, Size: 4119 bytes --]

/*
 * UIO can CAN L2 PCI
 *
 * (C) Armin Steinhoff <as@steinhoff-automation.com>
 * (C) 2007 Hans J. Koch <hjk@linutronix.de>
 * Original code (C) 2005 Benedikt Spranger <b.spranger@linutronix.de>
 *
 * Licensed under GPL version 2 only.
 *
 */

#include <linux/device.h>
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/uio_driver.h>

#include <asm/io.h>

#define DEBUG 1

#define PCI_VENDORID 0x110A
#define PCI_DEVICEID 0x2104
#define INT_QUEUE_SIZE 64

static unsigned char IntIx, * IntQ;
static	void __iomem *ISR; 
static	void __iomem *ICR; 

static irqreturn_t CAN_handler(int irq, struct uio_info *dev_info)
{
	
    // check PITA ICR ...
	if(*((unsigned long *)ICR) & 0x02) // our interrupt ?
	{
		IntQ[IntIx] = *((unsigned char *)ISR);
		IntIx = (IntIx + 1) & 0xF ; // modulo 16

		*((unsigned long *)ICR) = 0x02; // confirm interrupt 
		return(IRQ_HANDLED);
	}
	else
		return(IRQ_NONE);

}

static int __devinit ems_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
{
	struct uio_info *info;
	int err;
	
	info = kzalloc(sizeof(struct uio_info), GFP_KERNEL);
	if (!info)
		return -ENOMEM;

	// if (pci_enable_device(dev)) goto out_free;
	err = pci_enable_device(dev);
	if (err) {
		dev_err(&dev->dev, "%s: pci_enable_device failed: %d\n", __func__, err);
		return err;
	}

	if (pci_request_regions(dev, "uio_ems"))
		goto out_disable;

	info->mem[0].addr = pci_resource_start(dev, 0);
	if (!info->mem[0].addr)
		goto out_release;	
	info->mem[0].size = pci_resource_len(dev, 0);
	info->mem[0].memtype = UIO_MEM_PHYS;
	
	info->mem[0].internal_addr = ioremap(info->mem[0].addr,info->mem[0].size);
	if (!info->mem[0].internal_addr)
		goto out_release;
	
	// disable interrupt at PITA level 
	*( (unsigned long *)( info->mem[0].internal_addr )) &= ~0x20000; // reset Bit 17

	info->mem[1].addr = pci_resource_start(dev, 1);
	if (!info->mem[1].addr)
		goto out_release;	
	info->mem[1].size = pci_resource_len(dev, 1);
	info->mem[1].memtype = UIO_MEM_PHYS;
	
	info->mem[1].internal_addr = ioremap(info->mem[1].addr,info->mem[1].size);
	if (!info->mem[1].internal_addr)
		goto out_release;
	
	// interrupt queue
	info->mem[2].addr = (unsigned long)kmalloc(64, GFP_KERNEL);
	IntQ = (unsigned char * )info->mem[2].addr;
	
	if (!info->mem[2].addr)
		goto out_unmap1;
	
	memset(IntQ, 0x00, 16);
	IntIx = 0;

	info->mem[2].memtype = UIO_MEM_LOGICAL;
	info->mem[2].size = 64;
	
	ISR = info->mem[1].internal_addr + 12 + 0x400;	// interrupt status channel 1
	ICR = info->mem[0].internal_addr;
	*((unsigned long *)ICR) = 0x02; // confirm interrupt 
	
	info->name = "uio_ems";	
	info->version = "0.0.1";
	info->irq = dev->irq;
	info->irq_flags |= IRQF_SHARED;
	info->handler = CAN_handler;

	if (uio_register_device(&dev->dev, info))
		goto out_kfree;

	pci_set_drvdata(dev, info);
	printk("can_pci_probe end\n");
	return 0;
	
out_kfree:	
	kfree((void *)info->mem[2].addr);
out_unmap1:	
	iounmap(info->mem[1].internal_addr);		
	iounmap(info->mem[0].internal_addr);
out_release:
	pci_release_regions(dev);
out_disable:
	pci_disable_device(dev);
	kfree (info);
	printk("CAN_PCI: -ENODEV\n");
	return -ENODEV;
}

static void ems_pci_remove(struct pci_dev *dev)
{
	struct uio_info *info = pci_get_drvdata(dev);

	uio_unregister_device(info);
	pci_release_regions(dev);
	pci_disable_device(dev);
	pci_set_drvdata(dev, NULL);
	iounmap(info->mem[0].internal_addr);
	iounmap(info->mem[1].internal_addr);
	kfree((void *)info->mem[2].addr);
	kfree (info);
}

static struct pci_device_id ems_pci_ids[] __devinitdata = {
	{
		PCI_VENDORID, PCI_DEVICEID, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0
	},
	{ 0, }
};

static struct pci_driver ems_pci_driver = {
	.name = "uio_ems",
	.id_table = ems_pci_ids,
	.probe    = ems_pci_probe,
	.remove   = ems_pci_remove,
};

static int __init ems_init_module(void)
{
	int ret;

	ret = pci_register_driver(&ems_pci_driver);

	return ret;
}

static void __exit ems_exit_module(void)
{
	pci_unregister_driver(&ems_pci_driver);
}

module_init(ems_init_module);
module_exit(ems_exit_module);

MODULE_DEVICE_TABLE(pci, ems_pci_ids);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("A. Steinhoff");

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: user_space_mmap.c --]
[-- Type: text/x-csrc; name="user_space_mmap.c", Size: 1146 bytes --]

int do_mappings(void)
{
	int  size_fd;
	int uio_size;
	
	size_fd = open( UIO_SIZE0, O_RDONLY );
	if( size_fd<0 || uio_fd<0 ) {
		fprintf(stderr,"Can't open UIO file 0...\n");
		return -1;
}
	read( size_fd, uio_size_buf, sizeof(uio_size_buf) );
	uio_size = (int)strtol( uio_size_buf, NULL, 0 );
	BAR[0] = (BYTE *)mmap(NULL, uio_size, PROT_READ | PROT_WRITE, MAP_SHARED, uio_fd, 0);
	if(BAR[0] == MAP_FAILED) perror("BAR0:\n");
	close(size_fd);
	
	size_fd = open( UIO_SIZE1, O_RDONLY );
	if( size_fd<0 ) {
		fprintf(stderr,"Can't open UIO file 1...\n");
		return -1;
}
	read( size_fd, uio_size_buf, sizeof(uio_size_buf) );
	uio_size = (int)strtol( uio_size_buf, NULL, 0 );
	BAR[1] = (BYTE *)mmap(NULL, uio_size, PROT_READ | PROT_WRITE, MAP_SHARED, uio_fd, getpagesize());
	close(size_fd);
	
	size_fd = open( UIO_SIZE2, O_RDONLY );
	if( size_fd<0 ) {
		fprintf(stderr,"Can't open UIO file 2...\n");
		return -1;
}
	read( size_fd, uio_size_buf, sizeof(uio_size_buf) );
	uio_size = (int)strtol( uio_size_buf, NULL, 0 );
	BAR[2] = (BYTE *)mmap(NULL, uio_size, PROT_READ | PROT_WRITE, MAP_SHARED, uio_fd, 2*getpagesize());
	close(size_fd);
	return(0);
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-11 15:34     ` Clark Williams
@ 2010-02-15  9:32       ` Max Müller
  2010-02-15 14:38         ` Clark Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Max Müller @ 2010-02-15  9:32 UTC (permalink / raw)
  To: Clark Williams; +Cc: linux-rt-users

Clark Williams schrieb:
> On Thu, 11 Feb 2010 08:54:29 +0100
> Max Müller <mxmr@gmx.net> wrote:
>
>   
>> Clark Williams schrieb:
>>     
>>> On Tue, 9 Feb 2010 07:41:58 +0000 (UTC)
>>> Max Miller <mxmr@gmx.net> wrote:
>>>
>>>   
>>>       
>>>> Hello,
>>>>
>>>> im am using the PREEMPT-RT patch on linux 2.6.29.6. It runs on a MSI965GSE
>>>> industial board with Intel ATOM CPU (N270, 1,6GHz) and i945GSE Northbridge. 
>>>>
>>>> I got about 45µs as maximum and 13µs as average latency when hyperthreading is
>>>> disabled. With enabled Hyperthreading the maximum latency increses to about
>>>> 100µs. I measured the latency with cyclictest. 
>>>>
>>>> What can i do to get better maximum latency? Can I do somthing in the kernel
>>>> configuration or are there some kernel bootoptions? Or is it still impossible
>>>> with this CPU to get better results?
>>>>
>>>> Thanks in advance,
>>>> Max Miller 
>>>>
>>>>
>>>>     
>>>>         
>>> Make sure you turn off any power management settings in the BIOS and
>>> turn off the irqbalance and cpuspeed services on the Linux side.
>>>
>>> What cyclictest command are you using to measure latency?
>>>
>>> Clark
>>>   
>>>       
>> I run cyclictest as follows:
>>
>> cyclictest -n -t3 -p99
>>     
>
> You might want to try the new cyclictest option --smp (which is really
> the options -t, -a -n) and I'd back the priority down to -p95 just to
> keep out of the way of watchdog and migration threads. In general, when
> I run it on a multi-core box I use:
>
> 	$ cyclictest --smp -m -p95 -d0
>
> Lately on AMD boxes I use --numa, which makes calls into libnuma to
> allocated memory on local nodes for the measurement threads.
>
> If you want to get fancy and look at at the history of the run, you can
> use the -h <n> option to keep a histogram of <n> buckets (1 bucket == 1
> microsecond). 
>
>   
>> For generating additional system load i run (one to several instances):
>>
>> while true; do echo "blah" > /dev/null; done &
>>
>> Then i watch the max. latency from the thread with the highest priority.
>> Sometimes i add the parameter '-h' to generate a history. In this 
>> history i can
>> see that the most latency times are under 20µs, only  about 5ppm are 
>> worse than 30µs.
>> Am i doing this correctly?
>>     
>
> You're seeing some nice numbers there (any max latency under 100us is
> pretty good). 
>
> I have a python program I've been developing named 'rteval' which kicks
> off a kernel compile and a scheduler benchmark called 'hackbench', then
> runs cyclictest with the histogram option. After the run it generates a
> report on how well cyclictest did with the loads in place. If you're
> interested, you can get rteval from my kernel.org git repo:
>
> $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rteval.git
>
> It's not 100% complete, but it's getting there. 
>
>   
>> The only powersave setting in the BIOS is "Intel speedstep" which i 
>> disabled.
>>
>>
>> I will check with disabled "irqbalance and cpuspeed services" disabled 
>> and will report later.
>>
>>
>> What should the adequate max. latency on this system?
>>
>>
>>     
>
> I'd say you're doing pretty good keeping under 50us. You might want to
> try it under a heavier load than the shell script you've been running.
> If you don't want to fool with rteval, try kicking off a kernel compile
> in another window like this:
>
> $ while true; do make -j4 clean bzImage modules; done
>
> and then run cyclictest. A kernel compile with parallel jobs (-j) is a
> good overall load of computation and I/O.
>
>   

I tested now like you told me with irqbalance and cpuspeed services 
disabled. I hope i made the right for disabling irqbalance, i used the 
kernel parameter acpi_no_irqbalance. Is this correct? Unfortunately the 
results were nearly equal as before.

For measureing latency i did now the following:
-compile kernel (for high system load)
-running cyclictest -n -m -t3 -p94 (for having some running high 
priority threads)
-running cyclictest -n -m -h80 -p95 -l6000000 (for latency measurement)

I will also test your python program the next days.

I have now about 50µs worst case latency and about 15µs average latency.

Greetings,
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-15  9:32       ` Max Müller
@ 2010-02-15 14:38         ` Clark Williams
  2010-02-16  6:50           ` Max Müller
  0 siblings, 1 reply; 10+ messages in thread
From: Clark Williams @ 2010-02-15 14:38 UTC (permalink / raw)
  To: Max Müller; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]

On Mon, 15 Feb 2010 10:32:54 +0100
Max Müller <mxmr@gmx.net> wrote:
 
> >
> > I'd say you're doing pretty good keeping under 50us. You might want to
> > try it under a heavier load than the shell script you've been running.
> > If you don't want to fool with rteval, try kicking off a kernel compile
> > in another window like this:
> >
> > $ while true; do make -j4 clean bzImage modules; done
> >
> > and then run cyclictest. A kernel compile with parallel jobs (-j) is a
> > good overall load of computation and I/O.
> >
> >   
> 
> I tested now like you told me with irqbalance and cpuspeed services 
> disabled. I hope i made the right for disabling irqbalance, i used the 
> kernel parameter acpi_no_irqbalance. Is this correct? Unfortunately the 
> results were nearly equal as before.

I don't think you're going to get much better results on the Atom. I
have an MSI Nettop box with the dual-core version and I saw about the
same results as you.

What sort of scheduling deadlines are you trying to meet? 

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-15 14:38         ` Clark Williams
@ 2010-02-16  6:50           ` Max Müller
  2010-02-16 12:18             ` Luis Claudio R. Goncalves
  0 siblings, 1 reply; 10+ messages in thread
From: Max Müller @ 2010-02-16  6:50 UTC (permalink / raw)
  To: Clark Williams; +Cc: linux-rt-users

Clark Williams schrieb:
> On Mon, 15 Feb 2010 10:32:54 +0100
> Max Müller <mxmr@gmx.net> wrote:
>  
>   
>>> I'd say you're doing pretty good keeping under 50us. You might want to
>>> try it under a heavier load than the shell script you've been running.
>>> If you don't want to fool with rteval, try kicking off a kernel compile
>>> in another window like this:
>>>
>>> $ while true; do make -j4 clean bzImage modules; done
>>>
>>> and then run cyclictest. A kernel compile with parallel jobs (-j) is a
>>> good overall load of computation and I/O.
>>>
>>>   
>>>       
>> I tested now like you told me with irqbalance and cpuspeed services 
>> disabled. I hope i made the right for disabling irqbalance, i used the 
>> kernel parameter acpi_no_irqbalance. Is this correct? Unfortunately the 
>> results were nearly equal as before.
>>     
>
> I don't think you're going to get much better results on the Atom. I
> have an MSI Nettop box with the dual-core version and I saw about the
> same results as you.
>
> What sort of scheduling deadlines are you trying to meet? 
>
> Clark
>   
Shorter it is better it would be :-)
I can also live with this results, but i wanted to make sure to get the 
best out of this hardware.

Are you running both cores on the MSI box (maybe also with 
hyperthreading enabled) with this results?

In the meantime i thought also if the SMI (system management mode) could 
have a bad influence. I wrote a little userspace  programm which 
disables global SMI bit of the ICH7 southbridge. But also no better 
results.
After that i was told (thanks to Luis Claudio!) to check latency with 
the kernel module hwlat_detector. The results of this module was 0. I 
interpreted this that there is no SMI that causes the latency on my ATOM 
system.

Regards,
Max


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-16  6:50           ` Max Müller
@ 2010-02-16 12:18             ` Luis Claudio R. Goncalves
  2010-02-17  6:26               ` Max Müller
  0 siblings, 1 reply; 10+ messages in thread
From: Luis Claudio R. Goncalves @ 2010-02-16 12:18 UTC (permalink / raw)
  To: Max Müller; +Cc: Clark Williams, linux-rt-users

On Tue, Feb 16, 2010 at 07:50:02AM +0100, Max Müller wrote:
| Clark Williams schrieb:
| >On Mon, 15 Feb 2010 10:32:54 +0100
| >Max Müller <mxmr@gmx.net> wrote:
| >>>I'd say you're doing pretty good keeping under 50us. You might want to
| >>>try it under a heavier load than the shell script you've been running.
| >>>If you don't want to fool with rteval, try kicking off a kernel compile
| >>>in another window like this:
| >>>
| >>>$ while true; do make -j4 clean bzImage modules; done
| >>>
| >>>and then run cyclictest. A kernel compile with parallel jobs (-j) is a
| >>>good overall load of computation and I/O.
| >>>
| >>I tested now like you told me with irqbalance and cpuspeed
| >>services disabled. I hope i made the right for disabling
| >>irqbalance, i used the kernel parameter acpi_no_irqbalance. Is
| >>this correct? Unfortunately the results were nearly equal as
| >>before.
| >
| >I don't think you're going to get much better results on the Atom. I
| >have an MSI Nettop box with the dual-core version and I saw about the
| >same results as you.
| >
| >What sort of scheduling deadlines are you trying to meet?
| >
| >Clark
| Shorter it is better it would be :-)
| I can also live with this results, but i wanted to make sure to get
| the best out of this hardware.
| 
| Are you running both cores on the MSI box (maybe also with
| hyperthreading enabled) with this results?

Oops, I have overlooked that "more than one core" detail. In this case, you
have two cyclictest threads (from different executions) clashing at
priority FIFO:94. That can eventually create the latencies you see.

I would also suggest running the test without Hyperthreading (adjusting the
number of cyclictest threads) just to see if the latencies are there in
this case.
 
Luis

| In the meantime i thought also if the SMI (system management mode)
| could have a bad influence. I wrote a little userspace  programm
| which disables global SMI bit of the ICH7 southbridge. But also no
| better results.
| After that i was told (thanks to Luis Claudio!) to check latency
| with the kernel module hwlat_detector. The results of this module
| was 0. I interpreted this that there is no SMI that causes the
| latency on my ATOM system.
| 
| Regards,
| Max

-- 
[ Luis Claudio R. Goncalves             Red Hat  -  Realtime Team ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Tweak Latency on Intel ATOM
  2010-02-16 12:18             ` Luis Claudio R. Goncalves
@ 2010-02-17  6:26               ` Max Müller
  0 siblings, 0 replies; 10+ messages in thread
From: Max Müller @ 2010-02-17  6:26 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves; +Cc: Clark Williams, linux-rt-users

Luis Claudio R. Goncalves schrieb:
> On Tue, Feb 16, 2010 at 07:50:02AM +0100, Max Müller wrote:
> | Clark Williams schrieb:
> | >On Mon, 15 Feb 2010 10:32:54 +0100
> | >Max Müller <mxmr@gmx.net> wrote:
> | >>>I'd say you're doing pretty good keeping under 50us. You might want to
> | >>>try it under a heavier load than the shell script you've been running.
> | >>>If you don't want to fool with rteval, try kicking off a kernel compile
> | >>>in another window like this:
> | >>>
> | >>>$ while true; do make -j4 clean bzImage modules; done
> | >>>
> | >>>and then run cyclictest. A kernel compile with parallel jobs (-j) is a
> | >>>good overall load of computation and I/O.
> | >>>
> | >>I tested now like you told me with irqbalance and cpuspeed
> | >>services disabled. I hope i made the right for disabling
> | >>irqbalance, i used the kernel parameter acpi_no_irqbalance. Is
> | >>this correct? Unfortunately the results were nearly equal as
> | >>before.
> | >
> | >I don't think you're going to get much better results on the Atom. I
> | >have an MSI Nettop box with the dual-core version and I saw about the
> | >same results as you.
> | >
> | >What sort of scheduling deadlines are you trying to meet?
> | >
> | >Clark
> | Shorter it is better it would be :-)
> | I can also live with this results, but i wanted to make sure to get
> | the best out of this hardware.
> | 
> | Are you running both cores on the MSI box (maybe also with
> | hyperthreading enabled) with this results?
>
> Oops, I have overlooked that "more than one core" detail. In this case, you
> have two cyclictest threads (from different executions) clashing at
> priority FIFO:94. That can eventually create the latencies you see.
>
> I would also suggest running the test without Hyperthreading (adjusting the
> number of cyclictest threads) just to see if the latencies are there in
> this case.
>  
> Luis
>
> | In the meantime i thought also if the SMI (system management mode)
> | could have a bad influence. I wrote a little userspace  programm
> | which disables global SMI bit of the ICH7 southbridge. But also no
> | better results.
> | After that i was told (thanks to Luis Claudio!) to check latency
> | with the kernel module hwlat_detector. The results of this module
> | was 0. I interpreted this that there is no SMI that causes the
> | latency on my ATOM system.
> | 
> | Regards,
> | Max
>
>   
I have only the single core ATOM. Clark has the dual core box. 
Hyperthreading must be disabled, else the maximum latency rises from 
about 50µs (Hypertrhreading disabled) to about 100µs (Hypertrhreading 
enabled).
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-02-17  6:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-09  7:41 Tweak Latency on Intel ATOM Max Miller
2010-02-10 22:38 ` Clark Williams
2010-02-11  7:54   ` Max Müller
2010-02-11 15:34     ` Clark Williams
2010-02-15  9:32       ` Max Müller
2010-02-15 14:38         ` Clark Williams
2010-02-16  6:50           ` Max Müller
2010-02-16 12:18             ` Luis Claudio R. Goncalves
2010-02-17  6:26               ` Max Müller
2010-02-12 20:54     ` mapping of PCI memory to user space not working with uio.c ? Armin Steinhoff

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.