All of lore.kernel.org
 help / color / mirror / Atom feed
* pipe and af/unix latency differences between aa and jam on smp
@ 2002-07-09  0:59 rwhron
  2002-07-09  1:11 ` J.A. Magallon
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: rwhron @ 2002-07-09  0:59 UTC (permalink / raw)
  To: andrea, jamagallon; +Cc: linux-kernel, lse-tech

The -jam patchset is interesting because it starts out
with the entire -aa patchset and adds a few things.

Sometimes small differences in LMbench between -jam and -aa are 
just CPU bounces on SMP.  The difference for pipe and af/unix latency
only appears on SMP too, but it is very consistent.  (My k6/2
has small differences between -aa and -jam for pipe and af/unix
latency).

You will know better what could make the difference:

This is the averages:

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
kernel              Pipe    AF/Unix
-----------------  -------  -------
2.4.19-pre10-aa4    33.941   70.216
2.4.19-pre10-jam2    7.877   16.699


These are the individual runs:

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
OS                              Pipe   AF/Unix
-----------------------------  -------  ------
Linux 2.4.19-pre10-aa4         33.999   73.024 
Linux 2.4.19-pre10-aa4         35.829   73.261 
Linux 2.4.19-pre10-aa4         16.710   74.830 
Linux 2.4.19-pre10-aa4         37.221   66.354 
Linux 2.4.19-pre10-aa4         36.259   68.433 
Linux 2.4.19-pre10-aa4         36.429   68.215 
Linux 2.4.19-pre10-aa4         35.379   77.147 
Linux 2.4.19-pre10-aa4         29.300   73.641 
Linux 2.4.19-pre10-aa4         35.798   64.875 
Linux 2.4.19-pre10-aa4         35.691   75.433 
Linux 2.4.19-pre10-aa4         35.372   73.398 
Linux 2.4.19-pre10-aa4         33.516   69.183 
Linux 2.4.19-pre10-aa4         34.986   69.254 
Linux 2.4.19-pre10-aa4         33.743   69.893 
Linux 2.4.19-pre10-aa4         32.679   71.900 
Linux 2.4.19-pre10-aa4         34.131   71.812 
Linux 2.4.19-pre10-aa4         33.444   72.454 
Linux 2.4.19-pre10-aa4         36.531   71.956 
Linux 2.4.19-pre10-aa4         37.838   69.731 
Linux 2.4.19-pre10-aa4         34.359   71.522 
Linux 2.4.19-pre10-aa4         33.286   71.609 
Linux 2.4.19-pre10-aa4         32.361   43.533 
Linux 2.4.19-pre10-aa4         31.716   74.131 
Linux 2.4.19-pre10-aa4         35.218   72.001 
Linux 2.4.19-pre10-aa4         36.709   67.795 

Linux 2.4.19-pre10-jam2        7.9977   14.495 
Linux 2.4.19-pre10-jam2        7.8406   14.044 
Linux 2.4.19-pre10-jam2        7.7899   14.006 
Linux 2.4.19-pre10-jam2        7.8584   13.819 
Linux 2.4.19-pre10-jam2        7.8379   14.453 
Linux 2.4.19-pre10-jam2        7.8781   14.156 
Linux 2.4.19-pre10-jam2        7.8881   14.238 
Linux 2.4.19-pre10-jam2        7.9833   14.168 
Linux 2.4.19-pre10-jam2        7.7772   78.765 
Linux 2.4.19-pre10-jam2        8.0816   13.703 
Linux 2.4.19-pre10-jam2        7.8605   14.042 
Linux 2.4.19-pre10-jam2        7.7982   13.883 
Linux 2.4.19-pre10-jam2        7.6362   14.286 
Linux 2.4.19-pre10-jam2        7.7480   13.989 
Linux 2.4.19-pre10-jam2        7.9262   13.947 
Linux 2.4.19-pre10-jam2        8.0904   14.014 
Linux 2.4.19-pre10-jam2        7.8480   14.310 
Linux 2.4.19-pre10-jam2        7.7982   14.171 
Linux 2.4.19-pre10-jam2        7.9776   14.234 
Linux 2.4.19-pre10-jam2        7.7931   14.125 
Linux 2.4.19-pre10-jam2        7.8553   14.110 
Linux 2.4.19-pre10-jam2        7.7294   14.285 
Linux 2.4.19-pre10-jam2        8.3361   14.131 
Linux 2.4.19-pre10-jam2        7.7797   14.039 
Linux 2.4.19-pre10-jam2        7.8265   14.043 

For pipe and af/unix bandwidth, the difference appears to just be a
CPU bounce here and there.

jam patchsets are at:
http://giga.cps.unizar.es/~magallon/linux/

--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-09  0:59 pipe and af/unix latency differences between aa and jam on smp rwhron
@ 2002-07-09  1:11 ` J.A. Magallon
  2002-07-09  1:15   ` J.A. Magallon
  2002-07-09  1:25 ` J.A. Magallon
  2002-07-11 20:20 ` Bill Davidsen
  2 siblings, 1 reply; 11+ messages in thread
From: J.A. Magallon @ 2002-07-09  1:11 UTC (permalink / raw)
  To: rwhron; +Cc: andrea, jamagallon, linux-kernel, lse-tech


On 2002.07.09 rwhron@earthlink.net wrote:
>The -jam patchset is interesting because it starts out
>with the entire -aa patchset and adds a few things.
>
>Sometimes small differences in LMbench between -jam and -aa are 
>just CPU bounces on SMP.  The difference for pipe and af/unix latency
>only appears on SMP too, but it is very consistent.  (My k6/2
>has small differences between -aa and -jam for pipe and af/unix
>latency).
>
>You will know better what could make the difference:
>
>This is the averages:
>
>*Local* Communication latencies in microseconds - smaller is better
>-------------------------------------------------------------------
>kernel              Pipe    AF/Unix
>-----------------  -------  -------
>2.4.19-pre10-aa4    33.941   70.216
>2.4.19-pre10-jam2    7.877   16.699
>

Candidates in pre10-jam2 could be:

11-irqbalance-B1.bz2
12-smptimers-A0.bz2
13-irqrate-A1.bz2

excluding anything that has nothing to do with pipes or latency.

Could you try latest -rc1-aa2 ? It includes also irqbalance, so it could be
on varable less in the equation.
I dropped smptimers and irqrate because they did not mix very well  with
bproc and O1 scheduler, but I can try to add them again.

I have a rc1-jam2 ready, but the only important change wrt SMP could be the
mem-barrier specific implementation for P3/P4, and your box is an AMD.

??

-- 
J.A. Magallon             \   Software is like sex: It's better when it's free
mailto:jamagallon@able.es  \                    -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-rc1-jam2, Mandrake Linux 8.3 (Cooker) for i586
gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.7mdk)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-09  1:11 ` J.A. Magallon
@ 2002-07-09  1:15   ` J.A. Magallon
  2002-07-09 10:19     ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: J.A. Magallon @ 2002-07-09  1:15 UTC (permalink / raw)
  To: J.A. Magallon; +Cc: rwhron, linux-kernel


On 2002.07.09 J.A. Magallon wrote:
>
>
>I have a rc1-jam2 ready, but the only important change wrt SMP could be the
>mem-barrier specific implementation for P3/P4, and your box is an AMD.
>

Opps, I remembered your tests are done on a Quad Xeon ?

-- 
J.A. Magallon             \   Software is like sex: It's better when it's free
mailto:jamagallon@able.es  \                    -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-rc1-jam2, Mandrake Linux 8.3 (Cooker) for i586
gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.7mdk)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-09  0:59 pipe and af/unix latency differences between aa and jam on smp rwhron
  2002-07-09  1:11 ` J.A. Magallon
@ 2002-07-09  1:25 ` J.A. Magallon
  2002-07-11 20:20 ` Bill Davidsen
  2 siblings, 0 replies; 11+ messages in thread
From: J.A. Magallon @ 2002-07-09  1:25 UTC (permalink / raw)
  To: rwhron; +Cc: andrea, jamagallon, linux-kernel, lse-tech


On 2002.07.09 rwhron@earthlink.net wrote:
>The -jam patchset is interesting because it starts out
>with the entire -aa patchset and adds a few things.
>
>Sometimes small differences in LMbench between -jam and -aa are 
>just CPU bounces on SMP.  The difference for pipe and af/unix latency
>only appears on SMP too, but it is very consistent.  (My k6/2
>has small differences between -aa and -jam for pipe and af/unix
>latency).
>
>You will know better what could make the difference:
>
>This is the averages:
>
>*Local* Communication latencies in microseconds - smaller is better
>-------------------------------------------------------------------
>kernel              Pipe    AF/Unix
>-----------------  -------  -------
>2.4.19-pre10-aa4    33.941   70.216
>2.4.19-pre10-jam2    7.877   16.699
>

I took a look at your numbers:

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
kernel                          Pipe    AF/Unix    UDP    RPC/UDP    TCP    RPC/TCP  TCPconn
-----------------------------  -------  -------  -------  -------  -------  -------  -------
2.4.19-pre7-jam6                29.513   42.369  58.6165  60.7792  50.2572  82.4976   87.321
2.4.19-pre8-jam2                 7.697   15.274  59.6730  60.8190   55.276  82.1297   89.416
2.4.19-pre8-jam2-nowuos          7.739   14.929  57.9326  60.5497  55.9745  81.8908   90.370

(last line says that wake-up-sync is not responsible...)

Main changes between first two were irqbalance and ide6->ide10.


-- 
J.A. Magallon             \   Software is like sex: It's better when it's free
mailto:jamagallon@able.es  \                    -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-rc1-jam2, Mandrake Linux 8.3 (Cooker) for i586
gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.7mdk)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-09  1:15   ` J.A. Magallon
@ 2002-07-09 10:19     ` Zwane Mwaikambo
  0 siblings, 0 replies; 11+ messages in thread
From: Zwane Mwaikambo @ 2002-07-09 10:19 UTC (permalink / raw)
  To: J.A. Magallon; +Cc: rwhron, linux-kernel

On Tue, 9 Jul 2002, J.A. Magallon wrote:

> Opps, I remembered your tests are done on a Quad Xeon ?

Out of interest, is that a P4/Xeon?

Cheers,
	Zwane Mwaikambo
-- 
function.linuxpower.ca


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-09  0:59 pipe and af/unix latency differences between aa and jam on smp rwhron
  2002-07-09  1:11 ` J.A. Magallon
  2002-07-09  1:25 ` J.A. Magallon
@ 2002-07-11 20:20 ` Bill Davidsen
  2 siblings, 0 replies; 11+ messages in thread
From: Bill Davidsen @ 2002-07-11 20:20 UTC (permalink / raw)
  To: rwhron; +Cc: Linux Kernel Mailing List, lse-tech

On Mon, 8 Jul 2002 rwhron@earthlink.net wrote:

> Sometimes small differences in LMbench between -jam and -aa are 
> just CPU bounces on SMP.  The difference for pipe and af/unix latency
> only appears on SMP too, but it is very consistent.  (My k6/2
> has small differences between -aa and -jam for pipe and af/unix
> latency).
> 
> You will know better what could make the difference:
> 
> This is the averages:
> 
> *Local* Communication latencies in microseconds - smaller is better
> -------------------------------------------------------------------
> kernel              Pipe    AF/Unix
> -----------------  -------  -------
> 2.4.19-pre10-aa4    33.941   70.216
> 2.4.19-pre10-jam2    7.877   16.699

Small differences? The only thing I would call small is the latency of the
jam kernel!

If (a) this is a real value which results in ~5x latency reduction in
non-benchmark applications, and (b) doesn't have some resulting penalty
(there are some free lunches in Linux), then it would be desirable.

I have an IPC test which measures time for a datum to move form process A
to process B and back, using various methods, I'll try to build these
kernels and test it in my next free day. I'd love to test latency of SysV
message queues as well, since these turn out to be good solutions to some
types of N:M client-server problems.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
@ 2002-07-12  1:27 rwhron
  0 siblings, 0 replies; 11+ messages in thread
From: rwhron @ 2002-07-12  1:27 UTC (permalink / raw)
  To: andrea; +Cc: jamagallon, linux-kernel, lse-tech

> 2.4.19-pre10-jam2 is composed by plain 2.4.19pre10aa2 + a number
> of patches

Yes, that makes narrowing it down to a single patch straight forward.

> BTW, in your new set of benchmarks rc1aa1 still seems to be compiled in
> the unfair why that explains the slower I/O results, right? 

Yes.  2.4.19rc1aa1 did not have CONFIG_2GB or CONFIG_HIGHIO set, so
that was unfair.  2.4.19-pre10-jam[23] had 2GB and HIGHIO.  
2.4.19rc1aa2 is benching with 2GB and HIGHIO now.

> I don't have time to do benchmarks on this myself right now, but if
> somebody could try to apply the patches in jam2 with a binary search
> (I'd first suggest to backout irqrate, smptimers and irqbalance and see
> if it's still fast as I expect), that would be really interesting.

Thanks for picking out the most suspect patches.  In going through 
the patchlogs on the 4 different jam samples, I see:

irqrate and smptimers are in pre7jam6 (high latency) and pre8jam2
(low latency), so they may not be the key patch for pipe/unix 
latency.

irqbalance is in pre8jam2 and pre10jam2, which both had low latency.
irqbalance is not in pre7jam6 and pre10jam3 which had higher latency.

After 2.4.19rc1aa2 completes, I'll run the latency tests on pre10-jam2 
and back out patches until the difference appears.  Can't take but a
few pleasant hours, and the weekend is coming.  :)

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-11  9:02 rwhron
@ 2002-07-11  9:21 ` Andrea Arcangeli
  0 siblings, 0 replies; 11+ messages in thread
From: Andrea Arcangeli @ 2002-07-11  9:21 UTC (permalink / raw)
  To: rwhron; +Cc: jamagallon, linux-kernel, lse-tech

On Thu, Jul 11, 2002 at 05:02:14AM -0400, rwhron@earthlink.net wrote:
> > both pipe and afunix should not generate any irq load (other than
> > the IPI with the reschedule_task wakeups at least, but they're only
> > dependent on the scheduler
> 
> there are some scheduler bits in irqbalance for cpu affinity.
> irqbalance is in the two jam patchsets with low latency, and not
> in the patchsets with higher latency.  

I don't see those scheduler bits, it only exports the idle task info so
we know if a cpu is idle from irq.

anyways 2.4.19-pre10-jam2 is composed by plain 2.4.19pre10aa2 + a number
of patches (including irqbalance,irqrate,smptimers, btw smptimers
reintroduces a deadlock crahsing bug exploitable from userspace that I
pushed into 2.4 mainline recently). So the difference has to be in the
patches into pre10jam2 because pre10aa2 is slow and jam2 is fast.
Only looking at the patches it's not clear what can make the difference.

BTW, in your new set of benchmarks rc1aa1 still seems to be compiled in
the unfair why that explains the slower I/O results, right? I don't mind
of course, just to be sure.

I don't have time to do benchmarks on this myself right now, but if
somebody could try to apply the patches in jam2 with a binary search
(I'd first suggest to backout irqrate, smptimers and irqbalance and see
if it's still fast as I expect), that would be really interesting.

Thanks,

Andrea

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
@ 2002-07-11  9:02 rwhron
  2002-07-11  9:21 ` Andrea Arcangeli
  0 siblings, 1 reply; 11+ messages in thread
From: rwhron @ 2002-07-11  9:02 UTC (permalink / raw)
  To: andrea, jamagallon; +Cc: linux-kernel, lse-tech

> both pipe and afunix should not generate any irq load (other than
> the IPI with the reschedule_task wakeups at least, but they're only
> dependent on the scheduler

there are some scheduler bits in irqbalance for cpu affinity.
irqbalance is in the two jam patchsets with low latency, and not
in the patchsets with higher latency.  

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
  2002-07-09 14:05 rwhron
@ 2002-07-09 14:53 ` Andrea Arcangeli
  0 siblings, 0 replies; 11+ messages in thread
From: Andrea Arcangeli @ 2002-07-09 14:53 UTC (permalink / raw)
  To: rwhron; +Cc: zwane, jamagallon, linux-kernel, lse-tech

On Tue, Jul 09, 2002 at 10:05:58AM -0400, rwhron@earthlink.net wrote:
> > *Local* Communication latencies in microseconds - smaller is better
> 
> > kernel                          Pipe    AF/Unix
> > -----------------------------  -------  -------
> > 2.4.19-pre7-jam6                29.513   42.369
> > 2.4.19-pre8-jam2                 7.697   15.274
> > 2.4.19-pre8-jam2-nowuos          7.739   14.929
> 
> > (last line says that wake-up-sync is not responsible...)
> 
> > Main changes between first two were irqbalance and ide6->ide10.
> 
> The system is scsi only.  pre7-jam6 and pre8-jam2 .config's were 
> identical.
> 
> > Could you try latest -rc1-aa2 ? It includes also irqbalance,
> 
> Based on Andrea'a diff logs, irqbalance appeared in 2.4.19pre10aa3.
> There are small differences between the pre10-jam2 and aa irqbalance
> patches.  One new datapoint with pre10-jam3:
> 
> *Local* Communication latencies in microseconds - smaller is better
> -------------------------------------------------------------------
> kernel                          Pipe    AF/Unix
> -----------------------------  -------  -------
> 2.4.19-pre10-jam2                7.877   16.699
> 2.4.19-pre10-jam3               33.133   66.825
> 2.4.19-pre10-aa2                34.208   62.732
> 2.4.19-pre10-aa4                33.941   70.216
> 2.4.19-rc1-aa1-1g-nio           34.989   52.704

now if this was AF_INET via ethernet I could imagine the irqbalance
making difference (or even irqrate even if irqrate should make no
difference until your hardware hits the limit of irqs it can handle).

but both pipe and afunix should not generate any irq load (other than
the IPI with the reschedule_task wakeups at least, but they're only
dependent on the scheduler, ipi delivery isn't influenced by the
irqrate/irqbalance patches). it's all trasmission in software internal
to the kernel, with no hardware events so no irq, so I would be very
surprised if the irqbalance or irqrate could make any difference. I
would look elsewere first at least.  No idea why you're looking at those
irq related patches for this workload.

At first glance I would say either it's a compiler issue that generates
some very inefficent code one way or the other (seems very unlikely but
cache effects can be quite huge in tight loops where a very small part
of the kernel is exercised), or it has something to do with schduler or
similar core non-irq related areas.

> 
> A config difference between pre10-jam2 and pre10-jam3 is:
> CONFIG_X86_SFENCE=y	# pre10-jam2
> pre10-jam2 was compiled with -Os and pre10-jam3 with -O2.
> 
> > Out of interest, is that a P4/Xeon?
> 
> Quad P3/Xeon 700 mhz with 1MB cache.
> 
> -- 
> Randy Hron
> http://home.earthlink.net/~rwhron/kernel/bigbox.html


Andrea

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pipe and af/unix latency differences between aa and jam on smp
@ 2002-07-09 14:05 rwhron
  2002-07-09 14:53 ` Andrea Arcangeli
  0 siblings, 1 reply; 11+ messages in thread
From: rwhron @ 2002-07-09 14:05 UTC (permalink / raw)
  To: zwane, jamagallon; +Cc: andrea, linux-kernel, lse-tech

> *Local* Communication latencies in microseconds - smaller is better

> kernel                          Pipe    AF/Unix
> -----------------------------  -------  -------
> 2.4.19-pre7-jam6                29.513   42.369
> 2.4.19-pre8-jam2                 7.697   15.274
> 2.4.19-pre8-jam2-nowuos          7.739   14.929

> (last line says that wake-up-sync is not responsible...)

> Main changes between first two were irqbalance and ide6->ide10.

The system is scsi only.  pre7-jam6 and pre8-jam2 .config's were 
identical.

> Could you try latest -rc1-aa2 ? It includes also irqbalance,

Based on Andrea'a diff logs, irqbalance appeared in 2.4.19pre10aa3.
There are small differences between the pre10-jam2 and aa irqbalance
patches.  One new datapoint with pre10-jam3:

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
kernel                          Pipe    AF/Unix
-----------------------------  -------  -------
2.4.19-pre10-jam2                7.877   16.699
2.4.19-pre10-jam3               33.133   66.825
2.4.19-pre10-aa2                34.208   62.732
2.4.19-pre10-aa4                33.941   70.216
2.4.19-rc1-aa1-1g-nio           34.989   52.704

A config difference between pre10-jam2 and pre10-jam3 is:
CONFIG_X86_SFENCE=y	# pre10-jam2
pre10-jam2 was compiled with -Os and pre10-jam3 with -O2.

> Out of interest, is that a P4/Xeon?

Quad P3/Xeon 700 mhz with 1MB cache.

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-07-12  1:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-09  0:59 pipe and af/unix latency differences between aa and jam on smp rwhron
2002-07-09  1:11 ` J.A. Magallon
2002-07-09  1:15   ` J.A. Magallon
2002-07-09 10:19     ` Zwane Mwaikambo
2002-07-09  1:25 ` J.A. Magallon
2002-07-11 20:20 ` Bill Davidsen
2002-07-09 14:05 rwhron
2002-07-09 14:53 ` Andrea Arcangeli
2002-07-11  9:02 rwhron
2002-07-11  9:21 ` Andrea Arcangeli
2002-07-12  1:27 rwhron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.