linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 2.4 Scalability, Samba, and Netbench
@ 2001-05-09 16:29 Andrew M. Theurer
  2001-05-09 16:56 ` Mike Kravetz
  2001-05-10  1:23 ` Kenichi Okuyama
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew M. Theurer @ 2001-05-09 16:29 UTC (permalink / raw)
  To: lse-tech, linux-kernel, samba-technical

Hello,

I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
workload with Samba, and I wanted to get some feedback on results so
far.  I would appreciate comments and any suggestions for improving
scalability on this workload.

The environment consists of an Intel Profusion based SMP with 8 x 700
Mhz Xeon, 1 Mb L2, 14+ GB ram, ServeRAID, 8 Intel ethernet cards (IBM
Netfinity 8500R).  There are 16 500 Mhz PII, 128 MB clients running
Windows NT.  I tested for uniprocessor, 2-way, and 4-way SMP
configurations.  Future plans including testing 8-way performance when
more test clients are available.  Netbench(r) 7.01 was used with the
enterprise disk suite test.  The test was modified to use 2 engines per
client, and the range of test clients was changed from 1-60 to 8-16 
(for 2P & 4P) and 4-12 (for uniprocessor).

My initial results for linux 2.4.0, ext2 are as follows:

		[UP]	[2P]	[4P]
	08	149
	12	199
	16	227	236	260
# Eng	20	193	272	317	Mbps
	24	223	283	369
	28		285	396
	32		285	405

Same test, but with IRQ to processor affinity for 2P & 4P on the 8
ethernet cards:
		  	[2P]	[4P]
	16		231	259
# Eng	20		278	297
	24		293	320	Mbps
	28		297	365
	32		299	399*
	*Still investigating; we had some cpu idle time
	 on the 4P/32 engines, but not on test configuration
	 with out IRQ aff.

And for linux 2.4.3 with reiserfs:
		[UP]	[2P]	[4P]
	08	130
	12	190
	16	203	210	231
# Eng	20	190	235	279
	24	200	249	319	Mbps
	28		239	360
	32		251	335

Same, but with IRQ affinity for 2P & 4P on the 8 ethernet cards:
			[2P]	[4P]
	16		224	236
# Eng	20		220	308
	24		252	331	Mbps
	28		269	375
	32		267	382

 --All results in Mbps, using Netbench(r) 7.0.1 and Samba 2.0.7
 --Netbench(r) is available at http://www.netbench.com

I would like to help improve SMP scalability on this workload.  If you
have questions or comments about the above results, or if you are
conducting similar tests, please send email to
lse-tech@lists.sourceforge.net.  I have some ideas on my next steps,
but would like to discuss first.

Regards,

Andrew Theurer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 16:29 Linux 2.4 Scalability, Samba, and Netbench Andrew M. Theurer
@ 2001-05-09 16:56 ` Mike Kravetz
  2001-05-09 17:30   ` Andrew M. Theurer
  2001-05-10  1:23 ` Kenichi Okuyama
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2001-05-09 16:56 UTC (permalink / raw)
  To: Andrew M. Theurer; +Cc: lse-tech, linux-kernel, samba-technical

On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> 
> I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> workload with Samba, and I wanted to get some feedback on results so
> far.

Do you have any kernel profile or lock contention data?

-- 
Mike Kravetz                                 mkravetz@sequent.com
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 16:56 ` Mike Kravetz
@ 2001-05-09 17:30   ` Andrew M. Theurer
  2001-05-09 17:35     ` [Lse-tech] " Christoph Hellwig
                       ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Andrew M. Theurer @ 2001-05-09 17:30 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: lse-tech, linux-kernel, samba-technical

I do have kernprof ACG and lockmeter for a 4P run.  We saw no
significant problems with lockmeter.  csum_partial_copy_generic was the
highest % in profile, at 4.34%.  I'll see if we can get some space on
http://lse.sourceforge.net to post the test data.

Andrew Theurer

Mike Kravetz wrote:
> 
> On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> >
> > I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> > workload with Samba, and I wanted to get some feedback on results so
> > far.
> 
> Do you have any kernel profile or lock contention data?
> 
> --
> Mike Kravetz                                 mkravetz@sequent.com
> IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 17:30   ` Andrew M. Theurer
@ 2001-05-09 17:35     ` Christoph Hellwig
  2001-05-09 17:39     ` Alan Cox
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2001-05-09 17:35 UTC (permalink / raw)
  To: Andrew M. Theurer; +Cc: Mike Kravetz, lse-tech, linux-kernel, samba-technical

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.

Maybe you should try Kernel 2.4.4 (with Zerocopy TCP/IP) and Anton's
sendfile for samba patch.  A copy of the latter was posted to lkml - see
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0101.3/0484.html,
even if that maybe be unusable to due html crappieness.

	Christoph

-- 
Of course it doesn't work. We've performed a software upgrade.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 17:30   ` Andrew M. Theurer
  2001-05-09 17:35     ` [Lse-tech] " Christoph Hellwig
@ 2001-05-09 17:39     ` Alan Cox
  2001-05-09 17:43       ` Andrew M. Theurer
  2001-05-09 23:35       ` Chris Evans
  2001-05-10  4:51     ` [Lse-tech] " Maneesh Soni
  2001-05-10  8:40     ` Dipankar Sarma
  3 siblings, 2 replies; 13+ messages in thread
From: Alan Cox @ 2001-05-09 17:39 UTC (permalink / raw)
  To: Andrew M. Theurer; +Cc: Mike Kravetz, lse-tech, linux-kernel, samba-technical

> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on

Are you using Antons optimisations to samba to use sendfile ?

Alan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 17:39     ` Alan Cox
@ 2001-05-09 17:43       ` Andrew M. Theurer
  2001-05-09 23:35       ` Chris Evans
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew M. Theurer @ 2001-05-09 17:43 UTC (permalink / raw)
  To: Alan Cox; +Cc: Mike Kravetz, lse-tech, linux-kernel, samba-technical

Alan Cox wrote:
>
> > significant problems with lockmeter.  csum_partial_copy_generic was the
> > highest % in profile, at 4.34%.  I'll see if we can get some space on
> 
> Are you using Antons optimisations to samba to use sendfile ?
> 
> Alan

Not yet.  As I understand it, we need a supported nic to take advantage
of the sendfile/zero copy patch.  Once we have the HW, we will use it.

Thanks,

Andrew Theurer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 17:39     ` Alan Cox
  2001-05-09 17:43       ` Andrew M. Theurer
@ 2001-05-09 23:35       ` Chris Evans
  1 sibling, 0 replies; 13+ messages in thread
From: Chris Evans @ 2001-05-09 23:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew M. Theurer, Mike Kravetz, lse-tech, linux-kernel, samba-technical


On Wed, 9 May 2001, Alan Cox wrote:

> > significant problems with lockmeter.  csum_partial_copy_generic was the
> > highest % in profile, at 4.34%.  I'll see if we can get some space on
>
> Are you using Antons optimisations to samba to use sendfile ?

And you might like to try 2.4.4 (I saw 2.4.0 and 2.4.3 mentioned). 2.4.4
has the zerocopy TCP stuff (or was it 2.4.3 :)

Also, if the load is not disk limited, you might like to try Mingo's
pagecache/timers scalability patches. etc.

Cheers
Chris


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 16:29 Linux 2.4 Scalability, Samba, and Netbench Andrew M. Theurer
  2001-05-09 16:56 ` Mike Kravetz
@ 2001-05-10  1:23 ` Kenichi Okuyama
  1 sibling, 0 replies; 13+ messages in thread
From: Kenichi Okuyama @ 2001-05-10  1:23 UTC (permalink / raw)
  To: atheurer; +Cc: lse-tech, linux-kernel, samba-technical

>>>>> "AMT" == Andrew M Theurer <atheurer@austin.ibm.com> writes:
AMT> I would like to help improve SMP scalability on this workload.  If you
AMT> have questions or comments about the above results, or if you are
AMT> conducting similar tests, please send email to
AMT> lse-tech@lists.sourceforge.net.  I have some ideas on my next steps,
AMT> but would like to discuss first.


Did you check vmstat result of each benchmarks?

Most of the problems are caused due to kernel. If you look at result
of vmstat, more than 80% CPU time are used in kernel.

It's true that heavy kernel overhead is due to Samba, and is due to
Samba generating lot's and lot's of request against kernels ( not
only disk IO, but it requires many signal handling etc ).

So, there's really two things we need to do.

1) make Linux more scalable.
   ( This sometimes seems as if it's tuning, but it's really bug
     fix. So, don't ask performance team to tune. Let them FIX. )
 
2) make Samba work in less signals.
   This means, don't call useless system calls, use shared memory
   more effectively, divide Samba source into OS dependent part
   and independent part so that you can do tuning for specific OS
   and still have wide userland, etc.
---- 
Kenichi Okuyama.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 17:30   ` Andrew M. Theurer
  2001-05-09 17:35     ` [Lse-tech] " Christoph Hellwig
  2001-05-09 17:39     ` Alan Cox
@ 2001-05-10  4:51     ` Maneesh Soni
  2001-05-10  8:40     ` Dipankar Sarma
  3 siblings, 0 replies; 13+ messages in thread
From: Maneesh Soni @ 2001-05-10  4:51 UTC (permalink / raw)
  To: Andrew M. Theurer; +Cc: lse-tech, linux-kernel, samba-technical

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.
> 
> Andrew Theurer
> 
> Mike Kravetz wrote:
> > 
> > On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> > >
> > > I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> > > workload with Samba, and I wanted to get some feedback on results so
> > > far.
> > 
> > Do you have any kernel profile or lock contention data?
> > 
> > --
> > Mike Kravetz                                 mkravetz@sequent.com
> > IBM Linux Technology Center

Hello Andrew,

If in the kernprof data you find "fget" as one of the high rankers (say in top
10) then can you try the scalable FD management patch which uses 
read-copy-update mechanism for protecting files_struct. 

As of now there are working patches available for read-copy-update mechanism 
and FD management at "http://lse.sourceforge.net/locking/rclock.html" as 
rclock-2.4.2-01.patch and files_struct_rcu-2.4.2-03.patch but we are working on 
simpler interfaces. Also let me know if you need the patches for a different 
2.4 kernel version.

Maneesh

-- 
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
email: smaneesh@sequent.com
http://lse.sourceforge.net/locking/rclock.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 17:30   ` Andrew M. Theurer
                       ` (2 preceding siblings ...)
  2001-05-10  4:51     ` [Lse-tech] " Maneesh Soni
@ 2001-05-10  8:40     ` Dipankar Sarma
  2001-05-11 15:20       ` David Collier-Brown
  3 siblings, 1 reply; 13+ messages in thread
From: Dipankar Sarma @ 2001-05-10  8:40 UTC (permalink / raw)
  To: Andrew M. Theurer; +Cc: Mike Kravetz, lse-tech, linux-kernel, samba-technical

Hello Andrew,

You would need contact one of the administrators of the LSE project for this.
You would need a developer id for uploading. You can get all the information 
from http://sourceforge.net/projects/lse/.

I think it will be very helpful to have the results including lockmeter
and kernprof data available in lse.sourceforge.net.

Thanks
Dipankar
-- 
Dipankar Sarma  <dipankar@sequent.com> Project: http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.
> 
> Andrew Theurer
> 
> Mike Kravetz wrote:
> > 
> > On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> > >
> > > I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> > > workload with Samba, and I wanted to get some feedback on results so
> > > far.
> > 
> > Do you have any kernel profile or lock contention data?
> > 
> > --
> > Mike Kravetz                                 mkravetz@sequent.com
> > IBM Linux Technology Center
> 
> _______________________________________________
> Lse-tech mailing list
> Lse-tech@lists.sourceforge.net
> http://lists.sourceforge.net/lists/listinfo/lse-tech


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-10  8:40     ` Dipankar Sarma
@ 2001-05-11 15:20       ` David Collier-Brown
  0 siblings, 0 replies; 13+ messages in thread
From: David Collier-Brown @ 2001-05-11 15:20 UTC (permalink / raw)
  Cc: Andrew M. Theurer, Mike Kravetz, lse-tech, linux-kernel, samba-technical

On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> workload with Samba, and I wanted to get some feedback on results so
> far.

	Also consider using Andrew Tridgell's 
	dbench/tbench/smbtorture suite in this process: it
	is mathmeatically comparable to NetBench, runs on
	smaller numbers of load-generationg machines, and
	can give better breakdowns into the disk component,
	then network component and the on-server component
	of the available performance.

	I also have some results from SPARC Linux: send me email.

--dave
-- 
David Collier-Brown,           | Always do right. This will gratify 
Performance & Engineering Team | some people and astonish the rest.
Americas Customer Engineering  |                      -- Mark Twain
(905) 415-2849                 | davecb@canada.sun.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
  2001-05-09 23:34 Bruce Allan
@ 2001-05-10 13:38 ` Andrew M. Theurer
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew M. Theurer @ 2001-05-10 13:38 UTC (permalink / raw)
  To: Bruce Allan; +Cc: lse-tech, linux-kernel, samba-technical

Bruce Allan wrote:
> 
> Andrew Theurer wrote:
> > I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> > significant problems with lockmeter.  csum_partial_copy_generic was the
> > highest % in profile, at 4.34%.  I'll see if we can get some space on
> > http://lse.sourceforge.net to post the test data.
> 
> The Netfinity system that you are using has two different supported GigE
> adapters.  I assume you are using one of these types - Netfinity Gigabit
> Ethernet Adapter (19K4401) and the Netfinity Gigabit Ethernet SX Server
> Adapter (06P3701); using the acenic.c and e1000.c drivers, respectively.
> >From what I understand after initial perusal of the two drivers, the former
> has receive checksumming support on the adapter itself while the latter,
> the one you are using, does not support hardware checksumming (at least, it
> is not enabled by the driver).

Bruce,

According to Intel's driver for Pro/1000, it supports checksum on Rx via
module option "XsunRX=1".  I have not tried this yet because we are
waiting on our Gbps switch hardware.  
> Are you able to re-run your tests with GigE adapters that support
> checksumming on the hardware instead of doing it in the kernel?  If not, I
> will be running similar tests in a very similar configuration (with the
> 19K4401 adapters) in the near future and can share results if you'd like.

Yes, hopefully we will be running the new setup (64 clients, many Gbps
adapters) in about 2-3 weeks.  At that point I'd like to get some
results for 8-way as well.  It would definitely be a good idea to
compare results.

Andrew Theurer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Linux 2.4 Scalability, Samba, and Netbench
@ 2001-05-09 23:34 Bruce Allan
  2001-05-10 13:38 ` Andrew M. Theurer
  0 siblings, 1 reply; 13+ messages in thread
From: Bruce Allan @ 2001-05-09 23:34 UTC (permalink / raw)
  To: lse-tech; +Cc: linux-kernel, samba-technical


Andrew Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.

The Netfinity system that you are using has two different supported GigE
adapters.  I assume you are using one of these types - Netfinity Gigabit
Ethernet Adapter (19K4401) and the Netfinity Gigabit Ethernet SX Server
Adapter (06P3701); using the acenic.c and e1000.c drivers, respectively.
>From what I understand after initial perusal of the two drivers, the former
has receive checksumming support on the adapter itself while the latter,
the one you are using, does not support hardware checksumming (at least, it
is not enabled by the driver).

Are you able to re-run your tests with GigE adapters that support
checksumming on the hardware instead of doing it in the kernel?  If not, I
will be running similar tests in a very similar configuration (with the
19K4401 adapters) in the near future and can share results if you'd like.

Bruce Allan/Beaverton/IBM
IBM Linux Technology Center - OS Gold
503-578-4187   T/L 775-4187
bruce.allan@us.ibm.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2001-05-11 15:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-05-09 16:29 Linux 2.4 Scalability, Samba, and Netbench Andrew M. Theurer
2001-05-09 16:56 ` Mike Kravetz
2001-05-09 17:30   ` Andrew M. Theurer
2001-05-09 17:35     ` [Lse-tech] " Christoph Hellwig
2001-05-09 17:39     ` Alan Cox
2001-05-09 17:43       ` Andrew M. Theurer
2001-05-09 23:35       ` Chris Evans
2001-05-10  4:51     ` [Lse-tech] " Maneesh Soni
2001-05-10  8:40     ` Dipankar Sarma
2001-05-11 15:20       ` David Collier-Brown
2001-05-10  1:23 ` Kenichi Okuyama
2001-05-09 23:34 Bruce Allan
2001-05-10 13:38 ` Andrew M. Theurer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).