From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Patrick Turley" Date: Tue, 06 Jan 2004 18:03:21 +0000 Subject: Re: [LARTC] Bandwidth Control Tolerances Message-Id: <005101c3d47f$61d1f270$6401a8c0@pturley> List-Id: References: <002801c3d3eb$4634da30$6401a8c0@pturley> In-Reply-To: <002801c3d3eb$4634da30$6401a8c0@pturley> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: lartc@vger.kernel.org This is, of course, very valuable feedback. Unfortunately, given the responses I've had so far, I see that I didn't make it clear what I'm really looking for. I believe that my colleague's test methodology is flawed. I believe that you cannot generate reliable bandwidth measurements by ftp'ing files and measuring the time it takes. I believe this because I have seen iperf generate very stable measurements with a variability of only plus or minus 1 kbit/s. I think it is inappropriate to spend time examining the quality of the underlying bandwidth algorithms when the problem is really with the measurement technique. If I don't prove that his tests are flawed, then I *will* be asked to do a bunch of investigation that I think will only waste my time. The ideal response would be something like this: Patrick, Timing an FTP transfer is well known to be a poor choice for measuring bandwidth for the following reasons: 1) Even under ideal conditions, you will be guaranteed to under-measure the bandwidth because you are not accounting for FTP protocol overhead, which is considerable. 2) FTP does a lot of related work that isn't directly helpful in simply moving the bits, which is why your colleague is seeing such variability in his measurements. 3) The resolution of the Linux clock depends on the underlying hardware but, at user level, is one half second at best. This can introduce substantial error and variability in such a simple measurement. 4) Iperf is specifically designed to measure bandwidth without protocol overhead or any other operations that introduce undue error or variability, and accounts for clock skew. This is why you're seeing such stable measurements and your colleague is not. You don't measure voltage with a light bulb - you use a voltmeter. Don't measure bandwidth with FTP, use a bandwidth meter. These are my assertions. If you can authoritatively agree or disagree with any of these claims, please say so. Also, if any of you have measured HTB accuracy and can point me to a web site, that would be ideal. Yes, I've visited the HTB home page, but I haven't found what I'm looking for there. Please see below for my additional responses to Martin's e-mail. > : I have measured the performance of HTB with iperf and found it to be > : very close to expected (i.e., within 5%). I have a colleague who is > : measuring the performance by ftp'ing large files and recording the time > : required to make the transfer. He is seeing an average throughput that > : is nearly 10% away from the theoretical, with occasional excursions to > : nearly 30%. > > How have you defined your PSCHED_CLOCK_SOURCE? See this URL: > > http://www.docum.org/stef.coene/qos/faq/cache/40.html Thank you for the reference. Yes, I found this and read it. We are using stock RedHat kernels and are unwilling to recompile. I will try to figure out how the RedHat kernel is configured. Can you give me an easy way to discover this? Something in /proc perhaps? > : My colleague is now questioning the quality of the traffic control > : algorithms and wondering two things: > > Let's be careful with the baby and the bathwater. The algorithms have > been vetted. The implementation may not be ideal, but implementations > always suffer from compromises, right? I agree entirely. I am convinced that the bandwidth algorithms have been closely examined by many people and work quite well. However, because my colleague believes his tests are accurate, he has concluded otherwise. If you can help me prove that the problem is the measurement technique, I would be very grateful. > : 1) What tolerance can we guarantee and advertise? > > Measure the deviations from your specified bandwidth after changing your > setting for PSCHED_CLOCK_SOURCE. Advertise your measured tolerance > accordingly. Yes - that's fine, if you think you are measuring correctly. But that's the problem at hand, isn't it? > : 2) Can the tolerance be improved, since the values he has measured are > : unacceptable? > > I don't know--see the above link, and check how your kernel was compiled. > Others on this list may have further suggestions for you. > > Good luck, > > -Martin As always, Martin. Thanks very much for your help. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/