All of lore.kernel.org
 help / color / mirror / Atom feed
* [LARTC] Bandwidth Control Tolerances
@ 2004-01-06  0:23 Patrick Turley
  2004-01-06 17:20 ` Martin A. Brown
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Patrick Turley @ 2004-01-06  0:23 UTC (permalink / raw)
  To: lartc

I have measured the performance of HTB with iperf and found it to be very
close to expected (i.e., within 5%). I have a colleague who is measuring the
performance by ftp'ing large files and recording the time required to make
the transfer. He is seeing an average throughput that is nearly 10% away
from the theoretical, with occasional excursions to nearly 30%.

My colleague is now questioning the quality of the traffic control
algorithms and wondering two things:

1) What tolerance can we guarantee and advertise?

2) Can the tolerance be improved, since the values he has measured are
unacceptable?

I believe that my colleague's measurements are unusable. It would help me
greatly if anyone who is knowledgeable on these points could respond -
either agreeing or disagreeing with me (either would be helpful).

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LARTC] Bandwidth Control Tolerances
  2004-01-06  0:23 [LARTC] Bandwidth Control Tolerances Patrick Turley
@ 2004-01-06 17:20 ` Martin A. Brown
  2004-01-06 18:02 ` Patrick Turley
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Martin A. Brown @ 2004-01-06 17:20 UTC (permalink / raw)
  To: lartc

Hello Patrick,

Please excuse my suggestion if you have already considered the issue I
indicate below from Stef's FAQ.

 : I have measured the performance of HTB with iperf and found it to be
 : very close to expected (i.e., within 5%). I have a colleague who is
 : measuring the performance by ftp'ing large files and recording the time
 : required to make the transfer. He is seeing an average throughput that
 : is nearly 10% away from the theoretical, with occasional excursions to
 : nearly 30%.

How have you defined your PSCHED_CLOCK_SOURCE?  See this URL:

  http://www.docum.org/stef.coene/qos/faq/cache/40.html

 : My colleague is now questioning the quality of the traffic control
 : algorithms and wondering two things:

Let's be careful with the baby and the bathwater.  The algorithms have
been vetted.  The implementation may not be ideal, but implementations
always suffer from compromises, right?

 : 1) What tolerance can we guarantee and advertise?

Measure the deviations from your specified bandwidth after changing your
setting for PSCHED_CLOCK_SOURCE.  Advertise your measured tolerance
accordingly.

 : 2) Can the tolerance be improved, since the values he has measured are
 : unacceptable?

I don't know--see the above link, and check how your kernel was compiled.
Others on this list may have further suggestions for you.

Good luck,

-Martin

-- 
Martin A. Brown --- SecurePipe, Inc. --- mabrown@securepipe.com

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LARTC] Bandwidth Control Tolerances
  2004-01-06  0:23 [LARTC] Bandwidth Control Tolerances Patrick Turley
  2004-01-06 17:20 ` Martin A. Brown
@ 2004-01-06 18:02 ` Patrick Turley
  2004-01-06 18:02 ` Patrick Turley
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Patrick Turley @ 2004-01-06 18:02 UTC (permalink / raw)
  To: lartc

This is, of course, very valuable feedback. Unfortunately, given the
responses I've had so far, I see that I didn't make it clear what I'm really
looking for.

I believe that my colleague's test methodology is flawed. I believe that you
cannot generate reliable bandwidth measurements by ftp'ing files and
measuring the time it takes. I believe this because I have seen iperf
generate very stable measurements with a variability of only plus or minus 1
kbit/s. I think it is inappropriate to spend time examining the quality of
the underlying bandwidth algorithms when the problem is really with the
measurement technique. If I don't prove that his tests are flawed, then I
*will* be asked to do a bunch of investigation that I think will only waste
my time. The ideal response would be something like this:



Patrick,

Timing an FTP transfer is well known to be a poor choice for measuring
bandwidth for the following reasons:

1) Even under ideal conditions, you will be guaranteed to under-measure the
bandwidth because you are not accounting for FTP protocol overhead, which is
considerable.

2) FTP does a lot of related work that isn't directly helpful in simply
moving the bits, which is why your colleague is seeing such variability in
his measurements.

3) The resolution of the Linux clock depends on the underlying hardware but,
at user level, is one half second at best. This can introduce substantial
error and variability in such a simple measurement.

4) Iperf is specifically designed to measure bandwidth without protocol
overhead or any other operations that introduce undue error or variability,
and accounts for clock skew. This is why you're seeing such stable
measurements and your colleague is not. You don't measure voltage with a
light bulb - you use a voltmeter. Don't measure bandwidth with FTP, use a
bandwidth meter.



These are my assertions. If you can authoritatively agree or disagree with
any of these claims, please say so. Also, if any of you have measured HTB
accuracy and can point me to a web site, that would be ideal. Yes, I've
visited the HTB home page, but I haven't found what I'm looking for there.

Please see below for my additional responses to Martin's e-mail.

>  : I have measured the performance of HTB with iperf and found it to be
>  : very close to expected (i.e., within 5%). I have a colleague who is
>  : measuring the performance by ftp'ing large files and recording the time
>  : required to make the transfer. He is seeing an average throughput that
>  : is nearly 10% away from the theoretical, with occasional excursions to
>  : nearly 30%.
>
> How have you defined your PSCHED_CLOCK_SOURCE?  See this URL:
>
>   http://www.docum.org/stef.coene/qos/faq/cache/40.html

Thank you for the reference. Yes, I found this and read it. We are using
stock RedHat kernels and are unwilling to recompile. I will try to figure
out how the RedHat kernel is configured. Can you give me an easy way to
discover this? Something in /proc perhaps?

>  : My colleague is now questioning the quality of the traffic control
>  : algorithms and wondering two things:
>
> Let's be careful with the baby and the bathwater.  The algorithms have
> been vetted.  The implementation may not be ideal, but implementations
> always suffer from compromises, right?

I agree entirely. I am convinced that the bandwidth algorithms have been
closely examined by many people and work quite well. However, because my
colleague believes his tests are accurate, he has concluded otherwise. If
you can help me prove that the problem is the measurement technique, I would
be very grateful.

>  : 1) What tolerance can we guarantee and advertise?
>
> Measure the deviations from your specified bandwidth after changing your
> setting for PSCHED_CLOCK_SOURCE.  Advertise your measured tolerance
> accordingly.

Yes - that's fine, if you think you are measuring correctly. But that's the
problem at hand, isn't it?

>  : 2) Can the tolerance be improved, since the values he has measured are
>  : unacceptable?
>
> I don't know--see the above link, and check how your kernel was compiled.
> Others on this list may have further suggestions for you.
>
> Good luck,
>
> -Martin

As always, Martin. Thanks very much for your help.

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LARTC] Bandwidth Control Tolerances
  2004-01-06  0:23 [LARTC] Bandwidth Control Tolerances Patrick Turley
  2004-01-06 17:20 ` Martin A. Brown
  2004-01-06 18:02 ` Patrick Turley
@ 2004-01-06 18:02 ` Patrick Turley
  2004-01-06 18:03 ` Patrick Turley
  2004-01-07 17:36 ` Stef Coene
  4 siblings, 0 replies; 6+ messages in thread
From: Patrick Turley @ 2004-01-06 18:02 UTC (permalink / raw)
  To: lartc

This is, of course, very valuable feedback. Unfortunately, given the
responses I've had so far, I see that I didn't make it clear what I'm really
looking for.

I believe that my colleague's test methodology is flawed. I believe that you
cannot generate reliable bandwidth measurements by ftp'ing files and
measuring the time it takes. I believe this because I have seen iperf
generate very stable measurements with a variability of only plus or minus 1
kbit/s. I think it is inappropriate to spend time examining the quality of
the underlying bandwidth algorithms when the problem is really with the
measurement technique. If I don't prove that his tests are flawed, then I
*will* be asked to do a bunch of investigation that I think will only waste
my time. The ideal response would be something like this:



Patrick,

Timing an FTP transfer is well known to be a poor choice for measuring
bandwidth for the following reasons:

1) Even under ideal conditions, you will be guaranteed to under-measure the
bandwidth because you are not accounting for FTP protocol overhead, which is
considerable.

2) FTP does a lot of related work that isn't directly helpful in simply
moving the bits, which is why your colleague is seeing such variability in
his measurements.

3) The resolution of the Linux clock depends on the underlying hardware but,
at user level, is one half second at best. This can introduce substantial
error and variability in such a simple measurement.

4) Iperf is specifically designed to measure bandwidth without protocol
overhead or any other operations that introduce undue error or variability,
and accounts for clock skew. This is why you're seeing such stable
measurements and your colleague is not. You don't measure voltage with a
light bulb - you use a voltmeter. Don't measure bandwidth with FTP, use a
bandwidth meter.



These are my assertions. If you can authoritatively agree or disagree with
any of these claims, please say so. Also, if any of you have measured HTB
accuracy and can point me to a web site, that would be ideal. Yes, I've
visited the HTB home page, but I haven't found what I'm looking for there.

Please see below for my additional responses to Martin's e-mail.

>  : I have measured the performance of HTB with iperf and found it to be
>  : very close to expected (i.e., within 5%). I have a colleague who is
>  : measuring the performance by ftp'ing large files and recording the time
>  : required to make the transfer. He is seeing an average throughput that
>  : is nearly 10% away from the theoretical, with occasional excursions to
>  : nearly 30%.
>
> How have you defined your PSCHED_CLOCK_SOURCE?  See this URL:
>
>   http://www.docum.org/stef.coene/qos/faq/cache/40.html

Thank you for the reference. Yes, I found this and read it. We are using
stock RedHat kernels and are unwilling to recompile. I will try to figure
out how the RedHat kernel is configured. Can you give me an easy way to
discover this? Something in /proc perhaps?

>  : My colleague is now questioning the quality of the traffic control
>  : algorithms and wondering two things:
>
> Let's be careful with the baby and the bathwater.  The algorithms have
> been vetted.  The implementation may not be ideal, but implementations
> always suffer from compromises, right?

I agree entirely. I am convinced that the bandwidth algorithms have been
closely examined by many people and work quite well. However, because my
colleague believes his tests are accurate, he has concluded otherwise. If
you can help me prove that the problem is the measurement technique, I would
be very grateful.

>  : 1) What tolerance can we guarantee and advertise?
>
> Measure the deviations from your specified bandwidth after changing your
> setting for PSCHED_CLOCK_SOURCE.  Advertise your measured tolerance
> accordingly.

Yes - that's fine, if you think you are measuring correctly. But that's the
problem at hand, isn't it?

>  : 2) Can the tolerance be improved, since the values he has measured are
>  : unacceptable?
>
> I don't know--see the above link, and check how your kernel was compiled.
> Others on this list may have further suggestions for you.
>
> Good luck,
>
> -Martin

As always, Martin. Thanks very much for your help.

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LARTC] Bandwidth Control Tolerances
  2004-01-06  0:23 [LARTC] Bandwidth Control Tolerances Patrick Turley
                   ` (2 preceding siblings ...)
  2004-01-06 18:02 ` Patrick Turley
@ 2004-01-06 18:03 ` Patrick Turley
  2004-01-07 17:36 ` Stef Coene
  4 siblings, 0 replies; 6+ messages in thread
From: Patrick Turley @ 2004-01-06 18:03 UTC (permalink / raw)
  To: lartc

This is, of course, very valuable feedback. Unfortunately, given the
responses I've had so far, I see that I didn't make it clear what I'm really
looking for.

I believe that my colleague's test methodology is flawed. I believe that you
cannot generate reliable bandwidth measurements by ftp'ing files and
measuring the time it takes. I believe this because I have seen iperf
generate very stable measurements with a variability of only plus or minus 1
kbit/s. I think it is inappropriate to spend time examining the quality of
the underlying bandwidth algorithms when the problem is really with the
measurement technique. If I don't prove that his tests are flawed, then I
*will* be asked to do a bunch of investigation that I think will only waste
my time. The ideal response would be something like this:



Patrick,

Timing an FTP transfer is well known to be a poor choice for measuring
bandwidth for the following reasons:

1) Even under ideal conditions, you will be guaranteed to under-measure the
bandwidth because you are not accounting for FTP protocol overhead, which is
considerable.

2) FTP does a lot of related work that isn't directly helpful in simply
moving the bits, which is why your colleague is seeing such variability in
his measurements.

3) The resolution of the Linux clock depends on the underlying hardware but,
at user level, is one half second at best. This can introduce substantial
error and variability in such a simple measurement.

4) Iperf is specifically designed to measure bandwidth without protocol
overhead or any other operations that introduce undue error or variability,
and accounts for clock skew. This is why you're seeing such stable
measurements and your colleague is not. You don't measure voltage with a
light bulb - you use a voltmeter. Don't measure bandwidth with FTP, use a
bandwidth meter.



These are my assertions. If you can authoritatively agree or disagree with
any of these claims, please say so. Also, if any of you have measured HTB
accuracy and can point me to a web site, that would be ideal. Yes, I've
visited the HTB home page, but I haven't found what I'm looking for there.

Please see below for my additional responses to Martin's e-mail.

>  : I have measured the performance of HTB with iperf and found it to be
>  : very close to expected (i.e., within 5%). I have a colleague who is
>  : measuring the performance by ftp'ing large files and recording the time
>  : required to make the transfer. He is seeing an average throughput that
>  : is nearly 10% away from the theoretical, with occasional excursions to
>  : nearly 30%.
>
> How have you defined your PSCHED_CLOCK_SOURCE?  See this URL:
>
>   http://www.docum.org/stef.coene/qos/faq/cache/40.html

Thank you for the reference. Yes, I found this and read it. We are using
stock RedHat kernels and are unwilling to recompile. I will try to figure
out how the RedHat kernel is configured. Can you give me an easy way to
discover this? Something in /proc perhaps?

>  : My colleague is now questioning the quality of the traffic control
>  : algorithms and wondering two things:
>
> Let's be careful with the baby and the bathwater.  The algorithms have
> been vetted.  The implementation may not be ideal, but implementations
> always suffer from compromises, right?

I agree entirely. I am convinced that the bandwidth algorithms have been
closely examined by many people and work quite well. However, because my
colleague believes his tests are accurate, he has concluded otherwise. If
you can help me prove that the problem is the measurement technique, I would
be very grateful.

>  : 1) What tolerance can we guarantee and advertise?
>
> Measure the deviations from your specified bandwidth after changing your
> setting for PSCHED_CLOCK_SOURCE.  Advertise your measured tolerance
> accordingly.

Yes - that's fine, if you think you are measuring correctly. But that's the
problem at hand, isn't it?

>  : 2) Can the tolerance be improved, since the values he has measured are
>  : unacceptable?
>
> I don't know--see the above link, and check how your kernel was compiled.
> Others on this list may have further suggestions for you.
>
> Good luck,
>
> -Martin

As always, Martin. Thanks very much for your help.

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LARTC] Bandwidth Control Tolerances
  2004-01-06  0:23 [LARTC] Bandwidth Control Tolerances Patrick Turley
                   ` (3 preceding siblings ...)
  2004-01-06 18:03 ` Patrick Turley
@ 2004-01-07 17:36 ` Stef Coene
  4 siblings, 0 replies; 6+ messages in thread
From: Stef Coene @ 2004-01-07 17:36 UTC (permalink / raw)
  To: lartc

On Tuesday 06 January 2004 19:02, Patrick Turley wrote:
> This is, of course, very valuable feedback. Unfortunately, given the
> responses I've had so far, I see that I didn't make it clear what I'm
> really looking for.
I also did some htb tests.  I created scripts that uses iptables or tc 
counters to log the number of bytes.  And most of the time I had bursty 
results.  But I think this burst is more reated to the test setup: 
collisions, retransmits, CPU/disk, ...  

I did some burst tests and recorded the rate each 500ms:
http://docum.org/stef.coene/qos/tests/htb/burst/

I used ethloop.  Ethloop can be used to simulate a htb qdisc on the lo device 
(it can be found on the htb homepag).  So there is no network involved, it 
only records bytes like they should be sended if this was a real device.  So 
this is the perfect situation for htb.
And....  as you can see on the graphs, the rate is very stable and allmost 
perfect accurate.  So the bursts or rate deviation are not htb related.

Stef

-- 
stef.coene@docum.org
 "Using Linux as bandwidth manager"
     http://www.docum.org/
     #lartc @ irc.openprojects.net

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-01-07 17:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-06  0:23 [LARTC] Bandwidth Control Tolerances Patrick Turley
2004-01-06 17:20 ` Martin A. Brown
2004-01-06 18:02 ` Patrick Turley
2004-01-06 18:02 ` Patrick Turley
2004-01-06 18:03 ` Patrick Turley
2004-01-07 17:36 ` Stef Coene

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.