All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
@ 2014-09-21 18:35 Sebastian Moeller
  2014-09-21 21:40 ` Alan Goodman
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-21 18:35 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 19 bytes --]

Hi Dave, hi Andy,


[-- Attachment #2: ping_sweeper6.sh --]
[-- Type: application/octet-stream, Size: 3813 bytes --]

#! /bin/bash
# TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)

# just an identifier for the ping log
TECH=ADSL
# finding a proper target IP is somewhat of an art, just traceroute a remote site 
# and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
# for this I typically run "traceroute 8.8.8.8", and then select the first host on the ISP side (typically after 
# the first large RTT increment) and test its response by "ping -c 10 -s 16 NNN.NNN.NNN.NNN", if this host does not repsond 
# I pick the next host along the route to 8.8.8.8. I assume the closer the host the less disturbed by other traffic the 
# response will be.


if [ ! $# == 1 ]; then
    echo "To run measurements supply the TARGET IP address as first agument to ${0} this script."
    echo "Use traceroute 8.8.8.8 to get a list of increasingly distant hosts, pick the first host out of your network (ideally the DSLAM)."
    echo "Test whether the selected host responds to ping: 'ping -s16 -c 1 target.IP.address.quad' : this needs to actually return non zero RTTs."
    echo "If the hosts does not reply to the pings take the next host from the traceroute (movin closer to 8.8.8.8), repeat until you find a replying host."
    echo "Once the main script is started have a quick look at the logfile, to see whether the RTTs stay close to the initial test RTT."
    echo "If the RTTs have increased a lot, the PINGPERIOD might be too short, and the host might have put us on a slow path; either increase PINGPERIOD or try the next host..."
    echo ""
    echo "Here is the traceroute (might take a while):"
    echo ""
    traceroute 8.8.8.8
    
    exit 0
else
    TARGET=${1}		# Replace by an appropriate host
fi


DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
LOG=ping_sweep_${TECH}_${DATESTR}.txt

MAX_PREIP_OVERHEAD_SIZE=44	# as far as I can tell 44 bytes is the maximum pre IP header overhead for an ATM based carrier
IP4_HEADER_SIZE=20		# 20 bytes
IDEAL_MTU=1500			#  what the MTU should look like

# by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
# empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
# at 100 packets/s of 116 + 28 + 40 we would need 4 ATM cells = 192byte * 100/s = 150kbit/s
# at 100 packets/s of 16 + 28 + 40nwe would need 2 ATM cells = 96byte * 100/s = 75kbit/s
# on average we need 150 + 75 * 0.5 = 112.5 Kbit/s, increase the ping period if uplinh < 112.5 Kbit/s
PINGPERIOD=0.01		# reduce if uplink slower than roughly 200Kbit/s
PINGSPERSIZE=10000	# the higher the link rate the more samples we need to reliably detect the increasingly smaller ATM quantisation steps. Can be reduced for slower links

# Start, needed to find the per packet overhead dependent on the ATM encapsulation
# to reiably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
SWEEPMAXSIZE=116
SWEEPMAXSIZE=216
    

n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`


i_sweep=0
i_size=0

while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
do
    (( i_sweep++ ))
    echo "Current iteration: ${i_sweep}"
    # now loop from sweepmin to sweepmax
    i_size=${SWEEPMINSIZE}
    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
    do
	echo "${i_sweep}. repetition of ping size ${i_size}"
	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
	(( i_size++ ))
	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
	sleep ${PINGPERIOD}
    done
done

#tail -f ${LOG}

echo "Done... ($0)
"

[-- Attachment #3: Type: text/plain, Size: 2488 bytes --]


On Sep 20, 2014, at 19:55 , Dave Taht <dave.taht@gmail.com> wrote:

> We'd had a very long thread on cerowrt-devel and in the end sebastian
> (I think) had developed some scripts to exaustively (it took hours)
> derive the right encapsulation frame size on a link. I can't find the
> relevant link right now, ccing that list…

	I am certainly not the first to have looked at ATM encapsulation effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis about this topic (see http://www.adsl-optimizer.dk) and together with Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)  I believe they taught the linux kernel about how to account for encapsulation. What you need to tell the kernel is whether or not you have ATM encapsulation (ATM is weird in that each ip Packet gets chopped into 48 byte cells, with the last partially full cell padded) and the per packet overhead on your link. You can either get this information from your ISP and/or from the DSL-modem’s information page, but both are not guaranteed to be available/useful. So I set  out to empirically deduce this information from measurements on my own link. I naively started out with using ICMP echo requests as probes (as I easily could generate probe packets with different sizes with the linux/macosx ping binary), as it turned out, this works well enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh (attached) is the program I use (on an otherwise idle link, typically over night) to collect ~1000 repetitions of time stamped ping packets spanning two (potential) ATM cells. I then use tc_stab_parameter_guide.m (a matlab/octave program) to read in the output of the ping_sweeper script and process the data. In short if the link runs ATM encapsulation the plot of the data needs to look like a stair with 48 byte step width, if it is just smoothly increasing the carrier is not ATM. For ATM links and only ATM links, the script also tries to figure out the per packet overhead which always worked well for me. (My home-link got recently a silent upgrade where the encapsulation changed from 40 bytes to 44 bytes (probably due to the introduction of VLAN tags), which caused some disturbances in link capacity measurements I was running at the time; so I ran my code again and lo and behold the overhead had increased, which caused the issues with the measurements, as after taking the real overhead into account the disturbances went away, but I guess I digress ;) )

[-- Attachment #4: tc_stab_parameter_guide_05.m --]
[-- Type: application/octet-stream, Size: 40025 bytes --]

function [ output_args ] = tc_stab_parameter_guide_05( sweep_fqn, up_Kbit, down_Kbit )
%TC_STAB_PARAMETER_GUIDE Summary of this function goes here
%   try to read in the result from a ping sweep run
%	sweep_fqn (optional): the log file of the ping sweep against the first hop after
%		the DSL link
%	up_Kbit (optional): the uplink rate in Kilobits per second
%	down_Kbit (optional): the downlink rate in Kilobits per second
%
% TODO:
%	find whether the carrier is ATM quantized (via FFT?)
%		test whther best stair fits better than a simple linear regresson
%		line?
%	if yes:
%		what is the RTT step (try to deduce the combined up and down rates from this)
%	estimate the best MTU for the estimated protocol stack (how to test this?)
%		1) estimate the largest MTU that avoids fragmentation (default 1500 - 28 should be largest without fragmentation)
%		2) estimate the largest MTU that does not have padding in the last
%		   ATM cell, for this pick the MTU that no partial ATM cell remains
%	test geometric mean, and delogged mean of log(RTTs) (to deskew the long tailed distribution)
%	include the potential PACKET sizes for VLAN tagged packets as well?
%
%
% DONE:
%	Allow for holes in the ping data (missing sizes)
%	make sure all sizes are filled (use NaN for empty ones?)
%	maybe require to give the nominal up and down rates, to estimate the
%		RTT stepsize
%	try to figure out the overhead for each packet
%	netywork rates traditionally are in 10^3 magnitudes instead of RAM-like
%		2^10 magnitudes, take into account for ATM quantum calculation
%	implement robust mean (mean between certain quantiles), does that make
%		sense with RTT distribution?
%
%Thoughts:
%	ask about IPv4 or IPv6 (what about tunnelling?)
%	the sweep should be taken directly connected to the modem to reduce
%		non-ATM routing delays

if ~(isoctave)
	dbstop if error;
	timestamps.(mfilename).start = tic;
else
	tic();
end
disp(['Starting: ', mfilename]);

output_args = [];

% control options
show_mean = 1;		% the means are noisier than the medians
show_robust_mean = 1;		% the means are noisier than the medians
show_median = 1;	% the median seems the way to go
show_min = 1;		% the min should be the best measure, but in the ATM test sweep it is too variable
show_max = 0;		% only useful for debugging
show_sem = 0;		% give some estimate of the variance
show_ci = 1;		% show the confidence interval of the mean, if the mean is shown
ci_alpha = 0.05;	% alpha for confidence interval calculation
use_measure = 'median';	% median, or robust_mean
use_processed_results = 1;
max_samples_per_size = [];
% max_samples_per_size = 1000;	% if not empty only use maximally that many samples per size


% if not specified we try to estimate the per cell RTT from the data
default_up_Kbit = [];
default_down_KBit = [];

if (nargin == 0)
	sweep_fqn = '';
% 	sweep_fqn = fullfile(pwd, 'ping_sweep_ATM.txt');	% was Bridged, LLC/SNAP RFC-1483/2684 connection (overhead 32 bytes - 14 = 18)
% 	sweep_fqn = fullfile(pwd, 'ping_sweep_ATM_20130610_234707.txt');	% telekom PPPOE, LLC, overhead 40!
% 	sweep_fqn = fullfile(pwd, 'ping_sweep_ATM_20130618_233008.txt');	% telekom PPPOE
% 	sweep_fqn = fullfile(pwd, 'ping_sweep_ATM_20130620_234659.txt');	% telekom PPPOE
% 	sweep_fqn = fullfile(pwd, 'ping_sweep_ATM_20130618-20.txt');	% telekom PPPOE
	%  	sweep_fqn = fullfile(pwd, 'ping_sweep_CABLE_20120426_230227.txt');
	%  	sweep_fqn = fullfile(pwd, 'ping_sweep_CABLE_20120801_001235.txt');
	if isempty(sweep_fqn)
		[sweep_name, sweep_dir] = uigetfile('ping*.txt');
		sweep_fqn = fullfile(sweep_dir, sweep_name);
	end
	up_Kbit = default_up_Kbit;
	down_Kbit = default_down_KBit;
end
if (nargin == 1)
	up_Kbit = default_up_Kbit;
	down_Kbit = default_down_KBit;
end
if (nargin == 2)
	down_Kbit = default_down_KBit;
end

%ATM
quantum.byte = 48;	% ATM packets are always 53 bytes, 48 thereof payload
quantum.bit = quantum.byte * 8;
ATM_cell.byte = 53;
ATM_cell.bit = ATM_cell.byte * 8;


% known packet size offsets in bytes
offsets.IPv4 = 20;		% assume no IPv4 options are used, IPv6 would be 40bytes?
offsets.IPv6 = 40;		% not used yet...
offsets.ICMP = 8;		% ICMP header
offsets.ethernet = 14;	% ethernet header
offset.ATM.max_encapsulation_bytes = 44; % see http://ace-host.stuart.id.au/russell/files/tc/tc-atm/, but note that due to VLAN tags we can reach 48 worst case...
MTU = 1500;	% the nominal MTU to the ping host should be 1500, but might be lower if using a VPN
max_MTU_for_overhead_determination = 1280;
% fragmentation will cause an addition relative large increase in RTT (not necessarily registered to the ATM cells)
% that will confuse the ATM quantisation offset detector, so exclude all
% ping sizes that are potentially affected by fragmentation
max_ping_size_without_fragmentation = MTU + offsets.ethernet - offsets.IPv4 - offset.ATM.max_encapsulation_bytes; 
% unknown offsets is what we need to figure out to feed tc-stab...


[sweep_dir, sweep_name] = fileparts(sweep_fqn);
cur_parsed_data_mat = [sweep_fqn(1:end-4), '.mat'];

if (use_processed_results && ~isempty(dir(cur_parsed_data_mat)))
	disp(['Loading processed ping data from ', cur_parsed_data_mat]);
	load(cur_parsed_data_mat, 'ping');
else
	% read in the result from a ping sweep
	disp(['Processing ping data from ', sweep_fqn]);
	ping = parse_ping_output(sweep_fqn);
	if isempty(ping)
		disp('No useable ping data found, exiting...');
		return
	end
	save(cur_parsed_data_mat, 'ping');
end




% analyze the data
min_ping_size = min(ping.data(:, ping.cols.size)) - offsets.ICMP;
disp(['Minimum size of ping payload used: ', num2str(min_ping_size), ' bytes.']);
known_overhead = offsets.IPv4;	% ping reports the ICMP header already included in size
ping.data(:, ping.cols.size) = ping.data(:, ping.cols.size) + known_overhead;	% we know we used IPv4 so add the 20 bytes already, so that size are relative to the start of the IP header
size_list = unique(ping.data(:, ping.cols.size));	% this is the number of different sizes, but there might be holes/missing sizes
max_pingsize = max(size_list);

% packets larger than the pMTU will get fragmented, resulting in a extra-large step (roughly 2 to 3 times larger than usual) somewhere in the data
% which will confuse the simplistic stair finder, so limit the search space
% to <+ 1280 the min MTU for IPv6, hoping that this should work
% everywhere...
if (size_list(end) > max_MTU_for_overhead_determination)
	disp(['Restricting the ATM quantization search space to <= ', num2str(max_MTU_for_overhead_determination), ' bytes.']);
	tmp_idx = find(size_list <= max_MTU_for_overhead_determination);
	if (isempty(tmp_idx))
		disp(['No data with size <= ', num2str(max_MTU_for_overhead_determination), ' bytes found; ATM quantization can not be determined....']);
		return
	end
	measured_size_list = size_list;
	size_list = measured_size_list(tmp_idx);
	measured_max_pingsize = max_pingsize;
	max_pingsize = max(size_list);
end


per_size.header = {'size', 'mean', 'robust_mean', 'median', 'min', 'max', 'std', 'n', 'sem', 'ci'};
per_size.cols = get_column_name_indices(per_size.header);
per_size.data = zeros([max_pingsize, length(per_size.header)]) / 0;	% NaNs
per_size.data(:, per_size.cols.size) = (1:1:max_pingsize);

if ~isempty(max_samples_per_size)
	disp(['Analysing only the first ', num2str(max_samples_per_size), ' samples.']);
end

for i_size = 1 : length(size_list)
	cur_size = size_list(i_size);
	cur_size_idx = find(ping.data(:, ping.cols.size) == cur_size);
	if ~isempty(max_samples_per_size)
		n_selected_samples = min([length(cur_size_idx), max_samples_per_size]);
		cur_size_idx = cur_size_idx(1:n_selected_samples);
		%disp(['Analysing only the first ', num2str(max_samples_per_size), ' samples of ', num2str(length(cur_size_idx))]);
	end	
	per_size.data(cur_size, per_size.cols.mean) = mean(ping.data(cur_size_idx, ping.cols.time));
	% robust mean, aka mean of 5 to 95 quantiles
	per_size.data(cur_size, per_size.cols.robust_mean) = robust_mean(ping.data(cur_size_idx, ping.cols.time), 0.1, 0.9);	% take the mean while excluding extreme values
	
	per_size.data(cur_size, per_size.cols.median) = median(ping.data(cur_size_idx, ping.cols.time));
	per_size.data(cur_size, per_size.cols.min) = min(ping.data(cur_size_idx, ping.cols.time));
	per_size.data(cur_size, per_size.cols.max) = max(ping.data(cur_size_idx, ping.cols.time));
	per_size.data(cur_size, per_size.cols.std) = std(ping.data(cur_size_idx, ping.cols.time), 0);
	per_size.data(cur_size, per_size.cols.n) = length(cur_size_idx);
	per_size.data(cur_size, per_size.cols.sem) = per_size.data(cur_size, per_size.cols.std) / sqrt(length(cur_size_idx));
	per_size.data(cur_size, per_size.cols.ci) = calc_cihw(per_size.data(cur_size, per_size.cols.std), per_size.data(cur_size, per_size.cols.n), ci_alpha);
end

clear ping	% with large data sets 32bit matlab will run into memory issues...


figure('Name', sweep_name);
hold on;
legend_str = {};
if (show_mean)
	% means
	legend_str{end + 1} = 'mean';
	plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.mean), 'Color', [0 1 0 ]);
	legend_str{end + 1} = 'robust mean';
	plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.robust_mean), 'Color', [0 0.75 0 ]);
	if  (show_sem)
		legend_str{end + 1} = '+sem';
		legend_str{end + 1} = '-sem';
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.mean) - per_size.data(:, per_size.cols.sem), 'Color', [0 0.66 0]);
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.mean) + per_size.data(:, per_size.cols.sem), 'Color', [0 0.66 0]);
	end
	if  (show_ci)
		legend_str{end + 1} = '+ci';
		legend_str{end + 1} = '-ci';
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.mean) - per_size.data(:, per_size.cols.ci), 'Color', [0 0.37 0]);
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.mean) + per_size.data(:, per_size.cols.ci), 'Color', [0 0.37 0]);
	end
	
end
if(show_median)
	% median +- standard error of the mean, confidence interval would be
	% better
	legend_str{end + 1} = 'median';
	plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.median), 'Color', [1 0 0]);
	if (show_sem)
		legend_str{end + 1} = '+sem';
		legend_str{end + 1} = '-sem';
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.median) - per_size.data(:, per_size.cols.sem), 'Color', [0.66 0 0]);
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.median) + per_size.data(:, per_size.cols.sem), 'Color', [0.66 0 0]);
	end
	if(show_min)
		% minimum, should be cleanest, but for the test data set looks quite sad...
		legend_str{end + 1} = 'min';
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.min), 'Color', [0 0 1]);
	end
	if(show_max)
		% minimum, should be cleanest, but for the test data set looks quite sad...
		legend_str{end + 1} = 'max';
		plot(per_size.data(:, per_size.cols.size), per_size.data(:, per_size.cols.max), 'Color', [0 0 0.66]);
	end
end

title(['If this plot shows a (noisy) step function with a stepping ~', num2str(quantum.byte), ' bytes then the data carrier is quantised, make sure to use tc-stab']);
xlabel('Approximate packet size [bytes]');
ylabel('ICMP round trip times (ping RTT) [ms]');
legend(legend_str, 'Location', 'NorthWest');
hold off;

% potentially clean up the data, by interpolating values with large sem
% from the neighbours or replacing those with NaNs?

% if the size of the ping packet exceeds the MTU the ping packets gets
% fragmented the step over this ping size will cause a RTT increaser >> one
% RTT_quantum, so exclude all sizes potentially affected by this from the
% search space, (for now assume that the route to the ping host actually can carry 1500 byte MTUs...)
measured_pingsize_idx = find(~isnan(per_size.data(:, per_size.cols.(use_measure))));
tmp_idx = find(measured_pingsize_idx <= max_ping_size_without_fragmentation);
last_non_fragmented_pingsize = measured_pingsize_idx(tmp_idx(end));
ping_sizes_for_linear_fit = measured_pingsize_idx(tmp_idx);

% fit a line to the data, to estimate the RTT per byte
[p, S] = polyfit(per_size.data(ping_sizes_for_linear_fit, per_size.cols.size), per_size.data(ping_sizes_for_linear_fit, per_size.cols.(use_measure)), 1);
RTT_per_byte = p(end - 1);
fitted_line = polyval(p, per_size.data(ping_sizes_for_linear_fit, per_size.cols.size), S);
input_data = per_size.data(ping_sizes_for_linear_fit, per_size.cols.(use_measure));
% estimate the goodness of the linear fit the same way as for the stair
% function
linear_cumulative_difference = sum(abs(input_data - fitted_line));

% figure
% hold on
% plot(per_size.data(ping_sizes_for_linear_fit, per_size.cols.size), per_size.data(ping_sizes_for_linear_fit, per_size.cols.(use_measure)), 'Color', [0 1 0]);
% plot(per_size.data(ping_sizes_for_linear_fit, per_size.cols.size), fitted_line, 'Color', [1 0 0]);
% hold off
% based on the linear fit we can estimate the average RTT per ATM cell
estimated_RTT_quantum_ms = RTT_per_byte * 48;


% just get an idea what range the RTTs per ATM quantum can be for different
% bandwidths
% "ATM" cell over full duplex gigabit ethernet 
min_GE_RTT_quantum_ms = (ATM_cell.bit / (1000 * 1000 * 1000) + ATM_cell.bit / (1000 * 1000 * 1000) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits
% "ATM" cell over theoretical G.fast.vectoring (best case?)
min_GfastV_RTT_quantum_ms = (ATM_cell.bit / (500 * 1000 * 1000) + ATM_cell.bit / (500 * 1000 * 1000) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits
% the next three are  2014 extreme values fot Deutsche Telekom wired
% assume VDSL2.vectoring 100Mbit 40Mbit
min_VDSL2V_RTT_quantum_ms = (ATM_cell.bit / (100 * 1000 * 1000) + ATM_cell.bit / (40 * 1000 * 1000) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits
% assume ADSL2+ annex J fallback profile 2J R
max_ADSL2aJ_RTT_quantum_ms = (ATM_cell.bit / (448 * 1000) + ATM_cell.bit / (288 * 1000) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits
% assume ADSL2+ annex B fixed prifile dsl light 384
max_ADSL1aB_RTT_quantum_ms = (ATM_cell.bit / (384 * 1000) + ATM_cell.bit / (64 * 1000) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits



% the RTT should equal the average RTT increase per ATM quantum
% estimate the RTT step size
% at ADSL down 3008kbit/sec up 512kbit/sec we expect, this does not include
% processing time
if ~isempty(down_Kbit) || ~isempty(up_Kbit)
	expected_RTT_quantum_ms = (ATM_cell.bit / (down_Kbit * 1000) + ATM_cell.bit / (up_Kbit * 1000) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits
%	sm network rates are base 10 nt base 2	
%	expected_RTT_quantum_ms = (ATM_cell.bit / (down_Kbit * 1024) + ATM_cell.bit / (up_Kbit * 1024) ) * 1000;	% this estimate is rather a lower bound for fastpath , so search for best fits
else
	expected_RTT_quantum_ms = estimated_RTT_quantum_ms;
end
disp(['lower bound estimate for one ATM cell RTT based of specified up and downlink is ', num2str(expected_RTT_quantum_ms), ' ms.']);
disp(['estimate for one ATM cell RTT based on linear fit of the ping sweep data is ', num2str(estimated_RTT_quantum_ms), ' ms.']);

% lets search from expected_RTT_quantum_ms to 1.5 * expected_RTT_quantum_ms
% in steps of expected_RTT_quantum_ms / 100
% to allow for interleaved ATM setups increase the search space up to 32
% times best fastpath RTT estimate, 64 interleave seems to add 25ms to the
% per packet latency, but not to the per quantum delta t, so revisit this
% TODO check with high interleave ATM data (if available)
min_search_RTT_ms = expected_RTT_quantum_ms / 2;	% in case the initial estimates are only in the ballpark
search_RTT_steps_ms = expected_RTT_quantum_ms / 100;
max_search_RTT_ms = min([(32 * expected_RTT_quantum_ms) (max_ADSL1aB_RTT_quantum_ms * 1.5)]);
RTT_quantum_list = (min_search_RTT_ms : search_RTT_steps_ms : max_search_RTT_ms);
quantum_list = (1 : 1 : quantum.byte);

% BRUTE FORCE search of best fitting stair...
differences = zeros([length(RTT_quantum_list) length(quantum_list)]);
cumulative_differences = differences;


all_stairs = zeros([length(RTT_quantum_list) length(quantum_list) length(per_size.data(1:last_non_fragmented_pingsize, per_size.cols.(use_measure)))]);
for i_RTT_quant = 1 : length(RTT_quantum_list)
	cur_RTT_quant = RTT_quantum_list(i_RTT_quant);
	for i_quant = 1 : quantum.byte
		[differences(i_RTT_quant, i_quant), cumulative_differences(i_RTT_quant, i_quant), all_stairs(i_RTT_quant, i_quant, :)] = ...
			get_difference_between_data_and_stair( per_size.data(1:last_non_fragmented_pingsize, per_size.cols.size), per_size.data(1:last_non_fragmented_pingsize, per_size.cols.(use_measure)), ...
			quantum_list(i_quant), quantum.byte, 0, cur_RTT_quant );
	end
end

% for the initial test DSL set the best x_offset was 21, corresponding to 32 bytes overhead before the IP header.
[min_cum_diff, min_cum_diff_idx] = min(cumulative_differences(:));
[min_cum_diff_row_idx, min_cum_diff_col_idx] = ind2sub(size(cumulative_differences),min_cum_diff_idx);
best_difference = differences(min_cum_diff_row_idx, min_cum_diff_col_idx);
disp(['Best staircase fit cumulative difference is: ', num2str(cumulative_differences(min_cum_diff_row_idx, min_cum_diff_col_idx))]);
disp(['Best linear fit cumulative difference is: ', num2str(linear_cumulative_difference)]);
% judge the quantization
if (cumulative_differences(min_cum_diff_row_idx, min_cum_diff_col_idx) < linear_cumulative_difference)
	% stair fits better than line
	quant_string = ['Quantized ATM carrier LIKELY (cummulative residual: stair fit ', num2str(cumulative_differences(min_cum_diff_row_idx, min_cum_diff_col_idx)), ' linear fit ', num2str(linear_cumulative_difference)];
else
	quant_string = ['Quantized ATM carrier UNLIKELY (cummulative residual: stair fit ', num2str(cumulative_differences(min_cum_diff_row_idx, min_cum_diff_col_idx)), ' linear fit ', num2str(linear_cumulative_difference)];	
end
disp(quant_string);

disp(['remaining ATM cell length after ICMP header is ', num2str(quantum_list(min_cum_diff_col_idx)), ' bytes.']);
disp(['ICMP RTT of a single ATM cell is ', num2str(RTT_quantum_list(min_cum_diff_row_idx)), ' ms.']);


% as first approximation use the ATM cell offset and known offsets (ICMP
% IPv4 min_ping_size) to estimate the number of cells used for per packet
% overhead
% this assumes that no ATM related overhead is >= ATM cell size
% -1 to account for matlab 1 based indices
% what is the offset in the 2nd ATM cell
n_bytes_overhead_2nd_cell = quantum.byte - (quantum_list(min_cum_diff_col_idx) - 1);	% just assume we can not fit all overhead into one cell...
% what is the known overhead size for the first data point:
tmp_idx = find(~isnan(per_size.data(:, per_size.cols.mean)));
known_overhead_first_ping_size = tmp_idx(1);
%pre_IP_overhead = quantum.byte + (n_bytes_overhead_2nd_cell - known_overhead);	% ths is the one we are after in the end
pre_IP_overhead = quantum.byte + (n_bytes_overhead_2nd_cell - known_overhead_first_ping_size);	% ths is the one we are after in the end
disp(' ');
disp(['Estimated overhead preceding the IP header: ', num2str(pre_IP_overhead), ' bytes']);


figure('Name', 'Comparing ping data with');
hold on
legend_str = {'ping_data', 'fitted_stair', 'fitted_line'};
plot(per_size.data(1:last_non_fragmented_pingsize, per_size.cols.size), per_size.data(1:last_non_fragmented_pingsize, per_size.cols.(use_measure)), 'Color', [1 0 0]);
plot(per_size.data(1:last_non_fragmented_pingsize, per_size.cols.size), squeeze(all_stairs(min_cum_diff_row_idx, min_cum_diff_col_idx, :)) + best_difference, 'Color', [0 1 0]);

fitted_line = polyval(p, per_size.data(1:last_non_fragmented_pingsize, per_size.cols.size), S);
plot(per_size.data(1:last_non_fragmented_pingsize, per_size.cols.size), fitted_line, 'Color', [0 0 1]);

title({['Estimated RTT per quantum: ', num2str(RTT_quantum_list(min_cum_diff_row_idx)), ' ms; ICMP data offset in quantum ', num2str(quantum_list(min_cum_diff_col_idx)), ' bytes'];...
	['Estimated overhead preceding the IP header: ', num2str(pre_IP_overhead), ' bytes'];...
	quant_string});
xlabel('Approximate packet size [bytes]');
ylabel('ICMP round trip times (ping RTT) [ms]');
if (isoctave)
	legend(legend_str, 'Location', 'NorthWest');
else
	%annotation('textbox', [0.0 0.95 1.0 .05], 'String', ['Estimated overhead preceding the IP header: ', num2str(pre_IP_overhead), ' bytes'], 'FontSize', 9, 'Interpreter', 'none', 'Color', [1 0 0], 'LineStyle', 'none');
	legend(legend_str, 'Interpreter', 'none', 'Location', 'NorthWest');
end
hold off



% use http://ace-host.stuart.id.au/russell/files/tc/tc-atm/ to present the
% most likely ATM encapsulation for a given overhead and present a recommendation
% for the tc stab invocation
display_protocol_stack_information(pre_IP_overhead);


% now turn this into tc-stab recommendations:
disp(['Add the following to both the egress root qdisc:']);
% disp(' ');
disp(['A) Assuming the router connects over ethernet to the DSL-modem:']);
disp(['stab mtu 2048 tsize 128 overhead ', num2str(pre_IP_overhead), ' linklayer atm']);	% currently tc stab does not account for the ethernet header
% disp(['stab mtu 2048 tsize 128 overhead ', num2str(pre_IP_overhead - offsets.ethernet), ' linklayer atm']);
% disp(' ');
% disp(['B) Assuming the router connects via PPP and non-ethernet to the modem:']);
% disp(['stab mtu 2048 tsize 128 overhead ', num2str(pre_IP_overhead), ' linklayer atm']);

disp(' ');
% on ingress do not exclude the the ethernet header?
disp(['Add the following to both the ingress root qdisc:']);
disp(' ');
disp(['A) Assuming the router connects over ethernet to the DSL-modem:']);
disp(['stab mtu 2048 tsize 128 overhead ', num2str(pre_IP_overhead), ' linklayer atm']);
disp(' ');
if ~(isoctave)
	timestamps.(mfilename).end = toc(timestamps.(mfilename).start);
	disp([mfilename, ' took: ', num2str(timestamps.(mfilename).end), ' seconds.']);
else
	toc
end

% and now the other end of the data, what is the max MTU for the link and
% what is the best ATM cell aligned MTU

disp('Done...');

return
end


function [ ping_data ] = parse_ping_output( ping_log_fqn )
%PARSE_PING_OUTPUT read the putput of a ping run/sweep
% for further processing
% TODO:
%	use a faster parser, using srtok is quite expensive
%


if ~(isoctave)
	timestamps.parse_ping_output.start = tic;
else
	tic();
end

verbose = 0;
n_rows_to_grow_table_by = 10000;	% grow table increment to avoid excessive memory copy ops


ping_data = [];
cur_sweep_fd = fopen(ping_log_fqn, 'r');
if (cur_sweep_fd == -1)
	disp(['Could not open ', ping_log_fqn, '.']);
	if isempty(dir(ping_log_fqn))
		disp('Reason: file does not seem to exist at the given directory...')
	end
	return
end
ping_data.header = {'size', 'icmp_seq', 'ttl', 'time'};
ping_data.field_names_list = {'size', 'icmp_seq', 'seq', 'ttl', 'time'};

ping_data.header = {'size', 'time'};	% save half the size...
ping_data.field_names_list = {'size', 'time'};

ping_data.cols = get_column_name_indices(ping_data.header);

ping_data.data = zeros([n_rows_to_grow_table_by, length(ping_data.header)]);
cur_data_lines = 0;
cur_lines = 0;

% skip the first line
% PING netblock-75-79-143-1.dslextreme.com (75.79.143.1): (16 ... 1000)
% data bytes
header_line = fgetl(cur_sweep_fd);

while ~feof(cur_sweep_fd)
	% grow the data table if need be
	if (size(ping_data.data, 1) == cur_data_lines)
		if (verbose)
			disp('Growing ping data table...');
		end
		ping_data.data = [ping_data.data; zeros([n_rows_to_grow_table_by, length(ping_data.header)])];
	end
	
	cur_line = fgetl(cur_sweep_fd);
	if ~(mod(cur_lines, 1000))
		disp([num2str(cur_lines +1), ' lines parsed...']);
	end
	cur_lines = cur_lines + 1;
	
	[first_element, remainder] = strtok(cur_line);
	first_element_as_number = str2double(first_element);
	if isempty(first_element) || strcmp('Request', first_element) || strcmp('---', first_element)
		% skip empty lines explicitly
		continue;
	end
	% the following will not work for merged ping
	%if strmatch('---', first_element)
	%	%we reached the end of sweeps
	%	break;
	%end
	% now read in the data
	% 30 bytes from 75.79.143.1: icmp_seq=339 ttl=63 time=14.771 ms
	if ~isempty(first_element_as_number)
		% get the next element
		[tmp_next_item, tmp_remainder] = strtok(remainder);
		if strcmp(tmp_next_item, 'bytes')
			if ~(mod(cur_data_lines, 1000))
				disp(['Milestone ', num2str(cur_data_lines +1), ' ping packets reached...']);
			end
			cur_data_lines = cur_data_lines + 1;
			
			% size of the ICMP package
			ping_data.data(cur_data_lines, ping_data.cols.size) = first_element_as_number;
			% now process the remainder
			while ~isempty(remainder)
				[next_item, remainder] = strtok(remainder);
				equality_pos = strfind(next_item, '=');
				% data items are name+value pairs
				if ~isempty(equality_pos);
					cur_key = next_item(1: equality_pos - 1);
					cur_value = str2double(next_item(equality_pos + 1: end));
					if (ismember(cur_key, ping_data.field_names_list))
						switch cur_key
							% busybox ping and macosx ping return different key names
							case {'seq', 'icmp_seq'}
								ping_data.data(cur_data_lines, ping_data.cols.icmp_seq) = cur_value;
							case 'ttl'
								ping_data.data(cur_data_lines, ping_data.cols.ttl) = cur_value;
							case 'time'
								ping_data.data(cur_data_lines, ping_data.cols.time) = cur_value;
						end
					end
				end
			end
		else
			% skip this line
			if (verbose)
				disp(['Skipping: ', cur_line]);
			end
		end
	else
		if (verbose)
			disp(['Ping output: ', cur_line, ' not handled yet...']);
		end
	end
	
end

% remove empty lines
if (size(ping_data.data, 1) > cur_data_lines)
	ping_data.data = ping_data.data(1:cur_data_lines, :);
end

disp(['Found ', num2str(cur_data_lines), ' ping packets in ', ping_log_fqn]);
% clean up
fclose(cur_sweep_fd);

if ~(isoctave)
	timestamps.parse_ping_output.end = toc(timestamps.parse_ping_output.start);
	disp(['Parsing took: ', num2str(timestamps.parse_ping_output.end), ' seconds.']);
else
	toc
end

return
end


function [ difference , cumulative_difference, stair_y ] = get_difference_between_data_and_stair( data_x, data_y, x_size, stair_x_step_size, y_offset, stair_y_step_size )
% 130619sm: handle NaNs in data_y (marker for missing ping sizes)
% x_size is the flat part of the first stair, that is quantum minus the
% offset
% TODO: understand the offset issue and simplify this function
%		extrapolate the stair towards x = 0 again

debug = 0;
difference = [];

tmp_idx = find(~isnan(data_y));
x_start_val_idx = tmp_idx(1);
x_start_val = data_x(x_start_val_idx);
x_end_val = data_x(end);	% data_x is sorted...

% construct stair
stair_x = data_x;
proto_stair_y = zeros([x_end_val 1]);	% we need the final value in
% make sure the x_size values do not exceed the step size...
if (x_size > stair_x_step_size)
	if mod(x_size, stair_x_step_size) == 0
		x_size = stair_x_step_size;
	else
		x_size = mod(x_size, stair_x_step_size);
	end
end

%stair_y_step_idx = (x_start_val + x_size : stair_x_step_size : x_end_val);
%% we really want steps registered to x_start_val
%stair_y_step_idx = (mod(x_start_val, stair_x_step_size) + x_size : stair_x_step_size : x_end_val);
stair_y_step_idx = (mod(x_start_val + x_size, stair_x_step_size) : stair_x_step_size : x_end_val);
if stair_y_step_idx(1) == 0
	stair_y_step_idx(1) = [];
end

proto_stair_y(stair_y_step_idx) = stair_y_step_size;
stair_y = cumsum(proto_stair_y);
if (debug)
	figure
	hold on;
	title(['x offset used: ', num2str(x_size), ' with quantum ', num2str(stair_x_step_size)]);
	plot(data_x, data_y, 'Color', [0 1 0]);
	plot(stair_x, stair_y, 'Color', [1 0 0]);
	hold off;
end
% missing ping sizes are filled with NaNs, so skip those
notnan_idx = find(~isnan(data_y));
% estimate the best y_offset for the stair
difference = sum(abs(data_y(notnan_idx) - stair_y(notnan_idx))) / length(data_y(notnan_idx));
% calculate the cumulative difference between stair and data...
cumulative_difference = sum(abs(data_y(notnan_idx) - (stair_y(notnan_idx) + difference)));

return
end

% function [ stair ] = build_stair(x_vector, x_size, stair_x_step_size, y_offset, stair_y_step_size )
% stair = [];
%
% return
% end


function [columnnames_struct, n_fields] = get_column_name_indices(name_list)
% return a structure with each field for each member if the name_list cell
% array, giving the position in the name_list, then the columnnames_struct
% can serve as to address the columns, so the functions assitgning values
% to the columns do not have to care too much about the positions, and it
% becomes easy to add fields.
n_fields = length(name_list);
for i_col = 1 : length(name_list)
	cur_name = name_list{i_col};
	columnnames_struct.(cur_name) = i_col;
end
return
end




function [ci_halfwidth_vector] = calc_cihw(std_vector, n, alpha)
%calc_ci : calculate the half width of the confidence interval (for 1 - alpha)
%	the t_value lookup depends on alpha and the samplesize n; the relevant
%	calculation of the degree of freedom is performed inside calc_t_val.
%	ci_halfwidth = t_val(alpha, n-1) * std / sqrt(n)
%	Each groups CI ranges from mean - ci_halfwidth to mean - ci_halfwidth, so
%	the calling function has to perform this calculation...
%
% INPUTS:
%	std_vector: vector containing the standard deviations of all requested
%		groups
%	n: number of samples in each group, if the groups have different
%		samplesizes, specify each group's sample size in a vector
%	alpha: the desired maximal uncertainty/error in the range of [0, 1]
% OUTPUT:
%	ci_halfwidth_vector: vector containing the confidence intervals half width
%		for each group

% calc_t_val return one sided t-values, for the desired two sidedness one has
% to half the alpha for the table lookup
cur_alpha = alpha / 2;

% if n is scalar use same n for all elements of std_vec
if isscalar(n)
	t_ci = calc_t_val(cur_alpha, n);
	ci_halfwidth_vector = std_vector * t_ci / sqrt(n);
	% if n is a vector, prepare a matching vector of t_ci values
elseif isvector(n)
	t_ci_vector = n;
	% this is probably ugly, but calc_t_val only accepts scalars.
	for i_pos = 1 : length(n)
		t_ci_vector(i_pos) = calc_t_val(cur_alpha, n(i_pos));
	end
	ci_halfwidth_vector = std_vector .* t_ci_vector ./ sqrt(n);
end

return
end

%-----------------------------------------------------------------------------
function [t_val] = calc_t_val(alpha, n)
% the t value for the given alpha and n
% so call with the n of the sample, not with degres of freedom
% see http://mathworld.wolfram.com/Studentst-Distribution.html for formulas
% return values follow Bortz, Statistik fuer Sozialwissenschaftler, Springer
% 1999, table D page 775. That is it returns one sided t-values.
% primary author S. Moeller

% TODO:
%	sidedness of t-value???

% basic error checking
if nargin < 2
	error('alpha and n have to be specified...');
end

% probabilty of error
tmp_alpha = alpha ;%/ 2;
if (tmp_alpha < 0) || (tmp_alpha > 1)
	msgbox('alpha has to be taken from [0, 1]...');
	t_val = NaN;
	return
end
if tmp_alpha == 0
	t_val = -Inf;
	return
elseif tmp_alpha ==1
	t_val = Inf;
	return
end
% degree of freedom
df = n - 1;
if df < 1
	%msgbox('The n has to be >= 2 (=> df >= 1)...');
	% 	disp('The n has to be >= 2 (=> df >= 1)...');
	t_val = NaN;
	return
end


% only calculate each (alpha, df) combination once, store the results
persistent t_val_array;
% create the t_val_array
if ~iscell(t_val_array)
	t_val_array = {[NaN;NaN]};
end
% search for the (alpha, df) tupel, avoid calculation if already stored
if iscell(t_val_array)
	% cell array of 2d arrays containing alpha / t_val pairs
	if df <= length(t_val_array)
		% test whether the required alpha, t_val tupel exists
		if ~isempty(t_val_array{df})
			% search for alpha
			tmp_array = t_val_array{df};
			alpha_index = find(tmp_array(1,:) == tmp_alpha);
			if any(alpha_index)
				t_val = tmp_array(2, alpha_index);
				return
			end
		end
	else
		% grow t_val_array to length of n
		missing_cols = df - length(t_val_array);
		for i_missing_cols = 1: missing_cols
			t_val_array{end + 1} = [NaN;NaN];
		end
	end
end

% check the sign
cdf_sign = 1;
if (1 - tmp_alpha) == 0.5
	t_val = t_cdf;
elseif (1 - tmp_alpha) < 0.5 % the t-cdf is point symmetric around (0, 0.5)
	cdf_sign = -1;
	tmp_alpha = 1 - tmp_alpha; % this will be undone later
end

% init some variables
n_iterations = 0;
delta_t = 1;
last_alpha = 1;
higher_t = 50;
lower_t = 0;
% find a t-value pair around the desired alpha value
while norm_students_cdf(higher_t, df) < (1 - tmp_alpha);
	lower_t = higher_t;
	higher_t = higher_t * 2;
end

% search the t value for the given alpha...
while (n_iterations < 1000) && (abs(delta_t) >= 0.0001)
	n_iterations = n_iterations + 1;
	% get the test_t (TODO linear interpolation)
	% higher_alpha = norm_students_cdf(higher_t, df);
	% lower_alpha = norm_students_cdf(lower_t, df);
	test_t = lower_t + ((higher_t - lower_t) / 2);
	cur_alpha = norm_students_cdf(test_t, df);
	% just in case we hit the right t spot on...
	if cur_alpha == (1 - tmp_alpha)
		t_crit = test_t;
		break;
		% probably we have to search for the right t
	elseif cur_alpha < (1 - tmp_alpha)
		% test_t is the new lower_t
		lower_t = test_t;
		%higher_t = higher_t;	% this stays as is...
	elseif cur_alpha > (1 - tmp_alpha)
		%
		%lower_t = lower_t;	% this stays as is...
		higher_t = test_t;
	end
	delta_t = higher_t - lower_t;
	last_alpha = cur_alpha;
end
t_crit = test_t;

% set the return value, correct for negative t values
t_val = t_crit * cdf_sign;
if cdf_sign < 0
	tmp_alpha = 1 - tmp_alpha;
end

% store the alpha, n, t_val tupel in t_val_array
pos = size(t_val_array{df}, 2);
t_val_array{df}(1, (pos + 1)) = tmp_alpha;
t_val_array{df}(2, (pos + 1)) = t_val;

return
end

%-----------------------------------------------------------------------------
function [scaled_cdf] = norm_students_cdf(t, df)
% calculate the cdf of students distribution for a given degree of freedom df,
% and all given values of t, then normalize the result
% the extreme values depend on the values of df!!!

% get min and max by calculating values for extrem t-values (e.g. -10000000,
% 10000000)
extreme_cdf_vals = students_cdf([-10000000, 10000000], df);

tmp_cdf = students_cdf(t, df);

scaled_cdf =	(tmp_cdf - extreme_cdf_vals(1)) /...
	(extreme_cdf_vals(2) - extreme_cdf_vals(1));
return
end

%-----------------------------------------------------------------------------
function [cdf_value_array] = students_cdf(t_value_array, df)
%students_cdf: calc the cumulative density function for a t-distribution
% Calculate the CDF value for each value t of the input array
% see http://mathworld.wolfram.com/Studentst-Distribution.html for formulas
% INPUTS:	t_value_array:	array containing the t values for which to
%							calculate the cdf
%			df:	degree of freedom; equals n - 1 for the t-distribution

cdf_value_array = 0.5 +...
	((betainc(1, 0.5 * df, 0.5) / beta(0.5 * df, 0.5)) - ...
	(betainc((df ./ (df + t_value_array.^2)), 0.5 * df, 0.5) /...
	beta(0.5 * df, 0.5))) .*...
	sign(t_value_array);

return
end

%-----------------------------------------------------------------------------
function [t_prob_dist] = students_pf(df, t_arr)
%  calculate the probability function for students t-distribution

t_prob_dist =	(df ./ (df + t_arr.^2)).^((1 + df) / 2) /...
	(sqrt(df) * beta(0.5 * df, 0.5));

% % calculate and scale the cdf by hand...
% cdf = cumsum(t_prob_dist);
% discrete_t_cdf = (cdf - min(cdf)) / (max(cdf) - min(cdf));
% % numericaly get the t-value for the given alpha
% tmp_index = find(discrete_t_cdf > (1 - tmp_alpha));
% t_crit = t(tmp_index(1));

return
end

function in = isoctave ()
persistent inout;

if isempty(inout),
	inout = exist('OCTAVE_VERSION','builtin') ~= 0;
end;
in = inout;

return;
end


function [] = display_protocol_stack_information(pre_IP_overhead)
% use [1] http://ace-host.stuart.id.au/russell/files/tc/tc-atm/ to present the
% most likely ATM protocol stack setup for a given overhead so the user can
% compare with his prior knowledge

% how much data fits into ATM cells without padding? 32 cells would be 1519
% which is larger than the 1500 max MTU for ethernet
ATM_31_cells_proto_MTU = 31 * 48;	% according to [1] 31 cells are the optimum for all protocol stacks
ATM_32_cells_proto_MTU = 32 * 48;	% should be best for case 44

disp(' ');
disp('According to http://ace-host.stuart.id.au/russell/files/tc/tc-atm/');
disp(['', num2str(pre_IP_overhead), ' bytes overhead indicate']);

switch pre_IP_overhead
	case 8
		disp('Connection: IPoA, VC/Mux RFC-2684');
		disp('Protocol (bytes): ATM AAL5 SAR (8) : Total 8');
		overhead_bytes_around_MTU = 8;
		overhead_bytes_in_MTU = 0;
		
	case 16
		disp('Connection: IPoA, LLC/SNAP RFC-2684');
		disp('Protocol (bytes): ATM LLC (3), ATM SNAP (5), ATM AAL5 SAR (8) : Total 16');
		overhead_bytes_around_MTU = 16;
		overhead_bytes_in_MTU = 0;
		
	case 24
		disp('Connection: Bridged, VC/Mux RFC-1483/2684');
		disp('Protocol (bytes): Ethernet Header (14), ATM pad (2), ATM AAL5 SAR (8) : Total 24');
		overhead_bytes_around_MTU = 24;
		overhead_bytes_in_MTU = 0;
		
	case 28
		disp('Connection: Bridged, VC/Mux+FCS RFC-1483/2684');
		disp('Protocol (bytes): Ethernet Header (14), Ethernet PAD [8] (0), Ethernet Checksum (4), ATM pad (2), ATM AAL5 SAR (8) : Total 28');
		overhead_bytes_around_MTU = 28;
		overhead_bytes_in_MTU = 0;
		
	case 32
		disp('Connection: Bridged, LLC/SNAP RFC-1483/2684');
		disp('Protocol (bytes): Ethernet Header (14), ATM LLC (3), ATM SNAP (5), ATM pad (2), ATM AAL5 SAR (8) : Total 32');
		overhead_bytes_around_MTU = 32;
		overhead_bytes_in_MTU = 0;
		disp('OR');
		disp('Connection: PPPoE, VC/Mux RFC-2684');
		disp('Protocol (bytes): PPP (2), PPPoE (6), Ethernet Header (14), ATM pad (2), ATM AAL5 SAR (8) : Total 32');
		overhead_bytes_around_MTU = 24;
		overhead_bytes_in_MTU = 8;
		
	case 36
		disp('Connection: Bridged, LLC/SNAP+FCS RFC-1483/2684');
		disp('Protocol (bytes): Ethernet Header (14), Ethernet PAD [8] (0), Ethernet Checksum (4), ATM LLC (3), ATM SNAP (5), ATM pad (2), ATM AAL5 SAR (8) : Total 36');
		overhead_bytes_around_MTU = 36;
		overhead_bytes_in_MTU = 0;
		disp('OR');
		disp('Connection: PPPoE, VC/Mux+FCS RFC-2684');
		disp('Protocol (bytes): PPP (2), PPPoE (6), Ethernet Header (14), Ethernet PAD [8] (0), Ethernet Checksum (4), ATM pad (2), ATM AAL5 SAR (8) : Total 36');
		overhead_bytes_around_MTU = 28;
		overhead_bytes_in_MTU = 8;
		
	case 10
		disp('Connection: PPPoA, VC/Mux RFC-2364');
		disp('Protocol (bytes): PPP (2), ATM AAL5 SAR (8) : Total 10');
		overhead_bytes_around_MTU = 8;
		overhead_bytes_in_MTU = 2;
		
	case 14
		disp('Connection: PPPoA, LLC RFC-2364');
		disp('Protocol (bytes): PPP (2), ATM LLC (3), ATM LLC-NLPID (1), ATM AAL5 SAR (8) : Total 14');
		overhead_bytes_around_MTU = 12;
		overhead_bytes_in_MTU = 2;
		
	case 40
		disp('Connection: PPPoE, LLC/SNAP RFC-2684');
		disp('Protocol (bytes): PPP (2), PPPoE (6), Ethernet Header (14), ATM LLC (3), ATM SNAP (5), ATM pad (2), ATM AAL5 SAR (8) : Total 40');
		overhead_bytes_around_MTU = 32;
		overhead_bytes_in_MTU = 8;
		
	case 44
		disp('Connection: PPPoE, LLC/SNAP+FCS RFC-2684');
		disp('Protocol (bytes): PPP (2), PPPoE (6), Ethernet Header (14), Ethernet PAD [8] (0), Ethernet Checksum (4), ATM LLC (3), ATM SNAP (5), ATM pad (2), ATM AAL5 SAR (8) : Total 44');
		overhead_bytes_around_MTU = 36;
		overhead_bytes_in_MTU = 8;
		
	otherwise
		disp('a protocol stack this program does NOT know (yet)...');
end

disp(' ');
return;
end


function range_mean = robust_mean(value_list, lower_limit_ratio, upper_limit_ratio)

n_vals = length(value_list);
sorted_values = sort(value_list);

lowest_robust_idx = ceil(n_vals * lower_limit_ratio);
highest_robust_idx = floor(n_vals * upper_limit_ratio);

range_mean = mean(sorted_values(lowest_robust_idx:highest_robust_idx));

return
end

[-- Attachment #5: Type: text/plain, Size: 5574 bytes --]


Best Regards
	Sebastian


> 
> On Sat, Sep 20, 2014 at 7:17 PM, Andy Furniss <adf.lists@gmail.com> wrote:
>> Alan Goodman wrote:
>>> 
>>> Hi,
>>> 
>>> I am looking to figure out the most fool proof way to calculate stab
>>> overheads for ADSL/VDSL connections.
>>> 
>>> ppp0      Link encap:Point-to-Point Protocol inet addr:81.149.38.69
>>> P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
>>> MULTICAST  MTU:1492  Metric:1 RX packets:17368223 errors:0 dropped:0
>>> overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
>>> carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
>>> TX bytes:3611007028 (3.3 GiB)
>>> 
>>> I am setting a longer txqueuelen as I am not currently using any fair
>>> queuing (buffer bloat issues with sfq)
>> 
>> 
>> Whatever is txqlen is on ppp there is likely some other buffer after it
>> - the default can hurt with eg, htb as if you don't add qdiscs to
>> classes it takes (last time I looked) its qlen from that.
>> 
>> Sfq was only ever meant for bulk, so should really be in addition to
>> some classification to separate interactive - I don't really get the
> 
> Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
> bad things to bulk as it doesn't manage queue length.
> 
> A little bit of prioritization or deprioritization for some traffic is
> helpful, but most traffic is hard to classify.
> 
>> bufferbloat bit, you could make the default 128 limit lower if you wanted.
> 
> htb + fq_codel, if available, is the right thing here....
> 
> http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
> 
>>> The connection is a BT Infinity FTTC VDSL connection synced at
>>> 80mbit/20mbit.  The modem is connected directly to the ethernet port
>>> on a server running a slightly tweaked HFSC setup that you folks
>>> helped me set up in July - back when I was on ADSL.  I am still
>>> running pppoe I believe from my server.
>> 
>> 
>> I have similar since May 2013 and I still haven't got round to reading
>> up on everything yet :-)
>> 
>> I have extra geek score for using mini jumbos = running pppoe with mtu
>> 1500 which works for me on plusnet. You need a recent pppd for this and
>> a nic that works with mtu >= 1508.
>> 
>> As for overheads, initial searching indicated that it's not easy or
>> maybe even truly possible like adsl.
>> 
>>> The largest ping packet that I can fit out onto the wire is 1464
>>> bytes:
>>> 
>>> # ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
>>> 1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
>>> ttl=58 time=11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttl=58
>>> time=11.9 ms
>>> 
>>> # ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
>>> 1465(1493) bytes of data. From
>>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
>>> Frag needed and DF set (mtu = 1492) From
>>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
>>> Frag needed and DF set (mtu = 1492)
>> 
>> 
>> You can't work out your overheads like this.
>> 
>> On slow uplink adsl it was possible with ping to infer the fixed part
>> but you needed to send loads of pings increasing in size and plot the
>> best time for each to make a stepped graph.
>> 
>> 
>>> Based on this I believe overhead should be set to 28, however with 28
>>> set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
>>> to be loosing about 1.5mbit of upload...
>> 
>> 
>> Even if you could do things perfectly I would back off a few kbit just
>> to be safe. Timers may be different or there may be OAM/Reporting data
>> going up, albeit rarely.
>> 
>>> 
>>> No traffic manager enabled:
>>> 
>>> http://www.thinkbroadband.com/speedtest/results.html?id=141116089424883990118
>>> 
>>> 
>>> HFSC traffic manager:
>>> 
>>> http://www.thinkbroadband.com/speedtest/results.html?id=141116216621093133034
>>> 
>>> 
>>> 
>>> Am I calculating overhead incorrectly?
>> 
>> 
>> VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
>> atm with stab. Unfortunately stab doesn't do 64/65.
>> 
>> As for the fixed part - I am not sure, but roughly starting with IP as
>> that's what tc sees on ppp (as opposed to ip + 14 on eth)
>> 
>> IP
>> +8 for PPPOE
>> +14 for ethertype and macs
>> +4 because Openreach modem uses vlan
>> +2 CRC ??
>> + "a few" 64/65
>> 
>> That's it for fixed - of course 64/65 adds another one for every 64 TBH
>> I didn't get the precice detail from the spec and not having looked
>> recently I can't remember.
>> 
>> BT Sin 498 does give some of this info and a couple of examples of
>> throughput for different frame sizes - but it's rounded to kbit which
>> means I couldn't work out to the byte what the overheads were.
>> 
>> Worse still VDSL can use link layer retransmits and the sin says that
>> though currently (2013) not enabled, they would be in due course. I have
>> no clue how these work.
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe lartc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> -- 
> Dave Täht
> 
> https://www.bufferbloat.net/projects/make-wifi-fast
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
@ 2014-09-21 21:40 ` Alan Goodman
  2014-09-22  9:05 ` Sebastian Moeller
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Alan Goodman @ 2014-09-21 21:40 UTC (permalink / raw)
  To: lartc

Hi Billy,

Please can you share your modified script?

Alan

On 21/09/14 22:18, Billy Tallis wrote:
> On my Linux boxes ping has a -A option for adaptive ping, effectively
> sending out a new ping as soon as the reply to the last one is received,
> instead of having to wait a fixed period of time between pings. I
> modified ping_sweeper to use that last December when I was still on a
> DSL link and was able to find the overhead with only a few minutes of
> collecting data. (The connection was 6Mbps down, 512kbps up.) There was
> a bit of noise in the data from other traffic in the house, but the
> stair-step shape of the plot was unmistakeable and the octave script had
> no trouble identifying the per-packet overhead.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
  2014-09-21 21:40 ` Alan Goodman
@ 2014-09-22  9:05 ` Sebastian Moeller
  2014-09-22 10:01 ` Andy Furniss
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-22  9:05 UTC (permalink / raw)
  To: lartc

Hi Bill,


On Sep 21, 2014, at 23:18 , Billy Tallis <wtallis@gmail.com> wrote:

> On my Linux boxes ping has a -A option for adaptive ping, effectively sending out a new ping as soon as the reply to the last one is received, instead of having to wait a fixed period of time between pings.

	Interesting, the current version of ping_sweeper sends a ping every 10ms, with the typical RTT on a ADSL link > 10ms I am not sure how much “-A” speeds things up (except your method does work with any uplink speed, while with fixed intervals one needs to tweak the ping interval). One other reason for the ping interval was, that some routers/BRAS/DSLAMs are rumored to rate limit ICMP processing so I wanted to be able to control the rate to be able to work around such limitations.

> I modified ping_sweeper to use that last December when I was still on a DSL link and was able to find the overhead with only a few minutes of collecting data. (The connection was 6Mbps down, 512kbps up.)

	So, the slower the link is the fewer packed you need, as the per ATM cell time increase gets larger and hence easier to detect in the noise. xDSL uses a symbol rate of 4KHz, so there should be a quantization of 0.25 ms caused that will make detection of fast uplinks trickier (in my experience it works up to 2600Kbps the fastest I had available)...


> There was a bit of noise in the data from other traffic in the house, but the stair-step shape of the plot was unmistakeable and the octave script had no trouble identifying the per-packet overhead.

	Great to hear that is worked out.

Best Regards
	Sebastian

> 
> 2014-09-21 14:35 GMT-04:00 Sebastian Moeller <moeller0@gmx.de>:
> Hi Dave, hi Andy,
> 
> 
> 
> On Sep 20, 2014, at 19:55 , Dave Taht <dave.taht@gmail.com> wrote:
> 
> > We'd had a very long thread on cerowrt-devel and in the end sebastian
> > (I think) had developed some scripts to exaustively (it took hours)
> > derive the right encapsulation frame size on a link. I can't find the
> > relevant link right now, ccing that list…
> 
>         I am certainly not the first to have looked at ATM encapsulation effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis about this topic (see http://www.adsl-optimizer.dk) and together with Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)  I believe they taught the linux kernel about how to account for encapsulation. What you need to tell the kernel is whether or not you have ATM encapsulation (ATM is weird in that each ip Packet gets chopped into 48 byte cells, with the last partially full cell padded) and the per packet overhead on your link. You can either get this information from your ISP and/or from the DSL-modem’s information page, but both are not guaranteed to be available/useful. So I set  out to empirically deduce this information from measurements on my own link. I naively started out with using ICMP echo requests as probes (as I easily could generate probe packets with different sizes with the linux/macosx ping binary), as it turned out, this works well enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh (attached) is the program I use (on an otherwise idle link, typically over night) to collect ~1000 repetitions of time stamped ping packets spanning two (potential) ATM cells. I then use tc_stab_parameter_guide.m (a matlab/octave program) to read in the output of the ping_sweeper script and process the data. In short if the link runs ATM encapsulation the plot of the data needs to look like a stair with 48 byte step width, if it is just smoothly increasing the carrier is not ATM. For ATM links and only ATM links, the script also tries to figure out the per packet overhead which always worked well for me. (My home-link got recently a silent upgrade where the encapsulation changed from 40 bytes to 44 bytes (probably due to the introduction of VLAN tags), which caused some disturbances in link capacity measurements I was running at the time; so I ran my code again and lo and behold the overhead had increased, which caused the issues with the measurements, as after taking the real overhead into account the disturbances went away, but I guess I digress ;) )
> 
> 
> Best Regards
>         Sebastian
> 
> 
> >
> > On Sat, Sep 20, 2014 at 7:17 PM, Andy Furniss <adf.lists@gmail.com> wrote:
> >> Alan Goodman wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am looking to figure out the most fool proof way to calculate stab
> >>> overheads for ADSL/VDSL connections.
> >>>
> >>> ppp0      Link encap:Point-to-Point Protocol inet addr:81.149.38.69
> >>> P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
> >>> MULTICAST  MTU:1492  Metric:1 RX packets:17368223 errors:0 dropped:0
> >>> overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
> >>> carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
> >>> TX bytes:3611007028 (3.3 GiB)
> >>>
> >>> I am setting a longer txqueuelen as I am not currently using any fair
> >>> queuing (buffer bloat issues with sfq)
> >>
> >>
> >> Whatever is txqlen is on ppp there is likely some other buffer after it
> >> - the default can hurt with eg, htb as if you don't add qdiscs to
> >> classes it takes (last time I looked) its qlen from that.
> >>
> >> Sfq was only ever meant for bulk, so should really be in addition to
> >> some classification to separate interactive - I don't really get the
> >
> > Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
> > bad things to bulk as it doesn't manage queue length.
> >
> > A little bit of prioritization or deprioritization for some traffic is
> > helpful, but most traffic is hard to classify.
> >
> >> bufferbloat bit, you could make the default 128 limit lower if you wanted.
> >
> > htb + fq_codel, if available, is the right thing here....
> >
> > http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
> >
> >>> The connection is a BT Infinity FTTC VDSL connection synced at
> >>> 80mbit/20mbit.  The modem is connected directly to the ethernet port
> >>> on a server running a slightly tweaked HFSC setup that you folks
> >>> helped me set up in July - back when I was on ADSL.  I am still
> >>> running pppoe I believe from my server.
> >>
> >>
> >> I have similar since May 2013 and I still haven't got round to reading
> >> up on everything yet :-)
> >>
> >> I have extra geek score for using mini jumbos = running pppoe with mtu
> >> 1500 which works for me on plusnet. You need a recent pppd for this and
> >> a nic that works with mtu >= 1508.
> >>
> >> As for overheads, initial searching indicated that it's not easy or
> >> maybe even truly possible like adsl.
> >>
> >>> The largest ping packet that I can fit out onto the wire is 1464
> >>> bytes:
> >>>
> >>> # ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
> >>> 1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
> >>> ttlX time\x11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttlX
> >>> time\x11.9 ms
> >>>
> >>> # ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
> >>> 1465(1493) bytes of data. From
> >>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
> >>> Frag needed and DF set (mtu = 1492) From
> >>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
> >>> Frag needed and DF set (mtu = 1492)
> >>
> >>
> >> You can't work out your overheads like this.
> >>
> >> On slow uplink adsl it was possible with ping to infer the fixed part
> >> but you needed to send loads of pings increasing in size and plot the
> >> best time for each to make a stepped graph.
> >>
> >>
> >>> Based on this I believe overhead should be set to 28, however with 28
> >>> set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
> >>> to be loosing about 1.5mbit of upload...
> >>
> >>
> >> Even if you could do things perfectly I would back off a few kbit just
> >> to be safe. Timers may be different or there may be OAM/Reporting data
> >> going up, albeit rarely.
> >>
> >>>
> >>> No traffic manager enabled:
> >>>
> >>> http://www.thinkbroadband.com/speedtest/results.html?id\x141116089424883990118
> >>>
> >>>
> >>> HFSC traffic manager:
> >>>
> >>> http://www.thinkbroadband.com/speedtest/results.html?id\x141116216621093133034
> >>>
> >>>
> >>>
> >>> Am I calculating overhead incorrectly?
> >>
> >>
> >> VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
> >> atm with stab. Unfortunately stab doesn't do 64/65.
> >>
> >> As for the fixed part - I am not sure, but roughly starting with IP as
> >> that's what tc sees on ppp (as opposed to ip + 14 on eth)
> >>
> >> IP
> >> +8 for PPPOE
> >> +14 for ethertype and macs
> >> +4 because Openreach modem uses vlan
> >> +2 CRC ??
> >> + "a few" 64/65
> >>
> >> That's it for fixed - of course 64/65 adds another one for every 64 TBH
> >> I didn't get the precice detail from the spec and not having looked
> >> recently I can't remember.
> >>
> >> BT Sin 498 does give some of this info and a couple of examples of
> >> throughput for different frame sizes - but it's rounded to kbit which
> >> means I couldn't work out to the byte what the overheads were.
> >>
> >> Worse still VDSL can use link layer retransmits and the sin says that
> >> though currently (2013) not enabled, they would be in due course. I have
> >> no clue how these work.
> >>
> >>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe lartc" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> >
> > --
> > Dave Täht
> >
> > https://www.bufferbloat.net/projects/make-wifi-fast
> > _______________________________________________
> > Cerowrt-devel mailing list
> > Cerowrt-devel@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
  2014-09-21 21:40 ` Alan Goodman
  2014-09-22  9:05 ` Sebastian Moeller
@ 2014-09-22 10:01 ` Andy Furniss
  2014-09-22 10:20 ` Sebastian Moeller
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andy Furniss @ 2014-09-22 10:01 UTC (permalink / raw)
  To: lartc

Sebastian Moeller wrote:
> Hi Dave, hi Andy,
>
>
>
>
> On Sep 20, 2014, at 19:55 , Dave Taht <dave.taht@gmail.com> wrote:
>
>> We'd had a very long thread on cerowrt-devel and in the end
>> sebastian (I think) had developed some scripts to exaustively (it
>> took hours) derive the right encapsulation frame size on a link. I
>> can't find the relevant link right now, ccing that list…
>
> I am certainly not the first to have looked at ATM encapsulation
> effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis
> about this topic (see http://www.adsl-optimizer.dk) and together with
> Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)
> I believe they taught the linux kernel about how to account for
> encapsulation. What you need to tell the kernel is whether or not you
> have ATM encapsulation (ATM is weird in that each ip Packet gets
> chopped into 48 byte cells, with the last partially full cell padded)
> and the per packet overhead on your link. You can either get this
> information from your ISP and/or from the DSL-modem’s information
> page, but both are not guaranteed to be available/useful. So I set
> out to empirically deduce this information from measurements on my
> own link. I naively started out with using ICMP echo requests as
> probes (as I easily could generate probe packets with different sizes
> with the linux/macosx ping binary), as it turned out, this works well
> enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh
> (attached) is the program I use (on an otherwise idle link, typically
> over night) to collect ~1000 repetitions of time stamped ping packets
> spanning two (potential) ATM cells. I then use
> tc_stab_parameter_guide.m (a matlab/octave program) to read in the
> output of the ping_sweeper script and process the data. In short if
> the link runs ATM encapsulation the plot of the data needs to look
> like a stair with 48 byte step width, if it is just smoothly
> increasing the carrier is not ATM. For ATM links and only ATM links,
> the script also tries to figure out the per packet overhead which
> always worked well for me. (My home-link got recently a silent
> upgrade where the encapsulation changed from 40 bytes to 44 bytes
> (probably due to the introduction of VLAN tags), which caused some
> disturbances in link capacity measurements I was running at the time;
> so I ran my code again and lo and behold the overhead had increased,
> which caused the issues with the measurements, as after taking the
> real overhead into account the disturbances went away, but I guess I
> digress ;) )

Sounds like a handy script, though I am not so sure it would help for
vdsl 64/65 (if that is actually used!). I don't think there is any
padding (but may be wrong!).

As for the history, Yea Jesper got his stuff in - but didn't allow
negative overheads so I still used to have to patch tc to workaround that.

Before his work there was some user space code by IIRC Dan Singletary
which I used for a while and later Ed Wildgoose analysed the kernel code
and posted patches for htb and tc on the original lartc list which I
used for some time before Jespers code got in.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (2 preceding siblings ...)
  2014-09-22 10:01 ` Andy Furniss
@ 2014-09-22 10:20 ` Sebastian Moeller
  2014-09-22 13:09 ` Alan Goodman
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-22 10:20 UTC (permalink / raw)
  To: lartc

Hi Andy,


On Sep 22, 2014, at 12:01 , Andy Furniss <adf.lists@gmail.com> wrote:

> Sebastian Moeller wrote:
>> Hi Dave, hi Andy,
>> 
>> 
>> 
>> 
>> On Sep 20, 2014, at 19:55 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> We'd had a very long thread on cerowrt-devel and in the end
>>> sebastian (I think) had developed some scripts to exaustively (it
>>> took hours) derive the right encapsulation frame size on a link. I
>>> can't find the relevant link right now, ccing that list…
>> 
>> I am certainly not the first to have looked at ATM encapsulation
>> effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis
>> about this topic (see http://www.adsl-optimizer.dk) and together with
>> Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)

	One note about Russel’s handy list of ADSL overheads, these do not include VLAN tags so all the shown combinations can be 4 byte larger if a clan tag is added, as is quite common nowadays for double- and tripple play connections. Fun fact for a rather run of the mill encapsulation with, LLC/SNAP over AAL5 with clan tbs we now have three independent methods to multiplex different “connections over the line” (and that does not count PPPoE), all provisioned out of the bandwidth the end user pays for and best case is only one of the is functional, but I digress...


>> I believe they taught the linux kernel about how to account for
>> encapsulation. What you need to tell the kernel is whether or not you
>> have ATM encapsulation (ATM is weird in that each ip Packet gets
>> chopped into 48 byte cells, with the last partially full cell padded)
>> and the per packet overhead on your link. You can either get this
>> information from your ISP and/or from the DSL-modem’s information
>> page, but both are not guaranteed to be available/useful. So I set
>> out to empirically deduce this information from measurements on my
>> own link. I naively started out with using ICMP echo requests as
>> probes (as I easily could generate probe packets with different sizes
>> with the linux/macosx ping binary), as it turned out, this works well
>> enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh
>> (attached) is the program I use (on an otherwise idle link, typically
>> over night) to collect ~1000 repetitions of time stamped ping packets
>> spanning two (potential) ATM cells. I then use
>> tc_stab_parameter_guide.m (a matlab/octave program) to read in the
>> output of the ping_sweeper script and process the data. In short if
>> the link runs ATM encapsulation the plot of the data needs to look
>> like a stair with 48 byte step width, if it is just smoothly
>> increasing the carrier is not ATM. For ATM links and only ATM links,
>> the script also tries to figure out the per packet overhead which
>> always worked well for me. (My home-link got recently a silent
>> upgrade where the encapsulation changed from 40 bytes to 44 bytes
>> (probably due to the introduction of VLAN tags), which caused some
>> disturbances in link capacity measurements I was running at the time;
>> so I ran my code again and lo and behold the overhead had increased,
>> which caused the issues with the measurements, as after taking the
>> real overhead into account the disturbances went away, but I guess I
>> digress ;) )
> 
> Sounds like a handy script, though I am not so sure it would help for
> vdsl 64/65 (if that is actually used!).

	No, currently my script will tell you whether you have ATM cell encapsulation on your link or not (as far as I know VDSL2 means PTM (64/65), ADSL[1,2,2+] means ATM, not sure about VDSL1, but I think neither is VDSL2 prohibited from using ATM nor is ADSL stuck on ATM). If, and only if ATM is used will the script help to deduce the per packet overhead. I am still waiting for the upgrade to VDSL2 on my home link, once I have that available I will see whether I can figure out information about the per packet overhead or not; all I know is that my current approach will not work, because it relies on the ATM quantization.


> I don't think there is any
> padding (but may be wrong!).

	No, according to the standard the 64/64 encapsulation is “continous”, so not padded-out or reset for each packet

> 
> As for the history, Yea Jesper got his stuff in - but didn't allow
> negative overheads so I still used to have to patch tc to workaround that.

	True, but the “stab” work for tc got this right. Also note that the stab option does not automatically include the known overhead to the packet as indicated by the outdated man-page, so that the ability to specify negative overheads is basically not needed or useful. And yes the kernel needs to be fixed, one of these days. Speaking of kernel code, Jesper “recently” hoisted HTB’s link layer adjustment method into the present (getting basically rid of the tables and allowing for GRO packets), something that stab also needs to have fixed...

> 
> Before his work there was some user space code by IIRC Dan Singletary
> which I used for a while and later Ed Wildgoose analysed the kernel code
> and posted patches for htb and tc on the original lartc list which I
> used for some time before Jespers code got in.

	Interting piece of history, all that happened before I cared, heck even Jespers thesis was out by then.


Best Regards
	Sebastian

> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (3 preceding siblings ...)
  2014-09-22 10:20 ` Sebastian Moeller
@ 2014-09-22 13:09 ` Alan Goodman
  2014-09-22 19:52 ` Sebastian Moeller
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Alan Goodman @ 2014-09-22 13:09 UTC (permalink / raw)
  To: lartc

Hello all once again,

I tried running the attached ping sweeper yesterday evening as is and 
didnt get particularly plausible looking results.  I therefore decided 
to increase the upper limit of the size of ping packets sent and let the 
script run over night while the connection was quiet.

Here is a screen shot of the resulting graph which does appear to have a 
stepped appearance, but perhaps not as expected?
http://imgur.com/RjmT8Qh

This test was ran on a BT Infinity VDSL/FTTC connection with the modem 
plugged directly into a CentOS 6 machine which is doing PPPoE.  The 
connection is synced at 80mbit down and 20mbit up.  BT restrict 
downstream speed to 77.44Mbps IP traffic.

I can run the test on a slower BT connection over the week end if anyone 
is interested in the results?

Alan

On 21/09/14 19:35, Sebastian Moeller wrote:
> Hi Dave, hi Andy,
>
>
>
>
> On Sep 20, 2014, at 19:55 , Dave Taht <dave.taht@gmail.com> wrote:
>
>> We'd had a very long thread on cerowrt-devel and in the end sebastian
>> (I think) had developed some scripts to exaustively (it took hours)
>> derive the right encapsulation frame size on a link. I can't find the
>> relevant link right now, ccing that list…
>
> 	I am certainly not the first to have looked at ATM encapsulation effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis about this topic (see http://www.adsl-optimizer.dk) and together with Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)  I believe they taught the linux kernel about how to account for encapsulation. What you need to tell the kernel is whether or not you have ATM encapsulation (ATM is weird in that each ip Packet gets chopped into 48 byte cells, with the last partially full cell padded) and the per packet overhead on your link. You can either get this information from your ISP and/or from the DSL-modem’s information page, but both are not guaranteed to be available/useful. So I set  out to empirically deduce this information from measurements on my own link. I naively started out with using ICMP echo requests as probes (as I easily could generate probe packets with different sizes with the linux/macosx ping binary), as it tur
 ned out, 
this works well enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh (attached) is the program I use (on an otherwise idle link, typically over night) to collect ~1000 repetitions of time stamped ping packets spanning two (potential) ATM cells. I then use tc_stab_parameter_guide.m (a matlab/octave program) to read in the output of the ping_sweeper script and process the data. In short if the link runs ATM encapsulation the plot of the data needs to look like a stair with 48 byte step width, if it is just smoothly increasing the carrier is not ATM. For ATM links and only ATM links, the script also tries to figure out the per packet overhead which always worked well for me. (My home-link got recently a silent upgrade where the encapsulation changed from 40 bytes to 44 bytes (probably due to the introduction of VLAN tags), which caused some disturbances in link capacity measurements I was running at the time; so I ran my code again and lo and behold the overhead 
 had incre
ased, which caused the issues with the measurements, as after taking the real overhead into account the disturbances went away, but I guess I digress ;) )
>
>
>
>
> Best Regards
> 	Sebastian
>
>
>>
>> On Sat, Sep 20, 2014 at 7:17 PM, Andy Furniss <adf.lists@gmail.com> wrote:
>>> Alan Goodman wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am looking to figure out the most fool proof way to calculate stab
>>>> overheads for ADSL/VDSL connections.
>>>>
>>>> ppp0      Link encap:Point-to-Point Protocol inet addr:81.149.38.69
>>>> P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
>>>> MULTICAST  MTU:1492  Metric:1 RX packets:17368223 errors:0 dropped:0
>>>> overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
>>>> carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
>>>> TX bytes:3611007028 (3.3 GiB)
>>>>
>>>> I am setting a longer txqueuelen as I am not currently using any fair
>>>> queuing (buffer bloat issues with sfq)
>>>
>>>
>>> Whatever is txqlen is on ppp there is likely some other buffer after it
>>> - the default can hurt with eg, htb as if you don't add qdiscs to
>>> classes it takes (last time I looked) its qlen from that.
>>>
>>> Sfq was only ever meant for bulk, so should really be in addition to
>>> some classification to separate interactive - I don't really get the
>>
>> Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
>> bad things to bulk as it doesn't manage queue length.
>>
>> A little bit of prioritization or deprioritization for some traffic is
>> helpful, but most traffic is hard to classify.
>>
>>> bufferbloat bit, you could make the default 128 limit lower if you wanted.
>>
>> htb + fq_codel, if available, is the right thing here....
>>
>> http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
>>
>>>> The connection is a BT Infinity FTTC VDSL connection synced at
>>>> 80mbit/20mbit.  The modem is connected directly to the ethernet port
>>>> on a server running a slightly tweaked HFSC setup that you folks
>>>> helped me set up in July - back when I was on ADSL.  I am still
>>>> running pppoe I believe from my server.
>>>
>>>
>>> I have similar since May 2013 and I still haven't got round to reading
>>> up on everything yet :-)
>>>
>>> I have extra geek score for using mini jumbos = running pppoe with mtu
>>> 1500 which works for me on plusnet. You need a recent pppd for this and
>>> a nic that works with mtu >= 1508.
>>>
>>> As for overheads, initial searching indicated that it's not easy or
>>> maybe even truly possible like adsl.
>>>
>>>> The largest ping packet that I can fit out onto the wire is 1464
>>>> bytes:
>>>>
>>>> # ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
>>>> 1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
>>>> ttlX time\x11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttlX
>>>> time\x11.9 ms
>>>>
>>>> # ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
>>>> 1465(1493) bytes of data. From
>>>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
>>>> Frag needed and DF set (mtu = 1492) From
>>>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
>>>> Frag needed and DF set (mtu = 1492)
>>>
>>>
>>> You can't work out your overheads like this.
>>>
>>> On slow uplink adsl it was possible with ping to infer the fixed part
>>> but you needed to send loads of pings increasing in size and plot the
>>> best time for each to make a stepped graph.
>>>
>>>
>>>> Based on this I believe overhead should be set to 28, however with 28
>>>> set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
>>>> to be loosing about 1.5mbit of upload...
>>>
>>>
>>> Even if you could do things perfectly I would back off a few kbit just
>>> to be safe. Timers may be different or there may be OAM/Reporting data
>>> going up, albeit rarely.
>>>
>>>>
>>>> No traffic manager enabled:
>>>>
>>>> http://www.thinkbroadband.com/speedtest/results.html?id\x141116089424883990118
>>>>
>>>>
>>>> HFSC traffic manager:
>>>>
>>>> http://www.thinkbroadband.com/speedtest/results.html?id\x141116216621093133034
>>>>
>>>>
>>>>
>>>> Am I calculating overhead incorrectly?
>>>
>>>
>>> VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
>>> atm with stab. Unfortunately stab doesn't do 64/65.
>>>
>>> As for the fixed part - I am not sure, but roughly starting with IP as
>>> that's what tc sees on ppp (as opposed to ip + 14 on eth)
>>>
>>> IP
>>> +8 for PPPOE
>>> +14 for ethertype and macs
>>> +4 because Openreach modem uses vlan
>>> +2 CRC ??
>>> + "a few" 64/65
>>>
>>> That's it for fixed - of course 64/65 adds another one for every 64 TBH
>>> I didn't get the precice detail from the spec and not having looked
>>> recently I can't remember.
>>>
>>> BT Sin 498 does give some of this info and a couple of examples of
>>> throughput for different frame sizes - but it's rounded to kbit which
>>> means I couldn't work out to the byte what the overheads were.
>>>
>>> Worse still VDSL can use link layer retransmits and the sin says that
>>> though currently (2013) not enabled, they would be in due course. I have
>>> no clue how these work.
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe lartc" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> --
>> Dave Täht
>>
>> https://www.bufferbloat.net/projects/make-wifi-fast
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (4 preceding siblings ...)
  2014-09-22 13:09 ` Alan Goodman
@ 2014-09-22 19:52 ` Sebastian Moeller
  2014-09-22 23:02 ` Alan Goodman
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-22 19:52 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]

Hi Alan,


On Sep 22, 2014, at 15:09 , Alan Goodman <notifications@yescomputersolutions.com> wrote:

> Hello all once again,
> 
> I tried running the attached ping sweeper yesterday evening as is and didnt get particularly plausible looking results.  

	I concur, that does not look like ATM. I somehow like how I hedged my estimate of the ATM quantization by reporting it likely if the residuals of the stair fit is smaller than the residuals of the linear fit ;) But that clearly is not an ATM carrier... 

> I therefore decided to increase the upper limit of the size of ping packets sent and let the script run over night while the connection was quiet.

	I guess not a bad idea, but in this case the simplistic heuristic of just comparing cumulative residuals is just not good enough. (Note though that I do not have sufficient data sets to find a better statistic test)

> 
> Here is a screen shot of the resulting graph which does appear to have a stepped appearance, but perhaps not as expected?
> http://imgur.com/RjmT8Qh

	No, that is not the result to expect from an ATM carrier. Attached you will find example plots from a real ATM quantized link (2558 Kbps upload, 16402 Kbps download), notice how well the red line follows the green stair function in f2? Your example basically shows no stair function in the data, but for FTTC or VDLS2 that is to be expected as they finally got rid of the ATM carrier (which had overstayed its well come once the telco backbones switched away from ATM as well...)


[-- Attachment #2: ATM_quantisation_example_f1.tif --]
[-- Type: image/tif, Size: 31917 bytes --]

[-- Attachment #3: ATM_quantisation_example_f2.tif --]
[-- Type: image/tif, Size: 30999 bytes --]

[-- Attachment #4: Type: text/plain, Size: 10052 bytes --]


> 
> This test was ran on a BT Infinity VDSL/FTTC connection with the modem plugged directly into a CentOS 6 machine which is doing PPPoE.  The connection is synced at 80mbit down and 20mbit up.  BT restrict downstream speed to 77.44Mbps IP traffic.

	Thank you very much this is the first data set on a VDSL line I have seen, and clearly me hypothesis that overhead detection on PTM carriers will not work with the current code is nicely demonstrated. I need to ponder this a bit more and I might not be able to find a nice solution for those links...


> 
> I can run the test on a slower BT connection over the week end if anyone is interested in the results?

	I would love to see that especially if the other connection is much slower, as I see two possible issues with this data set:

1) Speed: It might be that your line is fast enough to hide the ATM quantization below another quantization (like the 4KHz symbol rate of the individual carriers) or two many concurrent carriers ;)

2) ICMP slowpathing: ATM or no ATM the RTT should increase with increasing packet size, something not really visible in you raw data plot on the right. This could be caused by either your ISP rate limiting your VDSL connection (which is something Dan Siemon encountered before, see: http://www.coverfire.com/archives/2012/12/05/per-packet-overhead-on-vdsl-3/ ) or by limits in the upstream ICMP host (if you are so inclined you could test to ping against a different host, like say gstatic.com)

Best Regards
	Sebastian

> 
> Alan
> 
> On 21/09/14 19:35, Sebastian Moeller wrote:
>> Hi Dave, hi Andy,
>> 
>> 
>> 
>> 
>> On Sep 20, 2014, at 19:55 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> We'd had a very long thread on cerowrt-devel and in the end sebastian
>>> (I think) had developed some scripts to exaustively (it took hours)
>>> derive the right encapsulation frame size on a link. I can't find the
>>> relevant link right now, ccing that list…
>> 
>> 	I am certainly not the first to have looked at ATM encapsulation effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis about this topic (see http://www.adsl-optimizer.dk) and together with Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)  I believe they taught the linux kernel about how to account for encapsulation. What you need to tell the kernel is whether or not you have ATM encapsulation (ATM is weird in that each ip Packet gets chopped into 48 byte cells, with the last partially full cell padded) and the per packet overhead on your link. You can either get this information from your ISP and/or from the DSL-modem’s information page, but both are not guaranteed to be available/useful. So I set  out to empirically deduce this information from measurements on my own link. I naively started out with using ICMP echo requests as probes (as I easily could generate probe packets with different sizes with the linux/macosx ping binary), as it tur
> ned out, this works well enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh (attached) is the program I use (on an otherwise idle link, typically over night) to collect ~1000 repetitions of time stamped ping packets spanning two (potential) ATM cells. I then use tc_stab_parameter_guide.m (a matlab/octave program) to read in the output of the ping_sweeper script and process the data. In short if the link runs ATM encapsulation the plot of the data needs to look like a stair with 48 byte step width, if it is just smoothly increasing the carrier is not ATM. For ATM links and only ATM links, the script also tries to figure out the per packet overhead which always worked well for me. (My home-link got recently a silent upgrade where the encapsulation changed from 40 bytes to 44 bytes (probably due to the introduction of VLAN tags), which caused some disturbances in link capacity measurements I was running at the time; so I ran my code again and lo and behold the overhead had incre
> ased, which caused the issues with the measurements, as after taking the real overhead into account the disturbances went away, but I guess I digress ;) )
>> 
>> 
>> 
>> 
>> Best Regards
>> 	Sebastian
>> 
>> 
>>> 
>>> On Sat, Sep 20, 2014 at 7:17 PM, Andy Furniss <adf.lists@gmail.com> wrote:
>>>> Alan Goodman wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am looking to figure out the most fool proof way to calculate stab
>>>>> overheads for ADSL/VDSL connections.
>>>>> 
>>>>> ppp0      Link encap:Point-to-Point Protocol inet addr:81.149.38.69
>>>>> P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
>>>>> MULTICAST  MTU:1492  Metric:1 RX packets:17368223 errors:0 dropped:0
>>>>> overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
>>>>> carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
>>>>> TX bytes:3611007028 (3.3 GiB)
>>>>> 
>>>>> I am setting a longer txqueuelen as I am not currently using any fair
>>>>> queuing (buffer bloat issues with sfq)
>>>> 
>>>> 
>>>> Whatever is txqlen is on ppp there is likely some other buffer after it
>>>> - the default can hurt with eg, htb as if you don't add qdiscs to
>>>> classes it takes (last time I looked) its qlen from that.
>>>> 
>>>> Sfq was only ever meant for bulk, so should really be in addition to
>>>> some classification to separate interactive - I don't really get the
>>> 
>>> Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
>>> bad things to bulk as it doesn't manage queue length.
>>> 
>>> A little bit of prioritization or deprioritization for some traffic is
>>> helpful, but most traffic is hard to classify.
>>> 
>>>> bufferbloat bit, you could make the default 128 limit lower if you wanted.
>>> 
>>> htb + fq_codel, if available, is the right thing here....
>>> 
>>> http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
>>> 
>>>>> The connection is a BT Infinity FTTC VDSL connection synced at
>>>>> 80mbit/20mbit.  The modem is connected directly to the ethernet port
>>>>> on a server running a slightly tweaked HFSC setup that you folks
>>>>> helped me set up in July - back when I was on ADSL.  I am still
>>>>> running pppoe I believe from my server.
>>>> 
>>>> 
>>>> I have similar since May 2013 and I still haven't got round to reading
>>>> up on everything yet :-)
>>>> 
>>>> I have extra geek score for using mini jumbos = running pppoe with mtu
>>>> 1500 which works for me on plusnet. You need a recent pppd for this and
>>>> a nic that works with mtu >= 1508.
>>>> 
>>>> As for overheads, initial searching indicated that it's not easy or
>>>> maybe even truly possible like adsl.
>>>> 
>>>>> The largest ping packet that I can fit out onto the wire is 1464
>>>>> bytes:
>>>>> 
>>>>> # ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
>>>>> 1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
>>>>> ttl=58 time=11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttl=58
>>>>> time=11.9 ms
>>>>> 
>>>>> # ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
>>>>> 1465(1493) bytes of data. From
>>>>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
>>>>> Frag needed and DF set (mtu = 1492) From
>>>>> host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
>>>>> Frag needed and DF set (mtu = 1492)
>>>> 
>>>> 
>>>> You can't work out your overheads like this.
>>>> 
>>>> On slow uplink adsl it was possible with ping to infer the fixed part
>>>> but you needed to send loads of pings increasing in size and plot the
>>>> best time for each to make a stepped graph.
>>>> 
>>>> 
>>>>> Based on this I believe overhead should be set to 28, however with 28
>>>>> set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
>>>>> to be loosing about 1.5mbit of upload...
>>>> 
>>>> 
>>>> Even if you could do things perfectly I would back off a few kbit just
>>>> to be safe. Timers may be different or there may be OAM/Reporting data
>>>> going up, albeit rarely.
>>>> 
>>>>> 
>>>>> No traffic manager enabled:
>>>>> 
>>>>> http://www.thinkbroadband.com/speedtest/results.html?id=141116089424883990118
>>>>> 
>>>>> 
>>>>> HFSC traffic manager:
>>>>> 
>>>>> http://www.thinkbroadband.com/speedtest/results.html?id=141116216621093133034
>>>>> 
>>>>> 
>>>>> 
>>>>> Am I calculating overhead incorrectly?
>>>> 
>>>> 
>>>> VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
>>>> atm with stab. Unfortunately stab doesn't do 64/65.
>>>> 
>>>> As for the fixed part - I am not sure, but roughly starting with IP as
>>>> that's what tc sees on ppp (as opposed to ip + 14 on eth)
>>>> 
>>>> IP
>>>> +8 for PPPOE
>>>> +14 for ethertype and macs
>>>> +4 because Openreach modem uses vlan
>>>> +2 CRC ??
>>>> + "a few" 64/65
>>>> 
>>>> That's it for fixed - of course 64/65 adds another one for every 64 TBH
>>>> I didn't get the precice detail from the spec and not having looked
>>>> recently I can't remember.
>>>> 
>>>> BT Sin 498 does give some of this info and a couple of examples of
>>>> throughput for different frame sizes - but it's rounded to kbit which
>>>> means I couldn't work out to the byte what the overheads were.
>>>> 
>>>> Worse still VDSL can use link layer retransmits and the sin says that
>>>> though currently (2013) not enabled, they would be in due course. I have
>>>> no clue how these work.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe lartc" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> 
>>> 
>>> --
>>> Dave Täht
>>> 
>>> https://www.bufferbloat.net/projects/make-wifi-fast
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (5 preceding siblings ...)
  2014-09-22 19:52 ` Sebastian Moeller
@ 2014-09-22 23:02 ` Alan Goodman
  2014-09-23  9:32 ` Sebastian Moeller
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Alan Goodman @ 2014-09-22 23:02 UTC (permalink / raw)
  To: lartc

On 22/09/14 20:52, Sebastian Moeller wrote:
>> >This test was ran on a BT Infinity VDSL/FTTC connection with the modem plugged directly into a CentOS 6 machine which is doing PPPoE.  The connection is synced at 80mbit down and 20mbit up.  BT restrict downstream speed to 77.44Mbps IP traffic.
> 	Thank you very much this is the first data set on a VDSL line I have seen, and clearly me hypothesis that overhead detection on PTM carriers will not work with the current code is nicely demonstrated. I need to ponder this a bit more and I might not be able to find a nice solution for those links...

You're welcome.  If you need any more data feel free to drop me a line.

>>>I can run the test on a slower BT connection over the week end if anyone is interested in the results?
>  I would love to see that especially if the other connection is much slower, as I see two possible issues with this data set:

The other connection is actually ADSL2, we probably know what the 
results there will be...  I think I shall run the test on a really slow 
ADSL connection later in the year to double check my overheads though. 
It seems like a very useful tool.

Also thanks for providing some example plots of how it should look. 
That will allow me to better interpret results in future.

> 1) Speed: It might be that your line is fast enough to hide the ATM quantization below another quantization (like the 4KHz symbol rate of the individual carriers) or two many concurrent carriers;)

Would it be useful if I limited my upload speed with say hfsc to 1mbit 
and re-ran the test?

Given the above comments in Sebastian’s very useful emails how would it 
be best to shape these FTTC connections at present?  Without overhead 
set or something else?

Alan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (6 preceding siblings ...)
  2014-09-22 23:02 ` Alan Goodman
@ 2014-09-23  9:32 ` Sebastian Moeller
  2014-09-23 15:10 ` Andy Furniss
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-23  9:32 UTC (permalink / raw)
  To: lartc

H Alan,


On Sep 23, 2014, at 01:02 , Alan Goodman <notifications@yescomputersolutions.com> wrote:

> On 22/09/14 20:52, Sebastian Moeller wrote:
>>> >This test was ran on a BT Infinity VDSL/FTTC connection with the modem plugged directly into a CentOS 6 machine which is doing PPPoE.  The connection is synced at 80mbit down and 20mbit up.  BT restrict downstream speed to 77.44Mbps IP traffic.
>> 	Thank you very much this is the first data set on a VDSL line I have seen, and clearly me hypothesis that overhead detection on PTM carriers will not work with the current code is nicely demonstrated. I need to ponder this a bit more and I might not be able to find a nice solution for those links...
> 
> You're welcome.  If you need any more data feel free to drop me a line.

	Thanks for the offer, I might take you up on it ;) (next month I  hope to upgrade to VDSL2 so I have an easier time trying new methods...)

> 
>>>> I can run the test on a slower BT connection over the week end if anyone is interested in the results?
>> I would love to see that especially if the other connection is much slower, as I see two possible issues with this data set:
> 
> The other connection is actually ADSL2, we probably know what the results there will be…

	I assume that this will work reasonably well, for all adel lines I tested 1000 samples per ping size and a range from 16 to 116 worked out well enough to detect quantization and overhead.

>  I think I shall run the test on a really slow ADSL connection later in the year to double check my overheads though.

	I think it is a decent idea to re-check the encapsulation used occasionally, in my case the ISP added VLAN tags (which I neither need nor want) increasing the overhead from 40 bytes to 44 bytes. If that had not caused irregularities in the netperf-wrapper tests I run I would probably not have noted. (If the link is fully saturated the wrong overhead has a strong effect on the link’s latency, but with moderate load that is somewhat hidden so easy to overlook)

> It seems like a very useful tool.

	Glad you like it ;) (I think the idea and method is sound, but those I lifted from Jesper’s thesis, my implementation however is messy)

> 
> Also thanks for providing some example plots of how it should look. That will allow me to better interpret results in future.
> 
>> 1) Speed: It might be that your line is fast enough to hide the ATM quantization below another quantization (like the 4KHz symbol rate of the individual carriers) or two many concurrent carriers;)
> 
> Would it be useful if I limited my upload speed with say hfsc to 1mbit and re-ran the test?

	First I would try against different hosts, the fact that there is no linear increase of the RTT with increasing packet size is a sign that something is messing with our probe packets and hence the whole thing gets iffy.

BUT, I strongly assume your VDSL link is using packet transfer mode (PTM) not ATM and so all my code can show you is that there is no quantization, and since overhead detection currently requires ATM cell quantization the reported numbers are just not useful. The reason to still report these is that I have not determined a proper statistical test to classify the link carrier.
	Note (I might have explained that earlier, but I am not sure whether that was in this thread): the code tries to find packet sizes at which the RTT increases, or in other words the boundaries of the ATM cells. Once this is done it uses all information it has about pre-payload overhead (ICMP header, IP header…) and finds out how many bytes are missing to fully fill the first (two) ATM cells (these cells are not really shown in the plots), it then reports the previously un-known pre-IP bytes as overhead that needs to be accounted for.

> 
> Given the above comments in Sebastian’s very useful emails how would it be best to shape these FTTC connections at present?

	So

>  Without overhead set or something else?

	I would just go and account for all overheads I could deduce, so I would guess: 8 bytes PPPoE, 4 byte VLAN tags, 14 bytes ethernet header (note for tc’s stab method one needs to include the ethernet headers in the specified overhead in spite of the man page), I am uncertain about the 4 byte ethernet frame check sequence (it was typically not included on ATM links). So in total 26 bytes; I would specify those, for PTM getting the overhead wrong is not as bad as with ATM so just try to make a good approximation. 
	A trickier question is how to select the shaping rate. In theory all xDSL-modem report some sort of line rates, but unfortunately the standards contain quite a lot slightly different rates the modem manufacturer might decide to report, so guess the best one can do is to guess, then iterate over measure and refine cycles to figure out the “optimal” shaping rates. Rich Brown’s betterspeedtest.sh or netperf-wrapper’s RRUL test (see http://www.bufferbloat.net/projects/cerowrt/wiki/Quick_Test_for_Bufferbloat/ ) are decent ways for the measure part…

Best Regards
	Sebastian Moeller


> 
> Alan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (7 preceding siblings ...)
  2014-09-23  9:32 ` Sebastian Moeller
@ 2014-09-23 15:10 ` Andy Furniss
  2014-09-23 17:47 ` Sebastian Moeller
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andy Furniss @ 2014-09-23 15:10 UTC (permalink / raw)
  To: lartc

Sebastian Moeller wrote:

> I would just go and account for all overheads I could deduce, so I
> would guess: 8 bytes PPPoE, 4 byte VLAN tags, 14 bytes ethernet
> header (note for tc’s stab method one needs to include the ethernet
> headers in the specified overhead in spite of the man page)

I don't think the man page is wrong - it includes eth in the pppoe example.

There is a difference between shaping on ppp and shaping on eth which
needs to be and is noted.

FWIW I tried a few pings on my VDSL2 and don't think I'll be any use for
results.

I do get an increase with larger packets but it's more than it should be
:-(.

The trouble is that my ISP does DPI/Ellacoya Qos for my ingress and I
guess this affects things a bit too much for sub milisecond accuracy
needed on a 20/80 line.

At least I don't have to bother so much about ingress shaping (not that
I would @80mbit so much anyway).

Ping and game traffic comes in tos marked 0x0a and gets prio on their
egress which is set slightly lower than my sync profile speed.

Additionally it's probably not the best time to test as they had a
recent outage which caused in-balance on their gateways which seems to
still persist.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (8 preceding siblings ...)
  2014-09-23 15:10 ` Andy Furniss
@ 2014-09-23 17:47 ` Sebastian Moeller
  2014-09-23 19:05 ` Andy Furniss
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-23 17:47 UTC (permalink / raw)
  To: lartc

Hi Andy,

On Sep 23, 2014, at 17:10 , Andy Furniss <adf.lists@gmail.com> wrote:

> Sebastian Moeller wrote:
> 
>> I would just go and account for all overheads I could deduce, so I
>> would guess: 8 bytes PPPoE, 4 byte VLAN tags, 14 bytes ethernet
>> header (note for tc’s stab method one needs to include the ethernet
>> headers in the specified overhead in spite of the man page)
> 
> I don't think the man page is wrong - it includes eth in the pppoe example.

	I am not sure we are talking about the same man page then. From opens use 13.1 “man tc-stab”:
When size table is consulted, and you're shaping traffic for the sake of another modem/router, ethernet header (with-
           out  padding) will already be added to initial packet's length. You should compensate for that by subtracting 14 from
           the above overheads in such case. If you're shaping directly on the router (for example, with speedtouch  usb  modem)
           using ppp daemon, you're using raw ip interface without underlying layer2, so nothing will be added.

           For more thorough explanations, please see [1] and [2].

BUT if you look at the kernel code, stab does not automatically include the ethernet overhead, so the subtract 14 in the above is actually wrong. See http://lxr.free-electrons.com/source/net/sched/sch_api.c#L538 where “pkt_len = skb->len + stab->szopts.overhead; is used instead of using “qdisc_skb_cb(skb)->pkt_len” that as filled properly in http://lxr.free-electrons.com/source/net/core/dev.c#L2705 . At least to me this clearly looks like the ethernet overhead is not pre-added when using stab, but I could be wrong. 
	And on an ADSL link you can see this quite well, with the proper overhead values sqm-scripts still controls the latency under netperf-wrapper’s RRUL test nicely even if the shaping rate equals the line rate, with the overhead to small latency goes down the drain ;)



> 
> There is a difference between shaping on ppp and shaping on eth which
> needs to be and is noted.

	Again I am not sure about the validity of the information in the man page...

> 
> FWIW I tried a few pings on my VDSL2 and don't think I'll be any use for
> results.

	Well for the overhead calculation my script absolutely requires ATM cell quantization, with PTM as usual on VDSL2 it has no chance of working at all; the “signal” it is searching for simply does not exist with a PTM carrier ;)


> 
> I do get an increase with larger packets but it's more than it should be
> :-(.

	If it is nicely linear that would be great.

> 
> The trouble is that my ISP does DPI/Ellacoya Qos for my ingress and I
> guess this affects things a bit too much for sub milisecond accuracy
> needed on a 20/80 line.

	Okay, so one issue is that with 80/20 you would expect the RTT-difference if you add a single ASTM cell to your packet to be:
((53*8) / (80000 * 1000) + (53*8) / (20000 * 1000) ) * 1000 = 0.0265milliseconds

With ping typically only reporting milliseconds with 1 decimal point this means even if you had an ATM carrier you would be in for a long measurement train… but BT VDSL runs on PTM so even with weeks of measurement time all that would show you is that there is no ATM quantization ;)

> 
> At least I don't have to bother so much about ingress shaping (not that
> I would @80mbit so much anyway).

	I would a) love to have your connection, and b) would still try to shape ingress; but currently not much affordable home routers can actually reliably shape a 80/20 connection...


> 
> Ping and game traffic comes in tos marked 0x0a and gets prio on their
> egress which is set slightly lower than my sync profile speed.

	Yeah, it seems excessively hard to calculate the net rate on VDSL links as a number of encapsulation details are well hidden from the end user (think DTU size…) so simply aiming lower and perform a few tests seems like the best approach. A bit of a pity since on ATM we really could account for all (and for that reason I saw great latency results even when shaping my line to 100% of reported line-rate). I am quite curious how tricky this is going to be on VDSL...

> 
> Additionally it's probably not the best time to test as they had a
> recent outage which caused in-balance on their gateways which seems to
> still persist.
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (9 preceding siblings ...)
  2014-09-23 17:47 ` Sebastian Moeller
@ 2014-09-23 19:05 ` Andy Furniss
  2014-09-23 22:16 ` Sebastian Moeller
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andy Furniss @ 2014-09-23 19:05 UTC (permalink / raw)
  To: lartc

Sebastian Moeller wrote:
> Hi Andy,
>
> On Sep 23, 2014, at 17:10 , Andy Furniss <adf.lists@gmail.com>
> wrote:

> BUT if you look at the kernel code, stab does not automatically
> include the ethernet overhead, so the subtract 14 in the above is
> actually wrong. See
> http://lxr.free-electrons.com/source/net/sched/sch_api.c#L538 where
> “pkt_len = skb->len + stab->szopts.overhead; is used instead of using
> “qdisc_skb_cb(skb)->pkt_len” that as filled properly in
> http://lxr.free-electrons.com/source/net/core/dev.c#L2705 . At least
> to me this clearly looks like the ethernet overhead is not pre-added
> when using stab, but I could be wrong. And on an ADSL link you can
> see this quite well, with the proper overhead values sqm-scripts
> still controls the latency under netperf-wrapper’s RRUL test nicely
> even if the shaping rate equals the line rate, with the overhead to
> small latency goes down the drain ;)

I guess skb->len varies depending on the interface.

Anyway here's a quick test on my desktop PC running a git kernel and tc.

I used to shape remotely pppoa/vc mux dsl so know that for me

ping -s 10 .... = one cell and -s 11 = 2 cells - overhead on IP was 10.

Paste time -

ph4[/mnt/sda8/Qos/stab-tests]# cat stab-hfsc
#set -x
TC=/sbin/tc

$TC qdisc del dev eth0 root &>/dev/null

if [ "$1" = "stop" ]
then
         exit
fi

$TC qdisc add dev eth0 root handle 1: stab overhead -4 linklayer atm 
hfsc default ffff
$TC class add dev eth0 parent 1: classid 1:1 hfsc sc rate 1kbit ul rate 
1kbit
$TC qdisc add dev eth0 parent 1:1 pfifo limit 200
$TC class add dev eth0 parent 1:0 classid 1:ffff hfsc sc rate 80mbit ul 
rate 80mbit

$TC filter add dev eth0 parent 1: protocol ip prio 1 \
     u32 match ip protocol 1 0xff classid 1:1

ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
ph4[/mnt/sda8/Qos/stab-tests]# ping -s 10 -c 1 noki
PING noki.andys.lan (192.168.0.1) 10(38) bytes of data.
18 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttld

--- noki.andys.lan ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms

ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
qdisc hfsc 1: root refcnt 2 default ffff
  Sent 106 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
qdisc pfifo 8005: parent 1:1 limit 200p
  Sent 53 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
ph4[/mnt/sda8/Qos/stab-tests]# ping -s 11 -c 1 noki
PING noki.andys.lan (192.168.0.1) 11(39) bytes of data.
19 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttld

--- noki.andys.lan ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms

ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
qdisc hfsc 1: root refcnt 2 default ffff
  Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
qdisc pfifo 8006: parent 1:1 limit 200p
  Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
ph4[/mnt/sda8/Qos/stab-tests]#

So it seems that overhead -4 is the correct thing to do.

I also tested backlogged (-i 0.2) with -s 10 and 11 and tcpdump showed 
the correct deltas -

ph4[/mnt/sda8/Qos/stab-tests]# tcpdump -nnttti eth0 icmp and dst host noki

snip

00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 92, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 93, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 94, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 95, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 96, length 18
00:00:00.424001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 97, length 18
00:00:00.423999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1345, seq 98, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 1, length 19
00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 2, length 19
00:00:00.848001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 3, length 19
00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 4, length 19
00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 5, length 19
00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 6, length 19
00:00:00.848002 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 
1347, seq 7, length 19



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (10 preceding siblings ...)
  2014-09-23 19:05 ` Andy Furniss
@ 2014-09-23 22:16 ` Sebastian Moeller
  2014-09-24  9:17 ` Andy Furniss
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-23 22:16 UTC (permalink / raw)
  To: lartc

Hi Andy,


On Sep 23, 2014, at 21:05 , Andy Furniss <adf.lists@gmail.com> wrote:

> Sebastian Moeller wrote:
>> Hi Andy,
>> 
>> On Sep 23, 2014, at 17:10 , Andy Furniss <adf.lists@gmail.com>
>> wrote:
> 
>> BUT if you look at the kernel code, stab does not automatically
>> include the ethernet overhead, so the subtract 14 in the above is
>> actually wrong. See
>> http://lxr.free-electrons.com/source/net/sched/sch_api.c#L538 where
>> “pkt_len = skb->len + stab->szopts.overhead; is used instead of using
>> “qdisc_skb_cb(skb)->pkt_len” that as filled properly in
>> http://lxr.free-electrons.com/source/net/core/dev.c#L2705 . At least
>> to me this clearly looks like the ethernet overhead is not pre-added
>> when using stab, but I could be wrong. And on an ADSL link you can
>> see this quite well, with the proper overhead values sqm-scripts
>> still controls the latency under netperf-wrapper’s RRUL test nicely
>> even if the shaping rate equals the line rate, with the overhead to
>> small latency goes down the drain ;)
> 
> I guess skb->len varies depending on the interface.
> 
> Anyway here's a quick test on my desktop PC running a git kernel and tc.
> 
> I used to shape remotely pppoa/vc mux dsl so know that for me
> 
> ping -s 10 .... = one cell and -s 11 = 2 cells - overhead on IP was 10.
> 
> Paste time -
> 
> ph4[/mnt/sda8/Qos/stab-tests]# cat stab-hfsc
> #set -x
> TC=/sbin/tc
> 
> $TC qdisc del dev eth0 root &>/dev/null
> 
> if [ "$1" = "stop" ]
> then
>        exit
> fi
> 
> $TC qdisc add dev eth0 root handle 1: stab overhead -4 linklayer atm hfsc default ffff
> $TC class add dev eth0 parent 1: classid 1:1 hfsc sc rate 1kbit ul rate 1kbit
> $TC qdisc add dev eth0 parent 1:1 pfifo limit 200
> $TC class add dev eth0 parent 1:0 classid 1:ffff hfsc sc rate 80mbit ul rate 80mbit
> 
> $TC filter add dev eth0 parent 1: protocol ip prio 1 \
>    u32 match ip protocol 1 0xff classid 1:1
> 
> ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
> ph4[/mnt/sda8/Qos/stab-tests]# ping -s 10 -c 1 noki
> PING noki.andys.lan (192.168.0.1) 10(38) bytes of data.
> 18 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttld
> 
> --- noki.andys.lan ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> 
> ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
> qdisc hfsc 1: root refcnt 2 default ffff
> Sent 106 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc pfifo 8005: parent 1:1 limit 200p
> Sent 53 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
> ph4[/mnt/sda8/Qos/stab-tests]# ping -s 11 -c 1 noki
> PING noki.andys.lan (192.168.0.1) 11(39) bytes of data.
> 19 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttld
> 
> --- noki.andys.lan ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> 
> ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
> qdisc hfsc 1: root refcnt 2 default ffff
> Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc pfifo 8006: parent 1:1 limit 200p
> Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> ph4[/mnt/sda8/Qos/stab-tests]#

	Thanks for sharing your test case; I can repeat these results exactly on my machines (I also tried htb instead hfsc for fun: same result as to be expected see below).
Looking back at http://lxr.free-electrons.com/ident?i=qdisc_pkt_len_init (line 2731):

qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len ;

I begin to realize this function is not responsible for adding single  wire packet’s ethernet header, but for figuring out in how many on-the-wire packets to chop down a GSO packet , and add the header overhead for the additional wire packets, I had completely looked over the (gso-segs - 1) part, oops. 

	@cerowrt-devel: everyone using link layer ATM you might want to try to reduce the the per packet overhead by 14… (but please test)

So I stand corrected, you are right, tic’s stab automatically adds the ethernet header. So I am off to repeat my netperf-wrapper tests right now again with overhead of 30 instead of 44, again these tests confirm your observation. Interestingly, it seems netperf-wrapper’s RRUL test really is suited to figure out the overhead: while shaping to 100% of line rate (on ADSL2+ where line rate rate is the net line rate (after FEC)) specifying too small an overhead the ICMP latency plot shows larger deviations from the expected unload RTT plus 10ms. Too large an overhead however just decreases the good put bait while leaving the latency well under control.


> 
> So it seems that overhead -4 is the correct thing to do.

	And thanks to your help I fully agree.

> 
> I also tested backlogged (-i 0.2) with -s 10 and 11 and tcpdump showed the correct deltas -
> 
> ph4[/mnt/sda8/Qos/stab-tests]# tcpdump -nnttti eth0 icmp and dst host noki
> 
> snip
> 
> 00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 92, length 18
> 00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 93, length 18
> 00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 94, length 18
> 00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 95, length 18
> 00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 96, length 18
> 00:00:00.424001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 97, length 18
> 00:00:00.423999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 98, length 18
> 00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 1, length 19
> 00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 2, length 19
> 00:00:00.848001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 3, length 19
> 00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 4, length 19
> 00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 5, length 19
> 00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 6, length 19
> 00:00:00.848002 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 7, length 19
> 
> 

	I really appreciate your test script, thanks for taking the time.

Best Regards
	Sebastian

	



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (11 preceding siblings ...)
  2014-09-23 22:16 ` Sebastian Moeller
@ 2014-09-24  9:17 ` Andy Furniss
  2014-09-24 16:23 ` Sebastian Moeller
  2014-09-24 22:48 ` Andy Furniss
  14 siblings, 0 replies; 16+ messages in thread
From: Andy Furniss @ 2014-09-24  9:17 UTC (permalink / raw)
  To: lartc

Sebastian Moeller wrote:

> Thanks for sharing your test case; I can repeat these results
> exactly on my machines (I also tried htb instead hfsc for fun: same
> result as to be expected see below). Looking back at
> http://lxr.free-electrons.com/ident?i=qdisc_pkt_len_init (line
> 2731):
>
> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len ;
>
> I begin to realize this function is not responsible for adding
> single wire packet’s ethernet header, but for figuring out in how
> many on-the-wire packets to chop down a GSO packet , and add the
> header overhead for the additional wire packets, I had completely
> looked over the (gso-segs - 1) part, oops.

Glad it helped - I know from trying, and giving up, how hard/error prone
reading kernel code can be :-)

>
> @cerowrt-devel: everyone using link layer ATM you might want to try
> to reduce the the per packet overhead by 14… (but please test)

Maybe you mean overhead calculated by a script?

Just to be clear, I expect that wrt would be shaping on ppp, so you
don't need to take 14 if that's the case.


> So I stand corrected, you are right, tic’s stab automatically adds
> the ethernet header. So I am off to repeat my netperf-wrapper tests
> right now again with overhead of 30 instead of 44, again these tests
> confirm your observation. Interestingly, it seems netperf-wrapper’s
> RRUL test really is suited to figure out the overhead: while shaping
> to 100% of line rate (on ADSL2+ where line rate rate is the net line
> rate (after FEC)) specifying too small an overhead the ICMP latency
> plot shows larger deviations from the expected unload RTT plus 10ms.
> Too large an overhead however just decreases the good put bait while
> leaving the latency well under control.

I wouldn't word it like "stab adds ..." This is nothing to do with stab
really - just the only length stab knows is skb->len and that means
different things on different interfaces because of how the kernel works.

(I haven't retested all this, but I doubt it's changed)

On ppp skb->len = ip len

On eth skb->len = ip len + 14

On vlan skb->len = ip len + 18

If you ran my script on various interfaces without stab I expect you
would still be able to see the difference - everyone who does any tc on
eth gets shaping with ip+14 sized packets.

Even without tc involved I think you could see the difference looking at
ip -s ls xxxx type stats on different interfaces.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (12 preceding siblings ...)
  2014-09-24  9:17 ` Andy Furniss
@ 2014-09-24 16:23 ` Sebastian Moeller
  2014-09-24 22:48 ` Andy Furniss
  14 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2014-09-24 16:23 UTC (permalink / raw)
  To: lartc

Hi Andy,


On Sep 24, 2014, at 11:17 , Andy Furniss <adf.lists@gmail.com> wrote:

> Sebastian Moeller wrote:
> 
>> Thanks for sharing your test case; I can repeat these results
>> exactly on my machines (I also tried htb instead hfsc for fun: same
>> result as to be expected see below). Looking back at
>> http://lxr.free-electrons.com/ident?i=qdisc_pkt_len_init (line
>> 2731):
>> 
>> qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len ;
>> 
>> I begin to realize this function is not responsible for adding
>> single wire packet’s ethernet header, but for figuring out in how
>> many on-the-wire packets to chop down a GSO packet , and add the
>> header overhead for the additional wire packets, I had completely
>> looked over the (gso-segs - 1) part, oops.
> 
> Glad it helped - I know from trying, and giving up, how hard/error prone
> reading kernel code can be :-)

	Especially when all one knows about C is basically from reading K&R with almost no hands-on coding experience ;)

> 
>> 
>> @cerowrt-devel: everyone using link layer ATM you might want to try
>> to reduce the the per packet overhead by 14… (but please test)
> 
> Maybe you mean overhead calculated by a script?

	Well in cerowrt’s SQM-scripts we expose the stab options so users can take link layer and overhead into account. If you naively determine the overhead, either with the help of the scrips I posted earlier or by looking it up on a table (if the encapsulation options are known) you will end up not handling the kernel’s auto-added overhead well. Currently SQM scripts does not expose PPP devices only ge00 (ethernet) so -14 seems currently the best recommendation in combination with “please test”. What I am curious after your message is what happens if the kernel terminates a pppoe connection but is connected to a “modem” via ethernet, what does the kernel do. And thanks to your pointers I know have an idea of how to test that ;)


> 
> Just to be clear, I expect that wrt would be shaping on ppp, so you
> don't need to take 14 if that's the case.

	Good to know.

> 
> 
>> So I stand corrected, you are right, tic’s stab automatically adds
>> the ethernet header. So I am off to repeat my netperf-wrapper tests
>> right now again with overhead of 30 instead of 44, again these tests
>> confirm your observation. Interestingly, it seems netperf-wrapper’s
>> RRUL test really is suited to figure out the overhead: while shaping
>> to 100% of line rate (on ADSL2+ where line rate rate is the net line
>> rate (after FEC)) specifying too small an overhead the ICMP latency
>> plot shows larger deviations from the expected unload RTT plus 10ms.
>> Too large an overhead however just decreases the good put bait while
>> leaving the latency well under control.
> 
> I wouldn't word it like "stab adds ..." This is nothing to do with stab
> really - just the only length stab knows is skb->len and that means
> different things on different interfaces because of how the kernel works.
> 
> (I haven't retested all this, but I doubt it's changed)
> 
> On ppp skb->len = ip len
> 
> On eth skb->len = ip len + 14
> 
> On vlan skb->len = ip len + 18

	So this is the information I actually wanted to find and then somehow thought qdisc_pkt_len_init() was the place. Do you by chance have any pointer where this assignment is handled?

> 
> If you ran my script on various interfaces without stab I expect you
> would still be able to see the difference - everyone who does any tc on
> eth gets shaping with ip+14 sized packets.
> 
> Even without tc involved I think you could see the difference looking at
> ip -s ls xxxx type stats on different interfaces.

Thanks again, & Best Regards
	Sebastian

> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cerowrt-devel] Correctly calculating overheads on unknown connections
  2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
                   ` (13 preceding siblings ...)
  2014-09-24 16:23 ` Sebastian Moeller
@ 2014-09-24 22:48 ` Andy Furniss
  14 siblings, 0 replies; 16+ messages in thread
From: Andy Furniss @ 2014-09-24 22:48 UTC (permalink / raw)
  To: lartc

Sebastian Moeller wrote:

>> Maybe you mean overhead calculated by a script?
>
> Well in cerowrt’s SQM-scripts we expose the stab options so users can
> take link layer and overhead into account. If you naively determine
> the overhead, either with the help of the scrips I posted earlier or
> by looking it up on a table (if the encapsulation options are known)
> you will end up not handling the kernel’s auto-added overhead well.
> Currently SQM scripts does not expose PPP devices only ge00
> (ethernet) so -14 seems currently the best recommendation in
> combination with “please test”.

Oh, OK - I know nothing about wrt.

> What I am curious after your message
> is what happens if the kernel terminates a pppoe connection but is
> connected to a “modem” via ethernet, what does the kernel do. And
> thanks to your pointers I know have an idea of how to test that ;)

Well I can't say I know - testing is always best.

I think we are "seeing" skbs just as they enter an interface - so what 
form they take depends on the particular interface they have just been 
made for.

It's possible to have multiple pppoes/vlans on an eth and use the eth 
normally at the same time. What you see I suppose depends on where you 
are "attached". I guess shaping a pppoe on the eth rather than on the 
actual ppp is doable with a bit of filtering - in which case you may 
need to allow for the +14 macs/ethertype and that 8 ppp are already in 
the payload - a totally untested theory :-)


>> On ppp skb->len = ip len
>>
>> On eth skb->len = ip len + 14
>>
>> On vlan skb->len = ip len + 18
>
> So this is the information I actually wanted to find and then somehow
> thought qdisc_pkt_len_init() was the place. Do you by chance have any
> pointer where this assignment is handled?

No, sorry I don't know the code.



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-09-24 22:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-21 18:35 [Cerowrt-devel] Correctly calculating overheads on unknown connections Sebastian Moeller
2014-09-21 21:40 ` Alan Goodman
2014-09-22  9:05 ` Sebastian Moeller
2014-09-22 10:01 ` Andy Furniss
2014-09-22 10:20 ` Sebastian Moeller
2014-09-22 13:09 ` Alan Goodman
2014-09-22 19:52 ` Sebastian Moeller
2014-09-22 23:02 ` Alan Goodman
2014-09-23  9:32 ` Sebastian Moeller
2014-09-23 15:10 ` Andy Furniss
2014-09-23 17:47 ` Sebastian Moeller
2014-09-23 19:05 ` Andy Furniss
2014-09-23 22:16 ` Sebastian Moeller
2014-09-24  9:17 ` Andy Furniss
2014-09-24 16:23 ` Sebastian Moeller
2014-09-24 22:48 ` Andy Furniss

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.