All of lore.kernel.org
 help / color / mirror / Atom feed
* DCCP_BUG called
@ 2010-08-19 14:21 Eugen Dedu
  2010-08-20  5:15 ` Gerrit Renker
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Eugen Dedu @ 2010-08-19 14:21 UTC (permalink / raw)
  To: dccp

Hi,

We use DCCP for a video transmission and it works well.  However, when 
we use traffic shaping (i.e. limiting network bandwidth on sender 
machine) using the code:

   sudo $TC qdisc add dev $IF root handle 1: htb default 10
   sudo $TC class add dev $IF parent 1: classid 1:10 htb rate $up_rate
   sudo $TC filter add dev $IF protocol ip parent 1:0 prio 1 u32 match 
ip dst $dest flowid 1:10
   sudo $TC qdisc add dev $IF parent 1:10 handle 40: sfq perturb 10 limit 2

taken from http://lartc.org/howto/, we receive many errors like this:

[27799.691275] BUG: err=1 after ccid_hc_tx_packet_sent at 
/build/buildd/linux-2.6.32/net/dccp/output.c:307/dccp_write_xmit()
[27799.691288] <IRQ>  [<ffffffffa0441db5>] dccp_write_xmit+0x165/0x310 
[dccp]
[27799.691308]  [<ffffffffa0443ac0>] ? dccp_write_xmit_timer+0x0/0x80 [dccp]
[27799.691315]  [<ffffffffa0443b3a>] dccp_write_xmit_timer+0x7a/0x80 [dccp]

We found that they are triggered by the following code in net/dccp/output.c:
306   err = dccp_transmit_skb(sk, skb);
307   ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
308   if (err)
309     DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
310              err);

Is there a way to fix to workaround that?

Cheers,
-- 
Eugen Dedu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
@ 2010-08-20  5:15 ` Gerrit Renker
  2010-08-20  8:46 ` Eugen Dedu
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Gerrit Renker @ 2010-08-20  5:15 UTC (permalink / raw)
  To: dccp

>   sudo $TC qdisc add dev $IF root handle 1: htb default 10
>   sudo $TC class add dev $IF parent 1: classid 1:10 htb rate $up_rate
>   sudo $TC filter add dev $IF protocol ip parent 1:0 prio 1 u32 match ip 
> dst $dest flowid 1:10
>   sudo $TC qdisc add dev $IF parent 1:10 handle 40: sfq perturb 10 limit 2
>
> taken from http://lartc.org/howto/, we receive many errors like this:
>
> [27799.691275] BUG: err=1 after ccid_hc_tx_packet_sent at  
> /build/buildd/linux-2.6.32/net/dccp/output.c:307/dccp_write_xmit()
> [27799.691288] <IRQ>  [<ffffffffa0441db5>] dccp_write_xmit+0x165/0x310  
> [dccp]
> [27799.691308]  [<ffffffffa0443ac0>] ? dccp_write_xmit_timer+0x0/0x80 [dccp]
> [27799.691315]  [<ffffffffa0443b3a>] dccp_write_xmit_timer+0x7a/0x80 [dccp]
>
> We found that they are triggered by the following code in net/dccp/output.c:
> 306   err = dccp_transmit_skb(sk, skb);
> 307   ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
> 308   if (err)
> 309     DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
> 310              err);
>
> Is there a way to fix to workaround that?

The 'err = 1' is triggered because the qdisc can refuse to take a packet, so that

 #define NET_XMIT_DROP           0x01    /* skb dropped                  */

is returned. Thus what you are seeing is not a real bug, but rather a misplaced
BUG statement. 

I haven't verified this, but I am almost sure that the problem is rectified in the
DCCP test tree which I would like to encourage you to use for all testing, since it
contains more up-to-date fixes (patches of the test tree shall soon be submitted).

You can pull the test tree from

    git://eden-feed.erg.abdn.ac.uk/dccp_exp         [subtree 'dccp']
    http://[same-address]
    
Further instructions and alternative check-out possibilities are described on 

    http://www.linuxfoundation.org/collaborate/workgroups/networking/dccp_testing#Experimental_DCCP_source_tree


Gerrit

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
  2010-08-20  5:15 ` Gerrit Renker
@ 2010-08-20  8:46 ` Eugen Dedu
  2010-08-20 10:40 ` Gerrit Renker
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eugen Dedu @ 2010-08-20  8:46 UTC (permalink / raw)
  To: dccp

On 20/08/10 07:15, Gerrit Renker wrote:
>>    sudo $TC qdisc add dev $IF root handle 1: htb default 10
>>    sudo $TC class add dev $IF parent 1: classid 1:10 htb rate $up_rate
>>    sudo $TC filter add dev $IF protocol ip parent 1:0 prio 1 u32 match ip
>> dst $dest flowid 1:10
>>    sudo $TC qdisc add dev $IF parent 1:10 handle 40: sfq perturb 10 limit 2
>>
>> taken from http://lartc.org/howto/, we receive many errors like this:
>>
>> [27799.691275] BUG: err=1 after ccid_hc_tx_packet_sent at
>> /build/buildd/linux-2.6.32/net/dccp/output.c:307/dccp_write_xmit()
>> [27799.691288]<IRQ>   [<ffffffffa0441db5>] dccp_write_xmit+0x165/0x310
>> [dccp]
>> [27799.691308]  [<ffffffffa0443ac0>] ? dccp_write_xmit_timer+0x0/0x80 [dccp]
>> [27799.691315]  [<ffffffffa0443b3a>] dccp_write_xmit_timer+0x7a/0x80 [dccp]
>>
>> We found that they are triggered by the following code in net/dccp/output.c:
>> 306   err = dccp_transmit_skb(sk, skb);
>> 307   ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
>> 308   if (err)
>> 309     DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
>> 310              err);
>>
>> Is there a way to fix to workaround that?
>
> The 'err = 1' is triggered because the qdisc can refuse to take a packet, so that
>
>   #define NET_XMIT_DROP           0x01    /* skb dropped                  */
>
> is returned. Thus what you are seeing is not a real bug, but rather a misplaced
> BUG statement.
>
> I haven't verified this, but I am almost sure that the problem is rectified in the
> DCCP test tree which I would like to encourage you to use for all testing, since it

Thank you for your answer.  I have not yet tested with DCCP test tree 
indeed, but I see that the code involved is there too, i.e.:

  306  err = dccp_transmit_skb(sk, skb);
  307  ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
  308  if (err)
  309    DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
  310             err);

and in dccp_transmit_skb():

  139                 err = icsk->icsk_af_ops->queue_xmit(skb);
  140                 return net_xmit_eval(err);

-- 
Eugen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
  2010-08-20  5:15 ` Gerrit Renker
  2010-08-20  8:46 ` Eugen Dedu
@ 2010-08-20 10:40 ` Gerrit Renker
  2010-08-23  5:29 ` Gerrit Renker
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Gerrit Renker @ 2010-08-20 10:40 UTC (permalink / raw)
  To: dccp

>> I haven't verified this, but I am almost sure that the problem is rectified in the
>> DCCP test tree which I would like to encourage you to use for all testing, since it
>
> Thank you for your answer.  I have not yet tested with DCCP test tree  
> indeed, but I see that the code involved is there too, i.e.:
>
>  306  err = dccp_transmit_skb(sk, skb);
>  307  ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
>  308  if (err)
>  309    DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
>  310             err);
>

Agree, in this case it will also trigger the BUG warning. It seems that it rather should be 

   if (err < 0)
      DCCP_BUG(...)

but I need to go through this at home. One reason why this is there is the congestion control.

The assumption in hc_tx_packet_sent() is normally that the packet has been sent, there is no
second function to register packets that have not been sent. That case is treated as if the
packet got lost in the network, i.e. the loss is registered later, via the receiver feedback.

Hence it seems to me that we can remove this, or replace by a debug/warning statement.


Thanks a lot for reporting the issue, I will reply again Monday.
Gerrit

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (2 preceding siblings ...)
  2010-08-20 10:40 ` Gerrit Renker
@ 2010-08-23  5:29 ` Gerrit Renker
  2010-08-25 13:21 ` Eugen Dedu
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Gerrit Renker @ 2010-08-23  5:29 UTC (permalink / raw)
  To: dccp

Eugen,-
>>>    sudo $TC qdisc add dev $IF parent 1:10 handle 40: sfq perturb 10 limit 2
>>>
>>> taken from http://lartc.org/howto/, we receive many errors like this:
>>>
>>> [27799.691275] BUG: err=1 after ccid_hc_tx_packet_sent at
>>> /build/buildd/linux-2.6.32/net/dccp/output.c:307/dccp_write_xmit()
>>> [27799.691288]<IRQ> [<ffffffffa0441db5>] dccp_write_xmit+0x165/0x310 [dccp]
>>> [27799.691308]      [<ffffffffa0443ac0>] ? dccp_write_xmit_timer+0x0/0x80 [dccp]
>>> [27799.691315]      [<ffffffffa0443b3a>] dccp_write_xmit_timer+0x7a/0x80 [dccp]
>>>
>>> We found that they are triggered by the following code in net/dccp/output.c:
>>> 306   err = dccp_transmit_skb(sk, skb);
>>> 307   ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
>>> 308   if (err)
>>> 309     DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
>>> 310              err);
>>>
>>> Is there a way to fix to workaround that?
<snip>

>> I haven't verified this, but I am almost sure that the problem is rectified in the
>> DCCP test tree which I would like to encourage you to use for all testing, since it
>
> Thank you for your answer.  I have not yet tested with DCCP test tree  
> indeed, but I see that the code involved is there too, i.e.:
>
>  306  err = dccp_transmit_skb(sk, skb);
>  307  ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
>  308  if (err)
>  309    DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
>  310             err);
>
The above is not in the test tree, please see below. Can you please double-check
that, after pulling the tree from git://eden-feed.erg.abdn.ac.uk/dccp_exp.git,
the subtree 'dccp' is checked out? 

The 'master' branch of that tree is identical with netdev-2.6, which is why the
above two code parts are identical. The 'dccp' subtree is the actual DCCP test
tree, this in turn has the 'ccid4' subtree (which eventually will be integrated
into the test tree).


In the test tree the BUG has been demoted to a debug call,

  286          err = dccp_transmit_skb(sk, skb);
  287          if (err)
  288                  dccp_pr_debug("transmit_skb() returned err=%d\n", err);
  289          /*
  290           * Register this one as sent even if an error occurred. To the remote
  291           * end a local packet drop is indistinguishable from network loss, i.e.
  292           * any local drop will eventually be reported via receiver feedback.
  293           */
  294          ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, len);

The return code will appear in the logs if /sys/module/dccp/parameters/dccp_debug = Y
and if either the queue_xmit() function pointer returned error < 0, qdisc returned a
positive NET_XMIT_.*, or  the device a positive NETDEV_TX_.* code (linux/netdevice.h).

DCCP here does not catch the case of local drop or congestion, as it is done by TCP
in net/ipv4/tcp_output.c:tcp_transmit_skb(), the corresponding passage is:

   894          err = icsk->icsk_af_ops->queue_xmit(skb);
   895          if (likely(err <= 0))
   896                  return err;
   897
   898          tcp_enter_cwr(sk, 1);
   899
   900          return net_xmit_eval(err);

If TCP is not already in a loss state (TCP_CA_Open, TCP_CA_Disorder), this causes
a state transition to the CWR state, and 'CWR' is also signalled to an ECN-capable
receiver.

I believe that there was an earlier discussion about also catching local congestion
in DCCP, at least I see in that the reason why the above debug statement is still
there.

However, the response would need to be resolved in the TX CCID, since the handling of
loss and ECN depends on the individual CCID (RFC 4340, 12):
 * in TFRC (RFC 5348, 4342, 5622):
   -  without ECN a loss is detected 3 packets later (RFC 5348, 5.1);
   -  the local loss of the packet can not be signalled via ECN in TFRC (CCID-3/4),
      since only the "ECE" bit in the IP header is evaluated, which requires first
      to deliver the packet (RFC 4342, RFC 5622);
 * CCID-2 (RFC 4341) behaves in the same way, i.e.
   - loss detected with a delay of 3 packets or
   - packet received, but ECN-marked (ECE bit set);
 * in both cases, ECN nonce sums are returned (1-bit field in CCID-3/4 Loss Intervals 
   option, or Ack Vector type 38/39 for CCID-2), however, as described in section 12.2
   of RFC 4340, (local) packet drop destroys the ECN nonce.

Long story short -- hopefully in future there will be a way of doing something smart
in response to local drop, as currently done in TCP.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (3 preceding siblings ...)
  2010-08-23  5:29 ` Gerrit Renker
@ 2010-08-25 13:21 ` Eugen Dedu
  2010-08-26 11:08 ` gerrit
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eugen Dedu @ 2010-08-25 13:21 UTC (permalink / raw)
  To: dccp

On 23/08/10 07:29, Gerrit Renker wrote:
> Eugen,-
>>>>     sudo $TC qdisc add dev $IF parent 1:10 handle 40: sfq perturb 10 limit 2
>>>>
>>>> taken from http://lartc.org/howto/, we receive many errors like this:
>>>>
>>>> [27799.691275] BUG: err=1 after ccid_hc_tx_packet_sent at
>>>> /build/buildd/linux-2.6.32/net/dccp/output.c:307/dccp_write_xmit()
>>>> [27799.691288]<IRQ>  [<ffffffffa0441db5>] dccp_write_xmit+0x165/0x310 [dccp]
>>>> [27799.691308]      [<ffffffffa0443ac0>] ? dccp_write_xmit_timer+0x0/0x80 [dccp]
>>>> [27799.691315]      [<ffffffffa0443b3a>] dccp_write_xmit_timer+0x7a/0x80 [dccp]
>>>>
>>>> We found that they are triggered by the following code in net/dccp/output.c:
>>>> 306   err = dccp_transmit_skb(sk, skb);
>>>> 307   ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
>>>> 308   if (err)
>>>> 309     DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
>>>> 310              err);
>>>>
>>>> Is there a way to fix to workaround that?
> <snip>
>
>>> I haven't verified this, but I am almost sure that the problem is rectified in the
>>> DCCP test tree which I would like to encourage you to use for all testing, since it
>>
>> Thank you for your answer.  I have not yet tested with DCCP test tree
>> indeed, but I see that the code involved is there too, i.e.:
>>
>>   306  err = dccp_transmit_skb(sk, skb);
>>   307  ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
>>   308  if (err)
>>   309    DCCP_BUG("err=%d after ccid_hc_tx_packet_sent",
>>   310             err);
>>
> The above is not in the test tree, please see below. Can you please double-check
> that, after pulling the tree from git://eden-feed.erg.abdn.ac.uk/dccp_exp.git,
> the subtree 'dccp' is checked out?
>
> The 'master' branch of that tree is identical with netdev-2.6, which is why the
> above two code parts are identical. The 'dccp' subtree is the actual DCCP test
> tree, this in turn has the 'ccid4' subtree (which eventually will be integrated
> into the test tree).
>
>
> In the test tree the BUG has been demoted to a debug call,
>
>    286          err = dccp_transmit_skb(sk, skb);
>    287          if (err)
>    288                  dccp_pr_debug("transmit_skb() returned err=%d\n", err);
>    289          /*
>    290           * Register this one as sent even if an error occurred. To the remote
>    291           * end a local packet drop is indistinguishable from network loss, i.e.
>    292           * any local drop will eventually be reported via receiver feedback.
>    293           */
>    294          ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, len);
>
> The return code will appear in the logs if /sys/module/dccp/parameters/dccp_debug = Y
> and if either the queue_xmit() function pointer returned error<  0, qdisc returned a
> positive NET_XMIT_.*, or  the device a positive NETDEV_TX_.* code (linux/netdevice.h).
>
> DCCP here does not catch the case of local drop or congestion, as it is done by TCP
> in net/ipv4/tcp_output.c:tcp_transmit_skb(), the corresponding passage is:
>
>     894          err = icsk->icsk_af_ops->queue_xmit(skb);
>     895          if (likely(err<= 0))
>     896                  return err;
>     897
>     898          tcp_enter_cwr(sk, 1);
>     899
>     900          return net_xmit_eval(err);
>
> If TCP is not already in a loss state (TCP_CA_Open, TCP_CA_Disorder), this causes
> a state transition to the CWR state, and 'CWR' is also signalled to an ECN-capable
> receiver.
>
> I believe that there was an earlier discussion about also catching local congestion
> in DCCP, at least I see in that the reason why the above debug statement is still
> there.
>
> However, the response would need to be resolved in the TX CCID, since the handling of
> loss and ECN depends on the individual CCID (RFC 4340, 12):
>   * in TFRC (RFC 5348, 4342, 5622):
>     -  without ECN a loss is detected 3 packets later (RFC 5348, 5.1);
>     -  the local loss of the packet can not be signalled via ECN in TFRC (CCID-3/4),
>        since only the "ECE" bit in the IP header is evaluated, which requires first
>        to deliver the packet (RFC 4342, RFC 5622);
>   * CCID-2 (RFC 4341) behaves in the same way, i.e.
>     - loss detected with a delay of 3 packets or
>     - packet received, but ECN-marked (ECE bit set);
>   * in both cases, ECN nonce sums are returned (1-bit field in CCID-3/4 Loss Intervals
>     option, or Ack Vector type 38/39 for CCID-2), however, as described in section 12.2
>     of RFC 4340, (local) packet drop destroys the ECN nonce.
>
> Long story short -- hopefully in future there will be a way of doing something smart
> in response to local drop, as currently done in TCP.

Hi,

And sorry to replay so late.

In the test tree you pointed out DCCP_BUG is indeed changed to debug.

I explained wrongly the problem in the original e-mail.  In our tests 
the problem was not:
- that DCCP_BUG was executed (this is not a bug in itself)
- that the DCCP sender does not receive feedback about locally lost 
packets (we assume it receives it, since computed transmit rate is good, 
see below)

but it was the following:

Let's say that qdisc on the sender allows 2Mb/s to get out.  A sender 
application sends a file at 3Mb/s to DCCP.  Currently, DCCP "eats" it 
completely, i.e. at 3Mb/s.  However, about 1Mb/s is "eaten" (lost) 
locally because of qdisc, and only 2Mb/s are sent to the network.  DCCP 
indeed sees that some packets are lost (the ones lost locally), that is 
why it computes a rate ("computed transmit rate") of 2Mb/s indeed (we 
printed it to the screen in our tests).  The problem is that DCCP "eats" 
3Mb/s instead of eating 2Mb/s.  In fact, it seems to us that when a 
packet is lost locally (DCCP_BUG called), the next available packet from 
DCCP socket is immediately taken into account, as if the other had not 
been "eaten" and had not been taken into account as a sent packet.

Otherwise said : let's say that DCCP/TFRC (we have always used TFRC) 
sends one packet each N ms.  Normally, when DCCP sends a packet, no 
matter if it gets out or is lost locally, it should wait the interval of 
time (N ms) before sending the next packet; but this does not happen 
when a packet is lost *locally*.

-- 
Eugen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (4 preceding siblings ...)
  2010-08-25 13:21 ` Eugen Dedu
@ 2010-08-26 11:08 ` gerrit
  2010-08-26 16:11 ` Eugen Dedu
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: gerrit @ 2010-08-26 11:08 UTC (permalink / raw)
  To: dccp

> Let's say that qdisc on the sender allows 2Mb/s to get out.  A sender
> application sends a file at 3Mb/s to DCCP.  Currently, DCCP "eats" it
> completely, i.e. at 3Mb/s.  However, about 1Mb/s is "eaten" (lost)
> locally because of qdisc, and only 2Mb/s are sent to the network.  DCCP
> indeed sees that some packets are lost (the ones lost locally), that is
> why it computes a rate ("computed transmit rate") of 2Mb/s indeed (we
> printed it to the screen in our tests).  The problem is that DCCP "eats"
> 3Mb/s instead of eating 2Mb/s.

Up to here I agree; but there is nothing wrong here. DCCP would even
"eat" 10Gbps if it were given large enough buffers. It is not a bug
since the actuator for the sending rate is the output, not the input.

It is made complicated since here are two control circuits wired in series:
 * TFRC as rate-based protocol functions similar to a Token Bucket Filter;
 * the Queueing Discipline attached to the output interface.

There are three different speeds:
 * the speed at which the application puts data into the socket (3Mbps)
 * the output rate of DCCP (circa 2Mbps as printed)
 * the target rate of the qdisc (also set to 2Mbps)

You have not said whether the application uses constant bitrate, it looks
as if. In this case the two control circuits interact
 * initially TFRC will send at a higher rate (slow-start);
 * to shape outgoing traffic, packets will be dropped at the outgoing
   interface;
 * the receiver (at the other end) will detect loss and feed it back
 * TFRC will recompute its sending rate and adjust in proportion to
   the experienced loss;
 * this stabilizes at some point where TFRC has converged to about 2Mbps


> In fact, it seems to us that when a packet is lost locally (DCCP_BUG
> called), the next available packet from DCCP socket is immediately
> taken into account, as if the other had not been "eaten" and had not been
> taken into account as a sent packet.

Yes that is what was trying to say: TCP feeds back local loss immediately
(but also notifies the receiver via ECN CWR), whereas DCCP has to wait
until the receiver reports the loss.

But as per previous email, I think it is not a high-priority issue to
provide a special case for local loss.

For tests involving traffic shaping the recommendation on the list has
been to use a separate "middlebox":

http://www.linuxfoundation.org/collaborate/workgroups/networking/dccptesting#Network_emulation_setup


Have you considered using dccp_probe to look at the other parameters --
some information is on

http://www.erg.abdn.ac.uk/users/gerrit/dccp/testing_dccp/#dccp_probe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (5 preceding siblings ...)
  2010-08-26 11:08 ` gerrit
@ 2010-08-26 16:11 ` Eugen Dedu
  2010-08-26 16:26 ` Ian McDonald
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eugen Dedu @ 2010-08-26 16:11 UTC (permalink / raw)
  To: dccp

On 26/08/10 13:08, gerrit@erg.abdn.ac.uk wrote:
>> Let's say that qdisc on the sender allows 2Mb/s to get out.  A sender
>> application sends a file at 3Mb/s to DCCP.  Currently, DCCP "eats" it
>> completely, i.e. at 3Mb/s.  However, about 1Mb/s is "eaten" (lost)
>> locally because of qdisc, and only 2Mb/s are sent to the network.  DCCP
>> indeed sees that some packets are lost (the ones lost locally), that is
>> why it computes a rate ("computed transmit rate") of 2Mb/s indeed (we
>> printed it to the screen in our tests).  The problem is that DCCP "eats"
>> 3Mb/s instead of eating 2Mb/s.
>
> Up to here I agree; but there is nothing wrong here. DCCP would even
> "eat" 10Gbps if it were given large enough buffers. It is not a bug
> since the actuator for the sending rate is the output, not the input.
>
> It is made complicated since here are two control circuits wired in series:
>   * TFRC as rate-based protocol functions similar to a Token Bucket Filter;
>   * the Queueing Discipline attached to the output interface.
>
> There are three different speeds:
>   * the speed at which the application puts data into the socket (3Mbps)
>   * the output rate of DCCP (circa 2Mbps as printed)
>   * the target rate of the qdisc (also set to 2Mbps)

This needs a clarification.  Suppose a DCCPsocket with a size of a few 
packets.  The current situation is the following:

App ---------> DCCPsocket --------> qdisc ---------> network
       3Mb/s                 3Mb/s     |     2Mb/s
                                       v
                                     1Mb/s rejected locally

We believe that DCCP acts wrongly when it sends at 3Mb/s (identical to 
appli speed).  It should have been:

App ---------> DCCPsocket --------> qdisc ---------> network
       3Mb/s        |        2Mb/s           2Mb/s
                    v
                  1Mb/s rejected because buffer is full

Now, we have seen that DCCP correctly computes the estimated transmit 
rate to 2Mb/s.  We believe this should be considered as DCCP (buffer) 
output, not as network output.  What is the interest of sending (eating 
from DCCPsocket) more than 2Mb/s if DCCP knows that all further packets 
are lost?  Otherwise said, when a packet is lost locally, why sending 
right afterwards another packet and not to wait the N ms given by TFRC 
equation?

In fact, when feedback about 1Mb/s lost packets arrives at the sender, 
three cases appear (I do not know how linux DCCP acts in reality):
- either DCCP pays attention to lost packets => it further reduces the 
rate, from 2Mb/s to say 1Mb/s, which is wrong, since the network accepts 
2Mb/s
- or DCCP pays attention to receiver rate (2Mb/s) and does NOT pay 
attention to lost packets, which is strange; in this case it stabilises 
to 2Mb/s indeed
- or other strange case (DCCP bug?)

> You have not said whether the application uses constant bitrate, it looks
> as if. In this case the two control circuits interact
>   * initially TFRC will send at a higher rate (slow-start);
>   * to shape outgoing traffic, packets will be dropped at the outgoing
>     interface;
>   * the receiver (at the other end) will detect loss and feed it back
>   * TFRC will recompute its sending rate and adjust in proportion to
>     the experienced loss;
>   * this stabilizes at some point where TFRC has converged to about 2Mbps
>
>
>> In fact, it seems to us that when a packet is lost locally (DCCP_BUG
>> called), the next available packet from DCCP socket is immediately
>> taken into account, as if the other had not been "eaten" and had not been
>> taken into account as a sent packet.
>
> Yes that is what was trying to say: TCP feeds back local loss immediately
> (but also notifies the receiver via ECN CWR), whereas DCCP has to wait
> until the receiver reports the loss.

It is not that DCCP has to wait one RTT, it is that it does not take 
into account the local losses at all.  DCCP does not act correctly one 
RTT later either.

> But as per previous email, I think it is not a high-priority issue to
> provide a special case for local loss.
>
> For tests involving traffic shaping the recommendation on the list has
> been to use a separate "middlebox":
>
> http://www.linuxfoundation.org/collaborate/workgroups/networking/dccptesting#Network_emulation_setup

Thank you, we have finally used a middlebox, and shaping works (well, 
there are from time to time intervals of 1 second where the receiver 
receives twice more packets than middlebox's qdisc would allow, but we 
need to investigate further this strange issue).

> Have you considered using dccp_probe to look at the other parameters --
> some information is on
>
> http://www.erg.abdn.ac.uk/users/gerrit/dccp/testing_dccp/#dccp_probe

We have done this manually, through getsockopt.

-- 
Eugen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (6 preceding siblings ...)
  2010-08-26 16:11 ` Eugen Dedu
@ 2010-08-26 16:26 ` Ian McDonald
  2010-08-26 20:12 ` Eugen Dedu
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Ian McDonald @ 2010-08-26 16:26 UTC (permalink / raw)
  To: dccp

One salient point here:
- are you using qdisc on the same box as the DCCP sender? I had to
shift qdisc onto another box as the point that qdisc intercepts is not
necessarily where you think. In my research I ended up using three
boxes - a sender, a traffic shaper and a receiver. I also had to
create two subnets and have the traffic shaper route also as otherwise
the boxes were smart enough to work out they were on the same Ethernet
segment.

I can supply some scripts etc if you want to look at mine...

Regards

Ian
--
twitter imcdnzl
web http://www.next-genit.co.uk

On 26 August 2010 17:11, Eugen Dedu <Eugen.Dedu@pu-pm.univ-fcomte.fr> wrote:
>
> On 26/08/10 13:08, gerrit@erg.abdn.ac.uk wrote:
>>>
>>> Let's say that qdisc on the sender allows 2Mb/s to get out.  A sender
>>> application sends a file at 3Mb/s to DCCP.  Currently, DCCP "eats" it
>>> completely, i.e. at 3Mb/s.  However, about 1Mb/s is "eaten" (lost)
>>> locally because of qdisc, and only 2Mb/s are sent to the network.  DCCP
>>> indeed sees that some packets are lost (the ones lost locally), that is
>>> why it computes a rate ("computed transmit rate") of 2Mb/s indeed (we
>>> printed it to the screen in our tests).  The problem is that DCCP "eats"
>>> 3Mb/s instead of eating 2Mb/s.
>>
>> Up to here I agree; but there is nothing wrong here. DCCP would even
>> "eat" 10Gbps if it were given large enough buffers. It is not a bug
>> since the actuator for the sending rate is the output, not the input.
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (7 preceding siblings ...)
  2010-08-26 16:26 ` Ian McDonald
@ 2010-08-26 20:12 ` Eugen Dedu
  2010-08-27 11:45 ` Gerrit Renker
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Eugen Dedu @ 2010-08-26 20:12 UTC (permalink / raw)
  To: dccp

On 26/08/10 18:26, Ian McDonald wrote:
> One salient point here:
> - are you using qdisc on the same box as the DCCP sender? I had to

Yes, during those tests qdisc was on the sender.  But now I use a 
middlebox, as you say.

> shift qdisc onto another box as the point that qdisc intercepts is not
> necessarily where you think. In my research I ended up using three
> boxes - a sender, a traffic shaper and a receiver. I also had to
> create two subnets and have the traffic shaper route also as otherwise
> the boxes were smart enough to work out they were on the same Ethernet
> segment.

I was aware of that too, so the middlebox was a linux machine as a router.

> I can supply some scripts etc if you want to look at mine...

We have a working environment, but maybe there is some useful stuff in 
your scripts, send them to us if it does not take you much time.

Thank you,
Eugen

> Regards
>
> Ian
> --
> twitter imcdnzl
> web http://www.next-genit.co.uk
>
> On 26 August 2010 17:11, Eugen Dedu<Eugen.Dedu@pu-pm.univ-fcomte.fr>  wrote:
>>
>> On 26/08/10 13:08, gerrit@erg.abdn.ac.uk wrote:
>>>>
>>>> Let's say that qdisc on the sender allows 2Mb/s to get out.  A sender
>>>> application sends a file at 3Mb/s to DCCP.  Currently, DCCP "eats" it
>>>> completely, i.e. at 3Mb/s.  However, about 1Mb/s is "eaten" (lost)
>>>> locally because of qdisc, and only 2Mb/s are sent to the network.  DCCP
>>>> indeed sees that some packets are lost (the ones lost locally), that is
>>>> why it computes a rate ("computed transmit rate") of 2Mb/s indeed (we
>>>> printed it to the screen in our tests).  The problem is that DCCP "eats"
>>>> 3Mb/s instead of eating 2Mb/s.
>>>
>>> Up to here I agree; but there is nothing wrong here. DCCP would even
>>> "eat" 10Gbps if it were given large enough buffers. It is not a bug
>>> since the actuator for the sending rate is the output, not the input.
>>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (8 preceding siblings ...)
  2010-08-26 20:12 ` Eugen Dedu
@ 2010-08-27 11:45 ` Gerrit Renker
  2010-08-27 13:32 ` Eugen Dedu
  2010-08-30 17:33 ` Ian McDonald
  11 siblings, 0 replies; 13+ messages in thread
From: Gerrit Renker @ 2010-08-27 11:45 UTC (permalink / raw)
  To: dccp

> This needs a clarification.  Suppose a DCCPsocket with a size of a few  
> packets.  The current situation is the following:
>
> App ---------> DCCPsocket --------> qdisc ---------> network
>       3Mb/s                 3Mb/s     |     2Mb/s
>                                       v
>                                     1Mb/s rejected locally
>
> We believe that DCCP acts wrongly when it sends at 3Mb/s (identical to  
> appli speed).  It should have been:
>
> App ---------> DCCPsocket --------> qdisc ---------> network
>       3Mb/s        |        2Mb/s           2Mb/s
>                    v
>                  1Mb/s rejected because buffer is full
>
> Now, we have seen that DCCP correctly computes the estimated transmit  
> rate to 2Mb/s.  We believe this should be considered as DCCP (buffer)  
> output, not as network output. 
>
I am not sure the above diagrams are correct. If TFRC sets a target bitrate
of 2Mbps, it will also send at that rate. I have just done some tests that
showed that the control of the output rate matches the expected value:
http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid3/sender_notes/rate_mismatch_controller/

Hence in the first diagram the input rate at the qdisc input would be 2Mpbs, not
3Mbps. But it is difficult to argue without testing, and as Ian pointed out, it
is not such a good idea to run the traffic shaper on the same box as the sender.

As per previous email, before drawing conclusions as above, I would really like to
encourage you to use dccp_probe. You are testing only one paramter, the computed
allowed sending rate X. This leaves out the current value of the loss rate p,
the computed sending rate X_calc, and the sending rate estimated at the receiver,
X_recv. 

Plus, using getsockopt is very unreliable for polling information, since it always
involves at least the overhead of a system call.

Before suggesting that there is a bug here, please consider your setup.

> What is the interest of sending (eating  from DCCPsocket) more than 2Mb/s
> if DCCP knows that all further packets  are lost?  Otherwise said, when a
> packet is lost locally, why sending  right afterwards another packet and
> not to wait the N ms given by TFRC equation?
>
As per previous email I agree with your suggestion that it would be perfect
if DCCP would also be able to handle local loss, as it is done by TCP.

But handling this special case is not a significant problem, since the sender
does react to this loss -- at the moment it receives feedback and recomputes
the allowed sending rate X.


> In fact, when feedback about 1Mb/s lost packets arrives at the sender,  
> three cases appear (I do not know how linux DCCP acts in reality):
CCID-3 is based on the TFRC specification, originally specified in RFC 3448,
now RFC 5348. The code is still between these two revisions, at the state
of rfc3448bis 0/1 (the working draft leading up to RFC 5348).

>> Yes that is what was trying to say: TCP feeds back local loss immediately
>> (but also notifies the receiver via ECN CWR), whereas DCCP has to wait
>> until the receiver reports the loss.
>
> It is not that DCCP has to wait one RTT, it is that it does not take  
> into account the local losses at all.  DCCP does not act correctly one  
> RTT later either.
>
I am not sure I want to believe what you are saying. As said, there are limits
as to what getsockopt can do for you, and hence the time it takes to complete
one getsockopt call can well be within one or more RTTs. There is a context 
switch involved also; so if your RTT is in the order of 1 millisecond, that
is already the granularity of one scheduling timeslice.

> Thank you, we have finally used a middlebox, and shaping works (well,  
> there are from time to time intervals of 1 second where the receiver  
> receives twice more packets than middlebox's qdisc would allow, but we  
> need to investigate further this strange issue).
>
In theory the limit that TFRC can control is 12Mbps (MTU\x1500, HZ\x1000),
at speeds higher than that it will send bursts where the momentaneous 
speed can be much higher.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (9 preceding siblings ...)
  2010-08-27 11:45 ` Gerrit Renker
@ 2010-08-27 13:32 ` Eugen Dedu
  2010-08-30 17:33 ` Ian McDonald
  11 siblings, 0 replies; 13+ messages in thread
From: Eugen Dedu @ 2010-08-27 13:32 UTC (permalink / raw)
  To: dccp

On 27/08/10 13:45, Gerrit Renker wrote:
>> This needs a clarification.  Suppose a DCCPsocket with a size of a few
>> packets.  The current situation is the following:
>>
>> App --------->  DCCPsocket -------->  qdisc --------->  network
>>        3Mb/s                 3Mb/s     |     2Mb/s
>>                                        v
>>                                      1Mb/s rejected locally
>>
>> We believe that DCCP acts wrongly when it sends at 3Mb/s (identical to
>> appli speed).  It should have been:
>>
>> App --------->  DCCPsocket -------->  qdisc --------->  network
>>        3Mb/s        |        2Mb/s           2Mb/s
>>                     v
>>                   1Mb/s rejected because buffer is full
>>
>> Now, we have seen that DCCP correctly computes the estimated transmit
>> rate to 2Mb/s.  We believe this should be considered as DCCP (buffer)
>> output, not as network output.
>>
> I am not sure the above diagrams are correct. If TFRC sets a target bitrate
> of 2Mbps, it will also send at that rate. I have just done some tests that
> showed that the control of the output rate matches the expected value:
> http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid3/sender_notes/rate_mismatch_controller/
>
> Hence in the first diagram the input rate at the qdisc input would be 2Mpbs, not
> 3Mbps. But it is difficult to argue without testing, and as Ian pointed out, it
> is not such a good idea to run the traffic shaper on the same box as the sender.
>
> As per previous email, before drawing conclusions as above, I would really like to
> encourage you to use dccp_probe. You are testing only one paramter, the computed
> allowed sending rate X. This leaves out the current value of the loss rate p,
> the computed sending rate X_calc, and the sending rate estimated at the receiver,
> X_recv.
>
> Plus, using getsockopt is very unreliable for polling information, since it always
> involves at least the overhead of a system call.
>
> Before suggesting that there is a bug here, please consider your setup.

Ok, I think you are right.  Thank you for this discussion.

>> Thank you, we have finally used a middlebox, and shaping works (well,
>> there are from time to time intervals of 1 second where the receiver
>> receives twice more packets than middlebox's qdisc would allow, but we
>> need to investigate further this strange issue).
>>
> In theory the limit that TFRC can control is 12Mbps (MTU\x1500, HZ\x1000),
> at speeds higher than that it will send bursts where the momentaneous
> speed can be much higher.

Thank you for the information, I didn't know that.

Cheers,
-- 
Eugen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DCCP_BUG called
  2010-08-19 14:21 DCCP_BUG called Eugen Dedu
                   ` (10 preceding siblings ...)
  2010-08-27 13:32 ` Eugen Dedu
@ 2010-08-30 17:33 ` Ian McDonald
  11 siblings, 0 replies; 13+ messages in thread
From: Ian McDonald @ 2010-08-30 17:33 UTC (permalink / raw)
  To: dccp

On 26 August 2010 21:12, Eugen Dedu <Eugen.Dedu@pu-pm.univ-fcomte.fr> wrote:
> We have a working environment, but maybe there is some useful stuff in your scripts, send them to us if it does not take you much time.
>
> Thank you,
> Eugen
>

To turn rate limiting and delay on I do:
/sbin/tc qdisc add dev lan0 root handle 1:0 netem delay $1ms
/sbin/tc qdisc add dev lan1 root handle 1:0 netem delay $1ms
/sbin/tc qdisc add dev lan0 parent 1:1 handle 10: tbf rate $2kbit
buffer 10000 limit 30000

and off:
/sbin/tc qdisc del dev lan0 parent 1:1
/sbin/tc qdisc del dev lan0 root
/sbin/tc qdisc del dev lan1 root

To turn on packet loss and delay I do:
/sbin/tc qdisc add dev lan1 root netem delay $1ms loss $3%
/sbin/tc qdisc add dev lan0 root netem delay $1ms loss $2%

and off:
/sbin/tc qdisc del dev lan1 root netem
/sbin/tc qdisc del dev lan0 root netem

These are all from the box in the middle.

Regards

Ian

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-08-30 17:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-19 14:21 DCCP_BUG called Eugen Dedu
2010-08-20  5:15 ` Gerrit Renker
2010-08-20  8:46 ` Eugen Dedu
2010-08-20 10:40 ` Gerrit Renker
2010-08-23  5:29 ` Gerrit Renker
2010-08-25 13:21 ` Eugen Dedu
2010-08-26 11:08 ` gerrit
2010-08-26 16:11 ` Eugen Dedu
2010-08-26 16:26 ` Ian McDonald
2010-08-26 20:12 ` Eugen Dedu
2010-08-27 11:45 ` Gerrit Renker
2010-08-27 13:32 ` Eugen Dedu
2010-08-30 17:33 ` Ian McDonald

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.