All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlad Yasevich <vyasevich@gmail.com>
To: linux-sctp@vger.kernel.org
Subject: Re: Suspected renege problem in sctp
Date: Thu, 31 Jan 2013 15:08:05 +0000	[thread overview]
Message-ID: <510A88D5.9000904@gmail.com> (raw)
In-Reply-To: <5109DDE8.9050700@gmail.com>

On 01/30/2013 11:30 PM, Roberts, Lee A. wrote:
> Vlad,
>
> The test code that I'm running at the moment has changes similar to the following.
> I think we want to peek at the tail of the queue---and not dequeue (or unlink) the
> data until we're sure we want to renege.

You are right.  If Bob can send a signed-off patch linux-sctp and 
netdev, we can get it upstream and into stable releases.

-vlad

>
>                                                  -- Lee Roberts
>
>
> # diff -c ulpqueue.c~ ulpqueue.c
> *** ulpqueue.c~ 2012-10-09 14:31:35.000000000 -0600
> --- ulpqueue.c  2013-01-30 21:22:49.000000000 -0700
> ***************
> *** 963,973 ****
>
>          tsnmap = &ulpq->asoc->peer.tsn_map;
>
> !       while ((skb = __skb_dequeue_tail(list)) != NULL) {
> !               freed += skb_headlen(skb);
>                  event = sctp_skb2event(skb);
>                  tsn = event->tsn;
>
>                  sctp_ulpevent_free(event);
>                  sctp_tsnmap_renege(tsnmap, tsn);
>                  if (freed >= needed)
> --- 963,976 ----
>
>          tsnmap = &ulpq->asoc->peer.tsn_map;
>
> !       while ((skb = skb_peek_tail(list)) != NULL) {
>                  event = sctp_skb2event(skb);
>                  tsn = event->tsn;
> +               if (TSN_lte(tsn, sctp_tsnmap_get_ctsn(tsnmap)))
> +                       break;
>
> +               __skb_unlink(skb, list);
> +               freed += skb_headlen(skb);
>                  sctp_ulpevent_free(event);
>                  sctp_tsnmap_renege(tsnmap, tsn);
>                  if (freed >= needed)
> #
>
>
>
> -----Original Message-----
> From: Vlad Yasevich [mailto:vyasevich@gmail.com]
> Sent: Wednesday, January 30, 2013 7:59 PM
> To: Montgomery, Bob
> Cc: Roberts, Lee A.; linux-sctp@vger.kernel.org
> Subject: Re: Suspected renege problem in sctp
>
> On 01/30/2013 07:51 PM, Bob Montgomery wrote:
>> Vlad,
>>
>> If you're not the right guy to ask can you let me know who might be?
>
> Hi Bob.  I am still the right person, but it would be even better to
> address these types of questions and writeups to linux-sctp@vger.kernel.org.
>
>>
>> We've been investigating sctpspray hangs for quite some time.
>>
>> I think we're seeing a case where a renege operation is removing
>> fragments from the ulp reasm queue that are at or below the value
>> of cumulative_tsn_ack_point.  We put a BUG statement in
>> sctp_tsnmap_renege so we'd crash instead of ignoring this case:
>>
>>           if (TSN_lt(tsn, map->base_tsn))
>>                   return;
>
>
> Hmm..  Looking at the reneging functions there is no TSN checking at
> all.  You are completely right.  We MUST not renege DATA that has moved
> the cumulative_tsn_ack_point.
>
> Adding something like this in sctp_ulpq_renege_list should fix this:
>
> diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
> index ada1746..9bd94e6 100644
> --- a/net/sctp/ulpqueue.c
> +++ b/net/sctp/ulpqueue.c
> @@ -970,10 +970,14 @@ static __u16 sctp_ulpq_renege_list(struct
> sctp_ulpq *ulpq,
>           tsnmap = &ulpq->asoc->peer.tsn_map;
>
>           while ((skb = __skb_dequeue_tail(list)) != NULL) {
> -               freed += skb_headlen(skb);
>                   event = sctp_skb2event(skb);
>                   tsn = event->tsn;
> -
> +
> +               /* Make sure we do not renege below CTSN */
> +               if (TSN_lte(tsn, sctp_tsnmap_get_ctsn(tsnmap)))
> +                       break;
> +
> +               freed += skb_headlen(skb);
>                   sctp_ulpevent_free(event);
>                   sctp_tsnmap_renege(tsnmap, tsn);
>                   if (freed >= needed)
>
> I think there also might be a bug here when reneging from the ordered
> queue and the message has been reassembled.  I need to look at that a
> bit more.
>
> -vlad
>
>>
>> And after two days of sctpspray pounding, hit the bug.
>>
>> Here's a partial write-up using examples from the core file:
>>
>> ====================================>> Fact 1:  a renege operation will only be launched if there's a gap in
>> the tsn.
>>
>> So a reassembly queue like this one would not be set upon:
>>
>> PID 1784
>> sctp_association 0xffff88041b6a2000
>>
>>         tsn_map = 0xffff88041dd8d560,
>>         base_tsn = 0x55751715,
>>         cumulative_tsn_ack_point = 0x55751714,
>>         max_tsn_seen = 0x55751714,
>>
>> reasm queue summary:
>>     ssn = 0x345,   tsn = 0x5575170c,   msg_flags = 0x2,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x5575170d,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x5575170e,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x5575170f,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x55751710,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x55751711,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x55751712,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x55751713,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x345,   tsn = 0x55751714,   msg_flags = 0x0,   rmem_len = 0x69c
>>
>> No gap: the last tsn in the reasm queue matches cumulative_tsn_ack_point = 0x55751714,
>>
>>
>> In our case, I believe the reasm queue looked like this when the renege launched:
>>
>> In sctp_association 0xffff88041b845000
>>
>>         base_tsn = 0x936a6d76,
>>         cumulative_tsn_ack_point = 0x936a6d75,
>>         max_tsn_seen = 0x936a6d79,
>>
>>     ssn = 0x0,   tsn = 0x936a6d6f,   msg_flags = 0x2,   rmem_len = 0x69c
>>     ssn = 0x0,   tsn = 0x936a6d70,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x0,   tsn = 0x936a6d71,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x0,   tsn = 0x936a6d72,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x0,   tsn = 0x936a6d73,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x0,   tsn = 0x936a6d74,   msg_flags = 0x0,   rmem_len = 0x69c
>>     ssn = 0x0,   tsn = 0x936a6d75,   msg_flags = 0x0,   rmem_len = 0x69c
>>
>>     ssn = 0x0,   tsn = 0x936a6d78,   msg_flags = 0x0,   rmem_len = 0x628
>>
>> The gap between 75 and 78 meant that a renege could be launched.
>>
>> It was launched for tsn = 0x936a6d76 (just arriving and apparently out
>> of memory), and the "needed" amount was 0x05ac (1452 bytes).
>>
>> tsn = 0x936a6d78 was removed, and the amount recovered was 0x538 (1336 bytes).
>> The value of "rmem_len" in the event is not what is used to calculated needed
>> and freed.
>>
>> Since 0x538 didn't satisfy 0x5ac, it went for the next one down on the queue
>> (tsn = 0x936a6d75)
>> and recovered 0x5ac from it for a total recovery of 0xae4 (2788 bytes).
>> So because the first post-gap fragment happened to be a LAST_FRAG and shorter than
>> the rest of them, it wasn't enough to satisfy the request and we moved on
>> to the one that caused the BUG.
>>
>> If there had been two gapped frags, or if the gapped frag had been another
>> middle one that was big enough to satisfy the request, it would not have
>> been caught freeing a fragment that was at the cumulative tsn ack point.
>> ====================================
>>
>> Since the base_tsn and cumulative_tsn_ack_point are advanced in
>> sctp_ulpevent_make_rcvmsg() before putting the fragments on the
>> reasm queue, the renege code should not be allowed to dip below
>> that point in sctp_ulpq_renege_list().   Otherwise, you're
>> discarding undelivered data that you've already reported as
>> "delivered" to the sender, right?
>>
>> Thanks,
>> Bob Montgomery
>>
>>
>>
>>
>


  parent reply	other threads:[~2013-01-31 15:08 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-31  2:58 Suspected renege problem in sctp Vlad Yasevich
2013-01-31  4:30 ` Roberts, Lee A.
2013-01-31 15:08 ` Vlad Yasevich [this message]
2013-02-04 23:47 ` Bob Montgomery
2013-02-05 15:56 ` Vlad Yasevich
2013-02-05 23:56 ` Bob Montgomery
2013-02-06 15:58 ` Vlad Yasevich
2013-02-06 23:04 ` Bob Montgomery

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=510A88D5.9000904@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=linux-sctp@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.