netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving
@ 2017-12-08 13:03 Xin Long
  2017-12-08 13:03 ` [PATCHv2 net-next 01/12] sctp: add stream interleave enable members and sockopt Xin Long
  2017-12-11 16:23 ` [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving David Miller
  0 siblings, 2 replies; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:03 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

Stream Interleave would be Implemented in two Parts:

   1. The I-DATA Chunk Supporting User Message Interleaving
   2. Interaction with Other SCTP Extensions

Overview in section 1.1 of RFC8260 for Part 1:

   This document describes a new chunk carrying payload data called
   I-DATA.  This chunk incorporates the properties of the current SCTP
   DATA chunk, all the flags and fields except the Stream Sequence
   Number (SSN), and also adds two new fields in its chunk header -- the
   Fragment Sequence Number (FSN) and the Message Identifier (MID).  The
   FSN is only used for reassembling all fragments that have the same
   MID and the same ordering property.  The TSN is only used for the
   reliable transfer in combination with Selective Acknowledgment (SACK)
   chunks.

   In addition, the MID is also used for ensuring ordered delivery
   instead of using the stream sequence number (the I-DATA chunk omits
   an SSN).

As the 1st part of Stream Interleave Implementation, this patchset adds
an ops framework named sctp_stream_interleave with a bunch of stuff that
does lots of things needed somewhere.

Then it defines sctp_stream_interleave_0 to work for normal DATA chunks
and sctp_stream_interleave_1 for I-DATA chunks.

With these functions, hundreds of if-else checks for the different process
on I-DATA chunks would be avoided. Besides, very few codes could be shared
in these two function sets.

In this patchset, it adds some basic variables, structures and socket
options firstly, then implement these functions one by one to add the
procedures for ordered idata gradually, at last adjusts some codes to
make them work for unordered idata.

To make it safe to be implemented and also not break the normal data
chunk process, this feature can't be enabled to use until all stream
interleave codes are completely accomplished.

v1 -> v2:
  - fixed a checkpatch warning that a blank line was missed.
  - avoided a kbuild warning reported from gcc-4.9.

Xin Long (12):
  sctp: add stream interleave enable members and sockopt
  sctp: add asoc intl_enable negotiation during 4 shakehands
  sctp: add basic structures and make chunk function for idata
  sctp: implement make_datafrag for sctp_stream_interleave
  sctp: implement assign_number for sctp_stream_interleave
  sctp: implement validate_data for sctp_stream_interleave
  sctp: implement ulpevent_data for sctp_stream_interleave
  sctp: implement enqueue_event for sctp_stream_interleave
  sctp: implement renege_events for sctp_stream_interleave
  sctp: implement start_pd for sctp_stream_interleave
  sctp: implement abort_pd for sctp_stream_interleave
  sctp: add support for the process of unordered idata

 include/linux/sctp.h                 |   20 +
 include/net/netns/sctp.h             |    5 +-
 include/net/sctp/constants.h         |    9 +-
 include/net/sctp/sctp.h              |    4 +-
 include/net/sctp/sm.h                |   15 +-
 include/net/sctp/stream_interleave.h |   54 ++
 include/net/sctp/structs.h           |   56 +-
 include/net/sctp/ulpevent.h          |   23 +-
 include/net/sctp/ulpqueue.h          |   10 +-
 include/uapi/linux/sctp.h            |    3 +
 net/sctp/Makefile                    |    2 +-
 net/sctp/associola.c                 |    2 +-
 net/sctp/chunk.c                     |    8 +-
 net/sctp/output.c                    |    5 +-
 net/sctp/sm_make_chunk.c             |   45 +-
 net/sctp/sm_sideeffect.c             |   23 +-
 net/sctp/sm_statefuns.c              |   21 +-
 net/sctp/sm_statetable.c             |    3 +
 net/sctp/socket.c                    |  130 +++-
 net/sctp/stream.c                    |    1 +
 net/sctp/stream_interleave.c         | 1118 ++++++++++++++++++++++++++++++++++
 net/sctp/ulpevent.c                  |   15 +-
 net/sctp/ulpqueue.c                  |   23 +-
 23 files changed, 1501 insertions(+), 94 deletions(-)
 create mode 100644 include/net/sctp/stream_interleave.h
 create mode 100644 net/sctp/stream_interleave.c

-- 
2.1.0

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 01/12] sctp: add stream interleave enable members and sockopt
  2017-12-08 13:03 [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving Xin Long
@ 2017-12-08 13:03 ` Xin Long
  2017-12-08 13:03   ` [PATCHv2 net-next 02/12] sctp: add asoc intl_enable negotiation during 4 shakehands Xin Long
  2017-12-11 16:23 ` [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving David Miller
  1 sibling, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:03 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

This patch adds intl_enable in asoc and netns, and strm_interleave in
sctp_sock to indicate if stream interleave is enabled and supported.

netns intl_enable would be set via procfs, but that is not added yet
until all stream interleave codes are completely implemented; asoc
intl_enable will be set when doing 4-shakehands.

sp strm_interleave can be set by sockopt SCTP_INTERLEAVING_SUPPORTED
which is also added in this patch. This socket option is defined in
section 4.3.1 of RFC8260.

Note that strm_interleave can only be set by sockopt when both netns
intl_enable and sp frag_interleave are set.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/netns/sctp.h   |  5 ++-
 include/net/sctp/structs.h |  2 ++
 include/uapi/linux/sctp.h  |  1 +
 net/sctp/socket.c          | 88 +++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h
index ebc8132..0db7fb3 100644
--- a/include/net/netns/sctp.h
+++ b/include/net/netns/sctp.h
@@ -122,9 +122,12 @@ struct netns_sctp {
 	/* Flag to indicate if PR-CONFIG is enabled. */
 	int reconf_enable;
 
-	/* Flag to idicate if SCTP-AUTH is enabled */
+	/* Flag to indicate if SCTP-AUTH is enabled */
 	int auth_enable;
 
+	/* Flag to indicate if stream interleave is enabled */
+	int intl_enable;
+
 	/*
 	 * Policy to control SCTP IPv4 address scoping
 	 * 0   - Disable IPv4 address scoping
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 2f8f93d..7030cbe 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -217,6 +217,7 @@ struct sctp_sock {
 		disable_fragments:1,
 		v4mapped:1,
 		frag_interleave:1,
+		strm_interleave:1,
 		recvrcvinfo:1,
 		recvnxtinfo:1,
 		data_ready_signalled:1;
@@ -1940,6 +1941,7 @@ struct sctp_association {
 	__u8 need_ecne:1,	/* Need to send an ECNE Chunk? */
 	     temp:1,		/* Is it a temporary association? */
 	     force_delay:1,
+	     intl_enable:1,
 	     prsctp_enable:1,
 	     reconf_enable:1;
 
diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
index d9adab3..6ed934c 100644
--- a/include/uapi/linux/sctp.h
+++ b/include/uapi/linux/sctp.h
@@ -125,6 +125,7 @@ typedef __s32 sctp_assoc_t;
 #define SCTP_SOCKOPT_PEELOFF_FLAGS 122
 #define SCTP_STREAM_SCHEDULER	123
 #define SCTP_STREAM_SCHEDULER_VALUE	124
+#define SCTP_INTERLEAVING_SUPPORTED	125
 
 /* PR-SCTP policies */
 #define SCTP_PR_SCTP_NONE	0x0000
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 014847e..8c33463 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3350,7 +3350,10 @@ static int sctp_setsockopt_fragment_interleave(struct sock *sk,
 	if (get_user(val, (int __user *)optval))
 		return -EFAULT;
 
-	sctp_sk(sk)->frag_interleave = (val == 0) ? 0 : 1;
+	sctp_sk(sk)->frag_interleave = !!val;
+
+	if (!sctp_sk(sk)->frag_interleave)
+		sctp_sk(sk)->strm_interleave = 0;
 
 	return 0;
 }
@@ -4019,6 +4022,40 @@ static int sctp_setsockopt_scheduler_value(struct sock *sk,
 	return retval;
 }
 
+static int sctp_setsockopt_interleaving_supported(struct sock *sk,
+						  char __user *optval,
+						  unsigned int optlen)
+{
+	struct sctp_sock *sp = sctp_sk(sk);
+	struct net *net = sock_net(sk);
+	struct sctp_assoc_value params;
+	int retval = -EINVAL;
+
+	if (optlen < sizeof(params))
+		goto out;
+
+	optlen = sizeof(params);
+	if (copy_from_user(&params, optval, optlen)) {
+		retval = -EFAULT;
+		goto out;
+	}
+
+	if (params.assoc_id)
+		goto out;
+
+	if (!net->sctp.intl_enable || !sp->frag_interleave) {
+		retval = -EPERM;
+		goto out;
+	}
+
+	sp->strm_interleave = !!params.assoc_value;
+
+	retval = 0;
+
+out:
+	return retval;
+}
+
 /* API 6.2 setsockopt(), getsockopt()
  *
  * Applications use setsockopt() and getsockopt() to set or retrieve
@@ -4206,6 +4243,10 @@ static int sctp_setsockopt(struct sock *sk, int level, int optname,
 	case SCTP_STREAM_SCHEDULER_VALUE:
 		retval = sctp_setsockopt_scheduler_value(sk, optval, optlen);
 		break;
+	case SCTP_INTERLEAVING_SUPPORTED:
+		retval = sctp_setsockopt_interleaving_supported(sk, optval,
+								optlen);
+		break;
 	default:
 		retval = -ENOPROTOOPT;
 		break;
@@ -6981,6 +7022,47 @@ static int sctp_getsockopt_scheduler_value(struct sock *sk, int len,
 	return retval;
 }
 
+static int sctp_getsockopt_interleaving_supported(struct sock *sk, int len,
+						  char __user *optval,
+						  int __user *optlen)
+{
+	struct sctp_assoc_value params;
+	struct sctp_association *asoc;
+	int retval = -EFAULT;
+
+	if (len < sizeof(params)) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	len = sizeof(params);
+	if (copy_from_user(&params, optval, len))
+		goto out;
+
+	asoc = sctp_id2assoc(sk, params.assoc_id);
+	if (asoc) {
+		params.assoc_value = asoc->intl_enable;
+	} else if (!params.assoc_id) {
+		struct sctp_sock *sp = sctp_sk(sk);
+
+		params.assoc_value = sp->strm_interleave;
+	} else {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (put_user(len, optlen))
+		goto out;
+
+	if (copy_to_user(optval, &params, len))
+		goto out;
+
+	retval = 0;
+
+out:
+	return retval;
+}
+
 static int sctp_getsockopt(struct sock *sk, int level, int optname,
 			   char __user *optval, int __user *optlen)
 {
@@ -7171,6 +7253,10 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname,
 		retval = sctp_getsockopt_scheduler_value(sk, len, optval,
 							 optlen);
 		break;
+	case SCTP_INTERLEAVING_SUPPORTED:
+		retval = sctp_getsockopt_interleaving_supported(sk, len, optval,
+								optlen);
+		break;
 	default:
 		retval = -ENOPROTOOPT;
 		break;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 02/12] sctp: add asoc intl_enable negotiation during 4 shakehands
  2017-12-08 13:03 ` [PATCHv2 net-next 01/12] sctp: add stream interleave enable members and sockopt Xin Long
@ 2017-12-08 13:03   ` Xin Long
  2017-12-08 13:04     ` [PATCHv2 net-next 03/12] sctp: add basic structures and make chunk function for idata Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:03 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

asoc intl_enable will be set when local sp strm_interleave is set
and there's I-DATA chunk in init and init_ack extensions, as said
in section 2.2.1 of RFC8260.

asoc intl_enable indicates all data will be sent as I-DATA chunks.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/linux/sctp.h     |  3 +++
 net/sctp/sm_make_chunk.c | 18 ++++++++++++++++--
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/linux/sctp.h b/include/linux/sctp.h
index da803df..6d2bd64 100644
--- a/include/linux/sctp.h
+++ b/include/linux/sctp.h
@@ -102,6 +102,9 @@ enum sctp_cid {
 	/* AUTH Extension Section 4.1 */
 	SCTP_CID_AUTH			= 0x0F,
 
+	/* sctp ndata 5.1. I-DATA */
+	SCTP_CID_I_DATA			= 0x40,
+
 	/* PR-SCTP Sec 3.2 */
 	SCTP_CID_FWD_TSN		= 0xC0,
 
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 9bf575f..da33c85 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -228,7 +228,7 @@ struct sctp_chunk *sctp_make_init(const struct sctp_association *asoc,
 	struct sctp_inithdr init;
 	union sctp_params addrs;
 	struct sctp_sock *sp;
-	__u8 extensions[4];
+	__u8 extensions[5];
 	size_t chunksize;
 	__be16 types[2];
 	int num_ext = 0;
@@ -278,6 +278,11 @@ struct sctp_chunk *sctp_make_init(const struct sctp_association *asoc,
 	if (sp->adaptation_ind)
 		chunksize += sizeof(aiparam);
 
+	if (sp->strm_interleave) {
+		extensions[num_ext] = SCTP_CID_I_DATA;
+		num_ext += 1;
+	}
+
 	chunksize += vparam_len;
 
 	/* Account for AUTH related parameters */
@@ -392,7 +397,7 @@ struct sctp_chunk *sctp_make_init_ack(const struct sctp_association *asoc,
 	struct sctp_inithdr initack;
 	union sctp_params addrs;
 	struct sctp_sock *sp;
-	__u8 extensions[4];
+	__u8 extensions[5];
 	size_t chunksize;
 	int num_ext = 0;
 	int cookie_len;
@@ -442,6 +447,11 @@ struct sctp_chunk *sctp_make_init_ack(const struct sctp_association *asoc,
 	if (sp->adaptation_ind)
 		chunksize += sizeof(aiparam);
 
+	if (asoc->intl_enable) {
+		extensions[num_ext] = SCTP_CID_I_DATA;
+		num_ext += 1;
+	}
+
 	if (asoc->peer.auth_capable) {
 		auth_random = (struct sctp_paramhdr *)asoc->c.auth_random;
 		chunksize += ntohs(auth_random->length);
@@ -2032,6 +2042,10 @@ static void sctp_process_ext_param(struct sctp_association *asoc,
 			if (net->sctp.addip_enable)
 				asoc->peer.asconf_capable = 1;
 			break;
+		case SCTP_CID_I_DATA:
+			if (sctp_sk(asoc->base.sk)->strm_interleave)
+				asoc->intl_enable = 1;
+			break;
 		default:
 			break;
 		}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 03/12] sctp: add basic structures and make chunk function for idata
  2017-12-08 13:03   ` [PATCHv2 net-next 02/12] sctp: add asoc intl_enable negotiation during 4 shakehands Xin Long
@ 2017-12-08 13:04     ` Xin Long
  2017-12-08 13:04       ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

sctp_idatahdr and sctp_idata_chunk are used to define and parse
I-DATA chunk format, and sctp_make_idata is a function to build
the chunk.

The I-DATA Chunk Format is defined in section 2.1 of RFC8260.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/linux/sctp.h       | 17 +++++++++++++++++
 include/net/sctp/sm.h      |  2 ++
 include/net/sctp/structs.h |  1 +
 net/sctp/sm_make_chunk.c   |  6 ++++++
 4 files changed, 26 insertions(+)

diff --git a/include/linux/sctp.h b/include/linux/sctp.h
index 6d2bd64..38e2cf6 100644
--- a/include/linux/sctp.h
+++ b/include/linux/sctp.h
@@ -243,6 +243,23 @@ struct sctp_data_chunk {
 	struct sctp_datahdr data_hdr;
 };
 
+struct sctp_idatahdr {
+	__be32 tsn;
+	__be16 stream;
+	__be16 reserved;
+	__be32 mid;
+	union {
+		__u32 ppid;
+		__be32 fsn;
+	};
+	__u8 payload[0];
+};
+
+struct sctp_idata_chunk {
+	struct sctp_chunkhdr chunk_hdr;
+	struct sctp_idatahdr data_hdr;
+};
+
 /* DATA Chuck Specific Flags */
 enum {
 	SCTP_DATA_MIDDLE_FRAG	= 0x00,
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 70fb397..5389ae0 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -197,6 +197,8 @@ struct sctp_chunk *sctp_make_cookie_ack(const struct sctp_association *asoc,
 struct sctp_chunk *sctp_make_cwr(const struct sctp_association *asoc,
 				 const __u32 lowest_tsn,
 				 const struct sctp_chunk *chunk);
+struct sctp_chunk *sctp_make_idata(const struct sctp_association *asoc,
+				   __u8 flags, int paylen, gfp_t gfp);
 struct sctp_chunk *sctp_make_datafrag_empty(struct sctp_association *asoc,
 					    const struct sctp_sndrcvinfo *sinfo,
 					    int len, const __u8 flags,
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 7030cbe..7026a80 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -575,6 +575,7 @@ struct sctp_chunk {
 		struct sctp_addiphdr *addip_hdr;
 		struct sctp_fwdtsn_hdr *fwdtsn_hdr;
 		struct sctp_authhdr *auth_hdr;
+		struct sctp_idatahdr *idata_hdr;
 	} subh;
 
 	__u8 *chunk_end;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index da33c85..b969397 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1425,6 +1425,12 @@ static struct sctp_chunk *sctp_make_data(const struct sctp_association *asoc,
 	return _sctp_make_chunk(asoc, SCTP_CID_DATA, flags, paylen, gfp);
 }
 
+struct sctp_chunk *sctp_make_idata(const struct sctp_association *asoc,
+				   __u8 flags, int paylen, gfp_t gfp)
+{
+	return _sctp_make_chunk(asoc, SCTP_CID_I_DATA, flags, paylen, gfp);
+}
+
 static struct sctp_chunk *sctp_make_control(const struct sctp_association *asoc,
 					    __u8 type, __u8 flags, int paylen,
 					    gfp_t gfp)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 13:04     ` [PATCHv2 net-next 03/12] sctp: add basic structures and make chunk function for idata Xin Long
@ 2017-12-08 13:04       ` Xin Long
  2017-12-08 13:04         ` [PATCHv2 net-next 05/12] sctp: implement assign_number " Xin Long
  2017-12-08 14:06         ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave David Laight
  0 siblings, 2 replies; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

To avoid hundreds of checks for the different process on I-DATA chunk,
struct sctp_stream_interleave is defined as a group of functions used
to replace the codes in some place where it needs to do different job
according to if the asoc intl_enabled is set.

With these ops, it only needs to initialize asoc->stream.si with
sctp_stream_interleave_0 for normal data if asoc intl_enable is 0,
or sctp_stream_interleave_1 for idata if asoc intl_enable is set in
sctp_stream_init.

After that, the members in asoc->stream.si can be used directly in
some special places without checking asoc intl_enable.

make_datafrag is the first member for sctp_stream_interleave, it's
used to make data or idata frags, called in sctp_datamsg_from_user.
The old function sctp_make_datafrag_empty needs to be adjust some
to fit in this ops.

Note that as idata and data chunks have different length, it also
defines data_chunk_len for sctp_stream_interleave to describe the
chunk size.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/sm.h                |  5 +--
 include/net/sctp/stream_interleave.h | 44 ++++++++++++++++++++
 include/net/sctp/structs.h           | 12 ++++++
 net/sctp/Makefile                    |  2 +-
 net/sctp/chunk.c                     |  6 +--
 net/sctp/sm_make_chunk.c             | 21 ++++------
 net/sctp/stream.c                    |  1 +
 net/sctp/stream_interleave.c         | 79 ++++++++++++++++++++++++++++++++++++
 8 files changed, 149 insertions(+), 21 deletions(-)
 create mode 100644 include/net/sctp/stream_interleave.h
 create mode 100644 net/sctp/stream_interleave.c

diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 5389ae0..f950186 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -199,10 +199,9 @@ struct sctp_chunk *sctp_make_cwr(const struct sctp_association *asoc,
 				 const struct sctp_chunk *chunk);
 struct sctp_chunk *sctp_make_idata(const struct sctp_association *asoc,
 				   __u8 flags, int paylen, gfp_t gfp);
-struct sctp_chunk *sctp_make_datafrag_empty(struct sctp_association *asoc,
+struct sctp_chunk *sctp_make_datafrag_empty(const struct sctp_association *asoc,
 					    const struct sctp_sndrcvinfo *sinfo,
-					    int len, const __u8 flags,
-					    __u16 ssn, gfp_t gfp);
+					    int len, __u8 flags, gfp_t gfp);
 struct sctp_chunk *sctp_make_ecne(const struct sctp_association *asoc,
 				  const __u32 lowest_tsn);
 struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc);
diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
new file mode 100644
index 0000000..7b9fa8d
--- /dev/null
+++ b/include/net/sctp/stream_interleave.h
@@ -0,0 +1,44 @@
+/* SCTP kernel implementation
+ * (C) Copyright Red Hat Inc. 2017
+ *
+ * These are definitions used by the stream schedulers, defined in RFC
+ * draft ndata (https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-11)
+ *
+ * This SCTP implementation is free software;
+ * you can redistribute it and/or modify it under the terms of
+ * the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This SCTP implementation  is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ *                 ************************
+ * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with GNU CC; see the file COPYING.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ *
+ * Please send any bug reports or fixes you make to the
+ * email addresses:
+ *    lksctp developers <linux-sctp@vger.kernel.org>
+ *
+ * Written or modified by:
+ *   Xin Long <lucien.xin@gmail.com>
+ */
+
+#ifndef __sctp_stream_interleave_h__
+#define __sctp_stream_interleave_h__
+
+struct sctp_stream_interleave {
+	__u16	data_chunk_len;
+	/* (I-)DATA process */
+	struct sctp_chunk *(*make_datafrag)(const struct sctp_association *asoc,
+					    const struct sctp_sndrcvinfo *sinfo,
+					    int len, __u8 flags, gfp_t gfp);
+};
+
+void sctp_stream_interleave_init(struct sctp_stream *stream);
+
+#endif /* __sctp_stream_interleave_h__ */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 7026a80..96cc898 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -89,6 +89,7 @@ struct sctp_stream;
 #include <net/sctp/tsnmap.h>
 #include <net/sctp/ulpevent.h>
 #include <net/sctp/ulpqueue.h>
+#include <net/sctp/stream_interleave.h>
 
 /* Structures useful for managing bind/connect. */
 
@@ -1389,11 +1390,22 @@ struct sctp_stream {
 			struct sctp_stream_out_ext *rr_next;
 		};
 	};
+	struct sctp_stream_interleave *si;
 };
 
 #define SCTP_STREAM_CLOSED		0x00
 #define SCTP_STREAM_OPEN		0x01
 
+static inline __u16 sctp_datachk_len(const struct sctp_stream *stream)
+{
+	return stream->si->data_chunk_len;
+}
+
+static inline __u16 sctp_datahdr_len(const struct sctp_stream *stream)
+{
+	return stream->si->data_chunk_len - sizeof(struct sctp_chunkhdr);
+}
+
 /* SCTP_GET_ASSOC_STATS counters */
 struct sctp_priv_assoc_stats {
 	/* Maximum observed rto in the association during subsequent
diff --git a/net/sctp/Makefile b/net/sctp/Makefile
index 1ca84a2..54bd9c1 100644
--- a/net/sctp/Makefile
+++ b/net/sctp/Makefile
@@ -14,7 +14,7 @@ sctp-y := sm_statetable.o sm_statefuns.o sm_sideeffect.o \
 	  tsnmap.o bind_addr.o socket.o primitive.o \
 	  output.o input.o debug.o stream.o auth.o \
 	  offload.o stream_sched.o stream_sched_prio.o \
-	  stream_sched_rr.o
+	  stream_sched_rr.o stream_interleave.o
 
 sctp_probe-y := probe.o
 
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 7f8baa4..62adaaa 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -191,7 +191,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 	 */
 	max_data = asoc->pathmtu -
 		   sctp_sk(asoc->base.sk)->pf->af->net_header_len -
-		   sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
+		   sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream);
 	max_data = SCTP_TRUNC4(max_data);
 
 	/* If the the peer requested that we authenticate DATA chunks
@@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 				frag |= SCTP_DATA_SACK_IMM;
 		}
 
-		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
-						 0, GFP_KERNEL);
+		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
+						       GFP_KERNEL);
 		if (!chunk) {
 			err = -ENOMEM;
 			goto errout;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index b969397..23a7313 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -721,38 +721,31 @@ struct sctp_chunk *sctp_make_ecne(const struct sctp_association *asoc,
 /* Make a DATA chunk for the given association from the provided
  * parameters.  However, do not populate the data payload.
  */
-struct sctp_chunk *sctp_make_datafrag_empty(struct sctp_association *asoc,
+struct sctp_chunk *sctp_make_datafrag_empty(const struct sctp_association *asoc,
 					    const struct sctp_sndrcvinfo *sinfo,
-					    int data_len, __u8 flags, __u16 ssn,
-					    gfp_t gfp)
+					    int len, __u8 flags, gfp_t gfp)
 {
 	struct sctp_chunk *retval;
 	struct sctp_datahdr dp;
-	int chunk_len;
 
 	/* We assign the TSN as LATE as possible, not here when
 	 * creating the chunk.
 	 */
-	dp.tsn = 0;
+	memset(&dp, 0, sizeof(dp));
+	dp.ppid = sinfo->sinfo_ppid;
 	dp.stream = htons(sinfo->sinfo_stream);
-	dp.ppid   = sinfo->sinfo_ppid;
 
 	/* Set the flags for an unordered send.  */
-	if (sinfo->sinfo_flags & SCTP_UNORDERED) {
+	if (sinfo->sinfo_flags & SCTP_UNORDERED)
 		flags |= SCTP_DATA_UNORDERED;
-		dp.ssn = 0;
-	} else
-		dp.ssn = htons(ssn);
 
-	chunk_len = sizeof(dp) + data_len;
-	retval = sctp_make_data(asoc, flags, chunk_len, gfp);
+	retval = sctp_make_data(asoc, flags, sizeof(dp) + len, gfp);
 	if (!retval)
-		goto nodata;
+		return NULL;
 
 	retval->subh.data_hdr = sctp_addto_chunk(retval, sizeof(dp), &dp);
 	memcpy(&retval->sinfo, sinfo, sizeof(struct sctp_sndrcvinfo));
 
-nodata:
 	return retval;
 }
 
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 76ea66b..8370e6c 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -167,6 +167,7 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	sched->init(stream);
 
 in:
+	sctp_stream_interleave_init(stream);
 	if (!incnt)
 		goto out;
 
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
new file mode 100644
index 0000000..397c3c1
--- /dev/null
+++ b/net/sctp/stream_interleave.c
@@ -0,0 +1,79 @@
+/* SCTP kernel implementation
+ * (C) Copyright Red Hat Inc. 2017
+ *
+ * This file is part of the SCTP kernel implementation
+ *
+ * These functions manipulate sctp stream queue/scheduling.
+ *
+ * This SCTP implementation is free software;
+ * you can redistribute it and/or modify it under the terms of
+ * the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This SCTP implementation is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ *                 ************************
+ * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with GNU CC; see the file COPYING.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ *
+ * Please send any bug reports or fixes you make to the
+ * email addresched(es):
+ *    lksctp developers <linux-sctp@vger.kernel.org>
+ *
+ * Written or modified by:
+ *    Xin Long <lucien.xin@gmail.com>
+ */
+
+#include <net/sctp/sctp.h>
+#include <net/sctp/sm.h>
+#include <linux/sctp.h>
+
+static struct sctp_chunk *sctp_make_idatafrag_empty(
+					const struct sctp_association *asoc,
+					const struct sctp_sndrcvinfo *sinfo,
+					int len, __u8 flags, gfp_t gfp)
+{
+	struct sctp_chunk *retval;
+	struct sctp_idatahdr dp;
+
+	memset(&dp, 0, sizeof(dp));
+	dp.stream = htons(sinfo->sinfo_stream);
+
+	if (sinfo->sinfo_flags & SCTP_UNORDERED)
+		flags |= SCTP_DATA_UNORDERED;
+
+	retval = sctp_make_idata(asoc, flags, sizeof(dp) + len, gfp);
+	if (!retval)
+		return NULL;
+
+	retval->subh.idata_hdr = sctp_addto_chunk(retval, sizeof(dp), &dp);
+	memcpy(&retval->sinfo, sinfo, sizeof(struct sctp_sndrcvinfo));
+
+	return retval;
+}
+
+static struct sctp_stream_interleave sctp_stream_interleave_0 = {
+	.data_chunk_len		= sizeof(struct sctp_data_chunk),
+	/* DATA process functions */
+	.make_datafrag		= sctp_make_datafrag_empty,
+};
+
+static struct sctp_stream_interleave sctp_stream_interleave_1 = {
+	.data_chunk_len		= sizeof(struct sctp_idata_chunk),
+	/* I-DATA process functions */
+	.make_datafrag		= sctp_make_idatafrag_empty,
+};
+
+void sctp_stream_interleave_init(struct sctp_stream *stream)
+{
+	struct sctp_association *asoc;
+
+	asoc = container_of(stream, struct sctp_association, stream);
+	stream->si = asoc->intl_enable ? &sctp_stream_interleave_1
+				       : &sctp_stream_interleave_0;
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 05/12] sctp: implement assign_number for sctp_stream_interleave
  2017-12-08 13:04       ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave Xin Long
@ 2017-12-08 13:04         ` Xin Long
  2017-12-08 13:04           ` [PATCHv2 net-next 06/12] sctp: implement validate_data " Xin Long
  2017-12-08 14:06         ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave David Laight
  1 sibling, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

assign_number is added as a member of sctp_stream_interleave, used
to assign ssn for data or mid (message id) for idata, called in
sctp_packet_append_data. sctp_chunk_assign_ssn is left as it is,
and sctp_chunk_assign_mid is added for sctp_stream_interleave_1.

This procedure is described in section 2.2.2 of RFC8260.

All sizeof(struct sctp_data_chunk) in tx path is replaced with
sctp_datachk_len, to make it right for idata as well. And also
adjust sctp_chunk_is_data for SCTP_CID_I_DATA.

After this patch, idata can be built and sent in tx path.

Note that if sp strm_interleave is set, it has to wait_connect in
sctp_sendmsg, as asoc intl_enable need to be known after 4 shake-
hands, to decide if it should use data or idata later. data and
idata can't be mixed to send in one asoc.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/constants.h         |  9 +++++----
 include/net/sctp/sctp.h              |  4 ++--
 include/net/sctp/sm.h                |  2 +-
 include/net/sctp/stream_interleave.h |  1 +
 include/net/sctp/structs.h           | 18 +++++++++++++++++-
 net/sctp/output.c                    |  5 +++--
 net/sctp/socket.c                    | 17 +++++++++++++++--
 net/sctp/stream_interleave.c         | 37 ++++++++++++++++++++++++++++++++++++
 net/sctp/ulpevent.c                  |  4 ++--
 9 files changed, 83 insertions(+), 14 deletions(-)

diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
index deaafa9..20ff237 100644
--- a/include/net/sctp/constants.h
+++ b/include/net/sctp/constants.h
@@ -145,12 +145,13 @@ SCTP_SUBTYPE_CONSTRUCTOR(OTHER,		enum sctp_event_other,	other)
 SCTP_SUBTYPE_CONSTRUCTOR(PRIMITIVE,	enum sctp_event_primitive, primitive)
 
 
-#define sctp_chunk_is_data(a) (a->chunk_hdr->type == SCTP_CID_DATA)
+#define sctp_chunk_is_data(a) (a->chunk_hdr->type == SCTP_CID_DATA || \
+			       a->chunk_hdr->type == SCTP_CID_I_DATA)
 
 /* Calculate the actual data size in a data chunk */
-#define SCTP_DATA_SNDSIZE(c) ((int)((unsigned long)(c->chunk_end)\
-		       		- (unsigned long)(c->chunk_hdr)\
-				- sizeof(struct sctp_data_chunk)))
+#define SCTP_DATA_SNDSIZE(c) ((int)((unsigned long)(c->chunk_end) - \
+				    (unsigned long)(c->chunk_hdr) - \
+				    sctp_datachk_len(&c->asoc->stream)))
 
 /* Internal error codes */
 enum sctp_ierror {
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 906a9c0..63ac57e 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -444,13 +444,13 @@ static inline int sctp_frag_point(const struct sctp_association *asoc, int pmtu)
 	int frag = pmtu;
 
 	frag -= sp->pf->af->net_header_len;
-	frag -= sizeof(struct sctphdr) + sizeof(struct sctp_data_chunk);
+	frag -= sizeof(struct sctphdr) + sctp_datachk_len(&asoc->stream);
 
 	if (asoc->user_frag)
 		frag = min_t(int, frag, asoc->user_frag);
 
 	frag = SCTP_TRUNC4(min_t(int, frag, SCTP_MAX_CHUNK_LEN -
-					    sizeof(struct sctp_data_chunk)));
+					    sctp_datachk_len(&asoc->stream)));
 
 	return frag;
 }
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index f950186..ca1db89 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -343,7 +343,7 @@ static inline __u16 sctp_data_size(struct sctp_chunk *chunk)
 	__u16 size;
 
 	size = ntohs(chunk->chunk_hdr->length);
-	size -= sizeof(struct sctp_data_chunk);
+	size -= sctp_datahdr_len(&chunk->asoc->stream);
 
 	return size;
 }
diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index 7b9fa8d..99f399e 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -37,6 +37,7 @@ struct sctp_stream_interleave {
 	struct sctp_chunk *(*make_datafrag)(const struct sctp_association *asoc,
 					    const struct sctp_sndrcvinfo *sinfo,
 					    int len, __u8 flags, gfp_t gfp);
+	void	(*assign_number)(struct sctp_chunk *chunk);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 96cc898..fd93973 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -399,6 +399,18 @@ void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new);
 #define sctp_ssn_skip(stream, type, sid, ssn) \
 	((stream)->type[sid].ssn = ssn + 1)
 
+/* What is the current MID number for this stream? */
+#define sctp_mid_peek(stream, type, sid) \
+	((stream)->type[sid].mid)
+
+/* Return the next MID number for this stream.  */
+#define sctp_mid_next(stream, type, sid) \
+	((stream)->type[sid].mid++)
+
+/* Skip over this mid and all below. */
+#define sctp_mid_skip(stream, type, sid, mid) \
+	((stream)->type[sid].mid = mid + 1)
+
 /*
  * Pointers to address related SCTP functions.
  * (i.e. things that depend on the address family.)
@@ -623,6 +635,7 @@ struct sctp_chunk {
 	__u16	rtt_in_progress:1,	/* This chunk used for RTT calc? */
 		has_tsn:1,		/* Does this chunk have a TSN yet? */
 		has_ssn:1,		/* Does this chunk have a SSN yet? */
+#define has_mid has_ssn
 		singleton:1,		/* Only chunk in the packet? */
 		end_of_packet:1,	/* Last chunk in the packet? */
 		ecn_ce_done:1,		/* Have we processed the ECN CE bit? */
@@ -1360,7 +1373,10 @@ struct sctp_stream_out_ext {
 };
 
 struct sctp_stream_out {
-	__u16	ssn;
+	union {
+		__u32 mid;
+		__u16 ssn;
+	};
 	__u8	state;
 	struct sctp_stream_out_ext *ext;
 };
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 4a865cd..01a26ee0 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -313,6 +313,7 @@ static enum sctp_xmit __sctp_packet_append_chunk(struct sctp_packet *packet,
 	/* We believe that this chunk is OK to add to the packet */
 	switch (chunk->chunk_hdr->type) {
 	case SCTP_CID_DATA:
+	case SCTP_CID_I_DATA:
 		/* Account for the data being in the packet */
 		sctp_packet_append_data(packet, chunk);
 		/* Disallow SACK bundling after DATA. */
@@ -724,7 +725,7 @@ static enum sctp_xmit sctp_packet_can_append_data(struct sctp_packet *packet,
 	 * or delay in hopes of bundling a full sized packet.
 	 */
 	if (chunk->skb->len + q->out_qlen > transport->pathmtu -
-		packet->overhead - sizeof(struct sctp_data_chunk) - 4)
+	    packet->overhead - sctp_datachk_len(&chunk->asoc->stream) - 4)
 		/* Enough data queued to fill a packet */
 		return SCTP_XMIT_OK;
 
@@ -759,7 +760,7 @@ static void sctp_packet_append_data(struct sctp_packet *packet,
 
 	asoc->peer.rwnd = rwnd;
 	sctp_chunk_assign_tsn(chunk);
-	sctp_chunk_assign_ssn(chunk);
+	asoc->stream.si->assign_number(chunk);
 }
 
 static enum sctp_xmit sctp_packet_will_fit(struct sctp_packet *packet,
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 8c33463..036f945 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2002,7 +2002,20 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 		if (err < 0)
 			goto out_free;
 
-		wait_connect = true;
+		/* If stream interleave is enabled, wait_connect has to be
+		 * done earlier than data enqueue, as it needs to make data
+		 * or idata according to asoc->intl_enable which is set
+		 * after connection is done.
+		 */
+		if (sctp_sk(asoc->base.sk)->strm_interleave) {
+			timeo = sock_sndtimeo(sk, 0);
+			err = sctp_wait_for_connect(asoc, &timeo);
+			if (err)
+				goto out_unlock;
+		} else {
+			wait_connect = true;
+		}
+
 		pr_debug("%s: we associated primitively\n", __func__);
 	}
 
@@ -3180,7 +3193,7 @@ static int sctp_setsockopt_maxseg(struct sock *sk, char __user *optval, unsigned
 		if (val == 0) {
 			val = asoc->pathmtu - sp->pf->af->net_header_len;
 			val -= sizeof(struct sctphdr) +
-			       sizeof(struct sctp_data_chunk);
+			       sctp_datachk_len(&asoc->stream);
 		}
 		asoc->user_frag = val;
 		asoc->frag_point = sctp_frag_point(asoc, asoc->pathmtu);
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index 397c3c1..3ac47e7 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -57,16 +57,53 @@ static struct sctp_chunk *sctp_make_idatafrag_empty(
 	return retval;
 }
 
+static void sctp_chunk_assign_mid(struct sctp_chunk *chunk)
+{
+	struct sctp_stream *stream;
+	struct sctp_chunk *lchunk;
+	__u32 cfsn = 0;
+	__u16 sid;
+
+	if (chunk->has_mid)
+		return;
+
+	sid = sctp_chunk_stream_no(chunk);
+	stream = &chunk->asoc->stream;
+
+	list_for_each_entry(lchunk, &chunk->msg->chunks, frag_list) {
+		struct sctp_idatahdr *hdr;
+
+		lchunk->has_mid = 1;
+
+		if (lchunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
+			continue;
+
+		hdr = lchunk->subh.idata_hdr;
+
+		if (lchunk->chunk_hdr->flags & SCTP_DATA_FIRST_FRAG)
+			hdr->ppid = lchunk->sinfo.sinfo_ppid;
+		else
+			hdr->fsn = htonl(cfsn++);
+
+		if (lchunk->chunk_hdr->flags & SCTP_DATA_LAST_FRAG)
+			hdr->mid = htonl(sctp_mid_next(stream, out, sid));
+		else
+			hdr->mid = htonl(sctp_mid_peek(stream, out, sid));
+	}
+}
+
 static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.data_chunk_len		= sizeof(struct sctp_data_chunk),
 	/* DATA process functions */
 	.make_datafrag		= sctp_make_datafrag_empty,
+	.assign_number		= sctp_chunk_assign_ssn,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	.data_chunk_len		= sizeof(struct sctp_idata_chunk),
 	/* I-DATA process functions */
 	.make_datafrag		= sctp_make_idatafrag_empty,
+	.assign_number		= sctp_chunk_assign_mid,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index 5447228..650b634 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -443,8 +443,8 @@ struct sctp_ulpevent *sctp_ulpevent_make_send_failed(
 		goto fail;
 
 	/* Pull off the common chunk header and DATA header.  */
-	skb_pull(skb, sizeof(struct sctp_data_chunk));
-	len -= sizeof(struct sctp_data_chunk);
+	skb_pull(skb, sctp_datachk_len(&asoc->stream));
+	len -= sctp_datachk_len(&asoc->stream);
 
 	/* Embed the event fields inside the cloned skb.  */
 	event = sctp_skb2event(skb);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 06/12] sctp: implement validate_data for sctp_stream_interleave
  2017-12-08 13:04         ` [PATCHv2 net-next 05/12] sctp: implement assign_number " Xin Long
@ 2017-12-08 13:04           ` Xin Long
  2017-12-08 13:04             ` [PATCHv2 net-next 07/12] sctp: implement ulpevent_data " Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

validate_data is added as a member of sctp_stream_interleave, used
to validate ssn/chunk type for data or mid (message id)/chunk type
for idata, called in sctp_eat_data.

If this check fails, an abort packet will be sent, as said in
section 2.2.3 of RFC8260.

It also adds the process for idata in rx path. As Marcelo pointed
out, there's no need to add event table for idata, but just share
chunk_event_table with data's. It would drop data chunk for idata
and drop idata chunk for data by calling validate_data in
sctp_eat_data.

As last patch did, it also replaces sizeof(struct sctp_data_chunk)
with sctp_datachk_len for rx path.

After this patch, the idata can be accepted and delivered to ulp
layer.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/sm.h                |  6 ++++++
 include/net/sctp/stream_interleave.h |  1 +
 include/net/sctp/structs.h           |  6 +++++-
 net/sctp/sm_statefuns.c              | 21 ++++++++-----------
 net/sctp/sm_statetable.c             |  3 +++
 net/sctp/stream_interleave.c         | 39 ++++++++++++++++++++++++++++++++++++
 6 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index ca1db89..0993b49 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -359,6 +359,12 @@ static inline __u16 sctp_data_size(struct sctp_chunk *chunk)
 	 typecheck(__u32, b) && \
 	 ((__s32)((a) - (b)) <= 0))
 
+/* Compare two MIDs */
+#define MID_lt(a, b)	\
+	(typecheck(__u32, a) && \
+	 typecheck(__u32, b) && \
+	 ((__s32)((a) - (b)) < 0))
+
 /* Compare two SSNs */
 #define SSN_lt(a,b)		\
 	(typecheck(__u16, a) && \
diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index 99f399e..d8d1b51 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -38,6 +38,7 @@ struct sctp_stream_interleave {
 					    const struct sctp_sndrcvinfo *sinfo,
 					    int len, __u8 flags, gfp_t gfp);
 	void	(*assign_number)(struct sctp_chunk *chunk);
+	bool	(*validate_data)(struct sctp_chunk *chunk);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index fd93973..be4cc73 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1382,7 +1382,11 @@ struct sctp_stream_out {
 };
 
 struct sctp_stream_in {
-	__u16	ssn;
+	union {
+		__u32 mid;
+		__u16 ssn;
+	};
+	__u32 fsn;
 };
 
 struct sctp_stream {
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 8f8ccde..c609c54 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3013,7 +3013,7 @@ enum sctp_disposition sctp_sf_eat_data_6_2(struct net *net,
 		return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
 	}
 
-	if (!sctp_chunk_length_valid(chunk, sizeof(struct sctp_data_chunk)))
+	if (!sctp_chunk_length_valid(chunk, sctp_datachk_len(&asoc->stream)))
 		return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
 						  commands);
 
@@ -3034,7 +3034,7 @@ enum sctp_disposition sctp_sf_eat_data_6_2(struct net *net,
 	case SCTP_IERROR_PROTO_VIOLATION:
 		return sctp_sf_abort_violation(net, ep, asoc, chunk, commands,
 					       (u8 *)chunk->subh.data_hdr,
-					       sizeof(struct sctp_datahdr));
+					       sctp_datahdr_len(&asoc->stream));
 	default:
 		BUG();
 	}
@@ -3133,7 +3133,7 @@ enum sctp_disposition sctp_sf_eat_data_fast_4_4(
 		return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
 	}
 
-	if (!sctp_chunk_length_valid(chunk, sizeof(struct sctp_data_chunk)))
+	if (!sctp_chunk_length_valid(chunk, sctp_datachk_len(&asoc->stream)))
 		return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
 						  commands);
 
@@ -3150,7 +3150,7 @@ enum sctp_disposition sctp_sf_eat_data_fast_4_4(
 	case SCTP_IERROR_PROTO_VIOLATION:
 		return sctp_sf_abort_violation(net, ep, asoc, chunk, commands,
 					       (u8 *)chunk->subh.data_hdr,
-					       sizeof(struct sctp_datahdr));
+					       sctp_datahdr_len(&asoc->stream));
 	default:
 		BUG();
 	}
@@ -6244,14 +6244,12 @@ static int sctp_eat_data(const struct sctp_association *asoc,
 	struct sctp_chunk *err;
 	enum sctp_verb deliver;
 	size_t datalen;
-	u8 ordered = 0;
-	u16 ssn, sid;
 	__u32 tsn;
 	int tmp;
 
 	data_hdr = (struct sctp_datahdr *)chunk->skb->data;
 	chunk->subh.data_hdr = data_hdr;
-	skb_pull(chunk->skb, sizeof(*data_hdr));
+	skb_pull(chunk->skb, sctp_datahdr_len(&asoc->stream));
 
 	tsn = ntohl(data_hdr->tsn);
 	pr_debug("%s: TSN 0x%x\n", __func__, tsn);
@@ -6299,7 +6297,7 @@ static int sctp_eat_data(const struct sctp_association *asoc,
 	 * Actually, allow a little bit of overflow (up to a MTU).
 	 */
 	datalen = ntohs(chunk->chunk_hdr->length);
-	datalen -= sizeof(struct sctp_data_chunk);
+	datalen -= sctp_datachk_len(&asoc->stream);
 
 	deliver = SCTP_CMD_CHUNK_ULP;
 
@@ -6394,7 +6392,6 @@ static int sctp_eat_data(const struct sctp_association *asoc,
 		SCTP_INC_STATS(net, SCTP_MIB_INORDERCHUNKS);
 		if (chunk->asoc)
 			chunk->asoc->stats.iodchunks++;
-		ordered = 1;
 	}
 
 	/* RFC 2960 6.5 Stream Identifier and Stream Sequence Number
@@ -6405,8 +6402,7 @@ static int sctp_eat_data(const struct sctp_association *asoc,
 	 * with cause set to "Invalid Stream Identifier" (See Section 3.3.10)
 	 * and discard the DATA chunk.
 	 */
-	sid = ntohs(data_hdr->stream);
-	if (sid >= asoc->stream.incnt) {
+	if (ntohs(data_hdr->stream) >= asoc->stream.incnt) {
 		/* Mark tsn as received even though we drop it */
 		sctp_add_cmd_sf(commands, SCTP_CMD_REPORT_TSN, SCTP_U32(tsn));
 
@@ -6427,8 +6423,7 @@ static int sctp_eat_data(const struct sctp_association *asoc,
 	 * SSN is smaller then the next expected one.  If it is, it wrapped
 	 * and is invalid.
 	 */
-	ssn = ntohs(data_hdr->ssn);
-	if (ordered && SSN_lt(ssn, sctp_ssn_peek(&asoc->stream, in, sid)))
+	if (!asoc->stream.si->validate_data(chunk))
 		return SCTP_IERROR_PROTO_VIOLATION;
 
 	/* Send the data up to the user.  Note:  Schedule  the
diff --git a/net/sctp/sm_statetable.c b/net/sctp/sm_statetable.c
index 79b6bee..8c9bb41 100644
--- a/net/sctp/sm_statetable.c
+++ b/net/sctp/sm_statetable.c
@@ -985,6 +985,9 @@ static const struct sctp_sm_table_entry *sctp_chunk_event_lookup(
 	if (state > SCTP_STATE_MAX)
 		return &bug;
 
+	if (net->sctp.intl_enable && cid == SCTP_CID_I_DATA)
+		cid = SCTP_CID_DATA;
+
 	if (cid <= SCTP_CID_BASE_MAX)
 		return &chunk_event_table[cid][state];
 
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index 3ac47e7..3d8733b 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -92,11 +92,49 @@ static void sctp_chunk_assign_mid(struct sctp_chunk *chunk)
 	}
 }
 
+static bool sctp_validate_data(struct sctp_chunk *chunk)
+{
+	const struct sctp_stream *stream;
+	__u16 sid, ssn;
+
+	if (chunk->chunk_hdr->type != SCTP_CID_DATA)
+		return false;
+
+	if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
+		return true;
+
+	stream = &chunk->asoc->stream;
+	sid = sctp_chunk_stream_no(chunk);
+	ssn = ntohs(chunk->subh.data_hdr->ssn);
+
+	return !SSN_lt(ssn, sctp_ssn_peek(stream, in, sid));
+}
+
+static bool sctp_validate_idata(struct sctp_chunk *chunk)
+{
+	struct sctp_stream *stream;
+	__u32 mid;
+	__u16 sid;
+
+	if (chunk->chunk_hdr->type != SCTP_CID_I_DATA)
+		return false;
+
+	if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
+		return true;
+
+	stream = &chunk->asoc->stream;
+	sid = sctp_chunk_stream_no(chunk);
+	mid = ntohl(chunk->subh.idata_hdr->mid);
+
+	return !MID_lt(mid, sctp_mid_peek(stream, in, sid));
+}
+
 static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.data_chunk_len		= sizeof(struct sctp_data_chunk),
 	/* DATA process functions */
 	.make_datafrag		= sctp_make_datafrag_empty,
 	.assign_number		= sctp_chunk_assign_ssn,
+	.validate_data		= sctp_validate_data,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
@@ -104,6 +142,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	/* I-DATA process functions */
 	.make_datafrag		= sctp_make_idatafrag_empty,
 	.assign_number		= sctp_chunk_assign_mid,
+	.validate_data		= sctp_validate_idata,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 07/12] sctp: implement ulpevent_data for sctp_stream_interleave
  2017-12-08 13:04           ` [PATCHv2 net-next 06/12] sctp: implement validate_data " Xin Long
@ 2017-12-08 13:04             ` Xin Long
  2017-12-08 13:04               ` [PATCHv2 net-next 08/12] sctp: implement enqueue_event " Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

ulpevent_data is added as a member of sctp_stream_interleave, used to
do the most process in ulpq, including to convert data or idata chunk
to event, reasm them in reasm queue and put them in lobby queue in
right order, and deliver them up to user sk rx queue.

This procedure is described in section 2.2.3 of RFC8260.

It adds most functions for idata here to do the similar process as
the old functions for data. But since the details are very different
between them, the old functions can not be reused for idata.

event->ssn and event->ppid settings are moved to ulpevent_data from
sctp_ulpevent_make_rcvmsg, so that sctp_ulpevent_make_rcvmsg could
work for both data and idata.

Note that mid is added in sctp_ulpevent for idata, __packed has to
be used for defining sctp_ulpevent, or it would exceeds the skb cb
that saves a sctp_ulpevent variable for ulp layer process.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/stream_interleave.h |   2 +
 include/net/sctp/structs.h           |   3 +
 include/net/sctp/ulpevent.h          |  20 +-
 net/sctp/sm_sideeffect.c             |   5 +-
 net/sctp/stream_interleave.c         | 418 +++++++++++++++++++++++++++++++++++
 net/sctp/ulpevent.c                  |   2 -
 net/sctp/ulpqueue.c                  |  12 +-
 7 files changed, 451 insertions(+), 11 deletions(-)

diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index d8d1b51..02f60f5 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -39,6 +39,8 @@ struct sctp_stream_interleave {
 					    int len, __u8 flags, gfp_t gfp);
 	void	(*assign_number)(struct sctp_chunk *chunk);
 	bool	(*validate_data)(struct sctp_chunk *chunk);
+	int	(*ulpevent_data)(struct sctp_ulpq *ulpq,
+				 struct sctp_chunk *chunk, gfp_t gfp);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index be4cc73..73b315d 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -411,6 +411,8 @@ void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new);
 #define sctp_mid_skip(stream, type, sid, mid) \
 	((stream)->type[sid].mid = mid + 1)
 
+#define sctp_stream_in(asoc, sid) (&(asoc)->stream.in[sid])
+
 /*
  * Pointers to address related SCTP functions.
  * (i.e. things that depend on the address family.)
@@ -1387,6 +1389,7 @@ struct sctp_stream_in {
 		__u16 ssn;
 	};
 	__u32 fsn;
+	char pd_mode;
 };
 
 struct sctp_stream {
diff --git a/include/net/sctp/ulpevent.h b/include/net/sctp/ulpevent.h
index 231dc42..ce4f2aa 100644
--- a/include/net/sctp/ulpevent.h
+++ b/include/net/sctp/ulpevent.h
@@ -45,19 +45,29 @@
 /* A structure to carry information to the ULP (e.g. Sockets API) */
 /* Warning: This sits inside an skb.cb[] area.  Be very careful of
  * growing this structure as it is at the maximum limit now.
+ *
+ * sctp_ulpevent is saved in sk->cb(48 bytes), whose last 4 bytes
+ * have been taken by sock_skb_cb, So here it has to use 'packed'
+ * to make sctp_ulpevent fit into the rest 44 bytes.
  */
 struct sctp_ulpevent {
 	struct sctp_association *asoc;
 	struct sctp_chunk *chunk;
 	unsigned int rmem_len;
-	__u32 ppid;
+	union {
+		__u32 mid;
+		__u16 ssn;
+	};
+	union {
+		__u32 ppid;
+		__u32 fsn;
+	};
 	__u32 tsn;
 	__u32 cumtsn;
 	__u16 stream;
-	__u16 ssn;
 	__u16 flags;
 	__u16 msg_flags;
-};
+} __packed;
 
 /* Retrieve the skb this event sits inside of. */
 static inline struct sk_buff *sctp_event2skb(const struct sctp_ulpevent *ev)
@@ -140,6 +150,10 @@ struct sctp_ulpevent *sctp_ulpevent_make_stream_change_event(
 	const struct sctp_association *asoc, __u16 flags,
 	__u32 strchange_instrms, __u32 strchange_outstrms, gfp_t gfp);
 
+struct sctp_ulpevent *sctp_make_reassembled_event(
+	struct net *net, struct sk_buff_head *queue,
+	struct sk_buff *f_frag, struct sk_buff *l_frag);
+
 void sctp_ulpevent_read_sndrcvinfo(const struct sctp_ulpevent *event,
 				   struct msghdr *);
 void sctp_ulpevent_read_rcvinfo(const struct sctp_ulpevent *event,
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index df94d77..9d25efb 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1483,8 +1483,9 @@ static int sctp_cmd_interpreter(enum sctp_event event_type,
 			pr_debug("%s: sm_sideff: chunk_up:%p, ulpq:%p\n",
 				 __func__, cmd->obj.chunk, &asoc->ulpq);
 
-			sctp_ulpq_tail_data(&asoc->ulpq, cmd->obj.chunk,
-					    GFP_ATOMIC);
+			asoc->stream.si->ulpevent_data(&asoc->ulpq,
+						       cmd->obj.chunk,
+						       GFP_ATOMIC);
 			break;
 
 		case SCTP_CMD_EVENT_ULP:
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index 3d8733b..8238311 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -29,8 +29,10 @@
  *    Xin Long <lucien.xin@gmail.com>
  */
 
+#include <net/busy_poll.h>
 #include <net/sctp/sctp.h>
 #include <net/sctp/sm.h>
+#include <net/sctp/ulpevent.h>
 #include <linux/sctp.h>
 
 static struct sctp_chunk *sctp_make_idatafrag_empty(
@@ -129,12 +131,427 @@ static bool sctp_validate_idata(struct sctp_chunk *chunk)
 	return !MID_lt(mid, sctp_mid_peek(stream, in, sid));
 }
 
+static void sctp_intl_store_reasm(struct sctp_ulpq *ulpq,
+				  struct sctp_ulpevent *event)
+{
+	struct sctp_ulpevent *cevent;
+	struct sk_buff *pos;
+
+	pos = skb_peek_tail(&ulpq->reasm);
+	if (!pos) {
+		__skb_queue_tail(&ulpq->reasm, sctp_event2skb(event));
+		return;
+	}
+
+	cevent = sctp_skb2event(pos);
+
+	if (event->stream == cevent->stream &&
+	    event->mid == cevent->mid &&
+	    (cevent->msg_flags & SCTP_DATA_FIRST_FRAG ||
+	     (!(event->msg_flags & SCTP_DATA_FIRST_FRAG) &&
+	      event->fsn > cevent->fsn))) {
+		__skb_queue_tail(&ulpq->reasm, sctp_event2skb(event));
+		return;
+	}
+
+	if ((event->stream == cevent->stream &&
+	     MID_lt(cevent->mid, event->mid)) ||
+	    event->stream > cevent->stream) {
+		__skb_queue_tail(&ulpq->reasm, sctp_event2skb(event));
+		return;
+	}
+
+	skb_queue_walk(&ulpq->reasm, pos) {
+		cevent = sctp_skb2event(pos);
+
+		if (event->stream < cevent->stream ||
+		    (event->stream == cevent->stream &&
+		     MID_lt(event->mid, cevent->mid)))
+			break;
+
+		if (event->stream == cevent->stream &&
+		    event->mid == cevent->mid &&
+		    !(cevent->msg_flags & SCTP_DATA_FIRST_FRAG) &&
+		    (event->msg_flags & SCTP_DATA_FIRST_FRAG ||
+		     event->fsn < cevent->fsn))
+			break;
+	}
+
+	__skb_queue_before(&ulpq->reasm, pos, sctp_event2skb(event));
+}
+
+static struct sctp_ulpevent *sctp_intl_retrieve_partial(
+						struct sctp_ulpq *ulpq,
+						struct sctp_ulpevent *event)
+{
+	struct sk_buff *first_frag = NULL;
+	struct sk_buff *last_frag = NULL;
+	struct sctp_ulpevent *retval;
+	struct sctp_stream_in *sin;
+	struct sk_buff *pos;
+	__u32 next_fsn = 0;
+	int is_last = 0;
+
+	sin = sctp_stream_in(ulpq->asoc, event->stream);
+
+	skb_queue_walk(&ulpq->reasm, pos) {
+		struct sctp_ulpevent *cevent = sctp_skb2event(pos);
+
+		if (cevent->stream < event->stream)
+			continue;
+
+		if (cevent->stream > event->stream ||
+		    cevent->mid != sin->mid)
+			break;
+
+		switch (cevent->msg_flags & SCTP_DATA_FRAG_MASK) {
+		case SCTP_DATA_FIRST_FRAG:
+			goto out;
+		case SCTP_DATA_MIDDLE_FRAG:
+			if (!first_frag) {
+				if (cevent->fsn == sin->fsn) {
+					first_frag = pos;
+					last_frag = pos;
+					next_fsn = cevent->fsn + 1;
+				}
+			} else if (cevent->fsn == next_fsn) {
+				last_frag = pos;
+				next_fsn++;
+			} else {
+				goto out;
+			}
+			break;
+		case SCTP_DATA_LAST_FRAG:
+			if (!first_frag) {
+				if (cevent->fsn == sin->fsn) {
+					first_frag = pos;
+					last_frag = pos;
+					next_fsn = 0;
+					is_last = 1;
+				}
+			} else if (cevent->fsn == next_fsn) {
+				last_frag = pos;
+				next_fsn = 0;
+				is_last = 1;
+			}
+			goto out;
+		default:
+			goto out;
+		}
+	}
+
+out:
+	if (!first_frag)
+		return NULL;
+
+	retval = sctp_make_reassembled_event(sock_net(ulpq->asoc->base.sk),
+					     &ulpq->reasm, first_frag,
+					     last_frag);
+	if (retval) {
+		sin->fsn = next_fsn;
+		if (is_last) {
+			retval->msg_flags |= MSG_EOR;
+			sin->pd_mode = 0;
+		}
+	}
+
+	return retval;
+}
+
+static struct sctp_ulpevent *sctp_intl_retrieve_reassembled(
+						struct sctp_ulpq *ulpq,
+						struct sctp_ulpevent *event)
+{
+	struct sctp_association *asoc = ulpq->asoc;
+	struct sk_buff *pos, *first_frag = NULL;
+	struct sctp_ulpevent *retval = NULL;
+	struct sk_buff *pd_first = NULL;
+	struct sk_buff *pd_last = NULL;
+	struct sctp_stream_in *sin;
+	__u32 next_fsn = 0;
+	__u32 pd_point = 0;
+	__u32 pd_len = 0;
+	__u32 mid = 0;
+
+	sin = sctp_stream_in(ulpq->asoc, event->stream);
+
+	skb_queue_walk(&ulpq->reasm, pos) {
+		struct sctp_ulpevent *cevent = sctp_skb2event(pos);
+
+		if (cevent->stream < event->stream)
+			continue;
+		if (cevent->stream > event->stream)
+			break;
+
+		if (MID_lt(cevent->mid, event->mid))
+			continue;
+		if (MID_lt(event->mid, cevent->mid))
+			break;
+
+		switch (cevent->msg_flags & SCTP_DATA_FRAG_MASK) {
+		case SCTP_DATA_FIRST_FRAG:
+			if (cevent->mid == sin->mid) {
+				pd_first = pos;
+				pd_last = pos;
+				pd_len = pos->len;
+			}
+
+			first_frag = pos;
+			next_fsn = 0;
+			mid = cevent->mid;
+			break;
+
+		case SCTP_DATA_MIDDLE_FRAG:
+			if (first_frag && cevent->mid == mid &&
+			    cevent->fsn == next_fsn) {
+				next_fsn++;
+				if (pd_first) {
+					pd_last = pos;
+					pd_len += pos->len;
+				}
+			} else {
+				first_frag = NULL;
+			}
+			break;
+
+		case SCTP_DATA_LAST_FRAG:
+			if (first_frag && cevent->mid == mid &&
+			    cevent->fsn == next_fsn)
+				goto found;
+			else
+				first_frag = NULL;
+			break;
+		}
+	}
+
+	if (!pd_first)
+		goto out;
+
+	pd_point = sctp_sk(asoc->base.sk)->pd_point;
+	if (pd_point && pd_point <= pd_len) {
+		retval = sctp_make_reassembled_event(sock_net(asoc->base.sk),
+						     &ulpq->reasm,
+						     pd_first, pd_last);
+		if (retval) {
+			sin->fsn = next_fsn;
+			sin->pd_mode = 1;
+		}
+	}
+	goto out;
+
+found:
+	retval = sctp_make_reassembled_event(sock_net(asoc->base.sk),
+					     &ulpq->reasm,
+					     first_frag, pos);
+	if (retval)
+		retval->msg_flags |= MSG_EOR;
+
+out:
+	return retval;
+}
+
+static struct sctp_ulpevent *sctp_intl_reasm(struct sctp_ulpq *ulpq,
+					     struct sctp_ulpevent *event)
+{
+	struct sctp_ulpevent *retval = NULL;
+	struct sctp_stream_in *sin;
+
+	if (SCTP_DATA_NOT_FRAG == (event->msg_flags & SCTP_DATA_FRAG_MASK)) {
+		event->msg_flags |= MSG_EOR;
+		return event;
+	}
+
+	sctp_intl_store_reasm(ulpq, event);
+
+	sin = sctp_stream_in(ulpq->asoc, event->stream);
+	if (sin->pd_mode && event->mid == sin->mid &&
+	    event->fsn == sin->fsn)
+		retval = sctp_intl_retrieve_partial(ulpq, event);
+
+	if (!retval)
+		retval = sctp_intl_retrieve_reassembled(ulpq, event);
+
+	return retval;
+}
+
+static void sctp_intl_store_ordered(struct sctp_ulpq *ulpq,
+				    struct sctp_ulpevent *event)
+{
+	struct sctp_ulpevent *cevent;
+	struct sk_buff *pos;
+
+	pos = skb_peek_tail(&ulpq->lobby);
+	if (!pos) {
+		__skb_queue_tail(&ulpq->lobby, sctp_event2skb(event));
+		return;
+	}
+
+	cevent = (struct sctp_ulpevent *)pos->cb;
+	if (event->stream == cevent->stream &&
+	    MID_lt(cevent->mid, event->mid)) {
+		__skb_queue_tail(&ulpq->lobby, sctp_event2skb(event));
+		return;
+	}
+
+	if (event->stream > cevent->stream) {
+		__skb_queue_tail(&ulpq->lobby, sctp_event2skb(event));
+		return;
+	}
+
+	skb_queue_walk(&ulpq->lobby, pos) {
+		cevent = (struct sctp_ulpevent *)pos->cb;
+
+		if (cevent->stream > event->stream)
+			break;
+
+		if (cevent->stream == event->stream &&
+		    MID_lt(event->mid, cevent->mid))
+			break;
+	}
+
+	__skb_queue_before(&ulpq->lobby, pos, sctp_event2skb(event));
+}
+
+static void sctp_intl_retrieve_ordered(struct sctp_ulpq *ulpq,
+				       struct sctp_ulpevent *event)
+{
+	struct sk_buff_head *event_list;
+	struct sctp_stream *stream;
+	struct sk_buff *pos, *tmp;
+	__u16 sid = event->stream;
+
+	stream  = &ulpq->asoc->stream;
+	event_list = (struct sk_buff_head *)sctp_event2skb(event)->prev;
+
+	sctp_skb_for_each(pos, &ulpq->lobby, tmp) {
+		struct sctp_ulpevent *cevent = (struct sctp_ulpevent *)pos->cb;
+
+		if (cevent->stream > sid)
+			break;
+
+		if (cevent->stream < sid)
+			continue;
+
+		if (cevent->mid != sctp_mid_peek(stream, in, sid))
+			break;
+
+		sctp_mid_next(stream, in, sid);
+
+		__skb_unlink(pos, &ulpq->lobby);
+
+		__skb_queue_tail(event_list, pos);
+	}
+}
+
+static struct sctp_ulpevent *sctp_intl_order(struct sctp_ulpq *ulpq,
+					     struct sctp_ulpevent *event)
+{
+	struct sctp_stream *stream;
+	__u16 sid;
+
+	if (event->msg_flags & SCTP_DATA_UNORDERED)
+		return event;
+
+	stream  = &ulpq->asoc->stream;
+	sid = event->stream;
+
+	if (event->mid != sctp_mid_peek(stream, in, sid)) {
+		sctp_intl_store_ordered(ulpq, event);
+		return NULL;
+	}
+
+	sctp_mid_next(stream, in, sid);
+
+	sctp_intl_retrieve_ordered(ulpq, event);
+
+	return event;
+}
+
+static int sctp_enqueue_event(struct sctp_ulpq *ulpq,
+			      struct sctp_ulpevent *event)
+{
+	struct sk_buff *skb = sctp_event2skb(event);
+	struct sock *sk = ulpq->asoc->base.sk;
+	struct sctp_sock *sp = sctp_sk(sk);
+	struct sk_buff_head *skb_list;
+
+	skb_list = (struct sk_buff_head *)skb->prev;
+
+	if (sk->sk_shutdown & RCV_SHUTDOWN &&
+	    (sk->sk_shutdown & SEND_SHUTDOWN ||
+	     !sctp_ulpevent_is_notification(event)))
+		goto out_free;
+
+	if (!sctp_ulpevent_is_notification(event)) {
+		sk_mark_napi_id(sk, skb);
+		sk_incoming_cpu_update(sk);
+	}
+
+	if (!sctp_ulpevent_is_enabled(event, &sp->subscribe))
+		goto out_free;
+
+	if (skb_list)
+		skb_queue_splice_tail_init(skb_list,
+					   &sk->sk_receive_queue);
+	else
+		__skb_queue_tail(&sk->sk_receive_queue, skb);
+
+	if (!sp->data_ready_signalled) {
+		sp->data_ready_signalled = 1;
+		sk->sk_data_ready(sk);
+	}
+
+	return 1;
+
+out_free:
+	if (skb_list)
+		sctp_queue_purge_ulpevents(skb_list);
+	else
+		sctp_ulpevent_free(event);
+
+	return 0;
+}
+
+static int sctp_ulpevent_idata(struct sctp_ulpq *ulpq,
+			       struct sctp_chunk *chunk, gfp_t gfp)
+{
+	struct sctp_ulpevent *event;
+	struct sk_buff_head temp;
+	int event_eor = 0;
+
+	event = sctp_ulpevent_make_rcvmsg(chunk->asoc, chunk, gfp);
+	if (!event)
+		return -ENOMEM;
+
+	event->mid = ntohl(chunk->subh.idata_hdr->mid);
+	if (event->msg_flags & SCTP_DATA_FIRST_FRAG)
+		event->ppid = chunk->subh.idata_hdr->ppid;
+	else
+		event->fsn = ntohl(chunk->subh.idata_hdr->fsn);
+
+	event = sctp_intl_reasm(ulpq, event);
+	if (event && event->msg_flags & MSG_EOR) {
+		skb_queue_head_init(&temp);
+		__skb_queue_tail(&temp, sctp_event2skb(event));
+
+		event = sctp_intl_order(ulpq, event);
+	}
+
+	if (event) {
+		event_eor = (event->msg_flags & MSG_EOR) ? 1 : 0;
+		sctp_enqueue_event(ulpq, event);
+	}
+
+	return event_eor;
+}
+
 static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.data_chunk_len		= sizeof(struct sctp_data_chunk),
 	/* DATA process functions */
 	.make_datafrag		= sctp_make_datafrag_empty,
 	.assign_number		= sctp_chunk_assign_ssn,
 	.validate_data		= sctp_validate_data,
+	.ulpevent_data		= sctp_ulpq_tail_data,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
@@ -143,6 +560,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	.make_datafrag		= sctp_make_idatafrag_empty,
 	.assign_number		= sctp_chunk_assign_mid,
 	.validate_data		= sctp_validate_idata,
+	.ulpevent_data		= sctp_ulpevent_idata,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index 650b634..d3218f3 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -705,8 +705,6 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc,
 	sctp_ulpevent_receive_data(event, asoc);
 
 	event->stream = ntohs(chunk->subh.data_hdr->stream);
-	event->ssn = ntohs(chunk->subh.data_hdr->ssn);
-	event->ppid = chunk->subh.data_hdr->ppid;
 	if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED) {
 		event->flags |= SCTP_UNORDERED;
 		event->cumtsn = sctp_tsnmap_get_ctsn(&asoc->peer.tsn_map);
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index a71be33..0d07f2a 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -104,6 +104,9 @@ int sctp_ulpq_tail_data(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
 	if (!event)
 		return -ENOMEM;
 
+	event->ssn = ntohs(chunk->subh.data_hdr->ssn);
+	event->ppid = chunk->subh.data_hdr->ppid;
+
 	/* Do reassembly if needed.  */
 	event = sctp_ulpq_reasm(ulpq, event);
 
@@ -328,9 +331,10 @@ static void sctp_ulpq_store_reasm(struct sctp_ulpq *ulpq,
  * payload was fragmented on the way and ip had to reassemble them.
  * We add the rest of skb's to the first skb's fraglist.
  */
-static struct sctp_ulpevent *sctp_make_reassembled_event(struct net *net,
-	struct sk_buff_head *queue, struct sk_buff *f_frag,
-	struct sk_buff *l_frag)
+struct sctp_ulpevent *sctp_make_reassembled_event(struct net *net,
+						  struct sk_buff_head *queue,
+						  struct sk_buff *f_frag,
+						  struct sk_buff *l_frag)
 {
 	struct sk_buff *pos;
 	struct sk_buff *new = NULL;
@@ -853,7 +857,7 @@ static struct sctp_ulpevent *sctp_ulpq_order(struct sctp_ulpq *ulpq,
 	struct sctp_stream *stream;
 
 	/* Check if this message needs ordering.  */
-	if (SCTP_DATA_UNORDERED & event->msg_flags)
+	if (event->msg_flags & SCTP_DATA_UNORDERED)
 		return event;
 
 	/* Note: The stream ID must be verified before this routine.  */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 08/12] sctp: implement enqueue_event for sctp_stream_interleave
  2017-12-08 13:04             ` [PATCHv2 net-next 07/12] sctp: implement ulpevent_data " Xin Long
@ 2017-12-08 13:04               ` Xin Long
  2017-12-08 13:04                 ` [PATCHv2 net-next 09/12] sctp: implement renege_events " Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

enqueue_event is added as a member of sctp_stream_interleave, used to
enqueue either data, idata or notification events into user socket rx
queue.

It replaces sctp_ulpq_tail_event used in the other places with
enqueue_event.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/stream_interleave.h | 2 ++
 net/sctp/associola.c                 | 2 +-
 net/sctp/chunk.c                     | 2 +-
 net/sctp/sm_sideeffect.c             | 9 +++++----
 net/sctp/socket.c                    | 2 +-
 net/sctp/stream_interleave.c         | 2 ++
 6 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index 02f60f5..a0f61bc 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -41,6 +41,8 @@ struct sctp_stream_interleave {
 	bool	(*validate_data)(struct sctp_chunk *chunk);
 	int	(*ulpevent_data)(struct sctp_ulpq *ulpq,
 				 struct sctp_chunk *chunk, gfp_t gfp);
+	int	(*enqueue_event)(struct sctp_ulpq *ulpq,
+				 struct sctp_ulpevent *event);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index 69394f4..837806d 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -861,7 +861,7 @@ void sctp_assoc_control_transport(struct sctp_association *asoc,
 		event = sctp_ulpevent_make_peer_addr_change(asoc, &addr,
 					0, spc_state, error, GFP_ATOMIC);
 		if (event)
-			sctp_ulpq_tail_event(&asoc->ulpq, event);
+			asoc->stream.si->enqueue_event(&asoc->ulpq, event);
 	}
 
 	/* Select new active and retran paths. */
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 62adaaa..991a530 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -124,7 +124,7 @@ static void sctp_datamsg_destroy(struct sctp_datamsg *msg)
 			ev = sctp_ulpevent_make_send_failed(asoc, chunk, sent,
 							    error, GFP_ATOMIC);
 			if (ev)
-				sctp_ulpq_tail_event(&asoc->ulpq, ev);
+				asoc->stream.si->enqueue_event(&asoc->ulpq, ev);
 		}
 
 		sctp_chunk_put(chunk);
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 9d25efb..f4e5eca 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -972,7 +972,7 @@ static void sctp_cmd_process_operr(struct sctp_cmd_seq *cmds,
 		if (!ev)
 			return;
 
-		sctp_ulpq_tail_event(&asoc->ulpq, ev);
+		asoc->stream.si->enqueue_event(&asoc->ulpq, ev);
 
 		switch (err_hdr->cause) {
 		case SCTP_ERROR_UNKNOWN_CHUNK:
@@ -1058,7 +1058,7 @@ static void sctp_cmd_assoc_change(struct sctp_cmd_seq *commands,
 					    asoc->c.sinit_max_instreams,
 					    NULL, GFP_ATOMIC);
 	if (ev)
-		sctp_ulpq_tail_event(&asoc->ulpq, ev);
+		asoc->stream.si->enqueue_event(&asoc->ulpq, ev);
 }
 
 /* Helper function to generate an adaptation indication event */
@@ -1070,7 +1070,7 @@ static void sctp_cmd_adaptation_ind(struct sctp_cmd_seq *commands,
 	ev = sctp_ulpevent_make_adaptation_indication(asoc, GFP_ATOMIC);
 
 	if (ev)
-		sctp_ulpq_tail_event(&asoc->ulpq, ev);
+		asoc->stream.si->enqueue_event(&asoc->ulpq, ev);
 }
 
 
@@ -1493,7 +1493,8 @@ static int sctp_cmd_interpreter(enum sctp_event event_type,
 			pr_debug("%s: sm_sideff: event_up:%p, ulpq:%p\n",
 				 __func__, cmd->obj.ulpevent, &asoc->ulpq);
 
-			sctp_ulpq_tail_event(&asoc->ulpq, cmd->obj.ulpevent);
+			asoc->stream.si->enqueue_event(&asoc->ulpq,
+						       cmd->obj.ulpevent);
 			break;
 
 		case SCTP_CMD_REPLY:
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 036f945..fe2cab9 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2294,7 +2294,7 @@ static int sctp_setsockopt_events(struct sock *sk, char __user *optval,
 			if (!event)
 				return -ENOMEM;
 
-			sctp_ulpq_tail_event(&asoc->ulpq, event);
+			asoc->stream.si->enqueue_event(&asoc->ulpq, event);
 		}
 	}
 
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index 8238311..e853972 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -552,6 +552,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.assign_number		= sctp_chunk_assign_ssn,
 	.validate_data		= sctp_validate_data,
 	.ulpevent_data		= sctp_ulpq_tail_data,
+	.enqueue_event		= sctp_ulpq_tail_event,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
@@ -561,6 +562,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	.assign_number		= sctp_chunk_assign_mid,
 	.validate_data		= sctp_validate_idata,
 	.ulpevent_data		= sctp_ulpevent_idata,
+	.enqueue_event		= sctp_enqueue_event,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 09/12] sctp: implement renege_events for sctp_stream_interleave
  2017-12-08 13:04               ` [PATCHv2 net-next 08/12] sctp: implement enqueue_event " Xin Long
@ 2017-12-08 13:04                 ` Xin Long
  2017-12-08 13:04                   ` [PATCHv2 net-next 10/12] sctp: implement start_pd " Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

renege_events is added as a member of sctp_stream_interleave, used to
renege some old data or idata in reasm or lobby queue properly to free
some memory for the new data when there's memory stress.

It defines sctp_renege_events for idata, and leaves sctp_ulpq_renege
as it is for data.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/stream_interleave.h |   2 +
 include/net/sctp/ulpqueue.h          |   9 +--
 net/sctp/sm_sideeffect.c             |   5 +-
 net/sctp/stream_interleave.c         | 109 +++++++++++++++++++++++++++++++++++
 net/sctp/ulpqueue.c                  |   4 +-
 5 files changed, 119 insertions(+), 10 deletions(-)

diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index a0f61bc..16a71cb 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -43,6 +43,8 @@ struct sctp_stream_interleave {
 				 struct sctp_chunk *chunk, gfp_t gfp);
 	int	(*enqueue_event)(struct sctp_ulpq *ulpq,
 				 struct sctp_ulpevent *event);
+	void	(*renege_events)(struct sctp_ulpq *ulpq,
+				 struct sctp_chunk *chunk, gfp_t gfp);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/include/net/sctp/ulpqueue.h b/include/net/sctp/ulpqueue.h
index e0dce07..eb98c71 100644
--- a/include/net/sctp/ulpqueue.h
+++ b/include/net/sctp/ulpqueue.h
@@ -76,11 +76,8 @@ int sctp_clear_pd(struct sock *sk, struct sctp_association *asoc);
 void sctp_ulpq_skip(struct sctp_ulpq *ulpq, __u16 sid, __u16 ssn);
 
 void sctp_ulpq_reasm_flushtsn(struct sctp_ulpq *, __u32);
-#endif /* __sctp_ulpqueue_h__ */
-
-
-
-
-
 
+__u16 sctp_ulpq_renege_list(struct sctp_ulpq *ulpq,
+			    struct sk_buff_head *list, __u16 needed);
 
+#endif /* __sctp_ulpqueue_h__ */
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index f4e5eca..2bec17a 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1735,8 +1735,9 @@ static int sctp_cmd_interpreter(enum sctp_event event_type,
 			break;
 
 		case SCTP_CMD_RENEGE:
-			sctp_ulpq_renege(&asoc->ulpq, cmd->obj.chunk,
-					 GFP_ATOMIC);
+			asoc->stream.si->renege_events(&asoc->ulpq,
+						       cmd->obj.chunk,
+						       GFP_ATOMIC);
 			break;
 
 		case SCTP_CMD_SETUP_T4:
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index e853972..d62ad5c 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -545,6 +545,113 @@ static int sctp_ulpevent_idata(struct sctp_ulpq *ulpq,
 	return event_eor;
 }
 
+static struct sctp_ulpevent *sctp_intl_retrieve_first(struct sctp_ulpq *ulpq)
+{
+	struct sctp_stream_in *csin, *sin = NULL;
+	struct sk_buff *first_frag = NULL;
+	struct sk_buff *last_frag = NULL;
+	struct sctp_ulpevent *retval;
+	struct sk_buff *pos;
+	__u32 next_fsn = 0;
+	__u16 sid = 0;
+
+	skb_queue_walk(&ulpq->reasm, pos) {
+		struct sctp_ulpevent *cevent = sctp_skb2event(pos);
+
+		csin = sctp_stream_in(ulpq->asoc, cevent->stream);
+		if (csin->pd_mode)
+			continue;
+
+		switch (cevent->msg_flags & SCTP_DATA_FRAG_MASK) {
+		case SCTP_DATA_FIRST_FRAG:
+			if (first_frag)
+				goto out;
+			if (cevent->mid == csin->mid) {
+				first_frag = pos;
+				last_frag = pos;
+				next_fsn = 0;
+				sin = csin;
+				sid = cevent->stream;
+			}
+			break;
+		case SCTP_DATA_MIDDLE_FRAG:
+			if (!first_frag)
+				break;
+			if (cevent->stream == sid &&
+			    cevent->mid == sin->mid &&
+			    cevent->fsn == next_fsn) {
+				next_fsn++;
+				last_frag = pos;
+			} else {
+				goto out;
+			}
+			break;
+		case SCTP_DATA_LAST_FRAG:
+			if (first_frag)
+				goto out;
+			break;
+		default:
+			break;
+		}
+	}
+
+	if (!first_frag)
+		return NULL;
+
+out:
+	retval = sctp_make_reassembled_event(sock_net(ulpq->asoc->base.sk),
+					     &ulpq->reasm, first_frag,
+					     last_frag);
+	if (retval) {
+		sin->fsn = next_fsn;
+		sin->pd_mode = 1;
+	}
+
+	return retval;
+}
+
+static void sctp_intl_start_pd(struct sctp_ulpq *ulpq, gfp_t gfp)
+{
+	struct sctp_ulpevent *event;
+
+	if (skb_queue_empty(&ulpq->reasm))
+		return;
+
+	do {
+		event = sctp_intl_retrieve_first(ulpq);
+		if (event)
+			sctp_enqueue_event(ulpq, event);
+	} while (event);
+}
+
+static void sctp_renege_events(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
+			       gfp_t gfp)
+{
+	struct sctp_association *asoc = ulpq->asoc;
+	__u32 freed = 0;
+	__u16 needed;
+
+	if (chunk) {
+		needed = ntohs(chunk->chunk_hdr->length);
+		needed -= sizeof(struct sctp_idata_chunk);
+	} else {
+		needed = SCTP_DEFAULT_MAXWINDOW;
+	}
+
+	if (skb_queue_empty(&asoc->base.sk->sk_receive_queue)) {
+		freed = sctp_ulpq_renege_list(ulpq, &ulpq->lobby, needed);
+		if (freed < needed)
+			freed += sctp_ulpq_renege_list(ulpq, &ulpq->reasm,
+						       needed);
+	}
+
+	if (chunk && freed >= needed)
+		if (sctp_ulpevent_idata(ulpq, chunk, gfp) <= 0)
+			sctp_intl_start_pd(ulpq, gfp);
+
+	sk_mem_reclaim(asoc->base.sk);
+}
+
 static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.data_chunk_len		= sizeof(struct sctp_data_chunk),
 	/* DATA process functions */
@@ -553,6 +660,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.validate_data		= sctp_validate_data,
 	.ulpevent_data		= sctp_ulpq_tail_data,
 	.enqueue_event		= sctp_ulpq_tail_event,
+	.renege_events		= sctp_ulpq_renege,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
@@ -563,6 +671,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	.validate_data		= sctp_validate_idata,
 	.ulpevent_data		= sctp_ulpevent_idata,
 	.enqueue_event		= sctp_enqueue_event,
+	.renege_events		= sctp_renege_events,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index 0d07f2a..76ec514 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -978,8 +978,8 @@ void sctp_ulpq_skip(struct sctp_ulpq *ulpq, __u16 sid, __u16 ssn)
 	sctp_ulpq_reap_ordered(ulpq, sid);
 }
 
-static __u16 sctp_ulpq_renege_list(struct sctp_ulpq *ulpq,
-		struct sk_buff_head *list, __u16 needed)
+__u16 sctp_ulpq_renege_list(struct sctp_ulpq *ulpq, struct sk_buff_head *list,
+			    __u16 needed)
 {
 	__u16 freed = 0;
 	__u32 tsn, last_tsn;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 10/12] sctp: implement start_pd for sctp_stream_interleave
  2017-12-08 13:04                 ` [PATCHv2 net-next 09/12] sctp: implement renege_events " Xin Long
@ 2017-12-08 13:04                   ` Xin Long
  2017-12-08 13:04                     ` [PATCHv2 net-next 11/12] sctp: implement abort_pd " Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

start_pd is added as a member of sctp_stream_interleave, used to
do partial_delivery for data or idata when datalen >= asoc->rwnd
in sctp_eat_data. The codes have been done in last patches, but
they need to be extracted into start_pd, so that it could be used
for SCTP_CMD_PART_DELIVER cmd as well.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/stream_interleave.h | 1 +
 net/sctp/sm_sideeffect.c             | 2 +-
 net/sctp/stream_interleave.c         | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index 16a71cb..317d9b3 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -45,6 +45,7 @@ struct sctp_stream_interleave {
 				 struct sctp_ulpevent *event);
 	void	(*renege_events)(struct sctp_ulpq *ulpq,
 				 struct sctp_chunk *chunk, gfp_t gfp);
+	void	(*start_pd)(struct sctp_ulpq *ulpq, gfp_t gfp);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 2bec17a..3671054 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1731,7 +1731,7 @@ static int sctp_cmd_interpreter(enum sctp_event event_type,
 			break;
 
 		case SCTP_CMD_PART_DELIVER:
-			sctp_ulpq_partial_delivery(&asoc->ulpq, GFP_ATOMIC);
+			asoc->stream.si->start_pd(&asoc->ulpq, GFP_ATOMIC);
 			break;
 
 		case SCTP_CMD_RENEGE:
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index d62ad5c..4dce8d3 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -661,6 +661,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.ulpevent_data		= sctp_ulpq_tail_data,
 	.enqueue_event		= sctp_ulpq_tail_event,
 	.renege_events		= sctp_ulpq_renege,
+	.start_pd		= sctp_ulpq_partial_delivery,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
@@ -672,6 +673,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	.ulpevent_data		= sctp_ulpevent_idata,
 	.enqueue_event		= sctp_enqueue_event,
 	.renege_events		= sctp_renege_events,
+	.start_pd		= sctp_intl_start_pd,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 11/12] sctp: implement abort_pd for sctp_stream_interleave
  2017-12-08 13:04                   ` [PATCHv2 net-next 10/12] sctp: implement start_pd " Xin Long
@ 2017-12-08 13:04                     ` Xin Long
  2017-12-08 13:04                       ` [PATCHv2 net-next 12/12] sctp: add support for the process of unordered idata Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

abort_pd is added as a member of sctp_stream_interleave, used to abort
partial delivery for data or idata, called in sctp_cmd_assoc_failed.

Since stream interleave allows to do partial delivery for each stream
at the same time, sctp_intl_abort_pd for idata would be very different
from the old function sctp_ulpq_abort_pd for data.

Note that sctp_ulpevent_make_pdapi will support per stream in this
patch by adding pdapi_stream and pdapi_seq in sctp_pdapi_event, as
described in section 6.1.7 of RFC6458.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/stream_interleave.h |  1 +
 include/net/sctp/ulpevent.h          |  3 +-
 include/uapi/linux/sctp.h            |  2 +
 net/sctp/sm_sideeffect.c             |  2 +-
 net/sctp/stream_interleave.c         | 99 ++++++++++++++++++++++++++++++++++++
 net/sctp/ulpevent.c                  |  9 ++--
 net/sctp/ulpqueue.c                  |  2 +-
 7 files changed, 112 insertions(+), 6 deletions(-)

diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
index 317d9b3..501b2be 100644
--- a/include/net/sctp/stream_interleave.h
+++ b/include/net/sctp/stream_interleave.h
@@ -46,6 +46,7 @@ struct sctp_stream_interleave {
 	void	(*renege_events)(struct sctp_ulpq *ulpq,
 				 struct sctp_chunk *chunk, gfp_t gfp);
 	void	(*start_pd)(struct sctp_ulpq *ulpq, gfp_t gfp);
+	void	(*abort_pd)(struct sctp_ulpq *ulpq, gfp_t gfp);
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream);
diff --git a/include/net/sctp/ulpevent.h b/include/net/sctp/ulpevent.h
index ce4f2aa..51b4e06 100644
--- a/include/net/sctp/ulpevent.h
+++ b/include/net/sctp/ulpevent.h
@@ -122,7 +122,8 @@ struct sctp_ulpevent *sctp_ulpevent_make_shutdown_event(
 
 struct sctp_ulpevent *sctp_ulpevent_make_pdapi(
 	const struct sctp_association *asoc,
-	__u32 indication, gfp_t gfp);
+	__u32 indication, __u32 sid, __u32 seq,
+	__u32 flags, gfp_t gfp);
 
 struct sctp_ulpevent *sctp_ulpevent_make_adaptation_indication(
 	const struct sctp_association *asoc, gfp_t gfp);
diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
index 6ed934c..4c4db14 100644
--- a/include/uapi/linux/sctp.h
+++ b/include/uapi/linux/sctp.h
@@ -460,6 +460,8 @@ struct sctp_pdapi_event {
 	__u32 pdapi_length;
 	__u32 pdapi_indication;
 	sctp_assoc_t pdapi_assoc_id;
+	__u32 pdapi_stream;
+	__u32 pdapi_seq;
 };
 
 enum { SCTP_PARTIAL_DELIVERY_ABORTED=0, };
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 3671054..8adde71 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -632,7 +632,7 @@ static void sctp_cmd_assoc_failed(struct sctp_cmd_seq *commands,
 	struct sctp_chunk *abort;
 
 	/* Cancel any partial delivery in progress. */
-	sctp_ulpq_abort_pd(&asoc->ulpq, GFP_ATOMIC);
+	asoc->stream.si->abort_pd(&asoc->ulpq, GFP_ATOMIC);
 
 	if (event_type == SCTP_EVENT_T_CHUNK && subtype.chunk == SCTP_CID_ABORT)
 		event = sctp_ulpevent_make_assoc_change(asoc, 0, SCTP_COMM_LOST,
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index 4dce8d3..d15645e 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -652,6 +652,103 @@ static void sctp_renege_events(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
 	sk_mem_reclaim(asoc->base.sk);
 }
 
+static void sctp_intl_stream_abort_pd(struct sctp_ulpq *ulpq, __u16 sid,
+				      __u32 mid, __u16 flags, gfp_t gfp)
+{
+	struct sock *sk = ulpq->asoc->base.sk;
+	struct sctp_ulpevent *ev = NULL;
+
+	if (!sctp_ulpevent_type_enabled(SCTP_PARTIAL_DELIVERY_EVENT,
+					&sctp_sk(sk)->subscribe))
+		return;
+
+	ev = sctp_ulpevent_make_pdapi(ulpq->asoc, SCTP_PARTIAL_DELIVERY_ABORTED,
+				      sid, mid, flags, gfp);
+	if (ev) {
+		__skb_queue_tail(&sk->sk_receive_queue, sctp_event2skb(ev));
+
+		if (!sctp_sk(sk)->data_ready_signalled) {
+			sctp_sk(sk)->data_ready_signalled = 1;
+			sk->sk_data_ready(sk);
+		}
+	}
+}
+
+static void sctp_intl_reap_ordered(struct sctp_ulpq *ulpq, __u16 sid)
+{
+	struct sctp_stream *stream = &ulpq->asoc->stream;
+	struct sctp_ulpevent *cevent, *event = NULL;
+	struct sk_buff_head *lobby = &ulpq->lobby;
+	struct sk_buff *pos, *tmp;
+	struct sk_buff_head temp;
+	__u16 csid;
+	__u32 cmid;
+
+	skb_queue_head_init(&temp);
+	sctp_skb_for_each(pos, lobby, tmp) {
+		cevent = (struct sctp_ulpevent *)pos->cb;
+		csid = cevent->stream;
+		cmid = cevent->mid;
+
+		if (csid > sid)
+			break;
+
+		if (csid < sid)
+			continue;
+
+		if (!MID_lt(cmid, sctp_mid_peek(stream, in, csid)))
+			break;
+
+		__skb_unlink(pos, lobby);
+		if (!event)
+			event = sctp_skb2event(pos);
+
+		__skb_queue_tail(&temp, pos);
+	}
+
+	if (!event && pos != (struct sk_buff *)lobby) {
+		cevent = (struct sctp_ulpevent *)pos->cb;
+		csid = cevent->stream;
+		cmid = cevent->mid;
+
+		if (csid == sid && cmid == sctp_mid_peek(stream, in, csid)) {
+			sctp_mid_next(stream, in, csid);
+			__skb_unlink(pos, lobby);
+			__skb_queue_tail(&temp, pos);
+			event = sctp_skb2event(pos);
+		}
+	}
+
+	if (event) {
+		sctp_intl_retrieve_ordered(ulpq, event);
+		sctp_enqueue_event(ulpq, event);
+	}
+}
+
+static void sctp_intl_abort_pd(struct sctp_ulpq *ulpq, gfp_t gfp)
+{
+	struct sctp_stream *stream = &ulpq->asoc->stream;
+	__u16 sid;
+
+	for (sid = 0; sid < stream->incnt; sid++) {
+		struct sctp_stream_in *sin = &stream->in[sid];
+		__u32 mid;
+
+		if (sin->pd_mode) {
+			sin->pd_mode = 0;
+
+			mid = sin->mid;
+			sctp_intl_stream_abort_pd(ulpq, sid, mid, 0, gfp);
+			sctp_mid_skip(stream, in, sid, mid);
+
+			sctp_intl_reap_ordered(ulpq, sid);
+		}
+	}
+
+	/* intl abort pd happens only when all data needs to be cleaned */
+	sctp_ulpq_flush(ulpq);
+}
+
 static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.data_chunk_len		= sizeof(struct sctp_data_chunk),
 	/* DATA process functions */
@@ -662,6 +759,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_0 = {
 	.enqueue_event		= sctp_ulpq_tail_event,
 	.renege_events		= sctp_ulpq_renege,
 	.start_pd		= sctp_ulpq_partial_delivery,
+	.abort_pd		= sctp_ulpq_abort_pd,
 };
 
 static struct sctp_stream_interleave sctp_stream_interleave_1 = {
@@ -674,6 +772,7 @@ static struct sctp_stream_interleave sctp_stream_interleave_1 = {
 	.enqueue_event		= sctp_enqueue_event,
 	.renege_events		= sctp_renege_events,
 	.start_pd		= sctp_intl_start_pd,
+	.abort_pd		= sctp_intl_abort_pd,
 };
 
 void sctp_stream_interleave_init(struct sctp_stream *stream)
diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index d3218f3..84207ad 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -730,8 +730,9 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc,
  *   various events.
  */
 struct sctp_ulpevent *sctp_ulpevent_make_pdapi(
-	const struct sctp_association *asoc, __u32 indication,
-	gfp_t gfp)
+					const struct sctp_association *asoc,
+					__u32 indication, __u32 sid, __u32 seq,
+					__u32 flags, gfp_t gfp)
 {
 	struct sctp_ulpevent *event;
 	struct sctp_pdapi_event *pd;
@@ -752,7 +753,9 @@ struct sctp_ulpevent *sctp_ulpevent_make_pdapi(
 	 *   Currently unused.
 	 */
 	pd->pdapi_type = SCTP_PARTIAL_DELIVERY_EVENT;
-	pd->pdapi_flags = 0;
+	pd->pdapi_flags = flags;
+	pd->pdapi_stream = sid;
+	pd->pdapi_seq = seq;
 
 	/* pdapi_length: 32 bits (unsigned integer)
 	 *
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index 76ec514..dd53daa 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -1144,7 +1144,7 @@ void sctp_ulpq_abort_pd(struct sctp_ulpq *ulpq, gfp_t gfp)
 				       &sctp_sk(sk)->subscribe))
 		ev = sctp_ulpevent_make_pdapi(ulpq->asoc,
 					      SCTP_PARTIAL_DELIVERY_ABORTED,
-					      gfp);
+					      0, 0, 0, gfp);
 	if (ev)
 		__skb_queue_tail(&sk->sk_receive_queue, sctp_event2skb(ev));
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCHv2 net-next 12/12] sctp: add support for the process of unordered idata
  2017-12-08 13:04                     ` [PATCHv2 net-next 11/12] sctp: implement abort_pd " Xin Long
@ 2017-12-08 13:04                       ` Xin Long
  0 siblings, 0 replies; 30+ messages in thread
From: Xin Long @ 2017-12-08 13:04 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

Unordered idata process is more complicated than unordered data:

 - It has to add mid into sctp_stream_out to save the next mid value,
   which is separated from ordered idata's.

 - To support pd for unordered idata, another mid and pd_mode need to
   be added to save the message id and pd state in sctp_stream_in.

 - To make  unordered idata reasm easier, it adds a new event queue
   to save frags for idata.

The patch mostly adds the samilar reasm functions for unordered idata
as ordered idata's, and also adjusts some other codes on assign_mid,
abort_pd and ulpevent_data for idata.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/structs.h   |  14 +-
 include/net/sctp/ulpqueue.h  |   1 +
 net/sctp/socket.c            |  23 ++-
 net/sctp/stream_interleave.c | 377 ++++++++++++++++++++++++++++++++++++++++---
 net/sctp/ulpqueue.c          |   5 +
 5 files changed, 392 insertions(+), 28 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 73b315d..8ef638d 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -413,6 +413,14 @@ void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new);
 
 #define sctp_stream_in(asoc, sid) (&(asoc)->stream.in[sid])
 
+/* What is the current MID_uo number for this stream? */
+#define sctp_mid_uo_peek(stream, type, sid) \
+	((stream)->type[sid].mid_uo)
+
+/* Return the next MID_uo number for this stream.  */
+#define sctp_mid_uo_next(stream, type, sid) \
+	((stream)->type[sid].mid_uo++)
+
 /*
  * Pointers to address related SCTP functions.
  * (i.e. things that depend on the address family.)
@@ -1379,8 +1387,9 @@ struct sctp_stream_out {
 		__u32 mid;
 		__u16 ssn;
 	};
-	__u8	state;
+	__u32 mid_uo;
 	struct sctp_stream_out_ext *ext;
+	__u8 state;
 };
 
 struct sctp_stream_in {
@@ -1388,8 +1397,11 @@ struct sctp_stream_in {
 		__u32 mid;
 		__u16 ssn;
 	};
+	__u32 mid_uo;
 	__u32 fsn;
+	__u32 fsn_uo;
 	char pd_mode;
+	char pd_mode_uo;
 };
 
 struct sctp_stream {
diff --git a/include/net/sctp/ulpqueue.h b/include/net/sctp/ulpqueue.h
index eb98c71..bb0ecba 100644
--- a/include/net/sctp/ulpqueue.h
+++ b/include/net/sctp/ulpqueue.h
@@ -45,6 +45,7 @@ struct sctp_ulpq {
 	char pd_mode;
 	struct sctp_association *asoc;
 	struct sk_buff_head reasm;
+	struct sk_buff_head reasm_uo;
 	struct sk_buff_head lobby;
 };
 
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fe2cab9..db8e7cc 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -201,6 +201,22 @@ static void sctp_for_each_tx_datachunk(struct sctp_association *asoc,
 		cb(chunk);
 }
 
+static void sctp_for_each_rx_skb(struct sctp_association *asoc, struct sock *sk,
+				 void (*cb)(struct sk_buff *, struct sock *))
+
+{
+	struct sk_buff *skb, *tmp;
+
+	sctp_skb_for_each(skb, &asoc->ulpq.lobby, tmp)
+		cb(skb, sk);
+
+	sctp_skb_for_each(skb, &asoc->ulpq.reasm, tmp)
+		cb(skb, sk);
+
+	sctp_skb_for_each(skb, &asoc->ulpq.reasm_uo, tmp)
+		cb(skb, sk);
+}
+
 /* Verify that this is a valid address. */
 static inline int sctp_verify_addr(struct sock *sk, union sctp_addr *addr,
 				   int len)
@@ -1554,6 +1570,7 @@ static void sctp_close(struct sock *sk, long timeout)
 
 		if (data_was_unread || !skb_queue_empty(&asoc->ulpq.lobby) ||
 		    !skb_queue_empty(&asoc->ulpq.reasm) ||
+		    !skb_queue_empty(&asoc->ulpq.reasm_uo) ||
 		    (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime)) {
 			struct sctp_chunk *chunk;
 
@@ -8507,11 +8524,7 @@ static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk,
 
 	}
 
-	sctp_skb_for_each(skb, &assoc->ulpq.reasm, tmp)
-		sctp_skb_set_owner_r_frag(skb, newsk);
-
-	sctp_skb_for_each(skb, &assoc->ulpq.lobby, tmp)
-		sctp_skb_set_owner_r_frag(skb, newsk);
+	sctp_for_each_rx_skb(assoc, newsk, sctp_skb_set_owner_r_frag);
 
 	/* Set the type of socket to indicate that it is peeled off from the
 	 * original UDP-style socket or created with the accept() call on a
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
index d15645e..87b9417 100644
--- a/net/sctp/stream_interleave.c
+++ b/net/sctp/stream_interleave.c
@@ -74,12 +74,10 @@ static void sctp_chunk_assign_mid(struct sctp_chunk *chunk)
 
 	list_for_each_entry(lchunk, &chunk->msg->chunks, frag_list) {
 		struct sctp_idatahdr *hdr;
+		__u32 mid;
 
 		lchunk->has_mid = 1;
 
-		if (lchunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
-			continue;
-
 		hdr = lchunk->subh.idata_hdr;
 
 		if (lchunk->chunk_hdr->flags & SCTP_DATA_FIRST_FRAG)
@@ -87,10 +85,16 @@ static void sctp_chunk_assign_mid(struct sctp_chunk *chunk)
 		else
 			hdr->fsn = htonl(cfsn++);
 
-		if (lchunk->chunk_hdr->flags & SCTP_DATA_LAST_FRAG)
-			hdr->mid = htonl(sctp_mid_next(stream, out, sid));
-		else
-			hdr->mid = htonl(sctp_mid_peek(stream, out, sid));
+		if (lchunk->chunk_hdr->flags & SCTP_DATA_UNORDERED) {
+			mid = lchunk->chunk_hdr->flags & SCTP_DATA_LAST_FRAG ?
+				sctp_mid_uo_next(stream, out, sid) :
+				sctp_mid_uo_peek(stream, out, sid);
+		} else {
+			mid = lchunk->chunk_hdr->flags & SCTP_DATA_LAST_FRAG ?
+				sctp_mid_next(stream, out, sid) :
+				sctp_mid_peek(stream, out, sid);
+		}
+		hdr->mid = htonl(mid);
 	}
 }
 
@@ -449,9 +453,6 @@ static struct sctp_ulpevent *sctp_intl_order(struct sctp_ulpq *ulpq,
 	struct sctp_stream *stream;
 	__u16 sid;
 
-	if (event->msg_flags & SCTP_DATA_UNORDERED)
-		return event;
-
 	stream  = &ulpq->asoc->stream;
 	sid = event->stream;
 
@@ -512,6 +513,317 @@ static int sctp_enqueue_event(struct sctp_ulpq *ulpq,
 	return 0;
 }
 
+static void sctp_intl_store_reasm_uo(struct sctp_ulpq *ulpq,
+				     struct sctp_ulpevent *event)
+{
+	struct sctp_ulpevent *cevent;
+	struct sk_buff *pos;
+
+	pos = skb_peek_tail(&ulpq->reasm_uo);
+	if (!pos) {
+		__skb_queue_tail(&ulpq->reasm_uo, sctp_event2skb(event));
+		return;
+	}
+
+	cevent = sctp_skb2event(pos);
+
+	if (event->stream == cevent->stream &&
+	    event->mid == cevent->mid &&
+	    (cevent->msg_flags & SCTP_DATA_FIRST_FRAG ||
+	     (!(event->msg_flags & SCTP_DATA_FIRST_FRAG) &&
+	      event->fsn > cevent->fsn))) {
+		__skb_queue_tail(&ulpq->reasm_uo, sctp_event2skb(event));
+		return;
+	}
+
+	if ((event->stream == cevent->stream &&
+	     MID_lt(cevent->mid, event->mid)) ||
+	    event->stream > cevent->stream) {
+		__skb_queue_tail(&ulpq->reasm_uo, sctp_event2skb(event));
+		return;
+	}
+
+	skb_queue_walk(&ulpq->reasm_uo, pos) {
+		cevent = sctp_skb2event(pos);
+
+		if (event->stream < cevent->stream ||
+		    (event->stream == cevent->stream &&
+		     MID_lt(event->mid, cevent->mid)))
+			break;
+
+		if (event->stream == cevent->stream &&
+		    event->mid == cevent->mid &&
+		    !(cevent->msg_flags & SCTP_DATA_FIRST_FRAG) &&
+		    (event->msg_flags & SCTP_DATA_FIRST_FRAG ||
+		     event->fsn < cevent->fsn))
+			break;
+	}
+
+	__skb_queue_before(&ulpq->reasm_uo, pos, sctp_event2skb(event));
+}
+
+static struct sctp_ulpevent *sctp_intl_retrieve_partial_uo(
+						struct sctp_ulpq *ulpq,
+						struct sctp_ulpevent *event)
+{
+	struct sk_buff *first_frag = NULL;
+	struct sk_buff *last_frag = NULL;
+	struct sctp_ulpevent *retval;
+	struct sctp_stream_in *sin;
+	struct sk_buff *pos;
+	__u32 next_fsn = 0;
+	int is_last = 0;
+
+	sin = sctp_stream_in(ulpq->asoc, event->stream);
+
+	skb_queue_walk(&ulpq->reasm_uo, pos) {
+		struct sctp_ulpevent *cevent = sctp_skb2event(pos);
+
+		if (cevent->stream < event->stream)
+			continue;
+		if (cevent->stream > event->stream)
+			break;
+
+		if (MID_lt(cevent->mid, sin->mid_uo))
+			continue;
+		if (MID_lt(sin->mid_uo, cevent->mid))
+			break;
+
+		switch (cevent->msg_flags & SCTP_DATA_FRAG_MASK) {
+		case SCTP_DATA_FIRST_FRAG:
+			goto out;
+		case SCTP_DATA_MIDDLE_FRAG:
+			if (!first_frag) {
+				if (cevent->fsn == sin->fsn_uo) {
+					first_frag = pos;
+					last_frag = pos;
+					next_fsn = cevent->fsn + 1;
+				}
+			} else if (cevent->fsn == next_fsn) {
+				last_frag = pos;
+				next_fsn++;
+			} else {
+				goto out;
+			}
+			break;
+		case SCTP_DATA_LAST_FRAG:
+			if (!first_frag) {
+				if (cevent->fsn == sin->fsn_uo) {
+					first_frag = pos;
+					last_frag = pos;
+					next_fsn = 0;
+					is_last = 1;
+				}
+			} else if (cevent->fsn == next_fsn) {
+				last_frag = pos;
+				next_fsn = 0;
+				is_last = 1;
+			}
+			goto out;
+		default:
+			goto out;
+		}
+	}
+
+out:
+	if (!first_frag)
+		return NULL;
+
+	retval = sctp_make_reassembled_event(sock_net(ulpq->asoc->base.sk),
+					     &ulpq->reasm_uo, first_frag,
+					     last_frag);
+	if (retval) {
+		sin->fsn_uo = next_fsn;
+		if (is_last) {
+			retval->msg_flags |= MSG_EOR;
+			sin->pd_mode_uo = 0;
+		}
+	}
+
+	return retval;
+}
+
+static struct sctp_ulpevent *sctp_intl_retrieve_reassembled_uo(
+						struct sctp_ulpq *ulpq,
+						struct sctp_ulpevent *event)
+{
+	struct sctp_association *asoc = ulpq->asoc;
+	struct sk_buff *pos, *first_frag = NULL;
+	struct sctp_ulpevent *retval = NULL;
+	struct sk_buff *pd_first = NULL;
+	struct sk_buff *pd_last = NULL;
+	struct sctp_stream_in *sin;
+	__u32 next_fsn = 0;
+	__u32 pd_point = 0;
+	__u32 pd_len = 0;
+	__u32 mid = 0;
+
+	sin = sctp_stream_in(ulpq->asoc, event->stream);
+
+	skb_queue_walk(&ulpq->reasm_uo, pos) {
+		struct sctp_ulpevent *cevent = sctp_skb2event(pos);
+
+		if (cevent->stream < event->stream)
+			continue;
+		if (cevent->stream > event->stream)
+			break;
+
+		if (MID_lt(cevent->mid, event->mid))
+			continue;
+		if (MID_lt(event->mid, cevent->mid))
+			break;
+
+		switch (cevent->msg_flags & SCTP_DATA_FRAG_MASK) {
+		case SCTP_DATA_FIRST_FRAG:
+			if (!sin->pd_mode_uo) {
+				sin->mid_uo = cevent->mid;
+				pd_first = pos;
+				pd_last = pos;
+				pd_len = pos->len;
+			}
+
+			first_frag = pos;
+			next_fsn = 0;
+			mid = cevent->mid;
+			break;
+
+		case SCTP_DATA_MIDDLE_FRAG:
+			if (first_frag && cevent->mid == mid &&
+			    cevent->fsn == next_fsn) {
+				next_fsn++;
+				if (pd_first) {
+					pd_last = pos;
+					pd_len += pos->len;
+				}
+			} else {
+				first_frag = NULL;
+			}
+			break;
+
+		case SCTP_DATA_LAST_FRAG:
+			if (first_frag && cevent->mid == mid &&
+			    cevent->fsn == next_fsn)
+				goto found;
+			else
+				first_frag = NULL;
+			break;
+		}
+	}
+
+	if (!pd_first)
+		goto out;
+
+	pd_point = sctp_sk(asoc->base.sk)->pd_point;
+	if (pd_point && pd_point <= pd_len) {
+		retval = sctp_make_reassembled_event(sock_net(asoc->base.sk),
+						     &ulpq->reasm_uo,
+						     pd_first, pd_last);
+		if (retval) {
+			sin->fsn_uo = next_fsn;
+			sin->pd_mode_uo = 1;
+		}
+	}
+	goto out;
+
+found:
+	retval = sctp_make_reassembled_event(sock_net(asoc->base.sk),
+					     &ulpq->reasm_uo,
+					     first_frag, pos);
+	if (retval)
+		retval->msg_flags |= MSG_EOR;
+
+out:
+	return retval;
+}
+
+static struct sctp_ulpevent *sctp_intl_reasm_uo(struct sctp_ulpq *ulpq,
+						struct sctp_ulpevent *event)
+{
+	struct sctp_ulpevent *retval = NULL;
+	struct sctp_stream_in *sin;
+
+	if (SCTP_DATA_NOT_FRAG == (event->msg_flags & SCTP_DATA_FRAG_MASK)) {
+		event->msg_flags |= MSG_EOR;
+		return event;
+	}
+
+	sctp_intl_store_reasm_uo(ulpq, event);
+
+	sin = sctp_stream_in(ulpq->asoc, event->stream);
+	if (sin->pd_mode_uo && event->mid == sin->mid_uo &&
+	    event->fsn == sin->fsn_uo)
+		retval = sctp_intl_retrieve_partial_uo(ulpq, event);
+
+	if (!retval)
+		retval = sctp_intl_retrieve_reassembled_uo(ulpq, event);
+
+	return retval;
+}
+
+static struct sctp_ulpevent *sctp_intl_retrieve_first_uo(struct sctp_ulpq *ulpq)
+{
+	struct sctp_stream_in *csin, *sin = NULL;
+	struct sk_buff *first_frag = NULL;
+	struct sk_buff *last_frag = NULL;
+	struct sctp_ulpevent *retval;
+	struct sk_buff *pos;
+	__u32 next_fsn = 0;
+	__u16 sid = 0;
+
+	skb_queue_walk(&ulpq->reasm_uo, pos) {
+		struct sctp_ulpevent *cevent = sctp_skb2event(pos);
+
+		csin = sctp_stream_in(ulpq->asoc, cevent->stream);
+		if (csin->pd_mode_uo)
+			continue;
+
+		switch (cevent->msg_flags & SCTP_DATA_FRAG_MASK) {
+		case SCTP_DATA_FIRST_FRAG:
+			if (first_frag)
+				goto out;
+			first_frag = pos;
+			last_frag = pos;
+			next_fsn = 0;
+			sin = csin;
+			sid = cevent->stream;
+			sin->mid_uo = cevent->mid;
+			break;
+		case SCTP_DATA_MIDDLE_FRAG:
+			if (!first_frag)
+				break;
+			if (cevent->stream == sid &&
+			    cevent->mid == sin->mid_uo &&
+			    cevent->fsn == next_fsn) {
+				next_fsn++;
+				last_frag = pos;
+			} else {
+				goto out;
+			}
+			break;
+		case SCTP_DATA_LAST_FRAG:
+			if (first_frag)
+				goto out;
+			break;
+		default:
+			break;
+		}
+	}
+
+	if (!first_frag)
+		return NULL;
+
+out:
+	retval = sctp_make_reassembled_event(sock_net(ulpq->asoc->base.sk),
+					     &ulpq->reasm_uo, first_frag,
+					     last_frag);
+	if (retval) {
+		sin->fsn_uo = next_fsn;
+		sin->pd_mode_uo = 1;
+	}
+
+	return retval;
+}
+
 static int sctp_ulpevent_idata(struct sctp_ulpq *ulpq,
 			       struct sctp_chunk *chunk, gfp_t gfp)
 {
@@ -529,12 +841,16 @@ static int sctp_ulpevent_idata(struct sctp_ulpq *ulpq,
 	else
 		event->fsn = ntohl(chunk->subh.idata_hdr->fsn);
 
-	event = sctp_intl_reasm(ulpq, event);
-	if (event && event->msg_flags & MSG_EOR) {
-		skb_queue_head_init(&temp);
-		__skb_queue_tail(&temp, sctp_event2skb(event));
+	if (!(event->msg_flags & SCTP_DATA_UNORDERED)) {
+		event = sctp_intl_reasm(ulpq, event);
+		if (event && event->msg_flags & MSG_EOR) {
+			skb_queue_head_init(&temp);
+			__skb_queue_tail(&temp, sctp_event2skb(event));
 
-		event = sctp_intl_order(ulpq, event);
+			event = sctp_intl_order(ulpq, event);
+		}
+	} else {
+		event = sctp_intl_reasm_uo(ulpq, event);
 	}
 
 	if (event) {
@@ -614,14 +930,21 @@ static void sctp_intl_start_pd(struct sctp_ulpq *ulpq, gfp_t gfp)
 {
 	struct sctp_ulpevent *event;
 
-	if (skb_queue_empty(&ulpq->reasm))
-		return;
+	if (!skb_queue_empty(&ulpq->reasm)) {
+		do {
+			event = sctp_intl_retrieve_first(ulpq);
+			if (event)
+				sctp_enqueue_event(ulpq, event);
+		} while (event);
+	}
 
-	do {
-		event = sctp_intl_retrieve_first(ulpq);
-		if (event)
-			sctp_enqueue_event(ulpq, event);
-	} while (event);
+	if (!skb_queue_empty(&ulpq->reasm_uo)) {
+		do {
+			event = sctp_intl_retrieve_first_uo(ulpq);
+			if (event)
+				sctp_enqueue_event(ulpq, event);
+		} while (event);
+	}
 }
 
 static void sctp_renege_events(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
@@ -643,6 +966,9 @@ static void sctp_renege_events(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
 		if (freed < needed)
 			freed += sctp_ulpq_renege_list(ulpq, &ulpq->reasm,
 						       needed);
+		if (freed < needed)
+			freed += sctp_ulpq_renege_list(ulpq, &ulpq->reasm_uo,
+						       needed);
 	}
 
 	if (chunk && freed >= needed)
@@ -734,6 +1060,13 @@ static void sctp_intl_abort_pd(struct sctp_ulpq *ulpq, gfp_t gfp)
 		struct sctp_stream_in *sin = &stream->in[sid];
 		__u32 mid;
 
+		if (sin->pd_mode_uo) {
+			sin->pd_mode_uo = 0;
+
+			mid = sin->mid_uo;
+			sctp_intl_stream_abort_pd(ulpq, sid, mid, 0x1, gfp);
+		}
+
 		if (sin->pd_mode) {
 			sin->pd_mode = 0;
 
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index dd53daa..97fae53 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -60,6 +60,7 @@ struct sctp_ulpq *sctp_ulpq_init(struct sctp_ulpq *ulpq,
 
 	ulpq->asoc = asoc;
 	skb_queue_head_init(&ulpq->reasm);
+	skb_queue_head_init(&ulpq->reasm_uo);
 	skb_queue_head_init(&ulpq->lobby);
 	ulpq->pd_mode  = 0;
 
@@ -83,6 +84,10 @@ void sctp_ulpq_flush(struct sctp_ulpq *ulpq)
 		sctp_ulpevent_free(event);
 	}
 
+	while ((skb = __skb_dequeue(&ulpq->reasm_uo)) != NULL) {
+		event = sctp_skb2event(skb);
+		sctp_ulpevent_free(event);
+	}
 }
 
 /* Dispose of a ulpqueue.  */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 13:04       ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave Xin Long
  2017-12-08 13:04         ` [PATCHv2 net-next 05/12] sctp: implement assign_number " Xin Long
@ 2017-12-08 14:06         ` David Laight
  2017-12-08 14:56           ` Marcelo Ricardo Leitner
  1 sibling, 1 reply; 30+ messages in thread
From: David Laight @ 2017-12-08 14:06 UTC (permalink / raw)
  To: 'Xin Long', network dev, linux-sctp
  Cc: Marcelo Ricardo Leitner, Neil Horman, davem

From: Xin Long
> Sent: 08 December 2017 13:04
...
> @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>  				frag |= SCTP_DATA_SACK_IMM;
>  		}
> 
> -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> -						 0, GFP_KERNEL);
> +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> +						       GFP_KERNEL);

I know that none of the sctp code is very optimised, but that indirect
call is going to be horrid.

	David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 14:06         ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave David Laight
@ 2017-12-08 14:56           ` Marcelo Ricardo Leitner
  2017-12-08 15:01             ` David Laight
  2017-12-08 15:37             ` Neil Horman
  0 siblings, 2 replies; 30+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-08 14:56 UTC (permalink / raw)
  To: David Laight
  Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, davem

On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> From: Xin Long
> > Sent: 08 December 2017 13:04
> ...
> > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> >  				frag |= SCTP_DATA_SACK_IMM;
> >  		}
> > 
> > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > -						 0, GFP_KERNEL);
> > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > +						       GFP_KERNEL);
> 
> I know that none of the sctp code is very optimised, but that indirect
> call is going to be horrid.

Yeah.. but there is no way to avoid the double derreference
considering we only have the asoc pointer in there and we have to
reach the contents of the data chunk operations struct, and the .si
part is the same as 'stream' part as it's a constant offset.

Due to the for() in there, we could add a variable to store
asoc->stream.si outside the for and then we can do only a single deref
inside it. Xin, can you please try and see if the generated code is
different?

Other suggestions?

  Marcelo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 14:56           ` Marcelo Ricardo Leitner
@ 2017-12-08 15:01             ` David Laight
  2017-12-08 15:15               ` 'Marcelo Ricardo Leitner'
  2017-12-08 15:37             ` Neil Horman
  1 sibling, 1 reply; 30+ messages in thread
From: David Laight @ 2017-12-08 15:01 UTC (permalink / raw)
  To: 'Marcelo Ricardo Leitner'
  Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, davem

From: Marcelo Ricardo Leitner
> Sent: 08 December 2017 14:57
> 
> On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > From: Xin Long
> > > Sent: 08 December 2017 13:04
> > ...
> > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > >  				frag |= SCTP_DATA_SACK_IMM;
> > >  		}
> > >
> > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > -						 0, GFP_KERNEL);
> > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > +						       GFP_KERNEL);
> >
> > I know that none of the sctp code is very optimised, but that indirect
> > call is going to be horrid.
> 
> Yeah.. but there is no way to avoid the double derreference
> considering we only have the asoc pointer in there and we have to
> reach the contents of the data chunk operations struct, and the .si
> part is the same as 'stream' part as it's a constant offset.
...

It isn't only the double indirect, the indirect call itself isn't 'fun'.

I think there are other hot paths where you've replaced a sizeof()
with a ?: clause.
Caching the result might be much better.

	David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 15:01             ` David Laight
@ 2017-12-08 15:15               ` 'Marcelo Ricardo Leitner'
  2017-12-08 15:32                 ` David Laight
  0 siblings, 1 reply; 30+ messages in thread
From: 'Marcelo Ricardo Leitner' @ 2017-12-08 15:15 UTC (permalink / raw)
  To: David Laight
  Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, davem

On Fri, Dec 08, 2017 at 03:01:31PM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 08 December 2017 14:57
> > 
> > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > From: Xin Long
> > > > Sent: 08 December 2017 13:04
> > > ...
> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > >  		}
> > > >
> > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > -						 0, GFP_KERNEL);
> > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > +						       GFP_KERNEL);
> > >
> > > I know that none of the sctp code is very optimised, but that indirect
> > > call is going to be horrid.
> > 
> > Yeah.. but there is no way to avoid the double derreference
> > considering we only have the asoc pointer in there and we have to
> > reach the contents of the data chunk operations struct, and the .si
> > part is the same as 'stream' part as it's a constant offset.
> ...
> 
> It isn't only the double indirect, the indirect call itself isn't 'fun'.

I meant in this context.

The indirect call is so we don't have to flood the stack with
if (old data chunk fmt) {
	...
} else {
	...
}

So instead of this, we now have some key operations identified and
wrapped up behind this struct, allowing us to abstract whatever data
chunk format it is.

> 
> I think there are other hot paths where you've replaced a sizeof()
> with a ?: clause.
> Caching the result might be much better.

The only new ?: clause I could find this patchset is on patch 12 and
has nothing to do with sizeof().

The sizeof() results are indeed cached, as you can see in patch 4:
+static struct sctp_stream_interleave sctp_stream_interleave_0 = {
+       .data_chunk_len         = sizeof(struct sctp_data_chunk),
and the two helpers on it at the begining of the patch.

  Marcelo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 15:15               ` 'Marcelo Ricardo Leitner'
@ 2017-12-08 15:32                 ` David Laight
  2017-12-08 16:02                   ` 'Marcelo Ricardo Leitner'
  0 siblings, 1 reply; 30+ messages in thread
From: David Laight @ 2017-12-08 15:32 UTC (permalink / raw)
  To: 'Marcelo Ricardo Leitner'
  Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, davem

From: 'Marcelo Ricardo Leitner'
> Sent: 08 December 2017 15:16
> On Fri, Dec 08, 2017 at 03:01:31PM +0000, David Laight wrote:
> > From: Marcelo Ricardo Leitner
> > > Sent: 08 December 2017 14:57
> > >
> > > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > > From: Xin Long
> > > > > Sent: 08 December 2017 13:04
> > > > ...
> > > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > > >  		}
> > > > >
> > > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > > -						 0, GFP_KERNEL);
> > > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > > +						       GFP_KERNEL);
> > > >
> > > > I know that none of the sctp code is very optimised, but that indirect
> > > > call is going to be horrid.
> > >
> > > Yeah.. but there is no way to avoid the double derreference
> > > considering we only have the asoc pointer in there and we have to
> > > reach the contents of the data chunk operations struct, and the .si
> > > part is the same as 'stream' part as it's a constant offset.
> > ...
> >
> > It isn't only the double indirect, the indirect call itself isn't 'fun'.
> 
> I meant in this context.
> 
> The indirect call is so we don't have to flood the stack with
> if (old data chunk fmt) {
> 	...
> } else {
> 	...
> }
> 
> So instead of this, we now have some key operations identified and
> wrapped up behind this struct, allowing us to abstract whatever data
> chunk format it is.

Nothing wrong with:
#define foo(asoc, ...) \
	if (asoc->new_fmt) \
		foo_new(asoc, __VA_LIST__); \
	else \
		foo_old(asoc, __VA_LIST__);

> > I think there are other hot paths where you've replaced a sizeof()
> > with a ?: clause.
> > Caching the result might be much better.
> 
> The only new ?: clause I could find this patchset is on patch 12 and
> has nothing to do with sizeof().
> 
> The sizeof() results are indeed cached, as you can see in patch 4:
> +static struct sctp_stream_interleave sctp_stream_interleave_0 = {
> +       .data_chunk_len         = sizeof(struct sctp_data_chunk),
> and the two helpers on it at the begining of the patch.

I was getting two bits mixed up.
But the code that reads data_chunk_len is following an awful lot of pointers.

	David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 14:56           ` Marcelo Ricardo Leitner
  2017-12-08 15:01             ` David Laight
@ 2017-12-08 15:37             ` Neil Horman
  2017-12-08 16:00               ` Marcelo Ricardo Leitner
  1 sibling, 1 reply; 30+ messages in thread
From: Neil Horman @ 2017-12-08 15:37 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: David Laight, 'Xin Long', network dev, linux-sctp, davem

On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
> On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > From: Xin Long
> > > Sent: 08 December 2017 13:04
> > ...
> > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > >  				frag |= SCTP_DATA_SACK_IMM;
> > >  		}
> > > 
> > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > -						 0, GFP_KERNEL);
> > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > +						       GFP_KERNEL);
> > 
> > I know that none of the sctp code is very optimised, but that indirect
> > call is going to be horrid.
> 
> Yeah.. but there is no way to avoid the double derreference
> considering we only have the asoc pointer in there and we have to
> reach the contents of the data chunk operations struct, and the .si
> part is the same as 'stream' part as it's a constant offset.
> 
> Due to the for() in there, we could add a variable to store
> asoc->stream.si outside the for and then we can do only a single deref
> inside it. Xin, can you please try and see if the generated code is
> different?
> 
> Other suggestions?
> 
Is it worth replacing the si struct with an index/enum value, and indexing an
array of method pointer structs?  That would save you at least one dereference.

Alternatively you could preform the dereference in two steps (i.e. declare an si
pointer on the stack and set it equal to asoc->stream.si, then deref
si->make_datafrag at call time.  That will at least give the compiler an
opportunity to preload the first pointer.

Neil

>   Marcelo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 15:37             ` Neil Horman
@ 2017-12-08 16:00               ` Marcelo Ricardo Leitner
  2017-12-08 16:04                 ` David Laight
  2017-12-08 16:17                 ` Xin Long
  0 siblings, 2 replies; 30+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-08 16:00 UTC (permalink / raw)
  To: Neil Horman
  Cc: David Laight, 'Xin Long', network dev, linux-sctp, davem

On Fri, Dec 08, 2017 at 10:37:34AM -0500, Neil Horman wrote:
> On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
> > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > From: Xin Long
> > > > Sent: 08 December 2017 13:04
> > > ...
> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > >  		}
> > > > 
> > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > -						 0, GFP_KERNEL);
> > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > +						       GFP_KERNEL);
> > > 
> > > I know that none of the sctp code is very optimised, but that indirect
> > > call is going to be horrid.
> > 
> > Yeah.. but there is no way to avoid the double derreference
> > considering we only have the asoc pointer in there and we have to
> > reach the contents of the data chunk operations struct, and the .si
> > part is the same as 'stream' part as it's a constant offset.
> > 
> > Due to the for() in there, we could add a variable to store
> > asoc->stream.si outside the for and then we can do only a single deref
> > inside it. Xin, can you please try and see if the generated code is
> > different?
> > 
> > Other suggestions?
> > 
> Is it worth replacing the si struct with an index/enum value, and indexing an
> array of method pointer structs?  That would save you at least one dereference.

Hmmm, maybe, yes. It would be like
sctp_stream_interleave[asoc->stream.si].make_datafrag(...)

Then same goes for pf->af, probably.

> 
> Alternatively you could preform the dereference in two steps (i.e. declare an si
> pointer on the stack and set it equal to asoc->stream.si, then deref
> si->make_datafrag at call time.  That will at least give the compiler an
> opportunity to preload the first pointer.

Yep, that was my 2nd paragraph above :-) but it only works for cases
such as this one.

  Marcelo

> 
> Neil
> 
> >   Marcelo
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 15:32                 ` David Laight
@ 2017-12-08 16:02                   ` 'Marcelo Ricardo Leitner'
  0 siblings, 0 replies; 30+ messages in thread
From: 'Marcelo Ricardo Leitner' @ 2017-12-08 16:02 UTC (permalink / raw)
  To: David Laight
  Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, davem

On Fri, Dec 08, 2017 at 03:32:54PM +0000, David Laight wrote:
> From: 'Marcelo Ricardo Leitner'
> > Sent: 08 December 2017 15:16
> > On Fri, Dec 08, 2017 at 03:01:31PM +0000, David Laight wrote:
> > > From: Marcelo Ricardo Leitner
> > > > Sent: 08 December 2017 14:57
> > > >
> > > > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > > > From: Xin Long
> > > > > > Sent: 08 December 2017 13:04
> > > > > ...
> > > > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > > > >  		}
> > > > > >
> > > > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > > > -						 0, GFP_KERNEL);
> > > > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > > > +						       GFP_KERNEL);
> > > > >
> > > > > I know that none of the sctp code is very optimised, but that indirect
> > > > > call is going to be horrid.
> > > >
> > > > Yeah.. but there is no way to avoid the double derreference
> > > > considering we only have the asoc pointer in there and we have to
> > > > reach the contents of the data chunk operations struct, and the .si
> > > > part is the same as 'stream' part as it's a constant offset.
> > > ...
> > >
> > > It isn't only the double indirect, the indirect call itself isn't 'fun'.
> > 
> > I meant in this context.
> > 
> > The indirect call is so we don't have to flood the stack with
> > if (old data chunk fmt) {
> > 	...
> > } else {
> > 	...
> > }
> > 
> > So instead of this, we now have some key operations identified and
> > wrapped up behind this struct, allowing us to abstract whatever data
> > chunk format it is.
> 
> Nothing wrong with:
> #define foo(asoc, ...) \
> 	if (asoc->new_fmt) \

Not all function pointers in sctp_stream_interleave have asoc as
a parameter, so we would have to have something like:

#define foo_asoc(asoc, ...) \
	if (asoc->new_fmt) \
		...

#define foo_chunk(chunk, ...) \
	if (chunk->asoc->new_fmt) \
		...

#define foo_ulpq(ulpq, ...) \
	if (ulpq->asoc->new_fmt) \
		...

and we're pretty much back to double deref.
Maybe some reworking on the parameters could alleviate some of these,
but not all.

> 		foo_new(asoc, __VA_LIST__); \
> 	else \
> 		foo_old(asoc, __VA_LIST__);
> 
> > > I think there are other hot paths where you've replaced a sizeof()
> > > with a ?: clause.
> > > Caching the result might be much better.
> > 
> > The only new ?: clause I could find this patchset is on patch 12 and
> > has nothing to do with sizeof().
> > 
> > The sizeof() results are indeed cached, as you can see in patch 4:
> > +static struct sctp_stream_interleave sctp_stream_interleave_0 = {
> > +       .data_chunk_len         = sizeof(struct sctp_data_chunk),
> > and the two helpers on it at the begining of the patch.
> 
> I was getting two bits mixed up.
> But the code that reads data_chunk_len is following an awful lot of pointers.

>From path 4:
        max_data = asoc->pathmtu -
                   sctp_sk(asoc->base.sk)->pf->af->net_header_len -
-                  sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
+                  sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream);

You're worried with the double deref in sctp_datachk_len() but on the
line right above it we have 4 derefs. There are several other cases
of double deref in sctp stack even on hot path. Not sure why you're
picking on this one, but ok.

  Marcelo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 16:00               ` Marcelo Ricardo Leitner
@ 2017-12-08 16:04                 ` David Laight
  2017-12-08 16:08                   ` Neil Horman
  2017-12-08 16:17                 ` Xin Long
  1 sibling, 1 reply; 30+ messages in thread
From: David Laight @ 2017-12-08 16:04 UTC (permalink / raw)
  To: 'Marcelo Ricardo Leitner', Neil Horman
  Cc: 'Xin Long', network dev, linux-sctp, davem

From: Marcelo Ricardo Leitner
> Sent: 08 December 2017 16:00
...
> > Is it worth replacing the si struct with an index/enum value, and indexing an
> > array of method pointer structs?  That would save you at least one dereference.
> 
> Hmmm, maybe, yes. It would be like
> sctp_stream_interleave[asoc->stream.si].make_datafrag(...)

If you only expect 2 choices then an if () is likely
to produce better code that the above.

The actual implementation can be hidden inside a #define
or static inline function.

	David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 16:04                 ` David Laight
@ 2017-12-08 16:08                   ` Neil Horman
  2017-12-08 20:37                     ` 'Marcelo Ricardo Leitner'
  0 siblings, 1 reply; 30+ messages in thread
From: Neil Horman @ 2017-12-08 16:08 UTC (permalink / raw)
  To: David Laight
  Cc: 'Marcelo Ricardo Leitner', 'Xin Long',
	network dev, linux-sctp, davem

On Fri, Dec 08, 2017 at 04:04:58PM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 08 December 2017 16:00
> ...
> > > Is it worth replacing the si struct with an index/enum value, and indexing an
> > > array of method pointer structs?  That would save you at least one dereference.
> > 
> > Hmmm, maybe, yes. It would be like
> > sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
> 
> If you only expect 2 choices then an if () is likely
> to produce better code that the above.
> 
> The actual implementation can be hidden inside a #define
> or static inline function.
> 
Thats the real question though, will we expect more than two interleaving
strategies?  Currently its a boolean operation so the answer seems like yes, but
is there a possiblity of a biased interleaving, or other worthwhile algorithm?

Neil

> 	David
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 16:00               ` Marcelo Ricardo Leitner
  2017-12-08 16:04                 ` David Laight
@ 2017-12-08 16:17                 ` Xin Long
  2017-12-08 16:22                   ` David Laight
  1 sibling, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 16:17 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Neil Horman, David Laight, network dev, linux-sctp, davem

On Sat, Dec 9, 2017 at 12:00 AM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
> On Fri, Dec 08, 2017 at 10:37:34AM -0500, Neil Horman wrote:
>> On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
>> > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
>> > > From: Xin Long
>> > > > Sent: 08 December 2017 13:04
>> > > ...
>> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>> > > >                                 frag |= SCTP_DATA_SACK_IMM;
>> > > >                 }
>> > > >
>> > > > -               chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
>> > > > -                                                0, GFP_KERNEL);
>> > > > +               chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
>> > > > +                                                      GFP_KERNEL);
>> > >
>> > > I know that none of the sctp code is very optimised, but that indirect
>> > > call is going to be horrid.
>> >
>> > Yeah.. but there is no way to avoid the double derreference
>> > considering we only have the asoc pointer in there and we have to
>> > reach the contents of the data chunk operations struct, and the .si
>> > part is the same as 'stream' part as it's a constant offset.
>> >
>> > Due to the for() in there, we could add a variable to store
>> > asoc->stream.si outside the for and then we can do only a single deref
>> > inside it. Xin, can you please try and see if the generated code is
>> > different?
>> >
>> > Other suggestions?
>> >
>> Is it worth replacing the si struct with an index/enum value, and indexing an
>> array of method pointer structs?  That would save you at least one dereference.
>
> Hmmm, maybe, yes. It would be like
> sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
>
> Then same goes for pf->af, probably.
>
>>
>> Alternatively you could preform the dereference in two steps (i.e. declare an si
>> pointer on the stack and set it equal to asoc->stream.si, then deref
>> si->make_datafrag at call time.  That will at least give the compiler an
>> opportunity to preload the first pointer.
>
> Yep, that was my 2nd paragraph above :-) but it only works for cases
> such as this one.

Now:
  for(N) {
    ...
    chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
     0x000000000000fb58 <+360>: mov    0x848(%r13),%rax  <---- [a]
     0x000000000000fb5f <+367>: movzbl %cl,%ecx
     0x000000000000fb62 <+370>: mov    $0x14000c0,%r8d
     0x000000000000fb68 <+376>: mov    %r12d,%edx
     0x000000000000fb6b <+379>: mov    (%rsp),%rsi
     0x000000000000fb6f <+383>: mov    %r13,%rdi  <=(X)
     0x000000000000fb72 <+386>: callq  *0x8(%rax)  <---- [b]
     0x000000000000fb78 <+392>: mov    %rax,%r15
   }

   ret = N * ([a] + [b])


After using a variable:
  struct sctp_stream_interleave *si;
  ...
  si = asoc->stream.si;
     0x000000000000fb44 <+340>: mov    0x848(%r14),%rax
     0x000000000000fb4e <+350>: mov    %rax,0x20(%rsp) <----- [1]

  for(N) {
    ...
    chunk = si->make_datafrag(asoc, sinfo, len, frag, GFP_KERNEL);
     0x000000000000fb69 <+377>: mov    0x20(%rsp),%rax <----- [2]
     0x000000000000fb6e <+382>: movzbl %cl,%ecx
     0x000000000000fb71 <+385>: mov    $0x14000c0,%r8d
     0x000000000000fb77 <+391>: mov    %r12d,%edx
     0x000000000000fb7a <+394>: mov    (%rsp),%rsi
     0x000000000000fb7e <+398>: mov    0x28(%rsp),%rdi <=(Y)
     0x000000000000fb83 <+403>: callq  *0x8(%rax) <----- [3]
     0x000000000000fb89 <+409>: mov    %rax,%r14
   }

   ret = [1] + N * ([2] + [3])


Another small difference:
  as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
  instead of %r13.

So that's what I can see from the related generated code.
If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
asoc->stream.si->make_datafrag() is even better. No ?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 16:17                 ` Xin Long
@ 2017-12-08 16:22                   ` David Laight
  2017-12-08 17:23                     ` Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: David Laight @ 2017-12-08 16:22 UTC (permalink / raw)
  To: 'Xin Long', Marcelo Ricardo Leitner
  Cc: Neil Horman, network dev, linux-sctp, davem

From: Xin Long
> Sent: 08 December 2017 16:18
> 
...
> >> Alternatively you could preform the dereference in two steps (i.e. declare an si
> >> pointer on the stack and set it equal to asoc->stream.si, then deref
> >> si->make_datafrag at call time.  That will at least give the compiler an
> >> opportunity to preload the first pointer.

You want to save the function pointer itself.

...
> Another small difference:
>   as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
>   instead of %r13.
> 
> So that's what I can see from the related generated code.
> If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
> asoc->stream.si->make_datafrag() is even better. No ?

That code must have far too many life local variables.
Otherwise there's be a caller saved register available.

	David


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 16:22                   ` David Laight
@ 2017-12-08 17:23                     ` Xin Long
  2017-12-08 17:29                       ` David Laight
  0 siblings, 1 reply; 30+ messages in thread
From: Xin Long @ 2017-12-08 17:23 UTC (permalink / raw)
  To: David Laight
  Cc: Marcelo Ricardo Leitner, Neil Horman, network dev, linux-sctp, davem

On Sat, Dec 9, 2017 at 12:22 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Xin Long
>> Sent: 08 December 2017 16:18
>>
> ...
>> >> Alternatively you could preform the dereference in two steps (i.e. declare an si
>> >> pointer on the stack and set it equal to asoc->stream.si, then deref
>> >> si->make_datafrag at call time.  That will at least give the compiler an
>> >> opportunity to preload the first pointer.
>
> You want to save the function pointer itself.
>
> ...
>> Another small difference:
>>   as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
>>   instead of %r13.
>>
>> So that's what I can see from the related generated code.
>> If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
>> asoc->stream.si->make_datafrag() is even better. No ?
>
> That code must have far too many life local variables.
> Otherwise there's be a caller saved register available.
>
Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
codes style now ?

For cpu cost,  I think 0x848(%r13) operation must be better than the
generated code of if-else.

For the codes style, comparing to the if-else, I think this one is
more readable. (ignore extendible stuff first, as probably no more
new type of data chunk).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 17:23                     ` Xin Long
@ 2017-12-08 17:29                       ` David Laight
  2017-12-08 17:37                         ` Xin Long
  0 siblings, 1 reply; 30+ messages in thread
From: David Laight @ 2017-12-08 17:29 UTC (permalink / raw)
  To: 'Xin Long'
  Cc: Marcelo Ricardo Leitner, Neil Horman, network dev, linux-sctp, davem

From: Xin Long ...
> Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
> codes style now ?
> 
> For cpu cost,  I think 0x848(%r13) operation must be better than the
> generated code of if-else.

Nope - the call xxx(%ryyy) is likely to be a data cache miss and a complete
cpu pipeline stall.

The conditional will be a data cache hit and the code (for one branch)
will be prefetched and speculatively executed.

Some very modern cpu might manage to predict indirect jumps, but for
most it is a full pipeline stall.

	David


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 17:29                       ` David Laight
@ 2017-12-08 17:37                         ` Xin Long
  0 siblings, 0 replies; 30+ messages in thread
From: Xin Long @ 2017-12-08 17:37 UTC (permalink / raw)
  To: David Laight
  Cc: Marcelo Ricardo Leitner, Neil Horman, network dev, linux-sctp, davem

On Sat, Dec 9, 2017 at 1:29 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Xin Long ...
>> Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
>> codes style now ?
>>
>> For cpu cost,  I think 0x848(%r13) operation must be better than the
>> generated code of if-else.
>
> Nope - the call xxx(%ryyy) is likely to be a data cache miss and a complete
> cpu pipeline stall.
>
> The conditional will be a data cache hit and the code (for one branch)
> will be prefetched and speculatively executed.
>
> Some very modern cpu might manage to predict indirect jumps, but for
> most it is a full pipeline stall.
Thanks for the CPU information.

The thing is with if-else can't avoid xxx(%ryyy) in this case, as Marcelo
said above. It seems if-else will just be a extra cost compare to this one.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave
  2017-12-08 16:08                   ` Neil Horman
@ 2017-12-08 20:37                     ` 'Marcelo Ricardo Leitner'
  0 siblings, 0 replies; 30+ messages in thread
From: 'Marcelo Ricardo Leitner' @ 2017-12-08 20:37 UTC (permalink / raw)
  To: Neil Horman
  Cc: David Laight, 'Xin Long', network dev, linux-sctp, davem

On Fri, Dec 08, 2017 at 11:08:56AM -0500, Neil Horman wrote:
> On Fri, Dec 08, 2017 at 04:04:58PM +0000, David Laight wrote:
> > From: Marcelo Ricardo Leitner
> > > Sent: 08 December 2017 16:00
> > ...
> > > > Is it worth replacing the si struct with an index/enum value, and indexing an
> > > > array of method pointer structs?  That would save you at least one dereference.
> > > 
> > > Hmmm, maybe, yes. It would be like
> > > sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
> > 
> > If you only expect 2 choices then an if () is likely
> > to produce better code that the above.
> > 
> > The actual implementation can be hidden inside a #define
> > or static inline function.
> > 
> Thats the real question though, will we expect more than two interleaving
> strategies?  Currently its a boolean operation so the answer seems like yes, but
> is there a possiblity of a biased interleaving, or other worthwhile algorithm?

For the chunk format, I don't think so. It would require another RFC
update.
For other possibilities on having a 3rd choice in there, I also don't
think so. Stream scheduling is handled apart from it and rx buffer
stuff is being covered by it now, don't see how we could have a 3rd
option without another chunk format.

But that said, I don't think the macro or even inline wrappers are as
clear as the struct with function pointers here and in some cases it
even won't avoid the second deref. I wouldn't like to compromise code
readability and OO because of 1 fetch, specially considering that this
will be barely noticeable.

Neil's idea on using the array of structs and indexing on it is nice.
Saves a deref and keeps the abstractions without inserting too much
noise on it. Plus, it also allows replacing the struct pointer in
sctp_stream with a bit. If then placed before the union in there, we
can save up to the whole pointer. Looks like a good compromise to me.

  Marcelo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving
  2017-12-08 13:03 [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving Xin Long
  2017-12-08 13:03 ` [PATCHv2 net-next 01/12] sctp: add stream interleave enable members and sockopt Xin Long
@ 2017-12-11 16:23 ` David Miller
  1 sibling, 0 replies; 30+ messages in thread
From: David Miller @ 2017-12-11 16:23 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, linux-sctp, marcelo.leitner, nhorman

From: Xin Long <lucien.xin@gmail.com>
Date: Fri,  8 Dec 2017 21:03:57 +0800

> Stream Interleave would be Implemented in two Parts:
> 
>    1. The I-DATA Chunk Supporting User Message Interleaving
>    2. Interaction with Other SCTP Extensions
 ...
> As the 1st part of Stream Interleave Implementation, this patchset adds
> an ops framework named sctp_stream_interleave with a bunch of stuff that
> does lots of things needed somewhere.
> 
> Then it defines sctp_stream_interleave_0 to work for normal DATA chunks
> and sctp_stream_interleave_1 for I-DATA chunks.
> 
> With these functions, hundreds of if-else checks for the different process
> on I-DATA chunks would be avoided. Besides, very few codes could be shared
> in these two function sets.
> 
> In this patchset, it adds some basic variables, structures and socket
> options firstly, then implement these functions one by one to add the
> procedures for ordered idata gradually, at last adjusts some codes to
> make them work for unordered idata.
> 
> To make it safe to be implemented and also not break the normal data
> chunk process, this feature can't be enabled to use until all stream
> interleave codes are completely accomplished.
> 
> v1 -> v2:
>   - fixed a checkpatch warning that a blank line was missed.
>   - avoided a kbuild warning reported from gcc-4.9.

Series applied, thanks Xin.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-12-11 16:23 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-08 13:03 [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving Xin Long
2017-12-08 13:03 ` [PATCHv2 net-next 01/12] sctp: add stream interleave enable members and sockopt Xin Long
2017-12-08 13:03   ` [PATCHv2 net-next 02/12] sctp: add asoc intl_enable negotiation during 4 shakehands Xin Long
2017-12-08 13:04     ` [PATCHv2 net-next 03/12] sctp: add basic structures and make chunk function for idata Xin Long
2017-12-08 13:04       ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave Xin Long
2017-12-08 13:04         ` [PATCHv2 net-next 05/12] sctp: implement assign_number " Xin Long
2017-12-08 13:04           ` [PATCHv2 net-next 06/12] sctp: implement validate_data " Xin Long
2017-12-08 13:04             ` [PATCHv2 net-next 07/12] sctp: implement ulpevent_data " Xin Long
2017-12-08 13:04               ` [PATCHv2 net-next 08/12] sctp: implement enqueue_event " Xin Long
2017-12-08 13:04                 ` [PATCHv2 net-next 09/12] sctp: implement renege_events " Xin Long
2017-12-08 13:04                   ` [PATCHv2 net-next 10/12] sctp: implement start_pd " Xin Long
2017-12-08 13:04                     ` [PATCHv2 net-next 11/12] sctp: implement abort_pd " Xin Long
2017-12-08 13:04                       ` [PATCHv2 net-next 12/12] sctp: add support for the process of unordered idata Xin Long
2017-12-08 14:06         ` [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave David Laight
2017-12-08 14:56           ` Marcelo Ricardo Leitner
2017-12-08 15:01             ` David Laight
2017-12-08 15:15               ` 'Marcelo Ricardo Leitner'
2017-12-08 15:32                 ` David Laight
2017-12-08 16:02                   ` 'Marcelo Ricardo Leitner'
2017-12-08 15:37             ` Neil Horman
2017-12-08 16:00               ` Marcelo Ricardo Leitner
2017-12-08 16:04                 ` David Laight
2017-12-08 16:08                   ` Neil Horman
2017-12-08 20:37                     ` 'Marcelo Ricardo Leitner'
2017-12-08 16:17                 ` Xin Long
2017-12-08 16:22                   ` David Laight
2017-12-08 17:23                     ` Xin Long
2017-12-08 17:29                       ` David Laight
2017-12-08 17:37                         ` Xin Long
2017-12-11 16:23 ` [PATCHv2 net-next 00/12] sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).