netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function
@ 2019-12-10 11:26 Duncan Roe
  2019-12-10 11:26 ` [PATCH libnetfilter_queue 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Duncan Roe @ 2019-12-10 11:26 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

pktb_usebuf() is a copy of pktb_alloc() with the first part modified to use
a supplied buffer rather than calloc() one. I thought this would give a
measurable performance boost and it does.
All the code after the memset() call is common to pktb_alloc(). If you like,
I can submit a v2 with this code in a static function called by both.

Duncan Roe (1):
  src: Add alternative function to pktb_alloc to avoid malloc / free
    overhead

 include/libnetfilter_queue/pktbuff.h |  2 +
 src/extra/pktbuff.c                  | 82 ++++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

--
2.14.5


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH libnetfilter_queue 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead
  2019-12-10 11:26 [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
@ 2019-12-10 11:26 ` Duncan Roe
  2019-12-22  2:09 ` [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2019-12-10 11:26 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

New function pktb_usebuf() works like pktb_alloc() except it has 2 extra
arguments, being a user-supplied buffer and its length.
Performance testing is ongoing, so the 5% improvement claimed in the
documentation is conservative.

Updated:

 include/libnetfilter_queue/pktbuff.h: Add pktb_usebuf() prototype

 src/extra/pktbuff.c: Implement pktb_usebuf()

Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
---
 include/libnetfilter_queue/pktbuff.h |  2 +
 src/extra/pktbuff.c                  | 82 ++++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/include/libnetfilter_queue/pktbuff.h b/include/libnetfilter_queue/pktbuff.h
index b15ee1e..ec75727 100644
--- a/include/libnetfilter_queue/pktbuff.h
+++ b/include/libnetfilter_queue/pktbuff.h
@@ -4,6 +4,8 @@
 struct pkt_buff;
 
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
+struct pkt_buff *pktb_usebuf(int family, void *data, size_t len, size_t extra,
+			     void *buf, size_t bufsize);
 void pktb_free(struct pkt_buff *pktb);
 
 uint8_t *pktb_data(struct pkt_buff *pktb);
diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
index c4f3da3..774e4ab 100644
--- a/src/extra/pktbuff.c
+++ b/src/extra/pktbuff.c
@@ -96,6 +96,88 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 	return pktb;
 }
 
+/**
+ * pktb_usebuf - make a packet buffer from an existing buffer
+ * \param family Indicate what family. Currently supported families are
+ * AF_BRIDGE, AF_INET & AF_INET6.
+ * \param data Pointer to packet data
+ * \param len Packet length
+ * \param extra Extra memory in the tail to be allocated (for mangling)
+ * \param buf Existing buffer to use
+ * \param bufsize Size of _buf_
+ *
+ * This function builds a userspace packet buffer inside a supplied buffer.
+ * The packet buffer contains the packet data and some extra memory room in the
+ * tail (if requested).
+ *
+ * This function uses less CPU cycles than pktb_alloc() + pktb_free().
+ * Users can expect a decrease in overall CPU time in the order of 5%.
+ *
+ * \return Pointer to a userspace packet buffer aligned within _buf_ or NULL on
+ * failure.
+ * \par Errors
+ * __ENOMEM__ _bufsize_ is too small to accomodate _len_
+ * \n
+ * __EPROTONOSUPPORT__ _family_ was __AF_BRIDGE__ and this is not an IP packet
+ * (v4 or v6)
+ * \warning Do __not__ call pktb_free() on the returned pointer
+ */
+EXPORT_SYMBOL
+struct pkt_buff *pktb_usebuf(int family, void *data, size_t len, size_t extra,
+			     void *buf, size_t bufsize)
+{
+	struct pkt_buff *pktb;
+	void *pkt_data;
+
+	/* Better make sure alignment is correct. */
+	size_t extra2 =
+		(size_t)buf & 0x7 ? (~((size_t)buf & 0x7) & 0x7) + 1 : 0;
+
+	pktb = buf + extra2;
+	if (extra2 + sizeof(struct pkt_buff) + len + extra > bufsize)
+	{
+		errno = ENOMEM;
+		return NULL;
+	}
+	memset(pktb, 0, sizeof(struct pkt_buff) + len + extra);
+
+	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
+	memcpy(pkt_data, data, len);
+
+	pktb->len = len;
+	pktb->data_len = len + extra;
+
+	pktb->head = pkt_data;
+	pktb->data = pkt_data;
+	pktb->tail = pktb->head + len;
+
+	switch(family) {
+	case AF_INET:
+	case AF_INET6:
+		pktb->network_header = pktb->data;
+		break;
+	case AF_BRIDGE: {
+		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
+
+		pktb->mac_header = pktb->data;
+
+		switch(ethhdr->h_proto) {
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			pktb->network_header = pktb->data + ETH_HLEN;
+			break;
+		default:
+			/* This protocol is unsupported. */
+			errno = EPROTONOSUPPORT;
+			free(pktb);
+			return NULL;
+		}
+		break;
+	}
+	}
+	return pktb;
+}
+
 /**
  * pktb_data - get pointer to network packet
  * \param pktb Pointer to userspace packet buffer
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function
  2019-12-10 11:26 [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
  2019-12-10 11:26 ` [PATCH libnetfilter_queue 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe
@ 2019-12-22  2:09 ` Duncan Roe
  2020-01-03  2:47 ` Duncan Roe
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2019-12-22  2:09 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

Hi Pablo,

On Tue, Dec 10, 2019 at 10:26:33PM +1100, Duncan Roe wrote:
> pktb_usebuf() is a copy of pktb_alloc() with the first part modified to use
> a supplied buffer rather than calloc() one. I thought this would give a
> measurable performance boost and it does.
> All the code after the memset() call is common to pktb_alloc(). If you like,
> I can submit a v2 with this code in a static function called by both.
>
> Duncan Roe (1):
>   src: Add alternative function to pktb_alloc to avoid malloc / free
>     overhead
>
>  include/libnetfilter_queue/pktbuff.h |  2 +
>  src/extra/pktbuff.c                  | 82 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 84 insertions(+)
>
> --
> 2.14.5
>
After a day or so of testing, it appears pktb_usebuf() saves 6% overall CPU over
pktb_alloc() in a program similar to libnetfilter_queue/examples/nf-queue.c (but
not printing every packet). The variance is quite high, so plus or minus 1%. Can
supply more details of testing methodology if you want them.

Overall I would say 6% is a worthwhile saving.

What about it?

Cheers ... Duncan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function
  2019-12-10 11:26 [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
  2019-12-10 11:26 ` [PATCH libnetfilter_queue 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe
  2019-12-22  2:09 ` [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
@ 2020-01-03  2:47 ` Duncan Roe
  2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 0/1] New pktb_make() function Duncan Roe
  2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe
  4 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-01-03  2:47 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

Hi Pablo,

Do you think pktb_make() might be a better name for this new function?

I noticed an errant free() call after "errno = EPROTONOSUPPORT" so there will
have to be a v2 anyway.

I could merge the (now almost-) identical second halves of pktb_alloc and
pktb_make into a static function to avoid the duplicate code maintenance
overhead. Would you prefer that?

Cheers ... Duncan.

On Tue, Dec 10, 2019 at 10:26:33PM +1100, Duncan Roe wrote:
> pktb_usebuf() is a copy of pktb_alloc() with the first part modified to use
> a supplied buffer rather than calloc() one. I thought this would give a
> measurable performance boost and it does.
> All the code after the memset() call is common to pktb_alloc(). If you like,
> I can submit a v2 with this code in a static function called by both.
>
> Duncan Roe (1):
>   src: Add alternative function to pktb_alloc to avoid malloc / free
>     overhead
>
>  include/libnetfilter_queue/pktbuff.h |  2 +
>  src/extra/pktbuff.c                  | 82 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 84 insertions(+)
>
> --
> 2.14.5
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH libnetfilter_queue v2 0/1] New pktb_make() function
  2019-12-10 11:26 [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
                   ` (2 preceding siblings ...)
  2020-01-03  2:47 ` Duncan Roe
@ 2020-01-06  3:17 ` Duncan Roe
  2020-01-08 22:53   ` Pablo Neira Ayuso
  2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe
  4 siblings, 1 reply; 19+ messages in thread
From: Duncan Roe @ 2020-01-06  3:17 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

This patch offers a faster alternative / replacement function to pktb_alloc().

pktb_make() is a copy of the first part of pktb_alloc() modified to use a
supplied buffer rather than calloc() one. It then calls the second part of
pktb_alloc() which is modified to be a static function.

Can't think of a use case where one would choose to use pktb_alloc over
pktb_make.
In a furure documentation update, might relegate pktb_alloc and pktb_free to
"other functions".

Duncan Roe (1):
  src: Add alternative function to pktb_alloc to avoid malloc / free
    overhead

 include/libnetfilter_queue/pktbuff.h |  1 +
 src/extra/pktbuff.c                  | 60 ++++++++++++++++++++++++++++++++++--
 2 files changed, 59 insertions(+), 2 deletions(-)

v2: - Function name changed to pktb_make (was pktb_usebuf)
    - Cvt 2nd 1/2 of pktb_malloc into a static function common to pktb_make

-- 
2.14.5


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH libnetfilter_queue v2 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead
  2019-12-10 11:26 [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
                   ` (3 preceding siblings ...)
  2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 0/1] New pktb_make() function Duncan Roe
@ 2020-01-06  3:17 ` Duncan Roe
  4 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-01-06  3:17 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

New function pktb_make() works like pktb_alloc() except it has 2 extra
arguments, being a user-supplied buffer and its length.
Performance testing using a program derived from examples/nf_queue.c shows
6% improvement, so the 5% improvement claimed in the documentation is
conservative.

Updated:

 include/libnetfilter_queue/pktbuff.h: Add pktb_make() prototype

 src/extra/pktbuff.c: Implement pktb_make()

Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
---
 include/libnetfilter_queue/pktbuff.h |  1 +
 src/extra/pktbuff.c                  | 60 ++++++++++++++++++++++++++++++++++--
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/include/libnetfilter_queue/pktbuff.h b/include/libnetfilter_queue/pktbuff.h
index 42bc153..481ed59 100644
--- a/include/libnetfilter_queue/pktbuff.h
+++ b/include/libnetfilter_queue/pktbuff.h
@@ -4,6 +4,7 @@
 struct pkt_buff;
 
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
+struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra, void *buf, size_t bufsize);
 void pktb_free(struct pkt_buff *pktb);
 
 uint8_t *pktb_data(struct pkt_buff *pktb);
diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
index 37f6bc0..d85121c 100644
--- a/src/extra/pktbuff.c
+++ b/src/extra/pktbuff.c
@@ -29,6 +29,8 @@
  * @{
  */
 
+static struct pkt_buff *pktb_alloc_make_common(int family, void *data,
+	size_t len, size_t extra, struct pkt_buff *pktb, bool malloc_used);
 /**
  * pktb_alloc - allocate a new packet buffer
  * \param family Indicate what family. Currently supported families are
@@ -52,11 +54,17 @@ EXPORT_SYMBOL
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 {
 	struct pkt_buff *pktb;
-	void *pkt_data;
 
 	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
 	if (pktb == NULL)
 		return NULL;
+	return pktb_alloc_make_common(family, data, len, extra, pktb, true);
+}
+
+static struct pkt_buff *pktb_alloc_make_common(int family, void *data,
+	size_t len, size_t extra, struct pkt_buff *pktb, bool malloc_used)
+{
+	void *pkt_data;
 
 	/* Better make sure alignment is correct. */
 	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
@@ -87,7 +95,8 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 		default:
 			/* This protocol is unsupported. */
 			errno = EPROTONOSUPPORT;
-			free(pktb);
+			if (malloc_used)
+				free(pktb);
 			return NULL;
 		}
 		break;
@@ -96,6 +105,53 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 	return pktb;
 }
 
+/**
+ * pktb_make - make a packet buffer from an existing buffer
+ * \param family Indicate what family. Currently supported families are
+ * AF_BRIDGE, AF_INET & AF_INET6.
+ * \param data Pointer to packet data
+ * \param len Packet length
+ * \param extra Extra memory in the tail to be allocated (for mangling)
+ * \param buf Existing buffer to use
+ * \param bufsize Size of _buf_
+ *
+ * This function builds a userspace packet buffer inside a supplied buffer.
+ * The packet buffer contains the packet data and some extra memory room in the
+ * tail (if requested).
+ *
+ * This function uses less CPU cycles than pktb_alloc() + pktb_free().
+ * Users can expect a decrease in overall CPU time in the order of 5%.
+ *
+ * \return Pointer to a userspace packet buffer aligned within _buf_ or NULL on
+ * failure.
+ * \par Errors
+ * __ENOMEM__ _bufsize_ is too small to accomodate _len_
+ * \n
+ * __EPROTONOSUPPORT__ _family_ was __AF_BRIDGE__ and this is not an IP packet
+ * (v4 or v6)
+ * \warning Do __not__ call pktb_free() on the returned pointer
+ */
+EXPORT_SYMBOL
+struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra,
+			     void *buf, size_t bufsize)
+{
+	struct pkt_buff *pktb;
+
+	/* Better make sure alignment is correct. */
+	size_t extra2 =
+		(size_t)buf & 0x7 ? (~((size_t)buf & 0x7) & 0x7) + 1 : 0;
+
+	pktb = buf + extra2;
+	if (extra2 + sizeof(struct pkt_buff) + len + extra > bufsize)
+	{
+		errno = ENOMEM;
+		return NULL;
+	}
+	memset(pktb, 0, sizeof(struct pkt_buff) + len + extra);
+
+	return pktb_alloc_make_common(family, data, len, extra, pktb, false);
+}
+
 /**
  * pktb_data - get pointer to network packet
  * \param pktb Pointer to userspace packet buffer
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2 0/1] New pktb_make() function
  2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 0/1] New pktb_make() function Duncan Roe
@ 2020-01-08 22:53   ` Pablo Neira Ayuso
  2020-01-10  2:27     ` Duncan Roe
                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Pablo Neira Ayuso @ 2020-01-08 22:53 UTC (permalink / raw)
  To: Duncan Roe; +Cc: netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

On Mon, Jan 06, 2020 at 02:17:13PM +1100, Duncan Roe wrote:
> This patch offers a faster alternative / replacement function to pktb_alloc().
> 
> pktb_make() is a copy of the first part of pktb_alloc() modified to use a
> supplied buffer rather than calloc() one. It then calls the second part of
> pktb_alloc() which is modified to be a static function.
> 
> Can't think of a use case where one would choose to use pktb_alloc over
> pktb_make.
> In a furure documentation update, might relegate pktb_alloc and pktb_free to
> "other functions".

This is very useful.

Would you explore something looking similar to what I'm attaching?

Warning: Compile tested only :-)

Thanks.

[-- Attachment #2: x.patch --]
[-- Type: text/x-diff, Size: 2901 bytes --]

diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
index 6250fbf3ac8b..985bb48ac986 100644
--- a/src/extra/pktbuff.c
+++ b/src/extra/pktbuff.c
@@ -29,6 +29,58 @@
  * @{
  */
 
+static struct pkt_buff *__pktb_alloc(int family, void *data, size_t len,
+				     size_t extra)
+{
+	struct pkt_buff *pktb;
+
+	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
+	if (pktb == NULL)
+		return NULL;
+
+	return pktb;
+}
+
+static int pktb_setup_family(struct pkt_buff *pktb, int family)
+{
+	switch(family) {
+	case AF_INET:
+	case AF_INET6:
+		pktb->network_header = pktb->data;
+		break;
+	case AF_BRIDGE: {
+		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
+
+		pktb->mac_header = pktb->data;
+
+		switch(ethhdr->h_proto) {
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			pktb->network_header = pktb->data + ETH_HLEN;
+			break;
+		default:
+			/* This protocol is unsupported. */
+			errno = EPROTONOSUPPORT;
+			return -1;
+		}
+		break;
+		}
+	}
+
+	return 0;
+}
+
+static void pktb_setup_metadata(struct pkt_buff *pktb, void *pkt_data,
+				size_t len, size_t extra)
+{
+	pktb->len = len;
+	pktb->data_len = len + extra;
+
+	pktb->head = pkt_data;
+	pktb->data = pkt_data;
+	pktb->tail = pktb->head + len;
+}
+
 /**
  * pktb_alloc - allocate a new packet buffer
  * \param family Indicate what family. Currently supported families are
@@ -54,45 +106,41 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 	struct pkt_buff *pktb;
 	void *pkt_data;
 
-	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
-	if (pktb == NULL)
+	pktb = __pktb_alloc(family, data, len, extra);
+	if (!pktb)
 		return NULL;
 
 	/* Better make sure alignment is correct. */
 	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
 	memcpy(pkt_data, data, len);
 
-	pktb->len = len;
-	pktb->data_len = len + extra;
+	pktb_setup_metadata(pktb, pkt_data, len, extra);
 
-	pktb->head = pkt_data;
-	pktb->data = pkt_data;
-	pktb->tail = pktb->head + len;
+	if (pktb_setup_family(pktb, family) < 0) {
+		free(pktb);
+		return NULL;
+	}
 
-	switch(family) {
-	case AF_INET:
-	case AF_INET6:
-		pktb->network_header = pktb->data;
-		break;
-	case AF_BRIDGE: {
-		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
+	return pktb;
+}
 
-		pktb->mac_header = pktb->data;
+EXPORT_SYMBOL
+struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len)
+{
+	struct pkt_buff *pktb;
 
-		switch(ethhdr->h_proto) {
-		case ETH_P_IP:
-		case ETH_P_IPV6:
-			pktb->network_header = pktb->data + ETH_HLEN;
-			break;
-		default:
-			/* This protocol is unsupported. */
-			errno = EPROTONOSUPPORT;
-			free(pktb);
-			return NULL;
-		}
-		break;
-	}
+	pktb = __pktb_alloc(family, data, 0, 0);
+	if (!pktb)
+		return NULL;
+
+	pktb->data = data;
+	pktb_setup_metadata(pktb, data, len, 0);
+
+	if (pktb_setup_family(pktb, family) < 0) {
+		free(pktb);
+		return NULL;
 	}
+
 	return pktb;
 }
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2 0/1] New pktb_make() function
  2020-01-08 22:53   ` Pablo Neira Ayuso
@ 2020-01-10  2:27     ` Duncan Roe
  2020-01-13 18:51       ` Pablo Neira Ayuso
  2020-01-27  1:44     ` Duncan Roe
  2020-02-01  6:21     ` [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc() Duncan Roe
  2 siblings, 1 reply; 19+ messages in thread
From: Duncan Roe @ 2020-01-10  2:27 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

On Wed, Jan 08, 2020 at 11:53:23PM +0100, Pablo Neira Ayuso wrote:
> On Mon, Jan 06, 2020 at 02:17:13PM +1100, Duncan Roe wrote:
> > This patch offers a faster alternative / replacement function to pktb_alloc().
> >
> > pktb_make() is a copy of the first part of pktb_alloc() modified to use a
> > supplied buffer rather than calloc() one. It then calls the second part of
> > pktb_alloc() which is modified to be a static function.
> >
> > Can't think of a use case where one would choose to use pktb_alloc over
> > pktb_make.
> > In a furure documentation update, might relegate pktb_alloc and pktb_free to
> > "other functions".
>
> This is very useful.
>
> Would you explore something looking similar to what I'm attaching?
>
> Warning: Compile tested only :-)
>
> Thanks.

> diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
> index 6250fbf3ac8b..985bb48ac986 100644
> --- a/src/extra/pktbuff.c
> +++ b/src/extra/pktbuff.c
> @@ -29,6 +29,58 @@
>   * @{
>   */
>
> +static struct pkt_buff *__pktb_alloc(int family, void *data, size_t len,
> +				     size_t extra)
> +{
> +	struct pkt_buff *pktb;
> +
> +	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
> +	if (pktb == NULL)
> +		return NULL;
> +
> +	return pktb;
> +}
> +
> +static int pktb_setup_family(struct pkt_buff *pktb, int family)
> +{
> +	switch(family) {
> +	case AF_INET:
> +	case AF_INET6:
> +		pktb->network_header = pktb->data;
> +		break;
> +	case AF_BRIDGE: {
> +		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
> +
> +		pktb->mac_header = pktb->data;
> +
> +		switch(ethhdr->h_proto) {
> +		case ETH_P_IP:
> +		case ETH_P_IPV6:
> +			pktb->network_header = pktb->data + ETH_HLEN;
> +			break;
> +		default:
> +			/* This protocol is unsupported. */
> +			errno = EPROTONOSUPPORT;
> +			return -1;
> +		}
> +		break;
> +		}

GRR! I just wasted 20 minutes looking at these last 3 lines. Fix the indentation

> +	}
> +
> +	return 0;
> +}
> +
> +static void pktb_setup_metadata(struct pkt_buff *pktb, void *pkt_data,
> +				size_t len, size_t extra)
> +{
> +	pktb->len = len;

You know, we only need any 2 of len, data and tail. This has caused bugs in the
past, e.g. commit 8a4316f31. In the code, len seems to be the least used - will
see if I can't get rid of it.

> +	pktb->data_len = len + extra;
> +
> +	pktb->head = pkt_data;
> +	pktb->data = pkt_data;
> +	pktb->tail = pktb->head + len;
> +}
> +
>  /**
>   * pktb_alloc - allocate a new packet buffer
>   * \param family Indicate what family. Currently supported families are
> @@ -54,45 +106,41 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
>  	struct pkt_buff *pktb;
>  	void *pkt_data;
>
> -	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
> -	if (pktb == NULL)
> +	pktb = __pktb_alloc(family, data, len, extra);
> +	if (!pktb)
>  		return NULL;
>
>  	/* Better make sure alignment is correct. */
>  	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
>  	memcpy(pkt_data, data, len);
>
> -	pktb->len = len;
> -	pktb->data_len = len + extra;
> +	pktb_setup_metadata(pktb, pkt_data, len, extra);
>
> -	pktb->head = pkt_data;
> -	pktb->data = pkt_data;
> -	pktb->tail = pktb->head + len;
> +	if (pktb_setup_family(pktb, family) < 0) {
> +		free(pktb);
> +		return NULL;
> +	}
>
> -	switch(family) {
> -	case AF_INET:
> -	case AF_INET6:
> -		pktb->network_header = pktb->data;
> -		break;
> -	case AF_BRIDGE: {
> -		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
> +	return pktb;
> +}
>
> -		pktb->mac_header = pktb->data;
> +EXPORT_SYMBOL
> +struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len)
> +{
> +	struct pkt_buff *pktb;
>
> -		switch(ethhdr->h_proto) {
> -		case ETH_P_IP:
> -		case ETH_P_IPV6:
> -			pktb->network_header = pktb->data + ETH_HLEN;
> -			break;
> -		default:
> -			/* This protocol is unsupported. */
> -			errno = EPROTONOSUPPORT;
> -			free(pktb);
> -			return NULL;
> -		}
> -		break;
> -	}
> +	pktb = __pktb_alloc(family, data, 0, 0);
> +	if (!pktb)
> +		return NULL;
> +
> +	pktb->data = data;
> +	pktb_setup_metadata(pktb, data, len, 0);
> +
> +	if (pktb_setup_family(pktb, family) < 0) {
> +		free(pktb);
> +		return NULL;
>  	}
> +
>  	return pktb;
>  }
>

Ok, so this is another approach to reducing CPU time: avoid memcpy of data.

That's great if you're not mangling content.

But if you are mangling, beware. pktb now has pointers into the buffer you used
for receiving from Netlink so you must use a different buffer when sending.

I'll be updating my performance testing regime to get results more quickly, so
will be reporting back at some stage. It remains the case that malloc / free
were taking at least 5% of overall CPU with short packets (and btw most of that
is free(): as a rule of thumb I have found free cpu = 3 * malloc cpu)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2 0/1] New pktb_make() function
  2020-01-10  2:27     ` Duncan Roe
@ 2020-01-13 18:51       ` Pablo Neira Ayuso
  2020-01-27  2:11         ` Duncan Roe
  0 siblings, 1 reply; 19+ messages in thread
From: Pablo Neira Ayuso @ 2020-01-13 18:51 UTC (permalink / raw)
  To: Netfilter Development

On Fri, Jan 10, 2020 at 01:27:57PM +1100, Duncan Roe wrote:
> On Wed, Jan 08, 2020 at 11:53:23PM +0100, Pablo Neira Ayuso wrote:
> > On Mon, Jan 06, 2020 at 02:17:13PM +1100, Duncan Roe wrote:
> > > This patch offers a faster alternative / replacement function to pktb_alloc().
> > >
> > > pktb_make() is a copy of the first part of pktb_alloc() modified to use a
> > > supplied buffer rather than calloc() one. It then calls the second part of
> > > pktb_alloc() which is modified to be a static function.
> > >
> > > Can't think of a use case where one would choose to use pktb_alloc over
> > > pktb_make.
> > > In a furure documentation update, might relegate pktb_alloc and pktb_free to
> > > "other functions".
> >
> > This is very useful.
> >
> > Would you explore something looking similar to what I'm attaching?
> >
> > Warning: Compile tested only :-)
> >
> > Thanks.
> 
> > diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
> > index 6250fbf3ac8b..985bb48ac986 100644
> > --- a/src/extra/pktbuff.c
> > +++ b/src/extra/pktbuff.c
> > @@ -29,6 +29,58 @@
> >   * @{
> >   */
> >
> > +static struct pkt_buff *__pktb_alloc(int family, void *data, size_t len,
> > +				     size_t extra)
> > +{
> > +	struct pkt_buff *pktb;
> > +
> > +	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
> > +	if (pktb == NULL)
> > +		return NULL;
> > +
> > +	return pktb;
> > +}
> > +
> > +static int pktb_setup_family(struct pkt_buff *pktb, int family)
> > +{
> > +	switch(family) {
> > +	case AF_INET:
> > +	case AF_INET6:
> > +		pktb->network_header = pktb->data;
> > +		break;
> > +	case AF_BRIDGE: {
> > +		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
> > +
> > +		pktb->mac_header = pktb->data;
> > +
> > +		switch(ethhdr->h_proto) {
> > +		case ETH_P_IP:
> > +		case ETH_P_IPV6:
> > +			pktb->network_header = pktb->data + ETH_HLEN;
> > +			break;
> > +		default:
> > +			/* This protocol is unsupported. */
> > +			errno = EPROTONOSUPPORT;
> > +			return -1;
> > +		}
> > +		break;
> > +		}
> 
> GRR! I just wasted 20 minutes looking at these last 3 lines. Fix the indentation
> 
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void pktb_setup_metadata(struct pkt_buff *pktb, void *pkt_data,
> > +				size_t len, size_t extra)
> > +{
> > +	pktb->len = len;
> 
> You know, we only need any 2 of len, data and tail. This has caused bugs in the
> past, e.g. commit 8a4316f31. In the code, len seems to be the least used - will
> see if I can't get rid of it.
> 
> > +	pktb->data_len = len + extra;
> > +
> > +	pktb->head = pkt_data;
> > +	pktb->data = pkt_data;
> > +	pktb->tail = pktb->head + len;
> > +}
> > +
> >  /**
> >   * pktb_alloc - allocate a new packet buffer
> >   * \param family Indicate what family. Currently supported families are
> > @@ -54,45 +106,41 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
> >  	struct pkt_buff *pktb;
> >  	void *pkt_data;
> >
> > -	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
> > -	if (pktb == NULL)
> > +	pktb = __pktb_alloc(family, data, len, extra);
> > +	if (!pktb)
> >  		return NULL;
> >
> >  	/* Better make sure alignment is correct. */
> >  	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
> >  	memcpy(pkt_data, data, len);
> >
> > -	pktb->len = len;
> > -	pktb->data_len = len + extra;
> > +	pktb_setup_metadata(pktb, pkt_data, len, extra);
> >
> > -	pktb->head = pkt_data;
> > -	pktb->data = pkt_data;
> > -	pktb->tail = pktb->head + len;
> > +	if (pktb_setup_family(pktb, family) < 0) {
> > +		free(pktb);
> > +		return NULL;
> > +	}
> >
> > -	switch(family) {
> > -	case AF_INET:
> > -	case AF_INET6:
> > -		pktb->network_header = pktb->data;
> > -		break;
> > -	case AF_BRIDGE: {
> > -		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
> > +	return pktb;
> > +}
> >
> > -		pktb->mac_header = pktb->data;
> > +EXPORT_SYMBOL
> > +struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len)
> > +{
> > +	struct pkt_buff *pktb;
> >
> > -		switch(ethhdr->h_proto) {
> > -		case ETH_P_IP:
> > -		case ETH_P_IPV6:
> > -			pktb->network_header = pktb->data + ETH_HLEN;
> > -			break;
> > -		default:
> > -			/* This protocol is unsupported. */
> > -			errno = EPROTONOSUPPORT;
> > -			free(pktb);
> > -			return NULL;
> > -		}
> > -		break;
> > -	}
> > +	pktb = __pktb_alloc(family, data, 0, 0);
> > +	if (!pktb)
> > +		return NULL;
> > +
> > +	pktb->data = data;
> > +	pktb_setup_metadata(pktb, data, len, 0);
> > +
> > +	if (pktb_setup_family(pktb, family) < 0) {
> > +		free(pktb);
> > +		return NULL;
> >  	}
> > +
> >  	return pktb;
> >  }
> >
> 
> Ok, so this is another approach to reducing CPU time: avoid memcpy of data.
> 
> That's great if you're not mangling content.
> 
> But if you are mangling, beware. pktb now has pointers into the buffer you used
> for receiving from Netlink so you must use a different buffer when sending.

Probably we can store a flag in the pkt_buff structure to say "this
buffer is now ours" so mangling can just trigger a clone via malloc()
+ memcpy() only for that path.

Or you can just document this, although handling this case for the
library would be make it easier to user for users.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2 0/1] New pktb_make() function
  2020-01-08 22:53   ` Pablo Neira Ayuso
  2020-01-10  2:27     ` Duncan Roe
@ 2020-01-27  1:44     ` Duncan Roe
  2020-02-01  6:21     ` [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc() Duncan Roe
  2 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-01-27  1:44 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

Hi Pablo,

On Wed, Jan 08, 2020 at 11:53:23PM +0100, Pablo Neira Ayuso wrote:
> On Mon, Jan 06, 2020 at 02:17:13PM +1100, Duncan Roe wrote:
> > This patch offers a faster alternative / replacement function to pktb_alloc().
> > 
> > pktb_make() is a copy of the first part of pktb_alloc() modified to use a
> > supplied buffer rather than calloc() one. It then calls the second part of
> > pktb_alloc() which is modified to be a static function.
> > 
> > Can't think of a use case where one would choose to use pktb_alloc over
> > pktb_make.
> > In a furure documentation update, might relegate pktb_alloc and pktb_free to
> > "other functions".
> 
> This is very useful.
> 
> Would you explore something looking similar to what I'm attaching?
> 
> Warning: Compile tested only :-)
> 
> Thanks.

> diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
> index 6250fbf3ac8b..985bb48ac986 100644
> --- a/src/extra/pktbuff.c
> +++ b/src/extra/pktbuff.c
> @@ -29,6 +29,58 @@
>   * @{
>   */
>  
> +static struct pkt_buff *__pktb_alloc(int family, void *data, size_t len,
> +				     size_t extra)
> +{
> +	struct pkt_buff *pktb;
> +
> +	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
> +	if (pktb == NULL)
> +		return NULL;
> +
> +	return pktb;
> +}
> +
> +static int pktb_setup_family(struct pkt_buff *pktb, int family)
> +{
> +	switch(family) {
> +	case AF_INET:
> +	case AF_INET6:
> +		pktb->network_header = pktb->data;
> +		break;
> +	case AF_BRIDGE: {
> +		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
> +
> +		pktb->mac_header = pktb->data;
> +
> +		switch(ethhdr->h_proto) {
> +		case ETH_P_IP:
> +		case ETH_P_IPV6:
> +			pktb->network_header = pktb->data + ETH_HLEN;
> +			break;
> +		default:
> +			/* This protocol is unsupported. */
> +			errno = EPROTONOSUPPORT;
> +			return -1;
> +		}
> +		break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static void pktb_setup_metadata(struct pkt_buff *pktb, void *pkt_data,
> +				size_t len, size_t extra)
> +{
> +	pktb->len = len;
> +	pktb->data_len = len + extra;
> +
> +	pktb->head = pkt_data;
> +	pktb->data = pkt_data;
> +	pktb->tail = pktb->head + len;
> +}
> +
>  /**
>   * pktb_alloc - allocate a new packet buffer
>   * \param family Indicate what family. Currently supported families are
> @@ -54,45 +106,41 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
>  	struct pkt_buff *pktb;
>  	void *pkt_data;
>  
> -	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
> -	if (pktb == NULL)
> +	pktb = __pktb_alloc(family, data, len, extra);
> +	if (!pktb)
>  		return NULL;
>  
>  	/* Better make sure alignment is correct. */
>  	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
>  	memcpy(pkt_data, data, len);
>  
> -	pktb->len = len;
> -	pktb->data_len = len + extra;
> +	pktb_setup_metadata(pktb, pkt_data, len, extra);
>  
> -	pktb->head = pkt_data;
> -	pktb->data = pkt_data;
> -	pktb->tail = pktb->head + len;
> +	if (pktb_setup_family(pktb, family) < 0) {
> +		free(pktb);
> +		return NULL;
> +	}
>  
> -	switch(family) {
> -	case AF_INET:
> -	case AF_INET6:
> -		pktb->network_header = pktb->data;
> -		break;
> -	case AF_BRIDGE: {
> -		struct ethhdr *ethhdr = (struct ethhdr *)pktb->data;
> +	return pktb;
> +}
>  
> -		pktb->mac_header = pktb->data;
> +EXPORT_SYMBOL
> +struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len)
> +{
> +	struct pkt_buff *pktb;
>  
> -		switch(ethhdr->h_proto) {
> -		case ETH_P_IP:
> -		case ETH_P_IPV6:
> -			pktb->network_header = pktb->data + ETH_HLEN;
> -			break;
> -		default:
> -			/* This protocol is unsupported. */
> -			errno = EPROTONOSUPPORT;
> -			free(pktb);
> -			return NULL;
> -		}
> -		break;
> -	}
> +	pktb = __pktb_alloc(family, data, 0, 0);
> +	if (!pktb)
> +		return NULL;
> +
> +	pktb->data = data;
> +	pktb_setup_metadata(pktb, data, len, 0);
> +
> +	if (pktb_setup_family(pktb, family) < 0) {
> +		free(pktb);
> +		return NULL;
>  	}
> +
>  	return pktb;
>  }
>  

The results are in. The spreadsheet is available at
https://github.com/duncan-roe/nfq/blob/master/nfq6/nfq6_timings.ods, but I'll
summarise them here.

Four functions are compared: pktb_alloc(), pktb_alloc_data(), pktb_make() and
pktb_make_data(). The timings for pktb_alloc & pktb_alloc_data include a call to
pktb_free(). When using pktb_make, one must not call pktb_free.

The test is to time calling the function (pair) 100000000 (1e8) times. In all
cases, this takes < 60 seconds. Two series of tests: one with 50-byte packets
(udp/IPv6, 2 data chars) and the other with 1500-byte (max MTU) packets.

Part-way through testing, pktb_make was improved to not zeroise the area into
which packet data will be copied. This is actually a tiny performance hit at
50-byte, could determine a crossover point at which to zeroise in 1 go but
didn't consider the extra complexity to be warrantied.

Here's a table of percentages:

bytes|pktb_alloc|pktb_alloc_data|pktb_make|pktb_make_data
-----|----------|---------------|---------|--------------
   50|       100|           56.2|     50.8|          28.9
-----|----------|---------------|---------|--------------
 1500|       100|           22.9|     59.6|          11.5

I have yet to do the doxygen documentation for the _data variants, then can send
in the code.

Cheers ... Duncan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2 0/1] New pktb_make() function
  2020-01-13 18:51       ` Pablo Neira Ayuso
@ 2020-01-27  2:11         ` Duncan Roe
  0 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-01-27  2:11 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

On Mon, Jan 13, 2020 at 07:51:50PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Jan 10, 2020 at 01:27:57PM +1100, Duncan Roe wrote:
> > On Wed, Jan 08, 2020 at 11:53:23PM +0100, Pablo Neira Ayuso wrote:
> > > On Mon, Jan 06, 2020 at 02:17:13PM +1100, Duncan Roe wrote:
> > > > This patch offers a faster alternative / replacement function to pktb_alloc().
> > > >
> > > > pktb_make() is a copy of the first part of pktb_alloc() modified to use a
> > > > supplied buffer rather than calloc() one. It then calls the second part of
> > > > pktb_alloc() which is modified to be a static function.
> > > >
> > > > Can't think of a use case where one would choose to use pktb_alloc over
> > > > pktb_make.
> > > > In a furure documentation update, might relegate pktb_alloc and pktb_free to
> > > > "other functions".
> > >
> > > This is very useful.
> > >
> > > Would you explore something looking similar to what I'm attaching?
> > >
> > > Warning: Compile tested only :-)
> > >
> > > Thanks.
> >
[...]
> >
> > Ok, so this is another approach to reducing CPU time: avoid memcpy of data.
> >
> > That's great if you're not mangling content.
> >
> > But if you are mangling, beware. pktb now has pointers into the buffer you used
> > for receiving from Netlink so you must use a different buffer when sending.
>
> Probably we can store a flag in the pkt_buff structure to say "this
> buffer is now ours" so mangling can just trigger a clone via malloc()
> + memcpy() only for that path.

Not for pktb_make: we never call pktb_free.
>
> Or you can just document this, although handling this case for the
> library would be make it easier to user for users.

ATM I have to document that you must use separate Rx & Tx netlink buffers if
mangling at all, because otherwise you may get overlapping data moves or worse.

So I'm inclined to document about no length increase. But ...

There *is* room at the end of the buffer, as long as the received Netlink
message didn't fill it. We just need some way to know about it - maybe extra
args to the _data functions and maybe as well more _data helper functions to
keep it easy to use.

As well, I've been toying with the idea of not copying data out of the Rx buffer
at all, but rather building the verdict message around it. I have in mind a
user-space-only header to go at the front of the buffer, to be created before
receiving into it.

There's still the copies out from and back into kernel space - guess we're stuck
with those.

Cheers ... Duncan.
>
> Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc()
  2020-01-08 22:53   ` Pablo Neira Ayuso
  2020-01-10  2:27     ` Duncan Roe
  2020-01-27  1:44     ` Duncan Roe
@ 2020-02-01  6:21     ` Duncan Roe
  2020-02-07 22:39       ` Duncan Roe
  2020-02-19 18:04       ` Pablo Neira Ayuso
  2 siblings, 2 replies; 19+ messages in thread
From: Duncan Roe @ 2020-02-01  6:21 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

Functions pktb_alloc_data, pktb_make and pktb_make_data are defined.
The pktb_make pair are syggested as replacements for the pktb_alloc (now) pair
because they are always faster.

- Add prototypes to include/libnetfilter_queue/pktbuff.h
- Add pktb_alloc_data much as per Pablo's email of Wed, 8 Jan 2020
  speedup: point to packet data in netlink receive buffer rather than copy to
           area immediately following pktb struct
- Add pktb_make much like pktb_usebuf proposed on 10 Dec 2019
  2 sppedups: 1. Use an existing buffer rather than calloc and (later) free one.
              2. Only zero struct and extra parts of pktb - the rest is
                 overwritten by copy (calloc has to zero the lot).
- Add pktb_make_data
  3 speedups: All of the above
- Document the new functions
- Move pktb_alloc and pktb_alloc_data into the "other functions" group since
  they are slower than the "make" equivalent functions

Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
---
 include/libnetfilter_queue/pktbuff.h |   3 +
 src/extra/pktbuff.c                  | 296 ++++++++++++++++++++++++++++++-----
 2 files changed, 261 insertions(+), 38 deletions(-)

diff --git a/include/libnetfilter_queue/pktbuff.h b/include/libnetfilter_queue/pktbuff.h
index 42bc153..fc6bf01 100644
--- a/include/libnetfilter_queue/pktbuff.h
+++ b/include/libnetfilter_queue/pktbuff.h
@@ -4,6 +4,9 @@
 struct pkt_buff;
 
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
+struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len);
+struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra, void *buf, size_t bufsize);
+struct pkt_buff *pktb_make_data(int family, void *data, size_t len, void *buf, size_t bufsize);
 void pktb_free(struct pkt_buff *pktb);
 
 uint8_t *pktb_data(struct pkt_buff *pktb);
diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
index 6dd0ca9..cfd9f15 100644
--- a/src/extra/pktbuff.c
+++ b/src/extra/pktbuff.c
@@ -29,6 +29,59 @@
  * @{
  */
 
+static struct pkt_buff *__pktb_alloc(size_t len, size_t extra)
+{
+	struct pkt_buff *pktb;
+
+	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
+
+	return pktb;
+}
+
+static int pktb_setup_family(struct pkt_buff *pktb, int family)
+{
+	struct ethhdr *ethhdr;
+
+	switch(family) {
+	case AF_INET:
+	case AF_INET6:
+		pktb->network_header = pktb->data;
+		break;
+	case AF_BRIDGE:
+		ethhdr = (struct ethhdr *)pktb->data;
+
+		pktb->mac_header = pktb->data;
+
+		switch(ethhdr->h_proto) {
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			pktb->network_header = pktb->data + ETH_HLEN;
+			break;
+		default:
+			/* This protocol is unsupported. */
+			errno = EPROTONOSUPPORT;
+			return -1;
+		}
+		break;
+	}
+
+	return 0;
+}
+
+static void pktb_setup_metadata(struct pkt_buff *pktb, void *pkt_data,
+				size_t len, size_t extra)
+{
+	pktb->len = len;
+	pktb->data_len = len + extra;
+
+	pktb->data = pkt_data;
+}
+
+/**
+ * \addtogroup otherfns
+ * @{
+ */
+
 /**
  * pktb_alloc - allocate a new packet buffer
  * \param family Indicate what family. Currently supported families are
@@ -52,10 +105,9 @@ EXPORT_SYMBOL
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 {
 	struct pkt_buff *pktb;
-	struct ethhdr *ethhdr;
 	void *pkt_data;
 
-	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
+	pktb = __pktb_alloc(len, extra);
 	if (pktb == NULL)
 		return NULL;
 
@@ -63,32 +115,198 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
 	memcpy(pkt_data, data, len);
 
-	pktb->len = len;
-	pktb->data_len = len + extra;
+	pktb_setup_metadata(pktb, pkt_data, len, extra);
 
-	pktb->data = pkt_data;
+	if (pktb_setup_family(pktb, family) < 0) {
+		free(pktb);
+		return NULL;
+	}
 
-	switch(family) {
-	case AF_INET:
-	case AF_INET6:
-		pktb->network_header = pktb->data;
-		break;
-	case AF_BRIDGE:
-		ethhdr = (struct ethhdr *)pktb->data;
-		pktb->mac_header = pktb->data;
+	return pktb;
+}
 
-		switch(ethhdr->h_proto) {
-		case ETH_P_IP:
-		case ETH_P_IPV6:
-			pktb->network_header = pktb->data + ETH_HLEN;
-			break;
-		default:
-			/* This protocol is unsupported. */
-			errno = EPROTONOSUPPORT;
-			free(pktb);
-			return NULL;
-		}
-		break;
+/**
+ * pktb_alloc_data - fast version of pktb_alloc with some restrictions
+ * \param family Indicate what family. Currently supported families are
+ * AF_BRIDGE, AF_INET & AF_INET6.
+ * \param data Pointer to packet data
+ * \param len Packet length
+ *
+ * This function returns a packet buffer that points to the packet data which
+ * remains in its original netlink receive buffer.
+ * This saves the time to copy the data but restricts mangling to leaving the
+ * packet the same size or making it shorter.
+ *
+ * \sa pktb_make_data(), expecially Notes and Warnings
+ */
+EXPORT_SYMBOL
+struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len)
+{
+	struct pkt_buff *pktb;
+
+	pktb = __pktb_alloc(0, 0);
+	if (!pktb)
+		return NULL;
+
+	pktb->data = data;
+	pktb_setup_metadata(pktb, data, len, 0);
+
+	if (pktb_setup_family(pktb, family) < 0) {
+		free(pktb);
+		return NULL;
+	}
+
+	return pktb;
+}
+
+/**
+ * @}
+ */
+
+static struct pkt_buff *__pktb_make(size_t len, size_t extra, void *buf,
+				    size_t bufsize)
+{
+	struct pkt_buff *pktb;
+
+	/* Better make sure alignment is correct. */
+	size_t extra2 =
+		(size_t)buf & 0x3 ? (~((size_t)buf & 0x3) & 0x3) + 1 : 0;
+
+	pktb = buf + extra2;
+	if (extra2 + sizeof(struct pkt_buff) + len + extra > bufsize)
+	{
+		errno = ENOMEM;
+		return NULL;
+	}
+	/* Zero the front bit. memcpy looks after next part */
+	memset(pktb, 0, sizeof(struct pkt_buff) + extra2);
+
+	return pktb;
+}
+
+/**
+ * pktb_make - make a packet buffer from an existing buffer
+ * \param family Indicate what family. Currently supported families are
+ * AF_BRIDGE, AF_INET & AF_INET6.
+ * \param data Pointer to packet data
+ * \param len Packet length
+ * \param extra Extra memory in the tail to be allocated (for mangling)
+ * \param buf Existing buffer to use
+ * \param bufsize Size of _buf_
+ *
+ * This function builds a userspace packet buffer inside a supplied buffer.
+ * The packet buffer contains the packet data and some extra memory room in the
+ * tail (if requested).
+ *
+ * This function uses less CPU cycles than pktb_alloc() + pktb_free().
+ * Users can expect a decrease in overall CPU time in the order of 5%.
+ *
+ * \return Pointer to a userspace packet buffer aligned within _buf_ or NULL on
+ * failure.
+ * \par Errors
+ * __ENOMEM__ _bufsize_ is too small to accomodate _len_
+ * \n
+ * __EPROTONOSUPPORT__ _family_ was __AF_BRIDGE__ and this is not an IP packet
+ * (v4 or v6)
+ * \note Be sure not to __malloc__ _buf_ inside your callback function,
+ * otherwise the performance improvement over pktb_alloc() will be lost.
+ * A static buffer works well in single-threaded programs, especially if
+ * debugging.
+ * Or, declare the buffer as a local (stack) variable in
+ * the callback function.
+ * (Each thread has its own stack of size __uname -s__ (typically 8MB, plenty
+ * of room)).
+ * \warning Do __not__ call pktb_free() on the returned pointer
+ */
+EXPORT_SYMBOL
+struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra,
+			     void *buf, size_t bufsize)
+{
+	struct pkt_buff *pktb;
+	void *pkt_data;
+
+	pktb = __pktb_make(len, extra, buf, bufsize);
+
+	if (pktb)
+	{
+		pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
+		memcpy(pkt_data, data, len);
+
+		pktb_setup_metadata(pktb, pkt_data, len, extra);
+		if (pktb_setup_family(pktb, family) < 0)
+			pktb = NULL;
+		if (pktb && extra)
+			memset(pktb_tail(pktb), 0, extra);
+	}
+	return pktb;
+}
+/**
+ * pktb_make_data - fast version of pktb_make with some restrictions
+ * \param family Indicate what family. Currently supported families are
+ * AF_BRIDGE, AF_INET & AF_INET6.
+ * \param data Pointer to packet data
+ * \param len Packet length
+ * \param buf Existing buffer to use
+ * \param bufsize Size of _buf_
+ *
+ * This function builds a userspace packet buffer where the packet data is left
+ * in the original receive buffer and pointed to.
+ * This saves the time to copy the data but restricts mangling to leaving the
+ * packet the same size or making it shorter.
+ *
+ * \par Relative performance timings
+ * The table below shows relative timings for pktb_alloc(), pktb_make(),
+ * pktb_alloc_data() and pktb_make_data().
+ * Timings for the functions with __alloc__ in their name include a call to
+ * __pktb_free__. Tests were done for 50-byte and 1500-byte packets.
+ * <TABLE>
+ * <TR>
+ * <TD><B>Bytes</B></TD>
+ * <TD><B>pktb_alloc</B></TD>
+ * <TD><B>pktb_make</B></TD>
+ * <TD><B>pktb_alloc_data</B></TD>
+ * <TD><B>pktb_make_data</B></TD>
+ * </TR>
+ * <TR>
+ * <TD>50</TD>
+ * <TD>100%</TD>
+ * <TD>51%</TD>
+ * <TD>56%</TD>
+ * <TD>29%</TD>
+ * </TR>
+ * <TR>
+ * <TD>1500</TD>
+ * <TD>100%</TD>
+ * <TD>60%</TD>
+ * <TD>23%</TD>
+ * <TD>12%</TD>
+ * </TR>
+ * </TABLE>
+ *
+ * \note If mangling, one must use different buffers for netlink receive and
+ * send from and to the kernel.
+ * Otherwise, there may be overlapping moves or worse.
+ * examples/nf-queue.c does this anyway.
+ * \note _bufsize_ only needs to be __sizeof(struct pkt_buff)__, since no data
+ * is copied.
+ * \warning The function prototype may change before the next library release,
+ * to accomodate other _data fast function variants
+ * \sa pktb_make(), expecially Notes and Warnings
+ */
+
+EXPORT_SYMBOL
+struct pkt_buff *pktb_make_data(int family, void *data, size_t len, void *buf,
+				size_t bufsize)
+{
+	struct pkt_buff *pktb;
+
+	pktb = __pktb_make(0, 0, buf, bufsize);
+
+	if (pktb)
+	{
+		pktb_setup_metadata(pktb, data, len, 0);
+		if (pktb_setup_family(pktb, family) < 0)
+			pktb = NULL;
 	}
 	return pktb;
 }
@@ -121,30 +339,32 @@ uint32_t pktb_len(struct pkt_buff *pktb)
 	return pktb->len;
 }
 
-/**
- * pktb_free - release packet buffer
- * \param pktb Pointer to userspace packet buffer
- */
-EXPORT_SYMBOL
-void pktb_free(struct pkt_buff *pktb)
-{
-	free(pktb);
-}
-
 /**
  * \defgroup otherfns Other functions
  *
  * The library provides a number of other functions which many user-space
- * programs will never need. These divide into 2 groups:
+ * programs will never need. These divide into 3 groups:
  * \n
  * 1. Functions to get values of members of opaque __struct pktbuff__, described
  * below
  * \n
- * 2. Internal functions, described in Module __Internal functions__
+ * 2. Slower functions that __malloc__ and __free__ the memory to make a buffer
+ * \n
+ * 3. Internal functions, described in Module __Internal functions__
  *
  * @{
  */
 
+/**
+ * pktb_free - release packet buffer
+ * \param pktb Pointer to userspace packet buffer
+ */
+EXPORT_SYMBOL
+void pktb_free(struct pkt_buff *pktb)
+{
+	free(pktb);
+}
+
 /**
  * \defgroup uselessfns Internal functions
  *
@@ -195,7 +415,7 @@ void pktb_put(struct pkt_buff *pktb, unsigned int len)
 /**
  * pktb_trim - set new length for this packet buffer
  * \param pktb Pointer to userspace packet buffer
- * \param len New packet length (tail is adjusted to reflect this)
+ * \param len New packet length
  */
 EXPORT_SYMBOL
 void pktb_trim(struct pkt_buff *pktb, unsigned int len)
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc()
  2020-02-01  6:21     ` [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc() Duncan Roe
@ 2020-02-07 22:39       ` Duncan Roe
  2020-02-19 18:04       ` Pablo Neira Ayuso
  1 sibling, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-02-07 22:39 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

Hi Pablo,

On Sat, Feb 01, 2020 at 05:21:27PM +1100, Duncan Roe wrote:
> Functions pktb_alloc_data, pktb_make and pktb_make_data are defined.
> The pktb_make pair are syggested as replacements for the pktb_alloc (now) pair
> because they are always faster.
>
> - Add prototypes to include/libnetfilter_queue/pktbuff.h
> - Add pktb_alloc_data much as per Pablo's email of Wed, 8 Jan 2020
>   speedup: point to packet data in netlink receive buffer rather than copy to
>            area immediately following pktb struct
> - Add pktb_make much like pktb_usebuf proposed on 10 Dec 2019
>   2 sppedups: 1. Use an existing buffer rather than calloc and (later) free one.
>               2. Only zero struct and extra parts of pktb - the rest is
>                  overwritten by copy (calloc has to zero the lot).
> - Add pktb_make_data
>   3 speedups: All of the above
> - Document the new functions
> - Move pktb_alloc and pktb_alloc_data into the "other functions" group since
>   they are slower than the "make" equivalent functions
>
> Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
[...]

Did you notice this? I guess it should have been v3 really.

I'd really appreciate some feedback soon, if you or whoever could find the time.

Cheers ... Duncan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc()
  2020-02-01  6:21     ` [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc() Duncan Roe
  2020-02-07 22:39       ` Duncan Roe
@ 2020-02-19 18:04       ` Pablo Neira Ayuso
  2020-02-20 23:22         ` Duncan Roe
                           ` (3 more replies)
  1 sibling, 4 replies; 19+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-19 18:04 UTC (permalink / raw)
  To: Duncan Roe; +Cc: netfilter-devel

On Sat, Feb 01, 2020 at 05:21:27PM +1100, Duncan Roe wrote:
> Functions pktb_alloc_data, pktb_make and pktb_make_data are defined.
> The pktb_make pair are syggested as replacements for the pktb_alloc (now) pair
> because they are always faster.
> 
> - Add prototypes to include/libnetfilter_queue/pktbuff.h
> - Add pktb_alloc_data much as per Pablo's email of Wed, 8 Jan 2020
>   speedup: point to packet data in netlink receive buffer rather than copy to
>            area immediately following pktb struct
> - Add pktb_make much like pktb_usebuf proposed on 10 Dec 2019
>   2 sppedups: 1. Use an existing buffer rather than calloc and (later) free one.
>               2. Only zero struct and extra parts of pktb - the rest is
>                  overwritten by copy (calloc has to zero the lot).
> - Add pktb_make_data
>   3 speedups: All of the above
> - Document the new functions
> - Move pktb_alloc and pktb_alloc_data into the "other functions" group since
>   they are slower than the "make" equivalent functions
> 
> Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
> ---
>  include/libnetfilter_queue/pktbuff.h |   3 +
>  src/extra/pktbuff.c                  | 296 ++++++++++++++++++++++++++++++-----
>  2 files changed, 261 insertions(+), 38 deletions(-)
> 
> diff --git a/include/libnetfilter_queue/pktbuff.h b/include/libnetfilter_queue/pktbuff.h
> index 42bc153..fc6bf01 100644
> --- a/include/libnetfilter_queue/pktbuff.h
> +++ b/include/libnetfilter_queue/pktbuff.h
> @@ -4,6 +4,9 @@
>  struct pkt_buff;
>  
>  struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
> +struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len);
> +struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra, void *buf, size_t bufsize);
> +struct pkt_buff *pktb_make_data(int family, void *data, size_t len, void *buf, size_t bufsize);

Hm, when I delivered the patch to you, I forgot that you main point
was that you wanted to skip the memory allocation.

I wonder if all these new functions can be consolidated into one
single function, something like:

        struct pkt_buff *pktb_alloc2(int family, void *head, size_t head_size, void *data, size_t len, size_t extra);

The idea is that:

* head is the memory area that is large enough for the struct pkt_buff
  (metadata). You can add a new pktb_head_size() function that returns
  the size of opaque struct pkt_buff structure (whose layout is not
  exposed to the user). I think you can this void *buf in your pktb_make
  function.

* data is the memory area to store the network packet itself.

Then, you can allocate head and data in the stack and skip
malloc/calloc.

Would this work for you? I would prefer if there is just one single
advanced function to do this.

Thanks for your patience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc()
  2020-02-19 18:04       ` Pablo Neira Ayuso
@ 2020-02-20 23:22         ` Duncan Roe
  2020-02-20 23:50           ` Duncan Roe
  2020-04-06  6:17         ` Duncan Roe
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Duncan Roe @ 2020-02-20 23:22 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

Hi Pablo,

On Wed, Feb 19, 2020 at 07:04:10PM +0100, Pablo Neira Ayuso wrote:
> On Sat, Feb 01, 2020 at 05:21:27PM +1100, Duncan Roe wrote:
[...]
> >  struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
> > +struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len);
> > +struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra, void *buf, size_t bufsize);
> > +struct pkt_buff *pktb_make_data(int family, void *data, size_t len, void *buf, size_t bufsize);
>
> Hm, when I delivered the patch to you, I forgot that you main point
> was that you wanted to skip the memory allocation.
>
> I wonder if all these new functions can be consolidated into one
> single function, something like:
>
>         struct pkt_buff *pktb_alloc2(int family, void *head, size_t head_size, void *data, size_t len, size_t extra);
>
> The idea is that:
>
> * head is the memory area that is large enough for the struct pkt_buff
>   (metadata). You can add a new pktb_head_size() function that returns
>   the size of opaque struct pkt_buff structure (whose layout is not
>   exposed to the user). I think you can this void *buf in your pktb_make
>   function.
>
> * data is the memory area to store the network packet itself.
>
> Then, you can allocate head and data in the stack and skip
> malloc/calloc.
>
> Would this work for you? I would prefer if there is just one single
> advanced function to do this.
>
> Thanks for your patience.

This patch set is not the last word. It would really help my development process
if you could just apply the patch set as-is.

> I would prefer if there is just one single advanced function ...

There *is* only 1 "advanced" function listed on the "User-space network packet
buffer" web page: pktb_make_data.

pktb_alloc must be kept for legacy support but it's documented on the "Other
functions" page.

pktb_alloc_data was only written for the benefit of timing tests. I can send a
patch to withdraw it as soon as this set is accepted.

Or, I can submit v4 with pktb_alloc_data removed. But only if you agree you will
then commit the patch.

> * data is the memory area to store the network packet itself.

Eh?? The plan is to leave the data where it is. pktb_make_data does that, it's
fine to use unless mangling could increase the packet size.

A more advance set of functions could lift that restriction. I'm still keen to
investigate the savings to be had by not having to move packet data for an
advanced variant of nfq_nlmsg_verdict_put_pkt. The idea is to use a new
structure (essentially metadata plus a struct nlmsghdr) which I have tentatively
named struct nlmsg_buffer. A number of advanced funtion variants would use it in
place of struct nlmsghdr. Please LMK if you would be interested in this.

> Thanks for your patience.

I have been working on something else :/

I had to put libnetfilter_queue development on hold because juggling 3 branches
was just getting to be too much.

(others are https://www.spinics.net/lists/netfilter-devel/msg65661.html (man
pages) and https://www.spinics.net/lists/netfilter-devel/msg65585.html (add more
helper functions to simplify coding)).

I can send a man pages update as soon as you commit something.

Cheers ... Duncan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc()
  2020-02-20 23:22         ` Duncan Roe
@ 2020-02-20 23:50           ` Duncan Roe
  0 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-02-20 23:50 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

On Fri, Feb 21, 2020 at 10:22:29AM +1100, Duncan Roe wrote:
> Hi Pablo,
>
> On Wed, Feb 19, 2020 at 07:04:10PM +0100, Pablo Neira Ayuso wrote:
[...]
> I had to put libnetfilter_queue development on hold because
> juggling 3 branches was just getting to be too much.
>
> (others are https://www.spinics.net/lists/netfilter-devel/msg65661.html (man
> pages) and https://www.spinics.net/lists/netfilter-devel/msg65585.html (add
> more helper functions to simplify coding)).
>
Please see https://www.spinics.net/lists/netfilter-devel/msg65584.html for a
roadmap of the new helper functions. Another plus for the new functions would
be that they would all have man pages (as long as you take that patch ;)

Cheers ... Duncan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc()
  2020-02-19 18:04       ` Pablo Neira Ayuso
  2020-02-20 23:22         ` Duncan Roe
@ 2020-04-06  6:17         ` Duncan Roe
  2020-04-11  7:24         ` [PATCH libnetfilter_queue 0/1] src & doc: pktb_alloc2 Duncan Roe
  2020-04-11  7:24         ` [PATCH libnetfilter_queue 1/1] New faster pktb_alloc2 replaces pktb_alloc & pktb_free Duncan Roe
  3 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-04-06  6:17 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Netfilter Development

Hi Pablo,

On Wed, Feb 19, 2020 at 07:04:10PM +0100, Pablo Neira Ayuso wrote:
> On Sat, Feb 01, 2020 at 05:21:27PM +1100, Duncan Roe wrote:
> > Functions pktb_alloc_data, pktb_make and pktb_make_data are defined.
> > The pktb_make pair are syggested as replacements for the pktb_alloc (now) pair
> > because they are always faster.
> >
> > - Add prototypes to include/libnetfilter_queue/pktbuff.h
> > - Add pktb_alloc_data much as per Pablo's email of Wed, 8 Jan 2020
> >   speedup: point to packet data in netlink receive buffer rather than copy to
> >            area immediately following pktb struct
> > - Add pktb_make much like pktb_usebuf proposed on 10 Dec 2019
> >   2 sppedups: 1. Use an existing buffer rather than calloc and (later) free one.
> >               2. Only zero struct and extra parts of pktb - the rest is
> >                  overwritten by copy (calloc has to zero the lot).
> > - Add pktb_make_data
> >   3 speedups: All of the above
> > - Document the new functions
> > - Move pktb_alloc and pktb_alloc_data into the "other functions" group since
> >   they are slower than the "make" equivalent functions
> >
> > Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
> > ---
> >  include/libnetfilter_queue/pktbuff.h |   3 +
> >  src/extra/pktbuff.c                  | 296 ++++++++++++++++++++++++++++++-----
> >  2 files changed, 261 insertions(+), 38 deletions(-)
> >
> > diff --git a/include/libnetfilter_queue/pktbuff.h b/include/libnetfilter_queue/pktbuff.h
> > index 42bc153..fc6bf01 100644
> > --- a/include/libnetfilter_queue/pktbuff.h
> > +++ b/include/libnetfilter_queue/pktbuff.h
> > @@ -4,6 +4,9 @@
> >  struct pkt_buff;
> >
> >  struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
> > +struct pkt_buff *pktb_alloc_data(int family, void *data, size_t len);
> > +struct pkt_buff *pktb_make(int family, void *data, size_t len, size_t extra, void *buf, size_t bufsize);
> > +struct pkt_buff *pktb_make_data(int family, void *data, size_t len, void *buf, size_t bufsize);
>
> Hm, when I delivered the patch to you, I forgot that you main point
> was that you wanted to skip the memory allocation.
>
> I wonder if all these new functions can be consolidated into one
> single function, something like:
>
>         struct pkt_buff *pktb_alloc2(int family, void *head, size_t head_size, void *data, size_t len, size_t extra);
>
> The idea is that:
>
> * head is the memory area that is large enough for the struct pkt_buff
>   (metadata). You can add a new pktb_head_size() function that returns
>   the size of opaque struct pkt_buff structure (whose layout is not
>   exposed to the user). I think you can this void *buf in your pktb_make
>   function.
>
> * data is the memory area to store the network packet itself.

Wait, you need data & len to describe where the data is *now*. You need an extra
buf, buflen pair for where to put it.
>
> Then, you can allocate head and data in the stack and skip
> malloc/calloc.
>
> Would this work for you? I would prefer if there is just one single
> advanced function to do this.
>
> Thanks for your patience.

I think I can do as you requested.

Just one thing: do you really think pktb_alloc2 is a good name when users *must
not* call pktb_free? pktb_put would have been good but it's already taken.

Perhaps pktb_alloc2 could set a flag for pktb_free to become a no-op and maybe
issue a warning the first time it is called - would that be worth doing?

In pktb_alloc2, if extra == 0 then buf can be NULL and buflen 0, since the data
will be left in place. That way, we get a single entry point doing as much
optimising as possible.

Cheers ... Duncan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH libnetfilter_queue 0/1] src & doc: pktb_alloc2
  2020-02-19 18:04       ` Pablo Neira Ayuso
  2020-02-20 23:22         ` Duncan Roe
  2020-04-06  6:17         ` Duncan Roe
@ 2020-04-11  7:24         ` Duncan Roe
  2020-04-11  7:24         ` [PATCH libnetfilter_queue 1/1] New faster pktb_alloc2 replaces pktb_alloc & pktb_free Duncan Roe
  3 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-04-11  7:24 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

Hi Pablo,

Herewith pktb_alloc2() as you suggested.
There is no malloc/free and data is only moved if extra > 0.

Cheers ... Duncan.

Duncan Roe (1):
  New faster pktb_alloc2 replaces pktb_alloc & pktb_free

 fixmanpages.sh                       |   6 +-
 include/libnetfilter_queue/pktbuff.h |   2 +
 src/extra/pktbuff.c                  | 204 +++++++++++++++++++++++++++++------
 src/nlmsg.c                          |  14 ++-
 4 files changed, 185 insertions(+), 41 deletions(-)

--
2.14.5


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH libnetfilter_queue 1/1] New faster pktb_alloc2 replaces pktb_alloc & pktb_free
  2020-02-19 18:04       ` Pablo Neira Ayuso
                           ` (2 preceding siblings ...)
  2020-04-11  7:24         ` [PATCH libnetfilter_queue 0/1] src & doc: pktb_alloc2 Duncan Roe
@ 2020-04-11  7:24         ` Duncan Roe
  3 siblings, 0 replies; 19+ messages in thread
From: Duncan Roe @ 2020-04-11  7:24 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

pktb_alloc2() avoids the malloc()/free() overhead in pktb_alloc().
Additionally, if the "extra" argument is zero, pktb_alloc2() leaves the packet
data in place rather than copying it (mangling is still allowed except to
increae the packet size).
New function pktb_head_size() gives size of opaque struct pkt_buff for
allocating space in the stack.
 - Mark old functions DEPRECATED in the documentation.
 - Update sample code snippet in nfq_nlmsg_verdict_put_pkt documentation.

Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
---
 fixmanpages.sh                       |   6 +-
 include/libnetfilter_queue/pktbuff.h |   2 +
 src/extra/pktbuff.c                  | 204 +++++++++++++++++++++++++++++------
 src/nlmsg.c                          |  14 ++-
 4 files changed, 185 insertions(+), 41 deletions(-)

diff --git a/fixmanpages.sh b/fixmanpages.sh
index dd8b3a4..86f902b 100755
--- a/fixmanpages.sh
+++ b/fixmanpages.sh
@@ -31,11 +31,11 @@ function main
     add2group nfq_nlmsg_verdict_put_mark nfq_nlmsg_verdict_put_pkt
   setgroup nlmsg nfq_nlmsg_parse
     add2group nfq_nlmsg_put
-  setgroup pktbuff pktb_alloc
-    add2group pktb_data pktb_len pktb_mangle pktb_mangled
-    add2group pktb_free
+  setgroup pktbuff pktb_alloc2
+    add2group pktb_data pktb_len pktb_mangle pktb_mangled pktb_head_size
     setgroup otherfns pktb_tailroom
       add2group pktb_mac_header pktb_network_header pktb_transport_header
+      add2group pktb_alloc pktb_free
       setgroup uselessfns pktb_push
         add2group pktb_pull pktb_put pktb_trim
   setgroup tcp nfq_tcp_get_hdr
diff --git a/include/libnetfilter_queue/pktbuff.h b/include/libnetfilter_queue/pktbuff.h
index 42bc153..a71e62b 100644
--- a/include/libnetfilter_queue/pktbuff.h
+++ b/include/libnetfilter_queue/pktbuff.h
@@ -4,6 +4,7 @@
 struct pkt_buff;
 
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra);
+struct pkt_buff *pktb_alloc2(int family, void *head, size_t headsize, void *data, size_t len, size_t extra, void *buf, size_t bufsize);
 void pktb_free(struct pkt_buff *pktb);
 
 uint8_t *pktb_data(struct pkt_buff *pktb);
@@ -22,5 +23,6 @@ uint8_t *pktb_transport_header(struct pkt_buff *pktb);
 int pktb_mangle(struct pkt_buff *pktb, int dataoff, unsigned int match_offset, unsigned int match_len, const char *rep_buffer, unsigned int rep_len);
 
 bool pktb_mangled(const struct pkt_buff *pktb);
+size_t pktb_head_size(void);
 
 #endif
diff --git a/src/extra/pktbuff.c b/src/extra/pktbuff.c
index 6dd0ca9..df97192 100644
--- a/src/extra/pktbuff.c
+++ b/src/extra/pktbuff.c
@@ -29,8 +29,52 @@
  * @{
  */
 
+static int pktb_setup_family(struct pkt_buff *pktb, int family)
+{
+	struct ethhdr *ethhdr;
+
+	switch(family) {
+	case AF_INET:
+	case AF_INET6:
+		pktb->network_header = pktb->data;
+		break;
+	case AF_BRIDGE:
+		ethhdr = (struct ethhdr *)pktb->data;
+
+		pktb->mac_header = pktb->data;
+
+		switch(ethhdr->h_proto) {
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			pktb->network_header = pktb->data + ETH_HLEN;
+			break;
+		default:
+			/* This protocol is unsupported. */
+			errno = EPROTONOSUPPORT;
+			return -1;
+		}
+		break;
+	}
+
+	return 0;
+}
+
+static void pktb_setup_metadata(struct pkt_buff *pktb, void *pkt_data,
+				size_t len, size_t extra)
+{
+	pktb->len = len;
+	pktb->data_len = len + extra;
+
+	pktb->data = pkt_data;
+}
+
+/**
+ * \addtogroup otherfns
+ * @{
+ */
+
 /**
- * pktb_alloc - allocate a new packet buffer
+ * pktb_alloc - allocate a new packet buffer [DEPRECATED]
  * \param family Indicate what family. Currently supported families are
  * AF_BRIDGE, AF_INET & AF_INET6.
  * \param data Pointer to packet data
@@ -46,13 +90,13 @@
  * \n
  * __EPROTONOSUPPORT__ _family_ was __AF_BRIDGE__ and this is not an IP packet
  * (v4 or v6)
+ * \note __pktb_alloc__ is deprecated. Use pktb_alloc2() in new code
  * \sa __calloc__(3)
  */
 EXPORT_SYMBOL
 struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 {
 	struct pkt_buff *pktb;
-	struct ethhdr *ethhdr;
 	void *pkt_data;
 
 	pktb = calloc(1, sizeof(struct pkt_buff) + len + extra);
@@ -63,33 +107,113 @@ struct pkt_buff *pktb_alloc(int family, void *data, size_t len, size_t extra)
 	pkt_data = (uint8_t *)pktb + sizeof(struct pkt_buff);
 	memcpy(pkt_data, data, len);
 
-	pktb->len = len;
-	pktb->data_len = len + extra;
+	pktb_setup_metadata(pktb, pkt_data, len, extra);
 
-	pktb->data = pkt_data;
+	if (pktb_setup_family(pktb, family) < 0) {
+		free(pktb);
+		return NULL;
+	}
 
-	switch(family) {
-	case AF_INET:
-	case AF_INET6:
-		pktb->network_header = pktb->data;
-		break;
-	case AF_BRIDGE:
-		ethhdr = (struct ethhdr *)pktb->data;
-		pktb->mac_header = pktb->data;
+	return pktb;
+}
 
-		switch(ethhdr->h_proto) {
-		case ETH_P_IP:
-		case ETH_P_IPV6:
-			pktb->network_header = pktb->data + ETH_HLEN;
-			break;
-		default:
-			/* This protocol is unsupported. */
-			errno = EPROTONOSUPPORT;
-			free(pktb);
-			return NULL;
-		}
-		break;
+/**
+ * @}
+ */
+
+/**
+ * pktb_alloc2 - make a packet buffer from an existing buffer
+ * \param family Indicate what family. Currently supported families are
+ * AF_BRIDGE, AF_INET & AF_INET6.
+ * \param head Buffer to hold packet metadata
+ * \param headsize Size of _head_. Use pktb_head_size() to get the optimum
+ * value
+ * \param data Pointer to packet data
+ * \param len Packet length
+ * \param extra Extra memory in the tail to be allocated (for mangling).
+ * Specify zero (to avoid a data copy) unless mangling may increase packet size
+ * \param buf Buffer to use for packet data copy.
+ * Specify NULL if _extra_ is zero
+ * \param bufsize Size of _buf_. Specify zero if _buf_ is NULL
+ *
+ * This function builds a userspace packet buffer inside a supplied buffer.
+ * The packet buffer contains the packet data and some extra memory room in the
+ * tail (if requested).
+ *
+ * This function uses less CPU cycles than pktb_alloc() + pktb_free().
+ * Users can expect a decrease in overall CPU time in the order of 5%,
+ * 10% if _extra_ is zero.
+ *
+ * \return Pointer to a userspace packet buffer or NULL on failure.
+ * \par Errors
+ * __ENOMEM__ _bufsize_ is too small to accomodate _len_
+ * or _headsize_ is too small to accomodate __struct pkt_buff__
+ * \n
+ * __EPROTONOSUPPORT__ _family_ was __AF_BRIDGE__ and this is not an IP packet
+ * (v4 or v6)
+ * \par Relative performance timings
+ * The table below shows relative timings for pktb_alloc() and pktb_alloc2()
+ * with and without extra bytes.
+ * Tests were done for 50-byte and 1500-byte packets.
+ * \verbatim
+======================================================
+| Bytes | pktb_alloc + | pktb_alloc2, | pktb_alloc2, |
+|       | pktb_free    | extra != 0   | extra == 0   |
+======================================================
+|   50  | 100%         | 51%          | 29%          |
+======================================================
+| 1500  | 100%         | 60%          | 12%          |
+======================================================
+ \endverbatim
+ * \note Be sure not to __malloc__ _buf_ or _head_ inside your callback
+ * function,
+ * otherwise the performance improvement over pktb_alloc() will be lost.
+ * It's suggested to declare the buffers as local (stack) variables in
+ * the callback function.
+ * (Each thread has its own stack of size __uname -s__ (typically 8MB, plenty
+ * of room)).
+ * \n
+ * If debugging a single-threaded program, static buffers may be preferred:
+ * either debug to find __sizeof(struct pkt_buff)__ or allow, say, 64 bytes.
+ * \note If mangling with zero _extra_,
+ * one must use different buffers for netlink receive and
+ * send from and to the kernel.
+ * Otherwise, there may be overlapping moves or worse.
+ * examples/nf-queue.c does this anyway.
+ * \warning Do __not__ call pktb_free() on the returned pointer.
+ * If upgrading existing software to use __pktb_alloc2__,
+ * be sure to remove the __pktb_free__ call.
+ * \sa nfq_nlmsg_verdict_put_pkt() (has sample code using __pktb_alloc2__)
+ */
+EXPORT_SYMBOL
+struct pkt_buff *pktb_alloc2(int family, void *head, size_t headsize,
+			     void *data, size_t len, size_t extra,
+			     void *buf, size_t bufsize)
+{
+	struct pkt_buff *pktb;
+	void *pkt_data;
+
+	if (sizeof(struct pkt_buff) > headsize ||
+		(extra && (len + extra > bufsize)))
+	{
+		errno = ENOMEM;
+		return NULL;
 	}
+	pktb = head;
+	memset(pktb, 0, sizeof(struct pkt_buff));
+
+	if (extra)
+	{
+		pkt_data = buf;
+		memcpy(pkt_data, data, len);
+		memset((uint8_t *)pkt_data + len, 0, extra);
+	}
+	else
+		pkt_data = data;
+
+	pktb_setup_metadata(pktb, pkt_data, len, extra);
+	if (pktb_setup_family(pktb, family) < 0)
+		pktb = NULL;
 	return pktb;
 }
 
@@ -122,29 +246,43 @@ uint32_t pktb_len(struct pkt_buff *pktb)
 }
 
 /**
- * pktb_free - release packet buffer
- * \param pktb Pointer to userspace packet buffer
+ * pktb_head_size - get size of struct pkt_buff
+ * \return Size of (opaque) __struct pkt_buff__
  */
 EXPORT_SYMBOL
-void pktb_free(struct pkt_buff *pktb)
+size_t pktb_head_size(void)
 {
-	free(pktb);
+	return sizeof(struct pkt_buff);
 }
 
 /**
  * \defgroup otherfns Other functions
  *
  * The library provides a number of other functions which many user-space
- * programs will never need. These divide into 2 groups:
+ * programs will never need. These divide into 3 groups:
  * \n
  * 1. Functions to get values of members of opaque __struct pktbuff__, described
  * below
  * \n
- * 2. Internal functions, described in Module __Internal functions__
+ * 2. Slower functions that __malloc__ and __free__ the memory to make a buffer
+ * \n
+ * 3. Internal functions, described in Module __Internal functions__
  *
  * @{
  */
 
+/**
+ * pktb_free - release packet buffer [DEPRECATED]
+ * \param pktb Pointer to userspace packet buffer
+ * \note __pktb_free__ is deprecated.
+ * It is not required and must not be used with pktb_alloc2()
+ */
+EXPORT_SYMBOL
+void pktb_free(struct pkt_buff *pktb)
+{
+	free(pktb);
+}
+
 /**
  * \defgroup uselessfns Internal functions
  *
@@ -195,7 +333,7 @@ void pktb_put(struct pkt_buff *pktb, unsigned int len)
 /**
  * pktb_trim - set new length for this packet buffer
  * \param pktb Pointer to userspace packet buffer
- * \param len New packet length (tail is adjusted to reflect this)
+ * \param len New packet length
  */
 EXPORT_SYMBOL
 void pktb_trim(struct pkt_buff *pktb, unsigned int len)
@@ -306,7 +444,7 @@ static int enlarge_pkt(struct pkt_buff *pktb, unsigned int extra)
  * excess of \b rep_len over \b match_len
  \warning pktb_mangle does not update any checksums. Developers should use the
  appropriate mangler for the protocol level: nfq_ip_mangle(),
- nfq_tcp_mangle_ipv4() or nfq_udp_mangle_ipv4(). IPv6 versions are planned.
+ nfq_tcp_mangle_ipv4(), nfq_udp_mangle_ipv4() or the IPv6 versions of these.
  \n
  It is appropriate to use pktb_mangle to change the MAC header.
  */
diff --git a/src/nlmsg.c b/src/nlmsg.c
index e141156..efc8189 100644
--- a/src/nlmsg.c
+++ b/src/nlmsg.c
@@ -106,26 +106,30 @@ EXPORT_SYMBOL
  *
  * This code snippet uses nfq_udp_mangle_ipv4. See nf-queue.c for
  * context:
- * \verbatim
-// main calls queue_cb (line 64) to process an enqueued packet:
+ * \code {.c}
+// main calls queue_cb (line 49) to process an enqueued packet:
 	// Extra variables
 	uint8_t *payload, *rep_data;
 	unsigned int match_offset, match_len, rep_len;
+	char head[pktb_head_size()];
 
 	// The next line was commented-out (with payload void*)
 	payload = mnl_attr_get_payload(attr[NFQA_PAYLOAD]);
 	// Copy data to a packet buffer (allow 255 bytes for mangling).
-	pktb = pktb_alloc(AF_INET, payload, plen, 255);
+#define EXTRA 255
+	char pktbuf[plen + EXTRA];
+	pktb = pktb_alloc2(AF_INET, head, sizeof head, payload, plen, EXTRA,
+		pktbuf, sizeof pktbuf);
 	// (decide that this packet needs mangling)
 	nfq_udp_mangle_ipv4(pktb, match_offset, match_len, rep_data, rep_len);
 	// nfq_udp_mangle_ipv4 updates packet length, no need to track locally
 
-	// Eventually nfq_send_verdict (line 39) gets called
+	// Eventually nfq_send_verdict (line 24) gets called
 	// The received packet may or may not have been modified.
 	// Add this code before nfq_nlmsg_verdict_put call:
 	if (pktb_mangled(pktb))
 		nfq_nlmsg_verdict_put_pkt(nlh, pktb_data(pktb), pktb_len(pktb));
-\endverbatim
+\endcode
  */
 void nfq_nlmsg_verdict_put_pkt(struct nlmsghdr *nlh, const void *pkt,
 			       uint32_t plen)
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-04-11  7:24 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-10 11:26 [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
2019-12-10 11:26 ` [PATCH libnetfilter_queue 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe
2019-12-22  2:09 ` [PATCH libnetfilter_queue 0/1] New pktb_usebuf() function Duncan Roe
2020-01-03  2:47 ` Duncan Roe
2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 0/1] New pktb_make() function Duncan Roe
2020-01-08 22:53   ` Pablo Neira Ayuso
2020-01-10  2:27     ` Duncan Roe
2020-01-13 18:51       ` Pablo Neira Ayuso
2020-01-27  2:11         ` Duncan Roe
2020-01-27  1:44     ` Duncan Roe
2020-02-01  6:21     ` [PATCH libnetfilter_queue v2] src: Add faster alternatives to pktb_alloc() Duncan Roe
2020-02-07 22:39       ` Duncan Roe
2020-02-19 18:04       ` Pablo Neira Ayuso
2020-02-20 23:22         ` Duncan Roe
2020-02-20 23:50           ` Duncan Roe
2020-04-06  6:17         ` Duncan Roe
2020-04-11  7:24         ` [PATCH libnetfilter_queue 0/1] src & doc: pktb_alloc2 Duncan Roe
2020-04-11  7:24         ` [PATCH libnetfilter_queue 1/1] New faster pktb_alloc2 replaces pktb_alloc & pktb_free Duncan Roe
2020-01-06  3:17 ` [PATCH libnetfilter_queue v2 1/1] src: Add alternative function to pktb_alloc to avoid malloc / free overhead Duncan Roe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).