netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next 0/3] docs: document some aspects of struct sk_buff
@ 2022-03-23 23:37 Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

I dusted off some old skb documentation patches I had in my tree.
This small set creates a place to render such documentation,
documents one random thing (data-only skbs) and converts the big
checksum comment to kdoc.

Jakub Kicinski (3):
  skbuff: add a basic intro doc
  skbuff: rewrite the doc for data-only skbs
  skbuff: render the checksum comment to documentation

 Documentation/networking/index.rst  |   1 +
 Documentation/networking/skbuff.rst |  37 ++++
 include/linux/skbuff.h              | 292 ++++++++++++++++++----------
 3 files changed, 224 insertions(+), 106 deletions(-)
 create mode 100644 Documentation/networking/skbuff.rst

-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC net-next 1/3] skbuff: add a basic intro doc
  2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
@ 2022-03-23 23:37 ` Jakub Kicinski
  2022-03-24 14:16   ` Jonathan Corbet
  2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 3/3] skbuff: render the checksum comment to documentation Jakub Kicinski
  2 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

Add basic skb documentation. It's mostly an intro to the subsequent
patches - it would looks strange if we documented advanced topics
without covering the basics in any way.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/skbuff.rst | 25 ++++++++++++++++++
 include/linux/skbuff.h              | 40 +++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 Documentation/networking/skbuff.rst

diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
new file mode 100644
index 000000000000..7c6be64f486a
--- /dev/null
+++ b/Documentation/networking/skbuff.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+struct sk_buff
+==============
+
+:c:type:`struct sk_buff` is the main networking structure representing
+a packet.
+
+Basic sk_buff geometry
+----------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :doc: Basic sk_buff geometry
+
+Shared skbs and skb clones
+--------------------------
+
+:c:member:`sk_buff.users` is a simple refcount allowing multiple entities
+to keep a struct sk_buff alive. skbs with a ``sk_buff.users != 1`` are referred
+to as shared skbs (see skb_shared()).
+
+skb_clone() allows for fast duplication of skbs. None of the data buffers
+get copied, but caller gets a new metadata struct (struct sk_buff).
+&skb_shared_info.refcount indicates the number of skbs pointing at the same
+packet data (i.e. clones).
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3a30cae8b0a5..5431be4aa309 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -764,6 +764,46 @@ typedef unsigned int sk_buff_data_t;
 typedef unsigned char *sk_buff_data_t;
 #endif
 
+/**
+ * DOC: Basic sk_buff geometry
+ *
+ * struct sk_buff itself is a metadata structure and does not hold any packet
+ * data. All the data is held in associated buffers.
+ *
+ * &sk_buff.head points to the main "head" buffer. The head buffer is divided
+ * into two parts:
+ *
+ *  - data buffer, containing headers and sometimes payload data;
+ *    this is the part of the skb operated on by the common helpers
+ *    such as skb_put() or skb_pull();
+ *  - shared info (struct skb_shared_info) which holds an array of pointers
+ *    to read-only payload data in the (page, offset, length) format.
+ *
+ * Optionally &skb_shared_info.frag_list may point to another skb.
+ *
+ * Basic diagram may look like this::
+ *
+ *                                  ---------------
+ *                                 | sk_buff       |
+ *                                  ---------------
+ *     ,---------------------------  + head
+ *    /          ,-----------------  + data
+ *   /          /      ,-----------  + tail
+ *  |          |      |            , + end
+ *  |          |      |           |
+ *  v          v      v           v
+ *   -----------------------------------------------
+ *  | headroom | data |  tailroom | skb_shared_info |
+ *   -----------------------------------------------
+ *                                 + [page frag]
+ *                                 + [page frag]
+ *                                 + [page frag]
+ *                                 + [page frag]       ---------
+ *                                 + frag_list    --> | sk_buff |
+ *                                                     ---------
+ *
+ */
+
 /**
  *	struct sk_buff - socket buffer
  *	@next: Next buffer in list
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs
  2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
@ 2022-03-23 23:37 ` Jakub Kicinski
  2022-03-24  8:50   ` Paolo Abeni
  2022-03-23 23:37 ` [RFC net-next 3/3] skbuff: render the checksum comment to documentation Jakub Kicinski
  2 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

The comment about shinfo->dataref split is really unhelpful,
at least to me. Rewrite it and render it to skb documentation.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/index.rst  |  1 +
 Documentation/networking/skbuff.rst |  6 ++++++
 include/linux/skbuff.h              | 33 +++++++++++++++++++----------
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index ce017136ab05..1b3c45add20d 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -96,6 +96,7 @@ Linux Networking Documentation
    sctp
    secid
    seg6-sysctl
+   skbuff
    smc-sysctl
    statistics
    strparser
diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
index 7c6be64f486a..581e5561c362 100644
--- a/Documentation/networking/skbuff.rst
+++ b/Documentation/networking/skbuff.rst
@@ -23,3 +23,9 @@ skb_clone() allows for fast duplication of skbs. None of the data buffers
 get copied, but caller gets a new metadata struct (struct sk_buff).
 &skb_shared_info.refcount indicates the number of skbs pointing at the same
 packet data (i.e. clones).
+
+dataref and headerless skbs
+---------------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :doc: dataref and headerless skbs
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5431be4aa309..5b838350931c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -691,16 +691,25 @@ struct skb_shared_info {
 	skb_frag_t	frags[MAX_SKB_FRAGS];
 };
 
-/* We divide dataref into two halves.  The higher 16 bits hold references
- * to the payload part of skb->data.  The lower 16 bits hold references to
- * the entire skb->data.  A clone of a headerless skb holds the length of
- * the header in skb->hdr_len.
+/**
+ * DOC: dataref and headerless skbs
+ *
+ * Transport layers send out clones of data skbs they hold for retransmissions.
+ * To allow lower layers of the stack to prepend their headers
+ * we split &skb_shared_info.dataref into two halves.
+ * The lower 16 bits count the overall number of references.
+ * The higher 16 bits indicate number of data-only references.
+ * skb_header_cloned() checks if skb is allowed to add / write the headers.
  *
- * All users must obey the rule that the skb->data reference count must be
- * greater than or equal to the payload reference count.
+ * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr
+ * (via __skb_header_release()). Any clone created from marked skb will get
+ * &sk_buff.hdr_len populated with the available headroom.
+ * If it's the only clone in existence it's able to modify the headroom at will.
  *
- * Holding a reference to the payload part means that the user does not
- * care about modifications to the header part of skb->data.
+ * This is not a very generic construct and it depends on the transport layers
+ * doing the right thing. In practice there's usually only one data-only skb.
+ * Having multiple data-only skbs with different lengths of hdr_len is not
+ * possible. The data-only skbs should never leave their owner.
  */
 #define SKB_DATAREF_SHIFT 16
 #define SKB_DATAREF_MASK ((1 << SKB_DATAREF_SHIFT) - 1)
@@ -833,7 +842,7 @@ typedef unsigned char *sk_buff_data_t;
  *	@ignore_df: allow local fragmentation
  *	@cloned: Head may be cloned (check refcnt to be sure)
  *	@ip_summed: Driver fed us an IP checksum
- *	@nohdr: Payload reference only, must not modify header
+ *	@nohdr: Data-only skb, must not modify header
  *	@pkt_type: Packet class
  *	@fclone: skbuff clone status
  *	@ipvs_property: skbuff is owned by ipvs
@@ -1962,8 +1971,10 @@ static inline int skb_header_unclone(struct sk_buff *skb, gfp_t pri)
 }
 
 /**
- *	__skb_header_release - release reference to header
- *	@skb: buffer to operate on
+ * __skb_header_release() - allow clones to use the headroom
+ * @skb: buffer to operate on
+ *
+ * See "DOC: dataref and headerless skbs".
  */
 static inline void __skb_header_release(struct sk_buff *skb)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC net-next 3/3] skbuff: render the checksum comment to documentation
  2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
@ 2022-03-23 23:37 ` Jakub Kicinski
  2 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

Long time ago Tom added a giant comment to skbuff.h explaining
checksums. Now that we have a place in Documentation for skbuff
docs we should render it. Sprinkle some markup while at it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/skbuff.rst |   6 +
 include/linux/skbuff.h              | 219 ++++++++++++++++------------
 2 files changed, 130 insertions(+), 95 deletions(-)

diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
index 581e5561c362..f2b97b5bb0b0 100644
--- a/Documentation/networking/skbuff.rst
+++ b/Documentation/networking/skbuff.rst
@@ -29,3 +29,9 @@ dataref and headerless skbs
 
 .. kernel-doc:: include/linux/skbuff.h
    :doc: dataref and headerless skbs
+
+Checksum information
+--------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :doc: skb checksums
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5b838350931c..a52524805919 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -43,98 +43,112 @@
 #include <linux/netfilter/nf_conntrack_common.h>
 #endif
 
-/* The interface for checksum offload between the stack and networking drivers
+/**
+ * DOC: skb checksums
+ *
+ * The interface for checksum offload between the stack and networking drivers
  * is as follows...
  *
- * A. IP checksum related features
+ * IP checksum related features
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
  * Drivers advertise checksum offload capabilities in the features of a device.
  * From the stack's point of view these are capabilities offered by the driver.
  * A driver typically only advertises features that it is capable of offloading
  * to its device.
  *
- * The checksum related features are:
- *
- *	NETIF_F_HW_CSUM	- The driver (or its device) is able to compute one
- *			  IP (one's complement) checksum for any combination
- *			  of protocols or protocol layering. The checksum is
- *			  computed and set in a packet per the CHECKSUM_PARTIAL
- *			  interface (see below).
- *
- *	NETIF_F_IP_CSUM - Driver (device) is only able to checksum plain
- *			  TCP or UDP packets over IPv4. These are specifically
- *			  unencapsulated packets of the form IPv4|TCP or
- *			  IPv4|UDP where the Protocol field in the IPv4 header
- *			  is TCP or UDP. The IPv4 header may contain IP options.
- *			  This feature cannot be set in features for a device
- *			  with NETIF_F_HW_CSUM also set. This feature is being
- *			  DEPRECATED (see below).
- *
- *	NETIF_F_IPV6_CSUM - Driver (device) is only able to checksum plain
- *			  TCP or UDP packets over IPv6. These are specifically
- *			  unencapsulated packets of the form IPv6|TCP or
- *			  IPv6|UDP where the Next Header field in the IPv6
- *			  header is either TCP or UDP. IPv6 extension headers
- *			  are not supported with this feature. This feature
- *			  cannot be set in features for a device with
- *			  NETIF_F_HW_CSUM also set. This feature is being
- *			  DEPRECATED (see below).
- *
- *	NETIF_F_RXCSUM - Driver (device) performs receive checksum offload.
- *			 This flag is only used to disable the RX checksum
- *			 feature for a device. The stack will accept receive
- *			 checksum indication in packets received on a device
- *			 regardless of whether NETIF_F_RXCSUM is set.
- *
- * B. Checksumming of received packets by device. Indication of checksum
- *    verification is set in skb->ip_summed. Possible values are:
- *
- * CHECKSUM_NONE:
+ * .. flat-table:: Checksum related device features
+ *   :widths: 1 10
+ *
+ *   * - %NETIF_F_HW_CSUM
+ *     - The driver (or its device) is able to compute one
+ *	 IP (one's complement) checksum for any combination
+ *	 of protocols or protocol layering. The checksum is
+ *	 computed and set in a packet per the CHECKSUM_PARTIAL
+ *	 interface (see below).
+ *
+ *   * - %NETIF_F_IP_CSUM
+ *     - Driver (device) is only able to checksum plain
+ *	 TCP or UDP packets over IPv4. These are specifically
+ *	 unencapsulated packets of the form IPv4|TCP or
+ *	 IPv4|UDP where the Protocol field in the IPv4 header
+ *	 is TCP or UDP. The IPv4 header may contain IP options.
+ *	 This feature cannot be set in features for a device
+ *	 with NETIF_F_HW_CSUM also set. This feature is being
+ *	 DEPRECATED (see below).
+ *
+ *   * - %NETIF_F_IPV6_CSUM
+ *     - Driver (device) is only able to checksum plain
+ *	 TCP or UDP packets over IPv6. These are specifically
+ *	 unencapsulated packets of the form IPv6|TCP or
+ *	 IPv6|UDP where the Next Header field in the IPv6
+ *	 header is either TCP or UDP. IPv6 extension headers
+ *	 are not supported with this feature. This feature
+ *	 cannot be set in features for a device with
+ *	 NETIF_F_HW_CSUM also set. This feature is being
+ *	 DEPRECATED (see below).
+ *
+ *   * - %NETIF_F_RXCSUM
+ *     - Driver (device) performs receive checksum offload.
+ *	 This flag is only used to disable the RX checksum
+ *	 feature for a device. The stack will accept receive
+ *	 checksum indication in packets received on a device
+ *	 regardless of whether NETIF_F_RXCSUM is set.
+ *
+ * Checksumming of received packets by device
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * Indication of checksum verification is set in &sk_buff.ip_summed.
+ * Possible values are:
+ *
+ * - %CHECKSUM_NONE
  *
  *   Device did not checksum this packet e.g. due to lack of capabilities.
  *   The packet contains full (though not verified) checksum in packet but
  *   not in skb->csum. Thus, skb->csum is undefined in this case.
  *
- * CHECKSUM_UNNECESSARY:
+ * - %CHECKSUM_UNNECESSARY
  *
  *   The hardware you're dealing with doesn't calculate the full checksum
- *   (as in CHECKSUM_COMPLETE), but it does parse headers and verify checksums
- *   for specific protocols. For such packets it will set CHECKSUM_UNNECESSARY
- *   if their checksums are okay. skb->csum is still undefined in this case
+ *   (as in %CHECKSUM_COMPLETE), but it does parse headers and verify checksums
+ *   for specific protocols. For such packets it will set %CHECKSUM_UNNECESSARY
+ *   if their checksums are okay. &sk_buff.csum is still undefined in this case
  *   though. A driver or device must never modify the checksum field in the
  *   packet even if checksum is verified.
  *
- *   CHECKSUM_UNNECESSARY is applicable to following protocols:
- *     TCP: IPv6 and IPv4.
- *     UDP: IPv4 and IPv6. A device may apply CHECKSUM_UNNECESSARY to a
+ *   %CHECKSUM_UNNECESSARY is applicable to following protocols:
+ *
+ *     - TCP: IPv6 and IPv4.
+ *     - UDP: IPv4 and IPv6. A device may apply CHECKSUM_UNNECESSARY to a
  *       zero UDP checksum for either IPv4 or IPv6, the networking stack
  *       may perform further validation in this case.
- *     GRE: only if the checksum is present in the header.
- *     SCTP: indicates the CRC in SCTP header has been validated.
- *     FCOE: indicates the CRC in FC frame has been validated.
+ *     - GRE: only if the checksum is present in the header.
+ *     - SCTP: indicates the CRC in SCTP header has been validated.
+ *     - FCOE: indicates the CRC in FC frame has been validated.
  *
- *   skb->csum_level indicates the number of consecutive checksums found in
- *   the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
+ *   &sk_buff.csum_level indicates the number of consecutive checksums found in
+ *   the packet minus one that have been verified as %CHECKSUM_UNNECESSARY.
  *   For instance if a device receives an IPv6->UDP->GRE->IPv4->TCP packet
  *   and a device is able to verify the checksums for UDP (possibly zero),
- *   GRE (checksum flag is set) and TCP, skb->csum_level would be set to
+ *   GRE (checksum flag is set) and TCP, &sk_buff.csum_level would be set to
  *   two. If the device were only able to verify the UDP checksum and not
  *   GRE, either because it doesn't support GRE checksum or because GRE
  *   checksum is bad, skb->csum_level would be set to zero (TCP checksum is
  *   not considered in this case).
  *
- * CHECKSUM_COMPLETE:
+ * - %CHECKSUM_COMPLETE
  *
  *   This is the most generic way. The device supplied checksum of the _whole_
- *   packet as seen by netif_rx() and fills in skb->csum. This means the
+ *   packet as seen by netif_rx() and fills in &sk_buff.csum. This means the
  *   hardware doesn't need to parse L3/L4 headers to implement this.
  *
  *   Notes:
+ *
  *   - Even if device supports only some protocols, but is able to produce
  *     skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
  *   - CHECKSUM_COMPLETE is not applicable to SCTP and FCoE protocols.
  *
- * CHECKSUM_PARTIAL:
+ * - %CHECKSUM_PARTIAL
  *
  *   A checksum is set up to be offloaded to a device as described in the
  *   output description for CHECKSUM_PARTIAL. This may occur on a packet
@@ -146,14 +160,18 @@
  *   packet that are after the checksum being offloaded are not considered to
  *   be verified.
  *
- * C. Checksumming on transmit for non-GSO. The stack requests checksum offload
- *    in the skb->ip_summed for a packet. Values are:
+ * Checksumming on transmit for non-GSO
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * The stack requests checksum offload in the &sk_buff.ip_summed for a packet.
+ * Values are:
  *
- * CHECKSUM_PARTIAL:
+ * - %CHECKSUM_PARTIAL
  *
  *   The driver is required to checksum the packet as seen by hard_start_xmit()
- *   from skb->csum_start up to the end, and to record/write the checksum at
- *   offset skb->csum_start + skb->csum_offset. A driver may verify that the
+ *   from &sk_buff.csum_start up to the end, and to record/write the checksum at
+ *   offset &sk_buff.csum_start + &sk_buff.csum_offset.
+ *   A driver may verify that the
  *   csum_start and csum_offset values are valid values given the length and
  *   offset of the packet, but it should not attempt to validate that the
  *   checksum refers to a legitimate transport layer checksum -- it is the
@@ -165,55 +183,66 @@
  *   checksum calculation to the device, or call skb_checksum_help (in the case
  *   that the device does not support offload for a particular checksum).
  *
- *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
- *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
+ *   %NETIF_F_IP_CSUM and %NETIF_F_IPV6_CSUM are being deprecated in favor of
+ *   %NETIF_F_HW_CSUM. New devices should use %NETIF_F_HW_CSUM to indicate
  *   checksum offload capability.
- *   skb_csum_hwoffload_help() can be called to resolve CHECKSUM_PARTIAL based
+ *   skb_csum_hwoffload_help() can be called to resolve %CHECKSUM_PARTIAL based
  *   on network device checksumming capabilities: if a packet does not match
- *   them, skb_checksum_help or skb_crc32c_help (depending on the value of
- *   csum_not_inet, see item D.) is called to resolve the checksum.
+ *   them, skb_checksum_help() or skb_crc32c_help() (depending on the value of
+ *   &sk_buff.csum_not_inet, see :ref:`crc`)
+ *   is called to resolve the checksum.
  *
- * CHECKSUM_NONE:
+ * - %CHECKSUM_NONE
  *
  *   The skb was already checksummed by the protocol, or a checksum is not
  *   required.
  *
- * CHECKSUM_UNNECESSARY:
+ * - %CHECKSUM_UNNECESSARY
  *
  *   This has the same meaning as CHECKSUM_NONE for checksum offload on
  *   output.
  *
- * CHECKSUM_COMPLETE:
+ * - %CHECKSUM_COMPLETE
+ *
  *   Not used in checksum output. If a driver observes a packet with this value
- *   set in skbuff, it should treat the packet as if CHECKSUM_NONE were set.
- *
- * D. Non-IP checksum (CRC) offloads
- *
- *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
- *     offloading the SCTP CRC in a packet. To perform this offload the stack
- *     will set csum_start and csum_offset accordingly, set ip_summed to
- *     CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication in
- *     the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
- *     A driver that supports both IP checksum offload and SCTP CRC32c offload
- *     must verify which offload is configured for a packet by testing the
- *     value of skb->csum_not_inet; skb_crc32c_csum_help is provided to resolve
- *     CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
- *
- *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
- *     offloading the FCOE CRC in a packet. To perform this offload the stack
- *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
- *     accordingly. Note that there is no indication in the skbuff that the
- *     CHECKSUM_PARTIAL refers to an FCOE checksum, so a driver that supports
- *     both IP checksum offload and FCOE CRC offload must verify which offload
- *     is configured for a packet, presumably by inspecting packet headers.
- *
- * E. Checksumming on output with GSO.
- *
- * In the case of a GSO packet (skb_is_gso(skb) is true), checksum offload
+ *   set in skbuff, it should treat the packet as if %CHECKSUM_NONE were set.
+ *
+ * .. _crc:
+ *
+ * Non-IP checksum (CRC) offloads
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * .. flat-table::
+ *   :widths: 1 10
+ *
+ *   * - %NETIF_F_SCTP_CRC
+ *     - This feature indicates that a device is capable of
+ *	 offloading the SCTP CRC in a packet. To perform this offload the stack
+ *	 will set csum_start and csum_offset accordingly, set ip_summed to
+ *	 %CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication
+ *	 in the skbuff that the %CHECKSUM_PARTIAL refers to CRC32c.
+ *	 A driver that supports both IP checksum offload and SCTP CRC32c offload
+ *	 must verify which offload is configured for a packet by testing the
+ *	 value of &sk_buff.csum_not_inet; skb_crc32c_csum_help() is provided to
+ *	 resolve %CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
+ *
+ *   * - %NETIF_F_FCOE_CRC
+ *     - This feature indicates that a device is capable of offloading the FCOE
+ *	 CRC in a packet. To perform this offload the stack will set ip_summed
+ *	 to %CHECKSUM_PARTIAL and set csum_start and csum_offset
+ *	 accordingly. Note that there is no indication in the skbuff that the
+ *	 %CHECKSUM_PARTIAL refers to an FCOE checksum, so a driver that supports
+ *	 both IP checksum offload and FCOE CRC offload must verify which offload
+ *	 is configured for a packet, presumably by inspecting packet headers.
+ *
+ * Checksumming on output with GSO
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * In the case of a GSO packet (skb_is_gso() is true), checksum offload
  * is implied by the SKB_GSO_* flags in gso_type. Most obviously, if the
- * gso_type is SKB_GSO_TCPV4 or SKB_GSO_TCPV6, TCP checksum offload as
+ * gso_type is %SKB_GSO_TCPV4 or %SKB_GSO_TCPV6, TCP checksum offload as
  * part of the GSO operation is implied. If a checksum is being offloaded
- * with GSO then ip_summed is CHECKSUM_PARTIAL, and both csum_start and
+ * with GSO then ip_summed is %CHECKSUM_PARTIAL, and both csum_start and
  * csum_offset are set to refer to the outermost checksum being offloaded
  * (two offloaded checksums are possible with UDP encapsulation).
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs
  2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
@ 2022-03-24  8:50   ` Paolo Abeni
  2022-03-24 18:31     ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2022-03-24  8:50 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, corbet, imagedong, edumazet, dsahern, talalahmad, linux-doc

Hello,

On Wed, 2022-03-23 at 16:37 -0700, Jakub Kicinski wrote:
> The comment about shinfo->dataref split is really unhelpful,
> at least to me. Rewrite it and render it to skb documentation.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  Documentation/networking/index.rst  |  1 +
>  Documentation/networking/skbuff.rst |  6 ++++++
>  include/linux/skbuff.h              | 33 +++++++++++++++++++----------
>  3 files changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
> index ce017136ab05..1b3c45add20d 100644
> --- a/Documentation/networking/index.rst
> +++ b/Documentation/networking/index.rst
> @@ -96,6 +96,7 @@ Linux Networking Documentation
>     sctp
>     secid
>     seg6-sysctl
> +   skbuff
>     smc-sysctl
>     statistics
>     strparser
> diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
> index 7c6be64f486a..581e5561c362 100644
> --- a/Documentation/networking/skbuff.rst
> +++ b/Documentation/networking/skbuff.rst
> @@ -23,3 +23,9 @@ skb_clone() allows for fast duplication of skbs. None of the data buffers
>  get copied, but caller gets a new metadata struct (struct sk_buff).
>  &skb_shared_info.refcount indicates the number of skbs pointing at the same
>  packet data (i.e. clones).
> +
> +dataref and headerless skbs
> +---------------------------
> +
> +.. kernel-doc:: include/linux/skbuff.h
> +   :doc: dataref and headerless skbs
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 5431be4aa309..5b838350931c 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -691,16 +691,25 @@ struct skb_shared_info {
>  	skb_frag_t	frags[MAX_SKB_FRAGS];
>  };
>  
> -/* We divide dataref into two halves.  The higher 16 bits hold references
> - * to the payload part of skb->data.  The lower 16 bits hold references to
> - * the entire skb->data.  A clone of a headerless skb holds the length of
> - * the header in skb->hdr_len.
> +/**
> + * DOC: dataref and headerless skbs
> + *
> + * Transport layers send out clones of data skbs they hold for retransmissions.
> + * To allow lower layers of the stack to prepend their headers
> + * we split &skb_shared_info.dataref into two halves.
> + * The lower 16 bits count the overall number of references.
> + * The higher 16 bits indicate number of data-only references.
> + * skb_header_cloned() checks if skb is allowed to add / write the headers.

Thank you very much for the IMHO much needed documentation!

Please allow me to do some non-native-english-speaker biased comments;)

The previous patch uses the form  "payload data" instead of data-only,
I think it would be clearer using consistently one or the other. I
personally would go for "payload-data-only" (which is probably a good
reason to pick a different option).

>   *
> - * All users must obey the rule that the skb->data reference count must be
> - * greater than or equal to the payload reference count.
> + * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr
> + * (via __skb_header_release()). Any clone created from marked skb will get
> + * &sk_buff.hdr_len populated with the available headroom.
> + * If it's the only clone in existence it's able to modify the headroom at will.

I think it would be great if we explicitly list the expected sequence,
e.g.
	<alloc skb>
	skb_reserve
	__skb_header_release
	skb_clone


Thanks!

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 1/3] skbuff: add a basic intro doc
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
@ 2022-03-24 14:16   ` Jonathan Corbet
  2022-03-24 18:20     ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Corbet @ 2022-03-24 14:16 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, pabeni, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

Jakub Kicinski <kuba@kernel.org> writes:

> Add basic skb documentation. It's mostly an intro to the subsequent
> patches - it would looks strange if we documented advanced topics
> without covering the basics in any way.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Glad to see improved docs!  One nit...

>  Documentation/networking/skbuff.rst | 25 ++++++++++++++++++
>  include/linux/skbuff.h              | 40 +++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+)
>  create mode 100644 Documentation/networking/skbuff.rst
>
> diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
> new file mode 100644
> index 000000000000..7c6be64f486a
> --- /dev/null
> +++ b/Documentation/networking/skbuff.rst
> @@ -0,0 +1,25 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +struct sk_buff
> +==============
> +
> +:c:type:`struct sk_buff` is the main networking structure representing
> +a packet.

You shouldn't need :c:type: here, our magic stuff should see "struct
sk_buff" and generate the cross reference.  Of course, it will be a
highly local reference in this case...

Thanks,

jon

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 1/3] skbuff: add a basic intro doc
  2022-03-24 14:16   ` Jonathan Corbet
@ 2022-03-24 18:20     ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-24 18:20 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: davem, netdev, pabeni, imagedong, edumazet, dsahern, talalahmad,
	linux-doc

On Thu, 24 Mar 2022 08:16:49 -0600 Jonathan Corbet wrote:
> > +:c:type:`struct sk_buff` is the main networking structure representing
> > +a packet.  
> 
> You shouldn't need :c:type: here, our magic stuff should see "struct
> sk_buff" and generate the cross reference.  Of course, it will be a
> highly local reference in this case...

Thanks! I'll fix in v1, I must admit I added this last minute and I hit
send before the build finished :S

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs
  2022-03-24  8:50   ` Paolo Abeni
@ 2022-03-24 18:31     ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-24 18:31 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: davem, netdev, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc

On Thu, 24 Mar 2022 09:50:06 +0100 Paolo Abeni wrote:
> > +/**
> > + * DOC: dataref and headerless skbs
> > + *
> > + * Transport layers send out clones of data skbs they hold for retransmissions.
> > + * To allow lower layers of the stack to prepend their headers
> > + * we split &skb_shared_info.dataref into two halves.
> > + * The lower 16 bits count the overall number of references.
> > + * The higher 16 bits indicate number of data-only references.
> > + * skb_header_cloned() checks if skb is allowed to add / write the headers.  
> 
> Thank you very much for the IMHO much needed documentation!
> 
> Please allow me to do some non-native-english-speaker biased comments;)
> 
> The previous patch uses the form  "payload data" instead of data-only,
> I think it would be clearer using consistently one or the other. I
> personally would go for "payload-data-only" (which is probably a good
> reason to pick a different option).

That starts to get long. Let me go for payload everywhere.

> > - * All users must obey the rule that the skb->data reference count must be
> > - * greater than or equal to the payload reference count.
> > + * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr
> > + * (via __skb_header_release()). Any clone created from marked skb will get
> > + * &sk_buff.hdr_len populated with the available headroom.
> > + * If it's the only clone in existence it's able to modify the headroom at will.  
> 
> I think it would be great if we explicitly list the expected sequence,
> e.g.
> 	<alloc skb>
> 	skb_reserve
> 	__skb_header_release
> 	skb_clone

Will do!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-24 18:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
2022-03-24 14:16   ` Jonathan Corbet
2022-03-24 18:20     ` Jakub Kicinski
2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
2022-03-24  8:50   ` Paolo Abeni
2022-03-24 18:31     ` Jakub Kicinski
2022-03-23 23:37 ` [RFC net-next 3/3] skbuff: render the checksum comment to documentation Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).