All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC net-next 0/3] docs: document some aspects of struct sk_buff
@ 2022-03-23 23:37 Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

I dusted off some old skb documentation patches I had in my tree.
This small set creates a place to render such documentation,
documents one random thing (data-only skbs) and converts the big
checksum comment to kdoc.

Jakub Kicinski (3):
  skbuff: add a basic intro doc
  skbuff: rewrite the doc for data-only skbs
  skbuff: render the checksum comment to documentation

 Documentation/networking/index.rst  |   1 +
 Documentation/networking/skbuff.rst |  37 ++++
 include/linux/skbuff.h              | 292 ++++++++++++++++++----------
 3 files changed, 224 insertions(+), 106 deletions(-)
 create mode 100644 Documentation/networking/skbuff.rst

-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC net-next 1/3] skbuff: add a basic intro doc
  2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
@ 2022-03-23 23:37 ` Jakub Kicinski
  2022-03-24 14:16   ` Jonathan Corbet
  2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 3/3] skbuff: render the checksum comment to documentation Jakub Kicinski
  2 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

Add basic skb documentation. It's mostly an intro to the subsequent
patches - it would looks strange if we documented advanced topics
without covering the basics in any way.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/skbuff.rst | 25 ++++++++++++++++++
 include/linux/skbuff.h              | 40 +++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 Documentation/networking/skbuff.rst

diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
new file mode 100644
index 000000000000..7c6be64f486a
--- /dev/null
+++ b/Documentation/networking/skbuff.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+struct sk_buff
+==============
+
+:c:type:`struct sk_buff` is the main networking structure representing
+a packet.
+
+Basic sk_buff geometry
+----------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :doc: Basic sk_buff geometry
+
+Shared skbs and skb clones
+--------------------------
+
+:c:member:`sk_buff.users` is a simple refcount allowing multiple entities
+to keep a struct sk_buff alive. skbs with a ``sk_buff.users != 1`` are referred
+to as shared skbs (see skb_shared()).
+
+skb_clone() allows for fast duplication of skbs. None of the data buffers
+get copied, but caller gets a new metadata struct (struct sk_buff).
+&skb_shared_info.refcount indicates the number of skbs pointing at the same
+packet data (i.e. clones).
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3a30cae8b0a5..5431be4aa309 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -764,6 +764,46 @@ typedef unsigned int sk_buff_data_t;
 typedef unsigned char *sk_buff_data_t;
 #endif
 
+/**
+ * DOC: Basic sk_buff geometry
+ *
+ * struct sk_buff itself is a metadata structure and does not hold any packet
+ * data. All the data is held in associated buffers.
+ *
+ * &sk_buff.head points to the main "head" buffer. The head buffer is divided
+ * into two parts:
+ *
+ *  - data buffer, containing headers and sometimes payload data;
+ *    this is the part of the skb operated on by the common helpers
+ *    such as skb_put() or skb_pull();
+ *  - shared info (struct skb_shared_info) which holds an array of pointers
+ *    to read-only payload data in the (page, offset, length) format.
+ *
+ * Optionally &skb_shared_info.frag_list may point to another skb.
+ *
+ * Basic diagram may look like this::
+ *
+ *                                  ---------------
+ *                                 | sk_buff       |
+ *                                  ---------------
+ *     ,---------------------------  + head
+ *    /          ,-----------------  + data
+ *   /          /      ,-----------  + tail
+ *  |          |      |            , + end
+ *  |          |      |           |
+ *  v          v      v           v
+ *   -----------------------------------------------
+ *  | headroom | data |  tailroom | skb_shared_info |
+ *   -----------------------------------------------
+ *                                 + [page frag]
+ *                                 + [page frag]
+ *                                 + [page frag]
+ *                                 + [page frag]       ---------
+ *                                 + frag_list    --> | sk_buff |
+ *                                                     ---------
+ *
+ */
+
 /**
  *	struct sk_buff - socket buffer
  *	@next: Next buffer in list
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs
  2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
@ 2022-03-23 23:37 ` Jakub Kicinski
  2022-03-24  8:50   ` Paolo Abeni
  2022-03-23 23:37 ` [RFC net-next 3/3] skbuff: render the checksum comment to documentation Jakub Kicinski
  2 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

The comment about shinfo->dataref split is really unhelpful,
at least to me. Rewrite it and render it to skb documentation.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/index.rst  |  1 +
 Documentation/networking/skbuff.rst |  6 ++++++
 include/linux/skbuff.h              | 33 +++++++++++++++++++----------
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index ce017136ab05..1b3c45add20d 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -96,6 +96,7 @@ Linux Networking Documentation
    sctp
    secid
    seg6-sysctl
+   skbuff
    smc-sysctl
    statistics
    strparser
diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
index 7c6be64f486a..581e5561c362 100644
--- a/Documentation/networking/skbuff.rst
+++ b/Documentation/networking/skbuff.rst
@@ -23,3 +23,9 @@ skb_clone() allows for fast duplication of skbs. None of the data buffers
 get copied, but caller gets a new metadata struct (struct sk_buff).
 &skb_shared_info.refcount indicates the number of skbs pointing at the same
 packet data (i.e. clones).
+
+dataref and headerless skbs
+---------------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :doc: dataref and headerless skbs
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5431be4aa309..5b838350931c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -691,16 +691,25 @@ struct skb_shared_info {
 	skb_frag_t	frags[MAX_SKB_FRAGS];
 };
 
-/* We divide dataref into two halves.  The higher 16 bits hold references
- * to the payload part of skb->data.  The lower 16 bits hold references to
- * the entire skb->data.  A clone of a headerless skb holds the length of
- * the header in skb->hdr_len.
+/**
+ * DOC: dataref and headerless skbs
+ *
+ * Transport layers send out clones of data skbs they hold for retransmissions.
+ * To allow lower layers of the stack to prepend their headers
+ * we split &skb_shared_info.dataref into two halves.
+ * The lower 16 bits count the overall number of references.
+ * The higher 16 bits indicate number of data-only references.
+ * skb_header_cloned() checks if skb is allowed to add / write the headers.
  *
- * All users must obey the rule that the skb->data reference count must be
- * greater than or equal to the payload reference count.
+ * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr
+ * (via __skb_header_release()). Any clone created from marked skb will get
+ * &sk_buff.hdr_len populated with the available headroom.
+ * If it's the only clone in existence it's able to modify the headroom at will.
  *
- * Holding a reference to the payload part means that the user does not
- * care about modifications to the header part of skb->data.
+ * This is not a very generic construct and it depends on the transport layers
+ * doing the right thing. In practice there's usually only one data-only skb.
+ * Having multiple data-only skbs with different lengths of hdr_len is not
+ * possible. The data-only skbs should never leave their owner.
  */
 #define SKB_DATAREF_SHIFT 16
 #define SKB_DATAREF_MASK ((1 << SKB_DATAREF_SHIFT) - 1)
@@ -833,7 +842,7 @@ typedef unsigned char *sk_buff_data_t;
  *	@ignore_df: allow local fragmentation
  *	@cloned: Head may be cloned (check refcnt to be sure)
  *	@ip_summed: Driver fed us an IP checksum
- *	@nohdr: Payload reference only, must not modify header
+ *	@nohdr: Data-only skb, must not modify header
  *	@pkt_type: Packet class
  *	@fclone: skbuff clone status
  *	@ipvs_property: skbuff is owned by ipvs
@@ -1962,8 +1971,10 @@ static inline int skb_header_unclone(struct sk_buff *skb, gfp_t pri)
 }
 
 /**
- *	__skb_header_release - release reference to header
- *	@skb: buffer to operate on
+ * __skb_header_release() - allow clones to use the headroom
+ * @skb: buffer to operate on
+ *
+ * See "DOC: dataref and headerless skbs".
  */
 static inline void __skb_header_release(struct sk_buff *skb)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC net-next 3/3] skbuff: render the checksum comment to documentation
  2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
  2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
@ 2022-03-23 23:37 ` Jakub Kicinski
  2 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-23 23:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, pabeni, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

Long time ago Tom added a giant comment to skbuff.h explaining
checksums. Now that we have a place in Documentation for skbuff
docs we should render it. Sprinkle some markup while at it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/skbuff.rst |   6 +
 include/linux/skbuff.h              | 219 ++++++++++++++++------------
 2 files changed, 130 insertions(+), 95 deletions(-)

diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
index 581e5561c362..f2b97b5bb0b0 100644
--- a/Documentation/networking/skbuff.rst
+++ b/Documentation/networking/skbuff.rst
@@ -29,3 +29,9 @@ dataref and headerless skbs
 
 .. kernel-doc:: include/linux/skbuff.h
    :doc: dataref and headerless skbs
+
+Checksum information
+--------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :doc: skb checksums
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5b838350931c..a52524805919 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -43,98 +43,112 @@
 #include <linux/netfilter/nf_conntrack_common.h>
 #endif
 
-/* The interface for checksum offload between the stack and networking drivers
+/**
+ * DOC: skb checksums
+ *
+ * The interface for checksum offload between the stack and networking drivers
  * is as follows...
  *
- * A. IP checksum related features
+ * IP checksum related features
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
  * Drivers advertise checksum offload capabilities in the features of a device.
  * From the stack's point of view these are capabilities offered by the driver.
  * A driver typically only advertises features that it is capable of offloading
  * to its device.
  *
- * The checksum related features are:
- *
- *	NETIF_F_HW_CSUM	- The driver (or its device) is able to compute one
- *			  IP (one's complement) checksum for any combination
- *			  of protocols or protocol layering. The checksum is
- *			  computed and set in a packet per the CHECKSUM_PARTIAL
- *			  interface (see below).
- *
- *	NETIF_F_IP_CSUM - Driver (device) is only able to checksum plain
- *			  TCP or UDP packets over IPv4. These are specifically
- *			  unencapsulated packets of the form IPv4|TCP or
- *			  IPv4|UDP where the Protocol field in the IPv4 header
- *			  is TCP or UDP. The IPv4 header may contain IP options.
- *			  This feature cannot be set in features for a device
- *			  with NETIF_F_HW_CSUM also set. This feature is being
- *			  DEPRECATED (see below).
- *
- *	NETIF_F_IPV6_CSUM - Driver (device) is only able to checksum plain
- *			  TCP or UDP packets over IPv6. These are specifically
- *			  unencapsulated packets of the form IPv6|TCP or
- *			  IPv6|UDP where the Next Header field in the IPv6
- *			  header is either TCP or UDP. IPv6 extension headers
- *			  are not supported with this feature. This feature
- *			  cannot be set in features for a device with
- *			  NETIF_F_HW_CSUM also set. This feature is being
- *			  DEPRECATED (see below).
- *
- *	NETIF_F_RXCSUM - Driver (device) performs receive checksum offload.
- *			 This flag is only used to disable the RX checksum
- *			 feature for a device. The stack will accept receive
- *			 checksum indication in packets received on a device
- *			 regardless of whether NETIF_F_RXCSUM is set.
- *
- * B. Checksumming of received packets by device. Indication of checksum
- *    verification is set in skb->ip_summed. Possible values are:
- *
- * CHECKSUM_NONE:
+ * .. flat-table:: Checksum related device features
+ *   :widths: 1 10
+ *
+ *   * - %NETIF_F_HW_CSUM
+ *     - The driver (or its device) is able to compute one
+ *	 IP (one's complement) checksum for any combination
+ *	 of protocols or protocol layering. The checksum is
+ *	 computed and set in a packet per the CHECKSUM_PARTIAL
+ *	 interface (see below).
+ *
+ *   * - %NETIF_F_IP_CSUM
+ *     - Driver (device) is only able to checksum plain
+ *	 TCP or UDP packets over IPv4. These are specifically
+ *	 unencapsulated packets of the form IPv4|TCP or
+ *	 IPv4|UDP where the Protocol field in the IPv4 header
+ *	 is TCP or UDP. The IPv4 header may contain IP options.
+ *	 This feature cannot be set in features for a device
+ *	 with NETIF_F_HW_CSUM also set. This feature is being
+ *	 DEPRECATED (see below).
+ *
+ *   * - %NETIF_F_IPV6_CSUM
+ *     - Driver (device) is only able to checksum plain
+ *	 TCP or UDP packets over IPv6. These are specifically
+ *	 unencapsulated packets of the form IPv6|TCP or
+ *	 IPv6|UDP where the Next Header field in the IPv6
+ *	 header is either TCP or UDP. IPv6 extension headers
+ *	 are not supported with this feature. This feature
+ *	 cannot be set in features for a device with
+ *	 NETIF_F_HW_CSUM also set. This feature is being
+ *	 DEPRECATED (see below).
+ *
+ *   * - %NETIF_F_RXCSUM
+ *     - Driver (device) performs receive checksum offload.
+ *	 This flag is only used to disable the RX checksum
+ *	 feature for a device. The stack will accept receive
+ *	 checksum indication in packets received on a device
+ *	 regardless of whether NETIF_F_RXCSUM is set.
+ *
+ * Checksumming of received packets by device
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * Indication of checksum verification is set in &sk_buff.ip_summed.
+ * Possible values are:
+ *
+ * - %CHECKSUM_NONE
  *
  *   Device did not checksum this packet e.g. due to lack of capabilities.
  *   The packet contains full (though not verified) checksum in packet but
  *   not in skb->csum. Thus, skb->csum is undefined in this case.
  *
- * CHECKSUM_UNNECESSARY:
+ * - %CHECKSUM_UNNECESSARY
  *
  *   The hardware you're dealing with doesn't calculate the full checksum
- *   (as in CHECKSUM_COMPLETE), but it does parse headers and verify checksums
- *   for specific protocols. For such packets it will set CHECKSUM_UNNECESSARY
- *   if their checksums are okay. skb->csum is still undefined in this case
+ *   (as in %CHECKSUM_COMPLETE), but it does parse headers and verify checksums
+ *   for specific protocols. For such packets it will set %CHECKSUM_UNNECESSARY
+ *   if their checksums are okay. &sk_buff.csum is still undefined in this case
  *   though. A driver or device must never modify the checksum field in the
  *   packet even if checksum is verified.
  *
- *   CHECKSUM_UNNECESSARY is applicable to following protocols:
- *     TCP: IPv6 and IPv4.
- *     UDP: IPv4 and IPv6. A device may apply CHECKSUM_UNNECESSARY to a
+ *   %CHECKSUM_UNNECESSARY is applicable to following protocols:
+ *
+ *     - TCP: IPv6 and IPv4.
+ *     - UDP: IPv4 and IPv6. A device may apply CHECKSUM_UNNECESSARY to a
  *       zero UDP checksum for either IPv4 or IPv6, the networking stack
  *       may perform further validation in this case.
- *     GRE: only if the checksum is present in the header.
- *     SCTP: indicates the CRC in SCTP header has been validated.
- *     FCOE: indicates the CRC in FC frame has been validated.
+ *     - GRE: only if the checksum is present in the header.
+ *     - SCTP: indicates the CRC in SCTP header has been validated.
+ *     - FCOE: indicates the CRC in FC frame has been validated.
  *
- *   skb->csum_level indicates the number of consecutive checksums found in
- *   the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
+ *   &sk_buff.csum_level indicates the number of consecutive checksums found in
+ *   the packet minus one that have been verified as %CHECKSUM_UNNECESSARY.
  *   For instance if a device receives an IPv6->UDP->GRE->IPv4->TCP packet
  *   and a device is able to verify the checksums for UDP (possibly zero),
- *   GRE (checksum flag is set) and TCP, skb->csum_level would be set to
+ *   GRE (checksum flag is set) and TCP, &sk_buff.csum_level would be set to
  *   two. If the device were only able to verify the UDP checksum and not
  *   GRE, either because it doesn't support GRE checksum or because GRE
  *   checksum is bad, skb->csum_level would be set to zero (TCP checksum is
  *   not considered in this case).
  *
- * CHECKSUM_COMPLETE:
+ * - %CHECKSUM_COMPLETE
  *
  *   This is the most generic way. The device supplied checksum of the _whole_
- *   packet as seen by netif_rx() and fills in skb->csum. This means the
+ *   packet as seen by netif_rx() and fills in &sk_buff.csum. This means the
  *   hardware doesn't need to parse L3/L4 headers to implement this.
  *
  *   Notes:
+ *
  *   - Even if device supports only some protocols, but is able to produce
  *     skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
  *   - CHECKSUM_COMPLETE is not applicable to SCTP and FCoE protocols.
  *
- * CHECKSUM_PARTIAL:
+ * - %CHECKSUM_PARTIAL
  *
  *   A checksum is set up to be offloaded to a device as described in the
  *   output description for CHECKSUM_PARTIAL. This may occur on a packet
@@ -146,14 +160,18 @@
  *   packet that are after the checksum being offloaded are not considered to
  *   be verified.
  *
- * C. Checksumming on transmit for non-GSO. The stack requests checksum offload
- *    in the skb->ip_summed for a packet. Values are:
+ * Checksumming on transmit for non-GSO
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * The stack requests checksum offload in the &sk_buff.ip_summed for a packet.
+ * Values are:
  *
- * CHECKSUM_PARTIAL:
+ * - %CHECKSUM_PARTIAL
  *
  *   The driver is required to checksum the packet as seen by hard_start_xmit()
- *   from skb->csum_start up to the end, and to record/write the checksum at
- *   offset skb->csum_start + skb->csum_offset. A driver may verify that the
+ *   from &sk_buff.csum_start up to the end, and to record/write the checksum at
+ *   offset &sk_buff.csum_start + &sk_buff.csum_offset.
+ *   A driver may verify that the
  *   csum_start and csum_offset values are valid values given the length and
  *   offset of the packet, but it should not attempt to validate that the
  *   checksum refers to a legitimate transport layer checksum -- it is the
@@ -165,55 +183,66 @@
  *   checksum calculation to the device, or call skb_checksum_help (in the case
  *   that the device does not support offload for a particular checksum).
  *
- *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
- *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
+ *   %NETIF_F_IP_CSUM and %NETIF_F_IPV6_CSUM are being deprecated in favor of
+ *   %NETIF_F_HW_CSUM. New devices should use %NETIF_F_HW_CSUM to indicate
  *   checksum offload capability.
- *   skb_csum_hwoffload_help() can be called to resolve CHECKSUM_PARTIAL based
+ *   skb_csum_hwoffload_help() can be called to resolve %CHECKSUM_PARTIAL based
  *   on network device checksumming capabilities: if a packet does not match
- *   them, skb_checksum_help or skb_crc32c_help (depending on the value of
- *   csum_not_inet, see item D.) is called to resolve the checksum.
+ *   them, skb_checksum_help() or skb_crc32c_help() (depending on the value of
+ *   &sk_buff.csum_not_inet, see :ref:`crc`)
+ *   is called to resolve the checksum.
  *
- * CHECKSUM_NONE:
+ * - %CHECKSUM_NONE
  *
  *   The skb was already checksummed by the protocol, or a checksum is not
  *   required.
  *
- * CHECKSUM_UNNECESSARY:
+ * - %CHECKSUM_UNNECESSARY
  *
  *   This has the same meaning as CHECKSUM_NONE for checksum offload on
  *   output.
  *
- * CHECKSUM_COMPLETE:
+ * - %CHECKSUM_COMPLETE
+ *
  *   Not used in checksum output. If a driver observes a packet with this value
- *   set in skbuff, it should treat the packet as if CHECKSUM_NONE were set.
- *
- * D. Non-IP checksum (CRC) offloads
- *
- *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
- *     offloading the SCTP CRC in a packet. To perform this offload the stack
- *     will set csum_start and csum_offset accordingly, set ip_summed to
- *     CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication in
- *     the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
- *     A driver that supports both IP checksum offload and SCTP CRC32c offload
- *     must verify which offload is configured for a packet by testing the
- *     value of skb->csum_not_inet; skb_crc32c_csum_help is provided to resolve
- *     CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
- *
- *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
- *     offloading the FCOE CRC in a packet. To perform this offload the stack
- *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
- *     accordingly. Note that there is no indication in the skbuff that the
- *     CHECKSUM_PARTIAL refers to an FCOE checksum, so a driver that supports
- *     both IP checksum offload and FCOE CRC offload must verify which offload
- *     is configured for a packet, presumably by inspecting packet headers.
- *
- * E. Checksumming on output with GSO.
- *
- * In the case of a GSO packet (skb_is_gso(skb) is true), checksum offload
+ *   set in skbuff, it should treat the packet as if %CHECKSUM_NONE were set.
+ *
+ * .. _crc:
+ *
+ * Non-IP checksum (CRC) offloads
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * .. flat-table::
+ *   :widths: 1 10
+ *
+ *   * - %NETIF_F_SCTP_CRC
+ *     - This feature indicates that a device is capable of
+ *	 offloading the SCTP CRC in a packet. To perform this offload the stack
+ *	 will set csum_start and csum_offset accordingly, set ip_summed to
+ *	 %CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication
+ *	 in the skbuff that the %CHECKSUM_PARTIAL refers to CRC32c.
+ *	 A driver that supports both IP checksum offload and SCTP CRC32c offload
+ *	 must verify which offload is configured for a packet by testing the
+ *	 value of &sk_buff.csum_not_inet; skb_crc32c_csum_help() is provided to
+ *	 resolve %CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
+ *
+ *   * - %NETIF_F_FCOE_CRC
+ *     - This feature indicates that a device is capable of offloading the FCOE
+ *	 CRC in a packet. To perform this offload the stack will set ip_summed
+ *	 to %CHECKSUM_PARTIAL and set csum_start and csum_offset
+ *	 accordingly. Note that there is no indication in the skbuff that the
+ *	 %CHECKSUM_PARTIAL refers to an FCOE checksum, so a driver that supports
+ *	 both IP checksum offload and FCOE CRC offload must verify which offload
+ *	 is configured for a packet, presumably by inspecting packet headers.
+ *
+ * Checksumming on output with GSO
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * In the case of a GSO packet (skb_is_gso() is true), checksum offload
  * is implied by the SKB_GSO_* flags in gso_type. Most obviously, if the
- * gso_type is SKB_GSO_TCPV4 or SKB_GSO_TCPV6, TCP checksum offload as
+ * gso_type is %SKB_GSO_TCPV4 or %SKB_GSO_TCPV6, TCP checksum offload as
  * part of the GSO operation is implied. If a checksum is being offloaded
- * with GSO then ip_summed is CHECKSUM_PARTIAL, and both csum_start and
+ * with GSO then ip_summed is %CHECKSUM_PARTIAL, and both csum_start and
  * csum_offset are set to refer to the outermost checksum being offloaded
  * (two offloaded checksums are possible with UDP encapsulation).
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs
  2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
@ 2022-03-24  8:50   ` Paolo Abeni
  2022-03-24 18:31     ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2022-03-24  8:50 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, corbet, imagedong, edumazet, dsahern, talalahmad, linux-doc

Hello,

On Wed, 2022-03-23 at 16:37 -0700, Jakub Kicinski wrote:
> The comment about shinfo->dataref split is really unhelpful,
> at least to me. Rewrite it and render it to skb documentation.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  Documentation/networking/index.rst  |  1 +
>  Documentation/networking/skbuff.rst |  6 ++++++
>  include/linux/skbuff.h              | 33 +++++++++++++++++++----------
>  3 files changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
> index ce017136ab05..1b3c45add20d 100644
> --- a/Documentation/networking/index.rst
> +++ b/Documentation/networking/index.rst
> @@ -96,6 +96,7 @@ Linux Networking Documentation
>     sctp
>     secid
>     seg6-sysctl
> +   skbuff
>     smc-sysctl
>     statistics
>     strparser
> diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
> index 7c6be64f486a..581e5561c362 100644
> --- a/Documentation/networking/skbuff.rst
> +++ b/Documentation/networking/skbuff.rst
> @@ -23,3 +23,9 @@ skb_clone() allows for fast duplication of skbs. None of the data buffers
>  get copied, but caller gets a new metadata struct (struct sk_buff).
>  &skb_shared_info.refcount indicates the number of skbs pointing at the same
>  packet data (i.e. clones).
> +
> +dataref and headerless skbs
> +---------------------------
> +
> +.. kernel-doc:: include/linux/skbuff.h
> +   :doc: dataref and headerless skbs
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 5431be4aa309..5b838350931c 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -691,16 +691,25 @@ struct skb_shared_info {
>  	skb_frag_t	frags[MAX_SKB_FRAGS];
>  };
>  
> -/* We divide dataref into two halves.  The higher 16 bits hold references
> - * to the payload part of skb->data.  The lower 16 bits hold references to
> - * the entire skb->data.  A clone of a headerless skb holds the length of
> - * the header in skb->hdr_len.
> +/**
> + * DOC: dataref and headerless skbs
> + *
> + * Transport layers send out clones of data skbs they hold for retransmissions.
> + * To allow lower layers of the stack to prepend their headers
> + * we split &skb_shared_info.dataref into two halves.
> + * The lower 16 bits count the overall number of references.
> + * The higher 16 bits indicate number of data-only references.
> + * skb_header_cloned() checks if skb is allowed to add / write the headers.

Thank you very much for the IMHO much needed documentation!

Please allow me to do some non-native-english-speaker biased comments;)

The previous patch uses the form  "payload data" instead of data-only,
I think it would be clearer using consistently one or the other. I
personally would go for "payload-data-only" (which is probably a good
reason to pick a different option).

>   *
> - * All users must obey the rule that the skb->data reference count must be
> - * greater than or equal to the payload reference count.
> + * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr
> + * (via __skb_header_release()). Any clone created from marked skb will get
> + * &sk_buff.hdr_len populated with the available headroom.
> + * If it's the only clone in existence it's able to modify the headroom at will.

I think it would be great if we explicitly list the expected sequence,
e.g.
	<alloc skb>
	skb_reserve
	__skb_header_release
	skb_clone


Thanks!

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 1/3] skbuff: add a basic intro doc
  2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
@ 2022-03-24 14:16   ` Jonathan Corbet
  2022-03-24 18:20     ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Corbet @ 2022-03-24 14:16 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, pabeni, imagedong, edumazet, dsahern, talalahmad,
	linux-doc, Jakub Kicinski

Jakub Kicinski <kuba@kernel.org> writes:

> Add basic skb documentation. It's mostly an intro to the subsequent
> patches - it would looks strange if we documented advanced topics
> without covering the basics in any way.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Glad to see improved docs!  One nit...

>  Documentation/networking/skbuff.rst | 25 ++++++++++++++++++
>  include/linux/skbuff.h              | 40 +++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+)
>  create mode 100644 Documentation/networking/skbuff.rst
>
> diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst
> new file mode 100644
> index 000000000000..7c6be64f486a
> --- /dev/null
> +++ b/Documentation/networking/skbuff.rst
> @@ -0,0 +1,25 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +struct sk_buff
> +==============
> +
> +:c:type:`struct sk_buff` is the main networking structure representing
> +a packet.

You shouldn't need :c:type: here, our magic stuff should see "struct
sk_buff" and generate the cross reference.  Of course, it will be a
highly local reference in this case...

Thanks,

jon

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 1/3] skbuff: add a basic intro doc
  2022-03-24 14:16   ` Jonathan Corbet
@ 2022-03-24 18:20     ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-24 18:20 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: davem, netdev, pabeni, imagedong, edumazet, dsahern, talalahmad,
	linux-doc

On Thu, 24 Mar 2022 08:16:49 -0600 Jonathan Corbet wrote:
> > +:c:type:`struct sk_buff` is the main networking structure representing
> > +a packet.  
> 
> You shouldn't need :c:type: here, our magic stuff should see "struct
> sk_buff" and generate the cross reference.  Of course, it will be a
> highly local reference in this case...

Thanks! I'll fix in v1, I must admit I added this last minute and I hit
send before the build finished :S

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs
  2022-03-24  8:50   ` Paolo Abeni
@ 2022-03-24 18:31     ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-03-24 18:31 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: davem, netdev, corbet, imagedong, edumazet, dsahern, talalahmad,
	linux-doc

On Thu, 24 Mar 2022 09:50:06 +0100 Paolo Abeni wrote:
> > +/**
> > + * DOC: dataref and headerless skbs
> > + *
> > + * Transport layers send out clones of data skbs they hold for retransmissions.
> > + * To allow lower layers of the stack to prepend their headers
> > + * we split &skb_shared_info.dataref into two halves.
> > + * The lower 16 bits count the overall number of references.
> > + * The higher 16 bits indicate number of data-only references.
> > + * skb_header_cloned() checks if skb is allowed to add / write the headers.  
> 
> Thank you very much for the IMHO much needed documentation!
> 
> Please allow me to do some non-native-english-speaker biased comments;)
> 
> The previous patch uses the form  "payload data" instead of data-only,
> I think it would be clearer using consistently one or the other. I
> personally would go for "payload-data-only" (which is probably a good
> reason to pick a different option).

That starts to get long. Let me go for payload everywhere.

> > - * All users must obey the rule that the skb->data reference count must be
> > - * greater than or equal to the payload reference count.
> > + * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr
> > + * (via __skb_header_release()). Any clone created from marked skb will get
> > + * &sk_buff.hdr_len populated with the available headroom.
> > + * If it's the only clone in existence it's able to modify the headroom at will.  
> 
> I think it would be great if we explicitly list the expected sequence,
> e.g.
> 	<alloc skb>
> 	skb_reserve
> 	__skb_header_release
> 	skb_clone

Will do!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-24 18:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-23 23:37 [RFC net-next 0/3] docs: document some aspects of struct sk_buff Jakub Kicinski
2022-03-23 23:37 ` [RFC net-next 1/3] skbuff: add a basic intro doc Jakub Kicinski
2022-03-24 14:16   ` Jonathan Corbet
2022-03-24 18:20     ` Jakub Kicinski
2022-03-23 23:37 ` [RFC net-next 2/3] skbuff: rewrite the doc for data-only skbs Jakub Kicinski
2022-03-24  8:50   ` Paolo Abeni
2022-03-24 18:31     ` Jakub Kicinski
2022-03-23 23:37 ` [RFC net-next 3/3] skbuff: render the checksum comment to documentation Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.