All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/7] Cleanup to main DSA structures
@ 2022-01-05 13:21 Vladimir Oltean
  2022-01-05 13:21 ` [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac Vladimir Oltean
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

This series contains changes that do the following:

- struct dsa_port reduced from 576 to 544 bytes, and first cache line a
  bit better organized
- struct dsa_switch from 160 to 136 bytes, and first cache line a bit
  better organized
- struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
  bit better organized

No changes compared to v1, just split into a separate patch set.

Vladimir Oltean (7):
  net: dsa: move dsa_port :: stp_state near dsa_port :: mac
  net: dsa: merge all bools of struct dsa_port into a single u8
  net: dsa: move dsa_port :: type near dsa_port :: index
  net: dsa: merge all bools of struct dsa_switch into a single u32
  net: dsa: make dsa_switch :: num_ports an unsigned int
  net: dsa: move dsa_switch_tree :: ports and lags to first cache line
  net: dsa: combine two holes in struct dsa_switch_tree

 include/net/dsa.h | 146 +++++++++++++++++++++++++---------------------
 net/dsa/dsa2.c    |   2 +-
 2 files changed, 81 insertions(+), 67 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:30   ` Florian Fainelli
  2022-01-05 13:21 ` [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8 Vladimir Oltean
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

The MAC address of a port is 6 octets in size, and this creates a 2
octet hole after it. There are some other u8 members of struct dsa_port
that we can put in that hole. One such member is the stp_state.

Before:

pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
        union {
                struct net_device * master;              /*     0     8 */
                struct net_device * slave;               /*     0     8 */
        };                                               /*     0     8 */
        const struct dsa_device_ops  * tag_ops;          /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
        enum {
                DSA_PORT_TYPE_UNUSED = 0,
                DSA_PORT_TYPE_CPU    = 1,
                DSA_PORT_TYPE_DSA    = 2,
                DSA_PORT_TYPE_USER   = 3,
        } type;                                          /*    32     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_switch *        ds;                   /*    40     8 */
        unsigned int               index;                /*    48     4 */

        /* XXX 4 bytes hole, try to pack */

        const char  *              name;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_port *          cpu_dp;               /*    64     8 */
        u8                         mac[6];               /*    72     6 */

        /* XXX 2 bytes hole, try to pack */

        struct device_node *       dn;                   /*    80     8 */
        unsigned int               ageing_time;          /*    88     4 */
        bool                       vlan_filtering;       /*    92     1 */
        bool                       learning;             /*    93     1 */
        u8                         stp_state;            /*    94     1 */

        /* XXX 1 byte hole, try to pack */

        struct dsa_bridge *        bridge;               /*    96     8 */
        struct devlink_port        devlink_port;         /*   104   288 */
        /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
        bool                       devlink_port_setup;   /*   392     1 */

        /* XXX 7 bytes hole, try to pack */

        struct phylink *           pl;                   /*   400     8 */
        struct phylink_config      pl_config;            /*   408    40 */
        /* --- cacheline 7 boundary (448 bytes) --- */
        struct net_device *        lag_dev;              /*   448     8 */
        bool                       lag_tx_enabled;       /*   456     1 */

        /* XXX 7 bytes hole, try to pack */

        struct net_device *        hsr_dev;              /*   464     8 */
        struct list_head           list;                 /*   472    16 */
        const struct ethtool_ops  * orig_ethtool_ops;    /*   488     8 */
        const struct dsa_netdevice_ops  * netdev_ops;    /*   496     8 */
        struct mutex               addr_lists_lock;      /*   504    32 */
        /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
        struct list_head           fdbs;                 /*   536    16 */
        struct list_head           mdbs;                 /*   552    16 */
        bool                       setup;                /*   568     1 */

        /* size: 576, cachelines: 9, members: 30 */
        /* sum members: 544, holes: 6, sum holes: 25 */
        /* padding: 7 */
};

After:

pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
        union {
                struct net_device * master;              /*     0     8 */
                struct net_device * slave;               /*     0     8 */
        };                                               /*     0     8 */
        const struct dsa_device_ops  * tag_ops;          /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
        enum {
                DSA_PORT_TYPE_UNUSED = 0,
                DSA_PORT_TYPE_CPU    = 1,
                DSA_PORT_TYPE_DSA    = 2,
                DSA_PORT_TYPE_USER   = 3,
        } type;                                          /*    32     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_switch *        ds;                   /*    40     8 */
        unsigned int               index;                /*    48     4 */

        /* XXX 4 bytes hole, try to pack */

        const char  *              name;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_port *          cpu_dp;               /*    64     8 */
        u8                         mac[6];               /*    72     6 */
        u8                         stp_state;            /*    78     1 */

        /* XXX 1 byte hole, try to pack */

        struct device_node *       dn;                   /*    80     8 */
        unsigned int               ageing_time;          /*    88     4 */
        bool                       vlan_filtering;       /*    92     1 */
        bool                       learning;             /*    93     1 */

        /* XXX 2 bytes hole, try to pack */

        struct dsa_bridge *        bridge;               /*    96     8 */
        struct devlink_port        devlink_port;         /*   104   288 */
        /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
        bool                       devlink_port_setup;   /*   392     1 */

        /* XXX 7 bytes hole, try to pack */

        struct phylink *           pl;                   /*   400     8 */
        struct phylink_config      pl_config;            /*   408    40 */
        /* --- cacheline 7 boundary (448 bytes) --- */
        struct net_device *        lag_dev;              /*   448     8 */
        bool                       lag_tx_enabled;       /*   456     1 */

        /* XXX 7 bytes hole, try to pack */

        struct net_device *        hsr_dev;              /*   464     8 */
        struct list_head           list;                 /*   472    16 */
        const struct ethtool_ops  * orig_ethtool_ops;    /*   488     8 */
        const struct dsa_netdevice_ops  * netdev_ops;    /*   496     8 */
        struct mutex               addr_lists_lock;      /*   504    32 */
        /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
        struct list_head           fdbs;                 /*   536    16 */
        struct list_head           mdbs;                 /*   552    16 */
        bool                       setup;                /*   568     1 */

        /* size: 576, cachelines: 9, members: 30 */
        /* sum members: 544, holes: 6, sum holes: 25 */
        /* padding: 7 */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index f16959444ae1..8878f9ce251b 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -258,12 +258,14 @@ struct dsa_port {
 	const char		*name;
 	struct dsa_port		*cpu_dp;
 	u8			mac[ETH_ALEN];
+
+	u8			stp_state;
+
 	struct device_node	*dn;
 	unsigned int		ageing_time;
 	bool			vlan_filtering;
 	/* Managed by DSA on user ports and by drivers on CPU and DSA ports */
 	bool			learning;
-	u8			stp_state;
 	struct dsa_bridge	*bridge;
 	struct devlink_port	devlink_port;
 	bool			devlink_port_setup;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
  2022-01-05 13:21 ` [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:30   ` Florian Fainelli
  2022-01-05 13:21 ` [PATCH v2 net-next 3/7] net: dsa: move dsa_port :: type near dsa_port :: index Vladimir Oltean
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

struct dsa_port has 5 bool members which create quite a number of 7 byte
holes in the structure layout. By merging them all into bitfields of an
u8, and placing that u8 in the 1-byte hole after dp->mac and dp->stp_state,
we can reduce the structure size from 576 bytes to 552 bytes on arm64.

Before:

pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
        union {
                struct net_device * master;              /*     0     8 */
                struct net_device * slave;               /*     0     8 */
        };                                               /*     0     8 */
        const struct dsa_device_ops  * tag_ops;          /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
        enum {
                DSA_PORT_TYPE_UNUSED = 0,
                DSA_PORT_TYPE_CPU    = 1,
                DSA_PORT_TYPE_DSA    = 2,
                DSA_PORT_TYPE_USER   = 3,
        } type;                                          /*    32     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_switch *        ds;                   /*    40     8 */
        unsigned int               index;                /*    48     4 */

        /* XXX 4 bytes hole, try to pack */

        const char  *              name;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_port *          cpu_dp;               /*    64     8 */
        u8                         mac[6];               /*    72     6 */
        u8                         stp_state;            /*    78     1 */

        /* XXX 1 byte hole, try to pack */

        struct device_node *       dn;                   /*    80     8 */
        unsigned int               ageing_time;          /*    88     4 */
        bool                       vlan_filtering;       /*    92     1 */
        bool                       learning;             /*    93     1 */

        /* XXX 2 bytes hole, try to pack */

        struct dsa_bridge *        bridge;               /*    96     8 */
        struct devlink_port        devlink_port;         /*   104   288 */
        /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
        bool                       devlink_port_setup;   /*   392     1 */

        /* XXX 7 bytes hole, try to pack */

        struct phylink *           pl;                   /*   400     8 */
        struct phylink_config      pl_config;            /*   408    40 */
        /* --- cacheline 7 boundary (448 bytes) --- */
        struct net_device *        lag_dev;              /*   448     8 */
        bool                       lag_tx_enabled;       /*   456     1 */

        /* XXX 7 bytes hole, try to pack */

        struct net_device *        hsr_dev;              /*   464     8 */
        struct list_head           list;                 /*   472    16 */
        const struct ethtool_ops  * orig_ethtool_ops;    /*   488     8 */
        const struct dsa_netdevice_ops  * netdev_ops;    /*   496     8 */
        struct mutex               addr_lists_lock;      /*   504    32 */
        /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
        struct list_head           fdbs;                 /*   536    16 */
        struct list_head           mdbs;                 /*   552    16 */
        bool                       setup;                /*   568     1 */

        /* size: 576, cachelines: 9, members: 30 */
        /* sum members: 544, holes: 6, sum holes: 25 */
        /* padding: 7 */
};

After:

pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
        union {
                struct net_device * master;              /*     0     8 */
                struct net_device * slave;               /*     0     8 */
        };                                               /*     0     8 */
        const struct dsa_device_ops  * tag_ops;          /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
        enum {
                DSA_PORT_TYPE_UNUSED = 0,
                DSA_PORT_TYPE_CPU    = 1,
                DSA_PORT_TYPE_DSA    = 2,
                DSA_PORT_TYPE_USER   = 3,
        } type;                                          /*    32     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_switch *        ds;                   /*    40     8 */
        unsigned int               index;                /*    48     4 */

        /* XXX 4 bytes hole, try to pack */

        const char  *              name;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_port *          cpu_dp;               /*    64     8 */
        u8                         mac[6];               /*    72     6 */
        u8                         stp_state;            /*    78     1 */
        u8                         vlan_filtering:1;     /*    79: 0  1 */
        u8                         learning:1;           /*    79: 1  1 */
        u8                         lag_tx_enabled:1;     /*    79: 2  1 */
        u8                         devlink_port_setup:1; /*    79: 3  1 */
        u8                         setup:1;              /*    79: 4  1 */

        /* XXX 3 bits hole, try to pack */

        struct device_node *       dn;                   /*    80     8 */
        unsigned int               ageing_time;          /*    88     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_bridge *        bridge;               /*    96     8 */
        struct devlink_port        devlink_port;         /*   104   288 */
        /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
        struct phylink *           pl;                   /*   392     8 */
        struct phylink_config      pl_config;            /*   400    40 */
        struct net_device *        lag_dev;              /*   440     8 */
        /* --- cacheline 7 boundary (448 bytes) --- */
        struct net_device *        hsr_dev;              /*   448     8 */
        struct list_head           list;                 /*   456    16 */
        const struct ethtool_ops  * orig_ethtool_ops;    /*   472     8 */
        const struct dsa_netdevice_ops  * netdev_ops;    /*   480     8 */
        struct mutex               addr_lists_lock;      /*   488    32 */
        /* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */
        struct list_head           fdbs;                 /*   520    16 */
        struct list_head           mdbs;                 /*   536    16 */

        /* size: 552, cachelines: 9, members: 30 */
        /* sum members: 539, holes: 3, sum holes: 12 */
        /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
        /* last cacheline: 40 bytes */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 8878f9ce251b..a8f0037b58e2 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -261,18 +261,23 @@ struct dsa_port {
 
 	u8			stp_state;
 
+	u8			vlan_filtering:1,
+				/* Managed by DSA on user ports and by
+				 * drivers on CPU and DSA ports
+				 */
+				learning:1,
+				lag_tx_enabled:1,
+				devlink_port_setup:1,
+				setup:1;
+
 	struct device_node	*dn;
 	unsigned int		ageing_time;
-	bool			vlan_filtering;
-	/* Managed by DSA on user ports and by drivers on CPU and DSA ports */
-	bool			learning;
+
 	struct dsa_bridge	*bridge;
 	struct devlink_port	devlink_port;
-	bool			devlink_port_setup;
 	struct phylink		*pl;
 	struct phylink_config	pl_config;
 	struct net_device	*lag_dev;
-	bool			lag_tx_enabled;
 	struct net_device	*hsr_dev;
 
 	struct list_head list;
@@ -293,8 +298,6 @@ struct dsa_port {
 	struct mutex		addr_lists_lock;
 	struct list_head	fdbs;
 	struct list_head	mdbs;
-
-	bool setup;
 };
 
 /* TODO: ideally DSA ports would have a single dp->link_dp member,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 3/7] net: dsa: move dsa_port :: type near dsa_port :: index
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
  2022-01-05 13:21 ` [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac Vladimir Oltean
  2022-01-05 13:21 ` [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8 Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:31   ` Florian Fainelli
  2022-01-05 13:21 ` [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct dsa_switch into a single u32 Vladimir Oltean
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

Both dsa_port :: type and dsa_port :: index introduce a 4 octet hole
after them, so we can group them together and the holes would be
eliminated, turning 16 octets of storage into just 8. This makes the
cpu_dp pointer fit in the first cache line, which is good, because
dsa_slave_to_master(), called by dsa_enqueue_skb(), uses it.

Before:

pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
        union {
                struct net_device * master;              /*     0     8 */
                struct net_device * slave;               /*     0     8 */
        };                                               /*     0     8 */
        const struct dsa_device_ops  * tag_ops;          /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
        enum {
                DSA_PORT_TYPE_UNUSED = 0,
                DSA_PORT_TYPE_CPU    = 1,
                DSA_PORT_TYPE_DSA    = 2,
                DSA_PORT_TYPE_USER   = 3,
        } type;                                          /*    32     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_switch *        ds;                   /*    40     8 */
        unsigned int               index;                /*    48     4 */

        /* XXX 4 bytes hole, try to pack */

        const char  *              name;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_port *          cpu_dp;               /*    64     8 */
        u8                         mac[6];               /*    72     6 */
        u8                         stp_state;            /*    78     1 */
        u8                         vlan_filtering:1;     /*    79: 0  1 */
        u8                         learning:1;           /*    79: 1  1 */
        u8                         lag_tx_enabled:1;     /*    79: 2  1 */
        u8                         devlink_port_setup:1; /*    79: 3  1 */
        u8                         setup:1;              /*    79: 4  1 */

        /* XXX 3 bits hole, try to pack */

        struct device_node *       dn;                   /*    80     8 */
        unsigned int               ageing_time;          /*    88     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_bridge *        bridge;               /*    96     8 */
        struct devlink_port        devlink_port;         /*   104   288 */
        /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
        struct phylink *           pl;                   /*   392     8 */
        struct phylink_config      pl_config;            /*   400    40 */
        struct net_device *        lag_dev;              /*   440     8 */
        /* --- cacheline 7 boundary (448 bytes) --- */
        struct net_device *        hsr_dev;              /*   448     8 */
        struct list_head           list;                 /*   456    16 */
        const struct ethtool_ops  * orig_ethtool_ops;    /*   472     8 */
        const struct dsa_netdevice_ops  * netdev_ops;    /*   480     8 */
        struct mutex               addr_lists_lock;      /*   488    32 */
        /* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */
        struct list_head           fdbs;                 /*   520    16 */
        struct list_head           mdbs;                 /*   536    16 */

        /* size: 552, cachelines: 9, members: 30 */
        /* sum members: 539, holes: 3, sum holes: 12 */
        /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
        /* last cacheline: 40 bytes */
};

After:

pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
        union {
                struct net_device * master;              /*     0     8 */
                struct net_device * slave;               /*     0     8 */
        };                                               /*     0     8 */
        const struct dsa_device_ops  * tag_ops;          /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
        struct dsa_switch *        ds;                   /*    32     8 */
        unsigned int               index;                /*    40     4 */
        enum {
                DSA_PORT_TYPE_UNUSED = 0,
                DSA_PORT_TYPE_CPU    = 1,
                DSA_PORT_TYPE_DSA    = 2,
                DSA_PORT_TYPE_USER   = 3,
        } type;                                          /*    44     4 */
        const char  *              name;                 /*    48     8 */
        struct dsa_port *          cpu_dp;               /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u8                         mac[6];               /*    64     6 */
        u8                         stp_state;            /*    70     1 */
        u8                         vlan_filtering:1;     /*    71: 0  1 */
        u8                         learning:1;           /*    71: 1  1 */
        u8                         lag_tx_enabled:1;     /*    71: 2  1 */
        u8                         devlink_port_setup:1; /*    71: 3  1 */
        u8                         setup:1;              /*    71: 4  1 */

        /* XXX 3 bits hole, try to pack */

        struct device_node *       dn;                   /*    72     8 */
        unsigned int               ageing_time;          /*    80     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_bridge *        bridge;               /*    88     8 */
        struct devlink_port        devlink_port;         /*    96   288 */
        /* --- cacheline 6 boundary (384 bytes) --- */
        struct phylink *           pl;                   /*   384     8 */
        struct phylink_config      pl_config;            /*   392    40 */
        struct net_device *        lag_dev;              /*   432     8 */
        struct net_device *        hsr_dev;              /*   440     8 */
        /* --- cacheline 7 boundary (448 bytes) --- */
        struct list_head           list;                 /*   448    16 */
        const struct ethtool_ops  * orig_ethtool_ops;    /*   464     8 */
        const struct dsa_netdevice_ops  * netdev_ops;    /*   472     8 */
        struct mutex               addr_lists_lock;      /*   480    32 */
        /* --- cacheline 8 boundary (512 bytes) --- */
        struct list_head           fdbs;                 /*   512    16 */
        struct list_head           mdbs;                 /*   528    16 */

        /* size: 544, cachelines: 9, members: 30 */
        /* sum members: 539, holes: 1, sum holes: 4 */
        /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
        /* last cacheline: 32 bytes */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index a8f0037b58e2..5e42fa7ea377 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -246,6 +246,10 @@ struct dsa_port {
 	struct dsa_switch_tree *dst;
 	struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev);
 
+	struct dsa_switch	*ds;
+
+	unsigned int		index;
+
 	enum {
 		DSA_PORT_TYPE_UNUSED = 0,
 		DSA_PORT_TYPE_CPU,
@@ -253,8 +257,6 @@ struct dsa_port {
 		DSA_PORT_TYPE_USER,
 	} type;
 
-	struct dsa_switch	*ds;
-	unsigned int		index;
 	const char		*name;
 	struct dsa_port		*cpu_dp;
 	u8			mac[ETH_ALEN];
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct dsa_switch into a single u32
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
                   ` (2 preceding siblings ...)
  2022-01-05 13:21 ` [PATCH v2 net-next 3/7] net: dsa: move dsa_port :: type near dsa_port :: index Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:32   ` Florian Fainelli
  2022-01-05 13:21 ` [PATCH v2 net-next 5/7] net: dsa: make dsa_switch :: num_ports an unsigned int Vladimir Oltean
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

struct dsa_switch has 9 boolean properties, many of which are in fact
set by drivers for custom behavior (vlan_filtering_is_global,
needs_standalone_vlan_filtering, etc etc). The binary layout of the
structure could be improved. For example, the "bool setup" at the
beginning introduces a gratuitous 7 byte hole in the first cache line.

The change merges all boolean properties into bitfields of an u32, and
places that u32 in the first cache line of the structure, since many
bools are accessed from the data path (untag_bridge_pvid, vlan_filtering,
vlan_filtering_is_global).

We place this u32 after the existing ds->index, which is also 4 bytes in
size. As a positive side effect, ds->tagger_data now fits into the first
cache line too, because 4 bytes are saved.

Before:

pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
        bool                       setup;                /*     0     1 */

        /* XXX 7 bytes hole, try to pack */

        struct device *            dev;                  /*     8     8 */
        struct dsa_switch_tree *   dst;                  /*    16     8 */
        unsigned int               index;                /*    24     4 */

        /* XXX 4 bytes hole, try to pack */

        struct notifier_block      nb;                   /*    32    24 */

        /* XXX last struct has 4 bytes of padding */

        void *                     priv;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        void *                     tagger_data;          /*    64     8 */
        struct dsa_chip_data *     cd;                   /*    72     8 */
        const struct dsa_switch_ops  * ops;              /*    80     8 */
        u32                        phys_mii_mask;        /*    88     4 */

        /* XXX 4 bytes hole, try to pack */

        struct mii_bus *           slave_mii_bus;        /*    96     8 */
        unsigned int               ageing_time_min;      /*   104     4 */
        unsigned int               ageing_time_max;      /*   108     4 */
        struct dsa_8021q_context * tag_8021q_ctx;        /*   112     8 */
        struct devlink *           devlink;              /*   120     8 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        unsigned int               num_tx_queues;        /*   128     4 */
        bool                       vlan_filtering_is_global; /*   132     1 */
        bool                       needs_standalone_vlan_filtering; /*   133     1 */
        bool                       configure_vlan_while_not_filtering; /*   134     1 */
        bool                       untag_bridge_pvid;    /*   135     1 */
        bool                       assisted_learning_on_cpu_port; /*   136     1 */
        bool                       vlan_filtering;       /*   137     1 */
        bool                       pcs_poll;             /*   138     1 */
        bool                       mtu_enforcement_ingress; /*   139     1 */
        unsigned int               num_lag_ids;          /*   140     4 */
        unsigned int               max_num_bridges;      /*   144     4 */

        /* XXX 4 bytes hole, try to pack */

        size_t                     num_ports;            /*   152     8 */

        /* size: 160, cachelines: 3, members: 27 */
        /* sum members: 141, holes: 4, sum holes: 19 */
        /* paddings: 1, sum paddings: 4 */
        /* last cacheline: 32 bytes */
};

After:

pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
        struct device *            dev;                  /*     0     8 */
        struct dsa_switch_tree *   dst;                  /*     8     8 */
        unsigned int               index;                /*    16     4 */
        u32                        setup:1;              /*    20: 0  4 */
        u32                        vlan_filtering_is_global:1; /*    20: 1  4 */
        u32                        needs_standalone_vlan_filtering:1; /*    20: 2  4 */
        u32                        configure_vlan_while_not_filtering:1; /*    20: 3  4 */
        u32                        untag_bridge_pvid:1;  /*    20: 4  4 */
        u32                        assisted_learning_on_cpu_port:1; /*    20: 5  4 */
        u32                        vlan_filtering:1;     /*    20: 6  4 */
        u32                        pcs_poll:1;           /*    20: 7  4 */
        u32                        mtu_enforcement_ingress:1; /*    20: 8  4 */

        /* XXX 23 bits hole, try to pack */

        struct notifier_block      nb;                   /*    24    24 */

        /* XXX last struct has 4 bytes of padding */

        void *                     priv;                 /*    48     8 */
        void *                     tagger_data;          /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_chip_data *     cd;                   /*    64     8 */
        const struct dsa_switch_ops  * ops;              /*    72     8 */
        u32                        phys_mii_mask;        /*    80     4 */

        /* XXX 4 bytes hole, try to pack */

        struct mii_bus *           slave_mii_bus;        /*    88     8 */
        unsigned int               ageing_time_min;      /*    96     4 */
        unsigned int               ageing_time_max;      /*   100     4 */
        struct dsa_8021q_context * tag_8021q_ctx;        /*   104     8 */
        struct devlink *           devlink;              /*   112     8 */
        unsigned int               num_tx_queues;        /*   120     4 */
        unsigned int               num_lag_ids;          /*   124     4 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        unsigned int               max_num_bridges;      /*   128     4 */

        /* XXX 4 bytes hole, try to pack */

        size_t                     num_ports;            /*   136     8 */

        /* size: 144, cachelines: 3, members: 27 */
        /* sum members: 132, holes: 2, sum holes: 8 */
        /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
        /* paddings: 1, sum paddings: 4 */
        /* last cacheline: 16 bytes */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 97 +++++++++++++++++++++++++----------------------
 1 file changed, 51 insertions(+), 46 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 5e42fa7ea377..a8a586039033 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -321,8 +321,6 @@ struct dsa_mac_addr {
 };
 
 struct dsa_switch {
-	bool setup;
-
 	struct device *dev;
 
 	/*
@@ -331,6 +329,57 @@ struct dsa_switch {
 	struct dsa_switch_tree	*dst;
 	unsigned int		index;
 
+	u32			setup:1,
+				/* Disallow bridge core from requesting
+				 * different VLAN awareness settings on ports
+				 * if not hardware-supported
+				 */
+				vlan_filtering_is_global:1,
+				/* Keep VLAN filtering enabled on ports not
+				 * offloading any upper
+				 */
+				needs_standalone_vlan_filtering:1,
+				/* Pass .port_vlan_add and .port_vlan_del to
+				 * drivers even for bridges that have
+				 * vlan_filtering=0. All drivers should ideally
+				 * set this (and then the option would get
+				 * removed), but it is unknown whether this
+				 * would break things or not.
+				 */
+				configure_vlan_while_not_filtering:1,
+				/* If the switch driver always programs the CPU
+				 * port as egress tagged despite the VLAN
+				 * configuration indicating otherwise, then
+				 * setting @untag_bridge_pvid will force the
+				 * DSA receive path to pop the bridge's
+				 * default_pvid VLAN tagged frames to offer a
+				 * consistent behavior between a
+				 * vlan_filtering=0 and vlan_filtering=1 bridge
+				 * device.
+				 */
+				untag_bridge_pvid:1,
+				/* Let DSA manage the FDB entries towards the
+				 * CPU, based on the software bridge database.
+				 */
+				assisted_learning_on_cpu_port:1,
+				/* In case vlan_filtering_is_global is set, the
+				 * VLAN awareness state should be retrieved
+				 * from here and not from the per-port
+				 * settings.
+				 */
+				vlan_filtering:1,
+				/* MAC PCS does not provide link state change
+				 * interrupt, and requires polling. Flag passed
+				 * on to PHYLINK.
+				 */
+				pcs_poll:1,
+				/* For switches that only have the MRU
+				 * configurable. To ensure the configured MTU
+				 * is not exceeded, normalization of MRU on all
+				 * bridged interfaces is needed.
+				 */
+				mtu_enforcement_ingress:1;
+
 	/* Listener for switch fabric events */
 	struct notifier_block	nb;
 
@@ -371,50 +420,6 @@ struct dsa_switch {
 	/* Number of switch port queues */
 	unsigned int		num_tx_queues;
 
-	/* Disallow bridge core from requesting different VLAN awareness
-	 * settings on ports if not hardware-supported
-	 */
-	bool			vlan_filtering_is_global;
-
-	/* Keep VLAN filtering enabled on ports not offloading any upper. */
-	bool			needs_standalone_vlan_filtering;
-
-	/* Pass .port_vlan_add and .port_vlan_del to drivers even for bridges
-	 * that have vlan_filtering=0. All drivers should ideally set this (and
-	 * then the option would get removed), but it is unknown whether this
-	 * would break things or not.
-	 */
-	bool			configure_vlan_while_not_filtering;
-
-	/* If the switch driver always programs the CPU port as egress tagged
-	 * despite the VLAN configuration indicating otherwise, then setting
-	 * @untag_bridge_pvid will force the DSA receive path to pop the bridge's
-	 * default_pvid VLAN tagged frames to offer a consistent behavior
-	 * between a vlan_filtering=0 and vlan_filtering=1 bridge device.
-	 */
-	bool			untag_bridge_pvid;
-
-	/* Let DSA manage the FDB entries towards the CPU, based on the
-	 * software bridge database.
-	 */
-	bool			assisted_learning_on_cpu_port;
-
-	/* In case vlan_filtering_is_global is set, the VLAN awareness state
-	 * should be retrieved from here and not from the per-port settings.
-	 */
-	bool			vlan_filtering;
-
-	/* MAC PCS does not provide link state change interrupt, and requires
-	 * polling. Flag passed on to PHYLINK.
-	 */
-	bool			pcs_poll;
-
-	/* For switches that only have the MRU configurable. To ensure the
-	 * configured MTU is not exceeded, normalization of MRU on all bridged
-	 * interfaces is needed.
-	 */
-	bool			mtu_enforcement_ingress;
-
 	/* Drivers that benefit from having an ID associated with each
 	 * offloaded LAG should set this to the maximum number of
 	 * supported IDs. DSA will then maintain a mapping of _at
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 5/7] net: dsa: make dsa_switch :: num_ports an unsigned int
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
                   ` (3 preceding siblings ...)
  2022-01-05 13:21 ` [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct dsa_switch into a single u32 Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:33   ` Florian Fainelli
  2022-01-05 13:21 ` [PATCH v2 net-next 6/7] net: dsa: move dsa_switch_tree :: ports and lags to first cache line Vladimir Oltean
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

Currently, num_ports is declared as size_t, which is defined as
__kernel_ulong_t, therefore it occupies 8 bytes of memory.

Even switches with port numbers in the range of tens are exotic, so
there is no need for this amount of storage.

Additionally, because the max_num_bridges member right above it is also
4 bytes, it means the compiler needs to add padding between the last 2
fields. By reducing the size, we don't need that padding and can reduce
the struct size.

Before:

pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
        struct device *            dev;                  /*     0     8 */
        struct dsa_switch_tree *   dst;                  /*     8     8 */
        unsigned int               index;                /*    16     4 */
        u32                        setup:1;              /*    20: 0  4 */
        u32                        vlan_filtering_is_global:1; /*    20: 1  4 */
        u32                        needs_standalone_vlan_filtering:1; /*    20: 2  4 */
        u32                        configure_vlan_while_not_filtering:1; /*    20: 3  4 */
        u32                        untag_bridge_pvid:1;  /*    20: 4  4 */
        u32                        assisted_learning_on_cpu_port:1; /*    20: 5  4 */
        u32                        vlan_filtering:1;     /*    20: 6  4 */
        u32                        pcs_poll:1;           /*    20: 7  4 */
        u32                        mtu_enforcement_ingress:1; /*    20: 8  4 */

        /* XXX 23 bits hole, try to pack */

        struct notifier_block      nb;                   /*    24    24 */

        /* XXX last struct has 4 bytes of padding */

        void *                     priv;                 /*    48     8 */
        void *                     tagger_data;          /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_chip_data *     cd;                   /*    64     8 */
        const struct dsa_switch_ops  * ops;              /*    72     8 */
        u32                        phys_mii_mask;        /*    80     4 */

        /* XXX 4 bytes hole, try to pack */

        struct mii_bus *           slave_mii_bus;        /*    88     8 */
        unsigned int               ageing_time_min;      /*    96     4 */
        unsigned int               ageing_time_max;      /*   100     4 */
        struct dsa_8021q_context * tag_8021q_ctx;        /*   104     8 */
        struct devlink *           devlink;              /*   112     8 */
        unsigned int               num_tx_queues;        /*   120     4 */
        unsigned int               num_lag_ids;          /*   124     4 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        unsigned int               max_num_bridges;      /*   128     4 */

        /* XXX 4 bytes hole, try to pack */

        size_t                     num_ports;            /*   136     8 */

        /* size: 144, cachelines: 3, members: 27 */
        /* sum members: 132, holes: 2, sum holes: 8 */
        /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
        /* paddings: 1, sum paddings: 4 */
        /* last cacheline: 16 bytes */
};

After:

pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
        struct device *            dev;                  /*     0     8 */
        struct dsa_switch_tree *   dst;                  /*     8     8 */
        unsigned int               index;                /*    16     4 */
        u32                        setup:1;              /*    20: 0  4 */
        u32                        vlan_filtering_is_global:1; /*    20: 1  4 */
        u32                        needs_standalone_vlan_filtering:1; /*    20: 2  4 */
        u32                        configure_vlan_while_not_filtering:1; /*    20: 3  4 */
        u32                        untag_bridge_pvid:1;  /*    20: 4  4 */
        u32                        assisted_learning_on_cpu_port:1; /*    20: 5  4 */
        u32                        vlan_filtering:1;     /*    20: 6  4 */
        u32                        pcs_poll:1;           /*    20: 7  4 */
        u32                        mtu_enforcement_ingress:1; /*    20: 8  4 */

        /* XXX 23 bits hole, try to pack */

        struct notifier_block      nb;                   /*    24    24 */

        /* XXX last struct has 4 bytes of padding */

        void *                     priv;                 /*    48     8 */
        void *                     tagger_data;          /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct dsa_chip_data *     cd;                   /*    64     8 */
        const struct dsa_switch_ops  * ops;              /*    72     8 */
        u32                        phys_mii_mask;        /*    80     4 */

        /* XXX 4 bytes hole, try to pack */

        struct mii_bus *           slave_mii_bus;        /*    88     8 */
        unsigned int               ageing_time_min;      /*    96     4 */
        unsigned int               ageing_time_max;      /*   100     4 */
        struct dsa_8021q_context * tag_8021q_ctx;        /*   104     8 */
        struct devlink *           devlink;              /*   112     8 */
        unsigned int               num_tx_queues;        /*   120     4 */
        unsigned int               num_lag_ids;          /*   124     4 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        unsigned int               max_num_bridges;      /*   128     4 */
        unsigned int               num_ports;            /*   132     4 */

        /* size: 136, cachelines: 3, members: 27 */
        /* sum members: 128, holes: 1, sum holes: 4 */
        /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
        /* paddings: 1, sum paddings: 4 */
        /* last cacheline: 8 bytes */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 2 +-
 net/dsa/dsa2.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index a8a586039033..fef9d8bb5190 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -435,7 +435,7 @@ struct dsa_switch {
 	 */
 	unsigned int		max_num_bridges;
 
-	size_t num_ports;
+	unsigned int		num_ports;
 };
 
 static inline struct dsa_port *dsa_to_port(struct dsa_switch *ds, int p)
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index c1da813786a4..3d21521453fe 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -1475,7 +1475,7 @@ static int dsa_switch_parse_ports_of(struct dsa_switch *ds,
 		}
 
 		if (reg >= ds->num_ports) {
-			dev_err(ds->dev, "port %pOF index %u exceeds num_ports (%zu)\n",
+			dev_err(ds->dev, "port %pOF index %u exceeds num_ports (%u)\n",
 				port, reg, ds->num_ports);
 			of_node_put(port);
 			err = -EINVAL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 6/7] net: dsa: move dsa_switch_tree :: ports and lags to first cache line
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
                   ` (4 preceding siblings ...)
  2022-01-05 13:21 ` [PATCH v2 net-next 5/7] net: dsa: make dsa_switch :: num_ports an unsigned int Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:34   ` Florian Fainelli
  2022-01-05 13:21 ` [PATCH v2 net-next 7/7] net: dsa: combine two holes in struct dsa_switch_tree Vladimir Oltean
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

dst->ports is accessed most notably by dsa_master_find_slave(), which is
invoked in the RX path.

dst->lags is accessed by dsa_lag_dev(), which is invoked in the RX path
of tag_dsa.c.

dst->tag_ops, dst->default_proto and dst->pd don't need to be in the
first cache line, so they are moved out by this change.

Before:

pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
        struct list_head           list;                 /*     0    16 */
        struct raw_notifier_head   nh;                   /*    16     8 */
        unsigned int               index;                /*    24     4 */
        struct kref                refcount;             /*    28     4 */
        bool                       setup;                /*    32     1 */

        /* XXX 7 bytes hole, try to pack */

        const struct dsa_device_ops  * tag_ops;          /*    40     8 */
        enum dsa_tag_protocol      default_proto;        /*    48     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_platform_data * pd;                   /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct list_head           ports;                /*    64    16 */
        struct list_head           rtable;               /*    80    16 */
        struct net_device * *      lags;                 /*    96     8 */
        unsigned int               lags_len;             /*   104     4 */
        unsigned int               last_switch;          /*   108     4 */

        /* size: 112, cachelines: 2, members: 13 */
        /* sum members: 101, holes: 2, sum holes: 11 */
        /* last cacheline: 48 bytes */
};

After:

pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
        struct list_head           list;                 /*     0    16 */
        struct list_head           ports;                /*    16    16 */
        struct raw_notifier_head   nh;                   /*    32     8 */
        unsigned int               index;                /*    40     4 */
        struct kref                refcount;             /*    44     4 */
        struct net_device * *      lags;                 /*    48     8 */
        bool                       setup;                /*    56     1 */

        /* XXX 7 bytes hole, try to pack */

        /* --- cacheline 1 boundary (64 bytes) --- */
        const struct dsa_device_ops  * tag_ops;          /*    64     8 */
        enum dsa_tag_protocol      default_proto;        /*    72     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_platform_data * pd;                   /*    80     8 */
        struct list_head           rtable;               /*    88    16 */
        unsigned int               lags_len;             /*   104     4 */
        unsigned int               last_switch;          /*   108     4 */

        /* size: 112, cachelines: 2, members: 13 */
        /* sum members: 101, holes: 2, sum holes: 11 */
        /* last cacheline: 48 bytes */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index fef9d8bb5190..cbbac75138d9 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -119,6 +119,9 @@ struct dsa_netdevice_ops {
 struct dsa_switch_tree {
 	struct list_head	list;
 
+	/* List of switch ports */
+	struct list_head ports;
+
 	/* Notifier chain for switch-wide events */
 	struct raw_notifier_head	nh;
 
@@ -128,6 +131,11 @@ struct dsa_switch_tree {
 	/* Number of switches attached to this tree */
 	struct kref refcount;
 
+	/* Maps offloaded LAG netdevs to a zero-based linear ID for
+	 * drivers that need it.
+	 */
+	struct net_device **lags;
+
 	/* Has this tree been applied to the hardware? */
 	bool setup;
 
@@ -145,16 +153,10 @@ struct dsa_switch_tree {
 	 */
 	struct dsa_platform_data	*pd;
 
-	/* List of switch ports */
-	struct list_head ports;
-
 	/* List of DSA links composing the routing table */
 	struct list_head rtable;
 
-	/* Maps offloaded LAG netdevs to a zero-based linear ID for
-	 * drivers that need it.
-	 */
-	struct net_device **lags;
+	/* Length of "lags" array */
 	unsigned int lags_len;
 
 	/* Track the largest switch index within a tree */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net-next 7/7] net: dsa: combine two holes in struct dsa_switch_tree
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
                   ` (5 preceding siblings ...)
  2022-01-05 13:21 ` [PATCH v2 net-next 6/7] net: dsa: move dsa_switch_tree :: ports and lags to first cache line Vladimir Oltean
@ 2022-01-05 13:21 ` Vladimir Oltean
  2022-01-05 18:34   ` Florian Fainelli
  2022-01-05 14:28 ` [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
  2022-01-05 18:39 ` Florian Fainelli
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 13:21 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

There is a 7 byte hole after dst->setup and a 4 byte hole after
dst->default_proto. Combining them, we have a single hole of just 3
bytes on 64 bit machines.

Before:

pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
        struct list_head           list;                 /*     0    16 */
        struct list_head           ports;                /*    16    16 */
        struct raw_notifier_head   nh;                   /*    32     8 */
        unsigned int               index;                /*    40     4 */
        struct kref                refcount;             /*    44     4 */
        struct net_device * *      lags;                 /*    48     8 */
        bool                       setup;                /*    56     1 */

        /* XXX 7 bytes hole, try to pack */

        /* --- cacheline 1 boundary (64 bytes) --- */
        const struct dsa_device_ops  * tag_ops;          /*    64     8 */
        enum dsa_tag_protocol      default_proto;        /*    72     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dsa_platform_data * pd;                   /*    80     8 */
        struct list_head           rtable;               /*    88    16 */
        unsigned int               lags_len;             /*   104     4 */
        unsigned int               last_switch;          /*   108     4 */

        /* size: 112, cachelines: 2, members: 13 */
        /* sum members: 101, holes: 2, sum holes: 11 */
        /* last cacheline: 48 bytes */
};

After:

pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
        struct list_head           list;                 /*     0    16 */
        struct list_head           ports;                /*    16    16 */
        struct raw_notifier_head   nh;                   /*    32     8 */
        unsigned int               index;                /*    40     4 */
        struct kref                refcount;             /*    44     4 */
        struct net_device * *      lags;                 /*    48     8 */
        const struct dsa_device_ops  * tag_ops;          /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        enum dsa_tag_protocol      default_proto;        /*    64     4 */
        bool                       setup;                /*    68     1 */

        /* XXX 3 bytes hole, try to pack */

        struct dsa_platform_data * pd;                   /*    72     8 */
        struct list_head           rtable;               /*    80    16 */
        unsigned int               lags_len;             /*    96     4 */
        unsigned int               last_switch;          /*   100     4 */

        /* size: 104, cachelines: 2, members: 13 */
        /* sum members: 101, holes: 1, sum holes: 3 */
        /* last cacheline: 40 bytes */
};

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index cbbac75138d9..5d0fec6db3ae 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -136,9 +136,6 @@ struct dsa_switch_tree {
 	 */
 	struct net_device **lags;
 
-	/* Has this tree been applied to the hardware? */
-	bool setup;
-
 	/* Tagging protocol operations */
 	const struct dsa_device_ops *tag_ops;
 
@@ -147,6 +144,9 @@ struct dsa_switch_tree {
 	 */
 	enum dsa_tag_protocol default_proto;
 
+	/* Has this tree been applied to the hardware? */
+	bool setup;
+
 	/*
 	 * Configuration data for the platform device that owns
 	 * this dsa switch tree instance.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 0/7] Cleanup to main DSA structures
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
                   ` (6 preceding siblings ...)
  2022-01-05 13:21 ` [PATCH v2 net-next 7/7] net: dsa: combine two holes in struct dsa_switch_tree Vladimir Oltean
@ 2022-01-05 14:28 ` Vladimir Oltean
  2022-01-05 18:37   ` Florian Fainelli
  2022-01-05 18:39 ` Florian Fainelli
  8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 14:28 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot,
	Florian Fainelli

On Wed, Jan 05, 2022 at 03:21:34PM +0200, Vladimir Oltean wrote:
> This series contains changes that do the following:
> 
> - struct dsa_port reduced from 576 to 544 bytes, and first cache line a
>   bit better organized
> - struct dsa_switch from 160 to 136 bytes, and first cache line a bit
>   better organized
> - struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
>   bit better organized
> 
> No changes compared to v1, just split into a separate patch set.
> 
> Vladimir Oltean (7):
>   net: dsa: move dsa_port :: stp_state near dsa_port :: mac
>   net: dsa: merge all bools of struct dsa_port into a single u8
>   net: dsa: move dsa_port :: type near dsa_port :: index
>   net: dsa: merge all bools of struct dsa_switch into a single u32
>   net: dsa: make dsa_switch :: num_ports an unsigned int
>   net: dsa: move dsa_switch_tree :: ports and lags to first cache line
>   net: dsa: combine two holes in struct dsa_switch_tree
> 
>  include/net/dsa.h | 146 +++++++++++++++++++++++++---------------------
>  net/dsa/dsa2.c    |   2 +-
>  2 files changed, 81 insertions(+), 67 deletions(-)
> 
> -- 
> 2.25.1
>

Let's keep this version for review only (RFC). For the final version I
just figured that I can use this syntax:

	u8			vlan_filtering:1;

	/* Managed by DSA on user ports and by drivers on CPU and DSA ports */
	u8			learning:1;

	u8			lag_tx_enabled:1;

	u8			devlink_port_setup:1;

	u8			setup:1;

instead of this syntax:

	u8			vlan_filtering:1,
				/* Managed by DSA on user ports and by
				 * drivers on CPU and DSA ports
				 */
				learning:1,
				lag_tx_enabled:1,
				devlink_port_setup:1,
				setup:1;

which is what I'm going to prefer.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac
  2022-01-05 13:21 ` [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac Vladimir Oltean
@ 2022-01-05 18:30   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:30 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> The MAC address of a port is 6 octets in size, and this creates a 2
> octet hole after it. There are some other u8 members of struct dsa_port
> that we can put in that hole. One such member is the stp_state.
> 
> Before:
> 
> pahole -C dsa_port net/dsa/slave.o
> struct dsa_port {
>         union {
>                 struct net_device * master;              /*     0     8 */
>                 struct net_device * slave;               /*     0     8 */
>         };                                               /*     0     8 */
>         const struct dsa_device_ops  * tag_ops;          /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
>         enum {
>                 DSA_PORT_TYPE_UNUSED = 0,
>                 DSA_PORT_TYPE_CPU    = 1,
>                 DSA_PORT_TYPE_DSA    = 2,
>                 DSA_PORT_TYPE_USER   = 3,
>         } type;                                          /*    32     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_switch *        ds;                   /*    40     8 */
>         unsigned int               index;                /*    48     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         const char  *              name;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_port *          cpu_dp;               /*    64     8 */
>         u8                         mac[6];               /*    72     6 */
> 
>         /* XXX 2 bytes hole, try to pack */
> 
>         struct device_node *       dn;                   /*    80     8 */
>         unsigned int               ageing_time;          /*    88     4 */
>         bool                       vlan_filtering;       /*    92     1 */
>         bool                       learning;             /*    93     1 */
>         u8                         stp_state;            /*    94     1 */
> 
>         /* XXX 1 byte hole, try to pack */
> 
>         struct dsa_bridge *        bridge;               /*    96     8 */
>         struct devlink_port        devlink_port;         /*   104   288 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         bool                       devlink_port_setup;   /*   392     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct phylink *           pl;                   /*   400     8 */
>         struct phylink_config      pl_config;            /*   408    40 */
>         /* --- cacheline 7 boundary (448 bytes) --- */
>         struct net_device *        lag_dev;              /*   448     8 */
>         bool                       lag_tx_enabled;       /*   456     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct net_device *        hsr_dev;              /*   464     8 */
>         struct list_head           list;                 /*   472    16 */
>         const struct ethtool_ops  * orig_ethtool_ops;    /*   488     8 */
>         const struct dsa_netdevice_ops  * netdev_ops;    /*   496     8 */
>         struct mutex               addr_lists_lock;      /*   504    32 */
>         /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
>         struct list_head           fdbs;                 /*   536    16 */
>         struct list_head           mdbs;                 /*   552    16 */
>         bool                       setup;                /*   568     1 */
> 
>         /* size: 576, cachelines: 9, members: 30 */
>         /* sum members: 544, holes: 6, sum holes: 25 */
>         /* padding: 7 */
> };
> 
> After:
> 
> pahole -C dsa_port net/dsa/slave.o
> struct dsa_port {
>         union {
>                 struct net_device * master;              /*     0     8 */
>                 struct net_device * slave;               /*     0     8 */
>         };                                               /*     0     8 */
>         const struct dsa_device_ops  * tag_ops;          /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
>         enum {
>                 DSA_PORT_TYPE_UNUSED = 0,
>                 DSA_PORT_TYPE_CPU    = 1,
>                 DSA_PORT_TYPE_DSA    = 2,
>                 DSA_PORT_TYPE_USER   = 3,
>         } type;                                          /*    32     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_switch *        ds;                   /*    40     8 */
>         unsigned int               index;                /*    48     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         const char  *              name;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_port *          cpu_dp;               /*    64     8 */
>         u8                         mac[6];               /*    72     6 */
>         u8                         stp_state;            /*    78     1 */
> 
>         /* XXX 1 byte hole, try to pack */
> 
>         struct device_node *       dn;                   /*    80     8 */
>         unsigned int               ageing_time;          /*    88     4 */
>         bool                       vlan_filtering;       /*    92     1 */
>         bool                       learning;             /*    93     1 */
> 
>         /* XXX 2 bytes hole, try to pack */
> 
>         struct dsa_bridge *        bridge;               /*    96     8 */
>         struct devlink_port        devlink_port;         /*   104   288 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         bool                       devlink_port_setup;   /*   392     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct phylink *           pl;                   /*   400     8 */
>         struct phylink_config      pl_config;            /*   408    40 */
>         /* --- cacheline 7 boundary (448 bytes) --- */
>         struct net_device *        lag_dev;              /*   448     8 */
>         bool                       lag_tx_enabled;       /*   456     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct net_device *        hsr_dev;              /*   464     8 */
>         struct list_head           list;                 /*   472    16 */
>         const struct ethtool_ops  * orig_ethtool_ops;    /*   488     8 */
>         const struct dsa_netdevice_ops  * netdev_ops;    /*   496     8 */
>         struct mutex               addr_lists_lock;      /*   504    32 */
>         /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
>         struct list_head           fdbs;                 /*   536    16 */
>         struct list_head           mdbs;                 /*   552    16 */
>         bool                       setup;                /*   568     1 */
> 
>         /* size: 576, cachelines: 9, members: 30 */
>         /* sum members: 544, holes: 6, sum holes: 25 */
>         /* padding: 7 */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 13:21 ` [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8 Vladimir Oltean
@ 2022-01-05 18:30   ` Florian Fainelli
  2022-01-05 18:39     ` Vladimir Oltean
  0 siblings, 1 reply; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:30 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> struct dsa_port has 5 bool members which create quite a number of 7 byte
> holes in the structure layout. By merging them all into bitfields of an
> u8, and placing that u8 in the 1-byte hole after dp->mac and dp->stp_state,
> we can reduce the structure size from 576 bytes to 552 bytes on arm64.
> 
> Before:
> 
> pahole -C dsa_port net/dsa/slave.o
> struct dsa_port {
>         union {
>                 struct net_device * master;              /*     0     8 */
>                 struct net_device * slave;               /*     0     8 */
>         };                                               /*     0     8 */
>         const struct dsa_device_ops  * tag_ops;          /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
>         enum {
>                 DSA_PORT_TYPE_UNUSED = 0,
>                 DSA_PORT_TYPE_CPU    = 1,
>                 DSA_PORT_TYPE_DSA    = 2,
>                 DSA_PORT_TYPE_USER   = 3,
>         } type;                                          /*    32     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_switch *        ds;                   /*    40     8 */
>         unsigned int               index;                /*    48     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         const char  *              name;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_port *          cpu_dp;               /*    64     8 */
>         u8                         mac[6];               /*    72     6 */
>         u8                         stp_state;            /*    78     1 */
> 
>         /* XXX 1 byte hole, try to pack */
> 
>         struct device_node *       dn;                   /*    80     8 */
>         unsigned int               ageing_time;          /*    88     4 */
>         bool                       vlan_filtering;       /*    92     1 */
>         bool                       learning;             /*    93     1 */
> 
>         /* XXX 2 bytes hole, try to pack */
> 
>         struct dsa_bridge *        bridge;               /*    96     8 */
>         struct devlink_port        devlink_port;         /*   104   288 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         bool                       devlink_port_setup;   /*   392     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct phylink *           pl;                   /*   400     8 */
>         struct phylink_config      pl_config;            /*   408    40 */
>         /* --- cacheline 7 boundary (448 bytes) --- */
>         struct net_device *        lag_dev;              /*   448     8 */
>         bool                       lag_tx_enabled;       /*   456     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct net_device *        hsr_dev;              /*   464     8 */
>         struct list_head           list;                 /*   472    16 */
>         const struct ethtool_ops  * orig_ethtool_ops;    /*   488     8 */
>         const struct dsa_netdevice_ops  * netdev_ops;    /*   496     8 */
>         struct mutex               addr_lists_lock;      /*   504    32 */
>         /* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
>         struct list_head           fdbs;                 /*   536    16 */
>         struct list_head           mdbs;                 /*   552    16 */
>         bool                       setup;                /*   568     1 */
> 
>         /* size: 576, cachelines: 9, members: 30 */
>         /* sum members: 544, holes: 6, sum holes: 25 */
>         /* padding: 7 */
> };
> 
> After:
> 
> pahole -C dsa_port net/dsa/slave.o
> struct dsa_port {
>         union {
>                 struct net_device * master;              /*     0     8 */
>                 struct net_device * slave;               /*     0     8 */
>         };                                               /*     0     8 */
>         const struct dsa_device_ops  * tag_ops;          /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
>         enum {
>                 DSA_PORT_TYPE_UNUSED = 0,
>                 DSA_PORT_TYPE_CPU    = 1,
>                 DSA_PORT_TYPE_DSA    = 2,
>                 DSA_PORT_TYPE_USER   = 3,
>         } type;                                          /*    32     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_switch *        ds;                   /*    40     8 */
>         unsigned int               index;                /*    48     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         const char  *              name;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_port *          cpu_dp;               /*    64     8 */
>         u8                         mac[6];               /*    72     6 */
>         u8                         stp_state;            /*    78     1 */
>         u8                         vlan_filtering:1;     /*    79: 0  1 */
>         u8                         learning:1;           /*    79: 1  1 */
>         u8                         lag_tx_enabled:1;     /*    79: 2  1 */
>         u8                         devlink_port_setup:1; /*    79: 3  1 */
>         u8                         setup:1;              /*    79: 4  1 */
> 
>         /* XXX 3 bits hole, try to pack */
> 
>         struct device_node *       dn;                   /*    80     8 */
>         unsigned int               ageing_time;          /*    88     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_bridge *        bridge;               /*    96     8 */
>         struct devlink_port        devlink_port;         /*   104   288 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         struct phylink *           pl;                   /*   392     8 */
>         struct phylink_config      pl_config;            /*   400    40 */
>         struct net_device *        lag_dev;              /*   440     8 */
>         /* --- cacheline 7 boundary (448 bytes) --- */
>         struct net_device *        hsr_dev;              /*   448     8 */
>         struct list_head           list;                 /*   456    16 */
>         const struct ethtool_ops  * orig_ethtool_ops;    /*   472     8 */
>         const struct dsa_netdevice_ops  * netdev_ops;    /*   480     8 */
>         struct mutex               addr_lists_lock;      /*   488    32 */
>         /* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */
>         struct list_head           fdbs;                 /*   520    16 */
>         struct list_head           mdbs;                 /*   536    16 */
> 
>         /* size: 552, cachelines: 9, members: 30 */
>         /* sum members: 539, holes: 3, sum holes: 12 */
>         /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
>         /* last cacheline: 40 bytes */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 3/7] net: dsa: move dsa_port :: type near dsa_port :: index
  2022-01-05 13:21 ` [PATCH v2 net-next 3/7] net: dsa: move dsa_port :: type near dsa_port :: index Vladimir Oltean
@ 2022-01-05 18:31   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:31 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> Both dsa_port :: type and dsa_port :: index introduce a 4 octet hole
> after them, so we can group them together and the holes would be
> eliminated, turning 16 octets of storage into just 8. This makes the
> cpu_dp pointer fit in the first cache line, which is good, because
> dsa_slave_to_master(), called by dsa_enqueue_skb(), uses it.
> 
> Before:
> 
> pahole -C dsa_port net/dsa/slave.o
> struct dsa_port {
>         union {
>                 struct net_device * master;              /*     0     8 */
>                 struct net_device * slave;               /*     0     8 */
>         };                                               /*     0     8 */
>         const struct dsa_device_ops  * tag_ops;          /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
>         enum {
>                 DSA_PORT_TYPE_UNUSED = 0,
>                 DSA_PORT_TYPE_CPU    = 1,
>                 DSA_PORT_TYPE_DSA    = 2,
>                 DSA_PORT_TYPE_USER   = 3,
>         } type;                                          /*    32     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_switch *        ds;                   /*    40     8 */
>         unsigned int               index;                /*    48     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         const char  *              name;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_port *          cpu_dp;               /*    64     8 */
>         u8                         mac[6];               /*    72     6 */
>         u8                         stp_state;            /*    78     1 */
>         u8                         vlan_filtering:1;     /*    79: 0  1 */
>         u8                         learning:1;           /*    79: 1  1 */
>         u8                         lag_tx_enabled:1;     /*    79: 2  1 */
>         u8                         devlink_port_setup:1; /*    79: 3  1 */
>         u8                         setup:1;              /*    79: 4  1 */
> 
>         /* XXX 3 bits hole, try to pack */
> 
>         struct device_node *       dn;                   /*    80     8 */
>         unsigned int               ageing_time;          /*    88     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_bridge *        bridge;               /*    96     8 */
>         struct devlink_port        devlink_port;         /*   104   288 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         struct phylink *           pl;                   /*   392     8 */
>         struct phylink_config      pl_config;            /*   400    40 */
>         struct net_device *        lag_dev;              /*   440     8 */
>         /* --- cacheline 7 boundary (448 bytes) --- */
>         struct net_device *        hsr_dev;              /*   448     8 */
>         struct list_head           list;                 /*   456    16 */
>         const struct ethtool_ops  * orig_ethtool_ops;    /*   472     8 */
>         const struct dsa_netdevice_ops  * netdev_ops;    /*   480     8 */
>         struct mutex               addr_lists_lock;      /*   488    32 */
>         /* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */
>         struct list_head           fdbs;                 /*   520    16 */
>         struct list_head           mdbs;                 /*   536    16 */
> 
>         /* size: 552, cachelines: 9, members: 30 */
>         /* sum members: 539, holes: 3, sum holes: 12 */
>         /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
>         /* last cacheline: 40 bytes */
> };
> 
> After:
> 
> pahole -C dsa_port net/dsa/slave.o
> struct dsa_port {
>         union {
>                 struct net_device * master;              /*     0     8 */
>                 struct net_device * slave;               /*     0     8 */
>         };                                               /*     0     8 */
>         const struct dsa_device_ops  * tag_ops;          /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         struct sk_buff *           (*rcv)(struct sk_buff *, struct net_device *); /*    24     8 */
>         struct dsa_switch *        ds;                   /*    32     8 */
>         unsigned int               index;                /*    40     4 */
>         enum {
>                 DSA_PORT_TYPE_UNUSED = 0,
>                 DSA_PORT_TYPE_CPU    = 1,
>                 DSA_PORT_TYPE_DSA    = 2,
>                 DSA_PORT_TYPE_USER   = 3,
>         } type;                                          /*    44     4 */
>         const char  *              name;                 /*    48     8 */
>         struct dsa_port *          cpu_dp;               /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         u8                         mac[6];               /*    64     6 */
>         u8                         stp_state;            /*    70     1 */
>         u8                         vlan_filtering:1;     /*    71: 0  1 */
>         u8                         learning:1;           /*    71: 1  1 */
>         u8                         lag_tx_enabled:1;     /*    71: 2  1 */
>         u8                         devlink_port_setup:1; /*    71: 3  1 */
>         u8                         setup:1;              /*    71: 4  1 */
> 
>         /* XXX 3 bits hole, try to pack */
> 
>         struct device_node *       dn;                   /*    72     8 */
>         unsigned int               ageing_time;          /*    80     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_bridge *        bridge;               /*    88     8 */
>         struct devlink_port        devlink_port;         /*    96   288 */
>         /* --- cacheline 6 boundary (384 bytes) --- */
>         struct phylink *           pl;                   /*   384     8 */
>         struct phylink_config      pl_config;            /*   392    40 */
>         struct net_device *        lag_dev;              /*   432     8 */
>         struct net_device *        hsr_dev;              /*   440     8 */
>         /* --- cacheline 7 boundary (448 bytes) --- */
>         struct list_head           list;                 /*   448    16 */
>         const struct ethtool_ops  * orig_ethtool_ops;    /*   464     8 */
>         const struct dsa_netdevice_ops  * netdev_ops;    /*   472     8 */
>         struct mutex               addr_lists_lock;      /*   480    32 */
>         /* --- cacheline 8 boundary (512 bytes) --- */
>         struct list_head           fdbs;                 /*   512    16 */
>         struct list_head           mdbs;                 /*   528    16 */
> 
>         /* size: 544, cachelines: 9, members: 30 */
>         /* sum members: 539, holes: 1, sum holes: 4 */
>         /* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
>         /* last cacheline: 32 bytes */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct dsa_switch into a single u32
  2022-01-05 13:21 ` [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct dsa_switch into a single u32 Vladimir Oltean
@ 2022-01-05 18:32   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:32 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> struct dsa_switch has 9 boolean properties, many of which are in fact
> set by drivers for custom behavior (vlan_filtering_is_global,
> needs_standalone_vlan_filtering, etc etc). The binary layout of the
> structure could be improved. For example, the "bool setup" at the
> beginning introduces a gratuitous 7 byte hole in the first cache line.
> 
> The change merges all boolean properties into bitfields of an u32, and
> places that u32 in the first cache line of the structure, since many
> bools are accessed from the data path (untag_bridge_pvid, vlan_filtering,
> vlan_filtering_is_global).
> 
> We place this u32 after the existing ds->index, which is also 4 bytes in
> size. As a positive side effect, ds->tagger_data now fits into the first
> cache line too, because 4 bytes are saved.
> 
> Before:
> 
> pahole -C dsa_switch net/dsa/slave.o
> struct dsa_switch {
>         bool                       setup;                /*     0     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct device *            dev;                  /*     8     8 */
>         struct dsa_switch_tree *   dst;                  /*    16     8 */
>         unsigned int               index;                /*    24     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct notifier_block      nb;                   /*    32    24 */
> 
>         /* XXX last struct has 4 bytes of padding */
> 
>         void *                     priv;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         void *                     tagger_data;          /*    64     8 */
>         struct dsa_chip_data *     cd;                   /*    72     8 */
>         const struct dsa_switch_ops  * ops;              /*    80     8 */
>         u32                        phys_mii_mask;        /*    88     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct mii_bus *           slave_mii_bus;        /*    96     8 */
>         unsigned int               ageing_time_min;      /*   104     4 */
>         unsigned int               ageing_time_max;      /*   108     4 */
>         struct dsa_8021q_context * tag_8021q_ctx;        /*   112     8 */
>         struct devlink *           devlink;              /*   120     8 */
>         /* --- cacheline 2 boundary (128 bytes) --- */
>         unsigned int               num_tx_queues;        /*   128     4 */
>         bool                       vlan_filtering_is_global; /*   132     1 */
>         bool                       needs_standalone_vlan_filtering; /*   133     1 */
>         bool                       configure_vlan_while_not_filtering; /*   134     1 */
>         bool                       untag_bridge_pvid;    /*   135     1 */
>         bool                       assisted_learning_on_cpu_port; /*   136     1 */
>         bool                       vlan_filtering;       /*   137     1 */
>         bool                       pcs_poll;             /*   138     1 */
>         bool                       mtu_enforcement_ingress; /*   139     1 */
>         unsigned int               num_lag_ids;          /*   140     4 */
>         unsigned int               max_num_bridges;      /*   144     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         size_t                     num_ports;            /*   152     8 */
> 
>         /* size: 160, cachelines: 3, members: 27 */
>         /* sum members: 141, holes: 4, sum holes: 19 */
>         /* paddings: 1, sum paddings: 4 */
>         /* last cacheline: 32 bytes */
> };
> 
> After:
> 
> pahole -C dsa_switch net/dsa/slave.o
> struct dsa_switch {
>         struct device *            dev;                  /*     0     8 */
>         struct dsa_switch_tree *   dst;                  /*     8     8 */
>         unsigned int               index;                /*    16     4 */
>         u32                        setup:1;              /*    20: 0  4 */
>         u32                        vlan_filtering_is_global:1; /*    20: 1  4 */
>         u32                        needs_standalone_vlan_filtering:1; /*    20: 2  4 */
>         u32                        configure_vlan_while_not_filtering:1; /*    20: 3  4 */
>         u32                        untag_bridge_pvid:1;  /*    20: 4  4 */
>         u32                        assisted_learning_on_cpu_port:1; /*    20: 5  4 */
>         u32                        vlan_filtering:1;     /*    20: 6  4 */
>         u32                        pcs_poll:1;           /*    20: 7  4 */
>         u32                        mtu_enforcement_ingress:1; /*    20: 8  4 */
> 
>         /* XXX 23 bits hole, try to pack */
> 
>         struct notifier_block      nb;                   /*    24    24 */
> 
>         /* XXX last struct has 4 bytes of padding */
> 
>         void *                     priv;                 /*    48     8 */
>         void *                     tagger_data;          /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_chip_data *     cd;                   /*    64     8 */
>         const struct dsa_switch_ops  * ops;              /*    72     8 */
>         u32                        phys_mii_mask;        /*    80     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct mii_bus *           slave_mii_bus;        /*    88     8 */
>         unsigned int               ageing_time_min;      /*    96     4 */
>         unsigned int               ageing_time_max;      /*   100     4 */
>         struct dsa_8021q_context * tag_8021q_ctx;        /*   104     8 */
>         struct devlink *           devlink;              /*   112     8 */
>         unsigned int               num_tx_queues;        /*   120     4 */
>         unsigned int               num_lag_ids;          /*   124     4 */
>         /* --- cacheline 2 boundary (128 bytes) --- */
>         unsigned int               max_num_bridges;      /*   128     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         size_t                     num_ports;            /*   136     8 */
> 
>         /* size: 144, cachelines: 3, members: 27 */
>         /* sum members: 132, holes: 2, sum holes: 8 */
>         /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
>         /* paddings: 1, sum paddings: 4 */
>         /* last cacheline: 16 bytes */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 5/7] net: dsa: make dsa_switch :: num_ports an unsigned int
  2022-01-05 13:21 ` [PATCH v2 net-next 5/7] net: dsa: make dsa_switch :: num_ports an unsigned int Vladimir Oltean
@ 2022-01-05 18:33   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:33 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> Currently, num_ports is declared as size_t, which is defined as
> __kernel_ulong_t, therefore it occupies 8 bytes of memory.
> 
> Even switches with port numbers in the range of tens are exotic, so
> there is no need for this amount of storage.
> 
> Additionally, because the max_num_bridges member right above it is also
> 4 bytes, it means the compiler needs to add padding between the last 2
> fields. By reducing the size, we don't need that padding and can reduce
> the struct size.
> 
> Before:
> 
> pahole -C dsa_switch net/dsa/slave.o
> struct dsa_switch {
>         struct device *            dev;                  /*     0     8 */
>         struct dsa_switch_tree *   dst;                  /*     8     8 */
>         unsigned int               index;                /*    16     4 */
>         u32                        setup:1;              /*    20: 0  4 */
>         u32                        vlan_filtering_is_global:1; /*    20: 1  4 */
>         u32                        needs_standalone_vlan_filtering:1; /*    20: 2  4 */
>         u32                        configure_vlan_while_not_filtering:1; /*    20: 3  4 */
>         u32                        untag_bridge_pvid:1;  /*    20: 4  4 */
>         u32                        assisted_learning_on_cpu_port:1; /*    20: 5  4 */
>         u32                        vlan_filtering:1;     /*    20: 6  4 */
>         u32                        pcs_poll:1;           /*    20: 7  4 */
>         u32                        mtu_enforcement_ingress:1; /*    20: 8  4 */
> 
>         /* XXX 23 bits hole, try to pack */
> 
>         struct notifier_block      nb;                   /*    24    24 */
> 
>         /* XXX last struct has 4 bytes of padding */
> 
>         void *                     priv;                 /*    48     8 */
>         void *                     tagger_data;          /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_chip_data *     cd;                   /*    64     8 */
>         const struct dsa_switch_ops  * ops;              /*    72     8 */
>         u32                        phys_mii_mask;        /*    80     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct mii_bus *           slave_mii_bus;        /*    88     8 */
>         unsigned int               ageing_time_min;      /*    96     4 */
>         unsigned int               ageing_time_max;      /*   100     4 */
>         struct dsa_8021q_context * tag_8021q_ctx;        /*   104     8 */
>         struct devlink *           devlink;              /*   112     8 */
>         unsigned int               num_tx_queues;        /*   120     4 */
>         unsigned int               num_lag_ids;          /*   124     4 */
>         /* --- cacheline 2 boundary (128 bytes) --- */
>         unsigned int               max_num_bridges;      /*   128     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         size_t                     num_ports;            /*   136     8 */
> 
>         /* size: 144, cachelines: 3, members: 27 */
>         /* sum members: 132, holes: 2, sum holes: 8 */
>         /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
>         /* paddings: 1, sum paddings: 4 */
>         /* last cacheline: 16 bytes */
> };
> 
> After:
> 
> pahole -C dsa_switch net/dsa/slave.o
> struct dsa_switch {
>         struct device *            dev;                  /*     0     8 */
>         struct dsa_switch_tree *   dst;                  /*     8     8 */
>         unsigned int               index;                /*    16     4 */
>         u32                        setup:1;              /*    20: 0  4 */
>         u32                        vlan_filtering_is_global:1; /*    20: 1  4 */
>         u32                        needs_standalone_vlan_filtering:1; /*    20: 2  4 */
>         u32                        configure_vlan_while_not_filtering:1; /*    20: 3  4 */
>         u32                        untag_bridge_pvid:1;  /*    20: 4  4 */
>         u32                        assisted_learning_on_cpu_port:1; /*    20: 5  4 */
>         u32                        vlan_filtering:1;     /*    20: 6  4 */
>         u32                        pcs_poll:1;           /*    20: 7  4 */
>         u32                        mtu_enforcement_ingress:1; /*    20: 8  4 */
> 
>         /* XXX 23 bits hole, try to pack */
> 
>         struct notifier_block      nb;                   /*    24    24 */
> 
>         /* XXX last struct has 4 bytes of padding */
> 
>         void *                     priv;                 /*    48     8 */
>         void *                     tagger_data;          /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct dsa_chip_data *     cd;                   /*    64     8 */
>         const struct dsa_switch_ops  * ops;              /*    72     8 */
>         u32                        phys_mii_mask;        /*    80     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct mii_bus *           slave_mii_bus;        /*    88     8 */
>         unsigned int               ageing_time_min;      /*    96     4 */
>         unsigned int               ageing_time_max;      /*   100     4 */
>         struct dsa_8021q_context * tag_8021q_ctx;        /*   104     8 */
>         struct devlink *           devlink;              /*   112     8 */
>         unsigned int               num_tx_queues;        /*   120     4 */
>         unsigned int               num_lag_ids;          /*   124     4 */
>         /* --- cacheline 2 boundary (128 bytes) --- */
>         unsigned int               max_num_bridges;      /*   128     4 */
>         unsigned int               num_ports;            /*   132     4 */
> 
>         /* size: 136, cachelines: 3, members: 27 */
>         /* sum members: 128, holes: 1, sum holes: 4 */
>         /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
>         /* paddings: 1, sum paddings: 4 */
>         /* last cacheline: 8 bytes */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 6/7] net: dsa: move dsa_switch_tree :: ports and lags to first cache line
  2022-01-05 13:21 ` [PATCH v2 net-next 6/7] net: dsa: move dsa_switch_tree :: ports and lags to first cache line Vladimir Oltean
@ 2022-01-05 18:34   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:34 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> dst->ports is accessed most notably by dsa_master_find_slave(), which is
> invoked in the RX path.
> 
> dst->lags is accessed by dsa_lag_dev(), which is invoked in the RX path
> of tag_dsa.c.
> 
> dst->tag_ops, dst->default_proto and dst->pd don't need to be in the
> first cache line, so they are moved out by this change.
> 
> Before:
> 
> pahole -C dsa_switch_tree net/dsa/slave.o
> struct dsa_switch_tree {
>         struct list_head           list;                 /*     0    16 */
>         struct raw_notifier_head   nh;                   /*    16     8 */
>         unsigned int               index;                /*    24     4 */
>         struct kref                refcount;             /*    28     4 */
>         bool                       setup;                /*    32     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         const struct dsa_device_ops  * tag_ops;          /*    40     8 */
>         enum dsa_tag_protocol      default_proto;        /*    48     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_platform_data * pd;                   /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct list_head           ports;                /*    64    16 */
>         struct list_head           rtable;               /*    80    16 */
>         struct net_device * *      lags;                 /*    96     8 */
>         unsigned int               lags_len;             /*   104     4 */
>         unsigned int               last_switch;          /*   108     4 */
> 
>         /* size: 112, cachelines: 2, members: 13 */
>         /* sum members: 101, holes: 2, sum holes: 11 */
>         /* last cacheline: 48 bytes */
> };
> 
> After:
> 
> pahole -C dsa_switch_tree net/dsa/slave.o
> struct dsa_switch_tree {
>         struct list_head           list;                 /*     0    16 */
>         struct list_head           ports;                /*    16    16 */
>         struct raw_notifier_head   nh;                   /*    32     8 */
>         unsigned int               index;                /*    40     4 */
>         struct kref                refcount;             /*    44     4 */
>         struct net_device * *      lags;                 /*    48     8 */
>         bool                       setup;                /*    56     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         const struct dsa_device_ops  * tag_ops;          /*    64     8 */
>         enum dsa_tag_protocol      default_proto;        /*    72     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_platform_data * pd;                   /*    80     8 */
>         struct list_head           rtable;               /*    88    16 */
>         unsigned int               lags_len;             /*   104     4 */
>         unsigned int               last_switch;          /*   108     4 */
> 
>         /* size: 112, cachelines: 2, members: 13 */
>         /* sum members: 101, holes: 2, sum holes: 11 */
>         /* last cacheline: 48 bytes */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 7/7] net: dsa: combine two holes in struct dsa_switch_tree
  2022-01-05 13:21 ` [PATCH v2 net-next 7/7] net: dsa: combine two holes in struct dsa_switch_tree Vladimir Oltean
@ 2022-01-05 18:34   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:34 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> There is a 7 byte hole after dst->setup and a 4 byte hole after
> dst->default_proto. Combining them, we have a single hole of just 3
> bytes on 64 bit machines.
> 
> Before:
> 
> pahole -C dsa_switch_tree net/dsa/slave.o
> struct dsa_switch_tree {
>         struct list_head           list;                 /*     0    16 */
>         struct list_head           ports;                /*    16    16 */
>         struct raw_notifier_head   nh;                   /*    32     8 */
>         unsigned int               index;                /*    40     4 */
>         struct kref                refcount;             /*    44     4 */
>         struct net_device * *      lags;                 /*    48     8 */
>         bool                       setup;                /*    56     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         const struct dsa_device_ops  * tag_ops;          /*    64     8 */
>         enum dsa_tag_protocol      default_proto;        /*    72     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         struct dsa_platform_data * pd;                   /*    80     8 */
>         struct list_head           rtable;               /*    88    16 */
>         unsigned int               lags_len;             /*   104     4 */
>         unsigned int               last_switch;          /*   108     4 */
> 
>         /* size: 112, cachelines: 2, members: 13 */
>         /* sum members: 101, holes: 2, sum holes: 11 */
>         /* last cacheline: 48 bytes */
> };
> 
> After:
> 
> pahole -C dsa_switch_tree net/dsa/slave.o
> struct dsa_switch_tree {
>         struct list_head           list;                 /*     0    16 */
>         struct list_head           ports;                /*    16    16 */
>         struct raw_notifier_head   nh;                   /*    32     8 */
>         unsigned int               index;                /*    40     4 */
>         struct kref                refcount;             /*    44     4 */
>         struct net_device * *      lags;                 /*    48     8 */
>         const struct dsa_device_ops  * tag_ops;          /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         enum dsa_tag_protocol      default_proto;        /*    64     4 */
>         bool                       setup;                /*    68     1 */
> 
>         /* XXX 3 bytes hole, try to pack */
> 
>         struct dsa_platform_data * pd;                   /*    72     8 */
>         struct list_head           rtable;               /*    80    16 */
>         unsigned int               lags_len;             /*    96     4 */
>         unsigned int               last_switch;          /*   100     4 */
> 
>         /* size: 104, cachelines: 2, members: 13 */
>         /* sum members: 101, holes: 1, sum holes: 3 */
>         /* last cacheline: 40 bytes */
> };
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 0/7] Cleanup to main DSA structures
  2022-01-05 14:28 ` [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
@ 2022-01-05 18:37   ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:37 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 6:28 AM, Vladimir Oltean wrote:
> On Wed, Jan 05, 2022 at 03:21:34PM +0200, Vladimir Oltean wrote:
>> This series contains changes that do the following:
>>
>> - struct dsa_port reduced from 576 to 544 bytes, and first cache line a
>>   bit better organized
>> - struct dsa_switch from 160 to 136 bytes, and first cache line a bit
>>   better organized
>> - struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
>>   bit better organized
>>
>> No changes compared to v1, just split into a separate patch set.
>>
>> Vladimir Oltean (7):
>>   net: dsa: move dsa_port :: stp_state near dsa_port :: mac
>>   net: dsa: merge all bools of struct dsa_port into a single u8
>>   net: dsa: move dsa_port :: type near dsa_port :: index
>>   net: dsa: merge all bools of struct dsa_switch into a single u32
>>   net: dsa: make dsa_switch :: num_ports an unsigned int
>>   net: dsa: move dsa_switch_tree :: ports and lags to first cache line
>>   net: dsa: combine two holes in struct dsa_switch_tree
>>
>>  include/net/dsa.h | 146 +++++++++++++++++++++++++---------------------
>>  net/dsa/dsa2.c    |   2 +-
>>  2 files changed, 81 insertions(+), 67 deletions(-)
>>
>> -- 
>> 2.25.1
>>
> 
> Let's keep this version for review only (RFC). For the final version I
> just figured that I can use this syntax:
> 
> 	u8			vlan_filtering:1;
> 
> 	/* Managed by DSA on user ports and by drivers on CPU and DSA ports */
> 	u8			learning:1;
> 
> 	u8			lag_tx_enabled:1;
> 
> 	u8			devlink_port_setup:1;
> 
> 	u8			setup:1;
> 
> instead of this syntax:
> 
> 	u8			vlan_filtering:1,
> 				/* Managed by DSA on user ports and by
> 				 * drivers on CPU and DSA ports
> 				 */
> 				learning:1,
> 				lag_tx_enabled:1,
> 				devlink_port_setup:1,
> 				setup:1;
> 
> which is what I'm going to prefer.

Yes this is indeed more readable. Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 0/7] Cleanup to main DSA structures
  2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
                   ` (7 preceding siblings ...)
  2022-01-05 14:28 ` [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
@ 2022-01-05 18:39 ` Florian Fainelli
  2022-01-05 18:59   ` Vladimir Oltean
  8 siblings, 1 reply; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:39 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> This series contains changes that do the following:
> 
> - struct dsa_port reduced from 576 to 544 bytes, and first cache line a
>   bit better organized
> - struct dsa_switch from 160 to 136 bytes, and first cache line a bit
>   better organized
> - struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
>   bit better organized
> 
> No changes compared to v1, just split into a separate patch set.

This is all looking good to me. I suppose we could possibly swap the
'nh' and 'tag_ops' member since dst->tag_ops is used in
dsa_tag_generic_flow_dissect() which is a fast path, what do you think?
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 18:30   ` Florian Fainelli
@ 2022-01-05 18:39     ` Vladimir Oltean
  2022-01-05 18:46       ` Florian Fainelli
  2022-01-05 22:10       ` Andrew Lunn
  0 siblings, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 18:39 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

Hi Florian,

On Wed, Jan 05, 2022 at 10:30:54AM -0800, Florian Fainelli wrote:
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Thanks a lot for the review.

I'm a bit on the fence on this patch and the other one for dsa_switch.
The thing is that bit fields are not atomic in C89, so if we update any
of the flags inside dp or ds concurrently (like dp->vlan_filtering),
we're in trouble. Right now this isn't a problem, because most of the
flags are set either during probe, or during ds->ops->setup, or are
serialized by the rtnl_mutex in ways that are there to stay (switchdev
notifiers). That's why I didn't say anything about it. But it may be a
caveat to watch out for in the future. Do you think we need to do
something about it? A lock would not be necessary, strictly speaking.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 18:39     ` Vladimir Oltean
@ 2022-01-05 18:46       ` Florian Fainelli
  2022-01-05 18:56         ` Vladimir Oltean
  2022-01-05 22:10       ` Andrew Lunn
  1 sibling, 1 reply; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 18:46 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 10:39 AM, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Wed, Jan 05, 2022 at 10:30:54AM -0800, Florian Fainelli wrote:
>> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> 
> Thanks a lot for the review.
> 
> I'm a bit on the fence on this patch and the other one for dsa_switch.
> The thing is that bit fields are not atomic in C89, so if we update any
> of the flags inside dp or ds concurrently (like dp->vlan_filtering),
> we're in trouble. Right now this isn't a problem, because most of the
> flags are set either during probe, or during ds->ops->setup, or are
> serialized by the rtnl_mutex in ways that are there to stay (switchdev
> notifiers). That's why I didn't say anything about it. But it may be a
> caveat to watch out for in the future. Do you think we need to do
> something about it? A lock would not be necessary, strictly speaking.

I would probably start with a comment that describes these pitfalls, I
wish we had a programmatic way to ensure that these flags would not be
set dynamically and outside of the probe/setup path but that won't
happen easily.

Should we be switching to a bitmask and bitmap helpers to be future proof?
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 18:46       ` Florian Fainelli
@ 2022-01-05 18:56         ` Vladimir Oltean
  2022-01-05 19:42           ` Florian Fainelli
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 18:56 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On Wed, Jan 05, 2022 at 10:46:44AM -0800, Florian Fainelli wrote:
> On 1/5/22 10:39 AM, Vladimir Oltean wrote:
> > Hi Florian,
> > 
> > On Wed, Jan 05, 2022 at 10:30:54AM -0800, Florian Fainelli wrote:
> >> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> > 
> > Thanks a lot for the review.
> > 
> > I'm a bit on the fence on this patch and the other one for dsa_switch.
> > The thing is that bit fields are not atomic in C89, so if we update any
> > of the flags inside dp or ds concurrently (like dp->vlan_filtering),
> > we're in trouble. Right now this isn't a problem, because most of the
> > flags are set either during probe, or during ds->ops->setup, or are
> > serialized by the rtnl_mutex in ways that are there to stay (switchdev
> > notifiers). That's why I didn't say anything about it. But it may be a
> > caveat to watch out for in the future. Do you think we need to do
> > something about it? A lock would not be necessary, strictly speaking.
> 
> I would probably start with a comment that describes these pitfalls, I
> wish we had a programmatic way to ensure that these flags would not be
> set dynamically and outside of the probe/setup path but that won't
> happen easily.

A comment is probably good.

> Should we be switching to a bitmask and bitmap helpers to be future proof?

We could have done that, and it would certainly raise the awareness a
bit more, but the reason I went with the bit fields at least in the
first place was to reduce the churn in drivers. Otherwise, sure, if
driver changes are on the table for this, we can even discuss about
adding more arguments to dsa_register_switch(). The amount of poking
that drivers do inside struct dsa_switch has grown, and sometimes it's
not even immediately clear which members of that structure are supposed
to be populated by whom and when. We could definitely just tell them to
provide their desired behavior as arguments (vlan_filtering_is_global,
needs_standalone_vlan_filtering, configure_vlan_while_not_filtering,
untag_bridge_pvid, assisted_learning_on_cpu_port, ageing_time_min,
ageing_time_max, phys_mii_mask, num_tx_queues, num_lag_ids, max_num_bridges)
and only DSA will update struct dsa_switch, and we could then control
races better that way. But the downside is that we'd have 11 extra
arguments to dsa_register_switch()....

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 0/7] Cleanup to main DSA structures
  2022-01-05 18:39 ` Florian Fainelli
@ 2022-01-05 18:59   ` Vladimir Oltean
  2022-01-05 19:04     ` Florian Fainelli
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 18:59 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On Wed, Jan 05, 2022 at 10:39:04AM -0800, Florian Fainelli wrote:
> On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> > This series contains changes that do the following:
> >
> > - struct dsa_port reduced from 576 to 544 bytes, and first cache line a
> >   bit better organized
> > - struct dsa_switch from 160 to 136 bytes, and first cache line a bit
> >   better organized
> > - struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
> >   bit better organized
> >
> > No changes compared to v1, just split into a separate patch set.
>
> This is all looking good to me. I suppose we could possibly swap the
> 'nh' and 'tag_ops' member since dst->tag_ops is used in
> dsa_tag_generic_flow_dissect() which is a fast path, what do you think?

pahole is telling me that dst->tag_ops is in the first cache line on
both arm64 and arm32. Are you saying that it's better for it to take
dst->nh's place?

[/opt/arm64-linux] $ pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
        struct list_head           list;                 /*     0    16 */
        struct list_head           ports;                /*    16    16 */
        struct raw_notifier_head   nh;                   /*    32     8 */
        unsigned int               index;                /*    40     4 */
        struct kref                refcount;             /*    44     4 */
        struct net_device * *      lags;                 /*    48     8 */
        const struct dsa_device_ops  * tag_ops;          /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        enum dsa_tag_protocol      default_proto;        /*    64     4 */
        bool                       setup;                /*    68     1 */

        /* XXX 3 bytes hole, try to pack */

        struct dsa_platform_data * pd;                   /*    72     8 */
        struct list_head           rtable;               /*    80    16 */
        unsigned int               lags_len;             /*    96     4 */
        unsigned int               last_switch;          /*   100     4 */

        /* size: 104, cachelines: 2, members: 13 */
        /* sum members: 101, holes: 1, sum holes: 3 */
        /* last cacheline: 40 bytes */
};

[/opt/arm-linux] $ pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
        struct list_head           list;                 /*     0     8 */
        struct list_head           ports;                /*     8     8 */
        struct raw_notifier_head   nh;                   /*    16     4 */
        unsigned int               index;                /*    20     4 */
        struct kref                refcount;             /*    24     4 */
        struct net_device * *      lags;                 /*    28     4 */
        const struct dsa_device_ops  * tag_ops;          /*    32     4 */
        enum dsa_tag_protocol      default_proto;        /*    36     4 */
        bool                       setup;                /*    40     1 */

        /* XXX 3 bytes hole, try to pack */

        struct dsa_platform_data * pd;                   /*    44     4 */
        struct list_head           rtable;               /*    48     8 */
        unsigned int               lags_len;             /*    56     4 */
        unsigned int               last_switch;          /*    60     4 */

        /* size: 64, cachelines: 1, members: 13 */
        /* sum members: 61, holes: 1, sum holes: 3 */
};

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 0/7] Cleanup to main DSA structures
  2022-01-05 18:59   ` Vladimir Oltean
@ 2022-01-05 19:04     ` Florian Fainelli
  2022-01-05 19:22       ` Vladimir Oltean
  0 siblings, 1 reply; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 19:04 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 10:59 AM, Vladimir Oltean wrote:
> On Wed, Jan 05, 2022 at 10:39:04AM -0800, Florian Fainelli wrote:
>> On 1/5/22 5:21 AM, Vladimir Oltean wrote:
>>> This series contains changes that do the following:
>>>
>>> - struct dsa_port reduced from 576 to 544 bytes, and first cache line a
>>>   bit better organized
>>> - struct dsa_switch from 160 to 136 bytes, and first cache line a bit
>>>   better organized
>>> - struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
>>>   bit better organized
>>>
>>> No changes compared to v1, just split into a separate patch set.
>>
>> This is all looking good to me. I suppose we could possibly swap the
>> 'nh' and 'tag_ops' member since dst->tag_ops is used in
>> dsa_tag_generic_flow_dissect() which is a fast path, what do you think?
> 
> pahole is telling me that dst->tag_ops is in the first cache line on
> both arm64 and arm32. Are you saying that it's better for it to take
> dst->nh's place?

Sorry it stuck in my head somehow based upon patch 7's pahole output
that tag_ops was still in the second cache line when the after clearly
shows that it moved to the first cache line. No need to add additional
changes then, thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 0/7] Cleanup to main DSA structures
  2022-01-05 19:04     ` Florian Fainelli
@ 2022-01-05 19:22       ` Vladimir Oltean
  0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Oltean @ 2022-01-05 19:22 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On Wed, Jan 05, 2022 at 11:04:24AM -0800, Florian Fainelli wrote:
> On 1/5/22 10:59 AM, Vladimir Oltean wrote:
> > On Wed, Jan 05, 2022 at 10:39:04AM -0800, Florian Fainelli wrote:
> >> On 1/5/22 5:21 AM, Vladimir Oltean wrote:
> >>> This series contains changes that do the following:
> >>>
> >>> - struct dsa_port reduced from 576 to 544 bytes, and first cache line a
> >>>   bit better organized
> >>> - struct dsa_switch from 160 to 136 bytes, and first cache line a bit
> >>>   better organized
> >>> - struct dsa_switch_tree from 112 to 104 bytes, and first cache line a
> >>>   bit better organized
> >>>
> >>> No changes compared to v1, just split into a separate patch set.
> >>
> >> This is all looking good to me. I suppose we could possibly swap the
> >> 'nh' and 'tag_ops' member since dst->tag_ops is used in
> >> dsa_tag_generic_flow_dissect() which is a fast path, what do you think?
> > 
> > pahole is telling me that dst->tag_ops is in the first cache line on
> > both arm64 and arm32. Are you saying that it's better for it to take
> > dst->nh's place?
> 
> Sorry it stuck in my head somehow based upon patch 7's pahole output
> that tag_ops was still in the second cache line when the after clearly
> shows that it moved to the first cache line. No need to add additional
> changes then, thanks!

Yes, that output is a bit verbose and small details are easy to miss
(adding --expand_types makes it even worse). Happened to me quite a few
times during the session...

I'll prepare a v3 once we figure out what would suffice in terms of
warning other developers about the lack of bit field atomicity.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 18:56         ` Vladimir Oltean
@ 2022-01-05 19:42           ` Florian Fainelli
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Fainelli @ 2022-01-05 19:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Jakub Kicinski, Andrew Lunn, Vivien Didelot

On 1/5/22 10:56 AM, Vladimir Oltean wrote:
> On Wed, Jan 05, 2022 at 10:46:44AM -0800, Florian Fainelli wrote:
>> On 1/5/22 10:39 AM, Vladimir Oltean wrote:
>>> Hi Florian,
>>>
>>> On Wed, Jan 05, 2022 at 10:30:54AM -0800, Florian Fainelli wrote:
>>>> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>>>
>>> Thanks a lot for the review.
>>>
>>> I'm a bit on the fence on this patch and the other one for dsa_switch.
>>> The thing is that bit fields are not atomic in C89, so if we update any
>>> of the flags inside dp or ds concurrently (like dp->vlan_filtering),
>>> we're in trouble. Right now this isn't a problem, because most of the
>>> flags are set either during probe, or during ds->ops->setup, or are
>>> serialized by the rtnl_mutex in ways that are there to stay (switchdev
>>> notifiers). That's why I didn't say anything about it. But it may be a
>>> caveat to watch out for in the future. Do you think we need to do
>>> something about it? A lock would not be necessary, strictly speaking.
>>
>> I would probably start with a comment that describes these pitfalls, I
>> wish we had a programmatic way to ensure that these flags would not be
>> set dynamically and outside of the probe/setup path but that won't
>> happen easily.
> 
> A comment is probably good.
> 
>> Should we be switching to a bitmask and bitmap helpers to be future proof?
> 
> We could have done that, and it would certainly raise the awareness a
> bit more, but the reason I went with the bit fields at least in the
> first place was to reduce the churn in drivers. Otherwise, sure, if
> driver changes are on the table for this, we can even discuss about
> adding more arguments to dsa_register_switch(). The amount of poking
> that drivers do inside struct dsa_switch has grown, and sometimes it's
> not even immediately clear which members of that structure are supposed
> to be populated by whom and when. We could definitely just tell them to
> provide their desired behavior as arguments (vlan_filtering_is_global,
> needs_standalone_vlan_filtering, configure_vlan_while_not_filtering,
> untag_bridge_pvid, assisted_learning_on_cpu_port, ageing_time_min,
> ageing_time_max, phys_mii_mask, num_tx_queues, num_lag_ids, max_num_bridges)
> and only DSA will update struct dsa_switch, and we could then control
> races better that way. But the downside is that we'd have 11 extra
> arguments to dsa_register_switch()....

I would not mind if we introduced a dsa_switch::features or similar
bitmap and require driver to set bits in there before
dsa_register_switch() but that seems outside the scope of your patch
set, and I am not sure what the benefits would be unless we do happen to
do dynamic bit/feature toggling.

So for now, I believe a comment is enough, and if we spot a driver
changing that paradigm we consider a more "dynamic" solution.
-- 
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8
  2022-01-05 18:39     ` Vladimir Oltean
  2022-01-05 18:46       ` Florian Fainelli
@ 2022-01-05 22:10       ` Andrew Lunn
  1 sibling, 0 replies; 26+ messages in thread
From: Andrew Lunn @ 2022-01-05 22:10 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, netdev, David S. Miller, Jakub Kicinski,
	Vivien Didelot

On Wed, Jan 05, 2022 at 06:39:34PM +0000, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Wed, Jan 05, 2022 at 10:30:54AM -0800, Florian Fainelli wrote:
> > Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> 
> Thanks a lot for the review.
> 
> I'm a bit on the fence on this patch and the other one for dsa_switch.
> The thing is that bit fields are not atomic in C89, so if we update any
> of the flags inside dp or ds concurrently (like dp->vlan_filtering),
> we're in trouble. Right now this isn't a problem, because most of the
> flags are set either during probe, or during ds->ops->setup

I've no idea if it ever got merged, but:

https://lwn.net/Articles/724319/

	Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-01-05 22:11 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-05 13:21 [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
2022-01-05 13:21 ` [PATCH v2 net-next 1/7] net: dsa: move dsa_port :: stp_state near dsa_port :: mac Vladimir Oltean
2022-01-05 18:30   ` Florian Fainelli
2022-01-05 13:21 ` [PATCH v2 net-next 2/7] net: dsa: merge all bools of struct dsa_port into a single u8 Vladimir Oltean
2022-01-05 18:30   ` Florian Fainelli
2022-01-05 18:39     ` Vladimir Oltean
2022-01-05 18:46       ` Florian Fainelli
2022-01-05 18:56         ` Vladimir Oltean
2022-01-05 19:42           ` Florian Fainelli
2022-01-05 22:10       ` Andrew Lunn
2022-01-05 13:21 ` [PATCH v2 net-next 3/7] net: dsa: move dsa_port :: type near dsa_port :: index Vladimir Oltean
2022-01-05 18:31   ` Florian Fainelli
2022-01-05 13:21 ` [PATCH v2 net-next 4/7] net: dsa: merge all bools of struct dsa_switch into a single u32 Vladimir Oltean
2022-01-05 18:32   ` Florian Fainelli
2022-01-05 13:21 ` [PATCH v2 net-next 5/7] net: dsa: make dsa_switch :: num_ports an unsigned int Vladimir Oltean
2022-01-05 18:33   ` Florian Fainelli
2022-01-05 13:21 ` [PATCH v2 net-next 6/7] net: dsa: move dsa_switch_tree :: ports and lags to first cache line Vladimir Oltean
2022-01-05 18:34   ` Florian Fainelli
2022-01-05 13:21 ` [PATCH v2 net-next 7/7] net: dsa: combine two holes in struct dsa_switch_tree Vladimir Oltean
2022-01-05 18:34   ` Florian Fainelli
2022-01-05 14:28 ` [PATCH v2 net-next 0/7] Cleanup to main DSA structures Vladimir Oltean
2022-01-05 18:37   ` Florian Fainelli
2022-01-05 18:39 ` Florian Fainelli
2022-01-05 18:59   ` Vladimir Oltean
2022-01-05 19:04     ` Florian Fainelli
2022-01-05 19:22       ` Vladimir Oltean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.