linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver
@ 2019-04-13  1:27 Vladimir Oltean
  2019-04-13  1:27 ` [PATCH v3 net-next 01/24] lib: Add support for generic packing operations Vladimir Oltean
                   ` (23 more replies)
  0 siblings, 24 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:27 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This patchset adds a DSA driver for the SPI-managed NXP SJA1105 switch.
Due to the hardware's unfriendliness, most of its state needs to be
shadowed in kernel memory by the driver. To support this and keep a
decent amount of cleanliness in the code, a new generic API for
converting between CPU-accessible ("unpacked") structures and
hardware-accessible ("packed") structures is proposed and used.

Supporting a fully-featured (traffic-capable) driver for this switch
requires some rework in DSA and also leaves behind a more generic
infrastructure for other dumb switches that rely on 802.1Q pseudo-switch
tagging for port separation. Among the DSA changes required are:

* Permitting the .setup callback to invoke switchdev operations that
  will loop back into the driver through the switchdev notifier chain.
* Adding DSA awareness of switches where VLAN filtering is a global
  property and saving that setting in cpu_dp->vlan_filtering.
* Generic xmit and rcv functions for DSA interaction with 802.1Q tags on
  skb's. This is modeled as a tagging protocol of its own but must be
  customized by drivers to fit their own hardware possibilities.
* Intervention in the DSA receive hotpath, where a new filtering
  function called from eth_type_trans() is needed. This is because
  switches that employ 802.1Q tagging might actually have some limited
  means of source port decoding, such as only for management traffic.
  In order for the 802.1Q tagging protocol (which cannot be enabled
  under all conditions, unlike the management traffic decoding) to not
  be an all-or-nothing choice, the filtering function matches everything
  that can be decoded, and everything else is left to pass to the master
  netdevice.

The SJA1105 driver then proceeds to extend this 8021q switch tagging
protocol while adding its own (tag_sja1105). This is done because
SJA1105 needs SPI intervention during transmission of link-local
traffic, which cannot be done from the xmit handler but requires a
deferred worker thread.

The driver is GPL-2.0 licensed. The source code files which are licensed
as BSD-3-Clause are hardware support files and derivative of the
userspace NXP sja1105-tool program, which is BSD-3-Clause licensed.

TODO items:
* Add full support for the P/Q/R/S series. The patches were mostly
  tested on a first-generation T device.
* Add timestamping support and PTP clock manipulation.
* Figure out how the tc-taprio hardware offload that was just proposed
  by Vinicius can be used to configure the switch's time-aware scheduler.
* Rework link state callbacks to use phylink once the SGMII port
  is supported.

Changes in v3:
1. Removed the patch for a dedicated Ethertype to use with 802.1Q DSA
   tagging
2. Changed the SJA1105 switch tagging protocol sysfs label from
   "sja1105" to "8021q" to denote to users such as tcpdump that the
   structure is more generic.
3. Respun previous patch "net: dsa: Allow drivers to modulate between
   presence and absence of tagging". Current equivalent patch is called
   "net: dsa: Allow drivers to filter packets they can decode source
   port from" and at least allows reception of management traffic during
   the time when switch tagging is not enabled.
4. Added DSA-level fixes for the bridge core not unsetting
   vlan_filtering when ports leave. The global VLAN filtering is treated
   as a special case. Made the mt7530 driver use this. This patch
   benefits the SJA1105 because otherwise traffic in standalone mode
   would no longer work after removing the ports from a vlan_filtering
   bridge, since the driver and the hardware would be in an inconsistent
   state.
5. Restructured the documentation as rst. This depends upon the recently
   submitted "[PATCH net-next] Documentation: net: dsa: transition to
   the rst format": https://patchwork.ozlabs.org/patch/1084658/.

v2 patchset can be found at:
https://www.spinics.net/lists/netdev/msg563454.html

Changes in v2:
1. Device ID is no longer auto-detected but enforced based on explicit DT
   compatible string. This helps with stricter checking of DT bindings.
2. Group all device-specific operations into a sja1105_info structure and
   avoid using the IS_ET() and IS_PQRS() macros at runtime as much as possible.
3. Added more verbiage to commit messages and documentation.
4. Treat the case where RGMII internal delays are requested through DT bindings
   and return error.
5. Miscellaneous cosmetic cleanup in sja1105_clocking.c
6. Not advertising link features that are not supported, such as pause frames
   and the half duplex modes.
7. Fixed a mistake in previous patchset where the switch tagging was not
   actually enabled (lost during a rebase). This brought up another uncaught
   issue where switching at runtime between tagging and no-tagging was not
   supported by DSA. Fixed up the mistake in "net: dsa: sja1105: Add support
   for traffic through standalone ports", and added the new patch "net: dsa:
   Allow drivers to modulate between presence and absence of tagging" to
   address the other issue.
8. Added a workaround for switch resets cutting a frame in the middle of
   transmission, which would throw off some link partners.
9. Changed the TPID from ETH_P_EDSA (0xDADA) to a newly introduced one:
   ETH_P_DSA_8021Q (0xDADB). Uncovered another mistake in the previous patchset
   with a missing ntohs(), which was not caught because 0xDADA is
   endian-agnostic.
10. Made NET_DSA_TAG_8021Q select VLAN_8021Q
11. Renamed __dsa_port_vlan_add to dsa_port_vid_add and not to
    dsa_port_vlan_add_trans, as suggested, because the corresponding _del function
    does not have a transactional phase and the naming is more uniform this way.

v1 patchset can be found at:
https://www.spinics.net/lists/netdev/msg561589.html

Changes from RFC:
1. Removed the packing code for the static configuration tables that were
   not currently used
2. Removed the code for unpacking a static configuration structure from
   a memory buffer (not used)
3. Completely removed the SGMII stubs, since the configuration is not
   complete anyway.
4. Moved some code from the SJA1105 introduction commit into the patch
   that used it.
5. Made the code for checking global VLAN filtering generic and made b53
   driver use it.
6. Made mt7530 driver use the new generic dp->vlan_filtering
7. Fixed check for stringset in .get_sset_count
8. Minor cleanup in sja1105_clocking.c
9. Fixed a confusing typo in DSA

RFC can be found at:
https://www.mail-archive.com/netdev@vger.kernel.org/msg291717.html

Vladimir Oltean (24):
  lib: Add support for generic packing operations
  net: dsa: Fix pharse -> phase typo
  net: dsa: Store vlan_filtering as a property of dsa_port
  net: dsa: mt7530: Use vlan_filtering property from dsa_port
  net: dsa: Add more convenient functions for installing port VLANs
  net: dsa: Call driver's setup callback after setting up its switchdev
    notifier
  net: dsa: Optional VLAN-based port separation for switches without
    tagging
  net: dsa: Be aware of switches where VLAN filtering is a global
    setting
  net: dsa: b53: Let DSA handle mismatched VLAN filtering settings
  net: dsa: Unset vlan_filtering when ports leave the bridge
  net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering
  net: dsa: Copy the vlan_filtering setting on the CPU port if it's
    global
  net: dsa: Allow drivers to filter packets they can decode source port
    from
  net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
  net: dsa: sja1105: Add support for FDB and MDB management
  net: dsa: sja1105: Add support for VLAN operations
  net: dsa: sja1105: Add support for ethtool port counters
  net: dsa: sja1105: Add support for traffic through standalone ports
  net: dsa: sja1105: Add support for Spanning Tree Protocol
  net: dsa: sja1105: Error out if RGMII delays are requested in DT
  net: dsa: sja1105: Prevent PHY jabbering during switch reset
  net: dsa: sja1105: Reject unsupported link modes for AN
  Documentation: net: dsa: Add details about NXP SJA1105 driver
  dt-bindings: net: dsa: Add documentation for NXP SJA1105 driver

 .../devicetree/bindings/net/dsa/sja1105.txt   |  157 ++
 Documentation/networking/dsa/index.rst        |    1 +
 Documentation/networking/dsa/sja1105.rst      |  216 +++
 Documentation/packing.txt                     |  150 ++
 MAINTAINERS                                   |   14 +
 drivers/net/dsa/Kconfig                       |    2 +
 drivers/net/dsa/Makefile                      |    1 +
 drivers/net/dsa/b53/b53_common.c              |   25 +-
 drivers/net/dsa/mt7530.c                      |   18 +-
 drivers/net/dsa/mt7530.h                      |    1 -
 drivers/net/dsa/sja1105/Kconfig               |   17 +
 drivers/net/dsa/sja1105/Makefile              |   10 +
 drivers/net/dsa/sja1105/sja1105.h             |  164 ++
 drivers/net/dsa/sja1105/sja1105_clocking.c    |  612 ++++++
 .../net/dsa/sja1105/sja1105_dynamic_config.c  |  504 +++++
 .../net/dsa/sja1105/sja1105_dynamic_config.h  |   43 +
 drivers/net/dsa/sja1105/sja1105_ethtool.c     |  414 ++++
 drivers/net/dsa/sja1105/sja1105_main.c        | 1663 +++++++++++++++++
 drivers/net/dsa/sja1105/sja1105_spi.c         |  594 ++++++
 .../net/dsa/sja1105/sja1105_static_config.c   | 1042 +++++++++++
 .../net/dsa/sja1105/sja1105_static_config.h   |  298 +++
 include/linux/dsa/sja1105.h                   |   52 +
 include/linux/packing.h                       |   49 +
 include/net/dsa.h                             |   26 +
 lib/Makefile                                  |    2 +-
 lib/packing.c                                 |  211 +++
 net/dsa/Kconfig                               |   13 +
 net/dsa/Makefile                              |    2 +
 net/dsa/dsa.c                                 |    6 +
 net/dsa/dsa2.c                                |    9 +-
 net/dsa/dsa_priv.h                            |   15 +
 net/dsa/legacy.c                              |    1 +
 net/dsa/port.c                                |   87 +-
 net/dsa/slave.c                               |   24 +-
 net/dsa/switch.c                              |   31 +-
 net/dsa/tag_8021q.c                           |  203 ++
 net/dsa/tag_sja1105.c                         |  148 ++
 net/ethernet/eth.c                            |    6 +-
 38 files changed, 6769 insertions(+), 62 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/sja1105.txt
 create mode 100644 Documentation/networking/dsa/sja1105.rst
 create mode 100644 Documentation/packing.txt
 create mode 100644 drivers/net/dsa/sja1105/Kconfig
 create mode 100644 drivers/net/dsa/sja1105/Makefile
 create mode 100644 drivers/net/dsa/sja1105/sja1105.h
 create mode 100644 drivers/net/dsa/sja1105/sja1105_clocking.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_dynamic_config.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_dynamic_config.h
 create mode 100644 drivers/net/dsa/sja1105/sja1105_ethtool.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_main.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_spi.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_static_config.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_static_config.h
 create mode 100644 include/linux/dsa/sja1105.h
 create mode 100644 include/linux/packing.h
 create mode 100644 lib/packing.c
 create mode 100644 net/dsa/tag_8021q.c
 create mode 100644 net/dsa/tag_sja1105.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 01/24] lib: Add support for generic packing operations
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
@ 2019-04-13  1:27 ` Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 02/24] net: dsa: Fix pharse -> phase typo Vladimir Oltean
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:27 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This provides an unified API for accessing register bit fields
regardless of memory layout. The basic unit of data for these API
functions is the u64. The process of transforming an u64 from native CPU
encoding into the peripheral's encoding is called 'pack', and
transforming it from peripheral to native CPU encoding is 'unpack'.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
None

Changes in v2:
None

 Documentation/packing.txt | 150 +++++++++++++++++++++++++++
 MAINTAINERS               |   8 ++
 include/linux/packing.h   |  49 +++++++++
 lib/Makefile              |   2 +-
 lib/packing.c             | 211 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 419 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/packing.txt
 create mode 100644 include/linux/packing.h
 create mode 100644 lib/packing.c

diff --git a/Documentation/packing.txt b/Documentation/packing.txt
new file mode 100644
index 000000000000..32eba9d23611
--- /dev/null
+++ b/Documentation/packing.txt
@@ -0,0 +1,150 @@
+=============================================
+Generic field packing and unpacking functions
+=============================================
+
+Problem statement
+-----------------
+
+When working with hardware, one has to choose between several approaches of
+interfacing with it.
+One can memory-map a pointer to a carefully crafted struct over the hardware
+device's memory region, and access its fields as struct members (potentially
+declared as bit fields). But writing code this way would make it less portable,
+due to potential endianness mismatches between the CPU and the hardware device.
+Additionally, one has to pay close attention when translating register
+definitions from the hardware documentation into bit field indices for the
+structs. Also, some hardware (typically networking equipment) tends to group
+its register fields in ways that violate any reasonable word boundaries
+(sometimes even 64 bit ones). This creates the inconvenience of having to
+define "high" and "low" portions of register fields within the struct.
+A more robust alternative to struct field definitions would be to extract the
+required fields by shifting the appropriate number of bits. But this would
+still not protect from endianness mismatches, except if all memory accesses
+were performed byte-by-byte. Also the code can easily get cluttered, and the
+high-level idea might get lost among the many bit shifts required.
+Many drivers take the bit-shifting approach and then attempt to reduce the
+clutter with tailored macros, but more often than not these macros take
+shortcuts that still prevent the code from being truly portable.
+
+The solution
+------------
+
+This API deals with 2 basic operations:
+  - Packing a CPU-usable number into a memory buffer (with hardware
+    constraints/quirks)
+  - Unpacking a memory buffer (which has hardware constraints/quirks)
+    into a CPU-usable number.
+
+The API offers an abstraction over said hardware constraints and quirks,
+over CPU endianness and therefore between possible mismatches between
+the two.
+
+The basic unit of these API functions is the u64. From the CPU's
+perspective, bit 63 always means bit offset 7 of byte 7, albeit only
+logically. The question is: where do we lay this bit out in memory?
+
+The following examples cover the memory layout of a packed u64 field.
+The byte offsets in the packed buffer are always implicitly 0, 1, ... 7.
+What the examples show is where the logical bytes and bits sit.
+
+1. Normally (no quirks), we would do it like this:
+
+63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
+7                       6                       5                        4
+31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
+3                       2                       1                        0
+
+That is, the MSByte (7) of the CPU-usable u64 sits at memory offset 0, and the
+LSByte (0) of the u64 sits at memory offset 7.
+This corresponds to what most folks would regard to as "big endian", where
+bit i corresponds to the number 2^i. This is also referred to in the code
+comments as "logical" notation.
+
+
+2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:
+
+56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
+7                       6                        5                       4
+24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7
+3                       2                        1                       0
+
+That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
+inverts bit offsets inside a byte.
+
+
+3. If QUIRK_LITTLE_ENDIAN is set, we do it like this:
+
+39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
+4                       5                       6                       7
+7  6  5  4  3  2  1  0  15 14 13 12 11 10  9  8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
+0                       1                       2                       3
+
+Therefore, QUIRK_LITTLE_ENDIAN means that inside the memory region, every
+byte from each 4-byte word is placed at its mirrored position compared to
+the boundary of that word.
+
+4. If QUIRK_MSB_ON_THE_RIGHT and QUIRK_LITTLE_ENDIAN are both set, we do it
+   like this:
+
+32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
+4                       5                       6                       7
+0  1  2  3  4  5  6  7  8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
+0                       1                       2                       3
+
+
+5. If just QUIRK_LSW32_IS_FIRST is set, we do it like this:
+
+31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
+3                       2                       1                        0
+63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
+7                       6                       5                        4
+
+In this case the 8 byte memory region is interpreted as follows: first
+4 bytes correspond to the least significant 4-byte word, next 4 bytes to
+the more significant 4-byte word.
+
+
+6. If QUIRK_LSW32_IS_FIRST and QUIRK_MSB_ON_THE_RIGHT are set, we do it like
+   this:
+
+24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7
+3                       2                        1                       0
+56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
+7                       6                        5                       4
+
+
+7. If QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN are set, it looks like
+   this:
+
+7  6  5  4  3  2  1  0  15 14 13 12 11 10  9  8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
+0                       1                       2                       3
+39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
+4                       5                       6                       7
+
+
+8. If QUIRK_LSW32_IS_FIRST, QUIRK_LITTLE_ENDIAN and QUIRK_MSB_ON_THE_RIGHT
+   are set, it looks like this:
+
+0  1  2  3  4  5  6  7  8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
+0                       1                       2                       3
+32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
+4                       5                       6                       7
+
+
+We always think of our offsets as if there were no quirk, and we translate
+them afterwards, before accessing the memory region.
+
+Intended use
+------------
+
+Drivers that opt to use this API first need to identify which of the above 3
+quirk combinations (for a total of 8) match what the hardware documentation
+describes. Then they should wrap the packing() function, creating a new
+xxx_packing() that calls it using the proper QUIRK_* one-hot bits set.
+
+The packing() function returns an int-encoded error code, which protects the
+programmer against incorrect API use.  The errors are not expected to occur
+durring runtime, therefore it is reasonable for xxx_packing() to return void
+and simply swallow those errors. Optionally it can dump stack or print the
+error description.
+
diff --git a/MAINTAINERS b/MAINTAINERS
index 2b08063c8275..fd80c14973ea 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11671,6 +11671,14 @@ L:	linux-i2c@vger.kernel.org
 S:	Orphan
 F:	drivers/i2c/busses/i2c-pasemi.c
 
+PACKING
+M:	Vladimir Oltean <olteanv@gmail.com>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	lib/packing.c
+F:	include/linux/packing.h
+F:	Documentation/packing.txt
+
 PADATA PARALLEL EXECUTION MECHANISM
 M:	Steffen Klassert <steffen.klassert@secunet.com>
 L:	linux-crypto@vger.kernel.org
diff --git a/include/linux/packing.h b/include/linux/packing.h
new file mode 100644
index 000000000000..cc646e4f5df1
--- /dev/null
+++ b/include/linux/packing.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2016-2018, NXP Semiconductors
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#ifndef _LINUX_PACKING_H
+#define _LINUX_PACKING_H
+
+#include <linux/types.h>
+#include <linux/bitops.h>
+
+#define QUIRK_MSB_ON_THE_RIGHT BIT(0)
+#define QUIRK_LITTLE_ENDIAN    BIT(1)
+#define QUIRK_LSW32_IS_FIRST   BIT(2)
+
+enum packing_op {
+	PACK,
+	UNPACK,
+};
+
+/**
+ * packing - Convert numbers (currently u64) between a packed and an unpacked
+ *	     format. Unpacked means laid out in memory in the CPU's native
+ *	     understanding of integers, while packed means anything else that
+ *	     requires translation.
+ *
+ * @pbuf: Pointer to a buffer holding the packed value.
+ * @uval: Pointer to an u64 holding the unpacked value.
+ * @startbit: The index (in logical notation, compensated for quirks) where
+ *	      the packed value starts within pbuf. Must be larger than, or
+ *	      equal to, endbit.
+ * @endbit: The index (in logical notation, compensated for quirks) where
+ *	    the packed value ends within pbuf. Must be smaller than, or equal
+ *	    to, startbit.
+ * @op: If PACK, then uval will be treated as const pointer and copied (packed)
+ *	into pbuf, between startbit and endbit.
+ *	If UNPACK, then pbuf will be treated as const pointer and the logical
+ *	value between startbit and endbit will be copied (unpacked) to uval.
+ * @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
+ *	    QUIRK_MSB_ON_THE_RIGHT.
+ *
+ * Return: 0 on success, EINVAL or ERANGE if called incorrectly. Assuming
+ *	   correct usage, return code may be discarded.
+ *	   If op is PACK, pbuf is modified.
+ *	   If op is UNPACK, uval is modified.
+ */
+int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen,
+	    enum packing_op op, u8 quirks);
+
+#endif
diff --git a/lib/Makefile b/lib/Makefile
index 3b08673e8881..ae4b1f52c1ec 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -37,7 +37,7 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
 	 bust_spinlocks.o kasprintf.o bitmap.o scatterlist.o \
 	 gcd.o lcm.o list_sort.o uuid.o iov_iter.o clz_ctz.o \
 	 bsearch.o find_bit.o llist.o memweight.o kfifo.o \
-	 percpu-refcount.o rhashtable.o reciprocal_div.o \
+	 packing.o percpu-refcount.o rhashtable.o reciprocal_div.o \
 	 once.o refcount.o usercopy.o errseq.o bucket_locks.o \
 	 generic-radix-tree.o
 obj-$(CONFIG_STRING_SELFTEST) += test_string.o
diff --git a/lib/packing.c b/lib/packing.c
new file mode 100644
index 000000000000..2d0bfd78bfe9
--- /dev/null
+++ b/lib/packing.c
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/* Copyright (c) 2016-2018, NXP Semiconductors
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include <linux/packing.h>
+#include <linux/module.h>
+#include <linux/bitops.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+
+static int get_le_offset(int offset)
+{
+	int closest_multiple_of_4;
+
+	closest_multiple_of_4 = (offset / 4) * 4;
+	offset -= closest_multiple_of_4;
+	return closest_multiple_of_4 + (3 - offset);
+}
+
+static int get_reverse_lsw32_offset(int offset, size_t len)
+{
+	int closest_multiple_of_4;
+	int word_index;
+
+	word_index = offset / 4;
+	closest_multiple_of_4 = word_index * 4;
+	offset -= closest_multiple_of_4;
+	word_index = (len / 4) - word_index - 1;
+	return word_index * 4 + offset;
+}
+
+static u64 bit_reverse(u64 val, unsigned int width)
+{
+	u64 new_val = 0;
+	unsigned int bit;
+	unsigned int i;
+
+	for (i = 0; i < width; i++) {
+		bit = (val & (1 << i)) != 0;
+		new_val |= (bit << (width - i - 1));
+	}
+	return new_val;
+}
+
+static void adjust_for_msb_right_quirk(u64 *to_write, int *box_start_bit,
+				       int *box_end_bit, u8 *box_mask)
+{
+	int box_bit_width = *box_start_bit - *box_end_bit + 1;
+	int new_box_start_bit, new_box_end_bit;
+
+	*to_write >>= *box_end_bit;
+	*to_write = bit_reverse(*to_write, box_bit_width);
+	*to_write <<= *box_end_bit;
+
+	new_box_end_bit   = box_bit_width - *box_start_bit - 1;
+	new_box_start_bit = box_bit_width - *box_end_bit - 1;
+	*box_mask = GENMASK_ULL(new_box_start_bit, new_box_end_bit);
+	*box_start_bit = new_box_start_bit;
+	*box_end_bit   = new_box_end_bit;
+}
+
+/**
+ * packing - Convert numbers (currently u64) between a packed and an unpacked
+ *	     format. Unpacked means laid out in memory in the CPU's native
+ *	     understanding of integers, while packed means anything else that
+ *	     requires translation.
+ *
+ * @pbuf: Pointer to a buffer holding the packed value.
+ * @uval: Pointer to an u64 holding the unpacked value.
+ * @startbit: The index (in logical notation, compensated for quirks) where
+ *	      the packed value starts within pbuf. Must be larger than, or
+ *	      equal to, endbit.
+ * @endbit: The index (in logical notation, compensated for quirks) where
+ *	    the packed value ends within pbuf. Must be smaller than, or equal
+ *	    to, startbit.
+ * @op: If PACK, then uval will be treated as const pointer and copied (packed)
+ *	into pbuf, between startbit and endbit.
+ *	If UNPACK, then pbuf will be treated as const pointer and the logical
+ *	value between startbit and endbit will be copied (unpacked) to uval.
+ * @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
+ *	    QUIRK_MSB_ON_THE_RIGHT.
+ *
+ * Return: 0 on success, EINVAL or ERANGE if called incorrectly. Assuming
+ *	   correct usage, return code may be discarded.
+ *	   If op is PACK, pbuf is modified.
+ *	   If op is UNPACK, uval is modified.
+ */
+int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen,
+	    enum packing_op op, u8 quirks)
+{
+	/* Number of bits for storing "uval"
+	 * also width of the field to access in the pbuf
+	 */
+	u64 value_width;
+	/* Logical byte indices corresponding to the
+	 * start and end of the field.
+	 */
+	int plogical_first_u8, plogical_last_u8, box;
+
+	/* startbit is expected to be larger than endbit */
+	if (startbit < endbit)
+		/* Invalid function call */
+		return -EINVAL;
+
+	value_width = startbit - endbit + 1;
+	if (value_width > 64)
+		return -ERANGE;
+
+	/* Check if "uval" fits in "value_width" bits.
+	 * If value_width is 64, the check will fail, but any
+	 * 64-bit uval will surely fit.
+	 */
+	if (op == PACK && value_width < 64 && (*uval >= (1ull << value_width)))
+		/* Cannot store "uval" inside "value_width" bits.
+		 * Truncating "uval" is most certainly not desirable,
+		 * so simply erroring out is appropriate.
+		 */
+		return -ERANGE;
+
+	/* Initialize parameter */
+	if (op == UNPACK)
+		*uval = 0;
+
+	/* Iterate through an idealistic view of the pbuf as an u64 with
+	 * no quirks, u8 by u8 (aligned at u8 boundaries), from high to low
+	 * logical bit significance. "box" denotes the current logical u8.
+	 */
+	plogical_first_u8 = startbit / 8;
+	plogical_last_u8  = endbit / 8;
+
+	for (box = plogical_first_u8; box >= plogical_last_u8; box--) {
+		/* Bit indices into the currently accessed 8-bit box */
+		int box_start_bit, box_end_bit, box_addr;
+		u8  box_mask;
+		/* Corresponding bits from the unpacked u64 parameter */
+		int proj_start_bit, proj_end_bit;
+		u64 proj_mask;
+
+		/* This u8 may need to be accessed in its entirety
+		 * (from bit 7 to bit 0), or not, depending on the
+		 * input arguments startbit and endbit.
+		 */
+		if (box == plogical_first_u8)
+			box_start_bit = startbit % 8;
+		else
+			box_start_bit = 7;
+		if (box == plogical_last_u8)
+			box_end_bit = endbit % 8;
+		else
+			box_end_bit = 0;
+
+		/* We have determined the box bit start and end.
+		 * Now we calculate where this (masked) u8 box would fit
+		 * in the unpacked (CPU-readable) u64 - the u8 box's
+		 * projection onto the unpacked u64. Though the
+		 * box is u8, the projection is u64 because it may fall
+		 * anywhere within the unpacked u64.
+		 */
+		proj_start_bit = ((box * 8) + box_start_bit) - endbit;
+		proj_end_bit   = ((box * 8) + box_end_bit) - endbit;
+		proj_mask = GENMASK_ULL(proj_start_bit, proj_end_bit);
+		box_mask  = GENMASK_ULL(box_start_bit, box_end_bit);
+
+		/* Determine the offset of the u8 box inside the pbuf,
+		 * adjusted for quirks. The adjusted box_addr will be used for
+		 * effective addressing inside the pbuf (so it's not
+		 * logical any longer).
+		 */
+		box_addr = pbuflen - box - 1;
+		if (quirks & QUIRK_LITTLE_ENDIAN)
+			box_addr = get_le_offset(box_addr);
+		if (quirks & QUIRK_LSW32_IS_FIRST)
+			box_addr = get_reverse_lsw32_offset(box_addr,
+							    pbuflen);
+
+		if (op == UNPACK) {
+			u64 pval;
+
+			/* Read from pbuf, write to uval */
+			pval = ((u8 *)pbuf)[box_addr] & box_mask;
+			if (quirks & QUIRK_MSB_ON_THE_RIGHT)
+				adjust_for_msb_right_quirk(&pval,
+							   &box_start_bit,
+							   &box_end_bit,
+							   &box_mask);
+
+			pval >>= box_end_bit;
+			pval <<= proj_end_bit;
+			*uval &= ~proj_mask;
+			*uval |= pval;
+		} else {
+			u64 pval;
+
+			/* Write to pbuf, read from uval */
+			pval = (*uval) & proj_mask;
+			pval >>= proj_end_bit;
+			if (quirks & QUIRK_MSB_ON_THE_RIGHT)
+				adjust_for_msb_right_quirk(&pval,
+							   &box_start_bit,
+							   &box_end_bit,
+							   &box_mask);
+
+			pval <<= box_end_bit;
+			((u8 *)pbuf)[box_addr] &= ~box_mask;
+			((u8 *)pbuf)[box_addr] |= pval;
+		}
+	}
+	return 0;
+}
+EXPORT_SYMBOL(packing);
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 02/24] net: dsa: Fix pharse -> phase typo
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
  2019-04-13  1:27 ` [PATCH v3 net-next 01/24] lib: Add support for generic packing operations Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 03/24] net: dsa: Store vlan_filtering as a property of dsa_port Vladimir Oltean
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None

Changes in v2:
None

 net/dsa/switch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index e1fae969aa73..fde4e9195709 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -196,7 +196,7 @@ static int dsa_port_vlan_check(struct dsa_switch *ds, int port,
 	if (!dp->bridge_dev)
 		return err;
 
-	/* dsa_slave_vlan_rx_{add,kill}_vid() cannot use the prepare pharse and
+	/* dsa_slave_vlan_rx_{add,kill}_vid() cannot use the prepare phase and
 	 * already checks whether there is an overlapping bridge VLAN entry
 	 * with the same VID, so here we only need to check that if we are
 	 * adding a bridge VLAN entry there is not an overlapping VLAN device
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 03/24] net: dsa: Store vlan_filtering as a property of dsa_port
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
  2019-04-13  1:27 ` [PATCH v3 net-next 01/24] lib: Add support for generic packing operations Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 02/24] net: dsa: Fix pharse -> phase typo Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 04/24] net: dsa: mt7530: Use vlan_filtering property from dsa_port Vladimir Oltean
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This allows drivers to query the VLAN setting imposed by the bridge
driver directly from DSA, instead of keeping their own state based on
the .port_vlan_filtering callback.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None

Changes in v2:
None

 include/net/dsa.h |  1 +
 net/dsa/port.c    | 12 ++++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 0cfc2f828b87..e8f7a6302a38 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -143,6 +143,7 @@ struct dsa_port {
 	const char		*mac;
 	struct device_node	*dn;
 	unsigned int		ageing_time;
+	bool			vlan_filtering;
 	u8			stp_state;
 	struct net_device	*bridge_dev;
 	struct devlink_port	devlink_port;
diff --git a/net/dsa/port.c b/net/dsa/port.c
index ea848596afe3..0caf7f9bfb57 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -158,15 +158,19 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 			    struct switchdev_trans *trans)
 {
 	struct dsa_switch *ds = dp->ds;
+	int err;
 
 	/* bridge skips -EOPNOTSUPP, so skip the prepare phase */
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
 
-	if (ds->ops->port_vlan_filtering)
-		return ds->ops->port_vlan_filtering(ds, dp->index,
-						    vlan_filtering);
-
+	if (ds->ops->port_vlan_filtering) {
+		err = ds->ops->port_vlan_filtering(ds, dp->index,
+						   vlan_filtering);
+		if (err)
+			return err;
+		dp->vlan_filtering = vlan_filtering;
+	}
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 04/24] net: dsa: mt7530: Use vlan_filtering property from dsa_port
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (2 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 03/24] net: dsa: Store vlan_filtering as a property of dsa_port Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 05/24] net: dsa: Add more convenient functions for installing port VLANs Vladimir Oltean
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This field was recently introduced, so keeping state inside the driver
is no longer necessary.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None

Changes in v2:
None

 drivers/net/dsa/mt7530.c | 14 ++++----------
 drivers/net/dsa/mt7530.h |  1 -
 2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 7357b4fc0185..8bb0837792b1 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -828,11 +828,9 @@ mt7530_port_set_vlan_unaware(struct dsa_switch *ds, int port)
 	mt7530_rmw(priv, MT7530_PVC_P(port), VLAN_ATTR_MASK,
 		   VLAN_ATTR(MT7530_VLAN_TRANSPARENT));
 
-	priv->ports[port].vlan_filtering = false;
-
 	for (i = 0; i < MT7530_NUM_PORTS; i++) {
 		if (dsa_is_user_port(ds, i) &&
-		    priv->ports[i].vlan_filtering) {
+		    ds->ports[i].vlan_filtering) {
 			all_user_ports_removed = false;
 			break;
 		}
@@ -891,7 +889,7 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port,
 		 * And the other port's port matrix cannot be broken when the
 		 * other port is still a VLAN-aware port.
 		 */
-		if (!priv->ports[i].vlan_filtering &&
+		if (!ds->ports[i].vlan_filtering &&
 		    dsa_is_user_port(ds, i) && i != port) {
 			if (dsa_to_port(ds, i)->bridge_dev != bridge)
 				continue;
@@ -1013,10 +1011,6 @@ static int
 mt7530_port_vlan_filtering(struct dsa_switch *ds, int port,
 			   bool vlan_filtering)
 {
-	struct mt7530_priv *priv = ds->priv;
-
-	priv->ports[port].vlan_filtering = vlan_filtering;
-
 	if (vlan_filtering) {
 		/* The port is being kept as VLAN-unaware port when bridge is
 		 * set up with vlan_filtering not being set, Otherwise, the
@@ -1139,7 +1133,7 @@ mt7530_port_vlan_add(struct dsa_switch *ds, int port,
 	/* The port is kept as VLAN-unaware if bridge with vlan_filtering not
 	 * being set.
 	 */
-	if (!priv->ports[port].vlan_filtering)
+	if (!ds->ports[port].vlan_filtering)
 		return;
 
 	mutex_lock(&priv->reg_mutex);
@@ -1170,7 +1164,7 @@ mt7530_port_vlan_del(struct dsa_switch *ds, int port,
 	/* The port is kept as VLAN-unaware if bridge with vlan_filtering not
 	 * being set.
 	 */
-	if (!priv->ports[port].vlan_filtering)
+	if (!ds->ports[port].vlan_filtering)
 		return 0;
 
 	mutex_lock(&priv->reg_mutex);
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
index a95ed958df5b..1eec7bdc283a 100644
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -410,7 +410,6 @@ struct mt7530_port {
 	bool enable;
 	u32 pm;
 	u16 pvid;
-	bool vlan_filtering;
 };
 
 /* struct mt7530_priv -	This is the main data structure for holding the state
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 05/24] net: dsa: Add more convenient functions for installing port VLANs
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (3 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 04/24] net: dsa: mt7530: Use vlan_filtering property from dsa_port Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-16 23:49   ` Florian Fainelli
  2019-04-13  1:28 ` [PATCH v3 net-next 06/24] net: dsa: Call driver's setup callback after setting up its switchdev notifier Vladimir Oltean
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This hides the need to perform a two-phase transaction and construct a
switchdev_obj_port_vlan struct.

Call graph (including a function that will be introduced in a follow-up
patch) looks like this now (same for the *_vlan_del function):

dsa_slave_vlan_rx_add_vid   dsa_port_setup_8021q_tagging
            |                        |
            |                        |
            |          +-------------+
            |          |
            v          v
           dsa_port_vid_add      dsa_slave_port_obj_add
                  |                         |
                  +-------+         +-------+
                          |         |
                          v         v
                       dsa_port_vlan_add

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Reworked dsa_slave_vlan_rx_kill_vid so that symmetry of the calling
graph is kept with the vlan_add functions.

Changes in v2:
Renamed __dsa_port_vlan_add to dsa_port_vid_add and not to
dsa_port_vlan_add_trans, as suggested, because the corresponding _del function
does not have a transactional phase and the naming is more uniform this way.

 net/dsa/dsa_priv.h |  2 ++
 net/dsa/port.c     | 31 +++++++++++++++++++++++++++++++
 net/dsa/slave.c    | 24 +++---------------------
 3 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 093b7d145eb1..4246523e3133 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -169,6 +169,8 @@ int dsa_port_vlan_add(struct dsa_port *dp,
 		      struct switchdev_trans *trans);
 int dsa_port_vlan_del(struct dsa_port *dp,
 		      const struct switchdev_obj_port_vlan *vlan);
+int dsa_port_vid_add(struct dsa_port *dp, u16 vid, u16 flags);
+int dsa_port_vid_del(struct dsa_port *dp, u16 vid);
 int dsa_port_link_register_of(struct dsa_port *dp);
 void dsa_port_link_unregister_of(struct dsa_port *dp);
 
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 0caf7f9bfb57..029169c2dd3b 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -326,6 +326,37 @@ int dsa_port_vlan_del(struct dsa_port *dp,
 	return 0;
 }
 
+int dsa_port_vid_add(struct dsa_port *dp, u16 vid, u16 flags)
+{
+	struct switchdev_obj_port_vlan vlan = {
+		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
+		.flags = flags,
+		.vid_begin = vid,
+		.vid_end = vid,
+	};
+	struct switchdev_trans trans;
+	int err;
+
+	trans.ph_prepare = true;
+	err = dsa_port_vlan_add(dp, &vlan, &trans);
+	if (err == -EOPNOTSUPP)
+		return 0;
+
+	trans.ph_prepare = false;
+	return dsa_port_vlan_add(dp, &vlan, &trans);
+}
+
+int dsa_port_vid_del(struct dsa_port *dp, u16 vid)
+{
+	struct switchdev_obj_port_vlan vlan = {
+		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
+		.vid_begin = vid,
+		.vid_end = vid,
+	};
+
+	return dsa_port_vlan_del(dp, &vlan);
+}
+
 static struct phy_device *dsa_port_get_phy_device(struct dsa_port *dp)
 {
 	struct device_node *phy_dn;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index ce26dddc8270..8ad9bf957da1 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1001,13 +1001,6 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 				     u16 vid)
 {
 	struct dsa_port *dp = dsa_slave_to_port(dev);
-	struct switchdev_obj_port_vlan vlan = {
-		.vid_begin = vid,
-		.vid_end = vid,
-		/* This API only allows programming tagged, non-PVID VIDs */
-		.flags = 0,
-	};
-	struct switchdev_trans trans;
 	struct bridge_vlan_info info;
 	int ret;
 
@@ -1024,25 +1017,14 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 			return -EBUSY;
 	}
 
-	trans.ph_prepare = true;
-	ret = dsa_port_vlan_add(dp, &vlan, &trans);
-	if (ret == -EOPNOTSUPP)
-		return 0;
-
-	trans.ph_prepare = false;
-	return dsa_port_vlan_add(dp, &vlan, &trans);
+	/* This API only allows programming tagged, non-PVID VIDs */
+	return dsa_port_vid_add(dp, vid, 0);
 }
 
 static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
 				      u16 vid)
 {
 	struct dsa_port *dp = dsa_slave_to_port(dev);
-	struct switchdev_obj_port_vlan vlan = {
-		.vid_begin = vid,
-		.vid_end = vid,
-		/* This API only allows programming tagged, non-PVID VIDs */
-		.flags = 0,
-	};
 	struct bridge_vlan_info info;
 	int ret;
 
@@ -1059,7 +1041,7 @@ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
 			return -EBUSY;
 	}
 
-	ret = dsa_port_vlan_del(dp, &vlan);
+	ret = dsa_port_vid_del(dp, vid);
 	if (ret == -EOPNOTSUPP)
 		ret = 0;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 06/24] net: dsa: Call driver's setup callback after setting up its switchdev notifier
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (4 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 05/24] net: dsa: Add more convenient functions for installing port VLANs Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 15:05   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 07/24] net: dsa: Optional VLAN-based port separation for switches without tagging Vladimir Oltean
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This allows the driver to perform some manipulations of its own during
setup, using generic switchdev calls. Having the notifiers registered at
setup time is important because otherwise any switchdev transaction
emitted during this time would be ignored (dispatched to an empty call
chain).

One current usage scenario is for the driver to request DSA to set up
802.1Q based switch tagging for its ports.

There is no danger for the driver setup code to start racing now with
switchdev events emitted from the network stack (such as bridge core)
even if the notifier is registered earlier. This is because the network
stack needs a net_device as a vehicle to perform switchdev operations,
and the slave net_devices are registered later than the core driver
setup anyway (ds->ops->setup in dsa_switch_setup vs dsa_port_setup).

Luckily DSA doesn't need a net_device to carry out switchdev callbacks,
and therefore drivers shouldn't assume either that net_devices are
available at the time their switchdev callbacks get invoked.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
More verbiage in commit message.

 net/dsa/dsa2.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index d122f1bcdab2..17817c1a7fbd 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -369,14 +369,14 @@ static int dsa_switch_setup(struct dsa_switch *ds)
 	if (err)
 		return err;
 
-	err = ds->ops->setup(ds);
-	if (err < 0)
-		return err;
-
 	err = dsa_switch_register_notifier(ds);
 	if (err)
 		return err;
 
+	err = ds->ops->setup(ds);
+	if (err < 0)
+		return err;
+
 	if (!ds->slave_mii_bus && ds->ops->phy_read) {
 		ds->slave_mii_bus = devm_mdiobus_alloc(ds->dev);
 		if (!ds->slave_mii_bus)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 07/24] net: dsa: Optional VLAN-based port separation for switches without tagging
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (5 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 06/24] net: dsa: Call driver's setup callback after setting up its switchdev notifier Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 08/24] net: dsa: Be aware of switches where VLAN filtering is a global setting Vladimir Oltean
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

This patch provides generic DSA code for using VLAN (802.1Q) tags for
the same purpose as a dedicated switch tag for injection/extraction.
It is based on the discussions and interest that has been so far
expressed in https://www.spinics.net/lists/netdev/msg556125.html.

Unlike all other DSA-supported tagging protocols, CONFIG_NET_DSA_TAG_8021Q
does not offer a complete solution for drivers (nor can it). Instead, it
provides generic code that driver can opt into calling:
- dsa_8021q_xmit: Inserts a VLAN header with the specified contents.
  Can be called from another tagging protocol's xmit function.
  Currently the LAN9303 driver is inserting headers that are simply
  802.1Q with custom fields, so this is an opportunity for code reuse.
- dsa_8021q_rcv: Retrieves the TPID and TCI from a VLAN-tagged skb.
  Removing the VLAN header is left as a decision for the caller to make.
- dsa_port_setup_8021q_tagging: For each user port, installs an Rx VID
  and a Tx VID, for proper untagged traffic identification on ingress
  and steering on egress. Also sets up the VLAN trunk on the upstream
  (CPU or DSA) port. Drivers are intentionally left to call this
  function explicitly, depending on the context and hardware support.
  The expected switch behavior and VLAN semantics should not be violated
  under any conditions. That is, after calling
  dsa_port_setup_8021q_tagging, the hardware should still pass all
  ingress traffic, be it tagged or untagged.

This only works when switch ports are standalone, or when they are added
to a VLAN-unaware bridge. It will probably remain this way for the
reasons below.

When added to a bridge that has vlan_filtering 1, the bridge core will
install its own VLANs and reset the pvids through switchdev. For the
bridge core, switchdev is a write-only pipe. All VLAN-related state is
kept in the bridge core and nothing is read from DSA/switchdev or from
the driver. So the bridge core will break this port separation because
it will install the vlan_default_pvid into all switchdev ports.

Even if we could teach the bridge driver about switchdev preference of a
certain vlan_default_pvid (task difficult in itself since the current
setting is per-bridge but we would need it per-port), there would still
exist many other challenges.

Firstly, in the DSA rcv callback, a driver would have to perform an
iterative reverse lookup to find the correct switch port. That is
because the port is a bridge slave, so its Rx VID (port PVID) is subject
to user configuration. How would we ensure that the user doesn't reset
the pvid to a different value (which would make an O(1) translation
impossible), or to a non-unique value within this DSA switch tree (which
would make any translation impossible)?

Finally, not all switch ports are equal in DSA, and that makes it
difficult for the bridge to be completely aware of this anyway.
The CPU port needs to transmit tagged packets (VLAN trunk) in order for
the DSA rcv code to be able to decode source information.
But the bridge code has absolutely no idea which switch port is the CPU
port, if nothing else then just because there is no netdevice registered
by DSA for the CPU port.
Also DSA does not currently allow the user to specify that they want the
CPU port to do VLAN trunking anyway. VLANs are added to the CPU port
using the same flags as they were added on the user port.

So the VLANs installed by dsa_port_setup_8021q_tagging per driver
request should remain private from the bridge's and user's perspective,
and should not alter the VLAN semantics observed by the user.

In the current implementation a VLAN range ending at 4095 (VLAN_N_VID)
is reserved for this purpose. Each port receives a unique Rx VLAN and a
unique Tx VLAN. Separate VLANs are needed for Rx and Tx because they
serve different purposes: on Rx the switch must process traffic as
untagged and process it with a port-based VLAN, but with care not to
hinder bridging. On the other hand, the Tx VLAN is where the
reachability restrictions are imposed, since by tagging frames in the
xmit callback we are telling the switch onto which port to steer the
frame.

Some general guidance on how this support might be employed for
real-life hardware (some comments made by Florian Fainelli):

- If the hardware supports VLAN tag stacking, it should somehow back
  up its private VLAN settings when the bridge tries to override them.
  Then the driver could re-apply them as outer tags. Dedicating an outer
  tag per bridge device would allow identical inner tag VID numbers to
  co-exist, yet preserve broadcast domain isolation.

- If the switch cannot handle VLAN tag stacking, it should disable this
  port separation when added as slave to a vlan_filtering bridge, in
  that case having reduced functionality.

- Drivers for old switches that don't support the entire VLAN_N_VID
  range will need to rework the current range selection mechanism.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
Made NET_DSA_TAG_8021Q select VLAN_8021Q
More verbiage in commit message
More comments
Adapted to the function name change from 05/22.

 include/net/dsa.h   |   4 +
 net/dsa/Kconfig     |  10 +++
 net/dsa/Makefile    |   1 +
 net/dsa/dsa_priv.h  |  10 +++
 net/dsa/tag_8021q.c | 203 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 228 insertions(+)
 create mode 100644 net/dsa/tag_8021q.c

diff --git a/include/net/dsa.h b/include/net/dsa.h
index e8f7a6302a38..809046f6a718 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -575,5 +575,9 @@ int dsa_port_get_phy_strings(struct dsa_port *dp, uint8_t *data);
 int dsa_port_get_ethtool_phy_stats(struct dsa_port *dp, uint64_t *data);
 int dsa_port_get_phy_sset_count(struct dsa_port *dp);
 void dsa_port_phylink_mac_change(struct dsa_switch *ds, int port, bool up);
+#ifdef CONFIG_NET_DSA_TAG_8021Q
+int dsa_port_setup_8021q_tagging(struct dsa_switch *ds, int index,
+				 bool enabled);
+#endif
 
 #endif
diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
index b695170795c2..b2fc07de8bcb 100644
--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -27,6 +27,16 @@ config NET_DSA_LEGACY
 	  This feature is scheduled for removal in 4.17.
 
 # tagging formats
+config NET_DSA_TAG_8021Q
+	select VLAN_8021Q
+	bool
+	help
+	  Unlike the other tagging protocols, the 802.1Q config option simply
+	  provides helpers for other tagging implementations that might rely on
+	  VLAN in one way or another. It is not a complete solution.
+
+	  Drivers which use these helpers should select this as dependency.
+
 config NET_DSA_TAG_BRCM
 	bool
 
diff --git a/net/dsa/Makefile b/net/dsa/Makefile
index 6e721f7a2947..d7fc3253d497 100644
--- a/net/dsa/Makefile
+++ b/net/dsa/Makefile
@@ -5,6 +5,7 @@ dsa_core-y += dsa.o dsa2.o master.o port.o slave.o switch.o
 dsa_core-$(CONFIG_NET_DSA_LEGACY) += legacy.o
 
 # tagging formats
+dsa_core-$(CONFIG_NET_DSA_TAG_8021Q) += tag_8021q.o
 dsa_core-$(CONFIG_NET_DSA_TAG_BRCM) += tag_brcm.o
 dsa_core-$(CONFIG_NET_DSA_TAG_BRCM_PREPEND) += tag_brcm.o
 dsa_core-$(CONFIG_NET_DSA_TAG_DSA) += tag_dsa.o
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 4246523e3133..cc5ec3759952 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -203,6 +203,16 @@ dsa_slave_to_master(const struct net_device *dev)
 int dsa_switch_register_notifier(struct dsa_switch *ds);
 void dsa_switch_unregister_notifier(struct dsa_switch *ds);
 
+/* tag_8021q.c */
+struct sk_buff *dsa_8021q_xmit(struct sk_buff *skb, struct net_device *netdev,
+			       u16 tpid, u16 tci);
+struct sk_buff *dsa_8021q_rcv(struct sk_buff *skb, struct net_device *netdev,
+			      struct packet_type *pt, u16 *tpid, u16 *tci);
+u16 dsa_tagging_tx_vid(struct dsa_switch *ds, int port);
+u16 dsa_tagging_rx_vid(struct dsa_switch *ds, int port);
+int dsa_tagging_rx_switch_id(u16 vid);
+int dsa_tagging_rx_source_port(u16 vid);
+
 /* tag_brcm.c */
 extern const struct dsa_device_ops brcm_netdev_ops;
 extern const struct dsa_device_ops brcm_prepend_netdev_ops;
diff --git a/net/dsa/tag_8021q.c b/net/dsa/tag_8021q.c
new file mode 100644
index 000000000000..3eb2d5a43837
--- /dev/null
+++ b/net/dsa/tag_8021q.c
@@ -0,0 +1,203 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include <linux/if_bridge.h>
+#include <linux/if_vlan.h>
+
+#include "dsa_priv.h"
+
+/* Allocating two VLAN tags per port - one for the Rx VID and
+ * the other for the Tx VID - see below
+ */
+#define DSA_TAGGING_VID_RANGE    (DSA_MAX_SWITCHES * DSA_MAX_PORTS)
+#define DSA_TAGGING_VID_BASE     (VLAN_N_VID - 2 * DSA_TAGGING_VID_RANGE - 1)
+#define DSA_TAGGING_RX_VID_BASE  (DSA_TAGGING_VID_BASE)
+#define DSA_TAGGING_TX_VID_BASE  (DSA_TAGGING_VID_BASE + DSA_TAGGING_VID_RANGE)
+
+/* Returns the VID to be inserted into the frame from xmit for switch steering
+ * instructions on egress. Encodes switch ID and port ID.
+ */
+u16 dsa_tagging_tx_vid(struct dsa_switch *ds, int port)
+{
+	return DSA_TAGGING_TX_VID_BASE + (DSA_MAX_PORTS * ds->index) + port;
+}
+
+/* Returns the VID that will be installed as pvid for this switch port, sent as
+ * tagged egress towards the CPU port and decoded by the rcv function.
+ */
+u16 dsa_tagging_rx_vid(struct dsa_switch *ds, int port)
+{
+	return DSA_TAGGING_RX_VID_BASE + (DSA_MAX_PORTS * ds->index) + port;
+}
+
+/* Returns the decoded switch ID from the Rx VID. */
+int dsa_tagging_rx_switch_id(u16 vid)
+{
+	return ((vid - DSA_TAGGING_RX_VID_BASE) / DSA_MAX_PORTS);
+}
+
+/* Returns the decoded port ID from the Rx VID. */
+int dsa_tagging_rx_source_port(u16 vid)
+{
+	return ((vid - DSA_TAGGING_RX_VID_BASE) % DSA_MAX_PORTS);
+}
+
+/* Rx VLAN tagging (left) and Tx VLAN tagging (right) setup shown for a single
+ * front-panel switch port (here swp0).
+ *
+ * Port identification through VLAN (802.1Q) tags has different requirements
+ * for it to work effectively:
+ *  - On Rx (ingress from network): each front-panel port must have a pvid
+ *    that uniquely identifies it, and the egress of this pvid must be tagged
+ *    towards the CPU port, so that software can recover the source port based
+ *    on the VID in the frame. But this would only work for standalone ports;
+ *    if bridged, this VLAN setup would break autonomous forwarding and would
+ *    force all switched traffic to pass through the CPU. So we must also make
+ *    the other front-panel ports members of this VID we're adding, albeit
+ *    we're not making it their PVID (they'll still have their own).
+ *    By the way - just because we're installing the same VID in multiple
+ *    switch ports doesn't mean that they'll start to talk to one another, even
+ *    while not bridged: the final forwarding decision is still an AND between
+ *    the L2 forwarding information (which is limiting forwarding in this case)
+ *    and the VLAN-based restrictions (of which there are none in this case,
+ *    since all ports are members).
+ *  - On Tx (ingress from CPU and towards network) we are faced with a problem.
+ *    If we were to tag traffic (from within DSA) with the port's pvid, all
+ *    would be well, assuming the switch ports were standalone. Frames would
+ *    have no choice but to be directed towards the correct front-panel port.
+ *    But because we also want the Rx VLAN to not break bridging, then
+ *    inevitably that means that we have to give them a choice (of what
+ *    front-panel port to go out on), and therefore we cannot steer traffic
+ *    based on the Rx VID. So what we do is simply install one more VID on the
+ *    front-panel and CPU ports, and profit off of the fact that steering will
+ *    work just by virtue of the fact that there is only one other port that's
+ *    a member of the VID we're tagging the traffic with - the desired one.
+ *
+ * So at the end, each front-panel port will have one Rx VID (also the PVID),
+ * the Rx VID of all other front-panel ports, and one Tx VID. Whereas the CPU
+ * port will have the Rx and Tx VIDs of all front-panel ports, and on top of
+ * that, is also tagged-input and tagged-output (VLAN trunk).
+ *
+ *               CPU port                               CPU port
+ * +-------------+-----+-------------+    +-------------+-----+-------------+
+ * |  Rx VID     |     |             |    |  Tx VID     |     |             |
+ * |  of swp0    |     |             |    |  of swp0    |     |             |
+ * |             +-----+             |    |             +-----+             |
+ * |                ^ T              |    |                | Tagged         |
+ * |                |                |    |                | ingress        |
+ * |    +-------+---+---+-------+    |    |    +-----------+                |
+ * |    |       |       |       |    |    |    | Untagged                   |
+ * |    |     U v     U v     U v    |    |    v egress                     |
+ * | +-----+ +-----+ +-----+ +-----+ |    | +-----+ +-----+ +-----+ +-----+ |
+ * | |     | |     | |     | |     | |    | |     | |     | |     | |     | |
+ * | |PVID | |     | |     | |     | |    | |     | |     | |     | |     | |
+ * +-+-----+-+-----+-+-----+-+-----+-+    +-+-----+-+-----+-+-----+-+-----+-+
+ *   swp0    swp1    swp2    swp3           swp0    swp1    swp2    swp3
+ */
+int dsa_port_setup_8021q_tagging(struct dsa_switch *ds, int port, bool enabled)
+{
+	int upstream = dsa_upstream_port(ds, port);
+	struct dsa_port *dp = &ds->ports[port];
+	struct dsa_port *upstream_dp = &ds->ports[upstream];
+	u16 rx_vid = dsa_tagging_rx_vid(ds, port);
+	u16 tx_vid = dsa_tagging_tx_vid(ds, port);
+	int i, err;
+
+	/* The CPU port is implicitly configured by
+	 * configuring the front-panel ports
+	 */
+	if (!dsa_is_user_port(ds, port))
+		return 0;
+
+	/* Add this user port's Rx VID to the membership list of all others
+	 * (including itself). This is so that bridging will not be hindered.
+	 * L2 forwarding rules still take precedence when there are no VLAN
+	 * restrictions, so there are no concerns about leaking traffic.
+	 */
+	for (i = 0; i < ds->num_ports; i++) {
+		struct dsa_port *other_dp = &ds->ports[i];
+		u16 flags;
+
+		if (i == upstream)
+			/* CPU port needs to see this port's Rx VID
+			 * as tagged egress.
+			 */
+			flags = 0;
+		else if (i == port)
+			/* The Rx VID is pvid on this port */
+			flags = BRIDGE_VLAN_INFO_UNTAGGED |
+				BRIDGE_VLAN_INFO_PVID;
+		else
+			/* The Rx VID is a regular VLAN on all others */
+			flags = BRIDGE_VLAN_INFO_UNTAGGED;
+
+		if (enabled)
+			err = dsa_port_vid_add(other_dp, rx_vid, flags);
+		else
+			err = dsa_port_vid_del(other_dp, rx_vid);
+		if (err) {
+			dev_err(ds->dev, "Failed to apply Rx VID %d to port %d: %d\n",
+				rx_vid, port, err);
+			return err;
+		}
+	}
+	/* Finally apply the Tx VID on this port and on the CPU port */
+	if (enabled)
+		err = dsa_port_vid_add(dp, tx_vid, BRIDGE_VLAN_INFO_UNTAGGED);
+	else
+		err = dsa_port_vid_del(dp, tx_vid);
+	if (err) {
+		dev_err(ds->dev, "Failed to apply Tx VID %d on port %d: %d\n",
+			tx_vid, port, err);
+		return err;
+	}
+	if (enabled)
+		err = dsa_port_vid_add(upstream_dp, tx_vid, 0);
+	else
+		err = dsa_port_vid_del(upstream_dp, tx_vid);
+	if (err) {
+		dev_err(ds->dev, "Failed to apply Tx VID %d on port %d: %d\n",
+			tx_vid, upstream, err);
+		return err;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dsa_port_setup_8021q_tagging);
+
+struct sk_buff *dsa_8021q_xmit(struct sk_buff *skb, struct net_device *netdev,
+			       u16 tpid, u16 tci)
+{
+	/* skb->data points at skb_mac_header, which
+	 * is fine for vlan_insert_tag.
+	 */
+	return vlan_insert_tag(skb, htons(tpid), tci);
+}
+EXPORT_SYMBOL_GPL(dsa_8021q_xmit);
+
+struct sk_buff *dsa_8021q_rcv(struct sk_buff *skb, struct net_device *netdev,
+			      struct packet_type *pt, u16 *tpid, u16 *tci)
+{
+	struct vlan_ethhdr *tag;
+
+	if (unlikely(!pskb_may_pull(skb, VLAN_HLEN)))
+		return NULL;
+
+	tag = vlan_eth_hdr(skb);
+	*tpid = ntohs(tag->h_vlan_proto);
+	*tci = ntohs(tag->h_vlan_TCI);
+
+	/* skb->data points in the middle of the VLAN tag,
+	 * after tpid and before tci. This is because so far,
+	 * ETH_HLEN (DMAC, SMAC, EtherType) bytes were pulled.
+	 * There are 2 bytes of VLAN tag left in skb->data, and upper
+	 * layers expect the 'real' EtherType to be consumed as well.
+	 * Coincidentally, a VLAN header is also of the same size as
+	 * the number of bytes that need to be pulled.
+	 */
+	skb_pull_rcsum(skb, VLAN_HLEN);
+
+	return skb;
+}
+EXPORT_SYMBOL_GPL(dsa_8021q_rcv);
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 08/24] net: dsa: Be aware of switches where VLAN filtering is a global setting
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (6 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 07/24] net: dsa: Optional VLAN-based port separation for switches without tagging Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-16 23:54   ` Florian Fainelli
  2019-04-13  1:28 ` [PATCH v3 net-next 09/24] net: dsa: b53: Let DSA handle mismatched VLAN filtering settings Vladimir Oltean
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

On some switches, the action of whether to parse VLAN frame headers and use
that information for ingress admission is configurable, but not per
port. Such is the case for the Broadcom BCM53xx and the NXP SJA1105
families, for example. In that case, DSA can prevent the bridge core
from trying to apply different VLAN filtering settings on net devices
that belong to the same switch.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
Reduced the indentation level by 1 in dsa_port_vlan_filtering().

Changes in v2:
None

 include/net/dsa.h |  5 +++++
 net/dsa/port.c    | 52 ++++++++++++++++++++++++++++++++++++++++-------
 net/dsa/switch.c  |  1 +
 3 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 809046f6a718..94a9f096568d 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -210,6 +210,11 @@ struct dsa_switch {
 	/* Number of switch port queues */
 	unsigned int		num_tx_queues;
 
+	/* Disallow bridge core from requesting different VLAN awareness
+	 * settings on ports if not hardware-supported
+	 */
+	bool			vlan_filtering_is_global;
+
 	unsigned long		*bitmap;
 	unsigned long		_bitmap;
 
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 029169c2dd3b..c8eb2cbcea6e 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -154,6 +154,39 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
 	dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
 }
 
+static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp,
+					      bool vlan_filtering)
+{
+	struct dsa_switch *ds = dp->ds;
+	int i;
+
+	if (!ds->vlan_filtering_is_global)
+		return true;
+
+	/* For cases where enabling/disabling VLAN awareness is global to the
+	 * switch, we need to handle the case where multiple bridges span
+	 * different ports of the same switch device and one of them has a
+	 * different setting than what is being requested.
+	 */
+	for (i = 0; i < ds->num_ports; i++) {
+		struct net_device *other_bridge;
+
+		other_bridge = dsa_to_port(ds, i)->bridge_dev;
+		if (!other_bridge)
+			continue;
+		/* If it's the same bridge, it also has same
+		 * vlan_filtering setting => no need to check
+		 */
+		if (other_bridge == dp->bridge_dev)
+			continue;
+		if (br_vlan_enabled(other_bridge) != vlan_filtering) {
+			dev_err(ds->dev, "VLAN filtering is a global setting\n");
+			return false;
+		}
+	}
+	return true;
+}
+
 int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 			    struct switchdev_trans *trans)
 {
@@ -164,13 +197,18 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
 
-	if (ds->ops->port_vlan_filtering) {
-		err = ds->ops->port_vlan_filtering(ds, dp->index,
-						   vlan_filtering);
-		if (err)
-			return err;
-		dp->vlan_filtering = vlan_filtering;
-	}
+	if (!ds->ops->port_vlan_filtering)
+		return 0;
+
+	if (!dsa_port_can_apply_vlan_filtering(dp, vlan_filtering))
+		return -EINVAL;
+
+	err = ds->ops->port_vlan_filtering(ds, dp->index,
+					   vlan_filtering);
+	if (err)
+		return err;
+
+	dp->vlan_filtering = vlan_filtering;
 	return 0;
 }
 
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index fde4e9195709..03b8d8928651 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -10,6 +10,7 @@
  * (at your option) any later version.
  */
 
+#include <linux/if_bridge.h>
 #include <linux/netdevice.h>
 #include <linux/notifier.h>
 #include <linux/if_vlan.h>
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 09/24] net: dsa: b53: Let DSA handle mismatched VLAN filtering settings
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (7 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 08/24] net: dsa: Be aware of switches where VLAN filtering is a global setting Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-16 23:52   ` Florian Fainelli
  2019-04-13  1:28 ` [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge Vladimir Oltean
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

The DSA core is now able to do this check prior to calling the
.port_vlan_filtering callback, so tell it that VLAN filtering is global
for this particular hardware.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None

Changes in v2:
None

 drivers/net/dsa/b53/b53_common.c | 25 +++++++------------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 0852e5e08177..a779b9c3ab6e 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -966,6 +966,13 @@ static int b53_setup(struct dsa_switch *ds)
 			b53_disable_port(ds, port);
 	}
 
+	/* Let DSA handle the case were multiple bridges span the same switch
+	 * device and different VLAN awareness settings are requested, which
+	 * would be breaking filtering semantics for any of the other bridge
+	 * devices. (not hardware supported)
+	 */
+	ds->vlan_filtering_is_global = true;
+
 	return ret;
 }
 
@@ -1275,26 +1282,8 @@ EXPORT_SYMBOL(b53_phylink_mac_link_up);
 int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering)
 {
 	struct b53_device *dev = ds->priv;
-	struct net_device *bridge_dev;
-	unsigned int i;
 	u16 pvid, new_pvid;
 
-	/* Handle the case were multiple bridges span the same switch device
-	 * and one of them has a different setting than what is being requested
-	 * which would be breaking filtering semantics for any of the other
-	 * bridge devices.
-	 */
-	b53_for_each_port(dev, i) {
-		bridge_dev = dsa_to_port(ds, i)->bridge_dev;
-		if (bridge_dev &&
-		    bridge_dev != dsa_to_port(ds, port)->bridge_dev &&
-		    br_vlan_enabled(bridge_dev) != vlan_filtering) {
-			netdev_err(bridge_dev,
-				   "VLAN filtering is global to the switch!\n");
-			return -EINVAL;
-		}
-	}
-
 	b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &pvid);
 	new_pvid = pvid;
 	if (dev->vlan_filtering_enabled && !vlan_filtering) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (8 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 09/24] net: dsa: b53: Let DSA handle mismatched VLAN filtering settings Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 15:11   ` Andrew Lunn
  2019-04-16 23:59   ` Florian Fainelli
  2019-04-13  1:28 ` [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering Vladimir Oltean
                   ` (13 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

When ports are standalone (after they left the bridge), they should have
no VLAN filtering semantics (they should pass all traffic to the CPU).
Currently this is not true for switchdev drivers, because the bridge
"forgets" to unset that.

Normally one would think that doing this at the bridge layer would be a
better idea, i.e. call br_vlan_filter_toggle() from br_del_if(), similar
to how nbp_vlan_init() is called from br_add_if().

However what complicates that approach, and makes this one preferable,
is the fact that for the bridge core, vlan_filtering is a per-bridge
setting, whereas for switchdev/DSA it is per-port. Also there are
switches where the setting is per the entire device, and unsetting
vlan_filtering one by one, for each leaving port, would not be possible
from the bridge core without a certain level of awareness. So do this in
DSA and let drivers be unaware of it.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Patch is new.

 net/dsa/switch.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 03b8d8928651..7d8cd9bc0ecc 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -72,6 +72,9 @@ static int dsa_switch_bridge_join(struct dsa_switch *ds,
 static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 				   struct dsa_notifier_bridge_info *info)
 {
+	bool unset_vlan_filtering = br_vlan_enabled(info->br);
+	int err, i;
+
 	if (ds->index == info->sw_index && ds->ops->port_bridge_leave)
 		ds->ops->port_bridge_leave(ds, info->port, info->br);
 
@@ -79,6 +82,31 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 		ds->ops->crosschip_bridge_leave(ds, info->sw_index, info->port,
 						info->br);
 
+	/* If the bridge was vlan_filtering, the bridge core doesn't trigger an
+	 * event for changing vlan_filtering setting upon slave ports leaving
+	 * it. That is a good thing, because that lets us handle it and also
+	 * handle the case where the switch's vlan_filtering setting is global
+	 * (not per port). When that happens, the correct moment to trigger the
+	 * vlan_filtering callback is only when the last port left this bridge.
+	 */
+	if (unset_vlan_filtering && ds->vlan_filtering_is_global) {
+		for (i = 0; i < ds->num_ports; i++) {
+			if (i == info->port)
+				continue;
+			if (dsa_to_port(ds, i)->bridge_dev == info->br) {
+				unset_vlan_filtering = false;
+				break;
+			}
+		}
+	}
+	if (unset_vlan_filtering) {
+		struct switchdev_trans trans = {0};
+
+		err = dsa_port_vlan_filtering(&ds->ports[info->port],
+					      false, &trans);
+		if (err && err != EOPNOTSUPP)
+			return err;
+	}
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (9 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 15:12   ` Andrew Lunn
  2019-04-16 23:59   ` Florian Fainelli
  2019-04-13  1:28 ` [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global Vladimir Oltean
                   ` (12 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

The driver, recognizing that the .port_vlan_filtering callback was never
coming after the port left its parent bridge, decided to take that duty
in its own hands. DSA now takes care of this condition, so fix that.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Patch is new.

 drivers/net/dsa/mt7530.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 8bb0837792b1..2279362450fc 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -908,8 +908,6 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port,
 			   PCR_MATRIX(BIT(MT7530_CPU_PORT)));
 	priv->ports[port].pm = PCR_MATRIX(BIT(MT7530_CPU_PORT));
 
-	mt7530_port_set_vlan_unaware(ds, port);
-
 	mutex_unlock(&priv->reg_mutex);
 }
 
@@ -1019,6 +1017,8 @@ mt7530_port_vlan_filtering(struct dsa_switch *ds, int port,
 		 */
 		mt7530_port_set_vlan_aware(ds, port);
 		mt7530_port_set_vlan_aware(ds, MT7530_CPU_PORT);
+	} else {
+		mt7530_port_set_vlan_unaware(ds, port);
 	}
 
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (10 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 15:23   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from Vladimir Oltean
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

The current behavior is not as obvious as one would assume (which is
that, if the driver set vlan_filtering_is_global = 1, then checking any
dp->vlan_filtering would yield the same result). Only the ports which
are actively enslaved into a bridge would have vlan_filtering set.

This makes it tricky for drivers to check what the global state is.
Moreover, the most obvious place to check for this setting, the CPU
port, is not populated since it's not being enslaved to the bridge.
So fix this and make the CPU port hold the global state of VLAN
filtering on this switch.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Patch is new.

 net/dsa/port.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index c8eb2cbcea6e..acb4ed1f9929 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -190,6 +190,8 @@ static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp,
 int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 			    struct switchdev_trans *trans)
 {
+	/* Violate a const pointer here */
+	struct dsa_port *cpu_dp = (struct dsa_port *)dp->cpu_dp;
 	struct dsa_switch *ds = dp->ds;
 	int err;
 
@@ -209,6 +211,12 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 		return err;
 
 	dp->vlan_filtering = vlan_filtering;
+	/* In case of switches where VLAN filtering is not per-port,
+	 * also put the setting in the most unambiguous place to
+	 * retrieve it from.
+	 */
+	if (ds->vlan_filtering_is_global)
+		cpu_dp->vlan_filtering = vlan_filtering;
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (11 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 15:39   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch Vladimir Oltean
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Frames get processed by DSA and redirected to switch port net devices
based on the ETH_P_XDSA multiplexed packet_type handler found by the
network stack when calling eth_type_trans().

The running assumption is that once the DSA .rcv function is called, DSA
is always able to decode the switch tag in order to change the skb->dev
from its master.

However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105,
user of DSA_TAG_PROTO_8021Q) where this assumption is not completely
true, since switch tagging piggybacks on the absence of a vlan_filtering
bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't
rely on switch tagging, but on a different mechanism. So it would make
sense to at least be able to terminate that.

Having DSA receive traffic it can't decode would put it in an impossible
situation: the eth_type_trans() function would invoke the DSA .rcv(),
which could not change skb->dev, then eth_type_trans() would be invoked
again, which again would call the DSA .rcv, and the packet would never
be able to exit the DSA filter and would spiral in a loop until the
whole system dies.

This happens because eth_type_trans() doesn't actually look at the skb
(so as to identify a potential tag) when it deems it as being
ETH_P_XDSA. It just checks whether skb->dev has a DSA private pointer
installed (therefore it's a DSA master) and that there exists a .rcv
callback (everybody except DSA_TAG_PROTO_NONE has that). This is
understandable as there are many switch tags out there, and exhaustively
checking for all of them is far from ideal.

The solution lies in introducing a filtering function for each tagging
protocol. In the absence of a filtering function, all traffic is passed
to the .rcv DSA callback. The tagging protocol should see the filtering
function as a pre-validation that it can decode the incoming skb. The
traffic that doesn't match the filter will bypass the DSA .rcv callback
and be left on the master netdevice, which wasn't previously possible.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Reworked from a simple boolean (uses_tag_protocol) into a function where
the driver has full insight into both the skb as well as the state of
the master netdevice (and implicitly the cpu_dp and the entire DSA
switch tree).

Changes in v2:
Patch is new.

 include/net/dsa.h  | 15 +++++++++++++++
 net/dsa/dsa2.c     |  1 +
 net/dsa/legacy.c   |  1 +
 net/ethernet/eth.c |  6 +++++-
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 94a9f096568d..e46c107507d8 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -55,6 +55,11 @@ struct dsa_device_ops {
 			       struct packet_type *pt);
 	int (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
 			    int *offset);
+	/* Used to determine which traffic should match the DSA filter in
+	 * eth_type_trans, and which, if any, should bypass it and be processed
+	 * as regular on the master net device.
+	 */
+	bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
 	unsigned int overhead;
 };
 
@@ -128,6 +133,7 @@ struct dsa_port {
 	struct dsa_switch_tree *dst;
 	struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
 			       struct packet_type *pt);
+	bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
 
 	enum {
 		DSA_PORT_TYPE_UNUSED = 0,
@@ -508,6 +514,15 @@ static inline bool netdev_uses_dsa(struct net_device *dev)
 	return false;
 }
 
+static inline bool dsa_can_decode(const struct sk_buff *skb,
+				  struct net_device *dev)
+{
+#if IS_ENABLED(CONFIG_NET_DSA)
+	return !dev->dsa_ptr->filter || dev->dsa_ptr->filter(skb, dev);
+#endif
+	return false;
+}
+
 struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n);
 void dsa_unregister_switch(struct dsa_switch *ds);
 int dsa_register_switch(struct dsa_switch *ds);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 17817c1a7fbd..e28cc98cf358 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -584,6 +584,7 @@ static int dsa_port_parse_cpu(struct dsa_port *dp, struct net_device *master)
 	}
 
 	dp->type = DSA_PORT_TYPE_CPU;
+	dp->filter = tag_ops->filter;
 	dp->rcv = tag_ops->rcv;
 	dp->tag_ops = tag_ops;
 	dp->master = master;
diff --git a/net/dsa/legacy.c b/net/dsa/legacy.c
index cb42939db776..33cb4e7cf74b 100644
--- a/net/dsa/legacy.c
+++ b/net/dsa/legacy.c
@@ -159,6 +159,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds,
 		dst->cpu_dp->tag_ops = tag_ops;
 
 		/* Few copies for faster access in master receive hot path */
+		dst->cpu_dp->filter = dst->cpu_dp->tag_ops->filter;
 		dst->cpu_dp->rcv = dst->cpu_dp->tag_ops->rcv;
 		dst->cpu_dp->dst = dst;
 	}
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index f7a3d7a171c7..0c984bb01c0a 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -183,8 +183,12 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
 	 * at all, so we check here whether one of those tagging
 	 * variants has been configured on the receiving interface,
 	 * and if so, set skb->protocol without looking at the packet.
+	 * The DSA tagging protocol may be able to decode some but not all
+	 * traffic (for example only for management). In that case give it the
+	 * option to filter the packets from which it can decode source port
+	 * information.
 	 */
-	if (unlikely(netdev_uses_dsa(dev)))
+	if (unlikely(netdev_uses_dsa(dev)) && dsa_can_decode(skb, dev))
 		return htons(ETH_P_XDSA);
 
 	if (likely(eth_proto_is_802_3(eth->h_proto)))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (12 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 15:42   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 15/24] net: dsa: sja1105: Add support for FDB and MDB management Vladimir Oltean
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

At this moment the following is supported:
* Link state management through phylib
* Autonomous L2 forwarding managed through iproute2 bridge commands. The
  switch ports are initialized in a mode where they can only talk to the
  CPU port. However, IP termination must be done currently through the
  master netdevice.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
1. Device ID is no longer auto-detected but enforced based on explicit DT
   compatible string. This helps with stricter checking of DT bindings.
2. Group all device-specific operations into a sja1105_info structure and
   avoid using the IS_ET() and IS_PQRS() macros at runtime as much as possible.
5. Miscellaneous cosmetic cleanup in sja1105_clocking.c

 MAINTAINERS                                   |    6 +
 drivers/net/dsa/Kconfig                       |    2 +
 drivers/net/dsa/Makefile                      |    1 +
 drivers/net/dsa/sja1105/Kconfig               |   17 +
 drivers/net/dsa/sja1105/Makefile              |    9 +
 drivers/net/dsa/sja1105/sja1105.h             |  145 +++
 drivers/net/dsa/sja1105/sja1105_clocking.c    |  607 ++++++++++
 .../net/dsa/sja1105/sja1105_dynamic_config.c  |  464 ++++++++
 .../net/dsa/sja1105/sja1105_dynamic_config.h  |   43 +
 drivers/net/dsa/sja1105/sja1105_main.c        |  945 ++++++++++++++++
 drivers/net/dsa/sja1105/sja1105_spi.c         |  551 +++++++++
 .../net/dsa/sja1105/sja1105_static_config.c   | 1004 +++++++++++++++++
 .../net/dsa/sja1105/sja1105_static_config.h   |  274 +++++
 13 files changed, 4068 insertions(+)
 create mode 100644 drivers/net/dsa/sja1105/Kconfig
 create mode 100644 drivers/net/dsa/sja1105/Makefile
 create mode 100644 drivers/net/dsa/sja1105/sja1105.h
 create mode 100644 drivers/net/dsa/sja1105/sja1105_clocking.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_dynamic_config.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_dynamic_config.h
 create mode 100644 drivers/net/dsa/sja1105/sja1105_main.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_spi.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_static_config.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_static_config.h

diff --git a/MAINTAINERS b/MAINTAINERS
index fd80c14973ea..a8c9c7fafde7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11118,6 +11118,12 @@ S:	Maintained
 F:	Documentation/devicetree/bindings/sound/sgtl5000.txt
 F:	sound/soc/codecs/sgtl5000*
 
+NXP SJA1105 ETHERNET SWITCH DRIVER
+M:	Vladimir Oltean <olteanv@gmail.com>
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+F:	drivers/net/dsa/sja1105
+
 NXP TDA998X DRM DRIVER
 M:	Russell King <linux@armlinux.org.uk>
 S:	Maintained
diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 71bb3aebded4..d38e7e00c4e8 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -51,6 +51,8 @@ source "drivers/net/dsa/microchip/Kconfig"
 
 source "drivers/net/dsa/mv88e6xxx/Kconfig"
 
+source "drivers/net/dsa/sja1105/Kconfig"
+
 config NET_DSA_QCA8K
 	tristate "Qualcomm Atheros QCA8K Ethernet switch family support"
 	depends on NET_DSA
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index 82e5d794c41f..fefb6aaa82ba 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -18,3 +18,4 @@ obj-$(CONFIG_NET_DSA_VITESSE_VSC73XX) += vitesse-vsc73xx.o
 obj-y				+= b53/
 obj-y				+= microchip/
 obj-y				+= mv88e6xxx/
+obj-y				+= sja1105/
diff --git a/drivers/net/dsa/sja1105/Kconfig b/drivers/net/dsa/sja1105/Kconfig
new file mode 100644
index 000000000000..e6bb6579a809
--- /dev/null
+++ b/drivers/net/dsa/sja1105/Kconfig
@@ -0,0 +1,17 @@
+config NET_DSA_SJA1105
+tristate "NXP SJA1105 Ethernet switch family support"
+	depends on NET_DSA && SPI
+	select NET_DSA_TAG_SJA1105
+	select NET_DSA_TAG_8021Q
+	help
+	  This is the driver for the NXP SJA1105 automotive Ethernet switch
+	  family. These are 5-port devices and are managed over an SPI
+	  interface. Probing is handled based on OF bindings and so is the
+	  linkage to phylib. The driver supports the following revisions:
+	    - SJA1105E (Gen. 1, No TT-Ethernet)
+	    - SJA1105T (Gen. 1, TT-Ethernet)
+	    - SJA1105P (Gen. 2, No SGMII, No TT-Ethernet)
+	    - SJA1105Q (Gen. 2, No SGMII, TT-Ethernet)
+	    - SJA1105R (Gen. 2, SGMII, No TT-Ethernet)
+	    - SJA1105S (Gen. 2, SGMII, TT-Ethernet)
+
diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
new file mode 100644
index 000000000000..ed00840802f4
--- /dev/null
+++ b/drivers/net/dsa/sja1105/Makefile
@@ -0,0 +1,9 @@
+obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
+
+sja1105-objs := \
+    sja1105_spi.o \
+    sja1105_main.o \
+    sja1105_clocking.o \
+    sja1105_static_config.o \
+    sja1105_dynamic_config.o \
+
diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
new file mode 100644
index 000000000000..ef555dd385a3
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -0,0 +1,145 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2018, Sensor-Technik Wiedemann GmbH
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#ifndef _SJA1105_H
+#define _SJA1105_H
+
+#include <net/dsa.h>
+#include "sja1105_static_config.h"
+
+/* IEEE 802.3 Annex 57A: Slow Protocols PDUs (01:80:C2:xx:xx:xx) */
+#define SJA1105_LINKLOCAL_FILTER_A	0x0180C2000000ull
+#define SJA1105_LINKLOCAL_FILTER_A_MASK	0xFFFFFF000000ull
+/* IEEE 1588 Annex F: Transport of PTP over Ethernet (01:1B:19:xx:xx:xx) */
+#define SJA1105_LINKLOCAL_FILTER_B	0x011B19000000ull
+#define SJA1105_LINKLOCAL_FILTER_B_MASK	0xFFFFFF000000ull
+
+#define SJA1105_NUM_PORTS		5
+#define SJA1105_NUM_TC			8
+#define SJA1105ET_FDB_BIN_SIZE		4
+
+/* Keeps the different addresses between E/T and P/Q/R/S */
+struct sja1105_regs {
+	u64 device_id;
+	u64 prod_id;
+	u64 status;
+	u64 rgu;
+	u64 config;
+	u64 rmii_pll1;
+	u64 pad_mii_tx[SJA1105_NUM_PORTS];
+	u64 cgu_idiv[SJA1105_NUM_PORTS];
+	u64 rgmii_pad_mii_tx[SJA1105_NUM_PORTS];
+	u64 mii_tx_clk[SJA1105_NUM_PORTS];
+	u64 mii_rx_clk[SJA1105_NUM_PORTS];
+	u64 mii_ext_tx_clk[SJA1105_NUM_PORTS];
+	u64 mii_ext_rx_clk[SJA1105_NUM_PORTS];
+	u64 rgmii_tx_clk[SJA1105_NUM_PORTS];
+	u64 rmii_ref_clk[SJA1105_NUM_PORTS];
+	u64 rmii_ext_tx_clk[SJA1105_NUM_PORTS];
+	u64 mac[SJA1105_NUM_PORTS];
+	u64 mac_hl1[SJA1105_NUM_PORTS];
+	u64 mac_hl2[SJA1105_NUM_PORTS];
+	u64 qlevel[SJA1105_NUM_PORTS];
+};
+
+struct sja1105_info {
+	u64 device_id;
+	/* Needed for distinction between P and R, and between Q and S
+	 * (since the parts with/without SGMII share the same
+	 * switch core and device_id)
+	 */
+	u64 part_no;
+	const struct sja1105_dynamic_table_ops *dyn_ops;
+	const struct sja1105_table_ops *static_ops;
+	const struct sja1105_regs *regs;
+	int (*reset_cmd)(const void *ctx, const void *data);
+	const char *name;
+};
+
+struct sja1105_private {
+	struct sja1105_static_config static_config;
+	const struct sja1105_info *info;
+	struct gpio_desc *reset_gpio;
+	struct spi_device *spidev;
+	struct dsa_switch *ds;
+};
+
+#include "sja1105_dynamic_config.h"
+
+struct sja1105_spi_message {
+	u64 access;
+	u64 read_count;
+	u64 address;
+};
+
+typedef enum {
+	SPI_READ = 0,
+	SPI_WRITE = 1,
+} sja1105_spi_rw_mode_t;
+
+/* From sja1105_spi.c */
+int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
+				sja1105_spi_rw_mode_t rw, u64 reg_addr,
+				void *packed_buf, size_t size_bytes);
+int sja1105_spi_send_int(const struct sja1105_private *priv,
+			 sja1105_spi_rw_mode_t rw, u64 reg_addr,
+			 u64 *value, u64 size_bytes);
+int sja1105_spi_send_long_packed_buf(const struct sja1105_private *priv,
+				     sja1105_spi_rw_mode_t rw, u64 base_addr,
+				     void *packed_buf, u64 buf_len);
+int sja1105_static_config_upload(struct sja1105_private *priv);
+
+extern struct sja1105_info sja1105e_info;
+extern struct sja1105_info sja1105t_info;
+extern struct sja1105_info sja1105p_info;
+extern struct sja1105_info sja1105q_info;
+extern struct sja1105_info sja1105r_info;
+extern struct sja1105_info sja1105s_info;
+
+/* From sja1105_clocking.c */
+
+typedef enum {
+	XMII_MAC = 0,
+	XMII_PHY = 1,
+} sja1105_mii_role_t;
+
+typedef enum {
+	XMII_MODE_MII		= 0,
+	XMII_MODE_RMII		= 1,
+	XMII_MODE_RGMII		= 2,
+} sja1105_phy_interface_t;
+
+typedef enum {
+	SJA1105_SPEED_10MBPS	= 3,
+	SJA1105_SPEED_100MBPS	= 2,
+	SJA1105_SPEED_1000MBPS	= 1,
+	SJA1105_SPEED_AUTO	= 0,
+} sja1105_speed_t;
+
+int sja1105_clocking_setup_port(struct sja1105_private *priv, int port);
+int sja1105_clocking_setup(struct sja1105_private *priv);
+
+/* From sja1105_dynamic_config.c */
+
+int sja1105_dynamic_config_read(struct sja1105_private *priv,
+				enum sja1105_blk_idx blk_idx,
+				int index, void *entry);
+int sja1105_dynamic_config_write(struct sja1105_private *priv,
+				 enum sja1105_blk_idx blk_idx,
+				 int index, void *entry, bool keep);
+
+/* Common implementations for the static and dynamic configs */
+size_t sja1105_l2_forwarding_entry_packing(void *buf, void *entry_ptr,
+					   enum packing_op op);
+size_t sja1105pqrs_l2_lookup_entry_packing(void *buf, void *entry_ptr,
+					   enum packing_op op);
+size_t sja1105et_l2_lookup_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op);
+size_t sja1105_vlan_lookup_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op);
+size_t sja1105pqrs_mac_config_entry_packing(void *buf, void *entry_ptr,
+					    enum packing_op op);
+
+#endif
+
diff --git a/drivers/net/dsa/sja1105/sja1105_clocking.c b/drivers/net/dsa/sja1105/sja1105_clocking.c
new file mode 100644
index 000000000000..d40da3d52464
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_clocking.c
@@ -0,0 +1,607 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/* Copyright (c) 2016-2018, NXP Semiconductors
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include <linux/packing.h>
+#include "sja1105.h"
+
+#define SIZE_CGU_CMD	4
+
+struct sja1105_cfg_pad_mii_tx {
+	u64 d32_os;
+	u64 d32_ipud;
+	u64 d10_os;
+	u64 d10_ipud;
+	u64 ctrl_os;
+	u64 ctrl_ipud;
+	u64 clk_os;
+	u64 clk_ih;
+	u64 clk_ipud;
+};
+
+/* UM10944 Table 82.
+ * IDIV_0_C to IDIV_4_C control registers
+ * (addr. 10000Bh to 10000Fh)
+ */
+struct sja1105_cgu_idiv {
+	u64 clksrc;
+	u64 autoblock;
+	u64 idiv;
+	u64 pd;
+};
+
+/* PLL_1_C control register
+ *
+ * SJA1105 E/T: UM10944 Table 81 (address 10000Ah)
+ * SJA1105 P/Q/R/S: UM11040 Table 116 (address 10000Ah)
+ */
+struct sja1105_cgu_pll_ctrl {
+	u64 pllclksrc;
+	u64 msel;
+	u64 nsel; /* Only for P/Q/R/S series */
+	u64 autoblock;
+	u64 psel;
+	u64 direct;
+	u64 fbsel;
+	u64 p23en; /* Only for P/Q/R/S series */
+	u64 bypass;
+	u64 pd;
+};
+
+enum {
+	CLKSRC_MII0_TX_CLK	= 0x00,
+	CLKSRC_MII0_RX_CLK	= 0x01,
+	CLKSRC_MII1_TX_CLK	= 0x02,
+	CLKSRC_MII1_RX_CLK	= 0x03,
+	CLKSRC_MII2_TX_CLK	= 0x04,
+	CLKSRC_MII2_RX_CLK	= 0x05,
+	CLKSRC_MII3_TX_CLK	= 0x06,
+	CLKSRC_MII3_RX_CLK	= 0x07,
+	CLKSRC_MII4_TX_CLK	= 0x08,
+	CLKSRC_MII4_RX_CLK	= 0x09,
+	CLKSRC_PLL0		= 0x0B,
+	CLKSRC_PLL1		= 0x0E,
+	CLKSRC_IDIV0		= 0x11,
+	CLKSRC_IDIV1		= 0x12,
+	CLKSRC_IDIV2		= 0x13,
+	CLKSRC_IDIV3		= 0x14,
+	CLKSRC_IDIV4		= 0x15,
+};
+
+/* UM10944 Table 83.
+ * MIIx clock control registers 1 to 30
+ * (addresses 100013h to 100035h)
+ */
+struct sja1105_cgu_mii_ctrl {
+	u64 clksrc;
+	u64 autoblock;
+	u64 pd;
+};
+
+static void sja1105_cgu_idiv_packing(void *buf, struct sja1105_cgu_idiv *idiv,
+				     enum packing_op op)
+{
+	const int size = 4;
+
+	sja1105_packing(buf, &idiv->clksrc,    28, 24, size, op);
+	sja1105_packing(buf, &idiv->autoblock, 11, 11, size, op);
+	sja1105_packing(buf, &idiv->idiv,       5,  2, size, op);
+	sja1105_packing(buf, &idiv->pd,         0,  0, size, op);
+}
+
+static int sja1105_cgu_idiv_config(struct sja1105_private *priv, int port,
+				   bool enabled, int factor)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct device *dev = priv->ds->dev;
+	struct sja1105_cgu_idiv idiv;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+
+	if (enabled && factor != 1 && factor != 10) {
+		dev_err(dev, "idiv factor must be 1 or 10\n");
+		return -ERANGE;
+	}
+
+	/* Payload for packed_buf */
+	idiv.clksrc    = 0x0A;            /* 25MHz */
+	idiv.autoblock = 1;               /* Block clk automatically */
+	idiv.idiv      = factor - 1;      /* Divide by 1 or 10 */
+	idiv.pd        = enabled ? 0 : 1; /* Power down? */
+	sja1105_cgu_idiv_packing(packed_buf, &idiv, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->cgu_idiv[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static void
+sja1105_cgu_mii_control_packing(void *buf, struct sja1105_cgu_mii_ctrl *cmd,
+				enum packing_op op)
+{
+	const int size = 4;
+
+	sja1105_packing(buf, &cmd->clksrc,    28, 24, size, op);
+	sja1105_packing(buf, &cmd->autoblock, 11, 11, size, op);
+	sja1105_packing(buf, &cmd->pd,         0,  0, size, op);
+}
+
+static int sja1105_cgu_mii_tx_clk_config(struct sja1105_private *priv,
+					 int port, sja1105_mii_role_t role)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl mii_tx_clk;
+	const int mac_clk_sources[] = {
+		CLKSRC_MII0_TX_CLK,
+		CLKSRC_MII1_TX_CLK,
+		CLKSRC_MII2_TX_CLK,
+		CLKSRC_MII3_TX_CLK,
+		CLKSRC_MII4_TX_CLK,
+	};
+	const int phy_clk_sources[] = {
+		CLKSRC_IDIV0,
+		CLKSRC_IDIV1,
+		CLKSRC_IDIV2,
+		CLKSRC_IDIV3,
+		CLKSRC_IDIV4,
+	};
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	int clksrc;
+
+	if (role == XMII_MAC)
+		clksrc = mac_clk_sources[port];
+	else
+		clksrc = phy_clk_sources[port];
+
+	/* Payload for packed_buf */
+	mii_tx_clk.clksrc    = clksrc;
+	mii_tx_clk.autoblock = 1;  /* Autoblock clk while changing clksrc */
+	mii_tx_clk.pd        = 0;  /* Power Down off => enabled */
+	sja1105_cgu_mii_control_packing(packed_buf, &mii_tx_clk, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->mii_tx_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int
+sja1105_cgu_mii_rx_clk_config(struct sja1105_private *priv, int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl mii_rx_clk;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	const int clk_sources[] = {
+		CLKSRC_MII0_RX_CLK,
+		CLKSRC_MII1_RX_CLK,
+		CLKSRC_MII2_RX_CLK,
+		CLKSRC_MII3_RX_CLK,
+		CLKSRC_MII4_RX_CLK,
+	};
+
+	/* Payload for packed_buf */
+	mii_rx_clk.clksrc    = clk_sources[port];
+	mii_rx_clk.autoblock = 1;  /* Autoblock clk while changing clksrc */
+	mii_rx_clk.pd        = 0;  /* Power Down off => enabled */
+	sja1105_cgu_mii_control_packing(packed_buf, &mii_rx_clk, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->mii_rx_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int
+sja1105_cgu_mii_ext_tx_clk_config(struct sja1105_private *priv, int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl mii_ext_tx_clk;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	const int clk_sources[] = {
+		CLKSRC_IDIV0,
+		CLKSRC_IDIV1,
+		CLKSRC_IDIV2,
+		CLKSRC_IDIV3,
+		CLKSRC_IDIV4,
+	};
+
+	/* Payload for packed_buf */
+	mii_ext_tx_clk.clksrc    = clk_sources[port];
+	mii_ext_tx_clk.autoblock = 1; /* Autoblock clk while changing clksrc */
+	mii_ext_tx_clk.pd        = 0; /* Power Down off => enabled */
+	sja1105_cgu_mii_control_packing(packed_buf, &mii_ext_tx_clk, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->mii_ext_tx_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int
+sja1105_cgu_mii_ext_rx_clk_config(struct sja1105_private *priv, int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl mii_ext_rx_clk;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	const int clk_sources[] = {
+		CLKSRC_IDIV0,
+		CLKSRC_IDIV1,
+		CLKSRC_IDIV2,
+		CLKSRC_IDIV3,
+		CLKSRC_IDIV4,
+	};
+
+	/* Payload for packed_buf */
+	mii_ext_rx_clk.clksrc    = clk_sources[port];
+	mii_ext_rx_clk.autoblock = 1; /* Autoblock clk while changing clksrc */
+	mii_ext_rx_clk.pd        = 0; /* Power Down off => enabled */
+	sja1105_cgu_mii_control_packing(packed_buf, &mii_ext_rx_clk, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->mii_ext_rx_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int mii_clocking_setup(struct sja1105_private *priv, int port,
+			      sja1105_mii_role_t role)
+{
+	struct device *dev = priv->ds->dev;
+	int rc;
+
+	dev_dbg(dev, "Configuring MII-%s clocking\n",
+		(role == XMII_MAC) ? "MAC" : "PHY");
+	/* If role is MAC, disable IDIV
+	 * If role is PHY, enable IDIV and configure for 1/1 divider
+	 */
+	rc = sja1105_cgu_idiv_config(priv, port, (role == XMII_PHY), 1);
+	if (rc < 0)
+		return rc;
+
+	/* Configure CLKSRC of MII_TX_CLK_n
+	 *   * If role is MAC, select TX_CLK_n
+	 *   * If role is PHY, select IDIV_n
+	 */
+	rc = sja1105_cgu_mii_tx_clk_config(priv, port, role);
+	if (rc < 0)
+		return rc;
+
+	/* Configure CLKSRC of MII_RX_CLK_n
+	 * Select RX_CLK_n
+	 */
+	rc = sja1105_cgu_mii_rx_clk_config(priv, port);
+	if (rc < 0)
+		return rc;
+
+	if (role == XMII_PHY) {
+		/* Per MII spec, the PHY (which is us) drives the TX_CLK pin */
+
+		/* Configure CLKSRC of EXT_TX_CLK_n
+		 * Select IDIV_n
+		 */
+		rc = sja1105_cgu_mii_ext_tx_clk_config(priv, port);
+		if (rc < 0)
+			return rc;
+
+		/* Configure CLKSRC of EXT_RX_CLK_n
+		 * Select IDIV_n
+		 */
+		rc = sja1105_cgu_mii_ext_rx_clk_config(priv, port);
+		if (rc < 0)
+			return rc;
+	}
+	return 0;
+}
+
+static void
+sja1105_cgu_pll_control_packing(void *buf, struct sja1105_cgu_pll_ctrl *cmd,
+				enum packing_op op)
+{
+	const int size = 4;
+
+	sja1105_packing(buf, &cmd->pllclksrc, 28, 24, size, op);
+	sja1105_packing(buf, &cmd->msel,      23, 16, size, op);
+	sja1105_packing(buf, &cmd->autoblock, 11, 11, size, op);
+	sja1105_packing(buf, &cmd->psel,       9,  8, size, op);
+	sja1105_packing(buf, &cmd->direct,     7,  7, size, op);
+	sja1105_packing(buf, &cmd->fbsel,      6,  6, size, op);
+	sja1105_packing(buf, &cmd->bypass,     1,  1, size, op);
+	sja1105_packing(buf, &cmd->pd,         0,  0, size, op);
+	/* P/Q/R/S only, but packing zeroes for E/T doesn't hurt */
+	sja1105_packing(buf, &cmd->nsel,      13, 12, size, op);
+	sja1105_packing(buf, &cmd->p23en,      2,  2, size, op);
+}
+
+static int sja1105_cgu_rgmii_tx_clk_config(struct sja1105_private *priv,
+					   int port, sja1105_speed_t speed)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl txc;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	int clksrc;
+
+	if (speed == SJA1105_SPEED_1000MBPS) {
+		clksrc = CLKSRC_PLL0;
+	} else {
+		int clk_sources[] = {CLKSRC_IDIV0, CLKSRC_IDIV1, CLKSRC_IDIV2,
+				     CLKSRC_IDIV3, CLKSRC_IDIV4};
+		clksrc = clk_sources[port];
+	}
+
+	/* RGMII: 125MHz for 1000, 25MHz for 100, 2.5MHz for 10 */
+	txc.clksrc = clksrc;
+	/* Autoblock clk while changing clksrc */
+	txc.autoblock = 1;
+	/* Power Down off => enabled */
+	txc.pd = 0;
+	sja1105_cgu_mii_control_packing(packed_buf, &txc, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->rgmii_tx_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+/* AGU */
+static void
+sja1105_cfg_pad_mii_tx_packing(void *buf, struct sja1105_cfg_pad_mii_tx *cmd,
+			       enum packing_op op)
+{
+	const int size = 4;
+
+	sja1105_packing(buf, &cmd->d32_os,   28, 27, size, op);
+	sja1105_packing(buf, &cmd->d32_ipud, 25, 24, size, op);
+	sja1105_packing(buf, &cmd->d10_os,   20, 19, size, op);
+	sja1105_packing(buf, &cmd->d10_ipud, 17, 16, size, op);
+	sja1105_packing(buf, &cmd->ctrl_os,  12, 11, size, op);
+	sja1105_packing(buf, &cmd->ctrl_ipud, 9,  8, size, op);
+	sja1105_packing(buf, &cmd->clk_os,    4,  3, size, op);
+	sja1105_packing(buf, &cmd->clk_ih,    2,  2, size, op);
+	sja1105_packing(buf, &cmd->clk_ipud,  1,  0, size, op);
+}
+
+static int sja1105_rgmii_cfg_pad_tx_config(struct sja1105_private *priv,
+					   int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cfg_pad_mii_tx pad_mii_tx;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+
+	/* Payload */
+	pad_mii_tx.d32_os    = 3; /* TXD[3:2] output stage: */
+				  /*          high noise/high speed */
+	pad_mii_tx.d10_os    = 3; /* TXD[1:0] output stage: */
+				  /*          high noise/high speed */
+	pad_mii_tx.d32_ipud  = 2; /* TXD[3:2] input stage: */
+				  /*          plain input (default) */
+	pad_mii_tx.d10_ipud  = 2; /* TXD[1:0] input stage: */
+				  /*          plain input (default) */
+	pad_mii_tx.ctrl_os   = 3; /* TX_CTL / TX_ER output stage */
+	pad_mii_tx.ctrl_ipud = 2; /* TX_CTL / TX_ER input stage (default) */
+	pad_mii_tx.clk_os    = 3; /* TX_CLK output stage */
+	pad_mii_tx.clk_ih    = 0; /* TX_CLK input hysteresis (default) */
+	pad_mii_tx.clk_ipud  = 2; /* TX_CLK input stage (default) */
+	sja1105_cfg_pad_mii_tx_packing(packed_buf, &pad_mii_tx, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->rgmii_pad_mii_tx[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int rgmii_clocking_setup(struct sja1105_private *priv, int port)
+{
+	struct device *dev = priv->ds->dev;
+	struct sja1105_table *mac;
+	sja1105_speed_t speed;
+	int rc;
+
+	mac = &priv->static_config.tables[BLK_IDX_MAC_CONFIG];
+	speed = ((struct sja1105_mac_config_entry *)mac->entries)[port].speed;
+
+	dev_dbg(dev, "Configuring port %d RGMII at speed %dMbps\n",
+		port, speed);
+
+	switch (speed) {
+	case SJA1105_SPEED_1000MBPS:
+		/* 1000Mbps, IDIV disabled (125 MHz) */
+		rc = sja1105_cgu_idiv_config(priv, port, false, 1);
+		break;
+	case SJA1105_SPEED_100MBPS:
+		/* 100Mbps, IDIV enabled, divide by 1 (25 MHz) */
+		rc = sja1105_cgu_idiv_config(priv, port, true, 1);
+		break;
+	case SJA1105_SPEED_10MBPS:
+		/* 10Mbps, IDIV enabled, divide by 10 (2.5 MHz) */
+		rc = sja1105_cgu_idiv_config(priv, port, true, 10);
+		break;
+	case SJA1105_SPEED_AUTO:
+		/* Skip CGU configuration if there is no speed available
+		 * (e.g. link is not established yet)
+		 */
+		dev_dbg(dev, "Speed not available, skipping CGU config\n");
+		return 0;
+	default:
+		rc = -EINVAL;
+	}
+
+	if (rc < 0) {
+		dev_err(dev, "Failed to configure idiv\n");
+		return rc;
+	}
+	rc = sja1105_cgu_rgmii_tx_clk_config(priv, port, speed);
+	if (rc < 0) {
+		dev_err(dev, "Failed to configure RGMII Tx clock\n");
+		return rc;
+	}
+	rc = sja1105_rgmii_cfg_pad_tx_config(priv, port);
+	if (rc < 0) {
+		dev_err(dev, "Failed to configure Tx pad registers\n");
+		return rc;
+	}
+	return 0;
+}
+
+static int sja1105_cgu_rmii_ref_clk_config(struct sja1105_private *priv,
+					   int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl ref_clk;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	const int clk_sources[] = {
+		CLKSRC_MII0_TX_CLK,
+		CLKSRC_MII1_TX_CLK,
+		CLKSRC_MII2_TX_CLK,
+		CLKSRC_MII3_TX_CLK,
+		CLKSRC_MII4_TX_CLK,
+	};
+
+	/* Payload for packed_buf */
+	ref_clk.clksrc    = clk_sources[port];
+	ref_clk.autoblock = 1;      /* Autoblock clk while changing clksrc */
+	ref_clk.pd        = 0;      /* Power Down off => enabled */
+	sja1105_cgu_mii_control_packing(packed_buf, &ref_clk, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->rmii_ref_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int
+sja1105_cgu_rmii_ext_tx_clk_config(struct sja1105_private *priv, int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct sja1105_cgu_mii_ctrl ext_tx_clk;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+
+	/* Payload for packed_buf */
+	ext_tx_clk.clksrc    = CLKSRC_PLL1;
+	ext_tx_clk.autoblock = 1;   /* Autoblock clk while changing clksrc */
+	ext_tx_clk.pd        = 0;   /* Power Down off => enabled */
+	sja1105_cgu_mii_control_packing(packed_buf, &ext_tx_clk, PACK);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE,
+					   regs->rmii_ext_tx_clk[port],
+					   packed_buf, SIZE_CGU_CMD);
+}
+
+static int sja1105_cgu_rmii_pll_config(struct sja1105_private *priv)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct device *dev = priv->ds->dev;
+	struct sja1105_cgu_pll_ctrl pll;
+	u8 packed_buf[SIZE_CGU_CMD] = {0};
+	int rc;
+
+	/* PLL1 must be enabled and output 50 Mhz.
+	 * This is done by writing first 0x0A010941 to
+	 * the PLL_1_C register and then deasserting
+	 * power down (PD) 0x0A010940.
+	 */
+
+	/* Step 1: PLL1 setup for 50Mhz */
+	pll.pllclksrc = 0xA;
+	pll.msel      = 0x1;
+	pll.autoblock = 0x1;
+	pll.psel      = 0x1;
+	pll.direct    = 0x0;
+	pll.fbsel     = 0x1;
+	pll.bypass    = 0x0;
+	pll.pd        = 0x1;
+	/* P/Q/R/S only */
+	pll.nsel      = 0x0; /* PLL pre-divider is 1 (nsel + 1) */
+	pll.p23en     = 0x0; /* disable 120 and 240 degree phase PLL outputs */
+
+	sja1105_cgu_pll_control_packing(packed_buf, &pll, PACK);
+	rc = sja1105_spi_send_packed_buf(priv, SPI_WRITE, regs->rmii_pll1,
+					 packed_buf, SIZE_CGU_CMD);
+	if (rc < 0) {
+		dev_err(dev, "failed to configure PLL1 for 50MHz\n");
+		return rc;
+	}
+
+	/* Step 2: Enable PLL1 */
+	pll.pd = 0x0;
+
+	sja1105_cgu_pll_control_packing(packed_buf, &pll, PACK);
+	rc = sja1105_spi_send_packed_buf(priv, SPI_WRITE, regs->rmii_pll1,
+					 packed_buf, SIZE_CGU_CMD);
+	if (rc < 0) {
+		dev_err(dev, "failed to enable PLL1\n");
+		return rc;
+	}
+	return rc;
+}
+
+static int rmii_clocking_setup(struct sja1105_private *priv, int port,
+			       sja1105_mii_role_t role)
+{
+	struct device *dev = priv->ds->dev;
+	int rc;
+
+	dev_dbg(dev, "Configuring RMII-%s clocking\n",
+		(role == XMII_MAC) ? "MAC" : "PHY");
+	/* AH1601.pdf chapter 2.5.1. Sources */
+	if (role == XMII_MAC) {
+		/* Configure and enable PLL1 for 50Mhz output */
+		rc = sja1105_cgu_rmii_pll_config(priv);
+		if (rc < 0)
+			return rc;
+	}
+	/* Disable IDIV for this port */
+	rc = sja1105_cgu_idiv_config(priv, port, false, 1);
+	if (rc < 0)
+		return rc;
+	/* Source to sink mappings */
+	rc = sja1105_cgu_rmii_ref_clk_config(priv, port);
+	if (rc < 0)
+		return rc;
+	if (role == XMII_MAC) {
+		rc = sja1105_cgu_rmii_ext_tx_clk_config(priv, port);
+		if (rc < 0)
+			return rc;
+	}
+	return 0;
+}
+
+int sja1105_clocking_setup_port(struct sja1105_private *priv, int port)
+{
+	struct sja1105_xmii_params_entry *mii;
+	struct device *dev = priv->ds->dev;
+	sja1105_phy_interface_t phy_mode;
+	sja1105_mii_role_t role;
+	int rc;
+
+	mii = priv->static_config.tables[BLK_IDX_XMII_PARAMS].entries;
+
+	/* RGMII etc */
+	phy_mode = mii->xmii_mode[port];
+	/* MAC or PHY, for applicable types (not RGMII) */
+	role = mii->phy_mac[port];
+
+	switch (phy_mode) {
+	case XMII_MODE_MII:
+		rc = mii_clocking_setup(priv, port, role);
+		break;
+	case XMII_MODE_RMII:
+		rc = rmii_clocking_setup(priv, port, role);
+		break;
+	case XMII_MODE_RGMII:
+		rc = rgmii_clocking_setup(priv, port);
+		break;
+	default:
+		dev_err(dev, "Invalid interface mode specified: %d\n",
+			phy_mode);
+		return -EINVAL;
+	}
+	if (rc)
+		dev_err(dev, "Clocking setup for port %d failed: %d\n",
+			port, rc);
+	return rc;
+}
+
+int sja1105_clocking_setup(struct sja1105_private *priv)
+{
+	int port, rc;
+
+	for (port = 0; port < SJA1105_NUM_PORTS; port++) {
+		rc = sja1105_clocking_setup_port(priv, port);
+		if (rc < 0)
+			return rc;
+	}
+	return 0;
+}
+
diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
new file mode 100644
index 000000000000..74c3a00d453c
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -0,0 +1,464 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include "sja1105.h"
+
+#define SIZE_DYN_CMD                     4
+#define SIZE_MAC_CONFIG_DYN_ENTRY_ET     SIZE_DYN_CMD
+#define SIZE_L2_LOOKUP_DYN_CMD_ET       (SIZE_DYN_CMD + SIZE_L2_LOOKUP_ENTRY_ET)
+#define SIZE_L2_LOOKUP_DYN_CMD_PQRS     (SIZE_DYN_CMD + SIZE_L2_LOOKUP_ENTRY_PQRS)
+#define SIZE_VLAN_LOOKUP_DYN_CMD        (SIZE_DYN_CMD + 4 + SIZE_VLAN_LOOKUP_ENTRY)
+#define SIZE_L2_FORWARDING_DYN_CMD      (SIZE_DYN_CMD + SIZE_L2_FORWARDING_ENTRY)
+#define SIZE_MAC_CONFIG_DYN_CMD_ET      (SIZE_DYN_CMD + SIZE_MAC_CONFIG_DYN_ENTRY_ET)
+#define SIZE_MAC_CONFIG_DYN_CMD_PQRS    (SIZE_DYN_CMD + SIZE_MAC_CONFIG_ENTRY_PQRS)
+#define SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET SIZE_DYN_CMD
+#define SIZE_GENERAL_PARAMS_DYN_CMD_ET   SIZE_DYN_CMD
+#define MAX_DYN_CMD_SIZE                 SIZE_MAC_CONFIG_DYN_CMD_PQRS
+
+static void
+sja1105pqrs_l2_lookup_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				  enum packing_op op)
+{
+	u8 *p = buf + SIZE_L2_LOOKUP_ENTRY_PQRS;
+
+	sja1105_packing(p, &cmd->valid,    31, 31, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->rdwrset,  30, 30, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->errors,   29, 29, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->valident, 27, 27, SIZE_DYN_CMD, op);
+	/* Hack - The hardware takes the 'index' field within
+	 * struct sja1105_l2_lookup_entry as the index on which this command
+	 * will operate. However it will ignore everything else, so 'index'
+	 * is logically part of command but physically part of entry.
+	 * Populate the 'index' entry field from within the command callback,
+	 * such that our API doesn't need to ask for a full-blown entry
+	 * structure when e.g. a delete is requested.
+	 */
+	sja1105_packing(buf, &cmd->index, 29, 20, SIZE_L2_LOOKUP_ENTRY_PQRS, op);
+	/* TODO hostcmd */
+}
+
+static void
+sja1105et_l2_lookup_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				enum packing_op op)
+{
+	u8 *p = buf + SIZE_L2_LOOKUP_ENTRY_ET;
+
+	sja1105_packing(p, &cmd->valid,    31, 31, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->rdwrset,  30, 30, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->errors,   29, 29, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->valident, 27, 27, SIZE_DYN_CMD, op);
+	/* Hack - see comments above. */
+	sja1105_packing(buf, &cmd->index, 29, 20, SIZE_L2_LOOKUP_ENTRY_ET, op);
+}
+
+static void
+sja1105et_mgmt_route_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				 enum packing_op op)
+{
+	u8 *p = buf + SIZE_L2_LOOKUP_ENTRY_ET;
+	u64 mgmtroute = 1;
+
+	sja1105et_l2_lookup_cmd_packing(buf, cmd, op);
+	if (op == PACK)
+		sja1105_pack(p, &mgmtroute, 26, 26, SIZE_DYN_CMD);
+}
+
+static size_t sja1105et_mgmt_route_entry_packing(void *buf, void *entry_ptr,
+						 enum packing_op op)
+{
+	struct sja1105_mgmt_entry *entry = entry_ptr;
+	const size_t size = SIZE_L2_LOOKUP_ENTRY_ET;
+
+	/* UM10944: To specify if a PTP egress timestamp shall be captured on
+	 * each port upon transmission of the frame, the LSB of VLANID in the
+	 * ENTRY field provided by the host must be set.
+	 * Bit 1 of VLANID then specifies the register where the timestamp for
+	 * this port is stored in.
+	 */
+	sja1105_packing(buf, &entry->tsreg,     85, 85, size, op);
+	sja1105_packing(buf, &entry->takets,    84, 84, size, op);
+	sja1105_packing(buf, &entry->macaddr,   83, 36, size, op);
+	sja1105_packing(buf, &entry->destports, 35, 31, size, op);
+	sja1105_packing(buf, &entry->enfport,   30, 30, size, op);
+	return size;
+}
+
+/* In E/T, entry is at addresses 0x27-0x28. There is a 4 byte gap at 0x29,
+ * and command is at 0x2a. Similarly in P/Q/R/S there is a 1 register gap
+ * between entry (0x2d, 0x2e) and command (0x30).
+ */
+static void
+sja1105_vlan_lookup_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				enum packing_op op)
+{
+	u8 *p = buf + SIZE_VLAN_LOOKUP_ENTRY + 4;
+
+	sja1105_packing(p, &cmd->valid,    31, 31, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->rdwrset,  30, 30, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->valident, 27, 27, SIZE_DYN_CMD, op);
+	/* Hack - see comments above, applied for 'vlanid' field of
+	 * struct sja1105_vlan_lookup_entry.
+	 */
+	sja1105_packing(buf, &cmd->index, 38, 27, SIZE_VLAN_LOOKUP_ENTRY, op);
+}
+
+static void
+sja1105_l2_forwarding_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				  enum packing_op op)
+{
+	u8 *p = buf + SIZE_L2_FORWARDING_ENTRY;
+
+	sja1105_packing(p, &cmd->valid,   31, 31, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->errors,  30, 30, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->rdwrset, 29, 29, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->index,    4,  0, SIZE_DYN_CMD, op);
+}
+
+static void
+sja1105et_mac_config_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				 enum packing_op op)
+{
+	/* Yup, user manual definitions are reversed */
+	u8 *reg1 = buf + 4;
+
+	sja1105_packing(reg1, &cmd->valid, 31, 31, SIZE_DYN_CMD, op);
+	sja1105_packing(reg1, &cmd->index, 26, 24, SIZE_DYN_CMD, op);
+}
+
+static size_t sja1105et_mac_config_entry_packing(void *buf, void *entry_ptr,
+						 enum packing_op op)
+{
+	struct sja1105_mac_config_entry *entry = entry_ptr;
+	const int size = SIZE_MAC_CONFIG_DYN_ENTRY_ET;
+	/* Yup, user manual definitions are reversed */
+	u8 *reg1 = buf + 4;
+	u8 *reg2 = buf;
+
+	sja1105_packing(reg1, &entry->speed,     30, 29, size, op);
+	sja1105_packing(reg1, &entry->drpdtag,   23, 23, size, op);
+	sja1105_packing(reg1, &entry->drpuntag,  22, 22, size, op);
+	sja1105_packing(reg1, &entry->retag,     21, 21, size, op);
+	sja1105_packing(reg1, &entry->dyn_learn, 20, 20, size, op);
+	sja1105_packing(reg1, &entry->egress,    19, 19, size, op);
+	sja1105_packing(reg1, &entry->ingress,   18, 18, size, op);
+	sja1105_packing(reg1, &entry->ing_mirr,  17, 17, size, op);
+	sja1105_packing(reg1, &entry->egr_mirr,  16, 16, size, op);
+	sja1105_packing(reg1, &entry->vlanprio,  14, 12, size, op);
+	sja1105_packing(reg1, &entry->vlanid,    11,  0, size, op);
+	sja1105_packing(reg2, &entry->tp_delin,  31, 16, size, op);
+	sja1105_packing(reg2, &entry->tp_delout, 15,  0, size, op);
+	/* MAC configuration table entries which can't be reconfigured:
+	 * top, base, enabled, ifg, maxage, drpnona664
+	 */
+	/* Bogus return value, not used anywhere */
+	return 0;
+}
+
+static void
+sja1105pqrs_mac_config_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				   enum packing_op op)
+{
+	u8 *p = buf + SIZE_MAC_CONFIG_ENTRY_PQRS;
+
+	sja1105_packing(p, &cmd->valid,   31, 31, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->errors,  30, 30, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->rdwrset, 29, 29, SIZE_DYN_CMD, op);
+	sja1105_packing(p, &cmd->index,    2,  0, SIZE_DYN_CMD, op);
+}
+
+static void
+sja1105et_l2_lookup_params_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				       enum packing_op op)
+{
+	sja1105_packing(buf, &cmd->valid, 31, 31,
+			SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET, op);
+}
+
+static size_t
+sja1105et_l2_lookup_params_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op)
+{
+	struct sja1105_l2_lookup_params_entry *entry = entry_ptr;
+
+	sja1105_packing(buf, &entry->poly, 7, 0,
+			SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET, op);
+	/* Bogus return value, not used anywhere */
+	return 0;
+}
+
+static void
+sja1105et_general_params_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+				     enum packing_op op)
+{
+	const int size = SIZE_GENERAL_PARAMS_DYN_CMD_ET;
+
+	sja1105_packing(buf, &cmd->valid,  31, 31, size, op);
+	sja1105_packing(buf, &cmd->errors, 30, 30, size, op);
+}
+
+static size_t
+sja1105et_general_params_entry_packing(void *buf, void *entry_ptr,
+				       enum packing_op op)
+{
+	struct sja1105_general_params_entry *entry = entry_ptr;
+	const int size = SIZE_GENERAL_PARAMS_DYN_CMD_ET;
+
+	sja1105_packing(buf, &entry->mirr_port, 2, 0, size, op);
+	/* Bogus return value, not used anywhere */
+	return 0;
+}
+
+#define OP_READ		BIT(0)
+#define OP_WRITE	BIT(1)
+#define OP_DEL		BIT(2)
+
+/* SJA1105E/T: First generation */
+struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.entry_packing = sja1105et_l2_lookup_entry_packing,
+		.cmd_packing = sja1105et_l2_lookup_cmd_packing,
+		.access = (OP_READ | OP_WRITE | OP_DEL),
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+		.packed_size = SIZE_L2_LOOKUP_DYN_CMD_ET,
+		.addr = 0x20,
+	},
+	[BLK_IDX_MGMT_ROUTE] = {
+		.entry_packing = sja1105et_mgmt_route_entry_packing,
+		.cmd_packing = sja1105et_mgmt_route_cmd_packing,
+		.access = (OP_READ | OP_WRITE),
+		.max_entry_count = SJA1105_NUM_PORTS,
+		.packed_size = SIZE_L2_LOOKUP_DYN_CMD_ET,
+		.addr = 0x20,
+	},
+	[BLK_IDX_L2_POLICING] = { 0 },
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.entry_packing = sja1105_vlan_lookup_entry_packing,
+		.cmd_packing = sja1105_vlan_lookup_cmd_packing,
+		.access = (OP_WRITE | OP_DEL),
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+		.packed_size = SIZE_VLAN_LOOKUP_DYN_CMD,
+		.addr = 0x27,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.entry_packing = sja1105_l2_forwarding_entry_packing,
+		.cmd_packing = sja1105_l2_forwarding_cmd_packing,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+		.access = OP_WRITE,
+		.packed_size = SIZE_L2_FORWARDING_DYN_CMD,
+		.addr = 0x24,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.entry_packing = sja1105et_mac_config_entry_packing,
+		.cmd_packing = sja1105et_mac_config_cmd_packing,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+		.access = OP_WRITE,
+		.packed_size = SIZE_MAC_CONFIG_DYN_CMD_ET,
+		.addr = 0x36,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.entry_packing = sja1105et_l2_lookup_params_entry_packing,
+		.cmd_packing = sja1105et_l2_lookup_params_cmd_packing,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+		.access = OP_WRITE,
+		.packed_size = SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET,
+		.addr = 0x38,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = { 0 },
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.entry_packing = sja1105et_general_params_entry_packing,
+		.cmd_packing = sja1105et_general_params_cmd_packing,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+		.access = OP_WRITE,
+		.packed_size = SIZE_GENERAL_PARAMS_DYN_CMD_ET,
+		.addr = 0x34,
+	},
+	[BLK_IDX_XMII_PARAMS] = { 0 },
+};
+
+/* SJA1105P/Q/R/S: Second generation: TODO */
+struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.entry_packing = sja1105pqrs_l2_lookup_entry_packing,
+		.cmd_packing = sja1105pqrs_l2_lookup_cmd_packing,
+		.access = (OP_READ | OP_WRITE | OP_DEL),
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+		.packed_size = SIZE_L2_LOOKUP_DYN_CMD_ET,
+		.addr = 0x24,
+	},
+	[BLK_IDX_L2_POLICING] = { 0 },
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.entry_packing = sja1105_vlan_lookup_entry_packing,
+		.cmd_packing = sja1105_vlan_lookup_cmd_packing,
+		.access = (OP_READ | OP_WRITE | OP_DEL),
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+		.packed_size = SIZE_VLAN_LOOKUP_DYN_CMD,
+		.addr = 0x2D,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.entry_packing = sja1105_l2_forwarding_entry_packing,
+		.cmd_packing = sja1105_l2_forwarding_cmd_packing,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+		.access = OP_WRITE,
+		.packed_size = SIZE_L2_FORWARDING_DYN_CMD,
+		.addr = 0x2A,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.entry_packing = sja1105pqrs_mac_config_entry_packing,
+		.cmd_packing = sja1105pqrs_mac_config_cmd_packing,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+		.access = (OP_READ | OP_WRITE),
+		.packed_size = SIZE_MAC_CONFIG_DYN_CMD_PQRS,
+		.addr = 0x4B,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.entry_packing = sja1105et_l2_lookup_params_entry_packing,
+		.cmd_packing = sja1105et_l2_lookup_params_cmd_packing,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+		.access = (OP_READ | OP_WRITE),
+		.packed_size = SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET,
+		.addr = 0x38,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = { 0 },
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.entry_packing = sja1105et_general_params_entry_packing,
+		.cmd_packing = sja1105et_general_params_cmd_packing,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+		.access = OP_WRITE,
+		.packed_size = SIZE_GENERAL_PARAMS_DYN_CMD_ET,
+		.addr = 0x34,
+	},
+	[BLK_IDX_XMII_PARAMS] = { 0 },
+};
+
+int sja1105_dynamic_config_read(struct sja1105_private *priv,
+				enum sja1105_blk_idx blk_idx,
+				int index, void *entry)
+{
+	const struct sja1105_dynamic_table_ops *ops;
+	struct sja1105_dyn_cmd cmd = { 0 };
+	/* SPI payload buffer */
+	u8 packed_buf[MAX_DYN_CMD_SIZE];
+	int retries = 3;
+	int rc;
+
+	if (blk_idx >= BLK_IDX_MAX_DYN)
+		return -ERANGE;
+
+	ops = &priv->info->dyn_ops[blk_idx];
+
+	if (index >= ops->max_entry_count)
+		return -ERANGE;
+	if (!(ops->access & OP_READ))
+		return -EOPNOTSUPP;
+	if (ops->packed_size > MAX_DYN_CMD_SIZE)
+		return -ERANGE;
+	if (!ops->cmd_packing)
+		return -EOPNOTSUPP;
+	if (!ops->entry_packing)
+		return -EOPNOTSUPP;
+
+	memset(packed_buf, 0, ops->packed_size);
+
+	cmd.valid = true; /* Trigger action on table entry */
+	cmd.rdwrset = SPI_READ; /* Action is read */
+	cmd.index = index;
+	ops->cmd_packing(packed_buf, &cmd, PACK);
+
+	/* Send SPI write operation: read config table entry */
+	rc = sja1105_spi_send_packed_buf(priv, SPI_WRITE, ops->addr,
+					 packed_buf, ops->packed_size);
+	if (rc < 0)
+		return rc;
+
+	/* Loop until we have confirmation that hardware has finished
+	 * processing the command and has cleared the VALID field
+	 */
+	do {
+		memset(packed_buf, 0, ops->packed_size);
+
+		/* Retrieve the read operation's result */
+		rc = sja1105_spi_send_packed_buf(priv, SPI_READ, ops->addr,
+						 packed_buf, ops->packed_size);
+		if (rc < 0)
+			return rc;
+
+		memset(&cmd, 0, sizeof(cmd));
+		ops->cmd_packing(packed_buf, &cmd, UNPACK);
+		/* UM10944: [valident] will always be found cleared
+		 * during a read access with MGMTROUTE set.
+		 * So don't error out in that case.
+		 */
+		if (!cmd.valident && blk_idx != BLK_IDX_MGMT_ROUTE)
+			return -EINVAL;
+		cpu_relax();
+	} while (cmd.valid && --retries);
+
+	if (cmd.valid)
+		return -ETIMEDOUT;
+
+	/* Don't dereference possibly NULL pointer - maybe caller
+	 * only wanted to see whether the entry existed or not.
+	 */
+	if (entry)
+		ops->entry_packing(packed_buf, entry, UNPACK);
+	return 0;
+}
+
+int sja1105_dynamic_config_write(struct sja1105_private *priv,
+				 enum sja1105_blk_idx blk_idx,
+				 int index, void *entry, bool keep)
+{
+	const struct sja1105_dynamic_table_ops *ops;
+	struct sja1105_dyn_cmd cmd = { 0 };
+	/* SPI payload buffer */
+	u8 packed_buf[MAX_DYN_CMD_SIZE];
+	int rc;
+
+	if (blk_idx >= BLK_IDX_MAX_DYN)
+		return -ERANGE;
+
+	ops = &priv->info->dyn_ops[blk_idx];
+
+	if (index >= ops->max_entry_count)
+		return -ERANGE;
+	if (!(ops->access & OP_WRITE))
+		return -EOPNOTSUPP;
+	if (!keep && !(ops->access & OP_DEL))
+		return -EOPNOTSUPP;
+	if (ops->packed_size > MAX_DYN_CMD_SIZE)
+		return -ERANGE;
+
+	memset(packed_buf, 0, ops->packed_size);
+
+	cmd.valident = keep; /* If false, deletes entry */
+	cmd.valid = true; /* Trigger action on table entry */
+	cmd.rdwrset = SPI_WRITE; /* Action is write */
+	cmd.index = index;
+
+	if (!ops->cmd_packing)
+		return -EOPNOTSUPP;
+	ops->cmd_packing(packed_buf, &cmd, PACK);
+
+	if (!ops->entry_packing)
+		return -EOPNOTSUPP;
+	/* Don't dereference potentially NULL pointer if just
+	 * deleting a table entry is what was requested. For cases
+	 * where 'index' field is physically part of entry structure,
+	 * and needed here, we deal with that in the cmd_packing callback.
+	 */
+	if (keep)
+		ops->entry_packing(packed_buf, entry, PACK);
+
+	/* Send SPI write operation: read config table entry */
+	rc = sja1105_spi_send_packed_buf(priv, SPI_WRITE, ops->addr,
+					 packed_buf, ops->packed_size);
+	if (rc < 0)
+		return rc;
+
+	memset(&cmd, 0, sizeof(cmd));
+	ops->cmd_packing(packed_buf, &cmd, UNPACK);
+	if (cmd.errors)
+		return -EINVAL;
+
+	return 0;
+}
diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.h b/drivers/net/dsa/sja1105/sja1105_dynamic_config.h
new file mode 100644
index 000000000000..77be59546a55
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#ifndef _SJA1105_DYNAMIC_CONFIG_H
+#define _SJA1105_DYNAMIC_CONFIG_H
+
+#include "sja1105.h"
+#include <linux/packing.h>
+
+struct sja1105_dyn_cmd {
+	u64 valid;
+	u64 rdwrset;
+	u64 errors;
+	u64 valident;
+	u64 index;
+};
+
+struct sja1105_dynamic_table_ops {
+	/* This returns size_t just to keep same prototype as the
+	 * static config ops, of which we are reusing some functions.
+	 */
+	size_t (*entry_packing)(void *buf, void *entry_ptr, enum packing_op op);
+	void (*cmd_packing)(void *buf, struct sja1105_dyn_cmd *cmd,
+			    enum packing_op op);
+	size_t max_entry_count;
+	size_t packed_size;
+	u64 addr;
+	u8 access;
+};
+
+struct sja1105_mgmt_entry {
+	u64 tsreg;
+	u64 takets;
+	u64 macaddr;
+	u64 destports;
+	u64 enfport;
+	u64 index;
+};
+
+extern struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN];
+extern struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN];
+
+#endif
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
new file mode 100644
index 000000000000..c3e4fff11101
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -0,0 +1,945 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, Sensor-Technik Wiedemann GmbH
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+#include <linux/spi/spi.h>
+#include <linux/errno.h>
+#include <linux/gpio/consumer.h>
+#include <linux/of.h>
+#include <linux/of_net.h>
+#include <linux/of_mdio.h>
+#include <linux/of_device.h>
+#include <linux/netdev_features.h>
+#include <linux/netdevice.h>
+#include <linux/if_bridge.h>
+#include <linux/if_ether.h>
+#include "sja1105.h"
+
+static void sja1105_hw_reset(struct gpio_desc *gpio, unsigned int pulse_len,
+			     unsigned int startup_delay)
+{
+	gpiod_set_value_cansleep(gpio, 1);
+	/* Wait for minimum reset pulse length */
+	msleep(pulse_len);
+	gpiod_set_value_cansleep(gpio, 0);
+	/* Wait until chip is ready after reset */
+	msleep(startup_delay);
+}
+
+static void
+sja1105_port_allow_traffic(struct sja1105_l2_forwarding_entry *l2_fwd,
+			   int from, int to, bool allow)
+{
+	if (allow) {
+		l2_fwd[from].bc_domain  |= BIT(to);
+		l2_fwd[from].reach_port |= BIT(to);
+		l2_fwd[from].fl_domain  |= BIT(to);
+	} else {
+		l2_fwd[from].bc_domain  &= ~BIT(to);
+		l2_fwd[from].reach_port &= ~BIT(to);
+		l2_fwd[from].fl_domain  &= ~BIT(to);
+	}
+}
+
+/* Structure used to temporarily transport device tree
+ * settings into sja1105_setup
+ */
+struct sja1105_dt_port {
+	phy_interface_t phy_mode;
+	sja1105_mii_role_t role;
+};
+
+static int sja1105_init_mac_settings(struct sja1105_private *priv)
+{
+	struct sja1105_mac_config_entry default_mac = {
+		/* Enable all 8 priority queues on egress.
+		 * Every queue i holds top[i] - base[i] frames.
+		 * Sum of top[i] - base[i] is 511 (max hardware limit).
+		 */
+		.top  = {0x3F, 0x7F, 0xBF, 0xFF, 0x13F, 0x17F, 0x1BF, 0x1FF},
+		.base = {0x0, 0x40, 0x80, 0xC0, 0x100, 0x140, 0x180, 0x1C0},
+		.enabled = {true, true, true, true, true, true, true, true},
+		/* Keep standard IFG of 12 bytes on egress. */
+		.ifg = 0,
+		/* Always put the MAC speed in automatic mode, where it can be
+		 * retrieved from the PHY object through phylib and
+		 * sja1105_adjust_port_config.
+		 */
+		.speed = SJA1105_SPEED_AUTO,
+		/* No static correction for 1-step 1588 events */
+		.tp_delin = 0,
+		.tp_delout = 0,
+		/* Disable aging for critical TTEthernet traffic */
+		.maxage = 0xFF,
+		/* Internal VLAN (pvid) to apply to untagged ingress */
+		.vlanprio = 0,
+		.vlanid = 0,
+		.ing_mirr = false,
+		.egr_mirr = false,
+		/* Don't drop traffic with other EtherType than ETH_P_IP */
+		.drpnona664 = false,
+		/* Don't drop double-tagged traffic */
+		.drpdtag = false,
+		/* Don't drop VLAN with single outer tag - P/Q/R/S only */
+		.drpsotag = false,
+		/* Don't drop VLAN with single inner tag - P/Q/R/S only */
+		.drpsitag = false,
+		/* Don't drop untagged traffic */
+		.drpuntag = false,
+		/* Don't retag 802.1p (VID 0) traffic with the pvid */
+		.retag = false,
+		/* Enable learning and I/O on user ports by default. */
+		.dyn_learn = true,
+		.egress = false,
+		.ingress = false,
+		.mirrcie = 0,
+		.mirrcetag = 0,
+		.ingmirrvid = 0,
+		.ingmirrpcp = 0,
+		.ingmirrdei = 0,
+	};
+	struct sja1105_mac_config_entry *mac;
+	struct sja1105_table *table;
+	int i;
+
+	table = &priv->static_config.tables[BLK_IDX_MAC_CONFIG];
+
+	/* Discard previous MAC Configuration Table */
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(SJA1105_NUM_PORTS,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	/* Override table based on phylib DT bindings */
+	table->entry_count = SJA1105_NUM_PORTS;
+
+	mac = table->entries;
+
+	for (i = 0; i < SJA1105_NUM_PORTS; i++)
+		mac[i] = default_mac;
+
+	return 0;
+}
+
+static int sja1105_init_mii_settings(struct sja1105_private *priv,
+				     struct sja1105_dt_port *ports)
+{
+	struct device *dev = &priv->spidev->dev;
+	struct sja1105_xmii_params_entry *mii;
+	struct sja1105_table *table;
+	int i;
+
+	table = &priv->static_config.tables[BLK_IDX_XMII_PARAMS];
+
+	/* Discard previous xMII Mode Parameters Table */
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(MAX_XMII_PARAMS_COUNT,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	/* Override table based on phylib DT bindings */
+	table->entry_count = MAX_XMII_PARAMS_COUNT;
+
+	mii = table->entries;
+
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		switch (ports[i].phy_mode) {
+		case PHY_INTERFACE_MODE_MII:
+			mii->xmii_mode[i] = XMII_MODE_MII;
+			break;
+		case PHY_INTERFACE_MODE_RMII:
+			mii->xmii_mode[i] = XMII_MODE_RMII;
+			break;
+		case PHY_INTERFACE_MODE_RGMII:
+		case PHY_INTERFACE_MODE_RGMII_ID:
+		case PHY_INTERFACE_MODE_RGMII_RXID:
+		case PHY_INTERFACE_MODE_RGMII_TXID:
+			mii->xmii_mode[i] = XMII_MODE_RGMII;
+			break;
+		default:
+			dev_err(dev, "Unsupported PHY mode %s!\n",
+				phy_modes(ports[i].phy_mode));
+		}
+
+		mii->phy_mac[i] = ports[i].role;
+	}
+	return 0;
+}
+
+static int sja1105_init_static_fdb(struct sja1105_private *priv)
+{
+	struct sja1105_table *table;
+
+	table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
+
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+	return 0;
+}
+
+static int sja1105_init_l2_lookup_params(struct sja1105_private *priv)
+{
+	struct sja1105_table *table;
+	struct sja1105_l2_lookup_params_entry default_l2_lookup_params = {
+		/* TODO Learned FDB entries are never forgotten */
+		.maxage = 0,
+		/* All entries within a FDB bin are available for learning */
+		.dyn_tbsz = SJA1105ET_FDB_BIN_SIZE,
+		/* 2^8 + 2^5 + 2^3 + 2^2 + 2^1 + 1 in Koopman notation */
+		.poly = 0x97,
+		/* This selects between Independent VLAN Learning (IVL) and
+		 * Shared VLAN Learning (SVL)
+		 */
+		.shared_learn = false,
+		/* Don't discard management traffic based on ENFPORT -
+		 * we don't perform SMAC port enforcement anyway, so
+		 * what we are setting here doesn't matter.
+		 */
+		.no_enf_hostprt = false,
+		/* Don't learn SMAC for mac_fltres1 and mac_fltres0.
+		 * TODO Maybe correlate with no_linklocal_learn from bridge
+		 * driver?
+		 */
+		.no_mgmt_learn = true,
+	};
+
+	table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP_PARAMS];
+
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(MAX_L2_LOOKUP_PARAMS_COUNT,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	table->entry_count = MAX_L2_LOOKUP_PARAMS_COUNT;
+
+	/* This table only has a single entry */
+	((struct sja1105_l2_lookup_params_entry *)table->entries)[0] =
+				default_l2_lookup_params;
+
+	return 0;
+}
+
+static int sja1105_init_static_vlan(struct sja1105_private *priv)
+{
+	struct sja1105_table *table;
+	struct sja1105_vlan_lookup_entry pvid = {
+		.ving_mirr = 0,
+		.vegr_mirr = 0,
+		.vmemb_port = 0,
+		.vlan_bc = 0,
+		.tag_port = 0,
+		.vlanid = 0,
+	};
+	int i;
+
+	table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
+
+	/* The static VLAN table will only contain the initial pvid of 0.
+	 */
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(1, table->ops->unpacked_entry_size,
+				 GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	table->entry_count = 1;
+
+	/* VLAN ID 0: all DT-defined ports are members; no restrictions on
+	 * forwarding; always transmit priority-tagged frames as untagged.
+	 */
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		pvid.vmemb_port |= BIT(i);
+		pvid.vlan_bc |= BIT(i);
+		pvid.tag_port &= ~BIT(i);
+	}
+
+	((struct sja1105_vlan_lookup_entry *)table->entries)[0] = pvid;
+	return 0;
+}
+
+static int sja1105_init_l2_forwarding(struct sja1105_private *priv)
+{
+	struct sja1105_l2_forwarding_entry *l2fwd;
+	struct sja1105_table *table;
+	int i, j;
+
+	table = &priv->static_config.tables[BLK_IDX_L2_FORWARDING];
+
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(MAX_L2_FORWARDING_COUNT,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	table->entry_count = MAX_L2_FORWARDING_COUNT;
+
+	l2fwd = table->entries;
+
+	/* First 5 entries define the forwarding rules */
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		unsigned int upstream = dsa_upstream_port(priv->ds, i);
+
+		for (j = 0; j < SJA1105_NUM_TC; j++)
+			l2fwd[i].vlan_pmap[j] = j;
+
+		if (i == upstream)
+			continue;
+
+		sja1105_port_allow_traffic(l2fwd, i, upstream, true);
+		sja1105_port_allow_traffic(l2fwd, upstream, i, true);
+	}
+	/* Next 8 entries define VLAN PCP mapping from ingress to egress.
+	 * Create a one-to-one mapping.
+	 */
+	for (i = 0; i < SJA1105_NUM_TC; i++)
+		for (j = 0; j < SJA1105_NUM_PORTS; j++)
+			l2fwd[SJA1105_NUM_PORTS + i].vlan_pmap[j] = i;
+
+	return 0;
+}
+
+static int sja1105_init_l2_forwarding_params(struct sja1105_private *priv)
+{
+	struct sja1105_l2_forwarding_params_entry default_l2fwd_params = {
+		/* Disallow dynamic reconfiguration of vlan_pmap */
+		.max_dynp = 0,
+		/* Use a single memory partition for all ingress queues */
+		.part_spc = { MAX_FRAME_MEMORY, 0, 0, 0, 0, 0, 0, 0 },
+	};
+	struct sja1105_table *table;
+
+	table = &priv->static_config.tables[BLK_IDX_L2_FORWARDING_PARAMS];
+
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(MAX_L2_FORWARDING_PARAMS_COUNT,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	table->entry_count = MAX_L2_FORWARDING_PARAMS_COUNT;
+
+	/* This table only has a single entry */
+	((struct sja1105_l2_forwarding_params_entry *)table->entries)[0] =
+				default_l2fwd_params;
+
+	return 0;
+}
+
+static int sja1105_init_general_params(struct sja1105_private *priv)
+{
+	struct sja1105_general_params_entry default_general_params = {
+		/* Disallow dynamic changing of the mirror port */
+		.mirr_ptacu = 0,
+		.switchid = priv->ds->index,
+		/* Priority queue for link-local frames trapped to CPU */
+		.hostprio = 0,
+		.mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A,
+		.mac_flt1    = SJA1105_LINKLOCAL_FILTER_A_MASK,
+		.incl_srcpt1 = true,
+		.send_meta1  = false,
+		.mac_fltres0 = SJA1105_LINKLOCAL_FILTER_B,
+		.mac_flt0    = SJA1105_LINKLOCAL_FILTER_B_MASK,
+		.incl_srcpt0 = true,
+		.send_meta0  = false,
+		/* The destination for traffic matching mac_fltres1 and
+		 * mac_fltres0 on all ports except host_port. Such traffic
+		 * receieved on host_port itself would be dropped, except
+		 * by installing a temporary 'management route'
+		 */
+		.host_port = dsa_upstream_port(priv->ds, 0),
+		/* Same as host port */
+		.mirr_port = dsa_upstream_port(priv->ds, 0),
+		/* Link-local traffic received on casc_port will be forwarded
+		 * to host_port without embedding the source port and device ID
+		 * info in the destination MAC address (presumably because it
+		 * is a cascaded port and a downstream SJA switch already did
+		 * that). Default to an invalid port (to disable the feature)
+		 * and overwrite this if we find any DSA (cascaded) ports.
+		 */
+		.casc_port = SJA1105_NUM_PORTS,
+		/* No TTEthernet */
+		.vllupformat = 0,
+		.vlmarker = 0,
+		.vlmask = 0,
+		/* Only update correctionField for 1-step PTP (L2 transport) */
+		.ignore2stf = 0,
+		.tpid = ETH_P_8021Q,
+		.tpid2 = ETH_P_8021Q,
+		/* P/Q/R/S only */
+		.queue_ts = 0,
+		.egrmirrvid = 0,
+		.egrmirrpcp = 0,
+		.egrmirrdei = 0,
+		.replay_port = 0,
+	};
+	struct sja1105_table *table;
+	int i;
+
+	for (i = 0; i < SJA1105_NUM_PORTS; i++)
+		if (dsa_is_dsa_port(priv->ds, i))
+			default_general_params.casc_port = i;
+
+	table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS];
+
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(MAX_GENERAL_PARAMS_COUNT,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	table->entry_count = MAX_GENERAL_PARAMS_COUNT;
+
+	/* This table only has a single entry */
+	((struct sja1105_general_params_entry *)table->entries)[0] =
+				default_general_params;
+
+	return 0;
+}
+
+static inline void
+sja1105_setup_policer(struct sja1105_l2_policing_entry *policing,
+		      int index)
+{
+#define RATE_MBPS(speed) (((speed) * 64000) / 1000)
+	policing[index].sharindx = index;
+	policing[index].smax = 65535; /* Burst size in bytes */
+	policing[index].rate = RATE_MBPS(1000);
+	policing[index].maxlen = ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN;
+	policing[index].partition = 0;
+#undef RATE_MBPS
+}
+
+static int sja1105_init_l2_policing(struct sja1105_private *priv)
+{
+	struct sja1105_l2_policing_entry *policing;
+	struct sja1105_table *table;
+	int i, j, k;
+
+	table = &priv->static_config.tables[BLK_IDX_L2_POLICING];
+
+	/* Discard previous L2 Policing Table */
+	if (table->entry_count) {
+		kfree(table->entries);
+		table->entry_count = 0;
+	}
+
+	table->entries = kcalloc(MAX_L2_POLICING_COUNT,
+				 table->ops->unpacked_entry_size, GFP_KERNEL);
+	if (!table->entries)
+		return -ENOMEM;
+
+	table->entry_count = MAX_L2_POLICING_COUNT;
+
+	policing = table->entries;
+
+	/* k sweeps through all unicast policers (0-39).
+	 * bcast sweeps through policers 40-44.
+	 */
+	for (i = 0, k = 0; i < SJA1105_NUM_PORTS; i++) {
+		int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + i;
+
+		for (j = 0; j < SJA1105_NUM_TC; j++, k++)
+			sja1105_setup_policer(policing, k);
+
+		/* Set up this port's policer for broadcast traffic */
+		sja1105_setup_policer(policing, bcast);
+	}
+	return 0;
+}
+
+static int sja1105_static_config_load(struct sja1105_private *priv,
+				      struct sja1105_dt_port *ports)
+{
+	int rc;
+
+	sja1105_static_config_free(&priv->static_config);
+	rc = sja1105_static_config_init(&priv->static_config,
+					priv->info->static_ops,
+					priv->info->device_id);
+	if (rc)
+		return rc;
+
+	/* Build static configuration */
+	rc = sja1105_init_mac_settings(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_mii_settings(priv, ports);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_static_fdb(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_static_vlan(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_l2_lookup_params(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_l2_forwarding(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_l2_forwarding_params(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_l2_policing(priv);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_init_general_params(priv);
+	if (rc < 0)
+		return rc;
+
+	/* Send initial configuration to hardware via SPI */
+	return sja1105_static_config_upload(priv);
+}
+
+static int sja1105_parse_ports_node(struct sja1105_private *priv,
+				    struct sja1105_dt_port *ports,
+				    struct device_node *ports_node)
+{
+	struct device *dev = &priv->spidev->dev;
+	struct device_node *child;
+
+	for_each_child_of_node(ports_node, child) {
+		struct device_node *phy_node;
+		int phy_mode;
+		u32 index;
+
+		/* Get switch port number from DT */
+		if (of_property_read_u32(child, "reg", &index) < 0) {
+			dev_err(dev, "Port number not defined in device tree "
+				"(property \"reg\")\n");
+			return -ENODEV;
+		}
+
+		/* Get PHY mode from DT */
+		phy_mode = of_get_phy_mode(child);
+		if (phy_mode < 0) {
+			dev_err(dev, "Failed to read phy-mode or "
+				"phy-interface-type property for port %d\n",
+				index);
+			return -ENODEV;
+		}
+		ports[index].phy_mode = phy_mode;
+
+		phy_node = of_parse_phandle(child, "phy-handle", 0);
+		if (!phy_node) {
+			if (!of_phy_is_fixed_link(child)) {
+				dev_err(dev, "phy-handle or fixed-link "
+					"properties missing!\n");
+				return -ENODEV;
+			}
+			/* phy-handle is missing, but fixed-link isn't.
+			 * So it's a fixed link. Default to PHY role.
+			 */
+			ports[index].role = XMII_PHY;
+		} else {
+			/* phy-handle present => put port in MAC role */
+			ports[index].role = XMII_MAC;
+			of_node_put(phy_node);
+		}
+
+		/* The MAC/PHY role can be overridden with explicit bindings */
+		if (of_property_read_bool(child, "sja1105,role-mac"))
+			ports[index].role = XMII_MAC;
+		else if (of_property_read_bool(child, "sja1105,role-phy"))
+			ports[index].role = XMII_PHY;
+	}
+
+	return 0;
+}
+
+static int sja1105_parse_dt(struct sja1105_private *priv,
+			    struct sja1105_dt_port *ports)
+{
+	struct device *dev = &priv->spidev->dev;
+	struct device_node *switch_node = dev->of_node;
+	struct device_node *ports_node;
+	int rc;
+
+	ports_node = of_get_child_by_name(switch_node, "ports");
+	if (!ports_node) {
+		dev_err(dev, "Incorrect bindings: absent \"ports\" node\n");
+		return -ENODEV;
+	}
+
+	rc = sja1105_parse_ports_node(priv, ports, ports_node);
+	of_node_put(ports_node);
+
+	return rc;
+}
+
+/* Convert back and forth MAC speed from Mbps to SJA1105 encoding */
+static int sja1105_speed[] = {
+	[SJA1105_SPEED_AUTO]     = 0,
+	[SJA1105_SPEED_10MBPS]   = 10,
+	[SJA1105_SPEED_100MBPS]  = 100,
+	[SJA1105_SPEED_1000MBPS] = 1000,
+};
+
+static sja1105_speed_t sja1105_get_speed_cfg(unsigned int speed_mbps)
+{
+	int i;
+
+	for (i = SJA1105_SPEED_AUTO; i <= SJA1105_SPEED_1000MBPS; i++)
+		if (sja1105_speed[i] == speed_mbps)
+			return i;
+	return -EINVAL;
+}
+
+/* Set link speed and enable/disable traffic I/O in the MAC configuration
+ * for a specific port.
+ *
+ * @speed_mbps: If 0, leave the speed unchanged, else adapt MAC to PHY speed.
+ * @enabled: Manage Rx and Tx settings for this port. Overrides the static
+ *	     configuration settings.
+ */
+static int sja1105_adjust_port_config(struct sja1105_private *priv, int port,
+				      int speed_mbps, bool enabled)
+{
+	struct sja1105_xmii_params_entry *mii;
+	struct sja1105_mac_config_entry *mac;
+	struct device *dev = priv->ds->dev;
+	sja1105_phy_interface_t phy_mode;
+	sja1105_speed_t speed;
+	int rc;
+
+	mii = priv->static_config.tables[BLK_IDX_XMII_PARAMS].entries;
+	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
+
+	speed = sja1105_get_speed_cfg(speed_mbps);
+	if (speed_mbps && speed < 0) {
+		dev_err(dev, "Invalid speed %iMbps\n", speed_mbps);
+		return -EINVAL;
+	}
+
+	/* If requested, overwrite SJA1105_SPEED_AUTO from the static MAC
+	 * configuration table, since this will be used for the clocking setup,
+	 * and we no longer need to store it in the static config (already told
+	 * hardware we want auto during upload phase).
+	 */
+	if (speed_mbps)
+		mac[port].speed = speed;
+	else
+		mac[port].speed = SJA1105_SPEED_AUTO;
+
+	/* On P/Q/R/S, one can read from the device via the MAC reconfiguration
+	 * tables. On E/T, MAC reconfig tables are not readable, only writable.
+	 * We have to *know* what the MAC looks like.  For the sake of keeping
+	 * the code common, we'll use the static configuration tables as a
+	 * reasonable approximation for both E/T and P/Q/R/S.
+	 */
+	mac[port].ingress = enabled;
+	mac[port].egress  = enabled;
+
+	/* Write to the dynamic reconfiguration tables */
+	rc = sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG,
+					  port, &mac[port], true);
+	if (rc < 0) {
+		dev_err(dev, "Failed to write MAC config: %d\n", rc);
+		return rc;
+	}
+
+	/* Reconfigure the PLLs for the RGMII interfaces (required 125 MHz at
+	 * gigabit, 25 MHz at 100 Mbps and 2.5 MHz at 10 Mbps). For MII and
+	 * RMII no change of the clock setup is required. Actually, changing
+	 * the clock setup does interrupt the clock signal for a certain time
+	 * which causes trouble for all PHYs relying on this signal.
+	 */
+	if (!enabled)
+		return 0;
+
+	phy_mode = mii->xmii_mode[port];
+	if (phy_mode != XMII_MODE_RGMII)
+		return 0;
+
+	return sja1105_clocking_setup_port(priv, port);
+}
+
+static void sja1105_adjust_link(struct dsa_switch *ds, int port,
+				struct phy_device *phydev)
+{
+	struct sja1105_private *priv = ds->priv;
+
+	if (!phydev->link)
+		sja1105_adjust_port_config(priv, port, 0, false);
+	else
+		sja1105_adjust_port_config(priv, port, phydev->speed, true);
+}
+
+static int sja1105_bridge_member(struct dsa_switch *ds, int port,
+				 struct net_device *br, bool member)
+{
+	struct sja1105_l2_forwarding_entry *l2_fwd;
+	struct sja1105_private *priv = ds->priv;
+	int i, rc;
+
+	l2_fwd = priv->static_config.tables[BLK_IDX_L2_FORWARDING].entries;
+
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		/* Add this port to the forwarding matrix of the
+		 * other ports in the same bridge, and viceversa.
+		 */
+		if (!dsa_is_user_port(ds, i))
+			continue;
+		/* For the ports already under the bridge, only one thing needs
+		 * to be done, and that is to add this port to their
+		 * reachability domain. So we can perform the SPI write for
+		 * them immediately. However, for this port itself (the one
+		 * that is new to the bridge), we need to add all other ports
+		 * to its reachability domain. So we do that incrementally in
+		 * this loop, and perform the SPI write only at the end, once
+		 * the domain contains all other bridge ports.
+		 */
+		if (i == port)
+			continue;
+		if (dsa_to_port(ds, i)->bridge_dev != br)
+			continue;
+		sja1105_port_allow_traffic(l2_fwd, i, port, member);
+		sja1105_port_allow_traffic(l2_fwd, port, i, member);
+
+		rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_FORWARDING,
+						  i, &l2_fwd[i], true);
+		if (rc < 0)
+			return rc;
+	}
+
+	return sja1105_dynamic_config_write(priv, BLK_IDX_L2_FORWARDING,
+					    port, &l2_fwd[port], true);
+}
+
+static int sja1105_bridge_join(struct dsa_switch *ds, int port,
+			       struct net_device *br)
+{
+	return sja1105_bridge_member(ds, port, br, true);
+}
+
+static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
+				 struct net_device *br)
+{
+	sja1105_bridge_member(ds, port, br, false);
+}
+
+static enum dsa_tag_protocol
+sja1105_get_tag_protocol(struct dsa_switch *ds, int port)
+{
+	return DSA_TAG_PROTO_NONE;
+}
+
+/* The programming model for the SJA1105 switch is "all-at-once" via static
+ * configuration tables. Some of these can be dynamically modified at runtime,
+ * but not the xMII mode parameters table.
+ * Furthermode, some PHYs may not have crystals for generating their clocks
+ * (e.g. RMII). Instead, their 50MHz clock is supplied via the SJA1105 port's
+ * ref_clk pin. So port clocking needs to be initialized early, before
+ * connecting to PHYs is attempted, otherwise they won't respond through MDIO.
+ * Setting correct PHY link speed does not matter now.
+ * But dsa_slave_phy_setup is called later than sja1105_setup, so the PHY
+ * bindings are not yet parsed by DSA core. We need to parse early so that we
+ * can populate the xMII mode parameters table.
+ */
+static int sja1105_setup(struct dsa_switch *ds)
+{
+	struct sja1105_dt_port ports[SJA1105_NUM_PORTS];
+	struct sja1105_private *priv = ds->priv;
+	int rc;
+
+	rc = sja1105_parse_dt(priv, ports);
+	if (rc < 0) {
+		dev_err(ds->dev, "Failed to parse DT: %d\n", rc);
+		return rc;
+	}
+	/* Create and send configuration down to device */
+	rc = sja1105_static_config_load(priv, ports);
+	if (rc < 0) {
+		dev_err(ds->dev, "Failed to load static config: %d\n", rc);
+		return rc;
+	}
+	/* Configure the CGU (PHY link modes and speeds) */
+	rc = sja1105_clocking_setup(priv);
+	if (rc < 0) {
+		dev_err(ds->dev, "Failed to configure MII clocking: %d\n", rc);
+		return rc;
+	}
+
+	return 0;
+}
+
+static const struct dsa_switch_ops sja1105_switch_ops = {
+	.get_tag_protocol	= sja1105_get_tag_protocol,
+	.setup			= sja1105_setup,
+	.adjust_link		= sja1105_adjust_link,
+	.port_bridge_join	= sja1105_bridge_join,
+	.port_bridge_leave	= sja1105_bridge_leave,
+};
+
+static int sja1105_check_device_id(struct sja1105_private *priv)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	u8 prod_id[SIZE_SJA1105_DEVICE_ID] = {0};
+	struct device *dev = &priv->spidev->dev;
+	u64 device_id;
+	u64 part_no;
+	int rc;
+
+	rc = sja1105_spi_send_int(priv, SPI_READ, regs->device_id,
+				  &device_id, SIZE_SJA1105_DEVICE_ID);
+	if (rc < 0)
+		return rc;
+
+	if (device_id != priv->info->device_id) {
+		dev_err(dev, "Expected device ID 0x%llx but read 0x%llx\n",
+			priv->info->device_id, device_id);
+		return -ENODEV;
+	}
+
+	rc = sja1105_spi_send_packed_buf(priv, SPI_READ, regs->prod_id,
+					 prod_id, SIZE_SJA1105_DEVICE_ID);
+	if (rc < 0)
+		return rc;
+
+	sja1105_unpack(prod_id, &part_no, 19, 4, SIZE_SJA1105_DEVICE_ID);
+
+	if (part_no != priv->info->part_no) {
+		dev_err(dev, "Expected part number 0x%llx but read 0x%llx\n",
+			priv->info->part_no, part_no);
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+static int sja1105_probe(struct spi_device *spi)
+{
+	struct device *dev = &spi->dev;
+	struct sja1105_private *priv;
+	struct dsa_switch *ds;
+	int rc;
+
+	if (!dev->of_node) {
+		dev_err(dev, "No DTS bindings for SJA1105 driver\n");
+		return -EINVAL;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(struct sja1105_private), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	/* Configure the optional reset pin and bring up switch */
+	priv->reset_gpio = devm_gpiod_get(dev, "reset", GPIOD_OUT_HIGH);
+	if (IS_ERR(priv->reset_gpio))
+		dev_dbg(dev, "reset-gpios not defined, ignoring\n");
+	else
+		sja1105_hw_reset(priv->reset_gpio, 1, 1);
+
+	/* Populate our driver private structure (priv) based on
+	 * the device tree node that was probed (spi)
+	 */
+	priv->spidev = spi;
+	spi_set_drvdata(spi, priv);
+
+	/* Configure the SPI bus */
+	spi->mode = SPI_CPHA;
+	spi->bits_per_word = 8;
+	rc = spi_setup(spi);
+	if (rc < 0) {
+		dev_err(dev, "Could not init SPI\n");
+		return rc;
+	}
+
+	priv->info = of_device_get_match_data(dev);
+
+	/* Detect hardware device */
+	rc = sja1105_check_device_id(priv);
+	if (rc < 0) {
+		dev_err(dev, "Device ID check failed: %d\n", rc);
+		return rc;
+	}
+
+	dev_info(dev, "Probed switch chip: %s\n", priv->info->name);
+
+	ds = dsa_switch_alloc(dev, SJA1105_NUM_PORTS);
+	if (!ds)
+		return -ENOMEM;
+
+	ds->ops = &sja1105_switch_ops;
+	ds->priv = priv;
+	priv->ds = ds;
+
+	return dsa_register_switch(priv->ds);
+}
+
+static int sja1105_remove(struct spi_device *spi)
+{
+	struct sja1105_private *priv = spi_get_drvdata(spi);
+
+	dsa_unregister_switch(priv->ds);
+	sja1105_static_config_free(&priv->static_config);
+	return 0;
+}
+
+static const struct of_device_id sja1105_dt_ids[] = {
+	{ .compatible = "nxp,sja1105e", .data = &sja1105e_info },
+	{ .compatible = "nxp,sja1105t", .data = &sja1105t_info },
+	{ .compatible = "nxp,sja1105p", .data = &sja1105p_info },
+	{ .compatible = "nxp,sja1105q", .data = &sja1105q_info },
+	{ .compatible = "nxp,sja1105r", .data = &sja1105r_info },
+	{ .compatible = "nxp,sja1105s", .data = &sja1105s_info },
+	{ /* sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, sja1105_dt_ids);
+
+static struct spi_driver sja1105_driver = {
+	.driver = {
+		.name  = "sja1105",
+		.owner = THIS_MODULE,
+		.of_match_table = of_match_ptr(sja1105_dt_ids),
+	},
+	.probe  = sja1105_probe,
+	.remove = sja1105_remove,
+};
+
+module_spi_driver(sja1105_driver);
+
+MODULE_AUTHOR("Vladimir Oltean <olteanv@gmail.com>");
+MODULE_AUTHOR("Georg Waibel <georg.waibel@sensor-technik.de>");
+MODULE_DESCRIPTION("SJA1105 Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
new file mode 100644
index 000000000000..09cb28e9be20
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -0,0 +1,551 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/* Copyright (c) 2016-2018, NXP Semiconductors
+ * Copyright (c) 2018, Sensor-Technik Wiedemann GmbH
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include <linux/spi/spi.h>
+#include <linux/packing.h>
+#include "sja1105.h"
+
+#define SIZE_SPI_MSG_HEADER	4
+#define SIZE_SPI_MSG_MAXLEN	(64 * 4)
+#define SPI_TRANSFER_SIZE_MAX	(SIZE_SPI_MSG_HEADER + SIZE_SPI_MSG_MAXLEN)
+
+static int sja1105_spi_transfer(const struct sja1105_private *priv,
+				const void *tx, void *rx, int size)
+{
+	struct spi_device *spi = priv->spidev;
+	struct spi_transfer transfer = {
+		.tx_buf = tx,
+		.rx_buf = rx,
+		.len = size,
+	};
+	struct spi_message msg;
+	int rc;
+
+	if (size > SPI_TRANSFER_SIZE_MAX) {
+		dev_err(&spi->dev, "SPI message (%d) longer than max of %d\n",
+			size, SPI_TRANSFER_SIZE_MAX);
+		return -EMSGSIZE;
+	}
+
+	spi_message_init(&msg);
+	spi_message_add_tail(&transfer, &msg);
+
+	rc = spi_sync(spi, &msg);
+	if (rc < 0) {
+		dev_err(&spi->dev, "SPI transfer failed: %d\n", rc);
+		return rc;
+	}
+
+	return rc;
+}
+
+static void
+sja1105_spi_message_pack(void *buf, const struct sja1105_spi_message *msg)
+{
+	const int size = SIZE_SPI_MSG_HEADER;
+
+	memset(buf, 0, size);
+
+	sja1105_pack(buf, &msg->access,     31, 31, size);
+	sja1105_pack(buf, &msg->read_count, 30, 25, size);
+	sja1105_pack(buf, &msg->address,    24,  4, size);
+}
+
+/* If @rw is:
+ * - SPI_WRITE: creates and sends an SPI write message at absolute
+ *		address reg_addr, taking size_bytes from *packed_buf
+ * - SPI_READ:  creates and sends an SPI read message from absolute
+ *		address reg_addr, writing size_bytes into *packed_buf
+ *
+ * This function should only be called if it is priorly known that
+ * @size_bytes is smaller than SIZE_SPI_MSG_MAXLEN. Larger packed buffers
+ * are chunked in smaller pieces by sja1105_spi_send_long_packed_buf below.
+ */
+int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
+				sja1105_spi_rw_mode_t rw, u64 reg_addr,
+				void *packed_buf, size_t size_bytes)
+{
+	u8 tx_buf[SIZE_SPI_MSG_HEADER + SIZE_SPI_MSG_MAXLEN] = {0};
+	u8 rx_buf[SIZE_SPI_MSG_HEADER + SIZE_SPI_MSG_MAXLEN] = {0};
+	const int msg_len = size_bytes + SIZE_SPI_MSG_HEADER;
+	struct sja1105_spi_message msg = {0};
+	int rc;
+
+	if (msg_len > SIZE_SPI_MSG_HEADER + SIZE_SPI_MSG_MAXLEN)
+		return -ERANGE;
+
+	msg.access = rw;
+	msg.address = reg_addr;
+	if (rw == SPI_READ)
+		msg.read_count = size_bytes / 4;
+
+	sja1105_spi_message_pack(tx_buf, &msg);
+
+	if (rw == SPI_WRITE)
+		memcpy(tx_buf + SIZE_SPI_MSG_HEADER, packed_buf, size_bytes);
+
+	rc = sja1105_spi_transfer(priv, tx_buf, rx_buf, msg_len);
+	if (rc < 0)
+		return rc;
+
+	if (rw == SPI_READ)
+		memcpy(packed_buf, rx_buf + SIZE_SPI_MSG_HEADER, size_bytes);
+
+	return 0;
+}
+
+/* If @rw is:
+ * - SPI_WRITE: creates and sends an SPI write message at absolute
+ *		address reg_addr, taking size_bytes from *packed_buf
+ * - SPI_READ:  creates and sends an SPI read message from absolute
+ *		address reg_addr, writing size_bytes into *packed_buf
+ *
+ * The u64 *value is unpacked, meaning that it's stored in the native
+ * CPU endianness and directly usable by software running on the core.
+ *
+ * This is a wrapper around sja1105_spi_send_packed_buf().
+ */
+int sja1105_spi_send_int(const struct sja1105_private *priv,
+			 sja1105_spi_rw_mode_t rw, u64 reg_addr,
+			 u64 *value, u64 size_bytes)
+{
+	u8 packed_buf[SIZE_SPI_MSG_MAXLEN];
+	int rc;
+
+	if (size_bytes > SIZE_SPI_MSG_MAXLEN)
+		return -ERANGE;
+
+	if (rw == SPI_WRITE)
+		sja1105_pack(packed_buf, value, 8 * size_bytes - 1, 0,
+			     size_bytes);
+
+	rc = sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
+					 size_bytes);
+
+	if (rw == SPI_READ)
+		sja1105_unpack(packed_buf, value, 8 * size_bytes - 1, 0,
+			       size_bytes);
+
+	return rc;
+}
+
+/* Should be used if a @packed_buf larger than SIZE_SPI_MSG_MAXLEN must be
+ * sent/received. Splitting the buffer into chunks and assembling those
+ * into SPI messages is done automatically by this function.
+ */
+int sja1105_spi_send_long_packed_buf(const struct sja1105_private *priv,
+				     sja1105_spi_rw_mode_t rw, u64 base_addr,
+				     void *packed_buf, u64 buf_len)
+{
+	struct chunk {
+		void *buf_ptr;
+		int len;
+		u64 spi_address;
+	} chunk;
+	int distance_to_end;
+	int rc;
+
+	/* Initialize chunk */
+	chunk.buf_ptr = packed_buf;
+	chunk.spi_address = base_addr;
+	chunk.len = min_t(int, buf_len, SIZE_SPI_MSG_MAXLEN);
+
+	while (chunk.len) {
+		rc = sja1105_spi_send_packed_buf(priv, rw, chunk.spi_address,
+						 chunk.buf_ptr, chunk.len);
+		if (rc < 0)
+			return rc;
+
+		chunk.buf_ptr += chunk.len;
+		chunk.spi_address += chunk.len / 4;
+		distance_to_end = (uintptr_t)(packed_buf + buf_len -
+					      chunk.buf_ptr);
+		chunk.len = min(distance_to_end, SIZE_SPI_MSG_MAXLEN);
+	}
+
+	return 0;
+}
+
+/* Back-ported structure from UM11040 Table 112.
+ * Reset control register (addr. 100440h)
+ * In the SJA1105 E/T, only warm_rst and cold_rst are
+ * supported (exposed in UM10944 as rst_ctrl), but the bit
+ * offsets of warm_rst and cold_rst are actually reversed.
+ */
+struct sja1105_reset_cmd {
+	u64 switch_rst;
+	u64 cfg_rst;
+	u64 car_rst;
+	u64 otp_rst;
+	u64 warm_rst;
+	u64 cold_rst;
+	u64 por_rst;
+};
+
+static void
+sja1105et_reset_cmd_pack(void *buf, const struct sja1105_reset_cmd *reset)
+{
+	const int size = 4;
+
+	memset(buf, 0, size);
+
+	sja1105_pack(buf, &reset->cold_rst, 3, 3, size);
+	sja1105_pack(buf, &reset->warm_rst, 2, 2, size);
+}
+
+static void
+sja1105pqrs_reset_cmd_pack(void *buf, const struct sja1105_reset_cmd *reset)
+{
+	const int size = 4;
+
+	memset(buf, 0, size);
+
+	sja1105_pack(buf, &reset->switch_rst, 8, 8, size);
+	sja1105_pack(buf, &reset->cfg_rst,    7, 7, size);
+	sja1105_pack(buf, &reset->car_rst,    5, 5, size);
+	sja1105_pack(buf, &reset->otp_rst,    4, 4, size);
+	sja1105_pack(buf, &reset->warm_rst,   3, 3, size);
+	sja1105_pack(buf, &reset->cold_rst,   2, 2, size);
+	sja1105_pack(buf, &reset->por_rst,    1, 1, size);
+}
+
+static int sja1105et_reset_cmd(const void *ctx, const void *data)
+{
+	const struct sja1105_private *priv = ctx;
+	const struct sja1105_reset_cmd *reset = data;
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct device *dev = priv->ds->dev;
+	u8 packed_buf[4];
+
+	if (reset->switch_rst ||
+	    reset->cfg_rst ||
+	    reset->car_rst ||
+	    reset->otp_rst ||
+	    reset->por_rst) {
+		dev_err(dev, "Only warm and cold reset is supported "
+			"for SJA1105 E/T!\n");
+		return -EINVAL;
+	}
+
+	if (reset->warm_rst)
+		dev_dbg(dev, "Warm reset requested\n");
+	if (reset->cold_rst)
+		dev_dbg(dev, "Cold reset requested\n");
+
+	sja1105et_reset_cmd_pack(packed_buf, reset);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE, regs->rgu,
+					   packed_buf, 4);
+}
+
+static int sja1105pqrs_reset_cmd(const void *ctx, const void *data)
+{
+	const struct sja1105_private *priv = ctx;
+	const struct sja1105_reset_cmd *reset = data;
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct device *dev = priv->ds->dev;
+	u8 packed_buf[4];
+
+	if (reset->switch_rst)
+		dev_dbg(dev, "Main reset for all functional modules requested\n");
+	if (reset->cfg_rst)
+		dev_dbg(dev, "Chip configuration reset requested\n");
+	if (reset->car_rst)
+		dev_dbg(dev, "Clock and reset control logic reset requested\n");
+	if (reset->otp_rst)
+		dev_dbg(dev, "OTP read cycle for reading product "
+			"config settings requested\n");
+	if (reset->warm_rst)
+		dev_dbg(dev, "Warm reset requested\n");
+	if (reset->cold_rst)
+		dev_dbg(dev, "Cold reset requested\n");
+	if (reset->por_rst)
+		dev_dbg(dev, "Power-on reset requested\n");
+
+	sja1105pqrs_reset_cmd_pack(packed_buf, reset);
+
+	return sja1105_spi_send_packed_buf(priv, SPI_WRITE, regs->rgu,
+					   packed_buf, 4);
+}
+
+static int sja1105_cold_reset(const struct sja1105_private *priv)
+{
+	struct sja1105_reset_cmd reset = {0};
+
+	reset.cold_rst = 1;
+	return priv->info->reset_cmd(priv, &reset);
+}
+
+struct sja1105_status {
+	u64 configs;
+	u64 crcchkl;
+	u64 ids;
+	u64 crcchkg;
+};
+
+/* This is not reading the entire General Status area, which is also
+ * divergent between E/T and P/Q/R/S, but only the relevant bits for
+ * ensuring that the static config upload procedure was successful.
+ */
+static void sja1105_status_unpack(void *buf, struct sja1105_status *status)
+{
+	/* So that addition translates to 4 bytes */
+	u32 *p = (u32 *)buf;
+
+	memset(status, 0, sizeof(*status));
+	/* device_id is missing from the buffer, but we don't
+	 * want to diverge from the manual definition of the
+	 * register addresses, so we'll back off one step with
+	 * the register pointer, and never access p[0].
+	 */
+	p--;
+	sja1105_unpack(p + 0x1, &status->configs,   31, 31, 4);
+	sja1105_unpack(p + 0x1, &status->crcchkl,   30, 30, 4);
+	sja1105_unpack(p + 0x1, &status->ids,       29, 29, 4);
+	sja1105_unpack(p + 0x1, &status->crcchkg,   28, 28, 4);
+}
+
+static int sja1105_status_get(struct sja1105_private *priv,
+			      struct sja1105_status *status)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	u8 packed_buf[4];
+	int rc;
+
+	rc = sja1105_spi_send_packed_buf(priv, SPI_READ,
+					 regs->status,
+					 packed_buf, 4);
+	if (rc < 0)
+		return rc;
+
+	sja1105_status_unpack(packed_buf, status);
+
+	return 0;
+}
+
+/* Not const because unpacking priv->static_config into buffers and preparing
+ * for upload requires the recalculation of table CRCs and updating the
+ * structures with these.
+ */
+static int
+static_config_buf_prepare_for_upload(struct sja1105_private *priv,
+				     void *config_buf, int buf_len)
+{
+	struct sja1105_static_config *config = &priv->static_config;
+	struct sja1105_table_header final_header;
+	sja1105_config_valid_t valid;
+	char *final_header_ptr;
+	int crc_len;
+
+	valid = sja1105_static_config_check_valid(config);
+	if (valid != SJA1105_CONFIG_OK) {
+		dev_err(&priv->spidev->dev,
+			sja1105_static_config_error_msg[valid]);
+		return -EINVAL;
+	}
+
+	/* Write Device ID and config tables to config_buf */
+	sja1105_static_config_pack(config_buf, config);
+	/* Recalculate CRC of the last header (right now 0xDEADBEEF).
+	 * Don't include the CRC field itself.
+	 */
+	crc_len = buf_len - 4;
+	/* Read the whole table header */
+	final_header_ptr = config_buf + buf_len - SIZE_TABLE_HEADER;
+	sja1105_table_header_packing(final_header_ptr, &final_header, UNPACK);
+	/* Modify */
+	final_header.crc = sja1105_crc32(config_buf, crc_len);
+	/* Rewrite */
+	sja1105_table_header_packing(final_header_ptr, &final_header, PACK);
+
+	return 0;
+}
+
+int sja1105_static_config_upload(struct sja1105_private *priv)
+{
+#define RETRIES 10
+	struct sja1105_static_config *config = &priv->static_config;
+	const struct sja1105_regs *regs = priv->info->regs;
+	struct device *dev = &priv->spidev->dev;
+	struct sja1105_status status;
+	int rc, retries = RETRIES;
+	u8 *config_buf;
+	int buf_len;
+
+	buf_len = sja1105_static_config_get_length(config);
+	config_buf = kcalloc(buf_len, sizeof(char), GFP_KERNEL);
+	if (!config_buf)
+		return -ENOMEM;
+
+	rc = static_config_buf_prepare_for_upload(priv, config_buf, buf_len);
+	if (rc < 0) {
+		dev_err(dev, "Invalid config, cannot upload\n");
+		return -EINVAL;
+	}
+	do {
+		/* Put the SJA1105 in programming mode */
+		rc = sja1105_cold_reset(priv);
+		if (rc < 0) {
+			dev_err(dev, "Failed to reset switch, retrying...\n");
+			continue;
+		}
+		/* Wait for the switch to come out of reset */
+		usleep_range(1000, 5000);
+		/* Upload the static config to the device */
+		rc = sja1105_spi_send_long_packed_buf(priv, SPI_WRITE,
+						      regs->config,
+						      config_buf, buf_len);
+		if (rc < 0) {
+			dev_err(dev, "Failed to upload config, retrying...\n");
+			continue;
+		}
+		/* Check that SJA1105 responded well to the config upload */
+		rc = sja1105_status_get(priv, &status);
+		if (rc < 0)
+			continue;
+
+		if (status.ids == 1) {
+			dev_err(dev, "Mismatch between hardware and static config "
+				"device id. Wrote 0x%llx, wants 0x%llx\n",
+				config->device_id, priv->info->device_id);
+			continue;
+		}
+		if (status.crcchkl == 1) {
+			dev_err(dev, "Switch reported invalid local CRC on "
+				"the uploaded config, retrying...\n");
+			continue;
+		}
+		if (status.crcchkg == 1) {
+			dev_err(dev, "Switch reported invalid global CRC on "
+				"the uploaded config, retrying...\n");
+			continue;
+		}
+		if (status.configs == 0) {
+			dev_err(dev, "Switch reported that configuration is "
+				"invalid, retrying...\n");
+			continue;
+		}
+	} while (--retries && (status.crcchkl == 1 || status.crcchkg == 1 ||
+		 status.configs == 0 || status.ids == 1));
+
+	if (!retries) {
+		rc = -EIO;
+		dev_err(dev, "Failed to upload config to device, giving up\n");
+		goto out;
+	} else if (retries != RETRIES - 1) {
+		dev_info(dev, "Succeeded after %d tried\n", RETRIES - retries);
+	}
+
+	dev_info(dev, "Reset switch and programmed static config\n");
+out:
+	kfree(config_buf);
+	return rc;
+#undef RETRIES
+}
+
+struct sja1105_regs sja1105et_regs = {
+	.device_id = 0x0,
+	.prod_id = 0x100BC3,
+	.status = 0x1,
+	.config = 0x020000,
+	.rgu = 0x100440,
+	.pad_mii_tx = {0x100800, 0x100802, 0x100804, 0x100806, 0x100808},
+	.rmii_pll1 = 0x10000A,
+	.cgu_idiv = {0x10000B, 0x10000C, 0x10000D, 0x10000E, 0x10000F},
+	/* UM10944.pdf, Table 86, ACU Register overview */
+	.rgmii_pad_mii_tx = {0x100800, 0x100802, 0x100804, 0x100806, 0x100808},
+	.mac = {0x200, 0x202, 0x204, 0x206, 0x208},
+	.mac_hl1 = {0x400, 0x410, 0x420, 0x430, 0x440},
+	.mac_hl2 = {0x600, 0x610, 0x620, 0x630, 0x640},
+	/* UM10944.pdf, Table 78, CGU Register overview */
+	.mii_tx_clk = {0x100013, 0x10001A, 0x100021, 0x100028, 0x10002F},
+	.mii_rx_clk = {0x100014, 0x10001B, 0x100022, 0x100029, 0x100030},
+	.mii_ext_tx_clk = {0x100018, 0x10001F, 0x100026, 0x10002D, 0x100034},
+	.mii_ext_rx_clk = {0x100019, 0x100020, 0x100027, 0x10002E, 0x100035},
+	.rgmii_tx_clk = {0x100016, 0x10001D, 0x100024, 0x10002B, 0x100032},
+	.rmii_ref_clk = {0x100015, 0x10001C, 0x100023, 0x10002A, 0x100031},
+	.rmii_ext_tx_clk = {0x100018, 0x10001F, 0x100026, 0x10002D, 0x100034},
+};
+
+struct sja1105_regs sja1105pqrs_regs = {
+	.device_id = 0x0,
+	.prod_id = 0x100BC3,
+	.status = 0x1,
+	.config = 0x020000,
+	.rgu = 0x100440,
+	.pad_mii_tx = {0x100800, 0x100802, 0x100804, 0x100806, 0x100808},
+	.rmii_pll1 = 0x10000A,
+	.cgu_idiv = {0x10000B, 0x10000C, 0x10000D, 0x10000E, 0x10000F},
+	/* UM10944.pdf, Table 86, ACU Register overview */
+	.rgmii_pad_mii_tx = {0x100800, 0x100802, 0x100804, 0x100806, 0x100808},
+	.mac = {0x200, 0x202, 0x204, 0x206, 0x208},
+	.mac_hl1 = {0x400, 0x410, 0x420, 0x430, 0x440},
+	.mac_hl2 = {0x600, 0x610, 0x620, 0x630, 0x640},
+	/* UM11040.pdf, Table 114 */
+	.mii_tx_clk = {0x100013, 0x100019, 0x10001F, 0x100025, 0x10002B},
+	.mii_rx_clk = {0x100014, 0x10001A, 0x100020, 0x100026, 0x10002C},
+	.mii_ext_tx_clk = {0x100017, 0x10001D, 0x100023, 0x100029, 0x10002F},
+	.mii_ext_rx_clk = {0x100018, 0x10001E, 0x100024, 0x10002A, 0x100030},
+	.rgmii_tx_clk = {0x100016, 0x10001C, 0x100022, 0x100028, 0x10002E},
+	.rmii_ref_clk = {0x100015, 0x10001B, 0x100021, 0x100027, 0x10002D},
+	.rmii_ext_tx_clk = {0x100017, 0x10001D, 0x100023, 0x100029, 0x10002F},
+	.qlevel = {0x604, 0x614, 0x624, 0x634, 0x644},
+};
+
+struct sja1105_info sja1105e_info = {
+	.device_id		= SJA1105E_DEVICE_ID,
+	.part_no		= SJA1105ET_PART_NO,
+	.static_ops		= sja1105e_table_ops,
+	.dyn_ops		= sja1105et_dyn_ops,
+	.reset_cmd		= sja1105et_reset_cmd,
+	.regs			= &sja1105et_regs,
+	.name			= "SJA1105E",
+};
+struct sja1105_info sja1105t_info = {
+	.device_id		= SJA1105T_DEVICE_ID,
+	.part_no		= SJA1105ET_PART_NO,
+	.static_ops		= sja1105t_table_ops,
+	.dyn_ops		= sja1105et_dyn_ops,
+	.reset_cmd		= sja1105et_reset_cmd,
+	.regs			= &sja1105et_regs,
+	.name			= "SJA1105T",
+};
+struct sja1105_info sja1105p_info = {
+	.device_id		= SJA1105PR_DEVICE_ID,
+	.part_no		= SJA1105P_PART_NO,
+	.static_ops		= sja1105p_table_ops,
+	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.reset_cmd		= sja1105pqrs_reset_cmd,
+	.regs			= &sja1105pqrs_regs,
+	.name			= "SJA1105P",
+};
+struct sja1105_info sja1105q_info = {
+	.device_id		= SJA1105QS_DEVICE_ID,
+	.part_no		= SJA1105Q_PART_NO,
+	.static_ops		= sja1105q_table_ops,
+	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.reset_cmd		= sja1105pqrs_reset_cmd,
+	.regs			= &sja1105pqrs_regs,
+	.name			= "SJA1105Q",
+};
+struct sja1105_info sja1105r_info = {
+	.device_id		= SJA1105PR_DEVICE_ID,
+	.part_no		= SJA1105R_PART_NO,
+	.static_ops		= sja1105r_table_ops,
+	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.reset_cmd		= sja1105pqrs_reset_cmd,
+	.regs			= &sja1105pqrs_regs,
+	.name			= "SJA1105R",
+};
+struct sja1105_info sja1105s_info = {
+	.device_id		= SJA1105QS_DEVICE_ID,
+	.part_no		= SJA1105S_PART_NO,
+	.static_ops		= sja1105s_table_ops,
+	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.regs			= &sja1105pqrs_regs,
+	.reset_cmd		= sja1105pqrs_reset_cmd,
+	.name			= "SJA1105S",
+};
+
diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.c b/drivers/net/dsa/sja1105/sja1105_static_config.c
new file mode 100644
index 000000000000..ae5c2551ad90
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.c
@@ -0,0 +1,1004 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/* Copyright (c) 2016-2018, NXP Semiconductors
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include "sja1105_static_config.h"
+#include <linux/crc32.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+
+/* Convenience wrappers over the generic packing functions. These take into
+ * account the SJA1105 memory layout quirks and provide some level of
+ * programmer protection against incorrect API use. The errors are not expected
+ * to occur durring runtime, therefore printing and swallowing them here is
+ * appropriate instead of clutterring up higher-level code.
+ */
+void sja1105_pack(void *buf, const u64 *val, int start, int end, size_t len)
+{
+	int rc = packing(buf, (u64 *)val, start, end, len,
+			 PACK, QUIRK_LSW32_IS_FIRST);
+
+	if (likely(!rc))
+		return;
+
+	if (rc == -EINVAL) {
+		pr_err("Start bit (%d) expected to be larger than end (%d)\n",
+		       start, end);
+	} else if (rc == -ERANGE) {
+		if ((start - end + 1) > 64)
+			pr_err("Field %d-%d too large for 64 bits!\n",
+			       start, end);
+		else
+			pr_err("Cannot store %llx inside bits %d-%d (would truncate)\n",
+			       *val, start, end);
+	}
+	dump_stack();
+}
+
+void sja1105_unpack(const void *buf, u64 *val, int start, int end, size_t len)
+{
+	int rc = packing((void *)buf, val, start, end, len,
+			 UNPACK, QUIRK_LSW32_IS_FIRST);
+
+	if (likely(!rc))
+		return;
+
+	if (rc == -EINVAL)
+		pr_err("Start bit (%d) expected to be larger than end (%d)\n",
+		       start, end);
+	else if (rc == -ERANGE)
+		pr_err("Field %d-%d too large for 64 bits!\n",
+		       start, end);
+	dump_stack();
+}
+
+void sja1105_packing(void *buf, u64 *val, int start, int end,
+		     size_t len, enum packing_op op)
+{
+	int rc = packing(buf, val, start, end, len, op, QUIRK_LSW32_IS_FIRST);
+
+	if (likely(!rc))
+		return;
+
+	if (rc == -EINVAL) {
+		pr_err("Start bit (%d) expected to be larger than end (%d)\n",
+		       start, end);
+	} else if (rc == -ERANGE) {
+		if ((start - end + 1) > 64)
+			pr_err("Field %d-%d too large for 64 bits!\n",
+			       start, end);
+		else
+			pr_err("Cannot store %llx inside bits %d-%d (would truncate)\n",
+			       *val, start, end);
+	}
+	dump_stack();
+}
+
+/* Little-endian Ethernet CRC32 of data packed as big-endian u32 words */
+u32 sja1105_crc32(const void *buf, size_t len)
+{
+	unsigned int i;
+	u64 word;
+	u32 crc;
+
+	/* seed */
+	crc = ~0;
+	for (i = 0; i < len; i += 4) {
+		sja1105_unpack((void *)buf + i, &word, 31, 0, 4);
+		crc = crc32_le(crc, (u8 *)&word, 4);
+	}
+	return ~crc;
+}
+
+static size_t sja1105et_general_params_entry_packing(void *buf, void *entry_ptr,
+						     enum packing_op op)
+{
+	const size_t size = SIZE_GENERAL_PARAMS_ENTRY_ET;
+	struct sja1105_general_params_entry *entry;
+
+	entry = (struct sja1105_general_params_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->vllupformat, 319, 319, size, op);
+	sja1105_packing(buf, &entry->mirr_ptacu,  318, 318, size, op);
+	sja1105_packing(buf, &entry->switchid,    317, 315, size, op);
+	sja1105_packing(buf, &entry->hostprio,    314, 312, size, op);
+	sja1105_packing(buf, &entry->mac_fltres1, 311, 264, size, op);
+	sja1105_packing(buf, &entry->mac_fltres0, 263, 216, size, op);
+	sja1105_packing(buf, &entry->mac_flt1,    215, 168, size, op);
+	sja1105_packing(buf, &entry->mac_flt0,    167, 120, size, op);
+	sja1105_packing(buf, &entry->incl_srcpt1, 119, 119, size, op);
+	sja1105_packing(buf, &entry->incl_srcpt0, 118, 118, size, op);
+	sja1105_packing(buf, &entry->send_meta1,  117, 117, size, op);
+	sja1105_packing(buf, &entry->send_meta0,  116, 116, size, op);
+	sja1105_packing(buf, &entry->casc_port,   115, 113, size, op);
+	sja1105_packing(buf, &entry->host_port,   112, 110, size, op);
+	sja1105_packing(buf, &entry->mirr_port,   109, 107, size, op);
+	sja1105_packing(buf, &entry->vlmarker,    106,  75, size, op);
+	sja1105_packing(buf, &entry->vlmask,       74,  43, size, op);
+	sja1105_packing(buf, &entry->tpid,         42,  27, size, op);
+	sja1105_packing(buf, &entry->ignore2stf,   26,  26, size, op);
+	sja1105_packing(buf, &entry->tpid2,        25,  10, size, op);
+	return size;
+}
+
+static size_t
+sja1105pqrs_general_params_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op)
+{
+	const size_t size = SIZE_GENERAL_PARAMS_ENTRY_PQRS;
+	struct sja1105_general_params_entry *entry;
+
+	entry = (struct sja1105_general_params_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->vllupformat, 351, 351, size, op);
+	sja1105_packing(buf, &entry->mirr_ptacu,  350, 350, size, op);
+	sja1105_packing(buf, &entry->switchid,    349, 347, size, op);
+	sja1105_packing(buf, &entry->hostprio,    346, 344, size, op);
+	sja1105_packing(buf, &entry->mac_fltres1, 343, 296, size, op);
+	sja1105_packing(buf, &entry->mac_fltres0, 295, 248, size, op);
+	sja1105_packing(buf, &entry->mac_flt1,    247, 200, size, op);
+	sja1105_packing(buf, &entry->mac_flt0,    199, 152, size, op);
+	sja1105_packing(buf, &entry->incl_srcpt1, 151, 151, size, op);
+	sja1105_packing(buf, &entry->incl_srcpt0, 150, 150, size, op);
+	sja1105_packing(buf, &entry->send_meta1,  149, 149, size, op);
+	sja1105_packing(buf, &entry->send_meta0,  148, 148, size, op);
+	sja1105_packing(buf, &entry->casc_port,   147, 145, size, op);
+	sja1105_packing(buf, &entry->host_port,   144, 142, size, op);
+	sja1105_packing(buf, &entry->mirr_port,   141, 139, size, op);
+	sja1105_packing(buf, &entry->vlmarker,    138, 107, size, op);
+	sja1105_packing(buf, &entry->vlmask,      106,  75, size, op);
+	sja1105_packing(buf, &entry->tpid,         74,  59, size, op);
+	sja1105_packing(buf, &entry->ignore2stf,   58,  58, size, op);
+	sja1105_packing(buf, &entry->tpid2,        57,  42, size, op);
+	sja1105_packing(buf, &entry->queue_ts,     41,  41, size, op);
+	sja1105_packing(buf, &entry->egrmirrvid,   40,  29, size, op);
+	sja1105_packing(buf, &entry->egrmirrpcp,   28,  26, size, op);
+	sja1105_packing(buf, &entry->egrmirrdei,   25,  25, size, op);
+	sja1105_packing(buf, &entry->replay_port,  24,  22, size, op);
+	return size;
+}
+
+static size_t
+sja1105_l2_forwarding_params_entry_packing(void *buf, void *entry_ptr,
+					   enum packing_op op)
+{
+	const size_t size = SIZE_L2_FORWARDING_PARAMS_ENTRY;
+	struct sja1105_l2_forwarding_params_entry *entry;
+	int offset, i;
+
+	entry = (struct sja1105_l2_forwarding_params_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->max_dynp, 95, 93, size, op);
+	for (i = 0, offset = 13; i < 8; i++, offset += 10)
+		sja1105_packing(buf, &entry->part_spc[i],
+				offset + 9, offset + 0, size, op);
+	return size;
+}
+
+size_t sja1105_l2_forwarding_entry_packing(void *buf, void *entry_ptr,
+					   enum packing_op op)
+{
+	const size_t size = SIZE_L2_FORWARDING_ENTRY;
+	struct sja1105_l2_forwarding_entry *entry;
+	int offset, i;
+
+	entry = (struct sja1105_l2_forwarding_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->bc_domain,  63, 59, size, op);
+	sja1105_packing(buf, &entry->reach_port, 58, 54, size, op);
+	sja1105_packing(buf, &entry->fl_domain,  53, 49, size, op);
+	for (i = 0, offset = 25; i < 8; i++, offset += 3)
+		sja1105_packing(buf, &entry->vlan_pmap[i],
+				offset + 2, offset + 0, size, op);
+	return size;
+}
+
+static size_t
+sja1105et_l2_lookup_params_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op)
+{
+	const size_t size = SIZE_L2_LOOKUP_PARAMS_ENTRY_ET;
+	struct sja1105_l2_lookup_params_entry *entry;
+
+	entry = (struct sja1105_l2_lookup_params_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->maxage,         31, 17, size, op);
+	sja1105_packing(buf, &entry->dyn_tbsz,       16, 14, size, op);
+	sja1105_packing(buf, &entry->poly,           13,  6, size, op);
+	sja1105_packing(buf, &entry->shared_learn,    5,  5, size, op);
+	sja1105_packing(buf, &entry->no_enf_hostprt,  4,  4, size, op);
+	sja1105_packing(buf, &entry->no_mgmt_learn,   3,  3, size, op);
+	return size;
+}
+
+static size_t
+sja1105pqrs_l2_lookup_params_entry_packing(void *buf, void *entry_ptr,
+					   enum packing_op op)
+{
+	const size_t size = SIZE_L2_LOOKUP_PARAMS_ENTRY_PQRS;
+	struct sja1105_l2_lookup_params_entry *entry;
+	int offset, i;
+
+	entry = (struct sja1105_l2_lookup_params_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->drpbc,         127, 123, size, op);
+	sja1105_packing(buf, &entry->drpmc,         122, 118, size, op);
+	sja1105_packing(buf, &entry->drpuni,        117, 113, size, op);
+	for (i = 0, offset = 58; i < 5; i++, offset += 11)
+		sja1105_packing(buf, &entry->maxaddrp[i],
+				offset + 10, offset + 0, size, op);
+	sja1105_packing(buf, &entry->maxage,         57,  43, size, op);
+	sja1105_packing(buf, &entry->start_dynspc,   42,  33, size, op);
+	sja1105_packing(buf, &entry->drpnolearn,     32,  28, size, op);
+	sja1105_packing(buf, &entry->shared_learn,   27,  27, size, op);
+	sja1105_packing(buf, &entry->no_enf_hostprt, 26,  26, size, op);
+	sja1105_packing(buf, &entry->no_mgmt_learn,  25,  25, size, op);
+	sja1105_packing(buf, &entry->use_static,     24,  24, size, op);
+	sja1105_packing(buf, &entry->owr_dyn,        23,  23, size, op);
+	sja1105_packing(buf, &entry->learn_once,     22,  22, size, op);
+	return size;
+}
+
+size_t sja1105et_l2_lookup_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op)
+{
+	const size_t size = SIZE_L2_LOOKUP_ENTRY_ET;
+	struct sja1105_l2_lookup_entry *entry;
+
+	entry = (struct sja1105_l2_lookup_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->vlanid,    95, 84, size, op);
+	sja1105_packing(buf, &entry->macaddr,   83, 36, size, op);
+	sja1105_packing(buf, &entry->destports, 35, 31, size, op);
+	sja1105_packing(buf, &entry->enfport,   30, 30, size, op);
+	sja1105_packing(buf, &entry->index,     29, 20, size, op);
+	return size;
+}
+
+size_t sja1105pqrs_l2_lookup_entry_packing(void *buf, void *entry_ptr,
+					   enum packing_op op)
+{
+	const size_t size = SIZE_L2_LOOKUP_ENTRY_PQRS;
+	struct sja1105_l2_lookup_entry *entry;
+
+	entry = (struct sja1105_l2_lookup_entry *)entry_ptr;
+
+	/* These are static L2 lookup entries, so the structure
+	 * should match UM11040 Table 16/17 definitions when
+	 * LOCKEDS is 1.
+	 */
+	sja1105_packing(buf, &entry->mirrvlan,     158, 147, size, op);
+	sja1105_packing(buf, &entry->mirr,         145, 145, size, op);
+	sja1105_packing(buf, &entry->retag,        144, 144, size, op);
+	sja1105_packing(buf, &entry->mask_iotag,   143, 143, size, op);
+	sja1105_packing(buf, &entry->mask_vlanid,  142, 131, size, op);
+	sja1105_packing(buf, &entry->mask_macaddr, 130,  83, size, op);
+	sja1105_packing(buf, &entry->iotag,         82,  82, size, op);
+	sja1105_packing(buf, &entry->vlanid,        81,  70, size, op);
+	sja1105_packing(buf, &entry->macaddr,       69,  22, size, op);
+	sja1105_packing(buf, &entry->destports,     21,  17, size, op);
+	sja1105_packing(buf, &entry->enfport,       16,  16, size, op);
+	sja1105_packing(buf, &entry->index,         15,   6, size, op);
+	return size;
+}
+
+static size_t sja1105_l2_policing_entry_packing(void *buf, void *entry_ptr,
+						enum packing_op op)
+{
+	const size_t size = SIZE_L2_POLICING_ENTRY;
+	struct sja1105_l2_policing_entry *entry;
+
+	entry = (struct sja1105_l2_policing_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->sharindx,  63, 58, size, op);
+	sja1105_packing(buf, &entry->smax,      57, 42, size, op);
+	sja1105_packing(buf, &entry->rate,      41, 26, size, op);
+	sja1105_packing(buf, &entry->maxlen,    25, 15, size, op);
+	sja1105_packing(buf, &entry->partition, 14, 12, size, op);
+	return size;
+}
+
+static size_t sja1105et_mac_config_entry_packing(void *buf, void *entry_ptr,
+						 enum packing_op op)
+{
+	const size_t size = SIZE_MAC_CONFIG_ENTRY_ET;
+	struct sja1105_mac_config_entry *entry;
+	int offset, i;
+
+	entry = (struct sja1105_mac_config_entry *)entry_ptr;
+
+	for (i = 0, offset = 72; i < 8; i++, offset += 19) {
+		sja1105_packing(buf, &entry->enabled[i],
+				offset +  0, offset +  0, size, op);
+		sja1105_packing(buf, &entry->base[i],
+				offset +  9, offset +  1, size, op);
+		sja1105_packing(buf, &entry->top[i],
+				offset + 18, offset + 10, size, op);
+	}
+	sja1105_packing(buf, &entry->ifg,       71, 67, size, op);
+	sja1105_packing(buf, &entry->speed,     66, 65, size, op);
+	sja1105_packing(buf, &entry->tp_delin,  64, 49, size, op);
+	sja1105_packing(buf, &entry->tp_delout, 48, 33, size, op);
+	sja1105_packing(buf, &entry->maxage,    32, 25, size, op);
+	sja1105_packing(buf, &entry->vlanprio,  24, 22, size, op);
+	sja1105_packing(buf, &entry->vlanid,    21, 10, size, op);
+	sja1105_packing(buf, &entry->ing_mirr,   9,  9, size, op);
+	sja1105_packing(buf, &entry->egr_mirr,   8,  8, size, op);
+	sja1105_packing(buf, &entry->drpnona664, 7,  7, size, op);
+	sja1105_packing(buf, &entry->drpdtag,    6,  6, size, op);
+	sja1105_packing(buf, &entry->drpuntag,   5,  5, size, op);
+	sja1105_packing(buf, &entry->retag,      4,  4, size, op);
+	sja1105_packing(buf, &entry->dyn_learn,  3,  3, size, op);
+	sja1105_packing(buf, &entry->egress,     2,  2, size, op);
+	sja1105_packing(buf, &entry->ingress,    1,  1, size, op);
+	return size;
+}
+
+size_t sja1105pqrs_mac_config_entry_packing(void *buf, void *entry_ptr,
+					    enum packing_op op)
+{
+	const size_t size = SIZE_MAC_CONFIG_ENTRY_PQRS;
+	struct sja1105_mac_config_entry *entry;
+	int offset, i;
+
+	entry = (struct sja1105_mac_config_entry *)entry_ptr;
+
+	for (i = 0, offset = 104; i < 8; i++, offset += 19) {
+		sja1105_packing(buf, &entry->enabled[i],
+				offset +  0, offset +  0, size, op);
+		sja1105_packing(buf, &entry->base[i],
+				offset +  9, offset +  1, size, op);
+		sja1105_packing(buf, &entry->top[i],
+				offset + 18, offset + 10, size, op);
+	}
+	sja1105_packing(buf, &entry->ifg,       103, 99, size, op);
+	sja1105_packing(buf, &entry->speed,      98, 97, size, op);
+	sja1105_packing(buf, &entry->tp_delin,   96, 81, size, op);
+	sja1105_packing(buf, &entry->tp_delout,  80, 65, size, op);
+	sja1105_packing(buf, &entry->maxage,     64, 57, size, op);
+	sja1105_packing(buf, &entry->vlanprio,   56, 54, size, op);
+	sja1105_packing(buf, &entry->vlanid,     53, 42, size, op);
+	sja1105_packing(buf, &entry->ing_mirr,   41, 41, size, op);
+	sja1105_packing(buf, &entry->egr_mirr,   40, 40, size, op);
+	sja1105_packing(buf, &entry->drpnona664, 39, 39, size, op);
+	sja1105_packing(buf, &entry->drpdtag,    38, 38, size, op);
+	sja1105_packing(buf, &entry->drpsotag,   37, 37, size, op);
+	sja1105_packing(buf, &entry->drpsitag,   36, 36, size, op);
+	sja1105_packing(buf, &entry->drpuntag,   35, 35, size, op);
+	sja1105_packing(buf, &entry->retag,      34, 34, size, op);
+	sja1105_packing(buf, &entry->dyn_learn,  33, 33, size, op);
+	sja1105_packing(buf, &entry->egress,     32, 32, size, op);
+	sja1105_packing(buf, &entry->ingress,    31, 31, size, op);
+	sja1105_packing(buf, &entry->mirrcie,    30, 30, size, op);
+	sja1105_packing(buf, &entry->mirrcetag,  29, 29, size, op);
+	sja1105_packing(buf, &entry->ingmirrvid, 28, 17, size, op);
+	sja1105_packing(buf, &entry->ingmirrpcp, 16, 14, size, op);
+	sja1105_packing(buf, &entry->ingmirrdei, 13, 13, size, op);
+	return size;
+}
+
+size_t sja1105_vlan_lookup_entry_packing(void *buf, void *entry_ptr,
+					 enum packing_op op)
+{
+	const size_t size = SIZE_VLAN_LOOKUP_ENTRY;
+	struct sja1105_vlan_lookup_entry *entry;
+
+	entry = (struct sja1105_vlan_lookup_entry *)entry_ptr;
+
+	sja1105_packing(buf, &entry->ving_mirr,  63, 59, size, op);
+	sja1105_packing(buf, &entry->vegr_mirr,  58, 54, size, op);
+	sja1105_packing(buf, &entry->vmemb_port, 53, 49, size, op);
+	sja1105_packing(buf, &entry->vlan_bc,    48, 44, size, op);
+	sja1105_packing(buf, &entry->tag_port,   43, 39, size, op);
+	sja1105_packing(buf, &entry->vlanid,     38, 27, size, op);
+	return size;
+}
+
+static size_t sja1105_xmii_params_entry_packing(void *buf, void *entry_ptr,
+						enum packing_op op)
+{
+	const size_t size = SIZE_XMII_PARAMS_ENTRY;
+	struct sja1105_xmii_params_entry *entry;
+	int offset, i;
+
+	entry = (struct sja1105_xmii_params_entry *)entry_ptr;
+
+	for (i = 0, offset = 17; i < 5; i++, offset += 3) {
+		sja1105_packing(buf, &entry->xmii_mode[i],
+				offset + 1, offset + 0, size, op);
+		sja1105_packing(buf, &entry->phy_mac[i],
+				offset + 2, offset + 2, size, op);
+	}
+	return size;
+}
+
+size_t sja1105_table_header_packing(void *buf, void *entry_ptr,
+				    enum packing_op op)
+{
+	const size_t size = SIZE_TABLE_HEADER;
+	struct sja1105_table_header *entry;
+
+	entry = (struct sja1105_table_header *)entry_ptr;
+
+	sja1105_packing(buf, &entry->block_id, 31, 24, size, op);
+	sja1105_packing(buf, &entry->len,      55, 32, size, op);
+	sja1105_packing(buf, &entry->crc,      95, 64, size, op);
+	return size;
+}
+
+/* WARNING: the *hdr pointer is really non-const, because it is
+ * modifying the CRC of the header for a 2-stage packing operation
+ */
+void
+sja1105_table_header_pack_with_crc(void *buf, struct sja1105_table_header *hdr)
+{
+	/* First pack the table as-is, then calculate the CRC, and
+	 * finally put the proper CRC into the packed buffer
+	 */
+	memset(buf, 0, SIZE_TABLE_HEADER);
+	sja1105_table_header_packing(buf, hdr, PACK);
+	hdr->crc = sja1105_crc32(buf, SIZE_TABLE_HEADER - 4);
+	sja1105_pack(buf + SIZE_TABLE_HEADER - 4, &hdr->crc, 31, 0, 4);
+}
+
+static void sja1105_table_write_crc(u8 *table_start, u8 *crc_ptr)
+{
+	u64 computed_crc;
+	int len_bytes;
+
+	len_bytes = (uintptr_t)(crc_ptr - table_start);
+	computed_crc = sja1105_crc32(table_start, len_bytes);
+	sja1105_pack(crc_ptr, &computed_crc, 31, 0, 4);
+}
+
+/* The block IDs that the switches support are unfortunately sparse, so keep a
+ * mapping table to "block indices" and translate back and forth so that we
+ * don't waste useless memory in struct sja1105_static_config.
+ * Also, since the block id comes from essentially untrusted input (unpacking
+ * the static config from userspace) it has to be sanitized (range-checked)
+ * before blindly indexing kernel memory with the blk_idx.
+ */
+static u64 blk_id_map[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = BLKID_L2_LOOKUP,
+	[BLK_IDX_L2_POLICING] = BLKID_L2_POLICING,
+	[BLK_IDX_VLAN_LOOKUP] = BLKID_VLAN_LOOKUP,
+	[BLK_IDX_L2_FORWARDING] = BLKID_L2_FORWARDING,
+	[BLK_IDX_MAC_CONFIG] = BLKID_MAC_CONFIG,
+	[BLK_IDX_L2_LOOKUP_PARAMS] = BLKID_L2_LOOKUP_PARAMS,
+	[BLK_IDX_L2_FORWARDING_PARAMS] = BLKID_L2_FORWARDING_PARAMS,
+	[BLK_IDX_GENERAL_PARAMS] = BLKID_GENERAL_PARAMS,
+	[BLK_IDX_XMII_PARAMS] = BLKID_XMII_PARAMS,
+};
+
+const char *sja1105_static_config_error_msg[] = {
+	[SJA1105_CONFIG_OK] = "",
+	[SJA1105_MISSING_L2_POLICING_TABLE] =
+		"l2-policing-table needs to have at least one entry",
+	[SJA1105_MISSING_L2_FORWARDING_TABLE] =
+		"l2-forwarding-table is either missing or incomplete",
+	[SJA1105_MISSING_L2_FORWARDING_PARAMS_TABLE] =
+		"l2-forwarding-parameters-table is missing",
+	[SJA1105_MISSING_GENERAL_PARAMS_TABLE] =
+		"general-parameters-table is missing",
+	[SJA1105_MISSING_VLAN_TABLE] =
+		"vlan-lookup-table needs to have at least the default untagged VLAN",
+	[SJA1105_MISSING_XMII_TABLE] =
+		"xmii-table is missing",
+	[SJA1105_MISSING_MAC_TABLE] =
+		"mac-configuration-table needs to contain an entry for each port",
+	[SJA1105_OVERCOMMITTED_FRAME_MEMORY] =
+		"Not allowed to overcommit frame memory. L2 memory partitions "
+		"and VL memory partitions share the same space. The sum of all "
+		"16 memory partitions is not allowed to be larger than 929 "
+		"128-byte blocks (or 910 with retagging). Please adjust "
+		"l2-forwarding-parameters-table.part_spc and/or "
+		"vl-forwarding-parameters-table.partspc.",
+};
+
+sja1105_config_valid_t
+static_config_check_memory_size(const struct sja1105_table *tables)
+{
+	const struct sja1105_l2_forwarding_params_entry *l2_fwd_params;
+	int i, mem = 0;
+
+	l2_fwd_params = tables[BLK_IDX_L2_FORWARDING_PARAMS].entries;
+
+	for (i = 0; i < 8; i++)
+		mem += l2_fwd_params->part_spc[i];
+
+	if (mem > MAX_FRAME_MEMORY)
+		return SJA1105_OVERCOMMITTED_FRAME_MEMORY;
+
+	return SJA1105_CONFIG_OK;
+}
+
+sja1105_config_valid_t
+sja1105_static_config_check_valid(const struct sja1105_static_config *config)
+{
+	const struct sja1105_table *tables = config->tables;
+#define IS_FULL(blk_idx) \
+	(tables[blk_idx].entry_count == tables[blk_idx].ops->max_entry_count)
+
+	if (tables[BLK_IDX_L2_POLICING].entry_count == 0)
+		return SJA1105_MISSING_L2_POLICING_TABLE;
+
+	if (tables[BLK_IDX_VLAN_LOOKUP].entry_count == 0)
+		return SJA1105_MISSING_VLAN_TABLE;
+
+	if (!IS_FULL(BLK_IDX_L2_FORWARDING))
+		return SJA1105_MISSING_L2_FORWARDING_TABLE;
+
+	if (!IS_FULL(BLK_IDX_MAC_CONFIG))
+		return SJA1105_MISSING_MAC_TABLE;
+
+	if (!IS_FULL(BLK_IDX_L2_FORWARDING_PARAMS))
+		return SJA1105_MISSING_L2_FORWARDING_PARAMS_TABLE;
+
+	if (!IS_FULL(BLK_IDX_GENERAL_PARAMS))
+		return SJA1105_MISSING_GENERAL_PARAMS_TABLE;
+
+	if (!IS_FULL(BLK_IDX_XMII_PARAMS))
+		return SJA1105_MISSING_XMII_TABLE;
+
+	return static_config_check_memory_size(tables);
+#undef IS_FULL
+}
+
+void
+sja1105_static_config_pack(void *buf, struct sja1105_static_config *config)
+{
+	struct sja1105_table_header header = {0};
+	enum sja1105_blk_idx i;
+	char *p = buf;
+	int j;
+
+	sja1105_pack(p, &config->device_id, 31, 0, 4);
+	p += SIZE_SJA1105_DEVICE_ID;
+
+	for (i = 0; i < BLK_IDX_MAX; i++) {
+		const struct sja1105_table *table;
+		char *table_start;
+
+		table = &config->tables[i];
+		if (!table->entry_count)
+			continue;
+
+		header.block_id = blk_id_map[i];
+		header.len = table->entry_count *
+			     table->ops->packed_entry_size / 4;
+		sja1105_table_header_pack_with_crc(p, &header);
+		p += SIZE_TABLE_HEADER;
+		table_start = p;
+		for (j = 0; j < table->entry_count; j++) {
+			u8 *entry_ptr = table->entries;
+
+			entry_ptr += j * table->ops->unpacked_entry_size;
+			memset(p, 0, table->ops->packed_entry_size);
+			table->ops->packing(p, entry_ptr, PACK);
+			p += table->ops->packed_entry_size;
+		}
+		sja1105_table_write_crc(table_start, p);
+		p += 4;
+	}
+	/* Final header:
+	 * Block ID does not matter
+	 * Length of 0 marks that header is final
+	 * CRC will be replaced on-the-fly on "config upload"
+	 */
+	header.block_id = 0;
+	header.len = 0;
+	header.crc = 0xDEADBEEF;
+	memset(p, 0, SIZE_TABLE_HEADER);
+	sja1105_table_header_packing(p, &header, PACK);
+}
+
+size_t
+sja1105_static_config_get_length(const struct sja1105_static_config *config)
+{
+	unsigned int sum;
+	unsigned int header_count;
+	enum sja1105_blk_idx i;
+
+	/* Ending header */
+	header_count = 1;
+	sum = SIZE_SJA1105_DEVICE_ID;
+
+	/* Tables (headers and entries) */
+	for (i = 0; i < BLK_IDX_MAX; i++) {
+		const struct sja1105_table *table;
+
+		table = &config->tables[i];
+		if (table->entry_count)
+			header_count++;
+
+		sum += table->ops->packed_entry_size * table->entry_count;
+	}
+	/* Headers have an additional CRC at the end */
+	sum += header_count * (SIZE_TABLE_HEADER + 4);
+	/* Last header does not have an extra CRC because there is no data */
+	sum -= 4;
+
+	return sum;
+}
+
+/* Compatibility matrices */
+
+/* SJA1105E: First generation, no TTEthernet */
+struct sja1105_table_ops sja1105e_table_ops[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.packing = sja1105et_l2_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_ENTRY_ET,
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_POLICING] = {
+		.packing = sja1105_l2_policing_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_policing_entry),
+		.packed_entry_size = SIZE_L2_POLICING_ENTRY,
+		.max_entry_count = MAX_L2_POLICING_COUNT,
+	},
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.packing = sja1105_vlan_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_vlan_lookup_entry),
+		.packed_entry_size = SIZE_VLAN_LOOKUP_ENTRY,
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.packing = sja1105_l2_forwarding_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.packing = sja1105et_mac_config_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_mac_config_entry),
+		.packed_entry_size = SIZE_MAC_CONFIG_ENTRY_ET,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.packing = sja1105et_l2_lookup_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_params_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_PARAMS_ENTRY_ET,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = {
+		.packing = sja1105_l2_forwarding_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_params_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_PARAMS_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_PARAMS_COUNT,
+	},
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.packing = sja1105et_general_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_general_params_entry),
+		.packed_entry_size = SIZE_GENERAL_PARAMS_ENTRY_ET,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+	},
+	[BLK_IDX_XMII_PARAMS] = {
+		.packing = sja1105_xmii_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_xmii_params_entry),
+		.packed_entry_size = SIZE_XMII_PARAMS_ENTRY,
+		.max_entry_count = MAX_XMII_PARAMS_COUNT,
+	},
+};
+
+/* SJA1105T: First generation, TTEthernet */
+struct sja1105_table_ops sja1105t_table_ops[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.packing = sja1105et_l2_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_ENTRY_ET,
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_POLICING] = {
+		.packing = sja1105_l2_policing_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_policing_entry),
+		.packed_entry_size = SIZE_L2_POLICING_ENTRY,
+		.max_entry_count = MAX_L2_POLICING_COUNT,
+	},
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.packing = sja1105_vlan_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_vlan_lookup_entry),
+		.packed_entry_size = SIZE_VLAN_LOOKUP_ENTRY,
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.packing = sja1105_l2_forwarding_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.packing = sja1105et_mac_config_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_mac_config_entry),
+		.packed_entry_size = SIZE_MAC_CONFIG_ENTRY_ET,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.packing = sja1105et_l2_lookup_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_params_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_PARAMS_ENTRY_ET,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = {
+		.packing = sja1105_l2_forwarding_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_params_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_PARAMS_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_PARAMS_COUNT,
+	},
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.packing = sja1105et_general_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_general_params_entry),
+		.packed_entry_size = SIZE_GENERAL_PARAMS_ENTRY_ET,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+	},
+	[BLK_IDX_XMII_PARAMS] = {
+		.packing = sja1105_xmii_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_xmii_params_entry),
+		.packed_entry_size = SIZE_XMII_PARAMS_ENTRY,
+		.max_entry_count = MAX_XMII_PARAMS_COUNT,
+	},
+};
+
+/* SJA1105P: Second generation, no TTEthernet, no SGMII */
+struct sja1105_table_ops sja1105p_table_ops[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.packing = sja1105pqrs_l2_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_POLICING] = {
+		.packing = sja1105_l2_policing_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_policing_entry),
+		.packed_entry_size = SIZE_L2_POLICING_ENTRY,
+		.max_entry_count = MAX_L2_POLICING_COUNT,
+	},
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.packing = sja1105_vlan_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_vlan_lookup_entry),
+		.packed_entry_size = SIZE_VLAN_LOOKUP_ENTRY,
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.packing = sja1105_l2_forwarding_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.packing = sja1105pqrs_mac_config_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_mac_config_entry),
+		.packed_entry_size = SIZE_MAC_CONFIG_ENTRY_PQRS,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.packing = sja1105pqrs_l2_lookup_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_params_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = {
+		.packing = sja1105_l2_forwarding_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_params_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_PARAMS_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_PARAMS_COUNT,
+	},
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.packing = sja1105pqrs_general_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_general_params_entry),
+		.packed_entry_size = SIZE_GENERAL_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+	},
+	[BLK_IDX_XMII_PARAMS] = {
+		.packing = sja1105_xmii_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_xmii_params_entry),
+		.packed_entry_size = SIZE_XMII_PARAMS_ENTRY,
+		.max_entry_count = MAX_XMII_PARAMS_COUNT,
+	},
+};
+
+/* SJA1105Q: Second generation, TTEthernet, no SGMII */
+struct sja1105_table_ops sja1105q_table_ops[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.packing = sja1105pqrs_l2_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_POLICING] = {
+		.packing = sja1105_l2_policing_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_policing_entry),
+		.packed_entry_size = SIZE_L2_POLICING_ENTRY,
+		.max_entry_count = MAX_L2_POLICING_COUNT,
+	},
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.packing = sja1105_vlan_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_vlan_lookup_entry),
+		.packed_entry_size = SIZE_VLAN_LOOKUP_ENTRY,
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.packing = sja1105_l2_forwarding_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.packing = sja1105pqrs_mac_config_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_mac_config_entry),
+		.packed_entry_size = SIZE_MAC_CONFIG_ENTRY_PQRS,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.packing = sja1105pqrs_l2_lookup_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_params_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = {
+		.packing = sja1105_l2_forwarding_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_params_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_PARAMS_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_PARAMS_COUNT,
+	},
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.packing = sja1105pqrs_general_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_general_params_entry),
+		.packed_entry_size = SIZE_GENERAL_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+	},
+	[BLK_IDX_XMII_PARAMS] = {
+		.packing = sja1105_xmii_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_xmii_params_entry),
+		.packed_entry_size = SIZE_XMII_PARAMS_ENTRY,
+		.max_entry_count = MAX_XMII_PARAMS_COUNT,
+	},
+};
+
+/* SJA1105R: Second generation, no TTEthernet, SGMII */
+struct sja1105_table_ops sja1105r_table_ops[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.packing = sja1105pqrs_l2_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_POLICING] = {
+		.packing = sja1105_l2_policing_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_policing_entry),
+		.packed_entry_size = SIZE_L2_POLICING_ENTRY,
+		.max_entry_count = MAX_L2_POLICING_COUNT,
+	},
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.packing = sja1105_vlan_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_vlan_lookup_entry),
+		.packed_entry_size = SIZE_VLAN_LOOKUP_ENTRY,
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.packing = sja1105_l2_forwarding_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.packing = sja1105pqrs_mac_config_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_mac_config_entry),
+		.packed_entry_size = SIZE_MAC_CONFIG_ENTRY_PQRS,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.packing = sja1105pqrs_l2_lookup_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_params_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = {
+		.packing = sja1105_l2_forwarding_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_params_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_PARAMS_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_PARAMS_COUNT,
+	},
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.packing = sja1105pqrs_general_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_general_params_entry),
+		.packed_entry_size = SIZE_GENERAL_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+	},
+	[BLK_IDX_XMII_PARAMS] = {
+		.packing = sja1105_xmii_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_xmii_params_entry),
+		.packed_entry_size = SIZE_XMII_PARAMS_ENTRY,
+		.max_entry_count = MAX_XMII_PARAMS_COUNT,
+	},
+};
+
+/* SJA1105S: Second generation, TTEthernet, SGMII */
+struct sja1105_table_ops sja1105s_table_ops[BLK_IDX_MAX] = {
+	[BLK_IDX_L2_LOOKUP] = {
+		.packing = sja1105pqrs_l2_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_POLICING] = {
+		.packing = sja1105_l2_policing_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_policing_entry),
+		.packed_entry_size = SIZE_L2_POLICING_ENTRY,
+		.max_entry_count = MAX_L2_POLICING_COUNT,
+	},
+	[BLK_IDX_VLAN_LOOKUP] = {
+		.packing = sja1105_vlan_lookup_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_vlan_lookup_entry),
+		.packed_entry_size = SIZE_VLAN_LOOKUP_ENTRY,
+		.max_entry_count = MAX_VLAN_LOOKUP_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING] = {
+		.packing = sja1105_l2_forwarding_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_COUNT,
+	},
+	[BLK_IDX_MAC_CONFIG] = {
+		.packing = sja1105pqrs_mac_config_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_mac_config_entry),
+		.packed_entry_size = SIZE_MAC_CONFIG_ENTRY_PQRS,
+		.max_entry_count = MAX_MAC_CONFIG_COUNT,
+	},
+	[BLK_IDX_L2_LOOKUP_PARAMS] = {
+		.packing = sja1105pqrs_l2_lookup_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_lookup_params_entry),
+		.packed_entry_size = SIZE_L2_LOOKUP_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_L2_LOOKUP_PARAMS_COUNT,
+	},
+	[BLK_IDX_L2_FORWARDING_PARAMS] = {
+		.packing = sja1105_l2_forwarding_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_l2_forwarding_params_entry),
+		.packed_entry_size = SIZE_L2_FORWARDING_PARAMS_ENTRY,
+		.max_entry_count = MAX_L2_FORWARDING_PARAMS_COUNT,
+	},
+	[BLK_IDX_GENERAL_PARAMS] = {
+		.packing = sja1105pqrs_general_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_general_params_entry),
+		.packed_entry_size = SIZE_GENERAL_PARAMS_ENTRY_PQRS,
+		.max_entry_count = MAX_GENERAL_PARAMS_COUNT,
+	},
+	[BLK_IDX_XMII_PARAMS] = {
+		.packing = sja1105_xmii_params_entry_packing,
+		.unpacked_entry_size = sizeof(struct sja1105_xmii_params_entry),
+		.packed_entry_size = SIZE_XMII_PARAMS_ENTRY,
+		.max_entry_count = MAX_XMII_PARAMS_COUNT,
+	},
+};
+
+int sja1105_static_config_init(struct sja1105_static_config *config,
+			       const struct sja1105_table_ops *static_ops,
+			       u64 device_id)
+{
+	enum sja1105_blk_idx i;
+
+	memset(config, 0, sizeof(*config));
+
+	/* Transfer static_ops array from priv into per-table ops
+	 * for handier access
+	 */
+	for (i = 0; i < BLK_IDX_MAX; i++)
+		config->tables[i].ops = &static_ops[i];
+
+	config->device_id = device_id;
+	return 0;
+}
+
+void sja1105_static_config_free(struct sja1105_static_config *config)
+{
+	enum sja1105_blk_idx i;
+
+	for (i = 0; i < BLK_IDX_MAX; i++) {
+		if (config->tables[i].entry_count) {
+			kfree(config->tables[i].entries);
+			config->tables[i].entry_count = 0;
+		}
+	}
+}
+
diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.h b/drivers/net/dsa/sja1105/sja1105_static_config.h
new file mode 100644
index 000000000000..a2e2ef2a0d8b
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.h
@@ -0,0 +1,274 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2016-2018, NXP Semiconductors
+ * Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#ifndef _SJA1105_STATIC_CONFIG_H
+#define _SJA1105_STATIC_CONFIG_H
+
+#include <linux/packing.h>
+#include <linux/types.h>
+#include <asm/types.h>
+
+#define SIZE_SJA1105_DEVICE_ID			4
+#define SIZE_TABLE_HEADER			12
+#define SIZE_L2_LOOKUP_ENTRY_ET			12
+#define SIZE_L2_LOOKUP_ENTRY_PQRS		20
+#define SIZE_L2_POLICING_ENTRY			8
+#define SIZE_VLAN_LOOKUP_ENTRY			8
+#define SIZE_L2_FORWARDING_ENTRY		8
+#define SIZE_MAC_CONFIG_ENTRY_ET		28
+#define SIZE_MAC_CONFIG_ENTRY_PQRS		32
+#define SIZE_L2_LOOKUP_PARAMS_ENTRY_ET		4
+#define SIZE_L2_LOOKUP_PARAMS_ENTRY_PQRS	16
+#define SIZE_L2_FORWARDING_PARAMS_ENTRY		12
+#define SIZE_GENERAL_PARAMS_ENTRY_ET		40
+#define SIZE_GENERAL_PARAMS_ENTRY_PQRS		44
+#define SIZE_XMII_PARAMS_ENTRY			4
+
+/* UM10944.pdf Page 11, Table 2. Configuration Blocks */
+enum {
+	BLKID_L2_LOOKUP				= 0x05,
+	BLKID_L2_POLICING			= 0x06,
+	BLKID_VLAN_LOOKUP			= 0x07,
+	BLKID_L2_FORWARDING			= 0x08,
+	BLKID_MAC_CONFIG			= 0x09,
+	BLKID_L2_LOOKUP_PARAMS			= 0x0D,
+	BLKID_L2_FORWARDING_PARAMS		= 0x0E,
+	BLKID_GENERAL_PARAMS			= 0x11,
+	BLKID_XMII_PARAMS			= 0x4E,
+};
+
+enum sja1105_blk_idx {
+	BLK_IDX_L2_LOOKUP = 0,
+	BLK_IDX_L2_POLICING,
+	BLK_IDX_VLAN_LOOKUP,
+	BLK_IDX_L2_FORWARDING,
+	BLK_IDX_MAC_CONFIG,
+	BLK_IDX_L2_LOOKUP_PARAMS,
+	BLK_IDX_L2_FORWARDING_PARAMS,
+	BLK_IDX_GENERAL_PARAMS,
+	BLK_IDX_XMII_PARAMS,
+	BLK_IDX_MAX,
+	/* Fake block indices that are only valid for dynamic access */
+	BLK_IDX_MGMT_ROUTE,
+	BLK_IDX_MAX_DYN,
+	BLK_IDX_INVAL = -1,
+};
+
+#define MAX_L2_LOOKUP_COUNT			1024
+#define MAX_L2_POLICING_COUNT			45
+#define MAX_VLAN_LOOKUP_COUNT			4096
+#define MAX_L2_FORWARDING_COUNT			13
+#define MAX_MAC_CONFIG_COUNT			5
+#define MAX_L2_LOOKUP_PARAMS_COUNT		1
+#define MAX_L2_FORWARDING_PARAMS_COUNT		1
+#define MAX_GENERAL_PARAMS_COUNT		1
+#define MAX_XMII_PARAMS_COUNT			1
+
+#define MAX_FRAME_MEMORY			929
+
+#define SJA1105E_DEVICE_ID			0x9C00000Cull
+#define SJA1105T_DEVICE_ID			0x9E00030Eull
+#define SJA1105PR_DEVICE_ID			0xAF00030Eull
+#define SJA1105QS_DEVICE_ID			0xAE00030Eull
+
+#define SJA1105ET_PART_NO			0x9A83
+#define SJA1105P_PART_NO			0x9A84
+#define SJA1105Q_PART_NO			0x9A85
+#define SJA1105R_PART_NO			0x9A86
+#define SJA1105S_PART_NO			0x9A87
+
+struct sja1105_general_params_entry {
+	u64 vllupformat;
+	u64 mirr_ptacu;
+	u64 switchid;
+	u64 hostprio;
+	u64 mac_fltres1;
+	u64 mac_fltres0;
+	u64 mac_flt1;
+	u64 mac_flt0;
+	u64 incl_srcpt1;
+	u64 incl_srcpt0;
+	u64 send_meta1;
+	u64 send_meta0;
+	u64 casc_port;
+	u64 host_port;
+	u64 mirr_port;
+	u64 vlmarker;
+	u64 vlmask;
+	u64 tpid;
+	u64 ignore2stf;
+	u64 tpid2;
+	/* P/Q/R/S only */
+	u64 queue_ts;
+	u64 egrmirrvid;
+	u64 egrmirrpcp;
+	u64 egrmirrdei;
+	u64 replay_port;
+};
+
+struct sja1105_vlan_lookup_entry {
+	u64 ving_mirr;
+	u64 vegr_mirr;
+	u64 vmemb_port;
+	u64 vlan_bc;
+	u64 tag_port;
+	u64 vlanid;
+};
+
+struct sja1105_l2_lookup_entry {
+	u64 mirrvlan;      /* P/Q/R/S only - LOCKEDS=1 */
+	u64 mirr;          /* P/Q/R/S only - LOCKEDS=1 */
+	u64 retag;         /* P/Q/R/S only - LOCKEDS=1 */
+	u64 mask_iotag;    /* P/Q/R/S only */
+	u64 mask_vlanid;   /* P/Q/R/S only */
+	u64 mask_macaddr;  /* P/Q/R/S only */
+	u64 iotag;         /* P/Q/R/S only */
+	u64 vlanid;
+	u64 macaddr;
+	u64 destports;
+	u64 enfport;
+	u64 index;
+};
+
+struct sja1105_l2_lookup_params_entry {
+	u64 drpbc;           /* P/Q/R/S only */
+	u64 drpmc;           /* P/Q/R/S only */
+	u64 drpuni;          /* P/Q/R/S only */
+	u64 maxaddrp[5];     /* P/Q/R/S only */
+	u64 start_dynspc;    /* P/Q/R/S only */
+	u64 drpnolearn;      /* P/Q/R/S only */
+	u64 use_static;      /* P/Q/R/S only */
+	u64 owr_dyn;         /* P/Q/R/S only */
+	u64 learn_once;      /* P/Q/R/S only */
+	u64 maxage;          /* Shared */
+	u64 dyn_tbsz;        /* E/T only */
+	u64 poly;            /* E/T only */
+	u64 shared_learn;    /* Shared */
+	u64 no_enf_hostprt;  /* Shared */
+	u64 no_mgmt_learn;   /* Shared */
+};
+
+struct sja1105_l2_forwarding_entry {
+	u64 bc_domain;
+	u64 reach_port;
+	u64 fl_domain;
+	u64 vlan_pmap[8];
+};
+
+struct sja1105_l2_forwarding_params_entry {
+	u64 max_dynp;
+	u64 part_spc[8];
+};
+
+struct sja1105_l2_policing_entry {
+	u64 sharindx;
+	u64 smax;
+	u64 rate;
+	u64 maxlen;
+	u64 partition;
+};
+
+struct sja1105_mac_config_entry {
+	u64 top[8];
+	u64 base[8];
+	u64 enabled[8];
+	u64 ifg;
+	u64 speed;
+	u64 tp_delin;
+	u64 tp_delout;
+	u64 maxage;
+	u64 vlanprio;
+	u64 vlanid;
+	u64 ing_mirr;
+	u64 egr_mirr;
+	u64 drpnona664;
+	u64 drpdtag;
+	u64 drpsotag;   /* only on P/Q/R/S */
+	u64 drpsitag;   /* only on P/Q/R/S */
+	u64 drpuntag;
+	u64 retag;
+	u64 dyn_learn;
+	u64 egress;
+	u64 ingress;
+	u64 mirrcie;    /* only on P/Q/R/S */
+	u64 mirrcetag;  /* only on P/Q/R/S */
+	u64 ingmirrvid; /* only on P/Q/R/S */
+	u64 ingmirrpcp; /* only on P/Q/R/S */
+	u64 ingmirrdei; /* only on P/Q/R/S */
+};
+
+struct sja1105_xmii_params_entry {
+	u64 phy_mac[5];
+	u64 xmii_mode[5];
+};
+
+struct sja1105_table_header {
+	u64 block_id;
+	u64 len;
+	u64 crc;
+};
+
+struct sja1105_table_ops {
+	size_t (*packing)(void *buf, void *entry_ptr, enum packing_op op);
+	size_t unpacked_entry_size;
+	size_t packed_entry_size;
+	size_t max_entry_count;
+};
+
+struct sja1105_table {
+	const struct sja1105_table_ops *ops;
+	size_t entry_count;
+	void *entries;
+};
+
+struct sja1105_static_config {
+	u64 device_id;
+	struct sja1105_table tables[BLK_IDX_MAX];
+};
+
+extern struct sja1105_table_ops sja1105e_table_ops[BLK_IDX_MAX];
+extern struct sja1105_table_ops sja1105t_table_ops[BLK_IDX_MAX];
+extern struct sja1105_table_ops sja1105p_table_ops[BLK_IDX_MAX];
+extern struct sja1105_table_ops sja1105q_table_ops[BLK_IDX_MAX];
+extern struct sja1105_table_ops sja1105r_table_ops[BLK_IDX_MAX];
+extern struct sja1105_table_ops sja1105s_table_ops[BLK_IDX_MAX];
+
+size_t sja1105_table_header_packing(void *buf, void *hdr, enum packing_op op);
+void
+sja1105_table_header_pack_with_crc(void *buf, struct sja1105_table_header *hdr);
+size_t
+sja1105_static_config_get_length(const struct sja1105_static_config *config);
+
+typedef enum {
+	SJA1105_CONFIG_OK = 0,
+	SJA1105_MISSING_L2_POLICING_TABLE,
+	SJA1105_MISSING_L2_FORWARDING_TABLE,
+	SJA1105_MISSING_L2_FORWARDING_PARAMS_TABLE,
+	SJA1105_MISSING_GENERAL_PARAMS_TABLE,
+	SJA1105_MISSING_VLAN_TABLE,
+	SJA1105_MISSING_XMII_TABLE,
+	SJA1105_MISSING_MAC_TABLE,
+	SJA1105_OVERCOMMITTED_FRAME_MEMORY,
+} sja1105_config_valid_t;
+
+extern const char *sja1105_static_config_error_msg[];
+
+sja1105_config_valid_t
+sja1105_static_config_check_valid(const struct sja1105_static_config *config);
+void
+sja1105_static_config_pack(void *buf, struct sja1105_static_config *config);
+int sja1105_static_config_init(struct sja1105_static_config *config,
+			       const struct sja1105_table_ops *static_ops,
+			       u64 device_id);
+void sja1105_static_config_free(struct sja1105_static_config *config);
+
+u32 sja1105_crc32(const void *buf, size_t len);
+
+void sja1105_pack(void *buf, const u64 *val, int start, int end, size_t len);
+void sja1105_unpack(const void *buf, u64 *val, int start, int end, size_t len);
+void sja1105_packing(void *buf, u64 *val, int start, int end,
+		     size_t len, enum packing_op op);
+
+#endif
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 15/24] net: dsa: sja1105: Add support for FDB and MDB management
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (13 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 20:58   ` Jiri Pirko
  2019-04-13  1:28 ` [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations Vladimir Oltean
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Currently only the (more difficult) first generation E/T series is
supported. Here the TCAM is only 4-way associative, and to know where
the hardware will search for a FDB entry, we need to perform the same
hash algorithm in order to install the entry in the correct bin.

On P/Q/R/S, the TCAM should be fully associative. However the SPI
command interface is different, and because I don't have access to a
new-generation device at the moment, support for it is TODO.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes from v3:
None

Changes from v2:
None

 drivers/net/dsa/sja1105/sja1105.h             |   2 +
 .../net/dsa/sja1105/sja1105_dynamic_config.c  |  40 ++++
 drivers/net/dsa/sja1105/sja1105_main.c        | 193 ++++++++++++++++++
 3 files changed, 235 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index ef555dd385a3..4c9df44a4478 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -129,6 +129,8 @@ int sja1105_dynamic_config_write(struct sja1105_private *priv,
 				 enum sja1105_blk_idx blk_idx,
 				 int index, void *entry, bool keep);
 
+u8 sja1105_fdb_hash(struct sja1105_private *priv, const u8 *addr, u16 vid);
+
 /* Common implementations for the static and dynamic configs */
 size_t sja1105_l2_forwarding_entry_packing(void *buf, void *entry_ptr,
 					   enum packing_op op);
diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
index 74c3a00d453c..0aeda6868c27 100644
--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -462,3 +462,43 @@ int sja1105_dynamic_config_write(struct sja1105_private *priv,
 
 	return 0;
 }
+
+static u8 crc8_add(u8 crc, u8 byte, u8 poly)
+{
+	int i;
+
+	for (i = 0; i < 8; i++) {
+		if ((crc ^ byte) & (1 << 7)) {
+			crc <<= 1;
+			crc ^= poly;
+		} else {
+			crc <<= 1;
+		}
+		byte <<= 1;
+	}
+	return crc;
+}
+
+/* CRC8 algorithm with non-reversed input, non-reversed output,
+ * no input xor and no output xor. Code customized for receiving
+ * the SJA1105 E/T FDB keys (vlanid, macaddr) as input. CRC polynomial
+ * is also received as argument in the Koopman notation that the switch
+ * hardware stores it in.
+ */
+u8 sja1105_fdb_hash(struct sja1105_private *priv, const u8 *addr, u16 vid)
+{
+	struct sja1105_l2_lookup_params_entry *l2_lookup_params =
+		priv->static_config.tables[BLK_IDX_L2_LOOKUP_PARAMS].entries;
+	u64 poly_koopman = l2_lookup_params->poly;
+	/* Convert polynomial from Koopman to 'normal' notation */
+	u8 poly = (u8)(1 + (poly_koopman << 1));
+	u64 vlanid = l2_lookup_params->shared_learn ? 0 : vid;
+	u64 input = (vlanid << 48) | ether_addr_to_u64(addr);
+	u8 crc = 0; /* seed */
+	int i;
+
+	/* Mask the eight bytes starting from MSB one at a time */
+	for (i = 56; i >= 0; i -= 8)
+		crc = crc8_add(crc, (input & (0xffull << i)) >> i, poly);
+	return crc;
+}
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index c3e4fff11101..e37181bd2a6a 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -188,6 +188,9 @@ static int sja1105_init_static_fdb(struct sja1105_private *priv)
 
 	table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
 
+	/* We only populate the FDB table through dynamic
+	 * L2 Address Lookup entries
+	 */
 	if (table->entry_count) {
 		kfree(table->entries);
 		table->entry_count = 0;
@@ -705,6 +708,190 @@ static void sja1105_adjust_link(struct dsa_switch *ds, int port,
 		sja1105_adjust_port_config(priv, port, phydev->speed, true);
 }
 
+#define fdb(bin, index) \
+	((bin) * SJA1105ET_FDB_BIN_SIZE + (index))
+#define is_bin_index_valid(i) \
+	((i) >= 0 && (i) < SJA1105ET_FDB_BIN_SIZE)
+
+static int
+sja1105_is_fdb_entry_in_bin(struct sja1105_private *priv, int bin,
+			    const u8 *addr, u16 vid,
+			    struct sja1105_l2_lookup_entry *fdb_match,
+			    int *last_unused)
+{
+	int index_in_bin;
+
+	for (index_in_bin = 0; index_in_bin < SJA1105ET_FDB_BIN_SIZE;
+	     index_in_bin++) {
+		struct sja1105_l2_lookup_entry l2_lookup = { 0 };
+
+		/* Skip unused entries, optionally marking them
+		 * into the return value
+		 */
+		if (sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
+						fdb(bin, index_in_bin),
+						&l2_lookup)) {
+			if (last_unused)
+				*last_unused = index_in_bin;
+			continue;
+		}
+
+		if (l2_lookup.macaddr == ether_addr_to_u64(addr) &&
+		    l2_lookup.vlanid == vid) {
+			if (fdb_match)
+				*fdb_match = l2_lookup;
+			return index_in_bin;
+		}
+	}
+	/* Return an invalid entry index if not found */
+	return SJA1105ET_FDB_BIN_SIZE;
+}
+
+static int sja1105_fdb_add(struct dsa_switch *ds, int port,
+			   const unsigned char *addr, u16 vid)
+{
+	struct sja1105_l2_lookup_entry l2_lookup = { 0 };
+	struct sja1105_private *priv = ds->priv;
+	struct device *dev = ds->dev;
+	int bin, index_in_bin;
+	int last_unused;
+
+	bin = sja1105_fdb_hash(priv, addr, vid);
+
+	index_in_bin = sja1105_is_fdb_entry_in_bin(priv, bin, addr, vid,
+						   &l2_lookup, &last_unused);
+	if (is_bin_index_valid(index_in_bin)) {
+		/* We have an FDB entry. Is our port in the destination
+		 * mask? If yes, we need to do nothing. If not, we need
+		 * to rewrite the entry by adding this port to it.
+		 */
+		if (l2_lookup.destports & BIT(port))
+			return 0;
+		l2_lookup.destports |= BIT(port);
+	} else {
+		/* We don't have an FDB entry. We construct a new one and
+		 * try to find a place for it within the FDB table.
+		 */
+		l2_lookup.macaddr = ether_addr_to_u64(addr);
+		l2_lookup.destports = BIT(port);
+		l2_lookup.vlanid = vid;
+
+		if (is_bin_index_valid(last_unused)) {
+			index_in_bin = last_unused;
+		} else {
+			/* Bin is full, need to evict somebody.
+			 * Choose victim at random. If you get these messages
+			 * often, you may need to consider changing the
+			 * distribution function:
+			 * static_config[BLK_IDX_L2_LOOKUP_PARAMS].entries->poly
+			 */
+			get_random_bytes(&index_in_bin, sizeof(u8));
+			index_in_bin %= SJA1105ET_FDB_BIN_SIZE;
+			dev_warn(dev, "Warning, FDB bin %d full while adding entry for %pM. Evicting entry %u.\n",
+				 bin, addr, index_in_bin);
+			/* Evict entry */
+			sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
+						     fdb(bin, index_in_bin),
+						     NULL, false);
+		}
+	}
+	l2_lookup.index = fdb(bin, index_in_bin);
+
+	return sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
+				l2_lookup.index, &l2_lookup, true);
+}
+
+static int sja1105_fdb_del(struct dsa_switch *ds, int port,
+			   const unsigned char *addr, u16 vid)
+{
+	struct sja1105_l2_lookup_entry l2_lookup = { 0 };
+	struct sja1105_private *priv = ds->priv;
+	u8 bin, index_in_bin;
+	bool keep;
+
+	bin = sja1105_fdb_hash(priv, addr, vid);
+
+	index_in_bin = sja1105_is_fdb_entry_in_bin(priv, bin, addr, vid,
+						   &l2_lookup, NULL);
+	if (!is_bin_index_valid(index_in_bin))
+		return 0;
+
+	/* We have an FDB entry. Is our port in the destination mask? If yes,
+	 * we need to remove it. If the resulting port mask becomes empty, we
+	 * need to completely evict the FDB entry.
+	 * Otherwise we just write it back.
+	 */
+	if (l2_lookup.destports & BIT(port))
+		l2_lookup.destports &= ~BIT(port);
+	if (l2_lookup.destports)
+		keep = true;
+	else
+		keep = false;
+
+	return sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
+					    fdb(bin, index_in_bin),
+					    &l2_lookup, keep);
+}
+
+static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
+			    dsa_fdb_dump_cb_t *cb, void *data)
+{
+	struct sja1105_private *priv = ds->priv;
+	struct device *dev = ds->dev;
+	int i;
+
+	for (i = 0; i < MAX_L2_LOOKUP_COUNT; i++) {
+		struct sja1105_l2_lookup_entry l2_lookup;
+		u8 macaddr[ETH_ALEN];
+		int rc;
+
+		memset(&l2_lookup, 0, sizeof(l2_lookup));
+		rc = sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
+						 i, &l2_lookup);
+		/* No fdb entry at i, not an issue */
+		if (rc == -EINVAL)
+			continue;
+		if (rc) {
+			dev_err(dev, "Failed to dump FDB: %d\n", rc);
+			return rc;
+		}
+
+		/* FDB dump callback is per port. This means we have to
+		 * disregard a valid entry if it's not for this port, even if
+		 * only to revisit it later. This is inefficient because the
+		 * 1024-sized FDB table needs to be traversed 4 times through
+		 * SPI during a 'bridge fdb show' command.
+		 */
+		if (!(l2_lookup.destports & BIT(port)))
+			continue;
+		u64_to_ether_addr(l2_lookup.macaddr, macaddr);
+		cb(macaddr, l2_lookup.vlanid, false, data);
+	}
+	return 0;
+}
+
+#undef fdb
+#undef is_bin_index_valid
+
+/* This callback needs to be present */
+static int sja1105_mdb_prepare(struct dsa_switch *ds, int port,
+			       const struct switchdev_obj_port_mdb *mdb)
+{
+	return 0;
+}
+
+static void sja1105_mdb_add(struct dsa_switch *ds, int port,
+			    const struct switchdev_obj_port_mdb *mdb)
+{
+	sja1105_fdb_add(ds, port, mdb->addr, mdb->vid);
+}
+
+static int sja1105_mdb_del(struct dsa_switch *ds, int port,
+			   const struct switchdev_obj_port_mdb *mdb)
+{
+	return sja1105_fdb_del(ds, port, mdb->addr, mdb->vid);
+}
+
 static int sja1105_bridge_member(struct dsa_switch *ds, int port,
 				 struct net_device *br, bool member)
 {
@@ -807,8 +994,14 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.get_tag_protocol	= sja1105_get_tag_protocol,
 	.setup			= sja1105_setup,
 	.adjust_link		= sja1105_adjust_link,
+	.port_fdb_dump		= sja1105_fdb_dump,
+	.port_fdb_add		= sja1105_fdb_add,
+	.port_fdb_del		= sja1105_fdb_del,
 	.port_bridge_join	= sja1105_bridge_join,
 	.port_bridge_leave	= sja1105_bridge_leave,
+	.port_mdb_prepare	= sja1105_mdb_prepare,
+	.port_mdb_add		= sja1105_mdb_add,
+	.port_mdb_del		= sja1105_mdb_del,
 };
 
 static int sja1105_check_device_id(struct sja1105_private *priv)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (14 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 15/24] net: dsa: sja1105: Add support for FDB and MDB management Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 20:56   ` Jiri Pirko
  2019-04-13  1:28 ` [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters Vladimir Oltean
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

VLAN filtering cannot be properly disabled in SJA1105. So in order to
emulate the "no VLAN awareness" behavior (not dropping traffic that is
tagged with a VID that isn't configured on the port), we need to hack
another switch feature: programmable TPID (which is 0x8100 for 802.1Q).
We are reprogramming the TPID to a bogus value (ETH_P_EDSA) which leaves
the switch thinking that all traffic is untagged, and therefore accepts
it.

Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is
installed again, and the switch starts identifying 802.1Q-tagged
traffic.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes from v3:
Changed back to ETH_P_EDSA.

Changes from v2:
Changed the TPID from ETH_P_EDSA (0xDADA) to a newly introduced one:
ETH_P_DSA_8021Q (0xDADB).

 drivers/net/dsa/sja1105/sja1105_main.c        | 254 +++++++++++++++++-
 .../net/dsa/sja1105/sja1105_static_config.c   |  38 +++
 .../net/dsa/sja1105/sja1105_static_config.h   |   3 +
 3 files changed, 293 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index e37181bd2a6a..a0851d1c9d89 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -261,6 +261,13 @@ static int sja1105_init_static_vlan(struct sja1105_private *priv)
 	table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
 
 	/* The static VLAN table will only contain the initial pvid of 0.
+	 * All other VLANs are to be configured through dynamic entries,
+	 * and kept in the static configuration table as backing memory.
+	 * The pvid of 0 is sufficient to pass traffic while the ports are
+	 * standalone and when vlan_filtering is disabled. When filtering
+	 * gets enabled, the switchdev core sets up the VLAN ID 1 and sets
+	 * it as the new pvid. Actually 'pvid 1' still comes up in 'bridge
+	 * vlan' even when vlan_filtering is off, but it has no effect.
 	 */
 	if (table->entry_count) {
 		kfree(table->entries);
@@ -401,8 +408,11 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
 		.vlmask = 0,
 		/* Only update correctionField for 1-step PTP (L2 transport) */
 		.ignore2stf = 0,
-		.tpid = ETH_P_8021Q,
-		.tpid2 = ETH_P_8021Q,
+		/* Forcefully disable VLAN filtering by telling
+		 * the switch that VLAN has a different EtherType.
+		 */
+		.tpid = ETH_P_EDSA,
+		.tpid2 = ETH_P_EDSA,
 		/* P/Q/R/S only */
 		.queue_ts = 0,
 		.egrmirrvid = 0,
@@ -945,12 +955,240 @@ static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
 	sja1105_bridge_member(ds, port, br, false);
 }
 
+/* For situations where we need to change a setting at runtime that is only
+ * available through the static configuration, resetting the switch in order
+ * to upload the new static config is unavoidable. Back up the settings we
+ * modify at runtime (currently only MAC) and restore them after uploading,
+ * such that this operation is relatively seamless.
+ */
+static int sja1105_static_config_reload(struct sja1105_private *priv)
+{
+	struct sja1105_mac_config_entry *mac;
+	int speed_mbps[SJA1105_NUM_PORTS];
+	int rc, i;
+
+	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
+
+	/* Back up settings changed by sja1105_adjust_port_config and
+	 * and restore their defaults.
+	 */
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		speed_mbps[i] = sja1105_speed[mac[i].speed];
+		mac[i].speed = SJA1105_SPEED_AUTO;
+	}
+
+	/* Reset switch and send updated static configuration */
+	rc = sja1105_static_config_upload(priv);
+	if (rc < 0)
+		goto out;
+
+	/* Configure the CGU (PLLs) for MII and RMII PHYs.
+	 * For these interfaces there is no dynamic configuration
+	 * needed, since PLLs have same settings at all speeds.
+	 */
+	rc = sja1105_clocking_setup(priv);
+	if (rc < 0)
+		goto out;
+
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		bool enabled = (speed_mbps[i] != 0);
+
+		rc = sja1105_adjust_port_config(priv, i, speed_mbps[i],
+						enabled);
+		if (rc < 0)
+			goto out;
+	}
+out:
+	return rc;
+}
+
+#define sja1105_vlan_filtering_enabled(priv) \
+	(((struct sja1105_general_params_entry *) \
+	((struct sja1105_private *)priv)->static_config. \
+	tables[BLK_IDX_GENERAL_PARAMS].entries)->tpid == ETH_P_8021Q)
+
+/* The TPID setting belongs to the General Parameters table,
+ * which can only be partially reconfigured at runtime (and not the TPID).
+ * So a switch reset is required.
+ */
+static int sja1105_change_tpid(struct sja1105_private *priv,
+			       u16 tpid, u16 tpid2)
+{
+	struct sja1105_general_params_entry *general_params;
+	struct sja1105_table *table;
+
+	table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS];
+	general_params = table->entries;
+	general_params->tpid = tpid;
+	general_params->tpid2 = tpid2;
+	return sja1105_static_config_reload(priv);
+}
+
+static int sja1105_pvid_apply(struct sja1105_private *priv, int port, u16 pvid)
+{
+	struct sja1105_mac_config_entry *mac;
+
+	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
+
+	mac[port].vlanid = pvid;
+
+	return sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, port,
+					   &mac[port], true);
+}
+
+static int sja1105_is_vlan_configured(struct sja1105_private *priv, u16 vid)
+{
+	struct sja1105_vlan_lookup_entry *vlan;
+	int count, i;
+
+	vlan = priv->static_config.tables[BLK_IDX_VLAN_LOOKUP].entries;
+	count = priv->static_config.tables[BLK_IDX_VLAN_LOOKUP].entry_count;
+
+	for (i = 0; i < count; i++)
+		if (vlan[i].vlanid == vid)
+			return i;
+
+	/* Return an invalid entry index if not found */
+	return -1;
+}
+
+static int sja1105_vlan_apply(struct sja1105_private *priv, int port, u16 vid,
+			      bool enabled, bool untagged)
+{
+	struct sja1105_vlan_lookup_entry *vlan;
+	struct sja1105_table *table;
+	bool keep = true;
+	int match, rc;
+
+	table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
+
+	match = sja1105_is_vlan_configured(priv, vid);
+	if (match < 0) {
+		/* Can't delete a missing entry. */
+		if (!enabled)
+			return 0;
+		rc = sja1105_table_resize(table, table->entry_count + 1);
+		if (rc)
+			return rc;
+		match = table->entry_count - 1;
+	}
+	/* Assign pointer after the resize (it's new memory) */
+	vlan = table->entries;
+	vlan[match].vlanid = vid;
+	if (enabled) {
+		vlan[match].vlan_bc |= BIT(port);
+		vlan[match].vmemb_port |= BIT(port);
+	} else {
+		vlan[match].vlan_bc &= ~BIT(port);
+		vlan[match].vmemb_port &= ~BIT(port);
+	}
+	/* Also unset tag_port if removing this VLAN was requested,
+	 * just so we don't have a confusing bitmap (no practical purpose).
+	 */
+	if (untagged || !enabled)
+		vlan[match].tag_port &= ~BIT(port);
+	else
+		vlan[match].tag_port |= BIT(port);
+	/* If there's no port left as member of this VLAN,
+	 * it's time for it to go.
+	 */
+	if (!vlan[match].vmemb_port)
+		keep = false;
+
+	dev_dbg(priv->ds->dev,
+		"%s: port %d, vid %llu, broadcast domain 0x%llx, "
+		"port members 0x%llx, tagged ports 0x%llx, keep %d\n",
+		__func__, port, vlan[match].vlanid, vlan[match].vlan_bc,
+		vlan[match].vmemb_port, vlan[match].tag_port, keep);
+
+	rc = sja1105_dynamic_config_write(priv, BLK_IDX_VLAN_LOOKUP, vid,
+					  &vlan[match], keep);
+	if (rc < 0)
+		return rc;
+
+	if (!keep)
+		return sja1105_table_delete_entry(table, match);
+
+	return 0;
+}
+
 static enum dsa_tag_protocol
 sja1105_get_tag_protocol(struct dsa_switch *ds, int port)
 {
 	return DSA_TAG_PROTO_NONE;
 }
 
+/* This callback needs to be present */
+static int sja1105_vlan_prepare(struct dsa_switch *ds, int port,
+				const struct switchdev_obj_port_vlan *vlan)
+{
+	return 0;
+}
+
+static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
+{
+	struct sja1105_private *priv = ds->priv;
+	int rc;
+
+	if (enabled && !sja1105_vlan_filtering_enabled(priv))
+		/* Enable VLAN filtering. */
+		rc = sja1105_change_tpid(priv, ETH_P_8021Q, ETH_P_8021AD);
+	else if (!enabled && sja1105_vlan_filtering_enabled(priv))
+		/* Disable VLAN filtering. */
+		rc = sja1105_change_tpid(priv, ETH_P_EDSA, ETH_P_EDSA);
+	else
+		return 0;
+	if (rc)
+		dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
+
+	return rc;
+}
+
+static void sja1105_vlan_add(struct dsa_switch *ds, int port,
+			     const struct switchdev_obj_port_vlan *vlan)
+{
+	struct sja1105_private *priv = ds->priv;
+	u16 vid;
+	int rc;
+
+	for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
+		rc = sja1105_vlan_apply(priv, port, vid, true, vlan->flags &
+					BRIDGE_VLAN_INFO_UNTAGGED);
+		if (rc < 0) {
+			dev_err(ds->dev, "Failed to add VLAN %d to port %d: %d\n",
+				vid, port, rc);
+			return;
+		}
+		if (vlan->flags & BRIDGE_VLAN_INFO_PVID) {
+			rc = sja1105_pvid_apply(ds->priv, port, vid);
+			if (rc < 0) {
+				dev_err(ds->dev, "Failed to set pvid %d on port %d: %d\n",
+					vid, port, rc);
+				return;
+			}
+		}
+	}
+}
+
+static int sja1105_vlan_del(struct dsa_switch *ds, int port,
+			    const struct switchdev_obj_port_vlan *vlan)
+{
+	struct sja1105_private *priv = ds->priv;
+	u16 vid;
+	int rc;
+
+	for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
+		rc = sja1105_vlan_apply(priv, port, vid, false, vlan->flags &
+					BRIDGE_VLAN_INFO_UNTAGGED);
+		if (rc < 0) {
+			dev_err(ds->dev, "Failed to remove VLAN %d from port %d: %d\n",
+				vid, port, rc);
+			return rc;
+		}
+	}
+	return 0;
+}
+
 /* The programming model for the SJA1105 switch is "all-at-once" via static
  * configuration tables. Some of these can be dynamically modified at runtime,
  * but not the xMII mode parameters table.
@@ -986,6 +1224,14 @@ static int sja1105_setup(struct dsa_switch *ds)
 		dev_err(ds->dev, "Failed to configure MII clocking: %d\n", rc);
 		return rc;
 	}
+	/* On SJA1105, VLAN filtering per se is always enabled in hardware.
+	 * The only thing we can do to disable it is lie about what the 802.1Q
+	 * EtherType is. So it will still try to apply VLAN filtering, but all
+	 * ingress traffic (except frames received with EtherType of
+	 * ETH_P_EDSA) will be internally tagged with a distorted VLAN header
+	 * where the TPID is ETH_P_EDSA, and the VLAN ID is the port pvid.
+	 */
+	ds->vlan_filtering_is_global = true;
 
 	return 0;
 }
@@ -999,6 +1245,10 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.port_fdb_del		= sja1105_fdb_del,
 	.port_bridge_join	= sja1105_bridge_join,
 	.port_bridge_leave	= sja1105_bridge_leave,
+	.port_vlan_prepare	= sja1105_vlan_prepare,
+	.port_vlan_filtering	= sja1105_vlan_filtering,
+	.port_vlan_add		= sja1105_vlan_add,
+	.port_vlan_del		= sja1105_vlan_del,
 	.port_mdb_prepare	= sja1105_mdb_prepare,
 	.port_mdb_add		= sja1105_mdb_add,
 	.port_mdb_del		= sja1105_mdb_del,
diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.c b/drivers/net/dsa/sja1105/sja1105_static_config.c
index ae5c2551ad90..42f71e373d1c 100644
--- a/drivers/net/dsa/sja1105/sja1105_static_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.c
@@ -1002,3 +1002,41 @@ void sja1105_static_config_free(struct sja1105_static_config *config)
 	}
 }
 
+int sja1105_table_delete_entry(struct sja1105_table *table, int i)
+{
+	size_t entry_size = table->ops->unpacked_entry_size;
+	u8 *entries = table->entries;
+
+	if (i > table->entry_count)
+		return -ERANGE;
+
+	memmove(entries + i * entry_size, entries + (i + 1) * entry_size,
+		(table->entry_count - i) * entry_size);
+
+	table->entry_count--;
+
+	return 0;
+}
+
+/* No pointers to table->entries should be kept when this is called. */
+int sja1105_table_resize(struct sja1105_table *table, size_t new_count)
+{
+	size_t entry_size = table->ops->unpacked_entry_size;
+	void *new_entries, *old_entries = table->entries;
+
+	if (new_count > table->ops->max_entry_count)
+		return -ERANGE;
+
+	new_entries = kcalloc(new_count, entry_size, GFP_KERNEL);
+	if (!new_entries)
+		return -ENOMEM;
+
+	memcpy(new_entries, old_entries, min(new_count, table->entry_count) *
+		entry_size);
+
+	table->entries = new_entries;
+	table->entry_count = new_count;
+	kfree(old_entries);
+	return 0;
+}
+
diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.h b/drivers/net/dsa/sja1105/sja1105_static_config.h
index a2e2ef2a0d8b..c71ef9d366db 100644
--- a/drivers/net/dsa/sja1105/sja1105_static_config.h
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.h
@@ -263,6 +263,9 @@ int sja1105_static_config_init(struct sja1105_static_config *config,
 			       u64 device_id);
 void sja1105_static_config_free(struct sja1105_static_config *config);
 
+int sja1105_table_delete_entry(struct sja1105_table *table, int i);
+int sja1105_table_resize(struct sja1105_table *table, size_t new_count);
+
 u32 sja1105_crc32(const void *buf, size_t len);
 
 void sja1105_pack(void *buf, const u64 *val, int start, int end, size_t len);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (15 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 20:53   ` Jiri Pirko
  2019-04-13  1:28 ` [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports Vladimir Oltean
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
None functional. Moved the IS_ET() and IS_PQRS() device identification
macros here since they are not used in earlier patches.

 drivers/net/dsa/sja1105/Makefile              |   1 +
 drivers/net/dsa/sja1105/sja1105.h             |   7 +-
 drivers/net/dsa/sja1105/sja1105_ethtool.c     | 414 ++++++++++++++++++
 drivers/net/dsa/sja1105/sja1105_main.c        |   3 +
 .../net/dsa/sja1105/sja1105_static_config.h   |  21 +
 5 files changed, 445 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/dsa/sja1105/sja1105_ethtool.c

diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
index ed00840802f4..bb4404c79eb2 100644
--- a/drivers/net/dsa/sja1105/Makefile
+++ b/drivers/net/dsa/sja1105/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
 sja1105-objs := \
     sja1105_spi.o \
     sja1105_main.o \
+    sja1105_ethtool.o \
     sja1105_clocking.o \
     sja1105_static_config.o \
     sja1105_dynamic_config.o \
diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 4c9df44a4478..80b20bdd8f9c 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -120,8 +120,13 @@ typedef enum {
 int sja1105_clocking_setup_port(struct sja1105_private *priv, int port);
 int sja1105_clocking_setup(struct sja1105_private *priv);
 
-/* From sja1105_dynamic_config.c */
+/* From sja1105_ethtool.c */
+void sja1105_get_ethtool_stats(struct dsa_switch *ds, int port, u64 *data);
+void sja1105_get_strings(struct dsa_switch *ds, int port,
+			 u32 stringset, u8 *data);
+int sja1105_get_sset_count(struct dsa_switch *ds, int port, int sset);
 
+/* From sja1105_dynamic_config.c */
 int sja1105_dynamic_config_read(struct sja1105_private *priv,
 				enum sja1105_blk_idx blk_idx,
 				int index, void *entry);
diff --git a/drivers/net/dsa/sja1105/sja1105_ethtool.c b/drivers/net/dsa/sja1105/sja1105_ethtool.c
new file mode 100644
index 000000000000..c082599702bd
--- /dev/null
+++ b/drivers/net/dsa/sja1105/sja1105_ethtool.c
@@ -0,0 +1,414 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include "sja1105.h"
+
+#define SIZE_MAC_AREA		(0x02 * 4)
+#define SIZE_HL1_AREA		(0x10 * 4)
+#define SIZE_HL2_AREA		(0x4 * 4)
+#define SIZE_QLEVEL_AREA	(0x8 * 4) /* 0x4 to 0xB */
+
+struct sja1105_port_status_mac {
+	u64 n_runt;
+	u64 n_soferr;
+	u64 n_alignerr;
+	u64 n_miierr;
+	u64 typeerr;
+	u64 sizeerr;
+	u64 tctimeout;
+	u64 priorerr;
+	u64 nomaster;
+	u64 memov;
+	u64 memerr;
+	u64 invtyp;
+	u64 intcyov;
+	u64 domerr;
+	u64 pcfbagdrop;
+	u64 spcprior;
+	u64 ageprior;
+	u64 portdrop;
+	u64 lendrop;
+	u64 bagdrop;
+	u64 policeerr;
+	u64 drpnona664err;
+	u64 spcerr;
+	u64 agedrp;
+};
+
+struct sja1105_port_status_hl1 {
+	u64 n_n664err;
+	u64 n_vlanerr;
+	u64 n_unreleased;
+	u64 n_sizeerr;
+	u64 n_crcerr;
+	u64 n_vlnotfound;
+	u64 n_ctpolerr;
+	u64 n_polerr;
+	u64 n_rxfrmsh;
+	u64 n_rxfrm;
+	u64 n_rxbytesh;
+	u64 n_rxbyte;
+	u64 n_txfrmsh;
+	u64 n_txfrm;
+	u64 n_txbytesh;
+	u64 n_txbyte;
+};
+
+struct sja1105_port_status_hl2 {
+	u64 n_qfull;
+	u64 n_part_drop;
+	u64 n_egr_disabled;
+	u64 n_not_reach;
+	u64 qlevel_hwm[8]; /* Only for P/Q/R/S */
+	u64 qlevel[8];     /* Only for P/Q/R/S */
+};
+
+struct sja1105_port_status {
+	struct sja1105_port_status_mac mac;
+	struct sja1105_port_status_hl1 hl1;
+	struct sja1105_port_status_hl2 hl2;
+};
+
+static void
+sja1105_port_status_mac_unpack(void *buf,
+			       struct sja1105_port_status_mac *status)
+{
+	/* Make pointer arithmetic work on 4 bytes */
+	u32 *p = (u32 *)buf;
+
+	sja1105_unpack(p + 0x0, &status->n_runt,       31, 24, 4);
+	sja1105_unpack(p + 0x0, &status->n_soferr,     23, 16, 4);
+	sja1105_unpack(p + 0x0, &status->n_alignerr,   15,  8, 4);
+	sja1105_unpack(p + 0x0, &status->n_miierr,      7,  0, 4);
+	sja1105_unpack(p + 0x1, &status->typeerr,      27, 27, 4);
+	sja1105_unpack(p + 0x1, &status->sizeerr,      26, 26, 4);
+	sja1105_unpack(p + 0x1, &status->tctimeout,    25, 25, 4);
+	sja1105_unpack(p + 0x1, &status->priorerr,     24, 24, 4);
+	sja1105_unpack(p + 0x1, &status->nomaster,     23, 23, 4);
+	sja1105_unpack(p + 0x1, &status->memov,        22, 22, 4);
+	sja1105_unpack(p + 0x1, &status->memerr,       21, 21, 4);
+	sja1105_unpack(p + 0x1, &status->invtyp,       19, 19, 4);
+	sja1105_unpack(p + 0x1, &status->intcyov,      18, 18, 4);
+	sja1105_unpack(p + 0x1, &status->domerr,       17, 17, 4);
+	sja1105_unpack(p + 0x1, &status->pcfbagdrop,   16, 16, 4);
+	sja1105_unpack(p + 0x1, &status->spcprior,     15, 12, 4);
+	sja1105_unpack(p + 0x1, &status->ageprior,     11,  8, 4);
+	sja1105_unpack(p + 0x1, &status->portdrop,      6,  6, 4);
+	sja1105_unpack(p + 0x1, &status->lendrop,       5,  5, 4);
+	sja1105_unpack(p + 0x1, &status->bagdrop,       4,  4, 4);
+	sja1105_unpack(p + 0x1, &status->policeerr,     3,  3, 4);
+	sja1105_unpack(p + 0x1, &status->drpnona664err, 2,  2, 4);
+	sja1105_unpack(p + 0x1, &status->spcerr,        1,  1, 4);
+	sja1105_unpack(p + 0x1, &status->agedrp,        0,  0, 4);
+}
+
+static void
+sja1105_port_status_hl1_unpack(void *buf,
+			       struct sja1105_port_status_hl1 *status)
+{
+	/* Make pointer arithmetic work on 4 bytes */
+	u32 *p = (u32 *)buf;
+
+	sja1105_unpack(p + 0xF, &status->n_n664err,    31,  0, 4);
+	sja1105_unpack(p + 0xE, &status->n_vlanerr,    31,  0, 4);
+	sja1105_unpack(p + 0xD, &status->n_unreleased, 31,  0, 4);
+	sja1105_unpack(p + 0xC, &status->n_sizeerr,    31,  0, 4);
+	sja1105_unpack(p + 0xB, &status->n_crcerr,     31,  0, 4);
+	sja1105_unpack(p + 0xA, &status->n_vlnotfound, 31,  0, 4);
+	sja1105_unpack(p + 0x9, &status->n_ctpolerr,   31,  0, 4);
+	sja1105_unpack(p + 0x8, &status->n_polerr,     31,  0, 4);
+	sja1105_unpack(p + 0x7, &status->n_rxfrmsh,    31,  0, 4);
+	sja1105_unpack(p + 0x6, &status->n_rxfrm,      31,  0, 4);
+	sja1105_unpack(p + 0x5, &status->n_rxbytesh,   31,  0, 4);
+	sja1105_unpack(p + 0x4, &status->n_rxbyte,     31,  0, 4);
+	sja1105_unpack(p + 0x3, &status->n_txfrmsh,    31,  0, 4);
+	sja1105_unpack(p + 0x2, &status->n_txfrm,      31,  0, 4);
+	sja1105_unpack(p + 0x1, &status->n_txbytesh,   31,  0, 4);
+	sja1105_unpack(p + 0x0, &status->n_txbyte,     31,  0, 4);
+	status->n_rxfrm  += status->n_rxfrmsh  << 32;
+	status->n_rxbyte += status->n_rxbytesh << 32;
+	status->n_txfrm  += status->n_txfrmsh  << 32;
+	status->n_txbyte += status->n_txbytesh << 32;
+}
+
+static void
+sja1105_port_status_hl2_unpack(void *buf,
+			       struct sja1105_port_status_hl2 *status)
+{
+	/* Make pointer arithmetic work on 4 bytes */
+	u32 *p = (u32 *)buf;
+
+	sja1105_unpack(p + 0x3, &status->n_qfull,        31,  0, 4);
+	sja1105_unpack(p + 0x2, &status->n_part_drop,    31,  0, 4);
+	sja1105_unpack(p + 0x1, &status->n_egr_disabled, 31,  0, 4);
+	sja1105_unpack(p + 0x0, &status->n_not_reach,    31,  0, 4);
+}
+
+static void
+sja1105pqrs_port_status_qlevel_unpack(void *buf,
+				      struct sja1105_port_status_hl2 *status)
+{
+	/* Make pointer arithmetic work on 4 bytes */
+	u32 *p = (u32 *)buf;
+	int i;
+
+	for (i = 0; i < 8; i++) {
+		sja1105_unpack(p + i, &status->qlevel_hwm[i], 24, 16, 4);
+		sja1105_unpack(p + i, &status->qlevel[i],      8,  0, 4);
+	}
+}
+
+static int sja1105_port_status_get_mac(struct sja1105_private *priv,
+				       struct sja1105_port_status_mac *status,
+				       int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	u8 packed_buf[SIZE_MAC_AREA] = {0};
+	int rc;
+
+	/* MAC area */
+	rc = sja1105_spi_send_packed_buf(priv, SPI_READ, regs->mac[port],
+					 packed_buf, SIZE_MAC_AREA);
+	if (rc < 0)
+		return rc;
+
+	sja1105_port_status_mac_unpack(packed_buf, status);
+
+	return 0;
+}
+
+static int sja1105_port_status_get_hl1(struct sja1105_private *priv,
+				       struct sja1105_port_status_hl1 *status,
+				       int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	u8 packed_buf[SIZE_HL1_AREA] = {0};
+	int rc;
+
+	rc = sja1105_spi_send_packed_buf(priv, SPI_READ, regs->mac_hl1[port],
+					 packed_buf, SIZE_HL1_AREA);
+	if (rc < 0)
+		return rc;
+
+	sja1105_port_status_hl1_unpack(packed_buf, status);
+
+	return 0;
+}
+
+static int sja1105_port_status_get_hl2(struct sja1105_private *priv,
+				       struct sja1105_port_status_hl2 *status,
+				       int port)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	u8 packed_buf[SIZE_QLEVEL_AREA] = {0};
+	int rc;
+
+	rc = sja1105_spi_send_packed_buf(priv, SPI_READ, regs->mac_hl2[port],
+					 packed_buf, SIZE_HL2_AREA);
+	if (rc < 0)
+		return rc;
+
+	sja1105_port_status_hl2_unpack(packed_buf, status);
+
+	if (IS_ET(priv->info->device_id))
+		/* Code below is strictly P/Q/R/S specific. */
+		return 0;
+
+	rc = sja1105_spi_send_packed_buf(priv, SPI_READ, regs->qlevel[port],
+					 packed_buf, SIZE_QLEVEL_AREA);
+	if (rc < 0)
+		return rc;
+
+	sja1105pqrs_port_status_qlevel_unpack(packed_buf, status);
+
+	return 0;
+}
+
+static int sja1105_port_status_get(struct sja1105_private *priv,
+				   struct sja1105_port_status *status,
+				   int port)
+{
+	int rc;
+
+	rc = sja1105_port_status_get_mac(priv, &status->mac, port);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_port_status_get_hl1(priv, &status->hl1, port);
+	if (rc < 0)
+		return rc;
+	rc = sja1105_port_status_get_hl2(priv, &status->hl2, port);
+	if (rc < 0)
+		return rc;
+
+	return 0;
+}
+
+static char sja1105_port_stats[][ETH_GSTRING_LEN] = {
+	/* MAC-Level Diagnostic Counters */
+	"n_runt",
+	"n_soferr",
+	"n_alignerr",
+	"n_miierr",
+	/* MAC-Level Diagnostic Flags */
+	"typeerr",
+	"sizeerr",
+	"tctimeout",
+	"priorerr",
+	"nomaster",
+	"memov",
+	"memerr",
+	"invtyp",
+	"intcyov",
+	"domerr",
+	"pcfbagdrop",
+	"spcprior",
+	"ageprior",
+	"portdrop",
+	"lendrop",
+	"bagdrop",
+	"policeerr",
+	"drpnona664err",
+	"spcerr",
+	"agedrp",
+	/* High-Level Diagnostic Counters */
+	"n_n664err",
+	"n_vlanerr",
+	"n_unreleased",
+	"n_sizeerr",
+	"n_crcerr",
+	"n_vlnotfound",
+	"n_ctpolerr",
+	"n_polerr",
+	"n_rxfrm",
+	"n_rxbyte",
+	"n_txfrm",
+	"n_txbyte",
+	"n_qfull",
+	"n_part_drop",
+	"n_egr_disabled",
+	"n_not_reach",
+};
+
+static char sja1105pqrs_extra_port_stats[][ETH_GSTRING_LEN] = {
+	/* Queue Levels */
+	"qlevel_hwm_0",
+	"qlevel_hwm_1",
+	"qlevel_hwm_2",
+	"qlevel_hwm_3",
+	"qlevel_hwm_4",
+	"qlevel_hwm_5",
+	"qlevel_hwm_6",
+	"qlevel_hwm_7",
+	"qlevel_0",
+	"qlevel_1",
+	"qlevel_2",
+	"qlevel_3",
+	"qlevel_4",
+	"qlevel_5",
+	"qlevel_6",
+	"qlevel_7",
+};
+
+void sja1105_get_ethtool_stats(struct dsa_switch *ds, int port, u64 *data)
+{
+	struct sja1105_private *priv = ds->priv;
+	struct sja1105_port_status status = {0};
+	int rc, i, k = 0;
+
+	rc = sja1105_port_status_get(priv, &status, port);
+	if (rc < 0) {
+		dev_err(ds->dev, "Failed to read port %d counters: %d\n",
+			port, rc);
+		return;
+	}
+	memset(data, 0, ARRAY_SIZE(sja1105_port_stats) * sizeof(u64));
+	data[k++] = status.mac.n_runt;
+	data[k++] = status.mac.n_soferr;
+	data[k++] = status.mac.n_alignerr;
+	data[k++] = status.mac.n_miierr;
+	data[k++] = status.mac.typeerr;
+	data[k++] = status.mac.sizeerr;
+	data[k++] = status.mac.tctimeout;
+	data[k++] = status.mac.priorerr;
+	data[k++] = status.mac.nomaster;
+	data[k++] = status.mac.memov;
+	data[k++] = status.mac.memerr;
+	data[k++] = status.mac.invtyp;
+	data[k++] = status.mac.intcyov;
+	data[k++] = status.mac.domerr;
+	data[k++] = status.mac.pcfbagdrop;
+	data[k++] = status.mac.spcprior;
+	data[k++] = status.mac.ageprior;
+	data[k++] = status.mac.portdrop;
+	data[k++] = status.mac.lendrop;
+	data[k++] = status.mac.bagdrop;
+	data[k++] = status.mac.policeerr;
+	data[k++] = status.mac.drpnona664err;
+	data[k++] = status.mac.spcerr;
+	data[k++] = status.mac.agedrp;
+	data[k++] = status.hl1.n_n664err;
+	data[k++] = status.hl1.n_vlanerr;
+	data[k++] = status.hl1.n_unreleased;
+	data[k++] = status.hl1.n_sizeerr;
+	data[k++] = status.hl1.n_crcerr;
+	data[k++] = status.hl1.n_vlnotfound;
+	data[k++] = status.hl1.n_ctpolerr;
+	data[k++] = status.hl1.n_polerr;
+	data[k++] = status.hl1.n_rxfrm;
+	data[k++] = status.hl1.n_rxbyte;
+	data[k++] = status.hl1.n_txfrm;
+	data[k++] = status.hl1.n_txbyte;
+	data[k++] = status.hl2.n_qfull;
+	data[k++] = status.hl2.n_part_drop;
+	data[k++] = status.hl2.n_egr_disabled;
+	data[k++] = status.hl2.n_not_reach;
+
+	if (!IS_PQRS(priv->info->device_id))
+		return;
+
+	memset(data + k, 0, ARRAY_SIZE(sja1105pqrs_extra_port_stats) *
+			sizeof(u64));
+	for (i = 0; i < 8; i++) {
+		data[k++] = status.hl2.qlevel_hwm[i];
+		data[k++] = status.hl2.qlevel[i];
+	}
+}
+
+void sja1105_get_strings(struct dsa_switch *ds, int port,
+			 u32 stringset, u8 *data)
+{
+	struct sja1105_private *priv = ds->priv;
+	u8 *p = data;
+	int i;
+
+	switch (stringset) {
+	case ETH_SS_STATS:
+		for (i = 0; i < ARRAY_SIZE(sja1105_port_stats); i++) {
+			strlcpy(p, sja1105_port_stats[i], ETH_GSTRING_LEN);
+			p += ETH_GSTRING_LEN;
+		}
+		if (!IS_PQRS(priv->info->device_id))
+			return;
+		for (i = 0; i < ARRAY_SIZE(sja1105pqrs_extra_port_stats); i++) {
+			strlcpy(p, sja1105pqrs_extra_port_stats[i],
+				ETH_GSTRING_LEN);
+			p += ETH_GSTRING_LEN;
+		}
+		break;
+	}
+}
+
+int sja1105_get_sset_count(struct dsa_switch *ds, int port, int sset)
+{
+	int count = ARRAY_SIZE(sja1105_port_stats);
+	struct sja1105_private *priv = ds->priv;
+
+	if (sset != ETH_SS_STATS)
+		return -EOPNOTSUPP;
+
+	if (IS_PQRS(priv->info->device_id))
+		count += ARRAY_SIZE(sja1105pqrs_extra_port_stats);
+
+	return count;
+}
+
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index a0851d1c9d89..9d28436fe466 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1240,6 +1240,9 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.get_tag_protocol	= sja1105_get_tag_protocol,
 	.setup			= sja1105_setup,
 	.adjust_link		= sja1105_adjust_link,
+	.get_strings		= sja1105_get_strings,
+	.get_ethtool_stats	= sja1105_get_ethtool_stats,
+	.get_sset_count		= sja1105_get_sset_count,
 	.port_fdb_dump		= sja1105_fdb_dump,
 	.port_fdb_add		= sja1105_fdb_add,
 	.port_fdb_del		= sja1105_fdb_del,
diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.h b/drivers/net/dsa/sja1105/sja1105_static_config.h
index c71ef9d366db..395d817e5fed 100644
--- a/drivers/net/dsa/sja1105/sja1105_static_config.h
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.h
@@ -78,6 +78,27 @@ enum sja1105_blk_idx {
 #define SJA1105R_PART_NO			0x9A86
 #define SJA1105S_PART_NO			0x9A87
 
+#define IS_PQRS(device_id) \
+	(((device_id) == SJA1105PR_DEVICE_ID) || \
+	 ((device_id) == SJA1105QS_DEVICE_ID))
+#define IS_ET(device_id) \
+	(((device_id) == SJA1105E_DEVICE_ID) || \
+	 ((device_id) == SJA1105T_DEVICE_ID))
+/* P and R have same Device ID, and differ by Part Number */
+#define IS_P(device_id, part_nr) \
+	(((device_id) == SJA1105PR_DEVICE_ID) && \
+	 ((part_nr) == SJA1105P_PART_NR))
+#define IS_R(device_id, part_nr) \
+	(((device_id) == SJA1105PR_DEVICE_ID) && \
+	 ((part_nr) == SJA1105R_PART_NR))
+/* Same do Q and S */
+#define IS_Q(device_id, part_nr) \
+	(((device_id) == SJA1105QS_DEVICE_ID) && \
+	 ((part_nr) == SJA1105Q_PART_NR))
+#define IS_S(device_id, part_nr) \
+	(((device_id) == SJA1105QS_DEVICE_ID) && \
+	 ((part_nr) == SJA1105S_PART_NR))
+
 struct sja1105_general_params_entry {
 	u64 vllupformat;
 	u64 mirr_ptacu;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (16 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 16:37   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 19/24] net: dsa: sja1105: Add support for Spanning Tree Protocol Vladimir Oltean
                   ` (5 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

In order to support this, we are creating a make-shift switch tag out of
a VLAN trunk configured on the CPU port. Termination of normal traffic
on switch ports only works when not under a vlan_filtering bridge.
Termination of management (PTP, BPDU) traffic works under all
circumstances because it uses a different tagging mechanism
(incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.

There are two types of traffic: regular and link-local.
The link-local traffic received on the CPU port is trapped from the
switch's regular forwarding decisions because it matched one of the two
DMAC filters for management traffic.
On transmission, the switch requires special massaging for these
link-local frames. Due to a weird implementation of the switching IP, by
default it drops link-local frames that originate on the CPU port. It
needs to be told where to forward them to, through an SPI command
("management route") that is valid for only a single frame.
So when we're sending link-local traffic, we need to clone skb's from
DSA and send them in our custom xmit worker that also performs SPI access.

For that purpose, the DSA xmit handler and the xmit worker communicate
through a per-port "skb ring" software structure, with a producer and a
consumer index. At the moment this structure is rather fragile
(ping-flooding to a link-local DMAC would cause most of the frames to
get dropped). I would like to move the management traffic on a separate
netdev queue that I can stop when the skb ring got full and hardware is
busy processing, so that we are not forced to drop traffic.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
Made management traffic be receivable on the DSA netdevices even when
switch tagging is disabled, as well as regular traffic be receivable on
the master netdevice in the same scenario. Both are accomplished using
the sja1105_filter() function and some small touch-ups in the .rcv
callback.

Changes in v2:
Fixed a mistake with a missing ntohs(), which was not caught because
the previously used EtherType (0xDADA) was endian-agnostic.

 drivers/net/dsa/sja1105/sja1105.h      |   8 ++
 drivers/net/dsa/sja1105/sja1105_main.c | 125 ++++++++++++++++++++-
 include/linux/dsa/sja1105.h            |  52 +++++++++
 include/net/dsa.h                      |   1 +
 net/dsa/Kconfig                        |   3 +
 net/dsa/Makefile                       |   1 +
 net/dsa/dsa.c                          |   6 +
 net/dsa/dsa_priv.h                     |   3 +
 net/dsa/tag_sja1105.c                  | 148 +++++++++++++++++++++++++
 9 files changed, 345 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/dsa/sja1105.h
 create mode 100644 net/dsa/tag_sja1105.c

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 80b20bdd8f9c..b7e745c0bb3a 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -5,6 +5,7 @@
 #ifndef _SJA1105_H
 #define _SJA1105_H
 
+#include <linux/dsa/sja1105.h>
 #include <net/dsa.h>
 #include "sja1105_static_config.h"
 
@@ -19,6 +20,12 @@
 #define SJA1105_NUM_TC			8
 #define SJA1105ET_FDB_BIN_SIZE		4
 
+struct sja1105_port {
+	struct dsa_port *dp;
+	struct work_struct xmit_work;
+	struct sja1105_skb_ring xmit_ring;
+};
+
 /* Keeps the different addresses between E/T and P/Q/R/S */
 struct sja1105_regs {
 	u64 device_id;
@@ -63,6 +70,7 @@ struct sja1105_private {
 	struct gpio_desc *reset_gpio;
 	struct spi_device *spidev;
 	struct dsa_switch *ds;
+	struct sja1105_port ports[SJA1105_NUM_PORTS];
 };
 
 #include "sja1105_dynamic_config.h"
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 9d28436fe466..018044f358fd 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1112,10 +1112,27 @@ static int sja1105_vlan_apply(struct sja1105_private *priv, int port, u16 vid,
 	return 0;
 }
 
+static int sja1105_setup_8021q_tagging(struct dsa_switch *ds, bool enabled)
+{
+	int rc, i;
+
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		rc = dsa_port_setup_8021q_tagging(ds, i, enabled);
+		if (rc < 0) {
+			dev_err(ds->dev, "Failed to setup VLAN tagging for port %d: %d\n",
+				i, rc);
+			return rc;
+		}
+	}
+	dev_info(ds->dev, "%s switch tagging\n",
+		 enabled ? "Enabled" : "Disabled");
+	return 0;
+}
+
 static enum dsa_tag_protocol
 sja1105_get_tag_protocol(struct dsa_switch *ds, int port)
 {
-	return DSA_TAG_PROTO_NONE;
+	return DSA_TAG_PROTO_SJA1105;
 }
 
 /* This callback needs to be present */
@@ -1141,7 +1158,11 @@ static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
 	if (rc)
 		dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
 
-	return rc;
+	/* Switch port identification based on 802.1Q is only passable
+	 * if we are not under a vlan_filtering bridge. So make sure
+	 * the two configurations are mutually exclusive.
+	 */
+	return sja1105_setup_8021q_tagging(ds, !enabled);
 }
 
 static void sja1105_vlan_add(struct dsa_switch *ds, int port,
@@ -1233,9 +1254,107 @@ static int sja1105_setup(struct dsa_switch *ds)
 	 */
 	ds->vlan_filtering_is_global = true;
 
+	/* The DSA/switchdev model brings up switch ports in standalone mode by
+	 * default, and that means vlan_filtering is 0 since they're not under
+	 * a bridge, so it's safe to set up switch tagging at this time.
+	 */
+	return sja1105_setup_8021q_tagging(ds, true);
+}
+
+#include "../../../net/dsa/dsa_priv.h"
+/* Deferred work is unfortunately necessary because setting up the management
+ * route cannot be done from atomit context (SPI transfer takes a sleepable
+ * lock on the bus)
+ */
+static void sja1105_xmit_work_handler(struct work_struct *work)
+{
+	struct sja1105_port *sp = container_of(work, struct sja1105_port,
+						xmit_work);
+	struct sja1105_private *priv = sp->dp->ds->priv;
+	struct net_device *slave = sp->dp->slave;
+	struct net_device *master = dsa_slave_to_master(slave);
+	int port = (uintptr_t)(sp - priv->ports);
+	struct sk_buff *skb;
+	int i, rc;
+
+	while ((i = sja1105_skb_ring_get(&sp->xmit_ring, &skb)) >= 0) {
+		struct sja1105_mgmt_entry mgmt_route = { 0 };
+		struct ethhdr *hdr;
+		int timeout = 10;
+		int skb_len;
+
+		skb_len = skb->len;
+		hdr = eth_hdr(skb);
+
+		mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
+		mgmt_route.destports = BIT(port);
+		mgmt_route.enfport = 1;
+		mgmt_route.tsreg = 0;
+		mgmt_route.takets = false;
+
+		rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
+						  port, &mgmt_route, true);
+		if (rc < 0) {
+			kfree_skb(skb);
+			slave->stats.tx_dropped++;
+			continue;
+		}
+
+		/* Transfer skb to the host port. */
+		skb->dev = master;
+		dev_queue_xmit(skb);
+
+		/* Wait until the switch has processed the frame */
+		do {
+			rc = sja1105_dynamic_config_read(priv, BLK_IDX_MGMT_ROUTE,
+							 port, &mgmt_route);
+			if (rc < 0) {
+				slave->stats.tx_errors++;
+				dev_err(priv->ds->dev,
+					"xmit: failed to poll for mgmt route\n");
+				continue;
+			}
+
+			/* UM10944: The ENFPORT flag of the respective entry is
+			 * cleared when a match is found. The host can use this
+			 * flag as an acknowledgment.
+			 */
+			cpu_relax();
+		} while (mgmt_route.enfport && --timeout);
+
+		if (!timeout) {
+			dev_err(priv->ds->dev, "xmit timed out\n");
+			slave->stats.tx_errors++;
+			continue;
+		}
+
+		slave->stats.tx_packets++;
+		slave->stats.tx_bytes += skb_len;
+	}
+}
+
+static int sja1105_port_enable(struct dsa_switch *ds, int port,
+			       struct phy_device *phydev)
+{
+	struct sja1105_private *priv = ds->priv;
+	struct sja1105_port *sp = &priv->ports[port];
+
+	sp->dp = &ds->ports[port];
+	INIT_WORK(&sp->xmit_work, sja1105_xmit_work_handler);
 	return 0;
 }
 
+static void sja1105_port_disable(struct dsa_switch *ds, int port)
+{
+	struct sja1105_private *priv = ds->priv;
+	struct sja1105_port *sp = &priv->ports[port];
+	struct sk_buff *skb;
+
+	cancel_work_sync(&sp->xmit_work);
+	while (sja1105_skb_ring_get(&sp->xmit_ring, &skb) >= 0)
+		kfree_skb(skb);
+}
+
 static const struct dsa_switch_ops sja1105_switch_ops = {
 	.get_tag_protocol	= sja1105_get_tag_protocol,
 	.setup			= sja1105_setup,
@@ -1255,6 +1374,8 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.port_mdb_prepare	= sja1105_mdb_prepare,
 	.port_mdb_add		= sja1105_mdb_add,
 	.port_mdb_del		= sja1105_mdb_del,
+	.port_enable		= sja1105_port_enable,
+	.port_disable		= sja1105_port_disable,
 };
 
 static int sja1105_check_device_id(struct sja1105_private *priv)
diff --git a/include/linux/dsa/sja1105.h b/include/linux/dsa/sja1105.h
new file mode 100644
index 000000000000..dd40ffa475cc
--- /dev/null
+++ b/include/linux/dsa/sja1105.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+
+/* Included by drivers/net/dsa/sja1105/sja1105.h and net/dsa/tag_sja1105.c */
+
+#ifndef _NET_DSA_SJA1105_H
+#define _NET_DSA_SJA1105_H
+
+#include <linux/skbuff.h>
+#include <net/dsa.h>
+
+#define SJA1105_SKB_RING_SIZE	20
+
+struct sja1105_skb_ring {
+	struct sk_buff *skb[SJA1105_SKB_RING_SIZE];
+	int count;
+	int pi; /* Producer index */
+	int ci; /* Consumer index */
+};
+
+static inline int sja1105_skb_ring_add(struct sja1105_skb_ring *ring,
+				       struct sk_buff *skb)
+{
+	int index;
+
+	if (ring->count == SJA1105_SKB_RING_SIZE)
+		return -1;
+
+	index = ring->pi;
+	ring->skb[index] = skb;
+	ring->pi = (index + 1) % SJA1105_SKB_RING_SIZE;
+	ring->count++;
+	return index;
+}
+
+static inline int sja1105_skb_ring_get(struct sja1105_skb_ring *ring,
+				       struct sk_buff **skb)
+{
+	int index;
+
+	if (ring->count == 0)
+		return -1;
+
+	index = ring->ci;
+	*skb = ring->skb[index];
+	ring->ci = (index + 1) % SJA1105_SKB_RING_SIZE;
+	ring->count--;
+	return index;
+}
+
+#endif /* _NET_DSA_SJA1105_H */
diff --git a/include/net/dsa.h b/include/net/dsa.h
index e46c107507d8..9c7e29c4f22a 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -41,6 +41,7 @@ enum dsa_tag_protocol {
 	DSA_TAG_PROTO_KSZ9893,
 	DSA_TAG_PROTO_LAN9303,
 	DSA_TAG_PROTO_MTK,
+	DSA_TAG_PROTO_SJA1105,
 	DSA_TAG_PROTO_QCA,
 	DSA_TAG_PROTO_TRAILER,
 	DSA_TAG_LAST,		/* MUST BE LAST */
diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
index b2fc07de8bcb..18deab52890f 100644
--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -65,6 +65,9 @@ config NET_DSA_TAG_LAN9303
 config NET_DSA_TAG_MTK
 	bool
 
+config NET_DSA_TAG_SJA1105
+	bool
+
 config NET_DSA_TAG_TRAILER
 	bool
 
diff --git a/net/dsa/Makefile b/net/dsa/Makefile
index d7fc3253d497..8c294cdb895a 100644
--- a/net/dsa/Makefile
+++ b/net/dsa/Makefile
@@ -15,4 +15,5 @@ dsa_core-$(CONFIG_NET_DSA_TAG_KSZ) += tag_ksz.o
 dsa_core-$(CONFIG_NET_DSA_TAG_LAN9303) += tag_lan9303.o
 dsa_core-$(CONFIG_NET_DSA_TAG_MTK) += tag_mtk.o
 dsa_core-$(CONFIG_NET_DSA_TAG_QCA) += tag_qca.o
+dsa_core-$(CONFIG_NET_DSA_TAG_SJA1105) += tag_sja1105.o
 dsa_core-$(CONFIG_NET_DSA_TAG_TRAILER) += tag_trailer.o
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 36de4f2a3366..7e2542d756e0 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -65,6 +65,9 @@ const struct dsa_device_ops *dsa_device_ops[DSA_TAG_LAST] = {
 #ifdef CONFIG_NET_DSA_TAG_MTK
 	[DSA_TAG_PROTO_MTK] = &mtk_netdev_ops,
 #endif
+#ifdef CONFIG_NET_DSA_TAG_SJA1105
+	[DSA_TAG_PROTO_SJA1105] = &sja1105_netdev_ops,
+#endif
 #ifdef CONFIG_NET_DSA_TAG_QCA
 	[DSA_TAG_PROTO_QCA] = &qca_netdev_ops,
 #endif
@@ -102,6 +105,9 @@ const char *dsa_tag_protocol_to_str(const struct dsa_device_ops *ops)
 #ifdef CONFIG_NET_DSA_TAG_MTK
 		[DSA_TAG_PROTO_MTK] = "mtk",
 #endif
+#ifdef CONFIG_NET_DSA_TAG_SJA1105
+		[DSA_TAG_PROTO_SJA1105] = "8021q",
+#endif
 #ifdef CONFIG_NET_DSA_TAG_QCA
 		[DSA_TAG_PROTO_QCA] = "qca",
 #endif
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index cc5ec3759952..d58a749c8b5b 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -236,6 +236,9 @@ extern const struct dsa_device_ops lan9303_netdev_ops;
 /* tag_mtk.c */
 extern const struct dsa_device_ops mtk_netdev_ops;
 
+/* tag_sja1105.c */
+extern const struct dsa_device_ops sja1105_netdev_ops;
+
 /* tag_qca.c */
 extern const struct dsa_device_ops qca_netdev_ops;
 
diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
new file mode 100644
index 000000000000..5c76a06c9093
--- /dev/null
+++ b/net/dsa/tag_sja1105.c
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
+ */
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/dsa/sja1105.h>
+#include "../../drivers/net/dsa/sja1105/sja1105.h"
+
+#include "dsa_priv.h"
+
+/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
+static inline bool sja1105_is_link_local(const struct sk_buff *skb)
+{
+	const struct ethhdr *hdr = eth_hdr(skb);
+	u64 dmac = ether_addr_to_u64(hdr->h_dest);
+
+	if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
+		    SJA1105_LINKLOCAL_FILTER_A)
+		return true;
+	if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
+		    SJA1105_LINKLOCAL_FILTER_B)
+		return true;
+	return false;
+}
+
+static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
+{
+	if (sja1105_is_link_local(skb))
+		return true;
+	if (!dev->dsa_ptr->vlan_filtering)
+		return true;
+	return false;
+}
+
+static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
+				    struct net_device *netdev)
+{
+	struct dsa_port *dp = dsa_slave_to_port(netdev);
+	struct dsa_switch *ds = dp->ds;
+	struct sja1105_private *priv = ds->priv;
+	struct sja1105_port *sp = &priv->ports[dp->index];
+	struct sk_buff *clone;
+
+	if (likely(!sja1105_is_link_local(skb))) {
+		/* Normal traffic path. */
+		u16 tx_vid = dsa_tagging_tx_vid(ds, dp->index);
+		u8 pcp = skb->priority;
+
+		/* If we are under a vlan_filtering bridge, IP termination on
+		 * switch ports based on 802.1Q tags is simply too brittle to
+		 * be passable. So just defer to the dsa_slave_notag_xmit
+		 * implementation.
+		 */
+		if (dp->vlan_filtering)
+			return skb;
+
+		return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
+				     ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
+	}
+
+	/* Code path for transmitting management traffic. This does not rely
+	 * upon switch tagging, but instead SPI-installed management routes.
+	 */
+	clone = skb_clone(skb, GFP_ATOMIC);
+	if (!clone) {
+		dev_err(ds->dev, "xmit: failed to clone skb\n");
+		return NULL;
+	}
+
+	if (sja1105_skb_ring_add(&sp->xmit_ring, clone) < 0) {
+		dev_err(ds->dev, "xmit: skb ring full\n");
+		kfree_skb(clone);
+		return NULL;
+	}
+
+	if (sp->xmit_ring.count == SJA1105_SKB_RING_SIZE)
+		/* TODO setup a dedicated netdev queue for management traffic
+		 * so that we can selectively apply backpressure and not be
+		 * required to stop the entire traffic when the software skb
+		 * ring is full. This requires hooking the ndo_select_queue
+		 * from DSA and matching on mac_fltres.
+		 */
+		dev_err(ds->dev, "xmit: reached maximum skb ring size\n");
+
+	schedule_work(&sp->xmit_work);
+	/* Let DSA free its reference to the skb and we will free
+	 * the clone in the deferred worker
+	 */
+	return NULL;
+}
+
+static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
+				   struct net_device *netdev,
+				   struct packet_type *pt)
+{
+	unsigned int source_port, switch_id;
+	struct ethhdr *hdr = eth_hdr(skb);
+	struct sk_buff *nskb;
+	u16 tpid, vid, tci;
+	bool is_tagged;
+
+	nskb = dsa_8021q_rcv(skb, netdev, pt, &tpid, &tci);
+	is_tagged = (nskb && tpid == ETH_P_EDSA);
+
+	skb->priority = (tci & VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
+	vid = tci & VLAN_VID_MASK;
+
+	skb->offload_fwd_mark = 1;
+
+	if (likely(!sja1105_is_link_local(skb))) {
+		/* Normal traffic path. */
+		source_port = dsa_tagging_rx_source_port(vid);
+		switch_id = dsa_tagging_rx_switch_id(vid);
+	} else {
+		/* Management traffic path. Switch embeds the switch ID and
+		 * port ID into bytes of the destination MAC, courtesy of
+		 * the incl_srcpt options.
+		 */
+		source_port = hdr->h_dest[3];
+		switch_id = hdr->h_dest[4];
+		/* Clear the DMAC bytes that were mangled by the switch */
+		hdr->h_dest[3] = 0;
+		hdr->h_dest[4] = 0;
+	}
+
+	skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
+	if (!skb->dev) {
+		netdev_warn(netdev, "Couldn't decode source port\n");
+		return NULL;
+	}
+
+	/* Delete/overwrite fake VLAN header, DSA expects to not find
+	 * it there, see dsa_switch_rcv: skb_push(skb, ETH_HLEN).
+	 */
+	if (is_tagged)
+		memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - VLAN_HLEN,
+			ETH_HLEN - VLAN_HLEN);
+
+	return skb;
+}
+
+const struct dsa_device_ops sja1105_netdev_ops = {
+	.xmit = sja1105_xmit,
+	.rcv = sja1105_rcv,
+	.filter = sja1105_filter,
+	.overhead = VLAN_HLEN,
+};
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 19/24] net: dsa: sja1105: Add support for Spanning Tree Protocol
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (17 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 16:41   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT Vladimir Oltean
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

While not explicitly documented as supported in UM10944, compliance with
the STP states can be obtained by manipulating 3 settings at the
(per-port) MAC config level: dynamic learning, inhibiting reception of
regular traffic, and inhibiting transmission of regular traffic.

In all these modes, transmission and reception of special BPDU frames
from the stack is still enabled (not inhibited by the MAC-level
settings).

On ingress, BPDUs are classified by the MAC filter as link-local
(01-80-C2-00-00-00) and forwarded to the CPU port.  This mechanism works
under all conditions (even without the custom 802.1Q tagging) because
the switch hardware inserts the source port and switch ID into bytes 4
and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put
back zeroes into the MAC address after decoding the source port
information.

On egress, BPDUs are transmitted using management routes from the xmit
worker thread. Again this does not require switch tagging, as the switch
port is programmed through SPI to hold a temporary (single-fire) route
for a frame with the programmed destination MAC (01-80-C2-00-00-00).

STP is activated using the following commands and was tested by
connecting two front-panel ports together and noticing that switching
loops were prevented (one port remains in the blocking state):

$ ip link add name br0 type bridge stp_state 1 && ip link set br0 up
$ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/);
  do ip link set ${eth} master br0 && ip link set ${eth} up; done

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Added verbiage to the commit message.

Changes in v2:
None.

 drivers/net/dsa/sja1105/sja1105_main.c | 108 ++++++++++++++++++++++---
 1 file changed, 99 insertions(+), 9 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 018044f358fd..e4abf8fb2013 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -94,8 +94,10 @@ static int sja1105_init_mac_settings(struct sja1105_private *priv)
 		.drpuntag = false,
 		/* Don't retag 802.1p (VID 0) traffic with the pvid */
 		.retag = false,
-		/* Enable learning and I/O on user ports by default. */
-		.dyn_learn = true,
+		/* Disable learning and I/O on user ports by default -
+		 * STP will enable it.
+		 */
+		.dyn_learn = false,
 		.egress = false,
 		.ingress = false,
 		.mirrcie = 0,
@@ -126,8 +128,17 @@ static int sja1105_init_mac_settings(struct sja1105_private *priv)
 
 	mac = table->entries;
 
-	for (i = 0; i < SJA1105_NUM_PORTS; i++)
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
 		mac[i] = default_mac;
+		if (i == dsa_upstream_port(priv->ds, i)) {
+			/* STP doesn't get called for CPU port, so we need to
+			 * set the I/O parameters statically.
+			 */
+			mac[i].dyn_learn = true;
+			mac[i].ingress = true;
+			mac[i].egress = true;
+		}
+	}
 
 	return 0;
 }
@@ -642,12 +653,14 @@ static sja1105_speed_t sja1105_get_speed_cfg(unsigned int speed_mbps)
  * for a specific port.
  *
  * @speed_mbps: If 0, leave the speed unchanged, else adapt MAC to PHY speed.
- * @enabled: Manage Rx and Tx settings for this port. Overrides the static
- *	     configuration settings.
+ * @enabled: Manage Rx and Tx settings for this port. If false, overrides the
+ *	     settings from the STP state, but not persistently (does not
+ *	     overwrite the static MAC info for this port).
  */
 static int sja1105_adjust_port_config(struct sja1105_private *priv, int port,
 				      int speed_mbps, bool enabled)
 {
+	struct sja1105_mac_config_entry dyn_mac;
 	struct sja1105_xmii_params_entry *mii;
 	struct sja1105_mac_config_entry *mac;
 	struct device *dev = priv->ds->dev;
@@ -680,12 +693,13 @@ static int sja1105_adjust_port_config(struct sja1105_private *priv, int port,
 	 * the code common, we'll use the static configuration tables as a
 	 * reasonable approximation for both E/T and P/Q/R/S.
 	 */
-	mac[port].ingress = enabled;
-	mac[port].egress  = enabled;
+	dyn_mac = mac[port];
+	dyn_mac.ingress = enabled && mac[port].ingress;
+	dyn_mac.egress  = enabled && mac[port].egress;
 
 	/* Write to the dynamic reconfiguration tables */
 	rc = sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG,
-					  port, &mac[port], true);
+					  port, &dyn_mac, true);
 	if (rc < 0) {
 		dev_err(dev, "Failed to write MAC config: %d\n", rc);
 		return rc;
@@ -943,6 +957,50 @@ static int sja1105_bridge_member(struct dsa_switch *ds, int port,
 					    port, &l2_fwd[port], true);
 }
 
+static void sja1105_bridge_stp_state_set(struct dsa_switch *ds, int port,
+					 u8 state)
+{
+	struct sja1105_private *priv = ds->priv;
+	struct sja1105_mac_config_entry *mac;
+
+	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
+
+	switch (state) {
+	case BR_STATE_DISABLED:
+	case BR_STATE_BLOCKING:
+		/* From UM10944 description of DRPDTAG (why put this there?):
+		 * "Management traffic flows to the port regardless of the state
+		 * of the INGRESS flag". So BPDUs are still be allowed to pass.
+		 * At the moment no difference between DISABLED and BLOCKING.
+		 */
+		mac[port].ingress   = false;
+		mac[port].egress    = false;
+		mac[port].dyn_learn = false;
+		break;
+	case BR_STATE_LISTENING:
+		mac[port].ingress   = true;
+		mac[port].egress    = false;
+		mac[port].dyn_learn = false;
+		break;
+	case BR_STATE_LEARNING:
+		mac[port].ingress   = true;
+		mac[port].egress    = false;
+		mac[port].dyn_learn = true;
+		break;
+	case BR_STATE_FORWARDING:
+		mac[port].ingress   = true;
+		mac[port].egress    = true;
+		mac[port].dyn_learn = true;
+		break;
+	default:
+		dev_err(ds->dev, "invalid STP state: %d\n", state);
+		return;
+	}
+
+	sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, port,
+				     &mac[port], true);
+}
+
 static int sja1105_bridge_join(struct dsa_switch *ds, int port,
 			       struct net_device *br)
 {
@@ -955,6 +1013,23 @@ static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
 	sja1105_bridge_member(ds, port, br, false);
 }
 
+static u8 sja1105_stp_state_get(struct sja1105_private *priv, int port)
+{
+	struct sja1105_mac_config_entry *mac;
+
+	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
+
+	if (!mac[port].ingress && !mac[port].egress && !mac[port].dyn_learn)
+		return BR_STATE_BLOCKING;
+	if (mac[port].ingress && !mac[port].egress && !mac[port].dyn_learn)
+		return BR_STATE_LISTENING;
+	if (mac[port].ingress && !mac[port].egress && mac[port].dyn_learn)
+		return BR_STATE_LEARNING;
+	if (mac[port].ingress && mac[port].egress && mac[port].dyn_learn)
+		return BR_STATE_FORWARDING;
+	return -EINVAL;
+}
+
 /* For situations where we need to change a setting at runtime that is only
  * available through the static configuration, resetting the switch in order
  * to upload the new static config is unavoidable. Back up the settings we
@@ -965,16 +1040,27 @@ static int sja1105_static_config_reload(struct sja1105_private *priv)
 {
 	struct sja1105_mac_config_entry *mac;
 	int speed_mbps[SJA1105_NUM_PORTS];
+	u8 stp_state[SJA1105_NUM_PORTS];
 	int rc, i;
 
 	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
 
 	/* Back up settings changed by sja1105_adjust_port_config and
-	 * and restore their defaults.
+	 * sja1105_bridge_stp_state_set and restore their defaults.
 	 */
 	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
 		speed_mbps[i] = sja1105_speed[mac[i].speed];
 		mac[i].speed = SJA1105_SPEED_AUTO;
+		if (i == dsa_upstream_port(priv->ds, i)) {
+			mac[i].ingress = true;
+			mac[i].egress = true;
+			mac[i].dyn_learn = true;
+		} else {
+			stp_state[i] = sja1105_stp_state_get(priv, i);
+			mac[i].ingress = false;
+			mac[i].egress = false;
+			mac[i].dyn_learn = false;
+		}
 	}
 
 	/* Reset switch and send updated static configuration */
@@ -993,6 +1079,9 @@ static int sja1105_static_config_reload(struct sja1105_private *priv)
 	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
 		bool enabled = (speed_mbps[i] != 0);
 
+		if (i != dsa_upstream_port(priv->ds, i))
+			sja1105_bridge_stp_state_set(priv->ds, i, stp_state[i]);
+
 		rc = sja1105_adjust_port_config(priv, i, speed_mbps[i],
 						enabled);
 		if (rc < 0)
@@ -1367,6 +1456,7 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.port_fdb_del		= sja1105_fdb_del,
 	.port_bridge_join	= sja1105_bridge_join,
 	.port_bridge_leave	= sja1105_bridge_leave,
+	.port_stp_state_set	= sja1105_bridge_stp_state_set,
 	.port_vlan_prepare	= sja1105_vlan_prepare,
 	.port_vlan_filtering	= sja1105_vlan_filtering,
 	.port_vlan_add		= sja1105_vlan_add,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (18 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 19/24] net: dsa: sja1105: Add support for Spanning Tree Protocol Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 16:49   ` Andrew Lunn
  2019-04-13 20:47   ` Jiri Pirko
  2019-04-13  1:28 ` [PATCH v3 net-next 21/24] net: dsa: sja1105: Prevent PHY jabbering during switch reset Vladimir Oltean
                   ` (3 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:

  * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
     should not add an RX delay in this case)

The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
   and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
   passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
   or less clear (needs to be added by the PHY), what should the MAC
   driver do about the Tx delays? Is it an implicit wild card for the
   MAC to apply delays in the Tx direction if it can? What if those were
   already added as serpentine PCB traces, how could that be made more
   obvious through DT bindings so that the MAC doesn't attempt to add
   them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
   fixed (a purely software entity that obviously cannot add clock
   skew), what is the meaning of the above property?

So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.

The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.

For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.

Meanwhile do not allow the user to select an invalid configuration.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
Patch is new.

 drivers/net/dsa/sja1105/sja1105.h          |  3 ++
 drivers/net/dsa/sja1105/sja1105_clocking.c |  7 ++++-
 drivers/net/dsa/sja1105/sja1105_main.c     | 32 +++++++++++++++++++++-
 drivers/net/dsa/sja1105/sja1105_spi.c      |  6 ++++
 4 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index b7e745c0bb3a..3c16b991032c 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -22,6 +22,8 @@
 
 struct sja1105_port {
 	struct dsa_port *dp;
+	bool rgmii_rx_delay;
+	bool rgmii_tx_delay;
 	struct work_struct xmit_work;
 	struct sja1105_skb_ring xmit_ring;
 };
@@ -61,6 +63,7 @@ struct sja1105_info {
 	const struct sja1105_table_ops *static_ops;
 	const struct sja1105_regs *regs;
 	int (*reset_cmd)(const void *ctx, const void *data);
+	int (*setup_rgmii_delay)(const void *ctx, int port, bool rx, bool tx);
 	const char *name;
 };
 
diff --git a/drivers/net/dsa/sja1105/sja1105_clocking.c b/drivers/net/dsa/sja1105/sja1105_clocking.c
index d40da3d52464..c02fec181676 100644
--- a/drivers/net/dsa/sja1105/sja1105_clocking.c
+++ b/drivers/net/dsa/sja1105/sja1105_clocking.c
@@ -432,7 +432,12 @@ static int rgmii_clocking_setup(struct sja1105_private *priv, int port)
 		dev_err(dev, "Failed to configure Tx pad registers\n");
 		return rc;
 	}
-	return 0;
+	if (!priv->info->setup_rgmii_delay)
+		return 0;
+
+	return priv->info->setup_rgmii_delay(priv, port,
+					     priv->ports[port].rgmii_rx_delay,
+					     priv->ports[port].rgmii_tx_delay);
 }
 
 static int sja1105_cgu_rmii_ref_clk_config(struct sja1105_private *priv,
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index e4abf8fb2013..5f7ddb1da006 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -555,6 +555,21 @@ static int sja1105_static_config_load(struct sja1105_private *priv,
 	return sja1105_static_config_upload(priv);
 }
 
+static void sja1105_parse_rgmii_delay(const struct sja1105_dt_port *in,
+				      struct sja1105_port *out)
+{
+	if (in->role == XMII_MAC)
+		return;
+
+	if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
+	    in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
+		out->rgmii_rx_delay = true;
+
+	if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_TXID ||
+	    in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
+		out->rgmii_tx_delay = true;
+}
+
 static int sja1105_parse_ports_node(struct sja1105_private *priv,
 				    struct sja1105_dt_port *ports,
 				    struct device_node *ports_node)
@@ -1315,13 +1330,28 @@ static int sja1105_setup(struct dsa_switch *ds)
 {
 	struct sja1105_dt_port ports[SJA1105_NUM_PORTS];
 	struct sja1105_private *priv = ds->priv;
-	int rc;
+	int rc, i;
 
 	rc = sja1105_parse_dt(priv, ports);
 	if (rc < 0) {
 		dev_err(ds->dev, "Failed to parse DT: %d\n", rc);
 		return rc;
 	}
+
+	/* Error out early if internal delays are required through DT
+	 * and we can't apply them.
+	 */
+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
+		sja1105_parse_rgmii_delay(&ports[i], &priv->ports[i]);
+
+		if ((priv->ports[i].rgmii_rx_delay ||
+		     priv->ports[i].rgmii_tx_delay) &&
+		     !priv->info->setup_rgmii_delay) {
+			dev_err(ds->dev, "RGMII delay not supported\n");
+			return -EINVAL;
+		}
+	}
+
 	/* Create and send configuration down to device */
 	rc = sja1105_static_config_load(priv, ports);
 	if (rc < 0) {
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
index 09cb28e9be20..e4ef4d8048b2 100644
--- a/drivers/net/dsa/sja1105/sja1105_spi.c
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -499,6 +499,7 @@ struct sja1105_info sja1105e_info = {
 	.part_no		= SJA1105ET_PART_NO,
 	.static_ops		= sja1105e_table_ops,
 	.dyn_ops		= sja1105et_dyn_ops,
+	.setup_rgmii_delay	= NULL,
 	.reset_cmd		= sja1105et_reset_cmd,
 	.regs			= &sja1105et_regs,
 	.name			= "SJA1105E",
@@ -508,6 +509,7 @@ struct sja1105_info sja1105t_info = {
 	.part_no		= SJA1105ET_PART_NO,
 	.static_ops		= sja1105t_table_ops,
 	.dyn_ops		= sja1105et_dyn_ops,
+	.setup_rgmii_delay	= NULL,
 	.reset_cmd		= sja1105et_reset_cmd,
 	.regs			= &sja1105et_regs,
 	.name			= "SJA1105T",
@@ -517,6 +519,7 @@ struct sja1105_info sja1105p_info = {
 	.part_no		= SJA1105P_PART_NO,
 	.static_ops		= sja1105p_table_ops,
 	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.setup_rgmii_delay	= NULL,
 	.reset_cmd		= sja1105pqrs_reset_cmd,
 	.regs			= &sja1105pqrs_regs,
 	.name			= "SJA1105P",
@@ -526,6 +529,7 @@ struct sja1105_info sja1105q_info = {
 	.part_no		= SJA1105Q_PART_NO,
 	.static_ops		= sja1105q_table_ops,
 	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.setup_rgmii_delay	= NULL,
 	.reset_cmd		= sja1105pqrs_reset_cmd,
 	.regs			= &sja1105pqrs_regs,
 	.name			= "SJA1105Q",
@@ -535,6 +539,7 @@ struct sja1105_info sja1105r_info = {
 	.part_no		= SJA1105R_PART_NO,
 	.static_ops		= sja1105r_table_ops,
 	.dyn_ops		= sja1105pqrs_dyn_ops,
+	.setup_rgmii_delay	= NULL,
 	.reset_cmd		= sja1105pqrs_reset_cmd,
 	.regs			= &sja1105pqrs_regs,
 	.name			= "SJA1105R",
@@ -545,6 +550,7 @@ struct sja1105_info sja1105s_info = {
 	.static_ops		= sja1105s_table_ops,
 	.dyn_ops		= sja1105pqrs_dyn_ops,
 	.regs			= &sja1105pqrs_regs,
+	.setup_rgmii_delay	= NULL,
 	.reset_cmd		= sja1105pqrs_reset_cmd,
 	.name			= "SJA1105S",
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 21/24] net: dsa: sja1105: Prevent PHY jabbering during switch reset
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (19 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13 16:54   ` Andrew Lunn
  2019-04-13  1:28 ` [PATCH v3 net-next 22/24] net: dsa: sja1105: Reject unsupported link modes for AN Vladimir Oltean
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Resetting the switch at runtime is currently done while changing the
vlan_filtering setting (due to the required TPID change).

But reset is asynchronous with packet egress, and the switch core will
not wait for egress to finish before carrying on with the reset
operation.

As a result, a connected PHY such as the BCM5464 would see an
unterminated Ethernet frame and start to jabber (repeat the last seen
Ethernet symbols - jabber is by definition an oversized Ethernet frame
with bad FCS). This behavior is strange in itself, but it also causes
the MACs of some link partners (such as the FRDM-LS1012A) to completely
lock up.

So as a remedy for this situation, when switch reset is required, simply
inhibit Tx on all ports, and wait for the necessary time for the
eventual one frame left in the egress queue (not even the Tx inhibit
command is instantaneous) to be flushed.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
None.

Changes in v2:
Patch is new.

 drivers/net/dsa/sja1105/sja1105.h     |  1 +
 drivers/net/dsa/sja1105/sja1105_spi.c | 37 +++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 3c16b991032c..9cea23ba2806 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -33,6 +33,7 @@ struct sja1105_regs {
 	u64 device_id;
 	u64 prod_id;
 	u64 status;
+	u64 port_control;
 	u64 rgu;
 	u64 config;
 	u64 rmii_pll1;
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
index e4ef4d8048b2..622f7afcbf43 100644
--- a/drivers/net/dsa/sja1105/sja1105_spi.c
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -10,6 +10,7 @@
 #define SIZE_SPI_MSG_HEADER	4
 #define SIZE_SPI_MSG_MAXLEN	(64 * 4)
 #define SPI_TRANSFER_SIZE_MAX	(SIZE_SPI_MSG_HEADER + SIZE_SPI_MSG_MAXLEN)
+#define SIZE_PORT_CTRL		4
 
 static int sja1105_spi_transfer(const struct sja1105_private *priv,
 				const void *tx, void *rx, int size)
@@ -278,6 +279,25 @@ static int sja1105_cold_reset(const struct sja1105_private *priv)
 	return priv->info->reset_cmd(priv, &reset);
 }
 
+static int sja1105_inhibit_tx(const struct sja1105_private *priv,
+			      const unsigned long *port_bitmap)
+{
+	const struct sja1105_regs *regs = priv->info->regs;
+	u64 inhibit_cmd;
+	int port, rc;
+
+	rc = sja1105_spi_send_int(priv, SPI_READ, regs->port_control,
+				  &inhibit_cmd, SIZE_PORT_CTRL);
+	if (rc < 0)
+		return rc;
+
+	for_each_set_bit(port, port_bitmap, SJA1105_NUM_PORTS)
+		inhibit_cmd |= BIT(port);
+
+	return sja1105_spi_send_int(priv, SPI_WRITE, regs->port_control,
+				    &inhibit_cmd, SIZE_PORT_CTRL);
+}
+
 struct sja1105_status {
 	u64 configs;
 	u64 crcchkl;
@@ -366,6 +386,7 @@ static_config_buf_prepare_for_upload(struct sja1105_private *priv,
 int sja1105_static_config_upload(struct sja1105_private *priv)
 {
 #define RETRIES 10
+	unsigned long port_bitmap = GENMASK_ULL(SJA1105_NUM_PORTS - 1, 0);
 	struct sja1105_static_config *config = &priv->static_config;
 	const struct sja1105_regs *regs = priv->info->regs;
 	struct device *dev = &priv->spidev->dev;
@@ -384,6 +405,20 @@ int sja1105_static_config_upload(struct sja1105_private *priv)
 		dev_err(dev, "Invalid config, cannot upload\n");
 		return -EINVAL;
 	}
+	/* Prevent PHY jabbering during switch reset by inhibiting
+	 * Tx on all ports and waiting for current packet to drain.
+	 * Otherwise, the PHY will see an unterminated Ethernet packet.
+	 */
+	rc = sja1105_inhibit_tx(priv, &port_bitmap);
+	if (rc < 0) {
+		dev_err(dev, "Failed to inhibit Tx on ports\n");
+		return -ENXIO;
+	}
+	/* Wait for an eventual egress packet to finish transmission
+	 * (reach IFG). It is guaranteed that a second one will not
+	 * follow, and that switch cold reset is thus safe
+	 */
+	usleep_range(500, 1000);
 	do {
 		/* Put the SJA1105 in programming mode */
 		rc = sja1105_cold_reset(priv);
@@ -449,6 +484,7 @@ struct sja1105_regs sja1105et_regs = {
 	.device_id = 0x0,
 	.prod_id = 0x100BC3,
 	.status = 0x1,
+	.port_control = 0x11,
 	.config = 0x020000,
 	.rgu = 0x100440,
 	.pad_mii_tx = {0x100800, 0x100802, 0x100804, 0x100806, 0x100808},
@@ -473,6 +509,7 @@ struct sja1105_regs sja1105pqrs_regs = {
 	.device_id = 0x0,
 	.prod_id = 0x100BC3,
 	.status = 0x1,
+	.port_control = 0x12,
 	.config = 0x020000,
 	.rgu = 0x100440,
 	.pad_mii_tx = {0x100800, 0x100802, 0x100804, 0x100806, 0x100808},
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 22/24] net: dsa: sja1105: Reject unsupported link modes for AN
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (20 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 21/24] net: dsa: sja1105: Prevent PHY jabbering during switch reset Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 23/24] Documentation: net: dsa: Add details about NXP SJA1105 driver Vladimir Oltean
  2019-04-13  1:28 ` [PATCH v3 net-next 24/24] dt-bindings: net: dsa: Add documentation for " Vladimir Oltean
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Ethernet flow control:

The switch MAC does not consume, nor does it emit pause frames. It
simply forwards them as any other Ethernet frame (and since the DMAC is,
per IEEE spec, 01-80-C2-00-00-01, it means they are filtered as
link-local traffic and forwarded to the CPU, which can't do anything
useful with them).

Duplex:

There is no duplex setting in the SJA1105 MAC. It is known to forward
traffic at line rate on the same port in both directions. Therefore it
must be that it only supports full duplex.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
Patch is new.

 drivers/net/dsa/sja1105/sja1105_main.c | 31 ++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 5f7ddb1da006..e22f7b666259 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -11,6 +11,7 @@
 #include <linux/spi/spi.h>
 #include <linux/errno.h>
 #include <linux/gpio/consumer.h>
+#include <linux/phylink.h>
 #include <linux/of.h>
 #include <linux/of_net.h>
 #include <linux/of_mdio.h>
@@ -747,6 +748,35 @@ static void sja1105_adjust_link(struct dsa_switch *ds, int port,
 		sja1105_adjust_port_config(priv, port, phydev->speed, true);
 }
 
+static void sja1105_phylink_validate(struct dsa_switch *ds, int port,
+				     unsigned long *supported,
+				     struct phylink_link_state *state)
+{
+	/* Construct a new mask which exhaustively contains all link features
+	 * supported by the MAC, and then apply that (logical AND) to what will
+	 * be sent to the PHY for "marketing".
+	 */
+	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
+	struct sja1105_private *priv = ds->priv;
+	struct sja1105_xmii_params_entry *mii;
+
+	mii = priv->static_config.tables[BLK_IDX_XMII_PARAMS].entries;
+
+	/* The MAC does not support pause frames, and also doesn't
+	 * support half-duplex traffic modes.
+	 */
+	phylink_set(mask, Autoneg);
+	phylink_set(mask, MII);
+	phylink_set(mask, 10baseT_Full);
+	phylink_set(mask, 100baseT_Full);
+	if (mii->xmii_mode[port] == XMII_MODE_RGMII)
+		phylink_set(mask, 1000baseT_Full);
+
+	bitmap_and(supported, supported, mask, __ETHTOOL_LINK_MODE_MASK_NBITS);
+	bitmap_and(state->advertising, state->advertising, mask,
+		   __ETHTOOL_LINK_MODE_MASK_NBITS);
+}
+
 #define fdb(bin, index) \
 	((bin) * SJA1105ET_FDB_BIN_SIZE + (index))
 #define is_bin_index_valid(i) \
@@ -1478,6 +1508,7 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.get_tag_protocol	= sja1105_get_tag_protocol,
 	.setup			= sja1105_setup,
 	.adjust_link		= sja1105_adjust_link,
+	.phylink_validate	= sja1105_phylink_validate,
 	.get_strings		= sja1105_get_strings,
 	.get_ethtool_stats	= sja1105_get_ethtool_stats,
 	.get_sset_count		= sja1105_get_sset_count,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 23/24] Documentation: net: dsa: Add details about NXP SJA1105 driver
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (21 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 22/24] net: dsa: sja1105: Reject unsupported link modes for AN Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  2019-04-17  0:20   ` Florian Fainelli
  2019-04-13  1:28 ` [PATCH v3 net-next 24/24] dt-bindings: net: dsa: Add documentation for " Vladimir Oltean
  23 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes in v3:
Reworked as rst, added a table for traffic support, added chapter for
switching features.

Changes in v2:
More verbiage at the end, regarding RGMII delays and potentially other
hardware-related caveats.

 Documentation/networking/dsa/index.rst   |   1 +
 Documentation/networking/dsa/sja1105.rst | 216 +++++++++++++++++++++++
 2 files changed, 217 insertions(+)
 create mode 100644 Documentation/networking/dsa/sja1105.rst

diff --git a/Documentation/networking/dsa/index.rst b/Documentation/networking/dsa/index.rst
index 5c488d345a1e..0e5b7a9be406 100644
--- a/Documentation/networking/dsa/index.rst
+++ b/Documentation/networking/dsa/index.rst
@@ -8,3 +8,4 @@ Distributed Switch Architecture
    dsa
    bcm_sf2
    lan9303
+   sja1105
diff --git a/Documentation/networking/dsa/sja1105.rst b/Documentation/networking/dsa/sja1105.rst
new file mode 100644
index 000000000000..9614d127da57
--- /dev/null
+++ b/Documentation/networking/dsa/sja1105.rst
@@ -0,0 +1,216 @@
+=========================
+NXP SJA1105 switch driver
+=========================
+
+Overview
+========
+
+The NXP SJA1105 is a family of 6 devices:
+
+- SJA1105E: First generation, no TTEthernet
+- SJA1105T: First generation, TTEthernet
+- SJA1105P: Second generation, no TTEthernet, no SGMII
+- SJA1105Q: Second generation, TTEthernet, no SGMII
+- SJA1105R: Second generation, no TTEthernet, SGMII
+- SJA1105S: Second generation, TTEthernet, SGMII
+
+These are SPI-managed automotive switches, with all ports being gigabit
+capable, and supporting MII/RMII/RGMII and optionally SGMII on one port.
+
+Being automotive parts, their configuration interface is geared towards
+set-and-forget use, with minimal dynamic interaction at runtime. They
+require a static configuration to be composed by software and packed
+with CRC and table headers, and sent over SPI.
+
+The static configuration is composed of several configuration tables. Each
+table takes a number of entries. Some configuration tables can be (partially)
+reconfigured at runtime, some not. Some tables are mandatory, some not:
+
+============================= ================== =============================
+Table                          Mandatory          Reconfigurable
+============================= ================== =============================
+Schedule                       no                 no
+Schedule entry points          if Scheduling      no
+VL Lookup                      no                 no
+VL Policing                    if VL Lookup       no
+VL Forwarding                  if VL Lookup       no
+L2 Lookup                      no                 no
+L2 Policing                    yes                no
+VLAN Lookup                    yes                yes
+L2 Forwarding                  yes                partially (fully on P/Q/R/S)
+MAC Config                     yes                partially (fully on P/Q/R/S)
+Schedule Params                if Scheduling      no
+Schedule Entry Points Params   if Scheduling      no
+VL Forwarding Params           if VL Forwarding   no
+L2 Lookup Params               no                 partially (fully on P/Q/R/S)
+L2 Forwarding Params           yes                no
+Clock Sync Params              no                 no
+AVB Params                     no                 no
+General Params                 yes                partially
+Retagging                      no                 yes
+xMII Params                    yes                no
+SGMII                          no                 yes
+============================= ================== =============================
+
+
+Also the configuration is write-only (software cannot read it back from the
+switch except for very few exceptions).
+
+The driver creates a static configuration at probe time, and keeps it at
+all times in memory, as a shadow for the hardware state. When required to
+change a hardware setting, the static configuration is also updated.
+If that changed setting can be transmitted to the switch through the dynamic
+reconfiguration interface, it is; otherwise the switch is reset and
+reprogrammed with the updated static configuration.
+
+Traffic support
+===============
+
+The switches do not support switch tagging in hardware. But they do support
+customizing the TPID by which VLAN traffic is identified as such. The switch
+driver is leveraging ``CONFIG_NET_DSA_TAG_8021Q`` by requesting that special
+VLANs (with a custom TPID of ``ETH_P_EDSA`` instead of ``ETH_P_8021Q``) are
+installed on its ports when not in ``vlan_filtering`` mode. This does not
+interfere with the reception and transmission of real 802.1Q-tagged traffic,
+because the switch does no longer parse those packets as VLAN after the TPID
+change.
+The TPID is restored when ``vlan_filtering`` is requested by the user through
+the bridge layer, and general IP termination becomes no longer possible through
+the switch netdevices in this mode.
+
+The switches have two programmable filters for link-local destination MACs.
+These are used to trap BPDUs and PTP traffic to the master netdevice, and are
+further used to support STP and 1588 ordinary clock/boundary clock
+functionality.
+
+The following traffic modes are supported over the switch netdevices:
+
++--------------------+------------+------------------+------------------+
+|                    | Standalone |   Bridged with   |   Bridged with   |
+|                    |    ports   | vlan_filtering 0 | vlan_filtering 1 |
++====================+============+==================+==================+
+| Regular traffic    |     Yes    |       Yes        |  No (use master) |
++--------------------+------------+------------------+------------------+
+| Management traffic |     Yes    |       Yes        |       Yes        |
+|    (BPDU, PTP)     |            |                  |                  |
++--------------------+------------+------------------+------------------+
+
+Switching features
+==================
+
+The driver supports the configuration of L2 forwarding rules in hardware for
+port bridging. The forwarding, broadcast and flooding domain between ports can
+be restricted through two methods: either at the L2 forwarding level (isolate
+one bridge's ports from another's) or at the VLAN port membership level
+(isolate ports within the same bridge). The final forwarding decision taken by
+the hardware is a logical AND of these two sets of rules.
+
+The hardware tags all traffic internally with a port-based VLAN (pvid), or it
+decodes the VLAN information from the 802.1Q tag. Advanced VLAN classification
+is not possible. Once attributed a VLAN tag, frames are checked against the
+port's membership rules and dropped at ingress if they don't match any VLAN.
+This behavior is available when switch ports are enslaved to a bridge with
+``vlan_filtering 1``.
+
+Normally the hardware is not configurable with respect to VLAN awareness, but
+by changing what TPID the switch searches 802.1Q tags for, the semantics of a
+bridge with ``vlan_filtering 0`` can be kept (accept all traffic, tagged or
+untagged), and therefore this mode is also supported.
+
+Segregating the switch ports in multiple bridges is supported (e.g. 2 + 2), but
+all bridges should have the same level of VLAN awareness (either both have
+``vlan_filtering`` 0, or both 1). Also an inevitable limitation of the fact
+that VLAN awareness is global at the switch level is that once a bridge with
+``vlan_filtering`` enslaves at least one switch port, the other un-bridged
+ports are no longer available for standalone traffic termination.
+
+Topology and loop detection through STP is supported.
+
+L2 FDB manipulation (add/delete/dump) is currently possible for the first
+generation devices. Aging time of FDB entries, as well as enabling fully static
+management (no address learning and no flooding of unknown traffic) is not yet
+configurable in the driver.
+
+Other notable features
+======================
+
+The switches have a PTP Hardware Clock that can be steered through SPI and used
+for timestamping management traffic on ingress and egress.
+Also, the T, Q and S devices support TTEthernet (an implementation of SAE
+AS6802 from TTTech), which is a set of Ethernet QoS enhancements somewhat
+similar in behavior to IEEE TSN (time-aware shaping, time-based policing).
+Configuring these features is currently not supported in the driver.
+
+Device Tree bindings and board design
+=====================================
+
+This section references ``Documentation/devicetree/bindings/net/dsa/sja1105.txt``
+and aims to showcase some potential switch caveats.
+
+RMII PHY role and out-of-band signaling
+---------------------------------------
+
+In the RMII spec, the 50 MHz clock signals are either driven by the MAC or by
+an external oscillator (but not by the PHY).
+But the spec is rather loose and devices go outside it in several ways.
+Some PHYs go against the spec and may provide an output pin where they source
+the 50 MHz clock themselves, in an attempt to be helpful.
+On the other hand, the SJA1105 is only binary configurable - when in the RMII
+MAC role it will also attempt to drive the clock signal. To prevent this from
+happening it must be put in RMII PHY role.
+But doing so has some unintended consequences.
+In the RMII spec, the PHY can transmit extra out-of-band signals via RXD[1:0].
+These are practically some extra code words (/J/ and /K/) sent prior to the
+preamble of each frame. The MAC does not have this out-of-band signaling
+mechanism defined by the RMII spec.
+So when the SJA1105 port is put in PHY role to avoid having 2 drivers on the
+clock signal, inevitably an RMII PHY-to-PHY connection is created. The SJA1105
+emulates a PHY interface fully and generates the /J/ and /K/ symbols prior to
+frame preambles, which the real PHY is not expected to understand. So the PHY
+simply encodes the extra symbols received from the SJA1105-as-PHY onto the
+100Base-Tx wire.
+On the other side of the wire, some link partners might discard these extra
+symbols, while others might choke on them and discard the entire Ethernet
+frames that follow along. This looks like packet loss with some link partners
+but not with others.
+The take-away is that in RMII mode, the SJA1105 must be let to drive the
+reference clock if connected to a PHY.
+
+RGMII fixed-link and internal delays
+------------------------------------
+
+As mentioned in the bindings document, the second generation of devices has
+tunable delay lines as part of the MAC, which can be used to establish the
+correct RGMII timing budget.
+When powered up, these can shift the Rx and Tx clocks with a phase difference
+between 73.8 and 101.7 degrees.
+The catch is that the delay lines need to lock onto a clock signal with a
+stable frequency. This means that there must be at least 2 microseconds of
+silence between the clock at the old vs at the new frequency. Otherwise the
+lock is lost and the delay lines must be reset (powered down and back up).
+In RGMII the clock frequency changes with link speed (125 MHz at 1000 Mbps, 25
+MHz at 100 Mbps and 2.5 MHz at 10 Mbps), and link speed might change during the
+AN process.
+In the situation where the switch port is connected through an RGMII fixed-link
+to a link partner whose link state life cycle is outside the control of Linux
+(such as a different SoC), then the delay lines would remain unlocked (and
+inactive) until there is manual intervention (ifdown/ifup on the switch port).
+The take-away is that in RGMII mode, the switch's internal delays are only
+reliable if the link partner never changes link speeds, or if it does, it does
+so in a way that is coordinated with the switch port (practically, both ends of
+the fixed-link are under control of the same Linux system).
+As to why would a fixed-link interface ever change link speeds: there are
+Ethernet controllers out there which come out of reset in 100 Mbps mode, and
+their driver inevitably needs to change the speed and clock frequency if it's
+required to work at gigabit.
+
+MDIO bus and PHY management
+---------------------------
+
+The SJA1105 does not have an MDIO bus and does not perform in-band AN either.
+Therefore there is no link state notification coming from the switch device.
+A board would need to hook up the PHYs connected to the switch to any other
+MDIO bus available to Linux within the system (e.g. to the DSA master's MDIO
+bus). Link state management then works by the driver manually keeping in sync
+(over SPI commands) the MAC link speed with the settings negotiated by the PHY.
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 net-next 24/24] dt-bindings: net: dsa: Add documentation for NXP SJA1105 driver
  2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
                   ` (22 preceding siblings ...)
  2019-04-13  1:28 ` [PATCH v3 net-next 23/24] Documentation: net: dsa: Add details about NXP SJA1105 driver Vladimir Oltean
@ 2019-04-13  1:28 ` Vladimir Oltean
  23 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13  1:28 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel, Vladimir Oltean

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
None.

Changes in v2:
Renamed sja1105,phy-mode to sja1105,role-phy and similarly for mac.
Clarified the switch situation with RGMII delays.

 .../devicetree/bindings/net/dsa/sja1105.txt   | 157 ++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/sja1105.txt

diff --git a/Documentation/devicetree/bindings/net/dsa/sja1105.txt b/Documentation/devicetree/bindings/net/dsa/sja1105.txt
new file mode 100644
index 000000000000..529401cf01ea
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/sja1105.txt
@@ -0,0 +1,157 @@
+NXP SJA1105 switch driver
+=========================
+
+Required properties:
+
+- compatible:
+	Must be one of:
+	- "nxp,sja1105e"
+	- "nxp,sja1105t"
+	- "nxp,sja1105p"
+	- "nxp,sja1105q"
+	- "nxp,sja1105r"
+	- "nxp,sja1105s"
+
+	Although the device ID could be detected at runtime, explicit bindings
+	are required in order to be able to statically check their validity.
+	For example, SGMII can only be specified on port 4 of R and S devices,
+	and the non-SGMII devices, while pin-compatible, are not equal in terms
+	of support for RGMII internal delays (supported on P/Q/R/S, but not on
+	E/T).
+
+Optional properties:
+
+- sja1105,role-mac:
+- sja1105,role-phy:
+	Boolean properties that can be assigned under each port node. By
+	default (unless otherwise specified) a port is configured as MAC if it
+	is driving a PHY (phy-handle is present) or as PHY if it is PHY-less
+	(fixed-link specified, presumably because it is connected to a MAC).
+	The effect of this property (in either its implicit or explicit form)
+	is:
+	- In the case of MII or RMII it specifies whether the SJA1105 port is a
+	  clock source or sink for this interface (not applicable for RGMII
+	  where there is a Tx and an Rx clock).
+	- In the case of RGMII it affects the behavior regarding internal
+	  delays:
+	  1. If sja1105,role-mac is specified, and the phy-mode property is one
+	     of "rgmii-id", "rgmii-txid" or "rgmii-rxid", then the entity
+	     designated to apply the delay/clock skew necessary for RGMII
+	     is the PHY. The SJA1105 MAC does not apply any internal delays.
+	  2. If sja1105,role-phy is specified, and the phy-mode property is one
+	     of the above, the designated entity to apply the internal delays
+	     is the SJA1105 MAC (if hardware-supported). This is only supported
+	     by the second-generation (P/Q/R/S) hardware. On a first-generation
+	     E or T device, it is an error to specify an RGMII phy-mode other
+	     than "rgmii" for a port that is in fixed-link mode. In that case,
+	     the clock skew must either be added by the MAC at the other end of
+	     the fixed-link, or by PCB serpentine traces on the board.
+	These properties are required, for example, in the case where SJA1105
+	ports are at both ends of a MII/RMII PHY-less setup. One end would need
+	to have sja1105,role-mac, while the other sja1105,role-phy.
+
+See Documentation/devicetree/bindings/net/dsa/dsa.txt for the list of standard
+DSA required and optional properties.
+
+Other observations
+------------------
+
+The SJA1105 SPI interface requires a CS-to-CLK time (t2 in UM10944) of at least
+one half of t_CLK. At an SPI frequency of 1MHz, this means a minimum
+cs_sck_delay of 500ns. Ensuring that this SPI timing requirement is observed
+depends on the SPI bus master driver.
+
+Example
+-------
+
+Ethernet switch connected via SPI to the host, CPU port wired to enet2:
+
+arch/arm/boot/dts/ls1021a-tsn.dts:
+
+/* SPI controller of the LS1021 */
+&dspi0 {
+	sja1105@1 {
+		reg = <0x1>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		compatible = "nxp,sja1105";
+		spi-max-frequency = <4000000>;
+		fsl,spi-cs-sck-delay = <1000>;
+		fsl,spi-sck-cs-delay = <1000>;
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			port@0 {
+				/* ETH5 written on chassis */
+				label = "swp5";
+				phy-handle = <&rgmii_phy6>;
+				phy-mode = "rgmii";
+				reg = <0>;
+				/* Implicit "sja1105,role-mac;" */
+			};
+			port@1 {
+				/* ETH2 written on chassis */
+				label = "swp2";
+				phy-handle = <&rgmii_phy3>;
+				phy-mode = "rgmii";
+				reg = <1>;
+				/* Implicit "sja1105,role-mac;" */
+			};
+			port@2 {
+				/* ETH3 written on chassis */
+				label = "swp3";
+				phy-handle = <&rgmii_phy4>;
+				phy-mode = "rgmii";
+				reg = <2>;
+				/* Implicit "sja1105,role-mac;" */
+			};
+			port@3 {
+				/* ETH4 written on chassis */
+				phy-handle = <&rgmii_phy5>;
+				label = "swp4";
+				phy-mode = "rgmii";
+				reg = <3>;
+				/* Implicit "sja1105,role-mac;" */
+			};
+			port@4 {
+				/* Internal port connected to eth2 */
+				ethernet = <&enet2>;
+				phy-mode = "rgmii";
+				reg = <4>;
+				/* Implicit "sja1105,role-phy;" */
+				fixed-link {
+					speed = <1000>;
+					full-duplex;
+				};
+			};
+		};
+	};
+};
+
+/* MDIO controller of the LS1021 */
+&mdio0 {
+	/* BCM5464 */
+	rgmii_phy3: ethernet-phy@3 {
+		reg = <0x3>;
+	};
+	rgmii_phy4: ethernet-phy@4 {
+		reg = <0x4>;
+	};
+	rgmii_phy5: ethernet-phy@5 {
+		reg = <0x5>;
+	};
+	rgmii_phy6: ethernet-phy@6 {
+		reg = <0x6>;
+	};
+};
+
+/* Ethernet master port of the LS1021 */
+&enet2 {
+	phy-connection-type = "rgmii";
+	status = "ok";
+	fixed-link {
+		speed = <1000>;
+		full-duplex;
+	};
+};
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 06/24] net: dsa: Call driver's setup callback after setting up its switchdev notifier
  2019-04-13  1:28 ` [PATCH v3 net-next 06/24] net: dsa: Call driver's setup callback after setting up its switchdev notifier Vladimir Oltean
@ 2019-04-13 15:05   ` Andrew Lunn
  0 siblings, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 15:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:04AM +0300, Vladimir Oltean wrote:
> This allows the driver to perform some manipulations of its own during
> setup, using generic switchdev calls. Having the notifiers registered at
> setup time is important because otherwise any switchdev transaction
> emitted during this time would be ignored (dispatched to an empty call
> chain).
> 
> One current usage scenario is for the driver to request DSA to set up
> 802.1Q based switch tagging for its ports.
> 
> There is no danger for the driver setup code to start racing now with
> switchdev events emitted from the network stack (such as bridge core)
> even if the notifier is registered earlier. This is because the network
> stack needs a net_device as a vehicle to perform switchdev operations,
> and the slave net_devices are registered later than the core driver
> setup anyway (ds->ops->setup in dsa_switch_setup vs dsa_port_setup).
> 
> Luckily DSA doesn't need a net_device to carry out switchdev callbacks,
> and therefore drivers shouldn't assume either that net_devices are
> available at the time their switchdev callbacks get invoked.

Hi Vladimir

Thanks for adding this explanation to the commit message.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge
  2019-04-13  1:28 ` [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge Vladimir Oltean
@ 2019-04-13 15:11   ` Andrew Lunn
  2019-04-16 23:59   ` Florian Fainelli
  1 sibling, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 15:11 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:08AM +0300, Vladimir Oltean wrote:
> When ports are standalone (after they left the bridge), they should have
> no VLAN filtering semantics (they should pass all traffic to the CPU).
> Currently this is not true for switchdev drivers, because the bridge
> "forgets" to unset that.
> 
> Normally one would think that doing this at the bridge layer would be a
> better idea, i.e. call br_vlan_filter_toggle() from br_del_if(), similar
> to how nbp_vlan_init() is called from br_add_if().
> 
> However what complicates that approach, and makes this one preferable,
> is the fact that for the bridge core, vlan_filtering is a per-bridge
> setting, whereas for switchdev/DSA it is per-port. Also there are
> switches where the setting is per the entire device, and unsetting
> vlan_filtering one by one, for each leaving port, would not be possible
> from the bridge core without a certain level of awareness. So do this in
> DSA and let drivers be unaware of it.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering
  2019-04-13  1:28 ` [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering Vladimir Oltean
@ 2019-04-13 15:12   ` Andrew Lunn
  2019-04-16 23:59   ` Florian Fainelli
  1 sibling, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 15:12 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:09AM +0300, Vladimir Oltean wrote:
> The driver, recognizing that the .port_vlan_filtering callback was never
> coming after the port left its parent bridge, decided to take that duty
> in its own hands. DSA now takes care of this condition, so fix that.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global
  2019-04-13  1:28 ` [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global Vladimir Oltean
@ 2019-04-13 15:23   ` Andrew Lunn
  2019-04-13 15:37     ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 15:23 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:10AM +0300, Vladimir Oltean wrote:
> The current behavior is not as obvious as one would assume (which is
> that, if the driver set vlan_filtering_is_global = 1, then checking any
> dp->vlan_filtering would yield the same result). Only the ports which
> are actively enslaved into a bridge would have vlan_filtering set.
> 
> This makes it tricky for drivers to check what the global state is.
> Moreover, the most obvious place to check for this setting, the CPU
> port, is not populated since it's not being enslaved to the bridge.
> So fix this and make the CPU port hold the global state of VLAN
> filtering on this switch.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
> Changes in v3:
> Patch is new.
> 
>  net/dsa/port.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index c8eb2cbcea6e..acb4ed1f9929 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -190,6 +190,8 @@ static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp,
>  int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
>  			    struct switchdev_trans *trans)
>  {
> +	/* Violate a const pointer here */
> +	struct dsa_port *cpu_dp = (struct dsa_port *)dp->cpu_dp;

Hi Vladimir

As compilers get more picky, i expect that is going to result in a
warning. 

Since this is a switch global attribute, putting it in dsa_switch
would be better, next to vlan_filteris_is_global.

	Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global
  2019-04-13 15:23   ` Andrew Lunn
@ 2019-04-13 15:37     ` Vladimir Oltean
  0 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 15:37 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sat, 13 Apr 2019 at 18:23, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 13, 2019 at 04:28:10AM +0300, Vladimir Oltean wrote:
> > The current behavior is not as obvious as one would assume (which is
> > that, if the driver set vlan_filtering_is_global = 1, then checking any
> > dp->vlan_filtering would yield the same result). Only the ports which
> > are actively enslaved into a bridge would have vlan_filtering set.
> >
> > This makes it tricky for drivers to check what the global state is.
> > Moreover, the most obvious place to check for this setting, the CPU
> > port, is not populated since it's not being enslaved to the bridge.
> > So fix this and make the CPU port hold the global state of VLAN
> > filtering on this switch.
> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> > Changes in v3:
> > Patch is new.
> >
> >  net/dsa/port.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/net/dsa/port.c b/net/dsa/port.c
> > index c8eb2cbcea6e..acb4ed1f9929 100644
> > --- a/net/dsa/port.c
> > +++ b/net/dsa/port.c
> > @@ -190,6 +190,8 @@ static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp,
> >  int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
> >                           struct switchdev_trans *trans)
> >  {
> > +     /* Violate a const pointer here */
> > +     struct dsa_port *cpu_dp = (struct dsa_port *)dp->cpu_dp;
>
> Hi Vladimir
>
> As compilers get more picky, i expect that is going to result in a
> warning.
>
> Since this is a switch global attribute, putting it in dsa_switch
> would be better, next to vlan_filteris_is_global.
>
>         Andrew

Hi Andrew,

Creating a bool ds->vlan_filtering wouldn't make a lot of sense for
the majority of drivers.
Additionally in my sja1105_filter() function, that would require me to
pass through one more pointer (dev->dsa_ptr->vlan_filtering vs
dev->dsa_ptr->*ds->*vlan_filtering) to reach the same information.
I don't think that keeping it in cpu_dp->vlan_filtering has any
semantical overlap with anything else that might appear in the future.
And I don't know why the cpu_dp pointer is const. In the
dsa_switch_tree it isn't.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from
  2019-04-13  1:28 ` [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from Vladimir Oltean
@ 2019-04-13 15:39   ` Andrew Lunn
  2019-04-13 15:48     ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 15:39 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:11AM +0300, Vladimir Oltean wrote:
> Frames get processed by DSA and redirected to switch port net devices
> based on the ETH_P_XDSA multiplexed packet_type handler found by the
> network stack when calling eth_type_trans().
> 
> The running assumption is that once the DSA .rcv function is called, DSA
> is always able to decode the switch tag in order to change the skb->dev
> from its master.
> 
> However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105,
> user of DSA_TAG_PROTO_8021Q) where this assumption is not completely
> true, since switch tagging piggybacks on the absence of a vlan_filtering
> bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't
> rely on switch tagging, but on a different mechanism. So it would make
> sense to at least be able to terminate that.

Hi Vladimir

Let me see if i get this correct.

If the filter fails to match, the frame is received on the master
interface? So BPDUs and PTT packets are going to go to the master
interface?

How does the bridge get these BPDUs, and associated to the correct
slave port?

How does the PTP core code get these frames, and associated to the
correct slave port?

You say there is a different mechanism to do this. Maybe a later patch
i've not yet looked at. But cannot this mechanism be built into the
tagger? That is what the tagger is there for, to demultiplex a frame
to the correct slave. The current code assume the needed information
is in the header, but there is nothing to stop it looking deeper into
the packet if needed. So far, we have been reluctant for a tagger to
call into the DSA driver, but if need be, it could happen.

     Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
  2019-04-13  1:28 ` [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch Vladimir Oltean
@ 2019-04-13 15:42   ` Andrew Lunn
  2019-04-13 15:46     ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 15:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:12AM +0300, Vladimir Oltean wrote:
> At this moment the following is supported:
> * Link state management through phylib
> * Autonomous L2 forwarding managed through iproute2 bridge commands. The
>   switch ports are initialized in a mode where they can only talk to the
>   CPU port. However, IP termination must be done currently through the
>   master netdevice.

Please could you explain that last sentence in more detail. Normally
the master device is just a dumb pipe. Nothing, except tcpdump,
uses it.

Thanks

     Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
  2019-04-13 15:42   ` Andrew Lunn
@ 2019-04-13 15:46     ` Vladimir Oltean
  2019-04-13 16:44       ` Andrew Lunn
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 15:46 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

Hi Andrew,

All I'm saying is that at this point in time (patch 14/24), the driver
is being introduced with DSA_TAG_PROTO_NONE. Then support for traffic
and switch tagging is added later on (18/24). I was just explaining
what can be done with the driver up to this point in the patchset.

Thanks,
-Vladimir


On Sat, 13 Apr 2019 at 18:43, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 13, 2019 at 04:28:12AM +0300, Vladimir Oltean wrote:
> > At this moment the following is supported:
> > * Link state management through phylib
> > * Autonomous L2 forwarding managed through iproute2 bridge commands. The
> >   switch ports are initialized in a mode where they can only talk to the
> >   CPU port. However, IP termination must be done currently through the
> >   master netdevice.
>
> Please could you explain that last sentence in more detail. Normally
> the master device is just a dumb pipe. Nothing, except tcpdump,
> uses it.
>
> Thanks
>
>      Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from
  2019-04-13 15:39   ` Andrew Lunn
@ 2019-04-13 15:48     ` Vladimir Oltean
  0 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 15:48 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sat, 13 Apr 2019 at 18:40, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 13, 2019 at 04:28:11AM +0300, Vladimir Oltean wrote:
> > Frames get processed by DSA and redirected to switch port net devices
> > based on the ETH_P_XDSA multiplexed packet_type handler found by the
> > network stack when calling eth_type_trans().
> >
> > The running assumption is that once the DSA .rcv function is called, DSA
> > is always able to decode the switch tag in order to change the skb->dev
> > from its master.
> >
> > However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105,
> > user of DSA_TAG_PROTO_8021Q) where this assumption is not completely
> > true, since switch tagging piggybacks on the absence of a vlan_filtering
> > bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't
> > rely on switch tagging, but on a different mechanism. So it would make
> > sense to at least be able to terminate that.
>
> Hi Vladimir
>
> Let me see if i get this correct.
>
> If the filter fails to match, the frame is received on the master
> interface? So BPDUs and PTT packets are going to go to the master
> interface?
>
> How does the bridge get these BPDUs, and associated to the correct
> slave port?
>
> How does the PTP core code get these frames, and associated to the
> correct slave port?
>
> You say there is a different mechanism to do this. Maybe a later patch
> i've not yet looked at. But cannot this mechanism be built into the
> tagger? That is what the tagger is there for, to demultiplex a frame
> to the correct slave. The current code assume the needed information
> is in the header, but there is nothing to stop it looking deeper into
> the packet if needed. So far, we have been reluctant for a tagger to
> call into the DSA driver, but if need be, it could happen.
>
>      Andrew


Hi Andrew,

Yes it's explained in a further patch. See the checks for
sja1105_is_link_local() in patch 18/24.
There's also a table in the documentation patch (23/24).

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13  1:28 ` [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports Vladimir Oltean
@ 2019-04-13 16:37   ` Andrew Lunn
  2019-04-13 21:27     ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 16:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:16AM +0300, Vladimir Oltean wrote:
> In order to support this, we are creating a make-shift switch tag out of
> a VLAN trunk configured on the CPU port. Termination of normal traffic
> on switch ports only works when not under a vlan_filtering bridge.
> Termination of management (PTP, BPDU) traffic works under all
> circumstances because it uses a different tagging mechanism
> (incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
> code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.
> 
> There are two types of traffic: regular and link-local.
> The link-local traffic received on the CPU port is trapped from the
> switch's regular forwarding decisions because it matched one of the two
> DMAC filters for management traffic.
> On transmission, the switch requires special massaging for these
> link-local frames. Due to a weird implementation of the switching IP, by
> default it drops link-local frames that originate on the CPU port. It
> needs to be told where to forward them to, through an SPI command
> ("management route") that is valid for only a single frame.
> So when we're sending link-local traffic, we need to clone skb's from
> DSA and send them in our custom xmit worker that also performs SPI access.
> 
> For that purpose, the DSA xmit handler and the xmit worker communicate
> through a per-port "skb ring" software structure, with a producer and a
> consumer index. At the moment this structure is rather fragile
> (ping-flooding to a link-local DMAC would cause most of the frames to
> get dropped). I would like to move the management traffic on a separate
> netdev queue that I can stop when the skb ring got full and hardware is
> busy processing, so that we are not forced to drop traffic.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
> Changes in v3:
> Made management traffic be receivable on the DSA netdevices even when
> switch tagging is disabled, as well as regular traffic be receivable on
> the master netdevice in the same scenario. Both are accomplished using
> the sja1105_filter() function and some small touch-ups in the .rcv
> callback.

It seems like you made major changes to this. When you do that, you
should drop any reviewed-by tags you have. They are no longer valid
because of the major changes.

>  /* This callback needs to be present */
> @@ -1141,7 +1158,11 @@ static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
>  	if (rc)
>  		dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
>  
> -	return rc;
> +	/* Switch port identification based on 802.1Q is only passable

possible, not passable.

> +	 * if we are not under a vlan_filtering bridge. So make sure
> +	 * the two configurations are mutually exclusive.
> +	 */
> +	return sja1105_setup_8021q_tagging(ds, !enabled);
>  }
>  
>  static void sja1105_vlan_add(struct dsa_switch *ds, int port,
> @@ -1233,9 +1254,107 @@ static int sja1105_setup(struct dsa_switch *ds)
>  	 */
>  	ds->vlan_filtering_is_global = true;
>  
> +	/* The DSA/switchdev model brings up switch ports in standalone mode by
> +	 * default, and that means vlan_filtering is 0 since they're not under
> +	 * a bridge, so it's safe to set up switch tagging at this time.
> +	 */
> +	return sja1105_setup_8021q_tagging(ds, true);
> +}
> +
> +#include "../../../net/dsa/dsa_priv.h"

No. Don't use relative includes like this.

What do you need from the header? Maybe move it into
include/linux/net/dsa.h

> +/* Deferred work is unfortunately necessary because setting up the management
> + * route cannot be done from atomit context (SPI transfer takes a sleepable
> + * lock on the bus)
> + */
> +static void sja1105_xmit_work_handler(struct work_struct *work)
> +{
> +	struct sja1105_port *sp = container_of(work, struct sja1105_port,
> +						xmit_work);
> +	struct sja1105_private *priv = sp->dp->ds->priv;
> +	struct net_device *slave = sp->dp->slave;
> +	struct net_device *master = dsa_slave_to_master(slave);
> +	int port = (uintptr_t)(sp - priv->ports);
> +	struct sk_buff *skb;
> +	int i, rc;
> +
> +	while ((i = sja1105_skb_ring_get(&sp->xmit_ring, &skb)) >= 0) {
> +		struct sja1105_mgmt_entry mgmt_route = { 0 };
> +		struct ethhdr *hdr;
> +		int timeout = 10;
> +		int skb_len;
> +
> +		skb_len = skb->len;
> +		hdr = eth_hdr(skb);
> +
> +		mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
> +		mgmt_route.destports = BIT(port);
> +		mgmt_route.enfport = 1;
> +		mgmt_route.tsreg = 0;
> +		mgmt_route.takets = false;
> +
> +		rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> +						  port, &mgmt_route, true);
> +		if (rc < 0) {
> +			kfree_skb(skb);
> +			slave->stats.tx_dropped++;
> +			continue;
> +		}
> +
> +		/* Transfer skb to the host port. */
> +		skb->dev = master;
> +		dev_queue_xmit(skb);
> +
> +		/* Wait until the switch has processed the frame */
> +		do {
> +			rc = sja1105_dynamic_config_read(priv, BLK_IDX_MGMT_ROUTE,
> +							 port, &mgmt_route);
> +			if (rc < 0) {
> +				slave->stats.tx_errors++;
> +				dev_err(priv->ds->dev,
> +					"xmit: failed to poll for mgmt route\n");
> +				continue;
> +			}
> +
> +			/* UM10944: The ENFPORT flag of the respective entry is
> +			 * cleared when a match is found. The host can use this
> +			 * flag as an acknowledgment.
> +			 */
> +			cpu_relax();
> +		} while (mgmt_route.enfport && --timeout);
> +
> +		if (!timeout) {
> +			dev_err(priv->ds->dev, "xmit timed out\n");
> +			slave->stats.tx_errors++;
> +			continue;
> +		}
> +
> +		slave->stats.tx_packets++;
> +		slave->stats.tx_bytes += skb_len;
> +	}
> +}
> +
> +static int sja1105_port_enable(struct dsa_switch *ds, int port,
> +			       struct phy_device *phydev)
> +{
> +	struct sja1105_private *priv = ds->priv;
> +	struct sja1105_port *sp = &priv->ports[port];
> +
> +	sp->dp = &ds->ports[port];
> +	INIT_WORK(&sp->xmit_work, sja1105_xmit_work_handler);
>  	return 0;
>  }

I think i'm missing something here. You have a per port queue of link
local frames which need special handling. And you have a per-port work
queue. To send such a frame, you need to write some register, send the
frame, and then wait until the mgmt_route.enfport is reset.

Why are you doing this per port? How do you stop two ports/work queues
running at the same time? It seems like one queue, with one work queue
would be a better structure.

Also, please move all this code into the tagger. Just add exports for 
sja1105_dynamic_config_write() and sja1105_dynamic_config_read().

> +static void sja1105_port_disable(struct dsa_switch *ds, int port)
> +{
> +	struct sja1105_private *priv = ds->priv;
> +	struct sja1105_port *sp = &priv->ports[port];
> +	struct sk_buff *skb;
> +
> +	cancel_work_sync(&sp->xmit_work);
> +	while (sja1105_skb_ring_get(&sp->xmit_ring, &skb) >= 0)
> +		kfree_skb(skb);
> +}
> +
> diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
> new file mode 100644
> index 000000000000..5c76a06c9093
> --- /dev/null
> +++ b/net/dsa/tag_sja1105.c
> @@ -0,0 +1,148 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
> + */
> +#include <linux/etherdevice.h>
> +#include <linux/if_vlan.h>
> +#include <linux/dsa/sja1105.h>
> +#include "../../drivers/net/dsa/sja1105/sja1105.h"

Again, no, don't do this.

> +
> +#include "dsa_priv.h"
> +
> +/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
> +static inline bool sja1105_is_link_local(const struct sk_buff *skb)
> +{
> +	const struct ethhdr *hdr = eth_hdr(skb);
> +	u64 dmac = ether_addr_to_u64(hdr->h_dest);
> +
> +	if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
> +		    SJA1105_LINKLOCAL_FILTER_A)
> +		return true;
> +	if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
> +		    SJA1105_LINKLOCAL_FILTER_B)
> +		return true;
> +	return false;
> +}
> +
> +static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
> +{
> +	if (sja1105_is_link_local(skb))
> +		return true;
> +	if (!dev->dsa_ptr->vlan_filtering)
> +		return true;
> +	return false;
> +}

Please add a comment here about what frames cannot be handled by the
tagger. However, i'm not too happy about this design...

> +
> +static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
> +				    struct net_device *netdev)
> +{
> +	struct dsa_port *dp = dsa_slave_to_port(netdev);
> +	struct dsa_switch *ds = dp->ds;
> +	struct sja1105_private *priv = ds->priv;
> +	struct sja1105_port *sp = &priv->ports[dp->index];
> +	struct sk_buff *clone;
> +
> +	if (likely(!sja1105_is_link_local(skb))) {
> +		/* Normal traffic path. */
> +		u16 tx_vid = dsa_tagging_tx_vid(ds, dp->index);
> +		u8 pcp = skb->priority;
> +
> +		/* If we are under a vlan_filtering bridge, IP termination on
> +		 * switch ports based on 802.1Q tags is simply too brittle to
> +		 * be passable. So just defer to the dsa_slave_notag_xmit
> +		 * implementation.
> +		 */
> +		if (dp->vlan_filtering)
> +			return skb;
> +
> +		return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> +				     ((pcp << VLAN_PRIO_SHIFT) | tx_vid));

Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.

> +	}
> +
> +	/* Code path for transmitting management traffic. This does not rely
> +	 * upon switch tagging, but instead SPI-installed management routes.
> +	 */
> +	clone = skb_clone(skb, GFP_ATOMIC);
> +	if (!clone) {
> +		dev_err(ds->dev, "xmit: failed to clone skb\n");
> +		return NULL;
> +	}
> +
> +	if (sja1105_skb_ring_add(&sp->xmit_ring, clone) < 0) {
> +		dev_err(ds->dev, "xmit: skb ring full\n");
> +		kfree_skb(clone);
> +		return NULL;
> +	}
> +
> +	if (sp->xmit_ring.count == SJA1105_SKB_RING_SIZE)
> +		/* TODO setup a dedicated netdev queue for management traffic
> +		 * so that we can selectively apply backpressure and not be
> +		 * required to stop the entire traffic when the software skb
> +		 * ring is full. This requires hooking the ndo_select_queue
> +		 * from DSA and matching on mac_fltres.
> +		 */
> +		dev_err(ds->dev, "xmit: reached maximum skb ring size\n");

This should be rate limited.

     Andrew

> +
> +	schedule_work(&sp->xmit_work);
> +	/* Let DSA free its reference to the skb and we will free
> +	 * the clone in the deferred worker
> +	 */
> +	return NULL;
> +}
> +
> +static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
> +				   struct net_device *netdev,
> +				   struct packet_type *pt)
> +{
> +	unsigned int source_port, switch_id;
> +	struct ethhdr *hdr = eth_hdr(skb);
> +	struct sk_buff *nskb;
> +	u16 tpid, vid, tci;
> +	bool is_tagged;
> +
> +	nskb = dsa_8021q_rcv(skb, netdev, pt, &tpid, &tci);
> +	is_tagged = (nskb && tpid == ETH_P_EDSA);
> +
> +	skb->priority = (tci & VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
> +	vid = tci & VLAN_VID_MASK;
> +
> +	skb->offload_fwd_mark = 1;
> +
> +	if (likely(!sja1105_is_link_local(skb))) {
> +		/* Normal traffic path. */
> +		source_port = dsa_tagging_rx_source_port(vid);
> +		switch_id = dsa_tagging_rx_switch_id(vid);
> +	} else {
> +		/* Management traffic path. Switch embeds the switch ID and
> +		 * port ID into bytes of the destination MAC, courtesy of
> +		 * the incl_srcpt options.
> +		 */
> +		source_port = hdr->h_dest[3];
> +		switch_id = hdr->h_dest[4];
> +		/* Clear the DMAC bytes that were mangled by the switch */
> +		hdr->h_dest[3] = 0;
> +		hdr->h_dest[4] = 0;
> +	}
> +
> +	skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
> +	if (!skb->dev) {
> +		netdev_warn(netdev, "Couldn't decode source port\n");
> +		return NULL;
> +	}
> +
> +	/* Delete/overwrite fake VLAN header, DSA expects to not find
> +	 * it there, see dsa_switch_rcv: skb_push(skb, ETH_HLEN).
> +	 */
> +	if (is_tagged)
> +		memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - VLAN_HLEN,
> +			ETH_HLEN - VLAN_HLEN);
> +
> +	return skb;
> +}
> +
> +const struct dsa_device_ops sja1105_netdev_ops = {
> +	.xmit = sja1105_xmit,
> +	.rcv = sja1105_rcv,
> +	.filter = sja1105_filter,
> +	.overhead = VLAN_HLEN,
> +};
> +
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 19/24] net: dsa: sja1105: Add support for Spanning Tree Protocol
  2019-04-13  1:28 ` [PATCH v3 net-next 19/24] net: dsa: sja1105: Add support for Spanning Tree Protocol Vladimir Oltean
@ 2019-04-13 16:41   ` Andrew Lunn
  0 siblings, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 16:41 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:17AM +0300, Vladimir Oltean wrote:
> While not explicitly documented as supported in UM10944, compliance with
> the STP states can be obtained by manipulating 3 settings at the
> (per-port) MAC config level: dynamic learning, inhibiting reception of
> regular traffic, and inhibiting transmission of regular traffic.
> 
> In all these modes, transmission and reception of special BPDU frames
> from the stack is still enabled (not inhibited by the MAC-level
> settings).
> 
> On ingress, BPDUs are classified by the MAC filter as link-local
> (01-80-C2-00-00-00) and forwarded to the CPU port.  This mechanism works
> under all conditions (even without the custom 802.1Q tagging) because
> the switch hardware inserts the source port and switch ID into bytes 4
> and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put
> back zeroes into the MAC address after decoding the source port
> information.
> 
> On egress, BPDUs are transmitted using management routes from the xmit
> worker thread. Again this does not require switch tagging, as the switch
> port is programmed through SPI to hold a temporary (single-fire) route
> for a frame with the programmed destination MAC (01-80-C2-00-00-00).
> 
> STP is activated using the following commands and was tested by
> connecting two front-panel ports together and noticing that switching
> loops were prevented (one port remains in the blocking state):
> 
> $ ip link add name br0 type bridge stp_state 1 && ip link set br0 up
> $ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/);
>   do ip link set ${eth} master br0 && ip link set ${eth} up; done
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
  2019-04-13 15:46     ` Vladimir Oltean
@ 2019-04-13 16:44       ` Andrew Lunn
  2019-04-13 21:29         ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 16:44 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sat, Apr 13, 2019 at 06:46:57PM +0300, Vladimir Oltean wrote:
> Hi Andrew,
> 
> All I'm saying is that at this point in time (patch 14/24), the driver
> is being introduced with DSA_TAG_PROTO_NONE. Then support for traffic
> and switch tagging is added later on (18/24). I was just explaining
> what can be done with the driver up to this point in the patchset.

O.K. So at this point in time, the driver is unusable. Too much other
stuff is missing. So i don't see any point in mentioning this.

      Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT
  2019-04-13  1:28 ` [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT Vladimir Oltean
@ 2019-04-13 16:49   ` Andrew Lunn
  2019-04-13 20:47   ` Jiri Pirko
  1 sibling, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 16:49 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:18AM +0300, Vladimir Oltean wrote:
> Documentation/devicetree/bindings/net/ethernet.txt is confusing because
> it says what the MAC should not do, but not what it *should* do:
> 
>   * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
>      should not add an RX delay in this case)
> 
> The gap in semantics is threefold:
> 1. Is it illegal for the MAC to apply the Rx internal delay by itself,
>    and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
>    passing it to of_phy_connect? The documentation would suggest yes.
> 1. For "rgmii-rxid", while the situation with the Rx clock skew is more
>    or less clear (needs to be added by the PHY), what should the MAC
>    driver do about the Tx delays? Is it an implicit wild card for the
>    MAC to apply delays in the Tx direction if it can? What if those were
>    already added as serpentine PCB traces, how could that be made more
>    obvious through DT bindings so that the MAC doesn't attempt to add
>    them twice and again potentially break the link?
> 3. If the interface is a fixed-link and therefore the PHY object is
>    fixed (a purely software entity that obviously cannot add clock
>    skew), what is the meaning of the above property?
> 
> So an interpretation of the RGMII bindings was chosen that hopefully
> does not contradict their intention but also makes them more applied.
> The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
> if the port is in the PHY role (either explicitly, or if it is a
> fixed-link). Otherwise it always passes the duty of setting up delays to
> the PHY driver.

That is a good interpretation. I always recommend the PHY does the
delay, because in general the PHY can, and often the MAC cannot.

> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 21/24] net: dsa: sja1105: Prevent PHY jabbering during switch reset
  2019-04-13  1:28 ` [PATCH v3 net-next 21/24] net: dsa: sja1105: Prevent PHY jabbering during switch reset Vladimir Oltean
@ 2019-04-13 16:54   ` Andrew Lunn
  0 siblings, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-13 16:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, davem, netdev, linux-kernel, georg.waibel

On Sat, Apr 13, 2019 at 04:28:19AM +0300, Vladimir Oltean wrote:
> Resetting the switch at runtime is currently done while changing the
> vlan_filtering setting (due to the required TPID change).
> 
> But reset is asynchronous with packet egress, and the switch core will
> not wait for egress to finish before carrying on with the reset
> operation.
> 
> As a result, a connected PHY such as the BCM5464 would see an
> unterminated Ethernet frame and start to jabber (repeat the last seen
> Ethernet symbols - jabber is by definition an oversized Ethernet frame
> with bad FCS). This behavior is strange in itself, but it also causes
> the MACs of some link partners (such as the FRDM-LS1012A) to completely
> lock up.
> 
> So as a remedy for this situation, when switch reset is required, simply
> inhibit Tx on all ports, and wait for the necessary time for the
> eventual one frame left in the egress queue (not even the Tx inhibit
> command is instantaneous) to be flushed.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT
  2019-04-13  1:28 ` [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT Vladimir Oltean
  2019-04-13 16:49   ` Andrew Lunn
@ 2019-04-13 20:47   ` Jiri Pirko
  2019-04-13 21:31     ` Vladimir Oltean
  1 sibling, 1 reply; 68+ messages in thread
From: Jiri Pirko @ 2019-04-13 20:47 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, andrew, davem, netdev, linux-kernel,
	georg.waibel

Sat, Apr 13, 2019 at 03:28:18AM CEST, olteanv@gmail.com wrote:
>Documentation/devicetree/bindings/net/ethernet.txt is confusing because
>it says what the MAC should not do, but not what it *should* do:
>
>  * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
>     should not add an RX delay in this case)
>
>The gap in semantics is threefold:
>1. Is it illegal for the MAC to apply the Rx internal delay by itself,
>   and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
>   passing it to of_phy_connect? The documentation would suggest yes.
>1. For "rgmii-rxid", while the situation with the Rx clock skew is more
>   or less clear (needs to be added by the PHY), what should the MAC
>   driver do about the Tx delays? Is it an implicit wild card for the
>   MAC to apply delays in the Tx direction if it can? What if those were
>   already added as serpentine PCB traces, how could that be made more
>   obvious through DT bindings so that the MAC doesn't attempt to add
>   them twice and again potentially break the link?
>3. If the interface is a fixed-link and therefore the PHY object is
>   fixed (a purely software entity that obviously cannot add clock
>   skew), what is the meaning of the above property?
>
>So an interpretation of the RGMII bindings was chosen that hopefully
>does not contradict their intention but also makes them more applied.
>The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
>if the port is in the PHY role (either explicitly, or if it is a
>fixed-link). Otherwise it always passes the duty of setting up delays to
>the PHY driver.
>
>The error behavior that this patch adds is required on SJA1105E/T where
>the MAC really cannot apply internal delays. If the other end of the
>fixed-link cannot apply RGMII delays either (this would be specified
>through its own DT bindings), then the situation requires PCB delays.
>
>For SJA1105P/Q/R/S, this is however hardware supported and the error is
>thus only temporary. I created a stub function pointer for configuring
>delays per-port on RXC and TXC, and will implement it when I have access
>to a board with this hardware setup.
>
>Meanwhile do not allow the user to select an invalid configuration.
>
>Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>---
>Changes in v3:
>None.
>
>Changes in v2:
>Patch is new.
>
> drivers/net/dsa/sja1105/sja1105.h          |  3 ++
> drivers/net/dsa/sja1105/sja1105_clocking.c |  7 ++++-
> drivers/net/dsa/sja1105/sja1105_main.c     | 32 +++++++++++++++++++++-
> drivers/net/dsa/sja1105/sja1105_spi.c      |  6 ++++
> 4 files changed, 46 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
>index b7e745c0bb3a..3c16b991032c 100644
>--- a/drivers/net/dsa/sja1105/sja1105.h
>+++ b/drivers/net/dsa/sja1105/sja1105.h
>@@ -22,6 +22,8 @@
> 
> struct sja1105_port {
> 	struct dsa_port *dp;
>+	bool rgmii_rx_delay;
>+	bool rgmii_tx_delay;
> 	struct work_struct xmit_work;
> 	struct sja1105_skb_ring xmit_ring;
> };
>@@ -61,6 +63,7 @@ struct sja1105_info {
> 	const struct sja1105_table_ops *static_ops;
> 	const struct sja1105_regs *regs;
> 	int (*reset_cmd)(const void *ctx, const void *data);
>+	int (*setup_rgmii_delay)(const void *ctx, int port, bool rx, bool tx);
> 	const char *name;
> };
> 
>diff --git a/drivers/net/dsa/sja1105/sja1105_clocking.c b/drivers/net/dsa/sja1105/sja1105_clocking.c
>index d40da3d52464..c02fec181676 100644
>--- a/drivers/net/dsa/sja1105/sja1105_clocking.c
>+++ b/drivers/net/dsa/sja1105/sja1105_clocking.c
>@@ -432,7 +432,12 @@ static int rgmii_clocking_setup(struct sja1105_private *priv, int port)
> 		dev_err(dev, "Failed to configure Tx pad registers\n");
> 		return rc;
> 	}
>-	return 0;
>+	if (!priv->info->setup_rgmii_delay)
>+		return 0;
>+
>+	return priv->info->setup_rgmii_delay(priv, port,
>+					     priv->ports[port].rgmii_rx_delay,
>+					     priv->ports[port].rgmii_tx_delay);
> }
> 
> static int sja1105_cgu_rmii_ref_clk_config(struct sja1105_private *priv,
>diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
>index e4abf8fb2013..5f7ddb1da006 100644
>--- a/drivers/net/dsa/sja1105/sja1105_main.c
>+++ b/drivers/net/dsa/sja1105/sja1105_main.c
>@@ -555,6 +555,21 @@ static int sja1105_static_config_load(struct sja1105_private *priv,
> 	return sja1105_static_config_upload(priv);
> }
> 
>+static void sja1105_parse_rgmii_delay(const struct sja1105_dt_port *in,
>+				      struct sja1105_port *out)
>+{
>+	if (in->role == XMII_MAC)
>+		return;
>+
>+	if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
>+	    in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
>+		out->rgmii_rx_delay = true;
>+
>+	if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_TXID ||
>+	    in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
>+		out->rgmii_tx_delay = true;
>+}
>+
> static int sja1105_parse_ports_node(struct sja1105_private *priv,
> 				    struct sja1105_dt_port *ports,
> 				    struct device_node *ports_node)
>@@ -1315,13 +1330,28 @@ static int sja1105_setup(struct dsa_switch *ds)
> {
> 	struct sja1105_dt_port ports[SJA1105_NUM_PORTS];
> 	struct sja1105_private *priv = ds->priv;
>-	int rc;
>+	int rc, i;
> 
> 	rc = sja1105_parse_dt(priv, ports);
> 	if (rc < 0) {
> 		dev_err(ds->dev, "Failed to parse DT: %d\n", rc);
> 		return rc;
> 	}
>+
>+	/* Error out early if internal delays are required through DT
>+	 * and we can't apply them.
>+	 */
>+	for (i = 0; i < SJA1105_NUM_PORTS; i++) {
>+		sja1105_parse_rgmii_delay(&ports[i], &priv->ports[i]);
>+
>+		if ((priv->ports[i].rgmii_rx_delay ||
>+		     priv->ports[i].rgmii_tx_delay) &&
>+		     !priv->info->setup_rgmii_delay) {
>+			dev_err(ds->dev, "RGMII delay not supported\n");
>+			return -EINVAL;
>+		}
>+	}
>+
> 	/* Create and send configuration down to device */
> 	rc = sja1105_static_config_load(priv, ports);
> 	if (rc < 0) {
>diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
>index 09cb28e9be20..e4ef4d8048b2 100644
>--- a/drivers/net/dsa/sja1105/sja1105_spi.c
>+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
>@@ -499,6 +499,7 @@ struct sja1105_info sja1105e_info = {
> 	.part_no		= SJA1105ET_PART_NO,
> 	.static_ops		= sja1105e_table_ops,
> 	.dyn_ops		= sja1105et_dyn_ops,
>+	.setup_rgmii_delay	= NULL,
> 	.reset_cmd		= sja1105et_reset_cmd,
> 	.regs			= &sja1105et_regs,
> 	.name			= "SJA1105E",
>@@ -508,6 +509,7 @@ struct sja1105_info sja1105t_info = {
> 	.part_no		= SJA1105ET_PART_NO,
> 	.static_ops		= sja1105t_table_ops,
> 	.dyn_ops		= sja1105et_dyn_ops,
>+	.setup_rgmii_delay	= NULL,
> 	.reset_cmd		= sja1105et_reset_cmd,
> 	.regs			= &sja1105et_regs,
> 	.name			= "SJA1105T",
>@@ -517,6 +519,7 @@ struct sja1105_info sja1105p_info = {
> 	.part_no		= SJA1105P_PART_NO,
> 	.static_ops		= sja1105p_table_ops,
> 	.dyn_ops		= sja1105pqrs_dyn_ops,
>+	.setup_rgmii_delay	= NULL,
> 	.reset_cmd		= sja1105pqrs_reset_cmd,
> 	.regs			= &sja1105pqrs_regs,
> 	.name			= "SJA1105P",
>@@ -526,6 +529,7 @@ struct sja1105_info sja1105q_info = {
> 	.part_no		= SJA1105Q_PART_NO,
> 	.static_ops		= sja1105q_table_ops,
> 	.dyn_ops		= sja1105pqrs_dyn_ops,
>+	.setup_rgmii_delay	= NULL,
> 	.reset_cmd		= sja1105pqrs_reset_cmd,
> 	.regs			= &sja1105pqrs_regs,
> 	.name			= "SJA1105Q",
>@@ -535,6 +539,7 @@ struct sja1105_info sja1105r_info = {
> 	.part_no		= SJA1105R_PART_NO,
> 	.static_ops		= sja1105r_table_ops,
> 	.dyn_ops		= sja1105pqrs_dyn_ops,
>+	.setup_rgmii_delay	= NULL,
> 	.reset_cmd		= sja1105pqrs_reset_cmd,
> 	.regs			= &sja1105pqrs_regs,
> 	.name			= "SJA1105R",
>@@ -545,6 +550,7 @@ struct sja1105_info sja1105s_info = {
> 	.static_ops		= sja1105s_table_ops,
> 	.dyn_ops		= sja1105pqrs_dyn_ops,
> 	.regs			= &sja1105pqrs_regs,
>+	.setup_rgmii_delay	= NULL,

You don't need to set this to NULL. Please avoid that.


> 	.reset_cmd		= sja1105pqrs_reset_cmd,
> 	.name			= "SJA1105S",
> };
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters
  2019-04-13  1:28 ` [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters Vladimir Oltean
@ 2019-04-13 20:53   ` Jiri Pirko
  2019-04-13 21:55     ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Jiri Pirko @ 2019-04-13 20:53 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, andrew, davem, netdev, linux-kernel,
	georg.waibel

Sat, Apr 13, 2019 at 03:28:15AM CEST, olteanv@gmail.com wrote:
>Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>---
>Changes in v3:
>None.
>
>Changes in v2:
>None functional. Moved the IS_ET() and IS_PQRS() device identification
>macros here since they are not used in earlier patches.
>
> drivers/net/dsa/sja1105/Makefile              |   1 +
> drivers/net/dsa/sja1105/sja1105.h             |   7 +-
> drivers/net/dsa/sja1105/sja1105_ethtool.c     | 414 ++++++++++++++++++
> drivers/net/dsa/sja1105/sja1105_main.c        |   3 +
> .../net/dsa/sja1105/sja1105_static_config.h   |  21 +
> 5 files changed, 445 insertions(+), 1 deletion(-)
> create mode 100644 drivers/net/dsa/sja1105/sja1105_ethtool.c
>
>diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
>index ed00840802f4..bb4404c79eb2 100644
>--- a/drivers/net/dsa/sja1105/Makefile
>+++ b/drivers/net/dsa/sja1105/Makefile
>@@ -3,6 +3,7 @@ obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
> sja1105-objs := \
>     sja1105_spi.o \
>     sja1105_main.o \
>+    sja1105_ethtool.o \
>     sja1105_clocking.o \
>     sja1105_static_config.o \
>     sja1105_dynamic_config.o \
>diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
>index 4c9df44a4478..80b20bdd8f9c 100644
>--- a/drivers/net/dsa/sja1105/sja1105.h
>+++ b/drivers/net/dsa/sja1105/sja1105.h
>@@ -120,8 +120,13 @@ typedef enum {
> int sja1105_clocking_setup_port(struct sja1105_private *priv, int port);
> int sja1105_clocking_setup(struct sja1105_private *priv);
> 
>-/* From sja1105_dynamic_config.c */
>+/* From sja1105_ethtool.c */
>+void sja1105_get_ethtool_stats(struct dsa_switch *ds, int port, u64 *data);
>+void sja1105_get_strings(struct dsa_switch *ds, int port,
>+			 u32 stringset, u8 *data);
>+int sja1105_get_sset_count(struct dsa_switch *ds, int port, int sset);
> 
>+/* From sja1105_dynamic_config.c */
> int sja1105_dynamic_config_read(struct sja1105_private *priv,
> 				enum sja1105_blk_idx blk_idx,
> 				int index, void *entry);
>diff --git a/drivers/net/dsa/sja1105/sja1105_ethtool.c b/drivers/net/dsa/sja1105/sja1105_ethtool.c
>new file mode 100644
>index 000000000000..c082599702bd
>--- /dev/null
>+++ b/drivers/net/dsa/sja1105/sja1105_ethtool.c
>@@ -0,0 +1,414 @@
>+// SPDX-License-Identifier: GPL-2.0
>+/* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
>+ */
>+#include "sja1105.h"
>+
>+#define SIZE_MAC_AREA		(0x02 * 4)
>+#define SIZE_HL1_AREA		(0x10 * 4)
>+#define SIZE_HL2_AREA		(0x4 * 4)
>+#define SIZE_QLEVEL_AREA	(0x8 * 4) /* 0x4 to 0xB */

Please use prefixes for defines like this. For example "SIZE_MAC_AREA"
sounds way too generic.

[...]


>+static void
>+sja1105_port_status_hl1_unpack(void *buf,
>+			       struct sja1105_port_status_hl1 *status)
>+{
>+	/* Make pointer arithmetic work on 4 bytes */
>+	u32 *p = (u32 *)buf;

You don't need to cast void *. Please avoid it in the whole patchset.

[...]


>+	if (!IS_PQRS(priv->info->device_id))
>+		return;
>+
>+	memset(data + k, 0, ARRAY_SIZE(sja1105pqrs_extra_port_stats) *
>+			sizeof(u64));
>+	for (i = 0; i < 8; i++) {

Array size instead of "8"?


>+		data[k++] = status.hl2.qlevel_hwm[i];
>+		data[k++] = status.hl2.qlevel[i];
>+	}

[...]


> 
>+#define IS_PQRS(device_id) \
>+	(((device_id) == SJA1105PR_DEVICE_ID) || \
>+	 ((device_id) == SJA1105QS_DEVICE_ID))
>+#define IS_ET(device_id) \
>+	(((device_id) == SJA1105E_DEVICE_ID) || \
>+	 ((device_id) == SJA1105T_DEVICE_ID))
>+/* P and R have same Device ID, and differ by Part Number */
>+#define IS_P(device_id, part_nr) \
>+	(((device_id) == SJA1105PR_DEVICE_ID) && \
>+	 ((part_nr) == SJA1105P_PART_NR))
>+#define IS_R(device_id, part_nr) \
>+	(((device_id) == SJA1105PR_DEVICE_ID) && \
>+	 ((part_nr) == SJA1105R_PART_NR))
>+/* Same do Q and S */
>+#define IS_Q(device_id, part_nr) \
>+	(((device_id) == SJA1105QS_DEVICE_ID) && \
>+	 ((part_nr) == SJA1105Q_PART_NR))
>+#define IS_S(device_id, part_nr) \

Please have a prefix for macros like this. "IS_S" sounds way too
generic...


>+	(((device_id) == SJA1105QS_DEVICE_ID) && \
>+	 ((part_nr) == SJA1105S_PART_NR))
>+
> struct sja1105_general_params_entry {
> 	u64 vllupformat;
> 	u64 mirr_ptacu;
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations
  2019-04-13  1:28 ` [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations Vladimir Oltean
@ 2019-04-13 20:56   ` Jiri Pirko
  2019-04-13 21:39     ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Jiri Pirko @ 2019-04-13 20:56 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, andrew, davem, netdev, linux-kernel,
	georg.waibel

Sat, Apr 13, 2019 at 03:28:14AM CEST, olteanv@gmail.com wrote:
>VLAN filtering cannot be properly disabled in SJA1105. So in order to
>emulate the "no VLAN awareness" behavior (not dropping traffic that is
>tagged with a VID that isn't configured on the port), we need to hack
>another switch feature: programmable TPID (which is 0x8100 for 802.1Q).
>We are reprogramming the TPID to a bogus value (ETH_P_EDSA) which leaves
>the switch thinking that all traffic is untagged, and therefore accepts
>it.
>
>Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is
>installed again, and the switch starts identifying 802.1Q-tagged
>traffic.
>
>Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>---
>Changes from v3:
>Changed back to ETH_P_EDSA.
>
>Changes from v2:
>Changed the TPID from ETH_P_EDSA (0xDADA) to a newly introduced one:
>ETH_P_DSA_8021Q (0xDADB).
>
> drivers/net/dsa/sja1105/sja1105_main.c        | 254 +++++++++++++++++-
> .../net/dsa/sja1105/sja1105_static_config.c   |  38 +++
> .../net/dsa/sja1105/sja1105_static_config.h   |   3 +
> 3 files changed, 293 insertions(+), 2 deletions(-)
>

[...]


>+#define sja1105_vlan_filtering_enabled(priv) \
>+	(((struct sja1105_general_params_entry *) \
>+	((struct sja1105_private *)priv)->static_config. \
>+	tables[BLK_IDX_GENERAL_PARAMS].entries)->tpid == ETH_P_8021Q)

This is unreadable. Please have it as function.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 15/24] net: dsa: sja1105: Add support for FDB and MDB management
  2019-04-13  1:28 ` [PATCH v3 net-next 15/24] net: dsa: sja1105: Add support for FDB and MDB management Vladimir Oltean
@ 2019-04-13 20:58   ` Jiri Pirko
  0 siblings, 0 replies; 68+ messages in thread
From: Jiri Pirko @ 2019-04-13 20:58 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: f.fainelli, vivien.didelot, andrew, davem, netdev, linux-kernel,
	georg.waibel

Sat, Apr 13, 2019 at 03:28:13AM CEST, olteanv@gmail.com wrote:
>Currently only the (more difficult) first generation E/T series is
>supported. Here the TCAM is only 4-way associative, and to know where
>the hardware will search for a FDB entry, we need to perform the same
>hash algorithm in order to install the entry in the correct bin.
>
>On P/Q/R/S, the TCAM should be fully associative. However the SPI
>command interface is different, and because I don't have access to a
>new-generation device at the moment, support for it is TODO.
>
>Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>---
>Changes from v3:
>None
>
>Changes from v2:
>None
>
> drivers/net/dsa/sja1105/sja1105.h             |   2 +
> .../net/dsa/sja1105/sja1105_dynamic_config.c  |  40 ++++
> drivers/net/dsa/sja1105/sja1105_main.c        | 193 ++++++++++++++++++
> 3 files changed, 235 insertions(+)
>
>diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
>index ef555dd385a3..4c9df44a4478 100644
>--- a/drivers/net/dsa/sja1105/sja1105.h
>+++ b/drivers/net/dsa/sja1105/sja1105.h
>@@ -129,6 +129,8 @@ int sja1105_dynamic_config_write(struct sja1105_private *priv,
> 				 enum sja1105_blk_idx blk_idx,
> 				 int index, void *entry, bool keep);
> 
>+u8 sja1105_fdb_hash(struct sja1105_private *priv, const u8 *addr, u16 vid);
>+
> /* Common implementations for the static and dynamic configs */
> size_t sja1105_l2_forwarding_entry_packing(void *buf, void *entry_ptr,
> 					   enum packing_op op);
>diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
>index 74c3a00d453c..0aeda6868c27 100644
>--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
>+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
>@@ -462,3 +462,43 @@ int sja1105_dynamic_config_write(struct sja1105_private *priv,
> 
> 	return 0;
> }
>+
>+static u8 crc8_add(u8 crc, u8 byte, u8 poly)

Prefix please.


>+{
>+	int i;
>+
>+	for (i = 0; i < 8; i++) {
>+		if ((crc ^ byte) & (1 << 7)) {
>+			crc <<= 1;
>+			crc ^= poly;
>+		} else {
>+			crc <<= 1;
>+		}
>+		byte <<= 1;
>+	}
>+	return crc;
>+}

[...]


> 
>+#define fdb(bin, index) \
>+	((bin) * SJA1105ET_FDB_BIN_SIZE + (index))
>+#define is_bin_index_valid(i) \
>+	((i) >= 0 && (i) < SJA1105ET_FDB_BIN_SIZE)

Please use prefixes and sane names. Also, consider using functions
instead of macros.


[...]


>+
>+#undef fdb
>+#undef is_bin_index_valid

Odd.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13 16:37   ` Andrew Lunn
@ 2019-04-13 21:27     ` Vladimir Oltean
  2019-04-13 22:08       ` Vladimir Oltean
                         ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 21:27 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sat, 13 Apr 2019 at 19:38, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 13, 2019 at 04:28:16AM +0300, Vladimir Oltean wrote:
> > In order to support this, we are creating a make-shift switch tag out of
> > a VLAN trunk configured on the CPU port. Termination of normal traffic
> > on switch ports only works when not under a vlan_filtering bridge.
> > Termination of management (PTP, BPDU) traffic works under all
> > circumstances because it uses a different tagging mechanism
> > (incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
> > code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.
> >
> > There are two types of traffic: regular and link-local.
> > The link-local traffic received on the CPU port is trapped from the
> > switch's regular forwarding decisions because it matched one of the two
> > DMAC filters for management traffic.
> > On transmission, the switch requires special massaging for these
> > link-local frames. Due to a weird implementation of the switching IP, by
> > default it drops link-local frames that originate on the CPU port. It
> > needs to be told where to forward them to, through an SPI command
> > ("management route") that is valid for only a single frame.
> > So when we're sending link-local traffic, we need to clone skb's from
> > DSA and send them in our custom xmit worker that also performs SPI access.
> >
> > For that purpose, the DSA xmit handler and the xmit worker communicate
> > through a per-port "skb ring" software structure, with a producer and a
> > consumer index. At the moment this structure is rather fragile
> > (ping-flooding to a link-local DMAC would cause most of the frames to
> > get dropped). I would like to move the management traffic on a separate
> > netdev queue that I can stop when the skb ring got full and hardware is
> > busy processing, so that we are not forced to drop traffic.
> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> > ---
> > Changes in v3:
> > Made management traffic be receivable on the DSA netdevices even when
> > switch tagging is disabled, as well as regular traffic be receivable on
> > the master netdevice in the same scenario. Both are accomplished using
> > the sja1105_filter() function and some small touch-ups in the .rcv
> > callback.
>
> It seems like you made major changes to this. When you do that, you
> should drop any reviewed-by tags you have. They are no longer valid
> because of the major changes.
>

Ok, noted.

> >  /* This callback needs to be present */
> > @@ -1141,7 +1158,11 @@ static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
> >       if (rc)
> >               dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
> >
> > -     return rc;
> > +     /* Switch port identification based on 802.1Q is only passable
>
> possible, not passable.
>

Passable (satisfactory, decent, acceptable) is what I wanted to say.
Tagging using VLANs is possible even when the bridge wants to use
them, but it's smarter not to go there. But I get your point, maybe
I'll rephrase.

> > +      * if we are not under a vlan_filtering bridge. So make sure
> > +      * the two configurations are mutually exclusive.
> > +      */
> > +     return sja1105_setup_8021q_tagging(ds, !enabled);
> >  }
> >
> >  static void sja1105_vlan_add(struct dsa_switch *ds, int port,
> > @@ -1233,9 +1254,107 @@ static int sja1105_setup(struct dsa_switch *ds)
> >        */
> >       ds->vlan_filtering_is_global = true;
> >
> > +     /* The DSA/switchdev model brings up switch ports in standalone mode by
> > +      * default, and that means vlan_filtering is 0 since they're not under
> > +      * a bridge, so it's safe to set up switch tagging at this time.
> > +      */
> > +     return sja1105_setup_8021q_tagging(ds, true);
> > +}
> > +
> > +#include "../../../net/dsa/dsa_priv.h"
>
> No. Don't use relative includes like this.
>
> What do you need from the header? Maybe move it into
> include/linux/net/dsa.h
>

dsa_slave_to_master()

> > +/* Deferred work is unfortunately necessary because setting up the management
> > + * route cannot be done from atomit context (SPI transfer takes a sleepable
> > + * lock on the bus)
> > + */
> > +static void sja1105_xmit_work_handler(struct work_struct *work)
> > +{
> > +     struct sja1105_port *sp = container_of(work, struct sja1105_port,
> > +                                             xmit_work);
> > +     struct sja1105_private *priv = sp->dp->ds->priv;
> > +     struct net_device *slave = sp->dp->slave;
> > +     struct net_device *master = dsa_slave_to_master(slave);
> > +     int port = (uintptr_t)(sp - priv->ports);
> > +     struct sk_buff *skb;
> > +     int i, rc;
> > +
> > +     while ((i = sja1105_skb_ring_get(&sp->xmit_ring, &skb)) >= 0) {
> > +             struct sja1105_mgmt_entry mgmt_route = { 0 };
> > +             struct ethhdr *hdr;
> > +             int timeout = 10;
> > +             int skb_len;
> > +
> > +             skb_len = skb->len;
> > +             hdr = eth_hdr(skb);
> > +
> > +             mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
> > +             mgmt_route.destports = BIT(port);
> > +             mgmt_route.enfport = 1;
> > +             mgmt_route.tsreg = 0;
> > +             mgmt_route.takets = false;
> > +
> > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > +                                               port, &mgmt_route, true);
> > +             if (rc < 0) {
> > +                     kfree_skb(skb);
> > +                     slave->stats.tx_dropped++;
> > +                     continue;
> > +             }
> > +
> > +             /* Transfer skb to the host port. */
> > +             skb->dev = master;
> > +             dev_queue_xmit(skb);
> > +
> > +             /* Wait until the switch has processed the frame */
> > +             do {
> > +                     rc = sja1105_dynamic_config_read(priv, BLK_IDX_MGMT_ROUTE,
> > +                                                      port, &mgmt_route);
> > +                     if (rc < 0) {
> > +                             slave->stats.tx_errors++;
> > +                             dev_err(priv->ds->dev,
> > +                                     "xmit: failed to poll for mgmt route\n");
> > +                             continue;
> > +                     }
> > +
> > +                     /* UM10944: The ENFPORT flag of the respective entry is
> > +                      * cleared when a match is found. The host can use this
> > +                      * flag as an acknowledgment.
> > +                      */
> > +                     cpu_relax();
> > +             } while (mgmt_route.enfport && --timeout);
> > +
> > +             if (!timeout) {
> > +                     dev_err(priv->ds->dev, "xmit timed out\n");
> > +                     slave->stats.tx_errors++;
> > +                     continue;
> > +             }
> > +
> > +             slave->stats.tx_packets++;
> > +             slave->stats.tx_bytes += skb_len;
> > +     }
> > +}
> > +
> > +static int sja1105_port_enable(struct dsa_switch *ds, int port,
> > +                            struct phy_device *phydev)
> > +{
> > +     struct sja1105_private *priv = ds->priv;
> > +     struct sja1105_port *sp = &priv->ports[port];
> > +
> > +     sp->dp = &ds->ports[port];
> > +     INIT_WORK(&sp->xmit_work, sja1105_xmit_work_handler);
> >       return 0;
> >  }
>
> I think i'm missing something here. You have a per port queue of link
> local frames which need special handling. And you have a per-port work
> queue. To send such a frame, you need to write some register, send the
> frame, and then wait until the mgmt_route.enfport is reset.
>
> Why are you doing this per port? How do you stop two ports/work queues
> running at the same time? It seems like one queue, with one work queue
> would be a better structure.
>

See the "port" parameter to this call here:

        rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
                          *port*, &mgmt_route, true);

The switch IP aptly allocates 4 slots for management routes. And it's
a 5-port switch where 1 port is the management port. I think the
structure is fine.

> Also, please move all this code into the tagger. Just add exports for
> sja1105_dynamic_config_write() and sja1105_dynamic_config_read().
>

Well, you see, the tagger code is part of the dsa_core object. If I
export function symbols from the driver, those still won't be there if
I compile the driver as a module. On the other hand, the way I'm doing
it, I think the schedule_work() gives me a pretty good separation.

> > +static void sja1105_port_disable(struct dsa_switch *ds, int port)
> > +{
> > +     struct sja1105_private *priv = ds->priv;
> > +     struct sja1105_port *sp = &priv->ports[port];
> > +     struct sk_buff *skb;
> > +
> > +     cancel_work_sync(&sp->xmit_work);
> > +     while (sja1105_skb_ring_get(&sp->xmit_ring, &skb) >= 0)
> > +             kfree_skb(skb);
> > +}
> > +
> > diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
> > new file mode 100644
> > index 000000000000..5c76a06c9093
> > --- /dev/null
> > +++ b/net/dsa/tag_sja1105.c
> > @@ -0,0 +1,148 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
> > + */
> > +#include <linux/etherdevice.h>
> > +#include <linux/if_vlan.h>
> > +#include <linux/dsa/sja1105.h>
> > +#include "../../drivers/net/dsa/sja1105/sja1105.h"
>
> Again, no, don't do this.
>

This separation between driver and tagger is fairly arbitrary.
I need access to the driver's private structure, in order to get a
hold of the private shadow of the dsa_port. Moving the driver private
structure to include/linux/dsa/ would pull in quite a number of
dependencies. Maybe I could provide declarations for the most of them,
but anyway the private structure wouldn't be so private any longer,
would it?
Otherwise put, would you prefer a dp->priv similar to the already
existing ds->priv? struct sja1105_port is much more lightweight to
keep in include/linux/dsa/.

> > +
> > +#include "dsa_priv.h"
> > +
> > +/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
> > +static inline bool sja1105_is_link_local(const struct sk_buff *skb)
> > +{
> > +     const struct ethhdr *hdr = eth_hdr(skb);
> > +     u64 dmac = ether_addr_to_u64(hdr->h_dest);
> > +
> > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
> > +                 SJA1105_LINKLOCAL_FILTER_A)
> > +             return true;
> > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
> > +                 SJA1105_LINKLOCAL_FILTER_B)
> > +             return true;
> > +     return false;
> > +}
> > +
> > +static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
> > +{
> > +     if (sja1105_is_link_local(skb))
> > +             return true;
> > +     if (!dev->dsa_ptr->vlan_filtering)
> > +             return true;
> > +     return false;
> > +}
>
> Please add a comment here about what frames cannot be handled by the
> tagger. However, i'm not too happy about this design...
>

Ok, let's put this another way.
A switch is primarily a device used to offload the forwarding of
traffic based on L2 rules. Additionally there may be some management
traffic for stuff like STP that needs to be terminated on the host
port of the switch. For that, the hardware's job is to filter and tag
management frames on their way to the host port, and the software's
job is to process the source port and switch id information in a
meaningful way.
Now both this particular switch hardware, and DSA, are taking the
above definitions to extremes.
The switch says: "that's all you want to see? ok, so that's all I'm
going to give you". So its native (hardware) tagging protocol is to
trap link-local traffic and overwrite two bytes of its destination MAC
with the switch ID and the source port. No more, no less. It is an
incomplete solution, but it does the job for practical use cases.
Now DSA says: "I want these to be fully capable net devices, I want
the user to not even realize what's going on under the hood". I don't
think that terminating iperf traffic through switch ports is a
realistic usage scenario. So in a way discussions about performance
and optimizations on DSA hotpath are slightly pointless IMO.
Now what my driver says is that it offers a bit of both. It speaks the
hardware's tagging protocol so it is capable of management traffic,
but it also speaks the DSA paradigm, so in a way pushes the hardware
to work in a mode it was never intended to, by repurposing VLANs when
the user doesn't request them. So on one hand there is some overlap
between the hardware tagging protocol and the VLAN one (in standalone
mode and in VLAN-unaware bridged mode, management traffic *could* use
VLAN tagging but it doesn't rely on it), and on the other hand the
reunion of the two tagging protocols is decent, but still doesn't
cover the entire spectrum (when put under a VLAN-aware bridge, you
lose the ability to decode general traffic). So you'd better not rely
on VLANs to decode the management traffic, because you won't be able
to always rely on that, and that is a shame since a bridge with both
vlan_filtering 1 and stp_state 1 is a real usage scenario, and the
hardware is capable of that combination.
But all of that is secondary. Let's forget about VLAN tagging for a
second and concentrate on the tagging of management traffic. The
limiting factor here is the software architecture of DSA, because in
order for me to decode that in the driver/tagger, I'd have to drop
everything else coming on the master net device (I explained in 13/24
why). I believe that DSA being all-or-nothing about switch tagging is
turning a blind eye to the devices that don't go overboard with
features, and give you what's needed in a real-world design but not
much else.
What would you improve about this design (assuming you're talking
about the filter function)?

Thanks,
-Vladimir





> > +
> > +static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
> > +                                 struct net_device *netdev)
> > +{
> > +     struct dsa_port *dp = dsa_slave_to_port(netdev);
> > +     struct dsa_switch *ds = dp->ds;
> > +     struct sja1105_private *priv = ds->priv;
> > +     struct sja1105_port *sp = &priv->ports[dp->index];
> > +     struct sk_buff *clone;
> > +
> > +     if (likely(!sja1105_is_link_local(skb))) {
> > +             /* Normal traffic path. */
> > +             u16 tx_vid = dsa_tagging_tx_vid(ds, dp->index);
> > +             u8 pcp = skb->priority;
> > +
> > +             /* If we are under a vlan_filtering bridge, IP termination on
> > +              * switch ports based on 802.1Q tags is simply too brittle to
> > +              * be passable. So just defer to the dsa_slave_notag_xmit
> > +              * implementation.
> > +              */
> > +             if (dp->vlan_filtering)
> > +                     return skb;
> > +
> > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
>
> Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
>
> > +     }
> > +
> > +     /* Code path for transmitting management traffic. This does not rely
> > +      * upon switch tagging, but instead SPI-installed management routes.
> > +      */
> > +     clone = skb_clone(skb, GFP_ATOMIC);
> > +     if (!clone) {
> > +             dev_err(ds->dev, "xmit: failed to clone skb\n");
> > +             return NULL;
> > +     }
> > +
> > +     if (sja1105_skb_ring_add(&sp->xmit_ring, clone) < 0) {
> > +             dev_err(ds->dev, "xmit: skb ring full\n");
> > +             kfree_skb(clone);
> > +             return NULL;
> > +     }
> > +
> > +     if (sp->xmit_ring.count == SJA1105_SKB_RING_SIZE)
> > +             /* TODO setup a dedicated netdev queue for management traffic
> > +              * so that we can selectively apply backpressure and not be
> > +              * required to stop the entire traffic when the software skb
> > +              * ring is full. This requires hooking the ndo_select_queue
> > +              * from DSA and matching on mac_fltres.
> > +              */
> > +             dev_err(ds->dev, "xmit: reached maximum skb ring size\n");
>
> This should be rate limited.
>
>      Andrew
>
> > +
> > +     schedule_work(&sp->xmit_work);
> > +     /* Let DSA free its reference to the skb and we will free
> > +      * the clone in the deferred worker
> > +      */
> > +     return NULL;
> > +}
> > +
> > +static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
> > +                                struct net_device *netdev,
> > +                                struct packet_type *pt)
> > +{
> > +     unsigned int source_port, switch_id;
> > +     struct ethhdr *hdr = eth_hdr(skb);
> > +     struct sk_buff *nskb;
> > +     u16 tpid, vid, tci;
> > +     bool is_tagged;
> > +
> > +     nskb = dsa_8021q_rcv(skb, netdev, pt, &tpid, &tci);
> > +     is_tagged = (nskb && tpid == ETH_P_EDSA);
> > +
> > +     skb->priority = (tci & VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
> > +     vid = tci & VLAN_VID_MASK;
> > +
> > +     skb->offload_fwd_mark = 1;
> > +
> > +     if (likely(!sja1105_is_link_local(skb))) {
> > +             /* Normal traffic path. */
> > +             source_port = dsa_tagging_rx_source_port(vid);
> > +             switch_id = dsa_tagging_rx_switch_id(vid);
> > +     } else {
> > +             /* Management traffic path. Switch embeds the switch ID and
> > +              * port ID into bytes of the destination MAC, courtesy of
> > +              * the incl_srcpt options.
> > +              */
> > +             source_port = hdr->h_dest[3];
> > +             switch_id = hdr->h_dest[4];
> > +             /* Clear the DMAC bytes that were mangled by the switch */
> > +             hdr->h_dest[3] = 0;
> > +             hdr->h_dest[4] = 0;
> > +     }
> > +
> > +     skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
> > +     if (!skb->dev) {
> > +             netdev_warn(netdev, "Couldn't decode source port\n");
> > +             return NULL;
> > +     }
> > +
> > +     /* Delete/overwrite fake VLAN header, DSA expects to not find
> > +      * it there, see dsa_switch_rcv: skb_push(skb, ETH_HLEN).
> > +      */
> > +     if (is_tagged)
> > +             memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - VLAN_HLEN,
> > +                     ETH_HLEN - VLAN_HLEN);
> > +
> > +     return skb;
> > +}
> > +
> > +const struct dsa_device_ops sja1105_netdev_ops = {
> > +     .xmit = sja1105_xmit,
> > +     .rcv = sja1105_rcv,
> > +     .filter = sja1105_filter,
> > +     .overhead = VLAN_HLEN,
> > +};
> > +
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
  2019-04-13 16:44       ` Andrew Lunn
@ 2019-04-13 21:29         ` Vladimir Oltean
  0 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 21:29 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sat, 13 Apr 2019 at 19:45, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Apr 13, 2019 at 06:46:57PM +0300, Vladimir Oltean wrote:
> > Hi Andrew,
> >
> > All I'm saying is that at this point in time (patch 14/24), the driver
> > is being introduced with DSA_TAG_PROTO_NONE. Then support for traffic
> > and switch tagging is added later on (18/24). I was just explaining
> > what can be done with the driver up to this point in the patchset.
>
> O.K. So at this point in time, the driver is unusable. Too much other
> stuff is missing. So i don't see any point in mentioning this.
>
>       Andrew

At this point the driver sees an unmanaged switch. Whether you
consider that unusable or not is debatable.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT
  2019-04-13 20:47   ` Jiri Pirko
@ 2019-04-13 21:31     ` Vladimir Oltean
  2019-04-14  8:35       ` Jiri Pirko
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 21:31 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Florian Fainelli, vivien.didelot, Andrew Lunn, davem, netdev,
	linux-kernel, Georg Waibel

On Sat, 13 Apr 2019 at 23:47, Jiri Pirko <jiri@resnulli.us> wrote:
>
> Sat, Apr 13, 2019 at 03:28:18AM CEST, olteanv@gmail.com wrote:
> >Documentation/devicetree/bindings/net/ethernet.txt is confusing because
> >it says what the MAC should not do, but not what it *should* do:
> >
> >  * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
> >     should not add an RX delay in this case)
> >
> >The gap in semantics is threefold:
> >1. Is it illegal for the MAC to apply the Rx internal delay by itself,
> >   and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
> >   passing it to of_phy_connect? The documentation would suggest yes.
> >1. For "rgmii-rxid", while the situation with the Rx clock skew is more
> >   or less clear (needs to be added by the PHY), what should the MAC
> >   driver do about the Tx delays? Is it an implicit wild card for the
> >   MAC to apply delays in the Tx direction if it can? What if those were
> >   already added as serpentine PCB traces, how could that be made more
> >   obvious through DT bindings so that the MAC doesn't attempt to add
> >   them twice and again potentially break the link?
> >3. If the interface is a fixed-link and therefore the PHY object is
> >   fixed (a purely software entity that obviously cannot add clock
> >   skew), what is the meaning of the above property?
> >
> >So an interpretation of the RGMII bindings was chosen that hopefully
> >does not contradict their intention but also makes them more applied.
> >The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
> >if the port is in the PHY role (either explicitly, or if it is a
> >fixed-link). Otherwise it always passes the duty of setting up delays to
> >the PHY driver.
> >
> >The error behavior that this patch adds is required on SJA1105E/T where
> >the MAC really cannot apply internal delays. If the other end of the
> >fixed-link cannot apply RGMII delays either (this would be specified
> >through its own DT bindings), then the situation requires PCB delays.
> >
> >For SJA1105P/Q/R/S, this is however hardware supported and the error is
> >thus only temporary. I created a stub function pointer for configuring
> >delays per-port on RXC and TXC, and will implement it when I have access
> >to a board with this hardware setup.
> >
> >Meanwhile do not allow the user to select an invalid configuration.
> >
> >Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> >Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> >---
> >Changes in v3:
> >None.
> >
> >Changes in v2:
> >Patch is new.
> >
> > drivers/net/dsa/sja1105/sja1105.h          |  3 ++
> > drivers/net/dsa/sja1105/sja1105_clocking.c |  7 ++++-
> > drivers/net/dsa/sja1105/sja1105_main.c     | 32 +++++++++++++++++++++-
> > drivers/net/dsa/sja1105/sja1105_spi.c      |  6 ++++
> > 4 files changed, 46 insertions(+), 2 deletions(-)
> >
> >diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
> >index b7e745c0bb3a..3c16b991032c 100644
> >--- a/drivers/net/dsa/sja1105/sja1105.h
> >+++ b/drivers/net/dsa/sja1105/sja1105.h
> >@@ -22,6 +22,8 @@
> >
> > struct sja1105_port {
> >       struct dsa_port *dp;
> >+      bool rgmii_rx_delay;
> >+      bool rgmii_tx_delay;
> >       struct work_struct xmit_work;
> >       struct sja1105_skb_ring xmit_ring;
> > };
> >@@ -61,6 +63,7 @@ struct sja1105_info {
> >       const struct sja1105_table_ops *static_ops;
> >       const struct sja1105_regs *regs;
> >       int (*reset_cmd)(const void *ctx, const void *data);
> >+      int (*setup_rgmii_delay)(const void *ctx, int port, bool rx, bool tx);
> >       const char *name;
> > };
> >
> >diff --git a/drivers/net/dsa/sja1105/sja1105_clocking.c b/drivers/net/dsa/sja1105/sja1105_clocking.c
> >index d40da3d52464..c02fec181676 100644
> >--- a/drivers/net/dsa/sja1105/sja1105_clocking.c
> >+++ b/drivers/net/dsa/sja1105/sja1105_clocking.c
> >@@ -432,7 +432,12 @@ static int rgmii_clocking_setup(struct sja1105_private *priv, int port)
> >               dev_err(dev, "Failed to configure Tx pad registers\n");
> >               return rc;
> >       }
> >-      return 0;
> >+      if (!priv->info->setup_rgmii_delay)
> >+              return 0;
> >+
> >+      return priv->info->setup_rgmii_delay(priv, port,
> >+                                           priv->ports[port].rgmii_rx_delay,
> >+                                           priv->ports[port].rgmii_tx_delay);
> > }
> >
> > static int sja1105_cgu_rmii_ref_clk_config(struct sja1105_private *priv,
> >diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> >index e4abf8fb2013..5f7ddb1da006 100644
> >--- a/drivers/net/dsa/sja1105/sja1105_main.c
> >+++ b/drivers/net/dsa/sja1105/sja1105_main.c
> >@@ -555,6 +555,21 @@ static int sja1105_static_config_load(struct sja1105_private *priv,
> >       return sja1105_static_config_upload(priv);
> > }
> >
> >+static void sja1105_parse_rgmii_delay(const struct sja1105_dt_port *in,
> >+                                    struct sja1105_port *out)
> >+{
> >+      if (in->role == XMII_MAC)
> >+              return;
> >+
> >+      if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
> >+          in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
> >+              out->rgmii_rx_delay = true;
> >+
> >+      if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_TXID ||
> >+          in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
> >+              out->rgmii_tx_delay = true;
> >+}
> >+
> > static int sja1105_parse_ports_node(struct sja1105_private *priv,
> >                                   struct sja1105_dt_port *ports,
> >                                   struct device_node *ports_node)
> >@@ -1315,13 +1330,28 @@ static int sja1105_setup(struct dsa_switch *ds)
> > {
> >       struct sja1105_dt_port ports[SJA1105_NUM_PORTS];
> >       struct sja1105_private *priv = ds->priv;
> >-      int rc;
> >+      int rc, i;
> >
> >       rc = sja1105_parse_dt(priv, ports);
> >       if (rc < 0) {
> >               dev_err(ds->dev, "Failed to parse DT: %d\n", rc);
> >               return rc;
> >       }
> >+
> >+      /* Error out early if internal delays are required through DT
> >+       * and we can't apply them.
> >+       */
> >+      for (i = 0; i < SJA1105_NUM_PORTS; i++) {
> >+              sja1105_parse_rgmii_delay(&ports[i], &priv->ports[i]);
> >+
> >+              if ((priv->ports[i].rgmii_rx_delay ||
> >+                   priv->ports[i].rgmii_tx_delay) &&
> >+                   !priv->info->setup_rgmii_delay) {
> >+                      dev_err(ds->dev, "RGMII delay not supported\n");
> >+                      return -EINVAL;
> >+              }
> >+      }
> >+
> >       /* Create and send configuration down to device */
> >       rc = sja1105_static_config_load(priv, ports);
> >       if (rc < 0) {
> >diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
> >index 09cb28e9be20..e4ef4d8048b2 100644
> >--- a/drivers/net/dsa/sja1105/sja1105_spi.c
> >+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
> >@@ -499,6 +499,7 @@ struct sja1105_info sja1105e_info = {
> >       .part_no                = SJA1105ET_PART_NO,
> >       .static_ops             = sja1105e_table_ops,
> >       .dyn_ops                = sja1105et_dyn_ops,
> >+      .setup_rgmii_delay      = NULL,
> >       .reset_cmd              = sja1105et_reset_cmd,
> >       .regs                   = &sja1105et_regs,
> >       .name                   = "SJA1105E",
> >@@ -508,6 +509,7 @@ struct sja1105_info sja1105t_info = {
> >       .part_no                = SJA1105ET_PART_NO,
> >       .static_ops             = sja1105t_table_ops,
> >       .dyn_ops                = sja1105et_dyn_ops,
> >+      .setup_rgmii_delay      = NULL,
> >       .reset_cmd              = sja1105et_reset_cmd,
> >       .regs                   = &sja1105et_regs,
> >       .name                   = "SJA1105T",
> >@@ -517,6 +519,7 @@ struct sja1105_info sja1105p_info = {
> >       .part_no                = SJA1105P_PART_NO,
> >       .static_ops             = sja1105p_table_ops,
> >       .dyn_ops                = sja1105pqrs_dyn_ops,
> >+      .setup_rgmii_delay      = NULL,
> >       .reset_cmd              = sja1105pqrs_reset_cmd,
> >       .regs                   = &sja1105pqrs_regs,
> >       .name                   = "SJA1105P",
> >@@ -526,6 +529,7 @@ struct sja1105_info sja1105q_info = {
> >       .part_no                = SJA1105Q_PART_NO,
> >       .static_ops             = sja1105q_table_ops,
> >       .dyn_ops                = sja1105pqrs_dyn_ops,
> >+      .setup_rgmii_delay      = NULL,
> >       .reset_cmd              = sja1105pqrs_reset_cmd,
> >       .regs                   = &sja1105pqrs_regs,
> >       .name                   = "SJA1105Q",
> >@@ -535,6 +539,7 @@ struct sja1105_info sja1105r_info = {
> >       .part_no                = SJA1105R_PART_NO,
> >       .static_ops             = sja1105r_table_ops,
> >       .dyn_ops                = sja1105pqrs_dyn_ops,
> >+      .setup_rgmii_delay      = NULL,
> >       .reset_cmd              = sja1105pqrs_reset_cmd,
> >       .regs                   = &sja1105pqrs_regs,
> >       .name                   = "SJA1105R",
> >@@ -545,6 +550,7 @@ struct sja1105_info sja1105s_info = {
> >       .static_ops             = sja1105s_table_ops,
> >       .dyn_ops                = sja1105pqrs_dyn_ops,
> >       .regs                   = &sja1105pqrs_regs,
> >+      .setup_rgmii_delay      = NULL,
>
> You don't need to set this to NULL. Please avoid that.
>

Hi Jiri, why not?

Thanks,
-Vladimir

>
> >       .reset_cmd              = sja1105pqrs_reset_cmd,
> >       .name                   = "SJA1105S",
> > };
> >--
> >2.17.1
> >

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations
  2019-04-13 20:56   ` Jiri Pirko
@ 2019-04-13 21:39     ` Vladimir Oltean
  0 siblings, 0 replies; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 21:39 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Florian Fainelli, vivien.didelot, Andrew Lunn, davem, netdev,
	linux-kernel, Georg Waibel

On Sat, 13 Apr 2019 at 23:56, Jiri Pirko <jiri@resnulli.us> wrote:
>
> Sat, Apr 13, 2019 at 03:28:14AM CEST, olteanv@gmail.com wrote:
> >VLAN filtering cannot be properly disabled in SJA1105. So in order to
> >emulate the "no VLAN awareness" behavior (not dropping traffic that is
> >tagged with a VID that isn't configured on the port), we need to hack
> >another switch feature: programmable TPID (which is 0x8100 for 802.1Q).
> >We are reprogramming the TPID to a bogus value (ETH_P_EDSA) which leaves
> >the switch thinking that all traffic is untagged, and therefore accepts
> >it.
> >
> >Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is
> >installed again, and the switch starts identifying 802.1Q-tagged
> >traffic.
> >
> >Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> >Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> >---
> >Changes from v3:
> >Changed back to ETH_P_EDSA.
> >
> >Changes from v2:
> >Changed the TPID from ETH_P_EDSA (0xDADA) to a newly introduced one:
> >ETH_P_DSA_8021Q (0xDADB).
> >
> > drivers/net/dsa/sja1105/sja1105_main.c        | 254 +++++++++++++++++-
> > .../net/dsa/sja1105/sja1105_static_config.c   |  38 +++
> > .../net/dsa/sja1105/sja1105_static_config.h   |   3 +
> > 3 files changed, 293 insertions(+), 2 deletions(-)
> >
>
> [...]
>
>
> >+#define sja1105_vlan_filtering_enabled(priv) \
> >+      (((struct sja1105_general_params_entry *) \
> >+      ((struct sja1105_private *)priv)->static_config. \
> >+      tables[BLK_IDX_GENERAL_PARAMS].entries)->tpid == ETH_P_8021Q)
>
> This is unreadable. Please have it as function.
>

I admit to that. If I reach to a consensus with Andrew on 12/24
ideally I could just say cpu_dp->vlan_filtering.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters
  2019-04-13 20:53   ` Jiri Pirko
@ 2019-04-13 21:55     ` Vladimir Oltean
  2019-04-14  8:34       ` Jiri Pirko
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 21:55 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Florian Fainelli, vivien.didelot, Andrew Lunn, davem, netdev,
	linux-kernel, Georg Waibel

On Sat, 13 Apr 2019 at 23:53, Jiri Pirko <jiri@resnulli.us> wrote:
>
> Sat, Apr 13, 2019 at 03:28:15AM CEST, olteanv@gmail.com wrote:
> >Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> >Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> >---
> >Changes in v3:
> >None.
> >
> >Changes in v2:
> >None functional. Moved the IS_ET() and IS_PQRS() device identification
> >macros here since they are not used in earlier patches.
> >
> > drivers/net/dsa/sja1105/Makefile              |   1 +
> > drivers/net/dsa/sja1105/sja1105.h             |   7 +-
> > drivers/net/dsa/sja1105/sja1105_ethtool.c     | 414 ++++++++++++++++++
> > drivers/net/dsa/sja1105/sja1105_main.c        |   3 +
> > .../net/dsa/sja1105/sja1105_static_config.h   |  21 +
> > 5 files changed, 445 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/net/dsa/sja1105/sja1105_ethtool.c
> >
> >diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
> >index ed00840802f4..bb4404c79eb2 100644
> >--- a/drivers/net/dsa/sja1105/Makefile
> >+++ b/drivers/net/dsa/sja1105/Makefile
> >@@ -3,6 +3,7 @@ obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
> > sja1105-objs := \
> >     sja1105_spi.o \
> >     sja1105_main.o \
> >+    sja1105_ethtool.o \
> >     sja1105_clocking.o \
> >     sja1105_static_config.o \
> >     sja1105_dynamic_config.o \
> >diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
> >index 4c9df44a4478..80b20bdd8f9c 100644
> >--- a/drivers/net/dsa/sja1105/sja1105.h
> >+++ b/drivers/net/dsa/sja1105/sja1105.h
> >@@ -120,8 +120,13 @@ typedef enum {
> > int sja1105_clocking_setup_port(struct sja1105_private *priv, int port);
> > int sja1105_clocking_setup(struct sja1105_private *priv);
> >
> >-/* From sja1105_dynamic_config.c */
> >+/* From sja1105_ethtool.c */
> >+void sja1105_get_ethtool_stats(struct dsa_switch *ds, int port, u64 *data);
> >+void sja1105_get_strings(struct dsa_switch *ds, int port,
> >+                       u32 stringset, u8 *data);
> >+int sja1105_get_sset_count(struct dsa_switch *ds, int port, int sset);
> >
> >+/* From sja1105_dynamic_config.c */
> > int sja1105_dynamic_config_read(struct sja1105_private *priv,
> >                               enum sja1105_blk_idx blk_idx,
> >                               int index, void *entry);
> >diff --git a/drivers/net/dsa/sja1105/sja1105_ethtool.c b/drivers/net/dsa/sja1105/sja1105_ethtool.c
> >new file mode 100644
> >index 000000000000..c082599702bd
> >--- /dev/null
> >+++ b/drivers/net/dsa/sja1105/sja1105_ethtool.c
> >@@ -0,0 +1,414 @@
> >+// SPDX-License-Identifier: GPL-2.0
> >+/* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
> >+ */
> >+#include "sja1105.h"
> >+
> >+#define SIZE_MAC_AREA         (0x02 * 4)
> >+#define SIZE_HL1_AREA         (0x10 * 4)
> >+#define SIZE_HL2_AREA         (0x4 * 4)
> >+#define SIZE_QLEVEL_AREA      (0x8 * 4) /* 0x4 to 0xB */
>
> Please use prefixes for defines like this. For example "SIZE_MAC_AREA"
> sounds way too generic.
>

What you propose sounds nice but then consistency would be a concern,
so I'd have to redo the entire driver. And then there are tables named
like "L2 Address Lookup Parameters Table", and as if
SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET wasn't long enough, a prefix would
top it off.
I don't think it's as much of an issue for the reader (for the
compiler it clearly isn't, as it's restricted to this C file only) as
it is for tools like ctags?

> [...]
>
>
> >+static void
> >+sja1105_port_status_hl1_unpack(void *buf,
> >+                             struct sja1105_port_status_hl1 *status)
> >+{
> >+      /* Make pointer arithmetic work on 4 bytes */
> >+      u32 *p = (u32 *)buf;
>
> You don't need to cast void *. Please avoid it in the whole patchset.
>

Ok.

> [...]
>
>
> >+      if (!IS_PQRS(priv->info->device_id))
> >+              return;
> >+
> >+      memset(data + k, 0, ARRAY_SIZE(sja1105pqrs_extra_port_stats) *
> >+                      sizeof(u64));
> >+      for (i = 0; i < 8; i++) {
>
> Array size instead of "8"?
>

Perhaps SJA1105_NUM_TC, since the egress queue occupancy levels are
per traffic class. The size of the array is 16.

>
> >+              data[k++] = status.hl2.qlevel_hwm[i];
> >+              data[k++] = status.hl2.qlevel[i];
> >+      }
>
> [...]
>
>
> >
> >+#define IS_PQRS(device_id) \
> >+      (((device_id) == SJA1105PR_DEVICE_ID) || \
> >+       ((device_id) == SJA1105QS_DEVICE_ID))
> >+#define IS_ET(device_id) \
> >+      (((device_id) == SJA1105E_DEVICE_ID) || \
> >+       ((device_id) == SJA1105T_DEVICE_ID))
> >+/* P and R have same Device ID, and differ by Part Number */
> >+#define IS_P(device_id, part_nr) \
> >+      (((device_id) == SJA1105PR_DEVICE_ID) && \
> >+       ((part_nr) == SJA1105P_PART_NR))
> >+#define IS_R(device_id, part_nr) \
> >+      (((device_id) == SJA1105PR_DEVICE_ID) && \
> >+       ((part_nr) == SJA1105R_PART_NR))
> >+/* Same do Q and S */
> >+#define IS_Q(device_id, part_nr) \
> >+      (((device_id) == SJA1105QS_DEVICE_ID) && \
> >+       ((part_nr) == SJA1105Q_PART_NR))
> >+#define IS_S(device_id, part_nr) \
>
> Please have a prefix for macros like this. "IS_S" sounds way too
> generic...
>

Ok, I can think of a more descriptive name, since there are under 5
occurrences of these macros after the reorg, now that I have the
sja1105_info structure which also holds per-revision function
pointers.

>
> >+      (((device_id) == SJA1105QS_DEVICE_ID) && \
> >+       ((part_nr) == SJA1105S_PART_NR))
> >+
> > struct sja1105_general_params_entry {
> >       u64 vllupformat;
> >       u64 mirr_ptacu;
> >--
> >2.17.1
> >


Thanks!
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13 21:27     ` Vladimir Oltean
@ 2019-04-13 22:08       ` Vladimir Oltean
  2019-04-13 22:26         ` Vladimir Oltean
  2019-04-14 16:05       ` Andrew Lunn
  2019-04-17  0:16       ` Florian Fainelli
  2 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 22:08 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, 14 Apr 2019 at 00:27, Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Sat, 13 Apr 2019 at 19:38, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sat, Apr 13, 2019 at 04:28:16AM +0300, Vladimir Oltean wrote:
> > > In order to support this, we are creating a make-shift switch tag out of
> > > a VLAN trunk configured on the CPU port. Termination of normal traffic
> > > on switch ports only works when not under a vlan_filtering bridge.
> > > Termination of management (PTP, BPDU) traffic works under all
> > > circumstances because it uses a different tagging mechanism
> > > (incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
> > > code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.
> > >
> > > There are two types of traffic: regular and link-local.
> > > The link-local traffic received on the CPU port is trapped from the
> > > switch's regular forwarding decisions because it matched one of the two
> > > DMAC filters for management traffic.
> > > On transmission, the switch requires special massaging for these
> > > link-local frames. Due to a weird implementation of the switching IP, by
> > > default it drops link-local frames that originate on the CPU port. It
> > > needs to be told where to forward them to, through an SPI command
> > > ("management route") that is valid for only a single frame.
> > > So when we're sending link-local traffic, we need to clone skb's from
> > > DSA and send them in our custom xmit worker that also performs SPI access.
> > >
> > > For that purpose, the DSA xmit handler and the xmit worker communicate
> > > through a per-port "skb ring" software structure, with a producer and a
> > > consumer index. At the moment this structure is rather fragile
> > > (ping-flooding to a link-local DMAC would cause most of the frames to
> > > get dropped). I would like to move the management traffic on a separate
> > > netdev queue that I can stop when the skb ring got full and hardware is
> > > busy processing, so that we are not forced to drop traffic.
> > >
> > > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > > Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> > > ---
> > > Changes in v3:
> > > Made management traffic be receivable on the DSA netdevices even when
> > > switch tagging is disabled, as well as regular traffic be receivable on
> > > the master netdevice in the same scenario. Both are accomplished using
> > > the sja1105_filter() function and some small touch-ups in the .rcv
> > > callback.
> >
> > It seems like you made major changes to this. When you do that, you
> > should drop any reviewed-by tags you have. They are no longer valid
> > because of the major changes.
> >
>
> Ok, noted.
>
> > >  /* This callback needs to be present */
> > > @@ -1141,7 +1158,11 @@ static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
> > >       if (rc)
> > >               dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
> > >
> > > -     return rc;
> > > +     /* Switch port identification based on 802.1Q is only passable
> >
> > possible, not passable.
> >
>
> Passable (satisfactory, decent, acceptable) is what I wanted to say.
> Tagging using VLANs is possible even when the bridge wants to use
> them, but it's smarter not to go there. But I get your point, maybe
> I'll rephrase.
>
> > > +      * if we are not under a vlan_filtering bridge. So make sure
> > > +      * the two configurations are mutually exclusive.
> > > +      */
> > > +     return sja1105_setup_8021q_tagging(ds, !enabled);
> > >  }
> > >
> > >  static void sja1105_vlan_add(struct dsa_switch *ds, int port,
> > > @@ -1233,9 +1254,107 @@ static int sja1105_setup(struct dsa_switch *ds)
> > >        */
> > >       ds->vlan_filtering_is_global = true;
> > >
> > > +     /* The DSA/switchdev model brings up switch ports in standalone mode by
> > > +      * default, and that means vlan_filtering is 0 since they're not under
> > > +      * a bridge, so it's safe to set up switch tagging at this time.
> > > +      */
> > > +     return sja1105_setup_8021q_tagging(ds, true);
> > > +}
> > > +
> > > +#include "../../../net/dsa/dsa_priv.h"
> >
> > No. Don't use relative includes like this.
> >
> > What do you need from the header? Maybe move it into
> > include/linux/net/dsa.h
> >
>
> dsa_slave_to_master()
>
> > > +/* Deferred work is unfortunately necessary because setting up the management
> > > + * route cannot be done from atomit context (SPI transfer takes a sleepable
> > > + * lock on the bus)
> > > + */
> > > +static void sja1105_xmit_work_handler(struct work_struct *work)
> > > +{
> > > +     struct sja1105_port *sp = container_of(work, struct sja1105_port,
> > > +                                             xmit_work);
> > > +     struct sja1105_private *priv = sp->dp->ds->priv;
> > > +     struct net_device *slave = sp->dp->slave;
> > > +     struct net_device *master = dsa_slave_to_master(slave);
> > > +     int port = (uintptr_t)(sp - priv->ports);
> > > +     struct sk_buff *skb;
> > > +     int i, rc;
> > > +
> > > +     while ((i = sja1105_skb_ring_get(&sp->xmit_ring, &skb)) >= 0) {
> > > +             struct sja1105_mgmt_entry mgmt_route = { 0 };
> > > +             struct ethhdr *hdr;
> > > +             int timeout = 10;
> > > +             int skb_len;
> > > +
> > > +             skb_len = skb->len;
> > > +             hdr = eth_hdr(skb);
> > > +
> > > +             mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
> > > +             mgmt_route.destports = BIT(port);
> > > +             mgmt_route.enfport = 1;
> > > +             mgmt_route.tsreg = 0;
> > > +             mgmt_route.takets = false;
> > > +
> > > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > > +                                               port, &mgmt_route, true);
> > > +             if (rc < 0) {
> > > +                     kfree_skb(skb);
> > > +                     slave->stats.tx_dropped++;
> > > +                     continue;
> > > +             }
> > > +
> > > +             /* Transfer skb to the host port. */
> > > +             skb->dev = master;
> > > +             dev_queue_xmit(skb);
> > > +
> > > +             /* Wait until the switch has processed the frame */
> > > +             do {
> > > +                     rc = sja1105_dynamic_config_read(priv, BLK_IDX_MGMT_ROUTE,
> > > +                                                      port, &mgmt_route);
> > > +                     if (rc < 0) {
> > > +                             slave->stats.tx_errors++;
> > > +                             dev_err(priv->ds->dev,
> > > +                                     "xmit: failed to poll for mgmt route\n");
> > > +                             continue;
> > > +                     }
> > > +
> > > +                     /* UM10944: The ENFPORT flag of the respective entry is
> > > +                      * cleared when a match is found. The host can use this
> > > +                      * flag as an acknowledgment.
> > > +                      */
> > > +                     cpu_relax();
> > > +             } while (mgmt_route.enfport && --timeout);
> > > +
> > > +             if (!timeout) {
> > > +                     dev_err(priv->ds->dev, "xmit timed out\n");
> > > +                     slave->stats.tx_errors++;
> > > +                     continue;
> > > +             }
> > > +
> > > +             slave->stats.tx_packets++;
> > > +             slave->stats.tx_bytes += skb_len;
> > > +     }
> > > +}
> > > +
> > > +static int sja1105_port_enable(struct dsa_switch *ds, int port,
> > > +                            struct phy_device *phydev)
> > > +{
> > > +     struct sja1105_private *priv = ds->priv;
> > > +     struct sja1105_port *sp = &priv->ports[port];
> > > +
> > > +     sp->dp = &ds->ports[port];
> > > +     INIT_WORK(&sp->xmit_work, sja1105_xmit_work_handler);
> > >       return 0;
> > >  }
> >
> > I think i'm missing something here. You have a per port queue of link
> > local frames which need special handling. And you have a per-port work
> > queue. To send such a frame, you need to write some register, send the
> > frame, and then wait until the mgmt_route.enfport is reset.
> >
> > Why are you doing this per port? How do you stop two ports/work queues
> > running at the same time? It seems like one queue, with one work queue
> > would be a better structure.
> >
>
> See the "port" parameter to this call here:
>
>         rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
>                           *port*, &mgmt_route, true);
>
> The switch IP aptly allocates 4 slots for management routes. And it's
> a 5-port switch where 1 port is the management port. I think the
> structure is fine.
>

"How do stop two work queues": if you're talking about contention on
the hardware management route, I responded to that.
If you're talking about netif_stop_queue(), I think I'm going to avoid
that altogether by not having a finite sized ring structure. While
studying the Marvell 88e6060 driver I found out that sk_buff_head
exists. Now I'm part of the elite club of wheel reinventors with my
struct sja1105_skb_ring :)


> > Also, please move all this code into the tagger. Just add exports for
> > sja1105_dynamic_config_write() and sja1105_dynamic_config_read().
> >
>
> Well, you see, the tagger code is part of the dsa_core object. If I
> export function symbols from the driver, those still won't be there if
> I compile the driver as a module. On the other hand, the way I'm doing
> it, I think the schedule_work() gives me a pretty good separation.
>
> > > +static void sja1105_port_disable(struct dsa_switch *ds, int port)
> > > +{
> > > +     struct sja1105_private *priv = ds->priv;
> > > +     struct sja1105_port *sp = &priv->ports[port];
> > > +     struct sk_buff *skb;
> > > +
> > > +     cancel_work_sync(&sp->xmit_work);
> > > +     while (sja1105_skb_ring_get(&sp->xmit_ring, &skb) >= 0)
> > > +             kfree_skb(skb);
> > > +}
> > > +
> > > diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
> > > new file mode 100644
> > > index 000000000000..5c76a06c9093
> > > --- /dev/null
> > > +++ b/net/dsa/tag_sja1105.c
> > > @@ -0,0 +1,148 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
> > > + */
> > > +#include <linux/etherdevice.h>
> > > +#include <linux/if_vlan.h>
> > > +#include <linux/dsa/sja1105.h>
> > > +#include "../../drivers/net/dsa/sja1105/sja1105.h"
> >
> > Again, no, don't do this.
> >
>
> This separation between driver and tagger is fairly arbitrary.
> I need access to the driver's private structure, in order to get a
> hold of the private shadow of the dsa_port. Moving the driver private
> structure to include/linux/dsa/ would pull in quite a number of
> dependencies. Maybe I could provide declarations for the most of them,
> but anyway the private structure wouldn't be so private any longer,
> would it?
> Otherwise put, would you prefer a dp->priv similar to the already
> existing ds->priv? struct sja1105_port is much more lightweight to
> keep in include/linux/dsa/.
>
> > > +
> > > +#include "dsa_priv.h"
> > > +
> > > +/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
> > > +static inline bool sja1105_is_link_local(const struct sk_buff *skb)
> > > +{
> > > +     const struct ethhdr *hdr = eth_hdr(skb);
> > > +     u64 dmac = ether_addr_to_u64(hdr->h_dest);
> > > +
> > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
> > > +                 SJA1105_LINKLOCAL_FILTER_A)
> > > +             return true;
> > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
> > > +                 SJA1105_LINKLOCAL_FILTER_B)
> > > +             return true;
> > > +     return false;
> > > +}
> > > +
> > > +static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
> > > +{
> > > +     if (sja1105_is_link_local(skb))
> > > +             return true;
> > > +     if (!dev->dsa_ptr->vlan_filtering)
> > > +             return true;
> > > +     return false;
> > > +}
> >
> > Please add a comment here about what frames cannot be handled by the
> > tagger. However, i'm not too happy about this design...
> >
>
> Ok, let's put this another way.
> A switch is primarily a device used to offload the forwarding of
> traffic based on L2 rules. Additionally there may be some management
> traffic for stuff like STP that needs to be terminated on the host
> port of the switch. For that, the hardware's job is to filter and tag
> management frames on their way to the host port, and the software's
> job is to process the source port and switch id information in a
> meaningful way.
> Now both this particular switch hardware, and DSA, are taking the
> above definitions to extremes.
> The switch says: "that's all you want to see? ok, so that's all I'm
> going to give you". So its native (hardware) tagging protocol is to
> trap link-local traffic and overwrite two bytes of its destination MAC
> with the switch ID and the source port. No more, no less. It is an
> incomplete solution, but it does the job for practical use cases.
> Now DSA says: "I want these to be fully capable net devices, I want
> the user to not even realize what's going on under the hood". I don't
> think that terminating iperf traffic through switch ports is a
> realistic usage scenario. So in a way discussions about performance
> and optimizations on DSA hotpath are slightly pointless IMO.
> Now what my driver says is that it offers a bit of both. It speaks the
> hardware's tagging protocol so it is capable of management traffic,
> but it also speaks the DSA paradigm, so in a way pushes the hardware
> to work in a mode it was never intended to, by repurposing VLANs when
> the user doesn't request them. So on one hand there is some overlap
> between the hardware tagging protocol and the VLAN one (in standalone
> mode and in VLAN-unaware bridged mode, management traffic *could* use
> VLAN tagging but it doesn't rely on it), and on the other hand the
> reunion of the two tagging protocols is decent, but still doesn't
> cover the entire spectrum (when put under a VLAN-aware bridge, you
> lose the ability to decode general traffic). So you'd better not rely
> on VLANs to decode the management traffic, because you won't be able
> to always rely on that, and that is a shame since a bridge with both
> vlan_filtering 1 and stp_state 1 is a real usage scenario, and the
> hardware is capable of that combination.
> But all of that is secondary. Let's forget about VLAN tagging for a
> second and concentrate on the tagging of management traffic. The
> limiting factor here is the software architecture of DSA, because in
> order for me to decode that in the driver/tagger, I'd have to drop
> everything else coming on the master net device (I explained in 13/24
> why). I believe that DSA being all-or-nothing about switch tagging is
> turning a blind eye to the devices that don't go overboard with
> features, and give you what's needed in a real-world design but not
> much else.
> What would you improve about this design (assuming you're talking
> about the filter function)?
>
> Thanks,
> -Vladimir
>
>
>
>
>
> > > +
> > > +static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
> > > +                                 struct net_device *netdev)
> > > +{
> > > +     struct dsa_port *dp = dsa_slave_to_port(netdev);
> > > +     struct dsa_switch *ds = dp->ds;
> > > +     struct sja1105_private *priv = ds->priv;
> > > +     struct sja1105_port *sp = &priv->ports[dp->index];
> > > +     struct sk_buff *clone;
> > > +
> > > +     if (likely(!sja1105_is_link_local(skb))) {
> > > +             /* Normal traffic path. */
> > > +             u16 tx_vid = dsa_tagging_tx_vid(ds, dp->index);
> > > +             u8 pcp = skb->priority;
> > > +
> > > +             /* If we are under a vlan_filtering bridge, IP termination on
> > > +              * switch ports based on 802.1Q tags is simply too brittle to
> > > +              * be passable. So just defer to the dsa_slave_notag_xmit
> > > +              * implementation.
> > > +              */
> > > +             if (dp->vlan_filtering)
> > > +                     return skb;
> > > +
> > > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
> >
> > Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
> >
> > > +     }
> > > +
> > > +     /* Code path for transmitting management traffic. This does not rely
> > > +      * upon switch tagging, but instead SPI-installed management routes.
> > > +      */
> > > +     clone = skb_clone(skb, GFP_ATOMIC);
> > > +     if (!clone) {
> > > +             dev_err(ds->dev, "xmit: failed to clone skb\n");
> > > +             return NULL;
> > > +     }
> > > +
> > > +     if (sja1105_skb_ring_add(&sp->xmit_ring, clone) < 0) {
> > > +             dev_err(ds->dev, "xmit: skb ring full\n");
> > > +             kfree_skb(clone);
> > > +             return NULL;
> > > +     }
> > > +
> > > +     if (sp->xmit_ring.count == SJA1105_SKB_RING_SIZE)
> > > +             /* TODO setup a dedicated netdev queue for management traffic
> > > +              * so that we can selectively apply backpressure and not be
> > > +              * required to stop the entire traffic when the software skb
> > > +              * ring is full. This requires hooking the ndo_select_queue
> > > +              * from DSA and matching on mac_fltres.
> > > +              */
> > > +             dev_err(ds->dev, "xmit: reached maximum skb ring size\n");
> >
> > This should be rate limited.
> >
> >      Andrew
> >
> > > +
> > > +     schedule_work(&sp->xmit_work);
> > > +     /* Let DSA free its reference to the skb and we will free
> > > +      * the clone in the deferred worker
> > > +      */
> > > +     return NULL;
> > > +}
> > > +
> > > +static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
> > > +                                struct net_device *netdev,
> > > +                                struct packet_type *pt)
> > > +{
> > > +     unsigned int source_port, switch_id;
> > > +     struct ethhdr *hdr = eth_hdr(skb);
> > > +     struct sk_buff *nskb;
> > > +     u16 tpid, vid, tci;
> > > +     bool is_tagged;
> > > +
> > > +     nskb = dsa_8021q_rcv(skb, netdev, pt, &tpid, &tci);
> > > +     is_tagged = (nskb && tpid == ETH_P_EDSA);
> > > +
> > > +     skb->priority = (tci & VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
> > > +     vid = tci & VLAN_VID_MASK;
> > > +
> > > +     skb->offload_fwd_mark = 1;
> > > +
> > > +     if (likely(!sja1105_is_link_local(skb))) {
> > > +             /* Normal traffic path. */
> > > +             source_port = dsa_tagging_rx_source_port(vid);
> > > +             switch_id = dsa_tagging_rx_switch_id(vid);
> > > +     } else {
> > > +             /* Management traffic path. Switch embeds the switch ID and
> > > +              * port ID into bytes of the destination MAC, courtesy of
> > > +              * the incl_srcpt options.
> > > +              */
> > > +             source_port = hdr->h_dest[3];
> > > +             switch_id = hdr->h_dest[4];
> > > +             /* Clear the DMAC bytes that were mangled by the switch */
> > > +             hdr->h_dest[3] = 0;
> > > +             hdr->h_dest[4] = 0;
> > > +     }
> > > +
> > > +     skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
> > > +     if (!skb->dev) {
> > > +             netdev_warn(netdev, "Couldn't decode source port\n");
> > > +             return NULL;
> > > +     }
> > > +
> > > +     /* Delete/overwrite fake VLAN header, DSA expects to not find
> > > +      * it there, see dsa_switch_rcv: skb_push(skb, ETH_HLEN).
> > > +      */
> > > +     if (is_tagged)
> > > +             memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - VLAN_HLEN,
> > > +                     ETH_HLEN - VLAN_HLEN);
> > > +
> > > +     return skb;
> > > +}
> > > +
> > > +const struct dsa_device_ops sja1105_netdev_ops = {
> > > +     .xmit = sja1105_xmit,
> > > +     .rcv = sja1105_rcv,
> > > +     .filter = sja1105_filter,
> > > +     .overhead = VLAN_HLEN,
> > > +};
> > > +
> > > --
> > > 2.17.1
> > >

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13 22:08       ` Vladimir Oltean
@ 2019-04-13 22:26         ` Vladimir Oltean
  2019-04-14 16:17           ` Andrew Lunn
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-13 22:26 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, 14 Apr 2019 at 01:08, Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Sun, 14 Apr 2019 at 00:27, Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Sat, 13 Apr 2019 at 19:38, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Sat, Apr 13, 2019 at 04:28:16AM +0300, Vladimir Oltean wrote:
> > > > In order to support this, we are creating a make-shift switch tag out of
> > > > a VLAN trunk configured on the CPU port. Termination of normal traffic
> > > > on switch ports only works when not under a vlan_filtering bridge.
> > > > Termination of management (PTP, BPDU) traffic works under all
> > > > circumstances because it uses a different tagging mechanism
> > > > (incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
> > > > code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.
> > > >
> > > > There are two types of traffic: regular and link-local.
> > > > The link-local traffic received on the CPU port is trapped from the
> > > > switch's regular forwarding decisions because it matched one of the two
> > > > DMAC filters for management traffic.
> > > > On transmission, the switch requires special massaging for these
> > > > link-local frames. Due to a weird implementation of the switching IP, by
> > > > default it drops link-local frames that originate on the CPU port. It
> > > > needs to be told where to forward them to, through an SPI command
> > > > ("management route") that is valid for only a single frame.
> > > > So when we're sending link-local traffic, we need to clone skb's from
> > > > DSA and send them in our custom xmit worker that also performs SPI access.
> > > >
> > > > For that purpose, the DSA xmit handler and the xmit worker communicate
> > > > through a per-port "skb ring" software structure, with a producer and a
> > > > consumer index. At the moment this structure is rather fragile
> > > > (ping-flooding to a link-local DMAC would cause most of the frames to
> > > > get dropped). I would like to move the management traffic on a separate
> > > > netdev queue that I can stop when the skb ring got full and hardware is
> > > > busy processing, so that we are not forced to drop traffic.
> > > >
> > > > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > > > Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> > > > ---
> > > > Changes in v3:
> > > > Made management traffic be receivable on the DSA netdevices even when
> > > > switch tagging is disabled, as well as regular traffic be receivable on
> > > > the master netdevice in the same scenario. Both are accomplished using
> > > > the sja1105_filter() function and some small touch-ups in the .rcv
> > > > callback.
> > >
> > > It seems like you made major changes to this. When you do that, you
> > > should drop any reviewed-by tags you have. They are no longer valid
> > > because of the major changes.
> > >
> >
> > Ok, noted.
> >
> > > >  /* This callback needs to be present */
> > > > @@ -1141,7 +1158,11 @@ static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
> > > >       if (rc)
> > > >               dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
> > > >
> > > > -     return rc;
> > > > +     /* Switch port identification based on 802.1Q is only passable
> > >
> > > possible, not passable.
> > >
> >
> > Passable (satisfactory, decent, acceptable) is what I wanted to say.
> > Tagging using VLANs is possible even when the bridge wants to use
> > them, but it's smarter not to go there. But I get your point, maybe
> > I'll rephrase.
> >
> > > > +      * if we are not under a vlan_filtering bridge. So make sure
> > > > +      * the two configurations are mutually exclusive.
> > > > +      */
> > > > +     return sja1105_setup_8021q_tagging(ds, !enabled);
> > > >  }
> > > >
> > > >  static void sja1105_vlan_add(struct dsa_switch *ds, int port,
> > > > @@ -1233,9 +1254,107 @@ static int sja1105_setup(struct dsa_switch *ds)
> > > >        */
> > > >       ds->vlan_filtering_is_global = true;
> > > >
> > > > +     /* The DSA/switchdev model brings up switch ports in standalone mode by
> > > > +      * default, and that means vlan_filtering is 0 since they're not under
> > > > +      * a bridge, so it's safe to set up switch tagging at this time.
> > > > +      */
> > > > +     return sja1105_setup_8021q_tagging(ds, true);
> > > > +}
> > > > +
> > > > +#include "../../../net/dsa/dsa_priv.h"
> > >
> > > No. Don't use relative includes like this.
> > >
> > > What do you need from the header? Maybe move it into
> > > include/linux/net/dsa.h
> > >
> >
> > dsa_slave_to_master()
> >
> > > > +/* Deferred work is unfortunately necessary because setting up the management
> > > > + * route cannot be done from atomit context (SPI transfer takes a sleepable
> > > > + * lock on the bus)
> > > > + */
> > > > +static void sja1105_xmit_work_handler(struct work_struct *work)
> > > > +{
> > > > +     struct sja1105_port *sp = container_of(work, struct sja1105_port,
> > > > +                                             xmit_work);
> > > > +     struct sja1105_private *priv = sp->dp->ds->priv;
> > > > +     struct net_device *slave = sp->dp->slave;
> > > > +     struct net_device *master = dsa_slave_to_master(slave);
> > > > +     int port = (uintptr_t)(sp - priv->ports);
> > > > +     struct sk_buff *skb;
> > > > +     int i, rc;
> > > > +
> > > > +     while ((i = sja1105_skb_ring_get(&sp->xmit_ring, &skb)) >= 0) {
> > > > +             struct sja1105_mgmt_entry mgmt_route = { 0 };
> > > > +             struct ethhdr *hdr;
> > > > +             int timeout = 10;
> > > > +             int skb_len;
> > > > +
> > > > +             skb_len = skb->len;
> > > > +             hdr = eth_hdr(skb);
> > > > +
> > > > +             mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
> > > > +             mgmt_route.destports = BIT(port);
> > > > +             mgmt_route.enfport = 1;
> > > > +             mgmt_route.tsreg = 0;
> > > > +             mgmt_route.takets = false;
> > > > +
> > > > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > > > +                                               port, &mgmt_route, true);
> > > > +             if (rc < 0) {
> > > > +                     kfree_skb(skb);
> > > > +                     slave->stats.tx_dropped++;
> > > > +                     continue;
> > > > +             }
> > > > +
> > > > +             /* Transfer skb to the host port. */
> > > > +             skb->dev = master;
> > > > +             dev_queue_xmit(skb);
> > > > +
> > > > +             /* Wait until the switch has processed the frame */
> > > > +             do {
> > > > +                     rc = sja1105_dynamic_config_read(priv, BLK_IDX_MGMT_ROUTE,
> > > > +                                                      port, &mgmt_route);
> > > > +                     if (rc < 0) {
> > > > +                             slave->stats.tx_errors++;
> > > > +                             dev_err(priv->ds->dev,
> > > > +                                     "xmit: failed to poll for mgmt route\n");
> > > > +                             continue;
> > > > +                     }
> > > > +
> > > > +                     /* UM10944: The ENFPORT flag of the respective entry is
> > > > +                      * cleared when a match is found. The host can use this
> > > > +                      * flag as an acknowledgment.
> > > > +                      */
> > > > +                     cpu_relax();
> > > > +             } while (mgmt_route.enfport && --timeout);
> > > > +
> > > > +             if (!timeout) {
> > > > +                     dev_err(priv->ds->dev, "xmit timed out\n");
> > > > +                     slave->stats.tx_errors++;
> > > > +                     continue;
> > > > +             }
> > > > +
> > > > +             slave->stats.tx_packets++;
> > > > +             slave->stats.tx_bytes += skb_len;
> > > > +     }
> > > > +}
> > > > +
> > > > +static int sja1105_port_enable(struct dsa_switch *ds, int port,
> > > > +                            struct phy_device *phydev)
> > > > +{
> > > > +     struct sja1105_private *priv = ds->priv;
> > > > +     struct sja1105_port *sp = &priv->ports[port];
> > > > +
> > > > +     sp->dp = &ds->ports[port];
> > > > +     INIT_WORK(&sp->xmit_work, sja1105_xmit_work_handler);
> > > >       return 0;
> > > >  }
> > >
> > > I think i'm missing something here. You have a per port queue of link
> > > local frames which need special handling. And you have a per-port work
> > > queue. To send such a frame, you need to write some register, send the
> > > frame, and then wait until the mgmt_route.enfport is reset.
> > >
> > > Why are you doing this per port? How do you stop two ports/work queues
> > > running at the same time? It seems like one queue, with one work queue
> > > would be a better structure.
> > >
> >
> > See the "port" parameter to this call here:
> >
> >         rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> >                           *port*, &mgmt_route, true);
> >
> > The switch IP aptly allocates 4 slots for management routes. And it's
> > a 5-port switch where 1 port is the management port. I think the
> > structure is fine.
> >
>
> "How do stop two work queues": if you're talking about contention on
> the hardware management route, I responded to that.
> If you're talking about netif_stop_queue(), I think I'm going to avoid
> that altogether by not having a finite sized ring structure. While
> studying the Marvell 88e6060 driver I found out that sk_buff_head
> exists. Now I'm part of the elite club of wheel reinventors with my
> struct sja1105_skb_ring :)
>
>
> > > Also, please move all this code into the tagger. Just add exports for
> > > sja1105_dynamic_config_write() and sja1105_dynamic_config_read().
> > >
> >
> > Well, you see, the tagger code is part of the dsa_core object. If I
> > export function symbols from the driver, those still won't be there if
> > I compile the driver as a module. On the other hand, the way I'm doing
> > it, I think the schedule_work() gives me a pretty good separation.
> >
> > > > +static void sja1105_port_disable(struct dsa_switch *ds, int port)
> > > > +{
> > > > +     struct sja1105_private *priv = ds->priv;
> > > > +     struct sja1105_port *sp = &priv->ports[port];
> > > > +     struct sk_buff *skb;
> > > > +
> > > > +     cancel_work_sync(&sp->xmit_work);
> > > > +     while (sja1105_skb_ring_get(&sp->xmit_ring, &skb) >= 0)
> > > > +             kfree_skb(skb);
> > > > +}
> > > > +
> > > > diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
> > > > new file mode 100644
> > > > index 000000000000..5c76a06c9093
> > > > --- /dev/null
> > > > +++ b/net/dsa/tag_sja1105.c
> > > > @@ -0,0 +1,148 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
> > > > + */
> > > > +#include <linux/etherdevice.h>
> > > > +#include <linux/if_vlan.h>
> > > > +#include <linux/dsa/sja1105.h>
> > > > +#include "../../drivers/net/dsa/sja1105/sja1105.h"
> > >
> > > Again, no, don't do this.
> > >
> >
> > This separation between driver and tagger is fairly arbitrary.
> > I need access to the driver's private structure, in order to get a
> > hold of the private shadow of the dsa_port. Moving the driver private
> > structure to include/linux/dsa/ would pull in quite a number of
> > dependencies. Maybe I could provide declarations for the most of them,
> > but anyway the private structure wouldn't be so private any longer,
> > would it?
> > Otherwise put, would you prefer a dp->priv similar to the already
> > existing ds->priv? struct sja1105_port is much more lightweight to
> > keep in include/linux/dsa/.
> >
> > > > +
> > > > +#include "dsa_priv.h"
> > > > +
> > > > +/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
> > > > +static inline bool sja1105_is_link_local(const struct sk_buff *skb)
> > > > +{
> > > > +     const struct ethhdr *hdr = eth_hdr(skb);
> > > > +     u64 dmac = ether_addr_to_u64(hdr->h_dest);
> > > > +
> > > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
> > > > +                 SJA1105_LINKLOCAL_FILTER_A)
> > > > +             return true;
> > > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
> > > > +                 SJA1105_LINKLOCAL_FILTER_B)
> > > > +             return true;
> > > > +     return false;
> > > > +}
> > > > +
> > > > +static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
> > > > +{
> > > > +     if (sja1105_is_link_local(skb))
> > > > +             return true;
> > > > +     if (!dev->dsa_ptr->vlan_filtering)
> > > > +             return true;
> > > > +     return false;
> > > > +}
> > >
> > > Please add a comment here about what frames cannot be handled by the
> > > tagger. However, i'm not too happy about this design...
> > >
> >
> > Ok, let's put this another way.
> > A switch is primarily a device used to offload the forwarding of
> > traffic based on L2 rules. Additionally there may be some management
> > traffic for stuff like STP that needs to be terminated on the host
> > port of the switch. For that, the hardware's job is to filter and tag
> > management frames on their way to the host port, and the software's
> > job is to process the source port and switch id information in a
> > meaningful way.
> > Now both this particular switch hardware, and DSA, are taking the
> > above definitions to extremes.
> > The switch says: "that's all you want to see? ok, so that's all I'm
> > going to give you". So its native (hardware) tagging protocol is to
> > trap link-local traffic and overwrite two bytes of its destination MAC
> > with the switch ID and the source port. No more, no less. It is an
> > incomplete solution, but it does the job for practical use cases.
> > Now DSA says: "I want these to be fully capable net devices, I want
> > the user to not even realize what's going on under the hood". I don't
> > think that terminating iperf traffic through switch ports is a
> > realistic usage scenario. So in a way discussions about performance
> > and optimizations on DSA hotpath are slightly pointless IMO.
> > Now what my driver says is that it offers a bit of both. It speaks the
> > hardware's tagging protocol so it is capable of management traffic,
> > but it also speaks the DSA paradigm, so in a way pushes the hardware
> > to work in a mode it was never intended to, by repurposing VLANs when
> > the user doesn't request them. So on one hand there is some overlap
> > between the hardware tagging protocol and the VLAN one (in standalone
> > mode and in VLAN-unaware bridged mode, management traffic *could* use
> > VLAN tagging but it doesn't rely on it), and on the other hand the
> > reunion of the two tagging protocols is decent, but still doesn't
> > cover the entire spectrum (when put under a VLAN-aware bridge, you
> > lose the ability to decode general traffic). So you'd better not rely
> > on VLANs to decode the management traffic, because you won't be able
> > to always rely on that, and that is a shame since a bridge with both
> > vlan_filtering 1 and stp_state 1 is a real usage scenario, and the
> > hardware is capable of that combination.
> > But all of that is secondary. Let's forget about VLAN tagging for a
> > second and concentrate on the tagging of management traffic. The
> > limiting factor here is the software architecture of DSA, because in
> > order for me to decode that in the driver/tagger, I'd have to drop
> > everything else coming on the master net device (I explained in 13/24
> > why). I believe that DSA being all-or-nothing about switch tagging is
> > turning a blind eye to the devices that don't go overboard with
> > features, and give you what's needed in a real-world design but not
> > much else.
> > What would you improve about this design (assuming you're talking
> > about the filter function)?
> >
> > Thanks,
> > -Vladimir
> >
> >
> >
> >
> >
> > > > +
> > > > +static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
> > > > +                                 struct net_device *netdev)
> > > > +{
> > > > +     struct dsa_port *dp = dsa_slave_to_port(netdev);
> > > > +     struct dsa_switch *ds = dp->ds;
> > > > +     struct sja1105_private *priv = ds->priv;
> > > > +     struct sja1105_port *sp = &priv->ports[dp->index];
> > > > +     struct sk_buff *clone;
> > > > +
> > > > +     if (likely(!sja1105_is_link_local(skb))) {
> > > > +             /* Normal traffic path. */
> > > > +             u16 tx_vid = dsa_tagging_tx_vid(ds, dp->index);
> > > > +             u8 pcp = skb->priority;
> > > > +
> > > > +             /* If we are under a vlan_filtering bridge, IP termination on
> > > > +              * switch ports based on 802.1Q tags is simply too brittle to
> > > > +              * be passable. So just defer to the dsa_slave_notag_xmit
> > > > +              * implementation.
> > > > +              */
> > > > +             if (dp->vlan_filtering)
> > > > +                     return skb;
> > > > +
> > > > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > > > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
> > >
> > > Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
> > >

I'm receiving contradictory advice on this. Why should I define a new
ethertype, and if I do, what scope should the definition have (local
to the driver and the tagger, local to DSA, UAPI)?

> > > > +     }
> > > > +
> > > > +     /* Code path for transmitting management traffic. This does not rely
> > > > +      * upon switch tagging, but instead SPI-installed management routes.
> > > > +      */
> > > > +     clone = skb_clone(skb, GFP_ATOMIC);
> > > > +     if (!clone) {
> > > > +             dev_err(ds->dev, "xmit: failed to clone skb\n");
> > > > +             return NULL;
> > > > +     }
> > > > +
> > > > +     if (sja1105_skb_ring_add(&sp->xmit_ring, clone) < 0) {
> > > > +             dev_err(ds->dev, "xmit: skb ring full\n");
> > > > +             kfree_skb(clone);
> > > > +             return NULL;
> > > > +     }
> > > > +
> > > > +     if (sp->xmit_ring.count == SJA1105_SKB_RING_SIZE)
> > > > +             /* TODO setup a dedicated netdev queue for management traffic
> > > > +              * so that we can selectively apply backpressure and not be
> > > > +              * required to stop the entire traffic when the software skb
> > > > +              * ring is full. This requires hooking the ndo_select_queue
> > > > +              * from DSA and matching on mac_fltres.
> > > > +              */
> > > > +             dev_err(ds->dev, "xmit: reached maximum skb ring size\n");
> > >
> > > This should be rate limited.
> > >

Again, with sk_buff_head it'll probably completely go away.

> > >      Andrew
> > >
> > > > +
> > > > +     schedule_work(&sp->xmit_work);
> > > > +     /* Let DSA free its reference to the skb and we will free
> > > > +      * the clone in the deferred worker
> > > > +      */
> > > > +     return NULL;
> > > > +}
> > > > +
> > > > +static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
> > > > +                                struct net_device *netdev,
> > > > +                                struct packet_type *pt)
> > > > +{
> > > > +     unsigned int source_port, switch_id;
> > > > +     struct ethhdr *hdr = eth_hdr(skb);
> > > > +     struct sk_buff *nskb;
> > > > +     u16 tpid, vid, tci;
> > > > +     bool is_tagged;
> > > > +
> > > > +     nskb = dsa_8021q_rcv(skb, netdev, pt, &tpid, &tci);
> > > > +     is_tagged = (nskb && tpid == ETH_P_EDSA);
> > > > +
> > > > +     skb->priority = (tci & VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT;
> > > > +     vid = tci & VLAN_VID_MASK;
> > > > +
> > > > +     skb->offload_fwd_mark = 1;
> > > > +
> > > > +     if (likely(!sja1105_is_link_local(skb))) {
> > > > +             /* Normal traffic path. */
> > > > +             source_port = dsa_tagging_rx_source_port(vid);
> > > > +             switch_id = dsa_tagging_rx_switch_id(vid);
> > > > +     } else {
> > > > +             /* Management traffic path. Switch embeds the switch ID and
> > > > +              * port ID into bytes of the destination MAC, courtesy of
> > > > +              * the incl_srcpt options.
> > > > +              */
> > > > +             source_port = hdr->h_dest[3];
> > > > +             switch_id = hdr->h_dest[4];
> > > > +             /* Clear the DMAC bytes that were mangled by the switch */
> > > > +             hdr->h_dest[3] = 0;
> > > > +             hdr->h_dest[4] = 0;
> > > > +     }
> > > > +
> > > > +     skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
> > > > +     if (!skb->dev) {
> > > > +             netdev_warn(netdev, "Couldn't decode source port\n");
> > > > +             return NULL;
> > > > +     }
> > > > +
> > > > +     /* Delete/overwrite fake VLAN header, DSA expects to not find
> > > > +      * it there, see dsa_switch_rcv: skb_push(skb, ETH_HLEN).
> > > > +      */
> > > > +     if (is_tagged)
> > > > +             memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - VLAN_HLEN,
> > > > +                     ETH_HLEN - VLAN_HLEN);
> > > > +
> > > > +     return skb;
> > > > +}
> > > > +
> > > > +const struct dsa_device_ops sja1105_netdev_ops = {
> > > > +     .xmit = sja1105_xmit,
> > > > +     .rcv = sja1105_rcv,
> > > > +     .filter = sja1105_filter,
> > > > +     .overhead = VLAN_HLEN,
> > > > +};
> > > > +
> > > > --
> > > > 2.17.1
> > > >

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters
  2019-04-13 21:55     ` Vladimir Oltean
@ 2019-04-14  8:34       ` Jiri Pirko
  0 siblings, 0 replies; 68+ messages in thread
From: Jiri Pirko @ 2019-04-14  8:34 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, Andrew Lunn, davem, netdev,
	linux-kernel, Georg Waibel

Sat, Apr 13, 2019 at 11:55:52PM CEST, olteanv@gmail.com wrote:
>On Sat, 13 Apr 2019 at 23:53, Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Sat, Apr 13, 2019 at 03:28:15AM CEST, olteanv@gmail.com wrote:
>> >Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>> >Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>> >---
>> >Changes in v3:
>> >None.
>> >
>> >Changes in v2:
>> >None functional. Moved the IS_ET() and IS_PQRS() device identification
>> >macros here since they are not used in earlier patches.
>> >
>> > drivers/net/dsa/sja1105/Makefile              |   1 +
>> > drivers/net/dsa/sja1105/sja1105.h             |   7 +-
>> > drivers/net/dsa/sja1105/sja1105_ethtool.c     | 414 ++++++++++++++++++
>> > drivers/net/dsa/sja1105/sja1105_main.c        |   3 +
>> > .../net/dsa/sja1105/sja1105_static_config.h   |  21 +
>> > 5 files changed, 445 insertions(+), 1 deletion(-)
>> > create mode 100644 drivers/net/dsa/sja1105/sja1105_ethtool.c
>> >
>> >diff --git a/drivers/net/dsa/sja1105/Makefile b/drivers/net/dsa/sja1105/Makefile
>> >index ed00840802f4..bb4404c79eb2 100644
>> >--- a/drivers/net/dsa/sja1105/Makefile
>> >+++ b/drivers/net/dsa/sja1105/Makefile
>> >@@ -3,6 +3,7 @@ obj-$(CONFIG_NET_DSA_SJA1105) += sja1105.o
>> > sja1105-objs := \
>> >     sja1105_spi.o \
>> >     sja1105_main.o \
>> >+    sja1105_ethtool.o \
>> >     sja1105_clocking.o \
>> >     sja1105_static_config.o \
>> >     sja1105_dynamic_config.o \
>> >diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
>> >index 4c9df44a4478..80b20bdd8f9c 100644
>> >--- a/drivers/net/dsa/sja1105/sja1105.h
>> >+++ b/drivers/net/dsa/sja1105/sja1105.h
>> >@@ -120,8 +120,13 @@ typedef enum {
>> > int sja1105_clocking_setup_port(struct sja1105_private *priv, int port);
>> > int sja1105_clocking_setup(struct sja1105_private *priv);
>> >
>> >-/* From sja1105_dynamic_config.c */
>> >+/* From sja1105_ethtool.c */
>> >+void sja1105_get_ethtool_stats(struct dsa_switch *ds, int port, u64 *data);
>> >+void sja1105_get_strings(struct dsa_switch *ds, int port,
>> >+                       u32 stringset, u8 *data);
>> >+int sja1105_get_sset_count(struct dsa_switch *ds, int port, int sset);
>> >
>> >+/* From sja1105_dynamic_config.c */
>> > int sja1105_dynamic_config_read(struct sja1105_private *priv,
>> >                               enum sja1105_blk_idx blk_idx,
>> >                               int index, void *entry);
>> >diff --git a/drivers/net/dsa/sja1105/sja1105_ethtool.c b/drivers/net/dsa/sja1105/sja1105_ethtool.c
>> >new file mode 100644
>> >index 000000000000..c082599702bd
>> >--- /dev/null
>> >+++ b/drivers/net/dsa/sja1105/sja1105_ethtool.c
>> >@@ -0,0 +1,414 @@
>> >+// SPDX-License-Identifier: GPL-2.0
>> >+/* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
>> >+ */
>> >+#include "sja1105.h"
>> >+
>> >+#define SIZE_MAC_AREA         (0x02 * 4)
>> >+#define SIZE_HL1_AREA         (0x10 * 4)
>> >+#define SIZE_HL2_AREA         (0x4 * 4)
>> >+#define SIZE_QLEVEL_AREA      (0x8 * 4) /* 0x4 to 0xB */
>>
>> Please use prefixes for defines like this. For example "SIZE_MAC_AREA"
>> sounds way too generic.
>>
>
>What you propose sounds nice but then consistency would be a concern,
>so I'd have to redo the entire driver. And then there are tables named

Yep


>like "L2 Address Lookup Parameters Table", and as if
>SIZE_L2_LOOKUP_PARAMS_DYN_CMD_ET wasn't long enough, a prefix would
>top it off.

It's a tradeoff. But most of the time, it is doable. Then the code is
easier to read.


>I don't think it's as much of an issue for the reader (for the
>compiler it clearly isn't, as it's restricted to this C file only) as
>it is for tools like ctags?
>
>> [...]
>>
>>
>> >+static void
>> >+sja1105_port_status_hl1_unpack(void *buf,
>> >+                             struct sja1105_port_status_hl1 *status)
>> >+{
>> >+      /* Make pointer arithmetic work on 4 bytes */
>> >+      u32 *p = (u32 *)buf;
>>
>> You don't need to cast void *. Please avoid it in the whole patchset.
>>
>
>Ok.
>
>> [...]
>>
>>
>> >+      if (!IS_PQRS(priv->info->device_id))
>> >+              return;
>> >+
>> >+      memset(data + k, 0, ARRAY_SIZE(sja1105pqrs_extra_port_stats) *
>> >+                      sizeof(u64));
>> >+      for (i = 0; i < 8; i++) {
>>
>> Array size instead of "8"?
>>
>
>Perhaps SJA1105_NUM_TC, since the egress queue occupancy levels are
>per traffic class. The size of the array is 16.
>
>>
>> >+              data[k++] = status.hl2.qlevel_hwm[i];
>> >+              data[k++] = status.hl2.qlevel[i];
>> >+      }
>>
>> [...]
>>
>>
>> >
>> >+#define IS_PQRS(device_id) \
>> >+      (((device_id) == SJA1105PR_DEVICE_ID) || \
>> >+       ((device_id) == SJA1105QS_DEVICE_ID))
>> >+#define IS_ET(device_id) \
>> >+      (((device_id) == SJA1105E_DEVICE_ID) || \
>> >+       ((device_id) == SJA1105T_DEVICE_ID))
>> >+/* P and R have same Device ID, and differ by Part Number */
>> >+#define IS_P(device_id, part_nr) \
>> >+      (((device_id) == SJA1105PR_DEVICE_ID) && \
>> >+       ((part_nr) == SJA1105P_PART_NR))
>> >+#define IS_R(device_id, part_nr) \
>> >+      (((device_id) == SJA1105PR_DEVICE_ID) && \
>> >+       ((part_nr) == SJA1105R_PART_NR))
>> >+/* Same do Q and S */
>> >+#define IS_Q(device_id, part_nr) \
>> >+      (((device_id) == SJA1105QS_DEVICE_ID) && \
>> >+       ((part_nr) == SJA1105Q_PART_NR))
>> >+#define IS_S(device_id, part_nr) \
>>
>> Please have a prefix for macros like this. "IS_S" sounds way too
>> generic...
>>
>
>Ok, I can think of a more descriptive name, since there are under 5
>occurrences of these macros after the reorg, now that I have the
>sja1105_info structure which also holds per-revision function
>pointers.
>
>>
>> >+      (((device_id) == SJA1105QS_DEVICE_ID) && \
>> >+       ((part_nr) == SJA1105S_PART_NR))
>> >+
>> > struct sja1105_general_params_entry {
>> >       u64 vllupformat;
>> >       u64 mirr_ptacu;
>> >--
>> >2.17.1
>> >
>
>
>Thanks!
>-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT
  2019-04-13 21:31     ` Vladimir Oltean
@ 2019-04-14  8:35       ` Jiri Pirko
  0 siblings, 0 replies; 68+ messages in thread
From: Jiri Pirko @ 2019-04-14  8:35 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, Andrew Lunn, davem, netdev,
	linux-kernel, Georg Waibel

Sat, Apr 13, 2019 at 11:31:01PM CEST, olteanv@gmail.com wrote:
>On Sat, 13 Apr 2019 at 23:47, Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Sat, Apr 13, 2019 at 03:28:18AM CEST, olteanv@gmail.com wrote:
>> >Documentation/devicetree/bindings/net/ethernet.txt is confusing because
>> >it says what the MAC should not do, but not what it *should* do:
>> >
>> >  * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
>> >     should not add an RX delay in this case)
>> >
>> >The gap in semantics is threefold:
>> >1. Is it illegal for the MAC to apply the Rx internal delay by itself,
>> >   and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
>> >   passing it to of_phy_connect? The documentation would suggest yes.
>> >1. For "rgmii-rxid", while the situation with the Rx clock skew is more
>> >   or less clear (needs to be added by the PHY), what should the MAC
>> >   driver do about the Tx delays? Is it an implicit wild card for the
>> >   MAC to apply delays in the Tx direction if it can? What if those were
>> >   already added as serpentine PCB traces, how could that be made more
>> >   obvious through DT bindings so that the MAC doesn't attempt to add
>> >   them twice and again potentially break the link?
>> >3. If the interface is a fixed-link and therefore the PHY object is
>> >   fixed (a purely software entity that obviously cannot add clock
>> >   skew), what is the meaning of the above property?
>> >
>> >So an interpretation of the RGMII bindings was chosen that hopefully
>> >does not contradict their intention but also makes them more applied.
>> >The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
>> >if the port is in the PHY role (either explicitly, or if it is a
>> >fixed-link). Otherwise it always passes the duty of setting up delays to
>> >the PHY driver.
>> >
>> >The error behavior that this patch adds is required on SJA1105E/T where
>> >the MAC really cannot apply internal delays. If the other end of the
>> >fixed-link cannot apply RGMII delays either (this would be specified
>> >through its own DT bindings), then the situation requires PCB delays.
>> >
>> >For SJA1105P/Q/R/S, this is however hardware supported and the error is
>> >thus only temporary. I created a stub function pointer for configuring
>> >delays per-port on RXC and TXC, and will implement it when I have access
>> >to a board with this hardware setup.
>> >
>> >Meanwhile do not allow the user to select an invalid configuration.
>> >
>> >Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>> >Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>> >---
>> >Changes in v3:
>> >None.
>> >
>> >Changes in v2:
>> >Patch is new.
>> >
>> > drivers/net/dsa/sja1105/sja1105.h          |  3 ++
>> > drivers/net/dsa/sja1105/sja1105_clocking.c |  7 ++++-
>> > drivers/net/dsa/sja1105/sja1105_main.c     | 32 +++++++++++++++++++++-
>> > drivers/net/dsa/sja1105/sja1105_spi.c      |  6 ++++
>> > 4 files changed, 46 insertions(+), 2 deletions(-)
>> >
>> >diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
>> >index b7e745c0bb3a..3c16b991032c 100644
>> >--- a/drivers/net/dsa/sja1105/sja1105.h
>> >+++ b/drivers/net/dsa/sja1105/sja1105.h
>> >@@ -22,6 +22,8 @@
>> >
>> > struct sja1105_port {
>> >       struct dsa_port *dp;
>> >+      bool rgmii_rx_delay;
>> >+      bool rgmii_tx_delay;
>> >       struct work_struct xmit_work;
>> >       struct sja1105_skb_ring xmit_ring;
>> > };
>> >@@ -61,6 +63,7 @@ struct sja1105_info {
>> >       const struct sja1105_table_ops *static_ops;
>> >       const struct sja1105_regs *regs;
>> >       int (*reset_cmd)(const void *ctx, const void *data);
>> >+      int (*setup_rgmii_delay)(const void *ctx, int port, bool rx, bool tx);
>> >       const char *name;
>> > };
>> >
>> >diff --git a/drivers/net/dsa/sja1105/sja1105_clocking.c b/drivers/net/dsa/sja1105/sja1105_clocking.c
>> >index d40da3d52464..c02fec181676 100644
>> >--- a/drivers/net/dsa/sja1105/sja1105_clocking.c
>> >+++ b/drivers/net/dsa/sja1105/sja1105_clocking.c
>> >@@ -432,7 +432,12 @@ static int rgmii_clocking_setup(struct sja1105_private *priv, int port)
>> >               dev_err(dev, "Failed to configure Tx pad registers\n");
>> >               return rc;
>> >       }
>> >-      return 0;
>> >+      if (!priv->info->setup_rgmii_delay)
>> >+              return 0;
>> >+
>> >+      return priv->info->setup_rgmii_delay(priv, port,
>> >+                                           priv->ports[port].rgmii_rx_delay,
>> >+                                           priv->ports[port].rgmii_tx_delay);
>> > }
>> >
>> > static int sja1105_cgu_rmii_ref_clk_config(struct sja1105_private *priv,
>> >diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
>> >index e4abf8fb2013..5f7ddb1da006 100644
>> >--- a/drivers/net/dsa/sja1105/sja1105_main.c
>> >+++ b/drivers/net/dsa/sja1105/sja1105_main.c
>> >@@ -555,6 +555,21 @@ static int sja1105_static_config_load(struct sja1105_private *priv,
>> >       return sja1105_static_config_upload(priv);
>> > }
>> >
>> >+static void sja1105_parse_rgmii_delay(const struct sja1105_dt_port *in,
>> >+                                    struct sja1105_port *out)
>> >+{
>> >+      if (in->role == XMII_MAC)
>> >+              return;
>> >+
>> >+      if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
>> >+          in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
>> >+              out->rgmii_rx_delay = true;
>> >+
>> >+      if (in->phy_mode == PHY_INTERFACE_MODE_RGMII_TXID ||
>> >+          in->phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
>> >+              out->rgmii_tx_delay = true;
>> >+}
>> >+
>> > static int sja1105_parse_ports_node(struct sja1105_private *priv,
>> >                                   struct sja1105_dt_port *ports,
>> >                                   struct device_node *ports_node)
>> >@@ -1315,13 +1330,28 @@ static int sja1105_setup(struct dsa_switch *ds)
>> > {
>> >       struct sja1105_dt_port ports[SJA1105_NUM_PORTS];
>> >       struct sja1105_private *priv = ds->priv;
>> >-      int rc;
>> >+      int rc, i;
>> >
>> >       rc = sja1105_parse_dt(priv, ports);
>> >       if (rc < 0) {
>> >               dev_err(ds->dev, "Failed to parse DT: %d\n", rc);
>> >               return rc;
>> >       }
>> >+
>> >+      /* Error out early if internal delays are required through DT
>> >+       * and we can't apply them.
>> >+       */
>> >+      for (i = 0; i < SJA1105_NUM_PORTS; i++) {
>> >+              sja1105_parse_rgmii_delay(&ports[i], &priv->ports[i]);
>> >+
>> >+              if ((priv->ports[i].rgmii_rx_delay ||
>> >+                   priv->ports[i].rgmii_tx_delay) &&
>> >+                   !priv->info->setup_rgmii_delay) {
>> >+                      dev_err(ds->dev, "RGMII delay not supported\n");
>> >+                      return -EINVAL;
>> >+              }
>> >+      }
>> >+
>> >       /* Create and send configuration down to device */
>> >       rc = sja1105_static_config_load(priv, ports);
>> >       if (rc < 0) {
>> >diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
>> >index 09cb28e9be20..e4ef4d8048b2 100644
>> >--- a/drivers/net/dsa/sja1105/sja1105_spi.c
>> >+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
>> >@@ -499,6 +499,7 @@ struct sja1105_info sja1105e_info = {
>> >       .part_no                = SJA1105ET_PART_NO,
>> >       .static_ops             = sja1105e_table_ops,
>> >       .dyn_ops                = sja1105et_dyn_ops,
>> >+      .setup_rgmii_delay      = NULL,
>> >       .reset_cmd              = sja1105et_reset_cmd,
>> >       .regs                   = &sja1105et_regs,
>> >       .name                   = "SJA1105E",
>> >@@ -508,6 +509,7 @@ struct sja1105_info sja1105t_info = {
>> >       .part_no                = SJA1105ET_PART_NO,
>> >       .static_ops             = sja1105t_table_ops,
>> >       .dyn_ops                = sja1105et_dyn_ops,
>> >+      .setup_rgmii_delay      = NULL,
>> >       .reset_cmd              = sja1105et_reset_cmd,
>> >       .regs                   = &sja1105et_regs,
>> >       .name                   = "SJA1105T",
>> >@@ -517,6 +519,7 @@ struct sja1105_info sja1105p_info = {
>> >       .part_no                = SJA1105P_PART_NO,
>> >       .static_ops             = sja1105p_table_ops,
>> >       .dyn_ops                = sja1105pqrs_dyn_ops,
>> >+      .setup_rgmii_delay      = NULL,
>> >       .reset_cmd              = sja1105pqrs_reset_cmd,
>> >       .regs                   = &sja1105pqrs_regs,
>> >       .name                   = "SJA1105P",
>> >@@ -526,6 +529,7 @@ struct sja1105_info sja1105q_info = {
>> >       .part_no                = SJA1105Q_PART_NO,
>> >       .static_ops             = sja1105q_table_ops,
>> >       .dyn_ops                = sja1105pqrs_dyn_ops,
>> >+      .setup_rgmii_delay      = NULL,
>> >       .reset_cmd              = sja1105pqrs_reset_cmd,
>> >       .regs                   = &sja1105pqrs_regs,
>> >       .name                   = "SJA1105Q",
>> >@@ -535,6 +539,7 @@ struct sja1105_info sja1105r_info = {
>> >       .part_no                = SJA1105R_PART_NO,
>> >       .static_ops             = sja1105r_table_ops,
>> >       .dyn_ops                = sja1105pqrs_dyn_ops,
>> >+      .setup_rgmii_delay      = NULL,
>> >       .reset_cmd              = sja1105pqrs_reset_cmd,
>> >       .regs                   = &sja1105pqrs_regs,
>> >       .name                   = "SJA1105R",
>> >@@ -545,6 +550,7 @@ struct sja1105_info sja1105s_info = {
>> >       .static_ops             = sja1105s_table_ops,
>> >       .dyn_ops                = sja1105pqrs_dyn_ops,
>> >       .regs                   = &sja1105pqrs_regs,
>> >+      .setup_rgmii_delay      = NULL,
>>
>> You don't need to set this to NULL. Please avoid that.
>>
>
>Hi Jiri, why not?

If you don't assign, it is already NULL. so the assignment to NULL is
pointless.


>
>Thanks,
>-Vladimir
>
>>
>> >       .reset_cmd              = sja1105pqrs_reset_cmd,
>> >       .name                   = "SJA1105S",
>> > };
>> >--
>> >2.17.1
>> >

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13 21:27     ` Vladimir Oltean
  2019-04-13 22:08       ` Vladimir Oltean
@ 2019-04-14 16:05       ` Andrew Lunn
  2019-04-14 18:42         ` Vladimir Oltean
  2019-04-17  0:16       ` Florian Fainelli
  2 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-14 16:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, Apr 14, 2019 at 12:27:50AM +0300, Vladimir Oltean wrote:
> > > +             mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
> > > +             mgmt_route.destports = BIT(port);
> > > +             mgmt_route.enfport = 1;
> > > +             mgmt_route.tsreg = 0;
> > > +             mgmt_route.takets = false;
> > > +
> > > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > > +                                               port, &mgmt_route, true);

>         rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
>                           *port*, &mgmt_route, true);
> 
> The switch IP aptly allocates 4 slots for management routes. And it's
> a 5-port switch where 1 port is the management port. I think the
> structure is fine.

So does the hardware look over all the slots and find the first one
which has a matching mgmt_route.macaddr destination MAC address? You
wait for the enfport to be cleared. I assume a slot with enfport
cleared is not active and won't match?

So we need to consider if there is a race condition where we have
multiple slots with the same destination MAC address, but different
destination ports? Say the bridge sends out BPDU to all ports of a
bridge in quick succession.

These work queues run in any order, and can sleep. Can we get into a
situation where we get the two slots setup, and then the frames sent
in reverse order? The match then happens backwards, and the frames get
sent out the wrong port?

Or say the two slots are setup, the two frames are sent in order, but
the stack decided to drop the first frame because buffers are
full. Can the second frame make it to the switch and match on the
first slot and go out the wrong port?

> > Also, please move all this code into the tagger. Just add exports for
> > sja1105_dynamic_config_write() and sja1105_dynamic_config_read().
> >
> 
> Well, you see, the tagger code is part of the dsa_core object. If I
> export function symbols from the driver, those still won't be there if
> I compile the driver as a module. On the other hand, the way I'm doing
> it, I think the schedule_work() gives me a pretty good separation.

That is solvable via Kconfig, don't allow it to be built as a module.

Also, DSA has been very successful, we keep getting more switches from
different vendors, and hence more taggers. So at some point, we should
turn the taggers into modules. I'm not saying that should happen now,
but when it does happen, this driver can then become a module.

The real reason is, tagger as all about handling frames, where as
drivers are all about configuring the switch. The majority of this
code is about frames, so it belongs in the tagger.
 
> > > +#include <linux/etherdevice.h>
> > > +#include <linux/if_vlan.h>
> > > +#include <linux/dsa/sja1105.h>
> > > +#include "../../drivers/net/dsa/sja1105/sja1105.h"
> >
> > Again, no, don't do this.
> >
> 
> This separation between driver and tagger is fairly arbitrary.
> I need access to the driver's private structure, in order to get a
> hold of the private shadow of the dsa_port. Moving the driver private
> structure to include/linux/dsa/ would pull in quite a number of
> dependencies. Maybe I could provide declarations for the most of them,
> but anyway the private structure wouldn't be so private any longer,
> would it?
> Otherwise put, would you prefer a dp->priv similar to the already
> existing ds->priv? struct sja1105_port is much more lightweight to
> keep in include/linux/dsa/.

Linux simply does not make use of relative paths going between
directories like this. That is the key point here. Whatever you need
to share between the tagger and the driver has to be put into
include/linux/dsa/. 

Assuming we are just exporting something like
sja1105_dynamic_config_write() and _read()


> > > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > > +                                               port, &mgmt_route, true);

priv can be replaced with ds, which the tagger has. port is
known. BLK_IDX_MGMT_ROUTE is implicit, and all that the tagger needs
to pass for mgmt_route is the destination MAC address, which it has.

The tagger does need somewhere to keep the queue of frames to be sent
and its workqueue. I would probably add a void *tagger_priv to
dsa_switch, and two new methods to dsa_device_ops, .probe and
.release, to allow it to create and destroy what it needs in
tagger_priv.

> > > +#include "dsa_priv.h"
> > > +
> > > +/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
> > > +static inline bool sja1105_is_link_local(const struct sk_buff *skb)
> > > +{
> > > +     const struct ethhdr *hdr = eth_hdr(skb);
> > > +     u64 dmac = ether_addr_to_u64(hdr->h_dest);
> > > +
> > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
> > > +                 SJA1105_LINKLOCAL_FILTER_A)
> > > +             return true;
> > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
> > > +                 SJA1105_LINKLOCAL_FILTER_B)
> > > +             return true;
> > > +     return false;
> > > +}
> > > +
> > > +static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
> > > +{
> > > +     if (sja1105_is_link_local(skb))
> > > +             return true;
> > > +     if (!dev->dsa_ptr->vlan_filtering)
> > > +             return true;
> > > +     return false;
> > > +}
> >
> > Please add a comment here about what frames cannot be handled by the
> > tagger. However, i'm not too happy about this design...
> >

> What would you improve about this design (assuming you're talking
> about the filter function)?

I want to understand what frames get passed via the master device, and
how ultimately they get to where they should be going.

Once i understand what sort of frames they are and what is
generating/consuming them, maybe we can find a better solution which
preserves the DSA concepts.

To me, it looks like they are not management frames, at least not BPDU
or PTP, since they are link local. If VLAN filtering is off, the VLAN
tag tells us which port they came in, so we can strip the tag and pass
them to the correct slave.

So it looks like real user frames with a VLAN tag are getting passed
to the master device. So i then assume you put vlan interfaces on top
of the master device, and your application then uses the vlan
interfaces? Your application does not care where the frames came from,
it is using the switch as a dumb switch. The DSA slaves are unused?

Could we enforce that a VLAN can only be assigned to a single port?
The tagger could then pass the tagged frame to the correct slave? Is
that too restrictive for your use case? Do you need the same VLAN on
multiple ports.

	 Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13 22:26         ` Vladimir Oltean
@ 2019-04-14 16:17           ` Andrew Lunn
  2019-04-14 18:53             ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-14 16:17 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

> > > > > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > > > > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
> > > >
> > > > Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
> > > >
> 
> I'm receiving contradictory advice on this. Why should I define a new
> ethertype, and if I do, what scope should the definition have (local
> to the driver and the tagger, local to DSA, UAPI)?

ETH_P_EDSA has a well defined meaning. It is a true global EtherType,
and means a Marvell EtherType DSA header follows.

You are polluting this meaning of ETH_P_EDSA. Would you put ETH_P_IP
or ETH_P_8021AD here?

   Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-14 16:05       ` Andrew Lunn
@ 2019-04-14 18:42         ` Vladimir Oltean
  2019-04-14 19:06           ` Andrew Lunn
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-14 18:42 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, 14 Apr 2019 at 19:05, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, Apr 14, 2019 at 12:27:50AM +0300, Vladimir Oltean wrote:
> > > > +             mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
> > > > +             mgmt_route.destports = BIT(port);
> > > > +             mgmt_route.enfport = 1;
> > > > +             mgmt_route.tsreg = 0;
> > > > +             mgmt_route.takets = false;
> > > > +
> > > > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > > > +                                               port, &mgmt_route, true);
>
> >         rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> >                           *port*, &mgmt_route, true);
> >
> > The switch IP aptly allocates 4 slots for management routes. And it's
> > a 5-port switch where 1 port is the management port. I think the
> > structure is fine.
>
> So does the hardware look over all the slots and find the first one
> which has a matching mgmt_route.macaddr destination MAC address? You
> wait for the enfport to be cleared. I assume a slot with enfport
> cleared is not active and won't match?
>
> So we need to consider if there is a race condition where we have
> multiple slots with the same destination MAC address, but different
> destination ports? Say the bridge sends out BPDU to all ports of a
> bridge in quick succession.
>
> These work queues run in any order, and can sleep. Can we get into a
> situation where we get the two slots setup, and then the frames sent
> in reverse order? The match then happens backwards, and the frames get
> sent out the wrong port?
>

Yes, it looks like the hardware isn't doing me any favors on this one.
From UM10944: "If the host provides several management route entries
with identical values for the MACADDR, the one at the lowest index is
used first."
So the 4 hardware management slots serve no purpose unless I'm willing
to lock the sja1105_xmit_work_handler with a mutex. And even then
there's no reason to use separate slots since the workers are
serialized anyway. Weird.

> Or say the two slots are setup, the two frames are sent in order, but
> the stack decided to drop the first frame because buffers are
> full. Can the second frame make it to the switch and match on the
> first slot and go out the wrong port?
>

Yes if waiting for enfport times out there's some cleanup work I'm
currently not doing.

> > > Also, please move all this code into the tagger. Just add exports for
> > > sja1105_dynamic_config_write() and sja1105_dynamic_config_read().
> > >
> >
> > Well, you see, the tagger code is part of the dsa_core object. If I
> > export function symbols from the driver, those still won't be there if
> > I compile the driver as a module. On the other hand, the way I'm doing
> > it, I think the schedule_work() gives me a pretty good separation.
>
> That is solvable via Kconfig, don't allow it to be built as a module.
>
> Also, DSA has been very successful, we keep getting more switches from
> different vendors, and hence more taggers. So at some point, we should
> turn the taggers into modules. I'm not saying that should happen now,
> but when it does happen, this driver can then become a module.
>
> The real reason is, tagger as all about handling frames, where as
> drivers are all about configuring the switch. The majority of this
> code is about frames, so it belongs in the tagger.
>

The xmit worker needs to configure the switch to be able to handle
frames. Why doesn't that belong in the driver?
Not allowing the driver to be built as module is hardly any cleaner
than a schedule_work().
I only need dsa_slave_to_master for the delayed enqueue. And if DSA
supported a delayed enqueue method natively I wouldn't need it at all.

> > > > +#include <linux/etherdevice.h>
> > > > +#include <linux/if_vlan.h>
> > > > +#include <linux/dsa/sja1105.h>
> > > > +#include "../../drivers/net/dsa/sja1105/sja1105.h"
> > >
> > > Again, no, don't do this.
> > >
> >
> > This separation between driver and tagger is fairly arbitrary.
> > I need access to the driver's private structure, in order to get a
> > hold of the private shadow of the dsa_port. Moving the driver private
> > structure to include/linux/dsa/ would pull in quite a number of
> > dependencies. Maybe I could provide declarations for the most of them,
> > but anyway the private structure wouldn't be so private any longer,
> > would it?
> > Otherwise put, would you prefer a dp->priv similar to the already
> > existing ds->priv? struct sja1105_port is much more lightweight to
> > keep in include/linux/dsa/.
>
> Linux simply does not make use of relative paths going between
> directories like this. That is the key point here. Whatever you need
> to share between the tagger and the driver has to be put into
> include/linux/dsa/.
>
> Assuming we are just exporting something like
> sja1105_dynamic_config_write() and _read()
>
>
> > > > +             rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
> > > > +                                               port, &mgmt_route, true);
>
> priv can be replaced with ds, which the tagger has. port is
> known. BLK_IDX_MGMT_ROUTE is implicit, and all that the tagger needs
> to pass for mgmt_route is the destination MAC address, which it has.
>
> The tagger does need somewhere to keep the queue of frames to be sent
> and its workqueue. I would probably add a void *tagger_priv to
> dsa_switch, and two new methods to dsa_device_ops, .probe and
> .release, to allow it to create and destroy what it needs in
> tagger_priv.
>

I need to think about this.

> > > > +#include "dsa_priv.h"
> > > > +
> > > > +/* Similar to is_link_local_ether_addr(hdr->h_dest) but also covers PTP */
> > > > +static inline bool sja1105_is_link_local(const struct sk_buff *skb)
> > > > +{
> > > > +     const struct ethhdr *hdr = eth_hdr(skb);
> > > > +     u64 dmac = ether_addr_to_u64(hdr->h_dest);
> > > > +
> > > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_A_MASK) ==
> > > > +                 SJA1105_LINKLOCAL_FILTER_A)
> > > > +             return true;
> > > > +     if ((dmac & SJA1105_LINKLOCAL_FILTER_B_MASK) ==
> > > > +                 SJA1105_LINKLOCAL_FILTER_B)
> > > > +             return true;
> > > > +     return false;
> > > > +}
> > > > +
> > > > +static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
> > > > +{
> > > > +     if (sja1105_is_link_local(skb))
> > > > +             return true;
> > > > +     if (!dev->dsa_ptr->vlan_filtering)
> > > > +             return true;
> > > > +     return false;
> > > > +}
> > >
> > > Please add a comment here about what frames cannot be handled by the
> > > tagger. However, i'm not too happy about this design...
> > >
>
> > What would you improve about this design (assuming you're talking
> > about the filter function)?
>
> I want to understand what frames get passed via the master device, and
> how ultimately they get to where they should be going.
>
> Once i understand what sort of frames they are and what is
> generating/consuming them, maybe we can find a better solution which
> preserves the DSA concepts.
>
> To me, it looks like they are not management frames, at least not BPDU
> or PTP, since they are link local. If VLAN filtering is off, the VLAN
> tag tells us which port they came in, so we can strip the tag and pass
> them to the correct slave.
>
> So it looks like real user frames with a VLAN tag are getting passed
> to the master device. So i then assume you put vlan interfaces on top
> of the master device, and your application then uses the vlan
> interfaces? Your application does not care where the frames came from,
> it is using the switch as a dumb switch. The DSA slaves are unused?
>

Whatever the application may be, the DSA solution to switches that
can't decode all incoming traffic is to drop the rest. In this case it
means that the host port is no longer a valid destination for the L2
switching process.

> Could we enforce that a VLAN can only be assigned to a single port?
> The tagger could then pass the tagged frame to the correct slave? Is
> that too restrictive for your use case? Do you need the same VLAN on
> multiple ports.
>
>          Andrew

No we can't enforce that. The commit message of 07/24 has a pretty
lengthy explanation why.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-14 16:17           ` Andrew Lunn
@ 2019-04-14 18:53             ` Vladimir Oltean
  2019-04-14 19:13               ` Andrew Lunn
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-14 18:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, 14 Apr 2019 at 19:18, Andrew Lunn <andrew@lunn.ch> wrote:
>
> > > > > > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > > > > > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
> > > > >
> > > > > Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
> > > > >
> >
> > I'm receiving contradictory advice on this. Why should I define a new
> > ethertype, and if I do, what scope should the definition have (local
> > to the driver and the tagger, local to DSA, UAPI)?
>
> ETH_P_EDSA has a well defined meaning. It is a true global EtherType,
> and means a Marvell EtherType DSA header follows.
>
> You are polluting this meaning of ETH_P_EDSA. Would you put ETH_P_IP
> or ETH_P_8021AD here?
>
>    Andrew

You are putting an equality sign here between things that are not quite equal.
The MEDSA EtherType is used for the same purpose as what I'm using it for.
The only situation when I can receive ETH_P_EDSA frames is if somebody
designed a system with a cascaded SJA1105 and a MV88E6xx. I think
that's unlikely but I might be wrong.
Don't get me wrong, I could use literally any EtherType and that's
exactly why I'm reluctant to define a new one.
The only thing is that if I pick an EtherType smaller than 1500
(LLC/SNAP) like ETH_P_XDSA (or even zero works), then I get the
hardware incrementing the n_sizeerr counter for each received tagged
frame (it doesn't drop it, though).

-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-14 18:42         ` Vladimir Oltean
@ 2019-04-14 19:06           ` Andrew Lunn
  0 siblings, 0 replies; 68+ messages in thread
From: Andrew Lunn @ 2019-04-14 19:06 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

> Yes, it looks like the hardware isn't doing me any favors on this one.
> >From UM10944: "If the host provides several management route entries
> with identical values for the MACADDR, the one at the lowest index is
> used first."
> So the 4 hardware management slots serve no purpose unless I'm willing
> to lock the sja1105_xmit_work_handler with a mutex. And even then
> there's no reason to use separate slots since the workers are
> serialized anyway. Weird.

So you can simply this down to just using one queue. Maybe even one
queue for all instances of this switch. You can then keep it all as
static private structures in the tag driver, maybe not even needing
ds->tag_priv.

> > Also, DSA has been very successful, we keep getting more switches from
> > different vendors, and hence more taggers. So at some point, we should
> > turn the taggers into modules. I'm not saying that should happen now,
> > but when it does happen, this driver can then become a module.

FYI: I started on this already. There might be patches today to allow
the tag drivers to be kernel modules.

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-14 18:53             ` Vladimir Oltean
@ 2019-04-14 19:13               ` Andrew Lunn
  2019-04-14 22:30                 ` Vladimir Oltean
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-14 19:13 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, Apr 14, 2019 at 09:53:42PM +0300, Vladimir Oltean wrote:
> On Sun, 14 Apr 2019 at 19:18, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > > > > > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > > > > > > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
> > > > > >
> > > > > > Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
> > > > > >
> > >
> > > I'm receiving contradictory advice on this. Why should I define a new
> > > ethertype, and if I do, what scope should the definition have (local
> > > to the driver and the tagger, local to DSA, UAPI)?
> >
> > ETH_P_EDSA has a well defined meaning. It is a true global EtherType,
> > and means a Marvell EtherType DSA header follows.
> >
> > You are polluting this meaning of ETH_P_EDSA. Would you put ETH_P_IP
> > or ETH_P_8021AD here?
> >
> >    Andrew
> 
> You are putting an equality sign here between things that are not quite equal.
> The MEDSA EtherType is used for the same purpose as what I'm using it for.

I don't think it is. tcpdump will match on this EtherType and decode
what follows as an EDSA header, just in the same way it matches a
ETH_P_IP and decodes what comes next as an IP packet. I also have
wireshark patches, which i never submitted, which do the same.

Please run tcpdump on the master interface with your test system and
see what it does.

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-14 19:13               ` Andrew Lunn
@ 2019-04-14 22:30                 ` Vladimir Oltean
  2019-04-15  3:07                   ` Andrew Lunn
  0 siblings, 1 reply; 68+ messages in thread
From: Vladimir Oltean @ 2019-04-14 22:30 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

On Sun, 14 Apr 2019 at 22:13, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, Apr 14, 2019 at 09:53:42PM +0300, Vladimir Oltean wrote:
> > On Sun, 14 Apr 2019 at 19:18, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > > > > > +             return dsa_8021q_xmit(skb, netdev, ETH_P_EDSA,
> > > > > > > > +                                  ((pcp << VLAN_PRIO_SHIFT) | tx_vid));
> > > > > > >
> > > > > > > Please don't reuse ETH_P_EDSA. Define an ETH_P_SJA1105.
> > > > > > >
> > > >
> > > > I'm receiving contradictory advice on this. Why should I define a new
> > > > ethertype, and if I do, what scope should the definition have (local
> > > > to the driver and the tagger, local to DSA, UAPI)?
> > >
> > > ETH_P_EDSA has a well defined meaning. It is a true global EtherType,
> > > and means a Marvell EtherType DSA header follows.
> > >
> > > You are polluting this meaning of ETH_P_EDSA. Would you put ETH_P_IP
> > > or ETH_P_8021AD here?
> > >
> > >    Andrew
> >
> > You are putting an equality sign here between things that are not quite equal.
> > The MEDSA EtherType is used for the same purpose as what I'm using it for.
>
> I don't think it is. tcpdump will match on this EtherType and decode
> what follows as an EDSA header, just in the same way it matches a
> ETH_P_IP and decodes what comes next as an IP packet. I also have
> wireshark patches, which i never submitted, which do the same.
>
> Please run tcpdump on the master interface with your test system and
> see what it does.
>
>     Andrew

It fails to decode the frames, obviously. But so does any other EtherType.
Florian was hinting
(https://lwn.net/ml/netdev/b52f4cdf-edcf-0757-1d6e-d4a831ec7943@gmail.com/)
at the recent pull requests submitted to tcpdump and libpcap that make
it possible to decode based on the string in
/sys/class/net/${master}/dsa/tagging. I admit I haven't actually
tested or studied those closely yet (there are more important things
to focus on ATM), but since my driver returns "8021q" in sysfs and
yours returns "edsa", I would presume tcpdump could use that
information. Anyway, since you obviously know more on this topic than
I do, please make me understand what are the real problems in spoofing
the Ethertype as a Marvell one.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-14 22:30                 ` Vladimir Oltean
@ 2019-04-15  3:07                   ` Andrew Lunn
  2019-04-17  0:09                     ` Florian Fainelli
  0 siblings, 1 reply; 68+ messages in thread
From: Andrew Lunn @ 2019-04-15  3:07 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, vivien.didelot, davem, netdev, linux-kernel,
	Georg Waibel

> It fails to decode the frames, obviously. But so does any other EtherType.

> Florian was hinting
> (https://lwn.net/ml/netdev/b52f4cdf-edcf-0757-1d6e-d4a831ec7943@gmail.com/)
> at the recent pull requests submitted to tcpdump and libpcap that make
> it possible to decode based on the string in
> /sys/class/net/${master}/dsa/tagging. I admit I haven't actually
> tested or studied those closely yet (there are more important things
> to focus on ATM), but since my driver returns "8021q" in sysfs and
> yours returns "edsa", I would presume tcpdump could use that
> information.

No it does not. It is a valid EtherType, that is what is used to
trigger the decoding, it takes no notice of what is in
/sys/class/net/${master}/dsa/tagging, nor the extra meta-data added to
the pcap file. There is no need. you can identify it is a Marvell EDSA
header from the EtherType.

In fact, this tcpdump code for decoding EDSA pre-dated Florians
patches by a few years.

You only need the code which Florian added when you cannot identify
the header directly from the packet. And that is true for most of the
tagging protocols. But EDSA you can.

> Anyway, since you obviously know more on this topic than I do,
> please make me understand what are the real problems in spoofing the
> Ethertype as a Marvell one.

Despite there being an EDSA EtherType in the frame, what follows is
not an ESDA header. It is like having the IPv4 EtherType but what
following is not an IP header. Broken.

    Andrew

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 05/24] net: dsa: Add more convenient functions for installing port VLANs
  2019-04-13  1:28 ` [PATCH v3 net-next 05/24] net: dsa: Add more convenient functions for installing port VLANs Vladimir Oltean
@ 2019-04-16 23:49   ` Florian Fainelli
  0 siblings, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-16 23:49 UTC (permalink / raw)
  To: Vladimir Oltean, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel



On 12/04/2019 18:28, Vladimir Oltean wrote:
> This hides the need to perform a two-phase transaction and construct a
> switchdev_obj_port_vlan struct.
> 
> Call graph (including a function that will be introduced in a follow-up
> patch) looks like this now (same for the *_vlan_del function):
> 
> dsa_slave_vlan_rx_add_vid   dsa_port_setup_8021q_tagging
>              |                        |
>              |                        |
>              |          +-------------+
>              |          |
>              v          v
>             dsa_port_vid_add      dsa_slave_port_obj_add
>                    |                         |
>                    +-------+         +-------+
>                            |         |
>                            v         v
>                         dsa_port_vlan_add
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 09/24] net: dsa: b53: Let DSA handle mismatched VLAN filtering settings
  2019-04-13  1:28 ` [PATCH v3 net-next 09/24] net: dsa: b53: Let DSA handle mismatched VLAN filtering settings Vladimir Oltean
@ 2019-04-16 23:52   ` Florian Fainelli
  0 siblings, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-16 23:52 UTC (permalink / raw)
  To: Vladimir Oltean, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel



On 12/04/2019 18:28, Vladimir Oltean wrote:
> The DSA core is now able to do this check prior to calling the
> .port_vlan_filtering callback, so tell it that VLAN filtering is global
> for this particular hardware.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> Suggested-by: Florian Fainelli <f.fainelli@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 08/24] net: dsa: Be aware of switches where VLAN filtering is a global setting
  2019-04-13  1:28 ` [PATCH v3 net-next 08/24] net: dsa: Be aware of switches where VLAN filtering is a global setting Vladimir Oltean
@ 2019-04-16 23:54   ` Florian Fainelli
  0 siblings, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-16 23:54 UTC (permalink / raw)
  To: Vladimir Oltean, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel



On 12/04/2019 18:28, Vladimir Oltean wrote:
> On some switches, the action of whether to parse VLAN frame headers and use
> that information for ingress admission is configurable, but not per
> port. Such is the case for the Broadcom BCM53xx and the NXP SJA1105
> families, for example. In that case, DSA can prevent the bridge core
> from trying to apply different VLAN filtering settings on net devices
> that belong to the same switch.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> Suggested-by: Florian Fainelli <f.fainelli@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge
  2019-04-13  1:28 ` [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge Vladimir Oltean
  2019-04-13 15:11   ` Andrew Lunn
@ 2019-04-16 23:59   ` Florian Fainelli
  1 sibling, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-16 23:59 UTC (permalink / raw)
  To: Vladimir Oltean, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel



On 12/04/2019 18:28, Vladimir Oltean wrote:
> When ports are standalone (after they left the bridge), they should have
> no VLAN filtering semantics (they should pass all traffic to the CPU).

The reset of the commit message is fine but this particular sentence 
sort of conflicts with the fact that we set NETIF_F_VLAN_RX_CTAG after 
my recent changes in order to support standalone ports + bridge ports w/ 
VLAN filtering on. What this feature flag means is that ingress VLAN 
filtering is active/supported.

On that particular topic this also means that for things like tcpdump to 
keep working on standalone ports while we have bridged ports w/ VLAN 
filtering turned on, we might have to re-configure how the switch 
performs ingress VLAN filtering checking when we have a standalone port 
in promiscuous mode. On BCM53xx we can achieve that by having the CPU 
port be part of all VLANs (there is a shorthand register for that 
purpose) and doing ingress VID checking + copy violated frames to CPU 
port. Food for thought...

> Currently this is not true for switchdev drivers, because the bridge
> "forgets" to unset that.
> 
> Normally one would think that doing this at the bridge layer would be a
> better idea, i.e. call br_vlan_filter_toggle() from br_del_if(), similar
> to how nbp_vlan_init() is called from br_add_if().
> 
> However what complicates that approach, and makes this one preferable,
> is the fact that for the bridge core, vlan_filtering is a per-bridge
> setting, whereas for switchdev/DSA it is per-port. Also there are
> switches where the setting is per the entire device, and unsetting
> vlan_filtering one by one, for each leaving port, would not be possible
> from the bridge core without a certain level of awareness. So do this in
> DSA and let drivers be unaware of it.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering
  2019-04-13  1:28 ` [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering Vladimir Oltean
  2019-04-13 15:12   ` Andrew Lunn
@ 2019-04-16 23:59   ` Florian Fainelli
  1 sibling, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-16 23:59 UTC (permalink / raw)
  To: Vladimir Oltean, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel



On 12/04/2019 18:28, Vladimir Oltean wrote:
> The driver, recognizing that the .port_vlan_filtering callback was never
> coming after the port left its parent bridge, decided to take that duty
> in its own hands. DSA now takes care of this condition, so fix that.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-15  3:07                   ` Andrew Lunn
@ 2019-04-17  0:09                     ` Florian Fainelli
  0 siblings, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-17  0:09 UTC (permalink / raw)
  To: Andrew Lunn, Vladimir Oltean
  Cc: vivien.didelot, davem, netdev, linux-kernel, Georg Waibel



On 14/04/2019 20:07, Andrew Lunn wrote:
>> It fails to decode the frames, obviously. But so does any other EtherType.
> 
>> Florian was hinting
>> (https://lwn.net/ml/netdev/b52f4cdf-edcf-0757-1d6e-d4a831ec7943@gmail.com/)
>> at the recent pull requests submitted to tcpdump and libpcap that make
>> it possible to decode based on the string in
>> /sys/class/net/${master}/dsa/tagging. I admit I haven't actually
>> tested or studied those closely yet (there are more important things
>> to focus on ATM), but since my driver returns "8021q" in sysfs and
>> yours returns "edsa", I would presume tcpdump could use that
>> information.
> 
> No it does not. It is a valid EtherType, that is what is used to
> trigger the decoding, it takes no notice of what is in
> /sys/class/net/${master}/dsa/tagging, nor the extra meta-data added to
> the pcap file. There is no need. you can identify it is a Marvell EDSA
> header from the EtherType.
> 
> In fact, this tcpdump code for decoding EDSA pre-dated Florians
> patches by a few years.
> 
> You only need the code which Florian added when you cannot identify
> the header directly from the packet. And that is true for most of the
> tagging protocols. But EDSA you can.

Correct.

> 
>> Anyway, since you obviously know more on this topic than I do,
>> please make me understand what are the real problems in spoofing the
>> Ethertype as a Marvell one.
> 
> Despite there being an EDSA EtherType in the frame, what follows is
> not an ESDA header. It is like having the IPv4 EtherType but what
> following is not an IP header. Broken.
I suppose this is a valid point, but in that case any EtherType would do 
and technically using ETH_P_EDSA is just an one of the many possible 
choices for configuring the Marvell EDSA EtherType, you just need to 
pick one that is not going to trick the switching into thinking this is 
invalid LLC/SNAP.

With Vivien's latest tcpdump changes, I don't think we need to rely on 
ETH_P_EDSA to be present anyway since the kernel tells us (when available).
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports
  2019-04-13 21:27     ` Vladimir Oltean
  2019-04-13 22:08       ` Vladimir Oltean
  2019-04-14 16:05       ` Andrew Lunn
@ 2019-04-17  0:16       ` Florian Fainelli
  2 siblings, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-17  0:16 UTC (permalink / raw)
  To: Vladimir Oltean, Andrew Lunn
  Cc: vivien.didelot, davem, netdev, linux-kernel, Georg Waibel



On 13/04/2019 14:27, Vladimir Oltean wrote:
> 
> Ok, let's put this another way.
> A switch is primarily a device used to offload the forwarding of
> traffic based on L2 rules. Additionally there may be some management
> traffic for stuff like STP that needs to be terminated on the host
> port of the switch. For that, the hardware's job is to filter and tag
> management frames on their way to the host port, and the software's
> job is to process the source port and switch id information in a
> meaningful way.
> Now both this particular switch hardware, and DSA, are taking the
> above definitions to extremes.
> The switch says: "that's all you want to see? ok, so that's all I'm
> going to give you". So its native (hardware) tagging protocol is to
> trap link-local traffic and overwrite two bytes of its destination MAC
> with the switch ID and the source port. No more, no less. It is an
> incomplete solution, but it does the job for practical use cases.

Indeed.

> Now DSA says: "I want these to be fully capable net devices, I want
> the user to not even realize what's going on under the hood". I don't
> think that terminating iperf traffic through switch ports is a
> realistic usage scenario. So in a way discussions about performance
> and optimizations on DSA hotpath are slightly pointless IMO.

Actually it is on Broadcom devices that I directly or indirectly helped 
to support with bcm_sf2/b53 we have 2Gb/sec capable management ports and 
we run iperf directly on the host CPUs. Some ports remain standalone 
(e.g.: WAN) and the others can be bridged together (LAN + WLAN).

> Now what my driver says is that it offers a bit of both. It speaks the
> hardware's tagging protocol so it is capable of management traffic,
> but it also speaks the DSA paradigm, so in a way pushes the hardware
> to work in a mode it was never intended to, by repurposing VLANs when
> the user doesn't request them. So on one hand there is some overlap
> between the hardware tagging protocol and the VLAN one (in standalone
> mode and in VLAN-unaware bridged mode, management traffic *could* use
> VLAN tagging but it doesn't rely on it), and on the other hand the
> reunion of the two tagging protocols is decent, but still doesn't
> cover the entire spectrum (when put under a VLAN-aware bridge, you
> lose the ability to decode general traffic). So you'd better not rely
> on VLANs to decode the management traffic, because you won't be able
> to always rely on that, and that is a shame since a bridge with both
> vlan_filtering 1 and stp_state 1 is a real usage scenario, and the
> hardware is capable of that combination.
> But all of that is secondary. Let's forget about VLAN tagging for a
> second and concentrate on the tagging of management traffic. The
> limiting factor here is the software architecture of DSA, because in
> order for me to decode that in the driver/tagger, I'd have to drop
> everything else coming on the master net device (I explained in 13/24
> why). I believe that DSA being all-or-nothing about switch tagging is
> turning a blind eye to the devices that don't go overboard with
> features, and give you what's needed in a real-world design but not
> much else.

I would word it differently and say that up until now, whatever DSA 
assumed about switches was something that was supportable and with the 
sja1105 we are faced with an interesting of limits on both designs. I 
don't think DSA is unreasonable in assuming that management frame is 
always tagged with a proprietary switch protocol; because that's what 
has happened across a wide variety of vendors. The NXP SJA1105 is not 
unreasonable but it does present some challenges.

> What would you improve about this design (assuming you're talking
> about the filter function)?

Would assigning different MAC addresses to each standalone port help in 
any way such that you could leverage filtering in HW based on MAC DA?
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 net-next 23/24] Documentation: net: dsa: Add details about NXP SJA1105 driver
  2019-04-13  1:28 ` [PATCH v3 net-next 23/24] Documentation: net: dsa: Add details about NXP SJA1105 driver Vladimir Oltean
@ 2019-04-17  0:20   ` Florian Fainelli
  0 siblings, 0 replies; 68+ messages in thread
From: Florian Fainelli @ 2019-04-17  0:20 UTC (permalink / raw)
  To: Vladimir Oltean, vivien.didelot, andrew, davem
  Cc: netdev, linux-kernel, georg.waibel



On 12/04/2019 18:28, Vladimir Oltean wrote:

[snip]

> +Segregating the switch ports in multiple bridges is supported (e.g. 2 + 2), but
> +all bridges should have the same level of VLAN awareness (either both have
> +``vlan_filtering`` 0, or both 1). Also an inevitable limitation of the fact
> +that VLAN awareness is global at the switch level is that once a bridge with
> +``vlan_filtering`` enslaves at least one switch port, the other un-bridged
> +ports are no longer available for standalone traffic termination.

That is quite a limitation that I don't think I had fully grasped until 
reading your different patches. Since enslaving ports into a bridge 
comes after the network device was already made available for use, maybe 
you should force the carrier down or something along those lines as soon 
as a port is enslaved into a bridge with vlan_filtering=1 to make this 
more predictable for the user?
-- 
Florian

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2019-04-17  0:20 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-13  1:27 [PATCH v3 net-next 00/24] NXP SJA1105 DSA driver Vladimir Oltean
2019-04-13  1:27 ` [PATCH v3 net-next 01/24] lib: Add support for generic packing operations Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 02/24] net: dsa: Fix pharse -> phase typo Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 03/24] net: dsa: Store vlan_filtering as a property of dsa_port Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 04/24] net: dsa: mt7530: Use vlan_filtering property from dsa_port Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 05/24] net: dsa: Add more convenient functions for installing port VLANs Vladimir Oltean
2019-04-16 23:49   ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 06/24] net: dsa: Call driver's setup callback after setting up its switchdev notifier Vladimir Oltean
2019-04-13 15:05   ` Andrew Lunn
2019-04-13  1:28 ` [PATCH v3 net-next 07/24] net: dsa: Optional VLAN-based port separation for switches without tagging Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 08/24] net: dsa: Be aware of switches where VLAN filtering is a global setting Vladimir Oltean
2019-04-16 23:54   ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 09/24] net: dsa: b53: Let DSA handle mismatched VLAN filtering settings Vladimir Oltean
2019-04-16 23:52   ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 10/24] net: dsa: Unset vlan_filtering when ports leave the bridge Vladimir Oltean
2019-04-13 15:11   ` Andrew Lunn
2019-04-16 23:59   ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 11/24] net: dsa: mt7530: Let DSA handle the unsetting of vlan_filtering Vladimir Oltean
2019-04-13 15:12   ` Andrew Lunn
2019-04-16 23:59   ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 12/24] net: dsa: Copy the vlan_filtering setting on the CPU port if it's global Vladimir Oltean
2019-04-13 15:23   ` Andrew Lunn
2019-04-13 15:37     ` Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 13/24] net: dsa: Allow drivers to filter packets they can decode source port from Vladimir Oltean
2019-04-13 15:39   ` Andrew Lunn
2019-04-13 15:48     ` Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 14/24] net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch Vladimir Oltean
2019-04-13 15:42   ` Andrew Lunn
2019-04-13 15:46     ` Vladimir Oltean
2019-04-13 16:44       ` Andrew Lunn
2019-04-13 21:29         ` Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 15/24] net: dsa: sja1105: Add support for FDB and MDB management Vladimir Oltean
2019-04-13 20:58   ` Jiri Pirko
2019-04-13  1:28 ` [PATCH v3 net-next 16/24] net: dsa: sja1105: Add support for VLAN operations Vladimir Oltean
2019-04-13 20:56   ` Jiri Pirko
2019-04-13 21:39     ` Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 17/24] net: dsa: sja1105: Add support for ethtool port counters Vladimir Oltean
2019-04-13 20:53   ` Jiri Pirko
2019-04-13 21:55     ` Vladimir Oltean
2019-04-14  8:34       ` Jiri Pirko
2019-04-13  1:28 ` [PATCH v3 net-next 18/24] net: dsa: sja1105: Add support for traffic through standalone ports Vladimir Oltean
2019-04-13 16:37   ` Andrew Lunn
2019-04-13 21:27     ` Vladimir Oltean
2019-04-13 22:08       ` Vladimir Oltean
2019-04-13 22:26         ` Vladimir Oltean
2019-04-14 16:17           ` Andrew Lunn
2019-04-14 18:53             ` Vladimir Oltean
2019-04-14 19:13               ` Andrew Lunn
2019-04-14 22:30                 ` Vladimir Oltean
2019-04-15  3:07                   ` Andrew Lunn
2019-04-17  0:09                     ` Florian Fainelli
2019-04-14 16:05       ` Andrew Lunn
2019-04-14 18:42         ` Vladimir Oltean
2019-04-14 19:06           ` Andrew Lunn
2019-04-17  0:16       ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 19/24] net: dsa: sja1105: Add support for Spanning Tree Protocol Vladimir Oltean
2019-04-13 16:41   ` Andrew Lunn
2019-04-13  1:28 ` [PATCH v3 net-next 20/24] net: dsa: sja1105: Error out if RGMII delays are requested in DT Vladimir Oltean
2019-04-13 16:49   ` Andrew Lunn
2019-04-13 20:47   ` Jiri Pirko
2019-04-13 21:31     ` Vladimir Oltean
2019-04-14  8:35       ` Jiri Pirko
2019-04-13  1:28 ` [PATCH v3 net-next 21/24] net: dsa: sja1105: Prevent PHY jabbering during switch reset Vladimir Oltean
2019-04-13 16:54   ` Andrew Lunn
2019-04-13  1:28 ` [PATCH v3 net-next 22/24] net: dsa: sja1105: Reject unsupported link modes for AN Vladimir Oltean
2019-04-13  1:28 ` [PATCH v3 net-next 23/24] Documentation: net: dsa: Add details about NXP SJA1105 driver Vladimir Oltean
2019-04-17  0:20   ` Florian Fainelli
2019-04-13  1:28 ` [PATCH v3 net-next 24/24] dt-bindings: net: dsa: Add documentation for " Vladimir Oltean

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).