All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol
@ 2023-12-07 22:29 Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
                   ` (16 more replies)
  0 siblings, 17 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

This RFC patchset suggests a new API for ASoC codec drivers to use for
firmware/table downloads.

Problem statement:

All existing transfers initiated by codec drivers rely on SoundWire
read/write commands, which can only support ONE byte per frame. With
the typical 48kHz frame rate, this means 384 kbits/s.

In addition, the command/control is typically handled with a FIFO and
interrupts which adds more software overhead. To give a practical
reference, sending 32Kb takes 2.5s on Intel platforms, which means
about 105kbit/s only. Additional non-audio activity is likely to
adversely impact interrupt scheduling and further decrease the
transfer speeds.

New SDCA-based codecs have a need to download tables and DSP firmware
which are typically between 20 and 256 Kb. The slow bus operation has
a direct impact on boot/resume times, and clearly having to wait more
than 300ms is a showstopper in terms of latency requirements and
user-experience.

Suggested solution:

The MIPI specification and most of the new codecs support the Bulk
Transfer Protocol (BTP) and specifically the Bulk Register Access
(BRA) configuration. This mode reclaims the 'audio' data space of the
SoundWire frame to send firmware/coefficients over the DataPort 0
(DP0).

The API suggested is rather simple, with the following sequence
expected:
open(): reserve resources and prepare hardware
send_async(): trigger DMAs and perform SoundWire bank switch
wait(): wait for DMA completion and disable DMAs
close(): release resources

Benefits:

Even after accounting for the protocol overhead, the data can be sent
8x or 16x faster on the link than with the regular commands.

With the use of DMAs, the software overhead becomes limited to the
initialization. Measured results show that transferring the same 32Kb
takes about 100ms, a 25x improvement on the baseline write() commands
with an actual bitrate of 2.6 Mbits/s. These results are a measure of
bus/hardware performance mainly, and will typically not be too
modified by the CPU activity and scheduling.

The performance for reads is similar, with a 25x speedup measured.

Limitations:

Setting up the transfers over DP0 takes time, and the reliance on DMAs
on the host side brings alignment restrictions. The BTP/BRA protocol
is really only relevant for "large" transfers done during boot/resume
*before* audio transfers take place. Mixing BTP/BRA and audio is a
nightmare, this patchset suggests a mutual-exclusion between two
usages.

Scope:

This patchset only exposes the API and a debugfs interface to initiate
commands, validate results and measure performance. The actual use of
the API is left as an exercise for codec driver developers.

This patchset depends on a number of pre-requisite patches and will
not build on top of any for-next branches. The main intent of this RFC
is to gather comments on the usage, API, benefits and restrictions.

The code and functionality was tested on an Intel LunarLake RVP platform
connected to a Realtek RT711-SDCA device.

Acknowledgements:

Thanks to Zeek Tsai at Realtek for providing test sequences that
helped reconcile the data formatted by the host driver with the
expected results on the code side.

Pierre-Louis Bossart (16):
  Documentation: driver: add SoundWire BRA description
  soundwire: cadence: add BTP support for DP0
  soundwire: stream: extend sdw_alloc_stream() to take 'type' parameter
  soundwire: extend sdw_stream_type to BPT
  soundwire: stream: special-case the bus compute_params() routine
  soundwire: stream: reuse existing code for BPT stream
  soundwire: bus: add API for BPT protocol
  soundwire: bus: add bpt_stream pointer
  soundwire: crc8: add constant table
  soundwire: cadence: add BTP/BRA helpers to format data
  soundwire: intel_auxdevice: add indirection for BPT
    open/close/send_async/wait
  ASoC: SOF: Intel: hda-sdw-bpt: add helpers for SoundWire BPT DMA
  soundwire: intel: add BPT context definition
  soundwire: intel_ace2x: add BPT open/close/send_async/wait
  soundwire: debugfs: add interface for BPT/BRA transfers
  ASoC: rt711-sdca: add DP0 support

 Documentation/driver-api/soundwire/bra.rst    | 478 +++++++++++++
 Documentation/driver-api/soundwire/index.rst  |   1 +
 Documentation/driver-api/soundwire/stream.rst |   2 +-
 .../driver-api/soundwire/summary.rst          |   5 +-
 drivers/soundwire/Kconfig                     |   1 +
 drivers/soundwire/Makefile                    |   4 +-
 drivers/soundwire/amd_manager.c               |   2 +-
 drivers/soundwire/bus.c                       |  77 +++
 drivers/soundwire/bus.h                       |  18 +
 drivers/soundwire/cadence_master.c            | 646 +++++++++++++++++-
 drivers/soundwire/cadence_master.h            |  30 +
 drivers/soundwire/crc8.c                      | 277 ++++++++
 drivers/soundwire/crc8.h                      |  11 +
 drivers/soundwire/debugfs.c                   | 122 +++-
 .../soundwire/generic_bandwidth_allocation.c  |  84 ++-
 drivers/soundwire/intel.h                     |  12 +
 drivers/soundwire/intel_ace2x.c               | 377 ++++++++++
 drivers/soundwire/intel_auxdevice.c           |  55 ++
 drivers/soundwire/qcom.c                      |   2 +-
 drivers/soundwire/stream.c                    | 137 +++-
 include/linux/soundwire/sdw.h                 |  91 ++-
 include/linux/soundwire/sdw_intel.h           |  16 +
 include/sound/hda-sdw-bpt.h                   |  76 +++
 sound/soc/codecs/rt711-sdca-sdw.c             |   8 +
 sound/soc/qcom/sdw.c                          |   2 +-
 sound/soc/sof/intel/Kconfig                   |   8 +-
 sound/soc/sof/intel/Makefile                  |   4 +
 sound/soc/sof/intel/hda-sdw-bpt.c             | 328 +++++++++
 28 files changed, 2810 insertions(+), 64 deletions(-)
 create mode 100644 Documentation/driver-api/soundwire/bra.rst
 create mode 100644 drivers/soundwire/crc8.c
 create mode 100644 drivers/soundwire/crc8.h
 create mode 100644 include/sound/hda-sdw-bpt.h
 create mode 100644 sound/soc/sof/intel/hda-sdw-bpt.c

-- 
2.39.2


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 23:29   ` Mark Brown
                     ` (2 more replies)
  2023-12-07 22:29 ` [RFC PATCH 02/16] soundwire: cadence: add BTP support for DP0 Pierre-Louis Bossart
                   ` (15 subsequent siblings)
  16 siblings, 3 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

The Bulk Register Access protocol was left as a TODO topic since
2018. It's time to document this protocol and the design of its Linux
support.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 Documentation/driver-api/soundwire/bra.rst    | 478 ++++++++++++++++++
 Documentation/driver-api/soundwire/index.rst  |   1 +
 .../driver-api/soundwire/summary.rst          |   5 +-
 3 files changed, 480 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/driver-api/soundwire/bra.rst

diff --git a/Documentation/driver-api/soundwire/bra.rst b/Documentation/driver-api/soundwire/bra.rst
new file mode 100644
index 000000000000..4cc934bf614d
--- /dev/null
+++ b/Documentation/driver-api/soundwire/bra.rst
@@ -0,0 +1,478 @@
+==========================
+Bulk Register Access (BRA)
+==========================
+
+Conventions
+-----------
+
+Capitalized words used in this documentation are intentional and refer
+to concepts of the SoundWire 1.x specification.
+
+Introduction
+------------
+
+The SoundWire 1.x specification provides a mechanism to speed-up
+command/control transfers by reclaiming parts of the audio
+bandwidth. The Bulk Register Access (BRA) protocol is a standard
+solution based on the Bulk Payload Transport (BPT) definitions.
+
+The regular control channel uses Column 0 and can only send/retrieve
+one byte per frame with write/read commands. With a typical 48kHz
+frame rate, only 48kB/s can be transferred.
+
+The optional Bulk Register Access capability can transmit up to 12
+Mbits/s and reduce transfer times by several orders of magnitude, but
+has multiple design constraints:
+
+  (1) Each frame can only support a read or a write transfer, with a
+      10-byte overhead per frame (header and footer response).
+
+  (2) The read/writes SHALL be from/to contiguous register addresses
+      in the same frame. A fragmented register space decreases the
+      efficiency of the protocol by requiring multiple BRA transfers
+      scheduled in different frames.
+
+  (3) The targeted Peripheral device SHALL support the optional Data
+      Port 0, and likewise the Manager SHALL expose audio-like Ports
+      to insert BRA packets in the audio payload using the concepts of
+      Sample Interval, HSTART, HSTOP, etc.
+
+  (4) The BRA transport efficiency depends on the available
+      bandwidth. If there are no on-going audio transfers, the entire
+      frame minus Column 0 can be reclaimed for BRA. The frame shape
+      also impacts efficiency: since Column0 cannot be used for
+      BTP/BRA, the frame should rely on a large number of columns and
+      minimize the number of rows. The bus clock should be as high as
+      possible.
+
+  (5) The number of bits transferred per frame SHALL be a multiple of
+      8 bits. Padding bits SHALL be inserted if necessary at the end
+      of the data.
+
+  (6) The regular read/write commands can be issued in parallel with
+      BRA transfers. This is convenient to e.g. deal with alerts, jack
+      detection or change the volume during firmware download, but
+      accessing the same address with two independent protocols has to
+      be avoided to avoid undefined behavior.
+
+  (7) Some implementations may not be capable of handling the
+      bandwidth of the BRA protocol, e.g. in the case of a slow I2C
+      bus behind the SoundWire IP. In this case, the transfers may
+      need to be spaced in time or flow-controlled.
+
+  (8) Each BRA packet SHALL be marked as 'Active' when valid data is
+      to be transmitted. This allows for software to allocate a BRA
+      stream but not transmit/discard data while processing the
+      results or preparing the next batch of data, or allowing the
+      peripheral to deal with the previous transfer. In addition BRA
+      transfer can be started early on without data being ready.
+
+  (9) Up to 470 bytes may be transmitted per frame.
+
+  (10) The address is represented with 32 bits and does not rely on
+       the paging registers used for the regular command/control
+       protocol in Column 0.
+
+
+Error checking
+--------------
+
+Firmware download is one of the key usages of the Bulk Register Access
+protocol. To make sure the binary data integrity is not compromised by
+transmission or programming errors, each BRA packet provides:
+
+  (1) A CRC on the 7-byte header. This CRC helps the Peripheral Device
+      check if it is addressed and set the start address and number of
+      bytes. The Peripheral Device provides a response in Byte 7.
+
+  (2) A CRC on the data block (header excluded). This CRC is
+      transmitted as the last-but-one byte in the packet, prior to the
+      footer response.
+
+The header response can be one of
+  (a) Ack
+  (b) Nak
+  (c) Not Ready
+
+The footer response can be one of
+  (1) Ack
+  (2) Nak  (CRC failure)
+  (3) Good (operation completed)
+  (4) Bad  (operation failed)
+
+Example frame
+-------------
+
+The example below is not to scale and makes simplifying assumptions
+for clarity. The different chunks in the BRA packets are not required
+to start on a new SoundWire Row, and the scale of data may vary.
+
+      ::
+
+	+---+--------------------------------------------+
+	+   |                                            |
+	+   |             BRA HEADER                     |
+        +   |	                                         |
+	+   +--------------------------------------------+
+	+ C |             HEADER CRC                     |
+	+ O +--------------------------------------------+
+	+ M | 	          HEADER RESPONSE                |
+	+ M +--------------------------------------------+
+	+ A |                                            |
+	+ N |                                            |
+	+ D |                 DATA                       |
+	+   |                                            |
+	+   |                                            |
+	+   |                                            |
+	+   +--------------------------------------------+
+	+   |             DATA CRC                       |
+	+   +--------------------------------------------+
+	+   | 	          FOOTER RESPONSE                |
+	+---+--------------------------------------------+
+
+
+Assuming the frame uses N columns, the configuration shown above can
+be programmed by setting the DP0 registers as:
+
+    - HSTART = 1
+    - HSTOP = N - 1
+    - Sampling Interval = N
+    - WordLength = N - 1
+
+Addressing restrictions
+-----------------------
+
+The Device Number specified in the Header follows the SoundWire
+definitions, and broadcast and group addressing are
+permitted. However, in reality it is very unlikely that the exact same
+binary data needs to be provided to the two different Peripheral
+devices. The Linux implementation only allows for transfers to a
+single device at a time.
+
+In the case of multiple Peripheral devices attached to different
+Managers, the broadcast and group addressing is not supported by the
+SoundWire specification. Each device must be handled with separate BRA
+streams, possibly in parallel - the links are really independent.
+
+Unsupported features
+--------------------
+
+The Bulk Register Access specification provides a number of
+capabilities that are not supported in known implementations, such as:
+
+  (1) Transfers initiated by a Peripheral Device. The BRA Initiator is
+      always the Manager Device.
+
+  (2) Flow-control capabilities and retransmission based on the
+      'NotReady' header response require extra buffering in the
+      SoundWire IP and are not implemented.
+
+Bi-directional handling
+-----------------------
+
+The BRA protocol can handle writes as well as reads, and in each
+packet the header and footer response are provided by the Peripheral
+Target device. On the Peripheral device, the BRA protocol is handled
+by a single DP0 data port, and at the low-level the bus ownership can
+will change for header/footer response as well as the data transmitted
+during a read.
+
+On the host side, most implementations rely on a Port-like concept,
+with two FIFOs consuming/generating data transfers in parallel
+(Host->Peripheral and Peripheral->Host). The amount of data
+consumed/produced by these FIFOs is not symmetrical, as a result
+hardware typically inserts markers to help software and hardware
+interpret raw data
+
+Each packet will typically have
+
+  (1) a 'Start of Packet' indicator.
+
+  (2) an 'End of Packet' indicator.
+
+  (3) a packet identifier to correlate the data requested and
+      transmitted, and the error status for each frame
+
+Hardware implementations can check errors at the frame level, and
+retry a transfer in case of errors. However, as for the flow-control
+case, this requires extra buffering and intelligence in the
+hardware. The Linux support assumes that the entire transfer is
+cancelled if a single error is detected in one of the responses.
+
+Cadence IP BRA support
+----------------------
+
+Format requirements
+~~~~~~~~~~~~~~~~~~~
+
+The Cadence IP relies on PDI0 for TX and PDI1 for RX. The data needs
+to be formatted with the following conventions:
+
+  (1) all Data is stored in bits 15..0 of the 32-bit PDI FIFOs.
+
+  (2) the start of packet is BIT(31).
+
+  (3) the end of packet is BIT(30).
+
+  (4) A packet ID is stored in bits 19..16. This packet ID is
+      determined by software and is typically a rolling counter.
+
+  (5) Padding shall be inserted as needed so that the Header CRC,
+      Header response, Footer CRC, Footer response are always in
+      Byte0. Padding is inserted by software for writes, and on reads
+      software shall discard the padding added by the hardware.
+
+Example format
+~~~~~~~~~~~~~~
+
+The following table represents the sequence provided to PDI0 for a
+write command followed by a read command.
+
+::
+
+	+---+---+--------+---------------+---------------+
+	+ 1 | 0 | ID = 0 |  WR HDR[1]    |  WR HDR[0]    |
+	+   |   |        |  WR HDR[3]    |  WR HDR[2]    |
+	+   |   |        |  WR HDR[5]    |  WR HDR[4]    |
+	+   |   |        |  pad          |  WR HDR CRC   |
+	+   |   |        |  WR Data[1]   |  WR Data[0]   |
+	+   |   |        |  WR Data[3]   |  WR Data[2]   |
+	+   |   |        |  WR Data[n-2] |  WR Data[n-3] |
+	+   |   |        |  pad          |  WR Data[n-1] |
+	+ 0 | 1 |        |  pad          |  WR Data CRC  |
+	+---+---+--------+---------------+---------------+
+	+ 1 | 0 | ID = 1 |  RD HDR[1]    |  RD HDR[0]    |
+	+   |   |        |  RD HDR[3]    |  RD HDR[2]    |
+	+   |   |        |  RD HDR[5]    |  RD HDR[4]    |
+	+ 0 | 1 |        |  pad          |  RD HDR CRC   |
+	+---+---+--------+---------------+---------------+
+
+
+The table below represents the data received on PDI1 for the same
+write command followed by a read command.
+
+::
+
+	+---+---+--------+---------------+---------------+
+	+ 1 | 0 | ID = 0 |  pad          |  WR Hdr Rsp   |
+	+ 0 | 1 |        |  pad          |  WR Ftr Rsp   |
+	+---+---+--------+---------------+---------------+
+	+ 1 | 0 | ID = 0 |  pad          |  Rd Hdr Rsp   |
+	+   |   |        |  RD Data[1]   |  RD Data[0]   |
+	+   |   |        |  RD Data[3]   |  RD Data[2]   |
+	+   |   |        |  RD HDR[n-2]  |  RD Data[n-3] |
+	+   |   |        |  pad          |  RD Data[n-1] |
+	+   |   |        |  pad          |  RD Data CRC  |
+	+ 0 | 1 |        |  pad          |  RD Ftr Rsp   |
+	+---+---+--------+---------------+---------------+
+
+
+Peripheral/bus interface
+------------------------
+
+Regmap use
+~~~~~~~~~~
+
+Existing codec drivers rely on regmap to download firmware to
+Peripherals, so at a high-level it would seem natural to combine BRA
+and regmap. The regmap layer could check if BRA is available or not,
+and use a regular read-write command channel in the latter case.
+
+However, the regmap layer does not have information on latency or how
+critical a BRA transfer is. It would make more sense to let the codec
+driver make decisions (wait, timeout, fallback to regular
+read/writes).
+
+In addition, the hardware may lose context and ask for the firmware to
+be downloaded again. The firmware is not however a fixed definition,
+the SDCA definition allows the hardware to request an updated firmware
+or a different coefficient table to deal with specific environment
+conditions. In other words, traditional regmap cache management is not
+good enough, the Peripheral driver is required track hardware
+notifications and react accordingly.
+
+Abstraction required
+~~~~~~~~~~~~~~~~~~~~
+
+There are no standard registers or mandatory implementation at the
+Manager level, so the low-level BPT/BRA details must be hidden in
+Manager-specific code. For example the Cadence IP format above is not
+known to the codec drivers.
+
+Likewise, codec drivers should not have to know the frame size. The
+computation of CRC and handling of responses is handled in helpers and
+Manager-specific code.
+
+The host BRA driver may also have restrictions on pages allocated for
+DMA, or other host-DSP communication protocols. The codec driver
+should not be aware of any of these restrictions, since it might be
+reused in combination with different implementations of Manager IPs.
+
+Concurrency between BRA and regular read/write
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The existing 'nread/nwrite' API already relies on a notion of start
+address and number of bytes, so it would be possible to extend this
+API with a 'hint' requesting BPT/BRA be used.
+
+However BRA transfers could be quite long, and the use of a single
+mutex for regular read/write and BRA is a show-stopper. Independent
+operation of the control/command and BRA transfers is a fundamental
+requirement, e.g. to change the volume level with the existing regmap
+interface while downloading firmware.
+
+This places the burden on the codec driver to verify that there is no
+concurrent access to the same address with the command/control
+protocol and the BRA protocol.
+
+In addition, the 'sdw_msg' structure hard-codes support for 16-bit
+addresses and paging registers which are irrelevant for BPT/BRA
+support based on native 32-bit addresses. A separate API with
+'sdw_bpt_msg' makes more sense.
+
+One possible strategy to speed-up all initialization tasks would be to
+start a BRA transfer for firmware download, then deal with all the
+"regular" read/writes in parallel with the command channel, and last
+to wait for the BRA transfers to complete. This would allow for a
+degree of overlap instead of a purely sequential solution. As a
+results, the BRA API must support async transfers and expose a
+separate wait function.
+
+Error handling
+~~~~~~~~~~~~~~
+
+The expected response to a 'bra_message' and follow-up behavior may
+vary:
+
+   (1) A Peripheral driver may want to receive an immediate -EBUSY
+       response if the BRA protocol is not available at a given time.
+
+   (2) A Peripheral driver may want to wait until a timeout for the
+       on-going transfer to be handled
+
+   (3) A Peripheral driver may want to wait until existing BRA
+       transfers complete or deal with BRA as a background task when
+       audio transfers stop. In this case, there would be no timeout,
+       and the operation may not happen if the platform is suspended.
+
+BRA stream model
+----------------
+
+For regular audio transfers, the machine driver exposes a dailink
+connecting CPU DAI(s) and Codec DAI(s).
+
+This model is not required BRA support:
+
+   (1) The SoundWire DAIs are mainly wrappers for SoundWire Data
+       Ports, with possibly some analog or audio conversion
+       capabilities bolted behind the Data Port. In the context of
+       BRA, the DP0 is the destination. DP0 registers are standard and
+       can be programmed blindly without knowing what Peripheral is
+       connected to each link. In addition, if there are multiple
+       Peripherals on a link and some of them do not support DP0, the
+       write commands to program DP0 registers will generate harmless
+       COMMAND_IGNORED responses that will be wired-ORed with
+       responses from Peripherals which support DP0. In other words,
+       the DP0 programming can be done with broadcast commands, and
+       the information on the Target device can be added only in the
+       BRA Header.
+
+   (2) At the CPU level, the DAI concept is not useful for BRA; the
+       machine driver will not create a dailink relying on DP0. The
+       only concept that is needed is the notion of port.
+
+   (3) The stream concept relies on a set of master_rt and slave_rt
+       concepts. All of these entities represent ports and not DAIs.
+
+   (4) With the assumption that a single BRA stream is used per link,
+       that stream can connect master ports as well as all peripheral
+       DP0 ports.
+
+   (5) BRA transfers only make sense in the concept of one
+       Manager/Link, so the BRA stream handling does not rely on the
+       concept of multi-link aggregation allowed by regular DAI links.
+
+Audio DMA support
+-----------------
+
+Some DMAs, such as HDaudio, require an audio format field to be
+set. This format is in turn used to define acceptable bursts. BPT/BRA
+support is not fully compatible with these definitions in that the
+format may vary between read and write commands.
+
+In addition, on Intel HDaudio Intel platforms the DMAs need to be
+programmed with a PCM format matching the bandwidth of the BPT/BRA
+transfer. The format is based on 48kHz 32-bit samples, and the number
+of channels varies to adjust the bandwidth. The notion of channel is
+completely notional since the data is not typical audio
+PCM. Programming channels helps reserve enough bandwidth and adjust
+FIFO sizes to avoid xruns. Note that the quality of service comes as a
+cost. Since all channels need to be present as a sample block, data
+sizes not aligned to 128-bytes are not supported.
+
+BTP/BRA API available to peripheral drivers
+-------------------------------------------
+
+ASoC Peripheral drivers may use
+
+   - sdw_bpt_stream_open(mode)
+
+      This function verifies that the BPT protocol with the
+      'mode'. For now only BRA is accepted as a mode. This function
+      allocates a work buffer internally. This buffer is not exposed
+      to the caller.
+
+     errors:
+         -ENODEV: BPT/BRA is not supported by the Manager.
+
+         -EBUSY: another agent is already using the audio payload for
+          audio transfers. There is no way to predict when the audio
+          streams might stop, this will require the Peripheral driver
+          to fall back to the regular (slow) command channel.
+
+         -EAGAIN: another agent is already transferring data using the
+          BPT/BRA protocol. Since the transfers will typically last
+          10s or 100s of ms, the Peripheral driver may wait and retry
+          later.
+
+    - sdw_bpt_message_send_async(bpt_message)
+
+      This function sends the data using the Manager
+      implementation-defined capabilities (typically DMA or IPC
+      protocol). If the message exceeds the size of the buffer
+      allocated in the 'open' stage, the data will be copied and
+      transmitted in multiple chunks using the buffer. This function
+      cannot be called multiple times to queue transfers, the codec
+      driver needs to wait for completion of the requested transfer.
+
+      errors:
+
+         -ENODEV: BPT/BRA is not supported by the Manager.
+
+	 -EINVAL: no resources available.
+
+    - sdw_bpt_message_wait(timeout)
+
+      This function waits for the entire message provided by the codec
+      driver in the 'send_async' stage. Intermediate status for
+      smaller chunks will not be provided back to the codec driver,
+      only a return code will be provided.
+
+      errors:
+
+         -ENODEV: BPT/BRA is not supported by the Manager.
+
+	 -EINVAL: no transfer queued
+
+         -EIO: some sort of transmisson error happened, typically a
+          bad CRC was detected.
+
+	 -ETIMEDOUT: transfer did not complete
+
+    - sdw_bpt_stream_close()
+
+      This functions releases the buffer allocated in the open stage
+      and decreases the refcounts.
+
+      Note that it's possible to call send_async/message_wait multiple
+      times, it's not required to close/open.
diff --git a/Documentation/driver-api/soundwire/index.rst b/Documentation/driver-api/soundwire/index.rst
index 234911a0db99..8d826fd5781f 100644
--- a/Documentation/driver-api/soundwire/index.rst
+++ b/Documentation/driver-api/soundwire/index.rst
@@ -9,6 +9,7 @@ SoundWire Documentation
    stream
    error_handling
    locking
+   bra
 
 .. only::  subproject and html
 
diff --git a/Documentation/driver-api/soundwire/summary.rst b/Documentation/driver-api/soundwire/summary.rst
index 01dcb954f6d7..260a1c78545e 100644
--- a/Documentation/driver-api/soundwire/summary.rst
+++ b/Documentation/driver-api/soundwire/summary.rst
@@ -187,10 +187,7 @@ reconfigurations.
 Future enhancements to be done
 ==============================
 
- (1) Bulk Register Access (BRA) transfers.
-
-
- (2) Multiple data lane support.
+ (1) Multiple data lane support.
 
 Links
 =====
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 02/16] soundwire: cadence: add BTP support for DP0
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 03/16] soundwire: stream: extend sdw_alloc_stream() to take 'type' parameter Pierre-Louis Bossart
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

The register definitions are missing a BULK_ENABLE bitfield which must
be set for DP0.

In addition, the existing mapping from PDI to Data Port is 1:1. That's
fine for PCM streams which are by construction in one direction
only. The BTP/BRA protocol is bidirectional and relies on DP0 only,
which breaks the 1:1 mapping. DP0 MUST be mapped to both PDI0 and
PDI1, with PDI1 taking care of the TX direction and PDI1 of the RX
direction.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/cadence_master.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/soundwire/cadence_master.c b/drivers/soundwire/cadence_master.c
index 725222036853..bb623f82826c 100644
--- a/drivers/soundwire/cadence_master.c
+++ b/drivers/soundwire/cadence_master.c
@@ -183,6 +183,7 @@ MODULE_PARM_DESC(cdns_mcp_int_mask, "Cadence MCP IntMask");
 #define CDNS_PORTCTRL_TEST_FAILED		BIT(1)
 #define CDNS_PORTCTRL_DIRN			BIT(7)
 #define CDNS_PORTCTRL_BANK_INVERT		BIT(8)
+#define CDNS_PORTCTRL_BULK_ENABLE		BIT(16)
 
 #define CDNS_PORT_OFFSET			0x80
 
@@ -1831,13 +1832,20 @@ void sdw_cdns_config_stream(struct sdw_cdns *cdns,
 
 		if (cdns->bus.params.m_data_mode != SDW_PORT_DATA_MODE_NORMAL)
 			val |= CDNS_PORTCTRL_TEST_FAILED;
+	} else if (pdi->num == 0) {
+		val |= CDNS_PORTCTRL_BULK_ENABLE;
 	}
 	offset = CDNS_PORTCTRL + pdi->num * CDNS_PORT_OFFSET;
 	cdns_updatel(cdns, offset,
-		     CDNS_PORTCTRL_DIRN | CDNS_PORTCTRL_TEST_FAILED,
+		     CDNS_PORTCTRL_DIRN | CDNS_PORTCTRL_TEST_FAILED |
+		     CDNS_PORTCTRL_BULK_ENABLE,
 		     val);
 
-	val = pdi->num;
+	/* The DataPort0 needs to be mapped to both PDI and PDI1 ! */
+	if (pdi->num == 1)
+		val = 0;
+	else
+		val = pdi->num;
 	val |= CDNS_PDI_CONFIG_SOFT_RESET;
 	val |= FIELD_PREP(CDNS_PDI_CONFIG_CHANNEL, (1 << ch) - 1);
 	cdns_writel(cdns, CDNS_PDI_CONFIG(pdi->num), val);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 03/16] soundwire: stream: extend sdw_alloc_stream() to take 'type' parameter
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 02/16] soundwire: cadence: add BTP support for DP0 Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 04/16] soundwire: extend sdw_stream_type to BPT Pierre-Louis Bossart
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

In the existing definition of sdw_stream_runtime, the 'type' member is
never set and defaults to PCM. To prepare for the BPT/BRA support, we
need to special-case streams and make use of the 'type'.

No functional change for now, the implicit PCM type is now explicit.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 Documentation/driver-api/soundwire/stream.rst | 2 +-
 drivers/soundwire/stream.c                    | 7 +++++--
 include/linux/soundwire/sdw.h                 | 3 ++-
 sound/soc/qcom/sdw.c                          | 2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/Documentation/driver-api/soundwire/stream.rst b/Documentation/driver-api/soundwire/stream.rst
index b432a2de45d3..65b9bb742689 100644
--- a/Documentation/driver-api/soundwire/stream.rst
+++ b/Documentation/driver-api/soundwire/stream.rst
@@ -291,7 +291,7 @@ per stream. From ASoC DPCM framework, this stream state maybe linked to
 
 .. code-block:: c
 
-  int sdw_alloc_stream(char * stream_name);
+  int sdw_alloc_stream(char * stream_name, enum sdw_stream_type type);
 
 The SoundWire core provides a sdw_startup_stream() helper function,
 typically called during a dailink .startup() callback, which performs
diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
index c4ab360e727f..c1e3bb897fb9 100644
--- a/drivers/soundwire/stream.c
+++ b/drivers/soundwire/stream.c
@@ -1743,12 +1743,14 @@ static int set_stream(struct snd_pcm_substream *substream,
  * sdw_alloc_stream() - Allocate and return stream runtime
  *
  * @stream_name: SoundWire stream name
+ * @type: stream type (could be PCM, PDM or BPT)
  *
  * Allocates a SoundWire stream runtime instance.
  * sdw_alloc_stream should be called only once per stream. Typically
  * invoked from ALSA/ASoC machine/platform driver.
  */
-struct sdw_stream_runtime *sdw_alloc_stream(const char *stream_name)
+struct sdw_stream_runtime *sdw_alloc_stream(const char *stream_name,
+					    enum sdw_stream_type type)
 {
 	struct sdw_stream_runtime *stream;
 
@@ -1760,6 +1762,7 @@ struct sdw_stream_runtime *sdw_alloc_stream(const char *stream_name)
 	INIT_LIST_HEAD(&stream->master_list);
 	stream->state = SDW_STREAM_ALLOCATED;
 	stream->m_rt_count = 0;
+	stream->type = type;
 
 	return stream;
 }
@@ -1788,7 +1791,7 @@ int sdw_startup_stream(void *sdw_substream)
 	if (!name)
 		return -ENOMEM;
 
-	sdw_stream = sdw_alloc_stream(name);
+	sdw_stream = sdw_alloc_stream(name, SDW_STREAM_PCM);
 	if (!sdw_stream) {
 		dev_err(rtd->dev, "alloc stream failed for substream DAI %s\n", substream->name);
 		ret = -ENOMEM;
diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 71a7031f7b3a..5cc7fed1bf19 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -1024,7 +1024,8 @@ struct sdw_stream_runtime {
 	int m_rt_count;
 };
 
-struct sdw_stream_runtime *sdw_alloc_stream(const char *stream_name);
+struct sdw_stream_runtime *sdw_alloc_stream(const char *stream_name,
+					    enum sdw_stream_type type);
 void sdw_release_stream(struct sdw_stream_runtime *stream);
 
 int sdw_compute_params(struct sdw_bus *bus);
diff --git a/sound/soc/qcom/sdw.c b/sound/soc/qcom/sdw.c
index 77dbe0c28b29..d91a1bd0c66c 100644
--- a/sound/soc/qcom/sdw.c
+++ b/sound/soc/qcom/sdw.c
@@ -27,7 +27,7 @@ int qcom_snd_sdw_startup(struct snd_pcm_substream *substream)
 	struct snd_soc_dai *codec_dai;
 	int ret, i;
 
-	sruntime = sdw_alloc_stream(cpu_dai->name);
+	sruntime = sdw_alloc_stream(cpu_dai->name, SDW_STREAM_PCM);
 	if (!sruntime)
 		return -ENOMEM;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 04/16] soundwire: extend sdw_stream_type to BPT
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (2 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 03/16] soundwire: stream: extend sdw_alloc_stream() to take 'type' parameter Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 05/16] soundwire: stream: special-case the bus compute_params() routine Pierre-Louis Bossart
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

BPT/BRA need to be special cased, i.e. there's no point in using the
bandwidth allocation since the entire frame can be used.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/linux/soundwire/sdw.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 5cc7fed1bf19..9f04f3ad044f 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -140,12 +140,14 @@ enum sdw_dpn_pkg_mode {
  *
  * @SDW_STREAM_PCM: PCM data stream
  * @SDW_STREAM_PDM: PDM data stream
+ * @SDW_STREAM_BPT: BPT data stream
  *
  * spec doesn't define this, but is used in implementation
  */
 enum sdw_stream_type {
 	SDW_STREAM_PCM = 0,
 	SDW_STREAM_PDM = 1,
+	SDW_STREAM_BPT = 2,
 };
 
 /**
@@ -959,7 +961,7 @@ struct sdw_port_config {
  * @ch_count: Channel count of the stream
  * @bps: Number of bits per audio sample
  * @direction: Data direction
- * @type: Stream type PCM or PDM
+ * @type: Stream type PCM, PDM or BPT
  */
 struct sdw_stream_config {
 	unsigned int frame_rate;
@@ -1009,7 +1011,7 @@ struct sdw_stream_params {
  * @name: SoundWire stream name
  * @params: Stream parameters
  * @state: Current state of the stream
- * @type: Stream type PCM or PDM
+ * @type: Stream type PCM, PDM or BPT
  * @master_list: List of Master runtime(s) in this stream.
  * master_list can contain only one m_rt per Master instance
  * for a stream
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 05/16] soundwire: stream: special-case the bus compute_params() routine
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (3 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 04/16] soundwire: extend sdw_stream_type to BPT Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream Pierre-Louis Bossart
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

For BPT support, we want to allocate the entire audio payload and
bypass the allocation based on PCM/PDM parameters.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/amd_manager.c               |  2 +-
 .../soundwire/generic_bandwidth_allocation.c  | 84 ++++++++++++++++++-
 drivers/soundwire/qcom.c                      |  2 +-
 drivers/soundwire/stream.c                    |  6 +-
 include/linux/soundwire/sdw.h                 |  4 +-
 5 files changed, 91 insertions(+), 7 deletions(-)

diff --git a/drivers/soundwire/amd_manager.c b/drivers/soundwire/amd_manager.c
index a3b1f4e6f0f9..96480f6ddd3f 100644
--- a/drivers/soundwire/amd_manager.c
+++ b/drivers/soundwire/amd_manager.c
@@ -389,7 +389,7 @@ static u32 amd_sdw_read_ping_status(struct sdw_bus *bus)
 	return slave_stat;
 }
 
-static int amd_sdw_compute_params(struct sdw_bus *bus)
+static int amd_sdw_compute_params(struct sdw_bus *bus, bool bpt_stream)
 {
 	struct sdw_transport_data t_data = {0};
 	struct sdw_master_runtime *m_rt;
diff --git a/drivers/soundwire/generic_bandwidth_allocation.c b/drivers/soundwire/generic_bandwidth_allocation.c
index c70a63d009ae..04083122acde 100644
--- a/drivers/soundwire/generic_bandwidth_allocation.c
+++ b/drivers/soundwire/generic_bandwidth_allocation.c
@@ -81,6 +81,82 @@ void sdw_compute_slave_ports(struct sdw_master_runtime *m_rt,
 }
 EXPORT_SYMBOL(sdw_compute_slave_ports);
 
+static void sdw_compute_slave_dp0_ports(struct sdw_master_runtime *m_rt,
+					struct sdw_transport_data *t_data)
+{
+	struct sdw_slave_runtime *s_rt = NULL;
+	struct sdw_port_runtime *p_rt;
+	int port_bo;
+	int sample_int;
+	unsigned int bps;
+
+	port_bo = t_data->block_offset;
+	bps = m_rt->bus->params.col - 1;
+	sample_int = m_rt->bus->params.col;
+
+	list_for_each_entry(s_rt, &m_rt->slave_rt_list, m_rt_node) {
+		list_for_each_entry(p_rt, &s_rt->port_list, port_node) {
+			sdw_fill_xport_params(&p_rt->transport_params,
+					      p_rt->num, false,
+					      SDW_BLK_GRP_CNT_1,
+					      sample_int, port_bo, port_bo >> 8,
+					      t_data->hstart,
+					      t_data->hstop,
+					      SDW_BLK_PKG_PER_PORT, 0x0);
+
+			sdw_fill_port_params(&p_rt->port_params,
+					     p_rt->num, bps,
+					     SDW_PORT_FLOW_MODE_ISOCH,
+					     SDW_PORT_DATA_MODE_NORMAL);
+		}
+	}
+}
+
+static void sdw_compute_dp0_master_ports(struct sdw_master_runtime *m_rt)
+{
+	struct sdw_transport_data t_data = {0};
+	struct sdw_port_runtime *p_rt;
+	struct sdw_bus *bus = m_rt->bus;
+	unsigned int bps;
+	int sample_int;
+	int port_bo;
+	int hstart;
+	int hstop;
+
+	bps = bus->params.col - 1;
+	sample_int = bus->params.col;
+	hstart = 1;
+	hstop = bus->params.col - 1;
+	port_bo = 0;
+
+	t_data.hstart = hstart;
+	t_data.hstop = hstop;
+	t_data.block_offset = port_bo;
+	t_data.sub_block_offset = 0;
+
+	list_for_each_entry(p_rt, &m_rt->port_list, port_node) {
+		sdw_fill_xport_params(&p_rt->transport_params, p_rt->num,
+				      false, SDW_BLK_GRP_CNT_1, sample_int,
+				      port_bo, port_bo >> 8, hstart, hstop,
+				      SDW_BLK_PKG_PER_PORT, 0x0);
+
+		sdw_fill_port_params(&p_rt->port_params,
+				     p_rt->num, bps,
+				     SDW_PORT_FLOW_MODE_ISOCH,
+				     SDW_PORT_DATA_MODE_NORMAL);
+	}
+
+	sdw_compute_slave_dp0_ports(m_rt, &t_data);
+}
+
+static void sdw_compute_dp0_port_params(struct sdw_bus *bus)
+{
+	struct sdw_master_runtime *m_rt;
+
+	list_for_each_entry(m_rt, &bus->m_rt_list, bus_node)
+		sdw_compute_dp0_master_ports(m_rt);
+}
+
 static void sdw_compute_master_ports(struct sdw_master_runtime *m_rt,
 				     struct sdw_group_params *params,
 				     int port_bo, int hstop)
@@ -392,8 +468,9 @@ static int sdw_compute_bus_params(struct sdw_bus *bus)
  * sdw_compute_params: Compute bus, transport and port parameters
  *
  * @bus: SDW Bus instance
+ * @bpt_stream: boolean to conditionally use dedicated bit allocation
  */
-int sdw_compute_params(struct sdw_bus *bus)
+int sdw_compute_params(struct sdw_bus *bus, bool bpt_stream)
 {
 	int ret;
 
@@ -402,6 +479,11 @@ int sdw_compute_params(struct sdw_bus *bus)
 	if (ret < 0)
 		return ret;
 
+	if (bpt_stream) {
+		sdw_compute_dp0_port_params(bus);
+		return 0;
+	}
+
 	/* Compute transport and port params */
 	ret = sdw_compute_port_params(bus);
 	if (ret < 0) {
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
index 3c4d6debab1f..8dd3319e8e40 100644
--- a/drivers/soundwire/qcom.c
+++ b/drivers/soundwire/qcom.c
@@ -1060,7 +1060,7 @@ static const struct sdw_master_ops qcom_swrm_ops = {
 	.pre_bank_switch = qcom_swrm_pre_bank_switch,
 };
 
-static int qcom_swrm_compute_params(struct sdw_bus *bus)
+static int qcom_swrm_compute_params(struct sdw_bus *bus, bool bpt_stream)
 {
 	struct qcom_swrm_ctrl *ctrl = to_qcom_sdw(bus);
 	struct sdw_master_runtime *m_rt;
diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
index c1e3bb897fb9..16ee58d4b7a2 100644
--- a/drivers/soundwire/stream.c
+++ b/drivers/soundwire/stream.c
@@ -1356,6 +1356,7 @@ static void sdw_release_bus_lock(struct sdw_stream_runtime *stream)
 static int _sdw_prepare_stream(struct sdw_stream_runtime *stream,
 			       bool update_params)
 {
+	bool bpt_stream = (stream->type == SDW_STREAM_BPT) ? true : false;
 	struct sdw_master_runtime *m_rt;
 	struct sdw_bus *bus;
 	struct sdw_master_prop *prop;
@@ -1382,7 +1383,7 @@ static int _sdw_prepare_stream(struct sdw_stream_runtime *stream,
 
 			/* Compute params */
 			if (bus->compute_params) {
-				ret = bus->compute_params(bus);
+				ret = bus->compute_params(bus, bpt_stream);
 				if (ret < 0) {
 					dev_err(bus->dev, "Compute params failed: %d\n",
 						ret);
@@ -1640,6 +1641,7 @@ EXPORT_SYMBOL(sdw_disable_stream);
 
 static int _sdw_deprepare_stream(struct sdw_stream_runtime *stream)
 {
+	bool bpt_stream = (stream->type == SDW_STREAM_BPT) ? true : false;
 	struct sdw_master_runtime *m_rt;
 	struct sdw_bus *bus;
 	int ret = 0;
@@ -1660,7 +1662,7 @@ static int _sdw_deprepare_stream(struct sdw_stream_runtime *stream)
 
 		/* Compute params */
 		if (bus->compute_params) {
-			ret = bus->compute_params(bus);
+			ret = bus->compute_params(bus, bpt_stream);
 			if (ret < 0) {
 				dev_err(bus->dev, "Compute params failed: %d\n",
 					ret);
diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 9f04f3ad044f..1dda65d91bad 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -918,7 +918,7 @@ struct sdw_bus {
 	struct lock_class_key bus_lock_key;
 	struct mutex msg_lock;
 	struct lock_class_key msg_lock_key;
-	int (*compute_params)(struct sdw_bus *bus);
+	int (*compute_params)(struct sdw_bus *bus, bool bpt_stream);
 	const struct sdw_master_ops *ops;
 	const struct sdw_master_port_ops *port_ops;
 	struct sdw_bus_params params;
@@ -1030,7 +1030,7 @@ struct sdw_stream_runtime *sdw_alloc_stream(const char *stream_name,
 					    enum sdw_stream_type type);
 void sdw_release_stream(struct sdw_stream_runtime *stream);
 
-int sdw_compute_params(struct sdw_bus *bus);
+int sdw_compute_params(struct sdw_bus *bus, bool bpt_stream);
 
 int sdw_stream_add_master(struct sdw_bus *bus,
 		struct sdw_stream_config *stream_config,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (4 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 05/16] soundwire: stream: special-case the bus compute_params() routine Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-12 12:30   ` Charles Keepax
  2023-12-07 22:29 ` [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol Pierre-Louis Bossart
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

DP0 (Data Port 0) is very similar to regular data ports, with minor
tweaks we can reuse the same code.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/stream.c | 94 +++++++++++++++++++++++++++-----------
 1 file changed, 68 insertions(+), 26 deletions(-)

diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
index 16ee58d4b7a2..f9762610f051 100644
--- a/drivers/soundwire/stream.c
+++ b/drivers/soundwire/stream.c
@@ -87,6 +87,10 @@ static int _sdw_program_slave_port_params(struct sdw_bus *bus,
 		return ret;
 	}
 
+	/* DP0 does not implement BlockCtrl3 */
+	if (!t_params->port_num)
+		goto skip_block_ctrl3;
+
 	/* Program DPN_BlockCtrl3 register */
 	ret = sdw_write_no_pm(slave, addr2, t_params->blk_pkg_mode);
 	if (ret < 0) {
@@ -94,6 +98,7 @@ static int _sdw_program_slave_port_params(struct sdw_bus *bus,
 		return ret;
 	}
 
+skip_block_ctrl3:
 	/*
 	 * Data ports are FULL, SIMPLE and REDUCED. This function handles
 	 * FULL and REDUCED only and beyond this point only FULL is
@@ -130,18 +135,29 @@ static int sdw_program_slave_port_params(struct sdw_bus *bus,
 	struct sdw_port_params *p_params = &p_rt->port_params;
 	struct sdw_slave_prop *slave_prop = &s_rt->slave->prop;
 	u32 addr1, addr2, addr3, addr4, addr5, addr6;
-	struct sdw_dpn_prop *dpn_prop;
+	enum sdw_dpn_type port_type;
+	bool read_only_wordlength;
 	int ret;
 	u8 wbuf;
 
 	if (s_rt->slave->is_mockup_device)
 		return 0;
 
-	dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
-					  s_rt->direction,
-					  t_params->port_num);
-	if (!dpn_prop)
-		return -EINVAL;
+	if (t_params->port_num) {
+		struct sdw_dpn_prop *dpn_prop;
+
+		dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
+						  s_rt->direction,
+						  t_params->port_num);
+		if (!dpn_prop)
+			return -EINVAL;
+
+		read_only_wordlength = dpn_prop->read_only_wordlength;
+		port_type = dpn_prop->type;
+	} else {
+		read_only_wordlength = false;
+		port_type = SDW_DPN_FULL;
+	}
 
 	addr1 = SDW_DPN_PORTCTRL(t_params->port_num);
 	addr2 = SDW_DPN_BLOCKCTRL1(t_params->port_num);
@@ -171,7 +187,7 @@ static int sdw_program_slave_port_params(struct sdw_bus *bus,
 		return ret;
 	}
 
-	if (!dpn_prop->read_only_wordlength) {
+	if (!read_only_wordlength) {
 		/* Program DPN_BlockCtrl1 register */
 		ret = sdw_write_no_pm(s_rt->slave, addr2, (p_params->bps - 1));
 		if (ret < 0) {
@@ -223,9 +239,9 @@ static int sdw_program_slave_port_params(struct sdw_bus *bus,
 		}
 	}
 
-	if (dpn_prop->type != SDW_DPN_SIMPLE) {
+	if (port_type != SDW_DPN_SIMPLE) {
 		ret = _sdw_program_slave_port_params(bus, s_rt->slave,
-						     t_params, dpn_prop->type);
+						     t_params, port_type);
 		if (ret < 0)
 			dev_err(&s_rt->slave->dev,
 				"Transport reg write failed for port: %d\n",
@@ -432,6 +448,9 @@ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
 	struct completion *port_ready;
 	struct sdw_dpn_prop *dpn_prop;
 	struct sdw_prepare_ch prep_ch;
+	u32 imp_def_interrupts;
+	bool simple_ch_prep_sm;
+	u32 ch_prep_timeout;
 	bool intr = false;
 	int ret = 0, val;
 	u32 addr;
@@ -439,20 +458,31 @@ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
 	prep_ch.num = p_rt->num;
 	prep_ch.ch_mask = p_rt->ch_mask;
 
-	dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
-					  s_rt->direction,
-					  prep_ch.num);
-	if (!dpn_prop) {
-		dev_err(bus->dev,
-			"Slave Port:%d properties not found\n", prep_ch.num);
-		return -EINVAL;
+	if (p_rt->num) {
+		dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
+						  s_rt->direction,
+						  prep_ch.num);
+		if (!dpn_prop) {
+			dev_err(bus->dev,
+				"Slave Port:%d properties not found\n", prep_ch.num);
+			return -EINVAL;
+		}
+
+		imp_def_interrupts = dpn_prop->imp_def_interrupts;
+		simple_ch_prep_sm = dpn_prop->simple_ch_prep_sm;
+	} else {
+		struct sdw_dp0_prop *dp0_prop = s_rt->slave->prop.dp0_prop;
+
+		imp_def_interrupts = dp0_prop->imp_def_interrupts;
+		simple_ch_prep_sm =  dp0_prop->simple_ch_prep_sm;
+		ch_prep_timeout = dp0_prop->ch_prep_timeout;
 	}
 
 	prep_ch.prepare = prep;
 
 	prep_ch.bank = bus->params.next_bank;
 
-	if (dpn_prop->imp_def_interrupts || !dpn_prop->simple_ch_prep_sm ||
+	if (imp_def_interrupts || !simple_ch_prep_sm ||
 	    bus->params.s_data_mode != SDW_PORT_DATA_MODE_NORMAL)
 		intr = true;
 
@@ -463,7 +493,7 @@ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
 	 */
 	if (prep && intr) {
 		ret = sdw_configure_dpn_intr(s_rt->slave, p_rt->num, prep,
-					     dpn_prop->imp_def_interrupts);
+					     imp_def_interrupts);
 		if (ret < 0)
 			return ret;
 	}
@@ -472,7 +502,7 @@ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
 	sdw_do_port_prep(s_rt, prep_ch, prep ? SDW_OPS_PORT_PRE_PREP : SDW_OPS_PORT_PRE_DEPREP);
 
 	/* Prepare Slave port implementing CP_SM */
-	if (!dpn_prop->simple_ch_prep_sm) {
+	if (!simple_ch_prep_sm) {
 		addr = SDW_DPN_PREPARECTRL(p_rt->num);
 
 		if (prep)
@@ -489,7 +519,7 @@ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
 		/* Wait for completion on port ready */
 		port_ready = &s_rt->slave->port_ready[prep_ch.num];
 		wait_for_completion_timeout(port_ready,
-			msecs_to_jiffies(dpn_prop->ch_prep_timeout));
+			msecs_to_jiffies(ch_prep_timeout));
 
 		val = sdw_read_no_pm(s_rt->slave, SDW_DPN_PREPARESTATUS(p_rt->num));
 		if ((val < 0) || (val & p_rt->ch_mask)) {
@@ -506,7 +536,7 @@ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
 	/* Disable interrupt after Port de-prepare */
 	if (!prep && intr)
 		ret = sdw_configure_dpn_intr(s_rt->slave, p_rt->num, prep,
-					     dpn_prop->imp_def_interrupts);
+					     imp_def_interrupts);
 
 	return ret;
 }
@@ -970,7 +1000,8 @@ static int sdw_slave_port_is_valid_range(struct device *dev, int num)
 
 static int sdw_slave_port_config(struct sdw_slave *slave,
 				 struct sdw_slave_runtime *s_rt,
-				 const struct sdw_port_config *port_config)
+				 const struct sdw_port_config *port_config,
+				 bool is_bpt_stream)
 {
 	struct sdw_port_runtime *p_rt;
 	int ret;
@@ -982,9 +1013,14 @@ static int sdw_slave_port_config(struct sdw_slave *slave,
 		 * TODO: Check valid port range as defined by DisCo/
 		 * slave
 		 */
-		ret = sdw_slave_port_is_valid_range(&slave->dev, port_config[i].num);
-		if (ret < 0)
-			return ret;
+		if (!is_bpt_stream) {
+			ret = sdw_slave_port_is_valid_range(&slave->dev, port_config[i].num);
+			if (ret < 0)
+				return ret;
+		} else {
+			if (port_config[i].num)
+				return -EINVAL;
+		}
 
 		ret = sdw_port_config(p_rt, port_config, i);
 		if (ret < 0)
@@ -1293,6 +1329,11 @@ struct sdw_dpn_prop *sdw_get_slave_dpn_prop(struct sdw_slave *slave,
 	u8 num_ports;
 	int i;
 
+	if (!port_num) {
+		dev_err(&slave->dev, "%s: port_num is zero\n", __func__);
+		return NULL;
+	}
+
 	if (direction == SDW_DATA_DIR_TX) {
 		num_ports = hweight32(slave->prop.source_ports);
 		dpn_prop = slave->prop.src_dpn_prop;
@@ -2056,7 +2097,8 @@ int sdw_stream_add_slave(struct sdw_slave *slave,
 	if (ret)
 		goto unlock;
 
-	ret = sdw_slave_port_config(slave, s_rt, port_config);
+	ret = sdw_slave_port_config(slave, s_rt, port_config,
+				    stream->type == SDW_STREAM_BPT ? true : false);
 	if (ret)
 		goto unlock;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (5 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-12 12:19   ` Charles Keepax
  2023-12-18 11:54   ` Vinod Koul
  2023-12-07 22:29 ` [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer Pierre-Louis Bossart
                   ` (9 subsequent siblings)
  16 siblings, 2 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Add definitions and helpers for the BPT/BRA protocol. Peripheral
drivers (aka ASoC codec drivers) can use this API to send bulk data
such as firmware or tables.

The API is only available when no other audio streams have been
allocated, and only one BTP/BRA stream is allowed per link.  To avoid
the addition of yet another lock, the refcount tests are handled in
the stream master_runtime alloc/free routines where the bus_lock is
already held. Another benefit of this approach is that the same
bus_lock is used to handle runtime and port linked lists, which
reduces the potential for misaligned configurations.

In addition to exclusion with audio streams, BPT transfers have a lot
of overhead, specifically registers writes are needed to enable
transport in DP0. In addition, most DMAs don't handle too well very
small data sets.

This patch suggests a minimum bound of 64 bytes, for smaller transfers
codec drivers should rely on the regular read/write commands in
Column0.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/bus.c       | 77 +++++++++++++++++++++++++++++++++++
 drivers/soundwire/bus.h       | 18 ++++++++
 drivers/soundwire/stream.c    | 30 ++++++++++++++
 include/linux/soundwire/sdw.h | 76 ++++++++++++++++++++++++++++++++++
 4 files changed, 201 insertions(+)

diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index f3fec15c3112..e5758d2ed88f 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -2015,3 +2015,80 @@ void sdw_clear_slave_status(struct sdw_bus *bus, u32 request)
 	}
 }
 EXPORT_SYMBOL(sdw_clear_slave_status);
+
+int sdw_bpt_close_stream(struct sdw_bus *bus,
+			 struct sdw_slave *slave,
+			 enum sdw_bpt_type mode,
+			 struct sdw_bpt_msg *msg)
+{
+	int ret;
+
+	ret = bus->ops->bpt_close_stream(bus, slave, mode, msg);
+	if (ret < 0)
+		dev_err(bus->dev, "BPT stream close, err %d\n", ret);
+
+	return ret;
+}
+EXPORT_SYMBOL(sdw_bpt_close_stream);
+
+int sdw_bpt_open_stream(struct sdw_bus *bus,
+			struct sdw_slave *slave,
+			enum sdw_bpt_type mode,
+			struct sdw_bpt_msg *msg)
+{
+	int ret;
+
+	/* only Bulk Register Access (BRA) is supported for now */
+	if (mode != SDW_BRA)
+		return -EINVAL;
+
+	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
+		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
+			msg->len, SDW_BPT_MSG_MIN_BYTES);
+		return -EINVAL;
+	}
+
+	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
+		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
+			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
+		return -EINVAL;
+	}
+
+	/* check device is enumerated */
+	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
+	    slave->dev_num > SDW_MAX_DEVICES)
+		return -ENODEV;
+
+	/* make sure all callbacks are defined */
+	if (!bus->ops->bpt_open_stream ||
+	    !bus->ops->bpt_close_stream ||
+	    !bus->ops->bpt_send_async ||
+	    !bus->ops->bpt_wait)
+		return -ENOTSUPP;
+
+	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
+	if (ret < 0)
+		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
+
+	return ret;
+}
+EXPORT_SYMBOL(sdw_bpt_open_stream);
+
+int sdw_bpt_send_async(struct sdw_bus *bus,
+		       struct sdw_slave *slave,
+		       struct sdw_bpt_msg *msg)
+{
+	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
+		return -EINVAL;
+
+	return bus->ops->bpt_send_async(bus, slave, msg);
+}
+EXPORT_SYMBOL(sdw_bpt_send_async);
+
+int sdw_bpt_wait(struct sdw_bus *bus,
+		 struct sdw_slave *slave,
+		 struct sdw_bpt_msg *msg)
+{
+	return bus->ops->bpt_wait(bus, slave, msg);
+}
+EXPORT_SYMBOL(sdw_bpt_wait);
diff --git a/drivers/soundwire/bus.h b/drivers/soundwire/bus.h
index fda6b24ac2da..936852ab3154 100644
--- a/drivers/soundwire/bus.h
+++ b/drivers/soundwire/bus.h
@@ -72,6 +72,24 @@ struct sdw_msg {
 	bool page;
 };
 
+/**
+ * struct sdw_btp_msg - Message structure
+ * @addr: Start Register address accessed in the Slave
+ * @len: number of bytes to transfer. More than 64Kb can be transferred
+ * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced.
+ * @dev_num: Slave device number
+ * @flags: transfer flags, indicate if xfer is read or write
+ * @buf: message data buffer (filled by host for write, filled
+ * by Peripheral hardware for reads)
+ */
+struct sdw_bpt_msg {
+	u32 addr;
+	u32 len;
+	u8 dev_num;
+	u8 flags;
+	u8 *buf;
+};
+
 #define SDW_DOUBLE_RATE_FACTOR		2
 #define SDW_STRM_RATE_GROUPING		1
 
diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
index f9762610f051..b097b1a808f9 100644
--- a/drivers/soundwire/stream.c
+++ b/drivers/soundwire/stream.c
@@ -1189,6 +1189,20 @@ static struct sdw_master_runtime
 	struct sdw_master_runtime *m_rt, *walk_m_rt;
 	struct list_head *insert_after;
 
+	if (stream->type == SDW_STREAM_BPT) {
+		if (bus->stream_refcount > 0 || bus->bpt_stream_refcount > 0) {
+			dev_err(bus->dev, "%s: invalid configuration %d streams %d BPT streams\n",
+				__func__, bus->stream_refcount, bus->bpt_stream_refcount);
+			return ERR_PTR(-EBUSY);
+		}
+	} else {
+		if (bus->bpt_stream_refcount > 0) {
+			dev_err(bus->dev, "%s: BTP stream already allocated\n",
+				__func__);
+			return ERR_PTR(-EAGAIN);
+		}
+	}
+
 	m_rt = kzalloc(sizeof(*m_rt), GFP_KERNEL);
 	if (!m_rt)
 		return NULL;
@@ -1217,6 +1231,8 @@ static struct sdw_master_runtime
 	m_rt->stream = stream;
 
 	bus->stream_refcount++;
+	if (stream->type == SDW_STREAM_BPT)
+		bus->bpt_stream_refcount++;
 
 	return m_rt;
 }
@@ -1265,6 +1281,8 @@ static void sdw_master_rt_free(struct sdw_master_runtime *m_rt,
 	list_del(&m_rt->bus_node);
 	kfree(m_rt);
 
+	if (stream->type == SDW_STREAM_BPT)
+		bus->bpt_stream_refcount--;
 	bus->stream_refcount--;
 }
 
@@ -1941,6 +1959,12 @@ int sdw_stream_add_master(struct sdw_bus *bus,
 	m_rt = sdw_master_rt_find(bus, stream);
 	if (!m_rt) {
 		m_rt = sdw_master_rt_alloc(bus, stream);
+		if (IS_ERR(m_rt)) {
+			ret = PTR_ERR(m_rt);
+			dev_err(bus->dev, "%s: Master runtime alloc failed for stream:%s: %d\n",
+				__func__, stream->name, ret);
+			goto unlock;
+		}
 		if (!m_rt) {
 			dev_err(bus->dev, "%s: Master runtime alloc failed for stream:%s\n",
 				__func__, stream->name);
@@ -2056,6 +2080,12 @@ int sdw_stream_add_slave(struct sdw_slave *slave,
 		 * So, allocate m_rt and add Slave to it.
 		 */
 		m_rt = sdw_master_rt_alloc(slave->bus, stream);
+		if (IS_ERR(m_rt)) {
+			ret = PTR_ERR(m_rt);
+			dev_err(&slave->dev, "%s: Master runtime alloc failed for stream:%s: %d\n",
+				__func__, stream->name, ret);
+			goto unlock;
+		}
 		if (!m_rt) {
 			dev_err(&slave->dev, "%s: Master runtime alloc failed for stream:%s\n",
 				__func__, stream->name);
diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 1dda65d91bad..e54c5bbd2b91 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -835,6 +835,46 @@ struct sdw_defer {
 	struct sdw_msg *msg;
 };
 
+/*
+ * Add a practical limit to BPT transfer sizes. BPT is typically used
+ * to transfer firmware, and larger firmware transfers will increase
+ * the cold latency beyond typical OS or user requirements.
+ */
+#define SDW_BPT_MSG_MAX_BYTES  (1024 * 1024)
+
+/*
+ * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of
+ * overhead, enabling/disabling a stream costs 6 write commands, plus the bank
+ * switch that could be delayed in time. In addition, transferring very small
+ * data sets over DMA is known to be problematic on multiple platforms.
+ */
+#define SDW_BPT_MSG_MIN_BYTES      128
+
+/*
+ * Add an alignment requirement for BPT transfer sizes. BPT is
+ * typically with DMAs which are most-efficient with data aligned to a
+ * cache line.  For example HDaudio DMAs strongly recommend to use
+ * multiples of 128 bytes.
+ * This could be a platform-specific value, but that would be problematic for
+ * codec drivers which are typically not platform-specific by definition.
+ * It's probably simpler for everyone to align on a baseline and use regular
+ * read/write commands for the remaining bytes.
+ */
+#define SDW_BPT_MSG_BYTE_ALIGNMENT 128
+
+/**
+ * sdw_bpt_type - SoundWire Bulk Payload Transfer (BPT) type
+ * @SDW_BRA: Soundwire Bulk Register Access (BRA)
+ *
+ * The SoundWire specification allows for implementation-defined
+ * solutions, they would be added after BRA.
+ */
+enum sdw_bpt_type {
+	SDW_BRA,
+};
+
+struct sdw_bpt_msg;
+
 /**
  * struct sdw_master_ops - Master driver ops
  * @read_prop: Read Master properties
@@ -850,6 +890,10 @@ struct sdw_defer {
  * @get_device_num: Callback for vendor-specific device_number allocation
  * @put_device_num: Callback for vendor-specific device_number release
  * @new_peripheral_assigned: Callback to handle enumeration of new peripheral.
+ * @bpt_open_stream: reserve resources for BPT stream
+ * @bpt_close_stream: release resources for BPT stream
+ * @bpt_send_async: send message using BTP protocol
+ * @bpt_wait: wait for message completion using BTP protocol
  */
 struct sdw_master_ops {
 	int (*read_prop)(struct sdw_bus *bus);
@@ -869,6 +913,20 @@ struct sdw_master_ops {
 	void (*new_peripheral_assigned)(struct sdw_bus *bus,
 					struct sdw_slave *slave,
 					int dev_num);
+	int (*bpt_open_stream)(struct sdw_bus *bus,
+			       struct sdw_slave *slave,
+			       enum sdw_bpt_type mode,
+			       struct sdw_bpt_msg *msg);
+	int (*bpt_close_stream)(struct sdw_bus *bus,
+				struct sdw_slave *slave,
+				enum sdw_bpt_type mode,
+				struct sdw_bpt_msg *msg);
+	int (*bpt_send_async)(struct sdw_bus *bus,
+			      struct sdw_slave *slave,
+			      struct sdw_bpt_msg *msg);
+	int (*bpt_wait)(struct sdw_bus *bus,
+			struct sdw_slave *slave,
+			struct sdw_bpt_msg *msg);
 };
 
 /**
@@ -905,6 +963,8 @@ struct sdw_master_ops {
  * synchronization will be used even if a stream only uses a single
  * SoundWire segment.
  * @stream_refcount: number of streams currently using this bus
+ * @btp_stream_refcount: number of BTP streams currently using this bus (should
+ * be zero or one, multiple streams per link is not supported).
  */
 struct sdw_bus {
 	struct device *dev;
@@ -935,6 +995,7 @@ struct sdw_bus {
 	bool multi_link;
 	int hw_sync_min_links;
 	int stream_refcount;
+	int bpt_stream_refcount;
 };
 
 int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent,
@@ -1052,6 +1113,21 @@ int sdw_bus_exit_clk_stop(struct sdw_bus *bus);
 int sdw_compare_devid(struct sdw_slave *slave, struct sdw_slave_id id);
 void sdw_extract_slave_id(struct sdw_bus *bus, u64 addr, struct sdw_slave_id *id);
 
+int sdw_bpt_open_stream(struct sdw_bus *bus,
+			struct sdw_slave *slave,
+			enum sdw_bpt_type mode,
+			struct sdw_bpt_msg *msg);
+int sdw_bpt_close_stream(struct sdw_bus *bus,
+			 struct sdw_slave *slave,
+			 enum sdw_bpt_type mode,
+			 struct sdw_bpt_msg *msg);
+int sdw_bpt_send_async(struct sdw_bus *bus,
+		       struct sdw_slave *slave,
+		       struct sdw_bpt_msg *msg);
+int sdw_bpt_wait(struct sdw_bus *bus,
+		 struct sdw_slave *slave,
+		 struct sdw_bpt_msg *msg);
+
 #if IS_ENABLED(CONFIG_SOUNDWIRE)
 
 int sdw_stream_add_slave(struct sdw_slave *slave,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (6 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-18 11:55   ` Vinod Koul
  2023-12-07 22:29 ` [RFC PATCH 09/16] soundwire: crc8: add constant table Pierre-Louis Bossart
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Add a convenience pointer to the 'sdw_bus' structure. BPT is a
dedicated stream which will typically not be handled by DAIs or
dailinks. Since there's only one BPT stream per link, storing the
pointer at the link level seems rather natural.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/linux/soundwire/sdw.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index e54c5bbd2b91..8db0cd7d0d89 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -965,6 +965,7 @@ struct sdw_master_ops {
  * @stream_refcount: number of streams currently using this bus
  * @btp_stream_refcount: number of BTP streams currently using this bus (should
  * be zero or one, multiple streams per link is not supported).
+ * @bpt_stream: pointer stored for convenience.
  */
 struct sdw_bus {
 	struct device *dev;
@@ -996,6 +997,7 @@ struct sdw_bus {
 	int hw_sync_min_links;
 	int stream_refcount;
 	int bpt_stream_refcount;
+	struct sdw_stream_runtime *bpt_stream;
 };
 
 int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 09/16] soundwire: crc8: add constant table
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (7 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-18 12:01   ` Vinod Koul
  2023-12-07 22:29 ` [RFC PATCH 10/16] soundwire: cadence: add BTP/BRA helpers to format data Pierre-Louis Bossart
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Add the lookup table required by crc8(). All configuration values were
directly table from the MIPI SoundWire 1.x specification.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/Makefile |   4 +-
 drivers/soundwire/crc8.c   | 277 +++++++++++++++++++++++++++++++++++++
 drivers/soundwire/crc8.h   |  11 ++
 3 files changed, 291 insertions(+), 1 deletion(-)
 create mode 100644 drivers/soundwire/crc8.c
 create mode 100644 drivers/soundwire/crc8.h

diff --git a/drivers/soundwire/Makefile b/drivers/soundwire/Makefile
index 657f5888a77b..170128dd9318 100644
--- a/drivers/soundwire/Makefile
+++ b/drivers/soundwire/Makefile
@@ -5,7 +5,9 @@
 
 #Bus Objs
 soundwire-bus-y := bus_type.o bus.o master.o slave.o mipi_disco.o stream.o  \
-			sysfs_slave.o sysfs_slave_dpn.o
+			sysfs_slave.o sysfs_slave_dpn.o \
+			crc8.o
+
 obj-$(CONFIG_SOUNDWIRE) += soundwire-bus.o
 
 soundwire-generic-allocation-objs := generic_bandwidth_allocation.o
diff --git a/drivers/soundwire/crc8.c b/drivers/soundwire/crc8.c
new file mode 100644
index 000000000000..b6b984d7f39a
--- /dev/null
+++ b/drivers/soundwire/crc8.c
@@ -0,0 +1,277 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+// Copyright(c) 2024 Intel Corporation.
+
+#include <linux/crc8.h>
+#include <linux/module.h>
+#include "crc8.h"
+
+/*
+ * the MIPI SoundWire CRC8 polynomial is X^8 + X^6 + X^3 + X^2 + 1, MSB first
+ * The value is (1)01001101 = 0x4D
+ *
+ * the table below was generated with
+ *
+ *	u8 crc8_lookup_table[CRC8_TABLE_SIZE];
+ *	crc8_populate_msb(crc8_lookup_table, SDW_CRC8_POLY);
+ *
+ */
+
+const u8 sdw_crc8_lookup_msb[CRC8_TABLE_SIZE] = {
+	0x00, /* 0 */
+	0x4d, /* 1 */
+	0x9a, /* 2 */
+	0xd7, /* 3 */
+	0x79, /* 4 */
+	0x34, /* 5 */
+	0xe3, /* 6 */
+	0xae, /* 7 */
+	0xf2, /* 8 */
+	0xbf, /* 9 */
+	0x68, /* 10 */
+	0x25, /* 11 */
+	0x8b, /* 12 */
+	0xc6, /* 13 */
+	0x11, /* 14 */
+	0x5c, /* 15 */
+	0xa9, /* 16 */
+	0xe4, /* 17 */
+	0x33, /* 18 */
+	0x7e, /* 19 */
+	0xd0, /* 20 */
+	0x9d, /* 21 */
+	0x4a, /* 22 */
+	0x07, /* 23 */
+	0x5b, /* 24 */
+	0x16, /* 25 */
+	0xc1, /* 26 */
+	0x8c, /* 27 */
+	0x22, /* 28 */
+	0x6f, /* 29 */
+	0xb8, /* 30 */
+	0xf5, /* 31 */
+	0x1f, /* 32 */
+	0x52, /* 33 */
+	0x85, /* 34 */
+	0xc8, /* 35 */
+	0x66, /* 36 */
+	0x2b, /* 37 */
+	0xfc, /* 38 */
+	0xb1, /* 39 */
+	0xed, /* 40 */
+	0xa0, /* 41 */
+	0x77, /* 42 */
+	0x3a, /* 43 */
+	0x94, /* 44 */
+	0xd9, /* 45 */
+	0x0e, /* 46 */
+	0x43, /* 47 */
+	0xb6, /* 48 */
+	0xfb, /* 49 */
+	0x2c, /* 50 */
+	0x61, /* 51 */
+	0xcf, /* 52 */
+	0x82, /* 53 */
+	0x55, /* 54 */
+	0x18, /* 55 */
+	0x44, /* 56 */
+	0x09, /* 57 */
+	0xde, /* 58 */
+	0x93, /* 59 */
+	0x3d, /* 60 */
+	0x70, /* 61 */
+	0xa7, /* 62 */
+	0xea, /* 63 */
+	0x3e, /* 64 */
+	0x73, /* 65 */
+	0xa4, /* 66 */
+	0xe9, /* 67 */
+	0x47, /* 68 */
+	0x0a, /* 69 */
+	0xdd, /* 70 */
+	0x90, /* 71 */
+	0xcc, /* 72 */
+	0x81, /* 73 */
+	0x56, /* 74 */
+	0x1b, /* 75 */
+	0xb5, /* 76 */
+	0xf8, /* 77 */
+	0x2f, /* 78 */
+	0x62, /* 79 */
+	0x97, /* 80 */
+	0xda, /* 81 */
+	0x0d, /* 82 */
+	0x40, /* 83 */
+	0xee, /* 84 */
+	0xa3, /* 85 */
+	0x74, /* 86 */
+	0x39, /* 87 */
+	0x65, /* 88 */
+	0x28, /* 89 */
+	0xff, /* 90 */
+	0xb2, /* 91 */
+	0x1c, /* 92 */
+	0x51, /* 93 */
+	0x86, /* 94 */
+	0xcb, /* 95 */
+	0x21, /* 96 */
+	0x6c, /* 97 */
+	0xbb, /* 98 */
+	0xf6, /* 99 */
+	0x58, /* 100 */
+	0x15, /* 101 */
+	0xc2, /* 102 */
+	0x8f, /* 103 */
+	0xd3, /* 104 */
+	0x9e, /* 105 */
+	0x49, /* 106 */
+	0x04, /* 107 */
+	0xaa, /* 108 */
+	0xe7, /* 109 */
+	0x30, /* 110 */
+	0x7d, /* 111 */
+	0x88, /* 112 */
+	0xc5, /* 113 */
+	0x12, /* 114 */
+	0x5f, /* 115 */
+	0xf1, /* 116 */
+	0xbc, /* 117 */
+	0x6b, /* 118 */
+	0x26, /* 119 */
+	0x7a, /* 120 */
+	0x37, /* 121 */
+	0xe0, /* 122 */
+	0xad, /* 123 */
+	0x03, /* 124 */
+	0x4e, /* 125 */
+	0x99, /* 126 */
+	0xd4, /* 127 */
+	0x7c, /* 128 */
+	0x31, /* 129 */
+	0xe6, /* 130 */
+	0xab, /* 131 */
+	0x05, /* 132 */
+	0x48, /* 133 */
+	0x9f, /* 134 */
+	0xd2, /* 135 */
+	0x8e, /* 136 */
+	0xc3, /* 137 */
+	0x14, /* 138 */
+	0x59, /* 139 */
+	0xf7, /* 140 */
+	0xba, /* 141 */
+	0x6d, /* 142 */
+	0x20, /* 143 */
+	0xd5, /* 144 */
+	0x98, /* 145 */
+	0x4f, /* 146 */
+	0x02, /* 147 */
+	0xac, /* 148 */
+	0xe1, /* 149 */
+	0x36, /* 150 */
+	0x7b, /* 151 */
+	0x27, /* 152 */
+	0x6a, /* 153 */
+	0xbd, /* 154 */
+	0xf0, /* 155 */
+	0x5e, /* 156 */
+	0x13, /* 157 */
+	0xc4, /* 158 */
+	0x89, /* 159 */
+	0x63, /* 160 */
+	0x2e, /* 161 */
+	0xf9, /* 162 */
+	0xb4, /* 163 */
+	0x1a, /* 164 */
+	0x57, /* 165 */
+	0x80, /* 166 */
+	0xcd, /* 167 */
+	0x91, /* 168 */
+	0xdc, /* 169 */
+	0x0b, /* 170 */
+	0x46, /* 171 */
+	0xe8, /* 172 */
+	0xa5, /* 173 */
+	0x72, /* 174 */
+	0x3f, /* 175 */
+	0xca, /* 176 */
+	0x87, /* 177 */
+	0x50, /* 178 */
+	0x1d, /* 179 */
+	0xb3, /* 180 */
+	0xfe, /* 181 */
+	0x29, /* 182 */
+	0x64, /* 183 */
+	0x38, /* 184 */
+	0x75, /* 185 */
+	0xa2, /* 186 */
+	0xef, /* 187 */
+	0x41, /* 188 */
+	0x0c, /* 189 */
+	0xdb, /* 190 */
+	0x96, /* 191 */
+	0x42, /* 192 */
+	0x0f, /* 193 */
+	0xd8, /* 194 */
+	0x95, /* 195 */
+	0x3b, /* 196 */
+	0x76, /* 197 */
+	0xa1, /* 198 */
+	0xec, /* 199 */
+	0xb0, /* 200 */
+	0xfd, /* 201 */
+	0x2a, /* 202 */
+	0x67, /* 203 */
+	0xc9, /* 204 */
+	0x84, /* 205 */
+	0x53, /* 206 */
+	0x1e, /* 207 */
+	0xeb, /* 208 */
+	0xa6, /* 209 */
+	0x71, /* 210 */
+	0x3c, /* 211 */
+	0x92, /* 212 */
+	0xdf, /* 213 */
+	0x08, /* 214 */
+	0x45, /* 215 */
+	0x19, /* 216 */
+	0x54, /* 217 */
+	0x83, /* 218 */
+	0xce, /* 219 */
+	0x60, /* 220 */
+	0x2d, /* 221 */
+	0xfa, /* 222 */
+	0xb7, /* 223 */
+	0x5d, /* 224 */
+	0x10, /* 225 */
+	0xc7, /* 226 */
+	0x8a, /* 227 */
+	0x24, /* 228 */
+	0x69, /* 229 */
+	0xbe, /* 230 */
+	0xf3, /* 231 */
+	0xaf, /* 232 */
+	0xe2, /* 233 */
+	0x35, /* 234 */
+	0x78, /* 235 */
+	0xd6, /* 236 */
+	0x9b, /* 237 */
+	0x4c, /* 238 */
+	0x01, /* 239 */
+	0xf4, /* 240 */
+	0xb9, /* 241 */
+	0x6e, /* 242 */
+	0x23, /* 243 */
+	0x8d, /* 244 */
+	0xc0, /* 245 */
+	0x17, /* 246 */
+	0x5a, /* 247 */
+	0x06, /* 248 */
+	0x4b, /* 249 */
+	0x9c, /* 250 */
+	0xd1, /* 251 */
+	0x7f, /* 252 */
+	0x32, /* 253 */
+	0xe5, /* 254 */
+	0xa8  /* 255 */
+};
+EXPORT_SYMBOL(sdw_crc8_lookup_msb);
diff --git a/drivers/soundwire/crc8.h b/drivers/soundwire/crc8.h
new file mode 100644
index 000000000000..9a88d3866016
--- /dev/null
+++ b/drivers/soundwire/crc8.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+/* Copyright(c) 2024 Intel Corporation. */
+
+#ifndef __SDW_CRC8_H
+#define __SDW_CRC8_H
+
+#define SDW_CRC8_SEED 0xFF
+#define SDW_CRC8_POLY 0x4D
+extern const u8 sdw_crc8_lookup_msb[CRC8_TABLE_SIZE];
+
+#endif /* __SDW_CRC8_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 10/16] soundwire: cadence: add BTP/BRA helpers to format data
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (8 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 09/16] soundwire: crc8: add constant table Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 11/16] soundwire: intel_auxdevice: add indirection for BPT open/close/send_async/wait Pierre-Louis Bossart
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

The Cadence IP expects a specific format (detailed in the
Documentation). Add helpers to copy the data into the DMA buffer.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/Kconfig          |   1 +
 drivers/soundwire/cadence_master.c | 634 +++++++++++++++++++++++++++++
 drivers/soundwire/cadence_master.h |  30 ++
 3 files changed, 665 insertions(+)

diff --git a/drivers/soundwire/Kconfig b/drivers/soundwire/Kconfig
index 4d8f3b7024ae..18ca21edd552 100644
--- a/drivers/soundwire/Kconfig
+++ b/drivers/soundwire/Kconfig
@@ -30,6 +30,7 @@ config SOUNDWIRE_AMD
 
 config SOUNDWIRE_CADENCE
 	tristate
+	select CRC8
 
 config SOUNDWIRE_INTEL
 	tristate "Intel SoundWire Master driver"
diff --git a/drivers/soundwire/cadence_master.c b/drivers/soundwire/cadence_master.c
index bb623f82826c..2da0c415c125 100644
--- a/drivers/soundwire/cadence_master.c
+++ b/drivers/soundwire/cadence_master.c
@@ -6,6 +6,7 @@
  * Used by Master driver
  */
 
+#include <linux/crc8.h>
 #include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/debugfs.h>
@@ -20,6 +21,7 @@
 #include <sound/soc.h>
 #include <linux/workqueue.h>
 #include "bus.h"
+#include "crc8.h"
 #include "cadence_master.h"
 
 static int interrupt_mask;
@@ -1890,5 +1892,637 @@ struct sdw_cdns_pdi *sdw_cdns_alloc_pdi(struct sdw_cdns *cdns,
 }
 EXPORT_SYMBOL(sdw_cdns_alloc_pdi);
 
+/* BPT/BRA helpers */
+
+#define SDW_CDNS_BRA_HDR			6 /* defined by MIPI */
+#define SDW_CDNS_BRA_HDR_CRC			1 /* defined by MIPI */
+#define SDW_CDNS_BRA_HDR_CRC_PAD		1 /* Cadence only */
+#define SDW_CDNS_BRA_HDR_RESP			1 /* defined by MIPI */
+#define SDW_CDNS_BRA_HDR_RESP_PAD		1 /* Cadence only */
+
+#define SDW_CDNS_BRA_DATA_PAD			1 /* Cadence only */
+#define SDW_CDNS_BRA_DATA_CRC			1 /* defined by MIPI */
+#define SDW_CDNS_BRA_DATA_CRC_PAD		1 /* Cadence only */
+
+#define SDW_CDNS_BRA_FOOTER_RESP		1 /* defined by MIPI */
+#define SDW_CDNS_BRA_FOOTER_RESP_PAD		1 /* Cadence only */
+
+static unsigned int sdw_cdns_bra_actual_data_size(unsigned int allocated_bytes_per_frame)
+{
+	unsigned int total;
+
+	total = allocated_bytes_per_frame -
+		SDW_CDNS_BRA_HDR -
+		SDW_CDNS_BRA_HDR_CRC -
+		SDW_CDNS_BRA_HDR_RESP;
+
+	total -= SDW_CDNS_BRA_DATA_CRC;
+
+	total -= SDW_CDNS_BRA_FOOTER_RESP;
+
+	return total;
+}
+
+static unsigned int sdw_cdns_write_pdi0_buffer_size(unsigned int actual_data_size)
+{
+	unsigned int total;
+
+	total = SDW_CDNS_BRA_HDR +
+		SDW_CDNS_BRA_HDR_CRC +
+		SDW_CDNS_BRA_HDR_CRC_PAD;
+
+	total += actual_data_size;
+	if (actual_data_size & 1)
+		total += SDW_CDNS_BRA_DATA_PAD;
+
+	total += SDW_CDNS_BRA_DATA_CRC +
+		SDW_CDNS_BRA_DATA_CRC_PAD;
+
+	return total * 2;
+}
+
+static unsigned int sdw_cdns_write_pdi1_buffer_size(unsigned int actual_data_size)
+{
+	unsigned int total;
+
+	total = SDW_CDNS_BRA_HDR_RESP +
+		SDW_CDNS_BRA_HDR_RESP_PAD;
+
+	total += SDW_CDNS_BRA_FOOTER_RESP +
+		SDW_CDNS_BRA_FOOTER_RESP_PAD;
+
+	return total * 2;
+}
+
+static unsigned int sdw_cdns_read_pdi0_buffer_size(unsigned int actual_data_size)
+{
+	unsigned int total;
+
+	total = SDW_CDNS_BRA_HDR +
+		SDW_CDNS_BRA_HDR_CRC +
+		SDW_CDNS_BRA_HDR_CRC_PAD;
+
+	return total * 2;
+}
+
+static unsigned int sdw_cdns_read_pdi1_buffer_size(unsigned int actual_data_size)
+{
+	unsigned int total;
+
+	total = SDW_CDNS_BRA_HDR_RESP +
+		SDW_CDNS_BRA_HDR_RESP_PAD;
+
+	total += actual_data_size;
+	if (actual_data_size & 1)
+		total += SDW_CDNS_BRA_DATA_PAD;
+
+	total += SDW_CDNS_BRA_HDR_CRC +
+		SDW_CDNS_BRA_HDR_CRC_PAD;
+
+	total += SDW_CDNS_BRA_FOOTER_RESP +
+		SDW_CDNS_BRA_FOOTER_RESP_PAD;
+
+	return total * 2;
+}
+
+int sdw_cdns_bpt_find_buffer_sizes(int command, /* 0: write, 1: read */
+				   int row, int col,
+				   unsigned int data_bytes,
+				   unsigned int requested_bytes_per_frame,
+				   unsigned int *data_per_frame,
+				   unsigned int *pdi0_buffer_size,
+				   unsigned int *pdi1_buffer_size,
+				   unsigned int *num_frames)
+{
+	unsigned int bpt_bits = row * (col - 1);
+	unsigned int bpt_bytes = bpt_bits >> 3;
+	unsigned int actual_bpt_bytes;
+	unsigned int pdi0_tx_size;
+	unsigned int pdi1_rx_size;
+	unsigned int remainder;
+
+	if (!data_bytes)
+		return -EINVAL;
+
+	actual_bpt_bytes = sdw_cdns_bra_actual_data_size(bpt_bytes);
+	if (data_bytes < actual_bpt_bytes)
+		actual_bpt_bytes = data_bytes;
+
+	/*
+	 * the caller may want to set the number of bytes per frame,
+	 * allow when possible
+	 */
+	if (requested_bytes_per_frame < actual_bpt_bytes)
+		actual_bpt_bytes = requested_bytes_per_frame;
+
+	*data_per_frame = actual_bpt_bytes;
+
+	if (command == 0) {
+		/*
+		 * for writes we need to send all the data_bytes per frame,
+		 * even for the last frame which may only transport fewer bytes
+		 */
+
+		*num_frames = DIV_ROUND_UP(data_bytes, actual_bpt_bytes);
+
+		pdi0_tx_size = sdw_cdns_write_pdi0_buffer_size(actual_bpt_bytes);
+		pdi1_rx_size = sdw_cdns_write_pdi1_buffer_size(actual_bpt_bytes);
+
+		*pdi0_buffer_size = pdi0_tx_size * *num_frames;
+		*pdi1_buffer_size = pdi1_rx_size * *num_frames;
+	} else {
+		/*
+		 * for reads we need to retrieve only what is requested in the BTP
+		 * header, so the last frame needs to be special-cased
+		 */
+		*num_frames = data_bytes / actual_bpt_bytes;
+
+		pdi0_tx_size = sdw_cdns_read_pdi0_buffer_size(actual_bpt_bytes);
+		pdi1_rx_size = sdw_cdns_read_pdi1_buffer_size(actual_bpt_bytes);
+
+		*pdi0_buffer_size = pdi0_tx_size * *num_frames;
+		*pdi1_buffer_size = pdi1_rx_size * *num_frames;
+
+		remainder = data_bytes % actual_bpt_bytes;
+		if (remainder) {
+			pdi0_tx_size = sdw_cdns_read_pdi0_buffer_size(remainder);
+			pdi1_rx_size = sdw_cdns_read_pdi1_buffer_size(remainder);
+
+			*num_frames = *num_frames + 1;
+			*pdi0_buffer_size += pdi0_tx_size;
+			*pdi1_buffer_size += pdi1_rx_size;
+		}
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(sdw_cdns_bpt_find_buffer_sizes);
+
+static int sdw_cdns_copy_write_data(u8 *data, int data_size, u8 *dma_buffer, int dma_buffer_size)
+{
+	int i;
+	int j;
+
+	/* size check to prevent out of bounds access */
+	i = data_size - 1;
+	j = (2 * i) - (i & 1);
+	if (data_size & 1)
+		j++;
+	j += 2;
+	if (j >= dma_buffer_size)
+		return -EINVAL;
+
+	/* copy data */
+	for (i = 0; i < data_size; i++) {
+		j = (2 * i) - (i & 1);
+		dma_buffer[j] = data[i];
+	}
+	/* add required pad */
+	if (data_size & 1)
+		dma_buffer[++j] = 0;
+	/* skip last two bytes */
+	j += 2;
+
+	/* offset and data are off-by-one */
+	return j + 1;
+}
+
+static int sdw_cdns_prepare_write_pd0_buffer(u8 *header, unsigned int header_size,
+					     u8 *data, unsigned int data_size,
+					     u8 *dma_buffer, unsigned int dma_buffer_size,
+					     unsigned int *dma_data_written,
+					     unsigned int frame_counter)
+{
+	u8 crc;
+	int data_written;
+	u8 *last_byte;
+
+	/* clear buffer */
+	memset(dma_buffer, 0, dma_buffer_size);
+	*dma_data_written = 0;
+
+	dma_buffer[3] = BIT(7);
+	dma_buffer[3] |= frame_counter & GENMASK(3, 0);
+
+	data_written = sdw_cdns_copy_write_data(header, header_size, dma_buffer, dma_buffer_size);
+	if (data_written < 0)
+		return data_written;
+	dma_buffer += data_written;
+	dma_buffer_size -= data_written;
+	*dma_data_written += data_written;
+
+	crc = SDW_CRC8_SEED;
+	crc = crc8(sdw_crc8_lookup_msb, header, header_size, crc);
+
+	data_written = sdw_cdns_copy_write_data(&crc, 1, dma_buffer, dma_buffer_size);
+	if (data_written < 0)
+		return data_written;
+	dma_buffer += data_written;
+	dma_buffer_size -= data_written;
+	*dma_data_written += data_written;
+
+	data_written = sdw_cdns_copy_write_data(data, data_size, dma_buffer, dma_buffer_size);
+	if (data_written < 0)
+		return data_written;
+	dma_buffer += data_written;
+	dma_buffer_size -= data_written;
+	*dma_data_written += data_written;
+
+	crc = SDW_CRC8_SEED;
+	crc = crc8(sdw_crc8_lookup_msb, data, data_size, crc);
+	data_written = sdw_cdns_copy_write_data(&crc, 1, dma_buffer, dma_buffer_size);
+	if (data_written < 0)
+		return data_written;
+	dma_buffer += data_written;
+	dma_buffer_size -= data_written;
+	*dma_data_written += data_written;
+
+	/* tag last byte */
+	last_byte = dma_buffer - 1;
+	last_byte[0] = BIT(6);
+
+	return 0;
+}
+
+static int sdw_cdns_prepare_read_pd0_buffer(u8 *header, unsigned int header_size,
+					    u8 *dma_buffer, unsigned int dma_buffer_size,
+					    unsigned int *dma_data_written,
+					    unsigned int frame_counter)
+{
+	u8 crc;
+	int data_written;
+	u8 *last_byte;
+
+	/* clear buffer */
+	memset(dma_buffer, 0, dma_buffer_size);
+	*dma_data_written = 0;
+
+	dma_buffer[3] = BIT(7);
+	dma_buffer[3] |= frame_counter & GENMASK(3, 0);
+
+	data_written = sdw_cdns_copy_write_data(header, header_size, dma_buffer, dma_buffer_size);
+	if (data_written < 0)
+		return data_written;
+	dma_buffer += data_written;
+	dma_buffer_size -= data_written;
+	*dma_data_written += data_written;
+
+	crc = SDW_CRC8_SEED;
+	crc = crc8(sdw_crc8_lookup_msb, header, header_size, crc);
+
+	data_written = sdw_cdns_copy_write_data(&crc, 1, dma_buffer, dma_buffer_size);
+	if (data_written < 0)
+		return data_written;
+	dma_buffer += data_written;
+	dma_buffer_size -= data_written;
+	*dma_data_written += data_written;
+
+	/* tag last byte */
+	last_byte = dma_buffer - 1;
+	last_byte[0] = BIT(6);
+
+	return 0;
+}
+
+#define CDNS_BTP_ROLLING_COUNTER_START 1
+
+int sdw_cdns_prepare_write_dma_buffer(u8 dev_num, u32 start_register,
+				      u8 *data, int data_size,
+				      int data_per_frame,
+				      u8 *dma_buffer, int dma_buffer_size,
+				      int *dma_buffer_total_bytes)
+{
+	u8 header[SDW_CDNS_BRA_HDR];
+	u8 *p_dma_buffer;
+	u8 *p_data;
+	int dma_data_written;
+	int total_dma_data_written;
+	u8 counter;
+	int remainder;
+	int ret;
+
+	counter = CDNS_BTP_ROLLING_COUNTER_START;
+
+	header[0] = BIT(1);		/* write command: BIT(1) set */
+	header[0] |= GENMASK(7, 6);	/* header is active */
+	header[0] |= (dev_num << 2);
+
+	p_data = &data[0];
+	p_dma_buffer = &dma_buffer[0];
+	total_dma_data_written = 0;
+
+	while (data_size / data_per_frame) {
+		header[1] = data_per_frame;
+		header[2] = start_register >> 24 & 0xFF;
+		header[3] = start_register >> 16 & 0xFF;
+		header[4] = start_register >> 8 & 0xFF;
+		header[5] = start_register >> 0 & 0xFF;
+
+		ret = sdw_cdns_prepare_write_pd0_buffer(header, SDW_CDNS_BRA_HDR,
+							p_data, data_per_frame,
+							p_dma_buffer, dma_buffer_size,
+							&dma_data_written,
+							counter);
+		if (ret < 0)
+			return ret;
+
+		counter++;
+
+		p_data += data_per_frame;
+		data_size -= data_per_frame;
+
+		p_dma_buffer += dma_data_written;
+		dma_buffer_size -= dma_data_written;
+		total_dma_data_written += dma_data_written;
+
+		start_register += data_per_frame;
+	}
+
+	remainder = data_size % data_per_frame;
+	if (remainder) {
+		header[1] = remainder;
+		header[2] = start_register >> 24 & 0xFF;
+		header[3] = start_register >> 16 & 0xFF;
+		header[4] = start_register >> 8 & 0xFF;
+		header[5] = start_register >> 0 & 0xFF;
+
+		ret = sdw_cdns_prepare_write_pd0_buffer(header, SDW_CDNS_BRA_HDR,
+							p_data, remainder,
+							p_dma_buffer, dma_buffer_size,
+							&dma_data_written,
+							counter);
+		if (ret < 0)
+			return ret;
+
+		total_dma_data_written += dma_data_written;
+	}
+
+	*dma_buffer_total_bytes = total_dma_data_written;
+
+	return 0;
+}
+EXPORT_SYMBOL(sdw_cdns_prepare_write_dma_buffer);
+
+int sdw_cdns_prepare_read_dma_buffer(u8 dev_num, u32 start_register,
+				     int data_size,
+				     int data_per_frame,
+				     u8 *dma_buffer, int dma_buffer_size,
+				     int *dma_buffer_total_bytes)
+{
+	u8 header[SDW_CDNS_BRA_HDR];
+	u8 *p_dma_buffer;
+	int dma_data_written;
+	int total_dma_data_written;
+	u8 counter;
+	int remainder;
+	int ret;
+
+	counter = CDNS_BTP_ROLLING_COUNTER_START;
+
+	header[0] = 0;			/* read command: BIT(1) cleared */
+	header[0] |= GENMASK(7, 6);	/* header is active */
+	header[0] |= (dev_num << 2);
+
+	p_dma_buffer = &dma_buffer[0];
+	total_dma_data_written = 0;
+
+	while (data_size / data_per_frame) {
+		header[1] = data_per_frame;
+		header[2] = start_register >> 24 & 0xFF;
+		header[3] = start_register >> 16 & 0xFF;
+		header[4] = start_register >> 8 & 0xFF;
+		header[5] = start_register >> 0 & 0xFF;
+
+		ret = sdw_cdns_prepare_read_pd0_buffer(header, SDW_CDNS_BRA_HDR,
+						       p_dma_buffer, dma_buffer_size,
+						       &dma_data_written,
+						       counter);
+		if (ret < 0)
+			return ret;
+
+		counter++;
+
+		data_size -= data_per_frame;
+
+		p_dma_buffer += dma_data_written;
+		dma_buffer_size -= dma_data_written;
+		total_dma_data_written += dma_data_written;
+
+		start_register += data_per_frame;
+	}
+
+	remainder = data_size % data_per_frame;
+	if (remainder) {
+		header[1] = remainder;
+		header[2] = start_register >> 24 & 0xFF;
+		header[3] = start_register >> 16 & 0xFF;
+		header[4] = start_register >> 8 & 0xFF;
+		header[5] = start_register >> 0 & 0xFF;
+
+		ret = sdw_cdns_prepare_read_pd0_buffer(header, SDW_CDNS_BRA_HDR,
+						       p_dma_buffer, dma_buffer_size,
+						       &dma_data_written,
+						       counter);
+		if (ret < 0)
+			return ret;
+
+		total_dma_data_written += dma_data_written;
+	}
+
+	*dma_buffer_total_bytes = total_dma_data_written;
+
+	return 0;
+}
+EXPORT_SYMBOL(sdw_cdns_prepare_read_dma_buffer);
+
+static int check_counter(u32 val, u8 counter)
+{
+	u8 frame;
+
+	frame = (val >> 24) & GENMASK(3, 0);
+	if (counter != frame)
+		return -EIO;
+	return 0;
+}
+
+static int check_response(u32 val)
+{
+	u8 response;
+
+	response = (val >> 3) & GENMASK(1, 0);
+	if (response == 0) /* Ignored */
+		return -ENODATA;
+	if (response != 1) /* ACK */
+		return -EIO;
+
+	return 0;
+}
+
+static int check_frame_start(u32 header, u8 counter)
+{
+	int ret;
+
+	/* check frame_start marker */
+	if (!(header & BIT(31)))
+		return -EIO;
+
+	ret = check_counter(header, counter);
+	if (ret < 0)
+		return ret;
+
+	return check_response(header);
+}
+
+static int check_frame_end(u32 footer)
+{
+	/* check frame_end marker */
+	if (!(footer & BIT(30)))
+		return -EIO;
+
+	return check_response(footer);
+}
+
+int sdw_cdns_check_write_response(struct device *dev,
+				  u8 *dma_buffer, int dma_buffer_size,
+				  int num_frames)
+{
+	u32 *p_data;
+	int counter;
+	u32 header;
+	u32 footer;
+	int ret;
+	int i;
+
+	/* paranoia check on buffer size */
+	if (dma_buffer_size != num_frames * 8)
+		return -EINVAL;
+
+	counter = CDNS_BTP_ROLLING_COUNTER_START;
+	p_data = (u32 *)dma_buffer;
+
+	for (i = 0; i < num_frames; i++) {
+		header = *p_data++;
+		footer = *p_data++;
+
+		ret = check_frame_start(header, counter);
+		if (ret < 0) {
+			dev_err(dev, "%s: bad frame %d/%d start header %x\n",
+				__func__, i, num_frames, header);
+			return ret;
+		}
+
+		ret = check_frame_end(footer);
+		if (ret < 0) {
+			dev_err(dev, "%s: bad frame %d/%d end footer %x\n",
+				__func__, i, num_frames, footer);
+			return ret;
+		}
+
+		counter++;
+		counter &= GENMASK(3, 0);
+	}
+	return 0;
+}
+EXPORT_SYMBOL(sdw_cdns_check_write_response);
+
+static u8 extract_read_data(u32 *data, int num_bytes, u8 *buffer)
+{
+	u32 val;
+	int i;
+	u8 crc;
+	u8 b0;
+	u8 b1;
+
+	crc = SDW_CRC8_SEED;
+
+	/* process two bytes at a time */
+	for (i = 0; i < num_bytes / 2; i++) {
+		val = *data++;
+
+		b0 = val & 0xff;
+		b1 = (val >> 8) & 0xff;
+
+		*buffer++ = b0;
+		crc = crc8(sdw_crc8_lookup_msb, &b0, 1, crc);
+
+		*buffer++ = b1;
+		crc = crc8(sdw_crc8_lookup_msb, &b1, 1, crc);
+	}
+	/* handle remaining byte if it exists */
+	if (num_bytes & 1) {
+		val = *data;
+
+		b0 = val & 0xff;
+
+		*buffer++ = b0;
+		crc = crc8(sdw_crc8_lookup_msb, &b0, 1, crc);
+	}
+	return crc;
+}
+
+int sdw_cdns_check_read_response(struct device *dev,
+				 u8 *dma_buffer, int dma_buffer_size,
+				 u8 *buffer, int buffer_size,
+				 int num_frames, int data_per_frame)
+{
+	int total_num_bytes = 0;
+	u32 *p_data;
+	u8 *p_buf;
+	int counter;
+	u32 header;
+	u32 footer;
+	u8 expected_crc;
+	u8 crc;
+	int len;
+	int ret;
+	int i;
+
+	counter = CDNS_BTP_ROLLING_COUNTER_START;
+	p_data = (u32 *)dma_buffer;
+	p_buf = buffer;
+
+	for (i = 0; i < num_frames; i++) {
+		header = *p_data++;
+
+		ret = check_frame_start(header, counter);
+		if (ret < 0) {
+			dev_err(dev, "%s: bad frame %d/%d start header %x\n",
+				__func__, i, num_frames, header);
+			return ret;
+		}
+
+		len = data_per_frame;
+		if (total_num_bytes + data_per_frame > buffer_size)
+			len = buffer_size - total_num_bytes;
+
+		crc = extract_read_data(p_data, len, p_buf);
+
+		p_data += (len + 1) / 2;
+		expected_crc = *p_data++ & 0xff;
+
+		if (crc != expected_crc) {
+			dev_err(dev, "%s: bad frame %d/%d crc %#x expected %#x\n",
+				__func__, i, num_frames, crc, expected_crc);
+			return -EIO;
+		}
+
+		p_buf += len;
+		total_num_bytes += len;
+
+		footer = *p_data++;
+		ret = check_frame_end(footer);
+		if (ret < 0) {
+			dev_err(dev, "%s: bad frame %d/%d end footer %x\n",
+				__func__, i, num_frames, footer);
+			return ret;
+		}
+
+		counter++;
+		counter &= GENMASK(3, 0);
+	}
+	return 0;
+}
+EXPORT_SYMBOL(sdw_cdns_check_read_response);
+
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_DESCRIPTION("Cadence Soundwire Library");
diff --git a/drivers/soundwire/cadence_master.h b/drivers/soundwire/cadence_master.h
index bc84435e420f..0463440d8486 100644
--- a/drivers/soundwire/cadence_master.h
+++ b/drivers/soundwire/cadence_master.h
@@ -202,4 +202,34 @@ void sdw_cdns_check_self_clearing_bits(struct sdw_cdns *cdns, const char *string
 void sdw_cdns_config_update(struct sdw_cdns *cdns);
 int sdw_cdns_config_update_set_wait(struct sdw_cdns *cdns);
 
+/* SoundWire BPT/BRA helpers to format data */
+int sdw_cdns_bpt_find_buffer_sizes(int command, /* 0: write, 1: read */
+				   int row, int col,
+				   unsigned int data_bytes,
+				   unsigned int requested_bytes_per_frame,
+				   unsigned int *data_per_frame,
+				   unsigned int *pdi0_buffer_size,
+				   unsigned int *pdi1_buffer_size,
+				   unsigned int *num_frames);
+
+int sdw_cdns_prepare_write_dma_buffer(u8 dev_num, u32 start_register,
+				      u8 *data, int data_size,
+				      int data_per_frame,
+				      u8 *dma_buffer, int dma_buffer_size,
+				      int *dma_buffer_total_bytes);
+
+int sdw_cdns_prepare_read_dma_buffer(u8 dev_num, u32 start_register,
+				     int data_size,
+				     int data_per_frame,
+				     u8 *dma_buffer, int dma_buffer_size,
+				     int *dma_buffer_total_bytes);
+
+int sdw_cdns_check_write_response(struct device *dev,
+				  u8 *dma_buffer, int dma_buffer_size,
+				  int num_frames);
+
+int sdw_cdns_check_read_response(struct device *dev,
+				 u8 *dma_buffer, int dma_buffer_size,
+				 u8 *buffer, int buffer_size,
+				 int num_frames, int data_per_frame);
 #endif /* __SDW_CADENCE_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 11/16] soundwire: intel_auxdevice: add indirection for BPT open/close/send_async/wait
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (9 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 10/16] soundwire: cadence: add BTP/BRA helpers to format data Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 12/16] ASoC: SOF: Intel: hda-sdw-bpt: add helpers for SoundWire BPT DMA Pierre-Louis Bossart
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Mirror abstraction added for master ops.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/intel_auxdevice.c | 55 +++++++++++++++++++++++++++++
 include/linux/soundwire/sdw_intel.h | 16 +++++++++
 2 files changed, 71 insertions(+)

diff --git a/drivers/soundwire/intel_auxdevice.c b/drivers/soundwire/intel_auxdevice.c
index 93698532deac..1cdf4699cf2a 100644
--- a/drivers/soundwire/intel_auxdevice.c
+++ b/drivers/soundwire/intel_auxdevice.c
@@ -74,6 +74,56 @@ static bool is_wake_capable(struct sdw_slave *slave)
 	return false;
 }
 
+static int generic_bpt_stream_open(struct sdw_bus *bus,
+				   struct sdw_slave *slave,
+				   enum sdw_bpt_type mode,
+				   struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = bus_to_cdns(bus);
+	struct sdw_intel *sdw = cdns_to_intel(cdns);
+
+	if (sdw->link_res->hw_ops->bpt_open_stream)
+		return sdw->link_res->hw_ops->bpt_open_stream(sdw, slave, mode, msg);
+	return -ENOTSUPP;
+}
+
+static int generic_bpt_stream_close(struct sdw_bus *bus,
+				    struct sdw_slave *slave,
+				    enum sdw_bpt_type mode,
+				    struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = bus_to_cdns(bus);
+	struct sdw_intel *sdw = cdns_to_intel(cdns);
+
+	if (sdw->link_res->hw_ops->bpt_close_stream)
+		return sdw->link_res->hw_ops->bpt_close_stream(sdw, slave, mode, msg);
+	return -ENOTSUPP;
+}
+
+static int generic_bpt_send_async(struct sdw_bus *bus,
+				  struct sdw_slave *slave,
+				  struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = bus_to_cdns(bus);
+	struct sdw_intel *sdw = cdns_to_intel(cdns);
+
+	if (sdw->link_res->hw_ops->bpt_send_async)
+		return sdw->link_res->hw_ops->bpt_send_async(sdw, slave, msg);
+	return -ENOTSUPP;
+}
+
+static int generic_bpt_wait(struct sdw_bus *bus,
+			    struct sdw_slave *slave,
+			    struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = bus_to_cdns(bus);
+	struct sdw_intel *sdw = cdns_to_intel(cdns);
+
+	if (sdw->link_res->hw_ops->bpt_wait)
+		return sdw->link_res->hw_ops->bpt_wait(sdw, slave, msg);
+	return -ENOTSUPP;
+}
+
 static int generic_pre_bank_switch(struct sdw_bus *bus)
 {
 	struct sdw_cdns *cdns = bus_to_cdns(bus);
@@ -203,6 +253,11 @@ static struct sdw_master_ops sdw_intel_ops = {
 	.get_device_num =  intel_get_device_num_ida,
 	.put_device_num = intel_put_device_num_ida,
 	.new_peripheral_assigned = generic_new_peripheral_assigned,
+
+	.bpt_open_stream = generic_bpt_stream_open,
+	.bpt_close_stream = generic_bpt_stream_close,
+	.bpt_send_async = generic_bpt_send_async,
+	.bpt_wait = generic_bpt_wait,
 };
 
 /*
diff --git a/include/linux/soundwire/sdw_intel.h b/include/linux/soundwire/sdw_intel.h
index 00bb22d96ae5..9e114df37a5b 100644
--- a/include/linux/soundwire/sdw_intel.h
+++ b/include/linux/soundwire/sdw_intel.h
@@ -423,6 +423,22 @@ struct sdw_intel_hw_ops {
 	bool (*sync_check_cmdsync_unlocked)(struct sdw_intel *sdw);
 
 	void (*program_sdi)(struct sdw_intel *sdw, int dev_num);
+
+	int (*bpt_open_stream)(struct sdw_intel *sdw,
+			       struct sdw_slave *slave,
+			       enum sdw_bpt_type mode,
+			       struct sdw_bpt_msg *msg);
+	int (*bpt_close_stream)(struct sdw_intel *sdw,
+				struct sdw_slave *slave,
+				enum sdw_bpt_type mode,
+				struct sdw_bpt_msg *msg);
+	int (*bpt_send_async)(struct sdw_intel *sdw,
+			      struct sdw_slave *slave,
+			      struct sdw_bpt_msg *msg);
+	int (*bpt_wait)(struct sdw_intel *sdw,
+			struct sdw_slave *slave,
+			struct sdw_bpt_msg *msg);
+
 };
 
 extern const struct sdw_intel_hw_ops sdw_intel_cnl_hw_ops;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 12/16] ASoC: SOF: Intel: hda-sdw-bpt: add helpers for SoundWire BPT DMA
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (10 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 11/16] soundwire: intel_auxdevice: add indirection for BPT open/close/send_async/wait Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 13/16] soundwire: intel: add BPT context definition Pierre-Louis Bossart
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Add SoundWire BPT DMA helpers as a separate module to avoid circular
dependencies.

For now this assumes no link DMA, only coupled mode.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/hda-sdw-bpt.h       |  76 +++++++
 sound/soc/sof/intel/Kconfig       |   8 +-
 sound/soc/sof/intel/Makefile      |   4 +
 sound/soc/sof/intel/hda-sdw-bpt.c | 328 ++++++++++++++++++++++++++++++
 4 files changed, 415 insertions(+), 1 deletion(-)
 create mode 100644 include/sound/hda-sdw-bpt.h
 create mode 100644 sound/soc/sof/intel/hda-sdw-bpt.c

diff --git a/include/sound/hda-sdw-bpt.h b/include/sound/hda-sdw-bpt.h
new file mode 100644
index 000000000000..163c4513c26f
--- /dev/null
+++ b/include/sound/hda-sdw-bpt.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause) */
+/*
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * Copyright(c) 2023 Intel Corporation. All rights reserved.
+ */
+
+#include <linux/device.h>
+
+struct hdac_ext_stream;
+struct snd_dma_buffer;
+
+#if IS_ENABLED(CONFIG_SND_SOF_SOF_HDA_SDW_BPT)
+int hda_sdw_bpt_open(struct device *dev,
+		     int link_id,
+		     struct hdac_ext_stream **bpt_tx_stream,
+		     struct snd_dma_buffer *dmab_tx_bdl,
+		     u32 bpt_tx_num_bytes,
+		     u32 tx_dma_bandwidth,
+		     struct hdac_ext_stream **bpt_rx_stream,
+		     struct snd_dma_buffer *dmab_rx_bdl,
+		     u32 bpt_rx_num_bytes,
+		     u32 rx_dma_bandwidth);
+
+int hda_sdw_bpt_send_async(struct device *dev,
+			   struct hdac_ext_stream *bpt_tx_stream,
+			   struct hdac_ext_stream *bpt_rx_stream);
+
+int hda_sdw_bpt_wait(struct device *dev,
+		     struct hdac_ext_stream *bpt_tx_stream,
+		     struct hdac_ext_stream *bpt_rx_stream);
+
+int hda_sdw_bpt_close(struct device *dev,
+		      struct hdac_ext_stream *bpt_tx_stream,
+		      struct snd_dma_buffer *dmab_tx_bdl,
+		      struct hdac_ext_stream *bpt_rx_stream,
+		      struct snd_dma_buffer *dmab_rx_bdl);
+#else
+static inline int hda_sdw_bpt_open(struct device *dev,
+				   int link_id,
+				   struct hdac_ext_stream **bpt_tx_stream,
+				   struct snd_dma_buffer *dmab_tx_bdl,
+				   u32 bpt_tx_num_bytes,
+				   u32 tx_dma_bandwidth,
+				   struct hdac_ext_stream **bpt_rx_stream,
+				   struct snd_dma_buffer *dmab_rx_bdl,
+				   u32 bpt_rx_num_bytes,
+				   u32 rx_dma_bandwidth)
+{
+	return 0;
+}
+
+static inline int hda_sdw_bpt_send_async(struct device *dev,
+					 struct hdac_ext_stream *bpt_tx_stream,
+					 struct hdac_ext_stream *bpt_rx_stream)
+{
+	return 0;
+}
+
+static inline int hda_sdw_bpt_wait(struct device *dev,
+				   struct hdac_ext_stream *bpt_tx_stream,
+				   struct hdac_ext_stream *bpt_rx_stream)
+{
+	return 0;
+}
+
+static inline int hda_sdw_bpt_close(struct device *dev,
+				    struct hdac_ext_stream *bpt_tx_stream,
+				    struct snd_dma_buffer *dmab_tx_bdl,
+				    struct hdac_ext_stream *bpt_rx_stream,
+				    struct snd_dma_buffer *dmab_rx_bdl)
+{
+	return 0;
+}
+#endif
diff --git a/sound/soc/sof/intel/Kconfig b/sound/soc/sof/intel/Kconfig
index 3396bd46b778..d2650348987b 100644
--- a/sound/soc/sof/intel/Kconfig
+++ b/sound/soc/sof/intel/Kconfig
@@ -268,6 +268,7 @@ config SND_SOC_SOF_INTEL_LNL
 	tristate
 	select SND_SOC_SOF_HDA_GENERIC
 	select SND_SOC_SOF_INTEL_SOUNDWIRE_LINK_BASELINE
+	select SND_SOF_SOF_HDA_SDW_BPT if SND_SOC_SOF_INTEL_SOUNDWIRE
 	select SND_SOC_SOF_IPC4
 	select SND_SOC_SOF_INTEL_MTL
 
@@ -325,12 +326,17 @@ config SND_SOC_SOF_HDA_AUDIO_CODEC
 
 endif ## SND_SOC_SOF_HDA_GENERIC
 
+config SND_SOF_SOF_HDA_SDW_BPT
+	tristate
+	help
+	  This option is not user-selectable but automagically handled by
+	  'select' statements at a higher level.
+
 config SND_SOC_SOF_HDA_LINK_BASELINE
 	tristate
 	select SND_SOC_SOF_HDA if SND_SOC_SOF_HDA_LINK
 	help
 	  This option is not user-selectable but automagically handled by
-	  'select' statements at a higher level.
 
 config SND_SOC_SOF_HDA
 	tristate
diff --git a/sound/soc/sof/intel/Makefile b/sound/soc/sof/intel/Makefile
index 806df08e3fd5..de1e37a76c29 100644
--- a/sound/soc/sof/intel/Makefile
+++ b/sound/soc/sof/intel/Makefile
@@ -12,6 +12,8 @@ snd-sof-intel-hda-generic-objs := hda.o hda-common-ops.o
 
 snd-sof-intel-hda-mlink-objs := hda-mlink.o
 
+snd-sof-intel-hda-sdw-bpt-objs := hda-sdw-bpt.o
+
 snd-sof-intel-hda-common-$(CONFIG_SND_SOC_SOF_HDA_PROBES) += hda-probes.o
 
 snd-sof-intel-hda-objs := hda-codec.o
@@ -26,6 +28,8 @@ obj-$(CONFIG_SND_SOC_SOF_HDA_GENERIC) += snd-sof-intel-hda-generic.o
 obj-$(CONFIG_SND_SOC_SOF_HDA_MLINK) += snd-sof-intel-hda-mlink.o
 obj-$(CONFIG_SND_SOC_SOF_HDA) += snd-sof-intel-hda.o
 
+obj-$(CONFIG_SND_SOF_SOF_HDA_SDW_BPT) += snd-sof-intel-hda-sdw-bpt.o
+
 snd-sof-pci-intel-tng-objs := pci-tng.o
 snd-sof-pci-intel-skl-objs := pci-skl.o skl.o hda-loader-skl.o
 snd-sof-pci-intel-apl-objs := pci-apl.o apl.o
diff --git a/sound/soc/sof/intel/hda-sdw-bpt.c b/sound/soc/sof/intel/hda-sdw-bpt.c
new file mode 100644
index 000000000000..980c75c3a6bd
--- /dev/null
+++ b/sound/soc/sof/intel/hda-sdw-bpt.c
@@ -0,0 +1,328 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+//
+// This file is provided under a dual BSD/GPLv2 license.  When using or
+// redistributing this file, you may do so under either license.
+//
+// Copyright(c) 2024 Intel Corporation. All rights reserved.
+//
+
+/*
+ * Hardware interface for SoundWire BPT support with HDA DMA
+ */
+
+#include <sound/hdaudio_ext.h>
+#include <sound/hda-mlink.h>
+#include <sound/hda-sdw-bpt.h>
+#include <sound/sof.h>
+#include "../ops.h"
+#include "../sof-priv.h"
+#include "hda.h"
+
+static int hda_sdw_bpt_dma_prepare(struct device *dev,
+				   struct hdac_ext_stream **sdw_bpt_stream,
+				   struct snd_dma_buffer *dmab_bdl,
+				   u32 bpt_num_bytes,
+				   unsigned int num_channels,
+				   int direction)
+{
+	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
+	struct hdac_ext_stream *bpt_stream;
+	unsigned int format = HDA_CL_STREAM_FORMAT;
+
+	/*
+	 * the baseline format needs to be adjusted to
+	 * bandwidth requirements
+	 */
+	format |= (num_channels - 1);
+
+	dev_dbg(dev, "direction %d format_val %x\n", direction, format);
+
+	bpt_stream = hda_cl_prepare(dev, format,
+				    bpt_num_bytes,
+				    dmab_bdl,
+				    direction,
+				    false);
+	if (IS_ERR(bpt_stream)) {
+		dev_err(sdev->dev, "%s: SDW BPT DMA prepare failed: dir %d\n",
+			__func__, direction);
+		return PTR_ERR(bpt_stream);
+	}
+	*sdw_bpt_stream = bpt_stream;
+
+	if (hdac_stream(bpt_stream)->direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		struct hdac_bus *bus = sof_to_bus(sdev);
+		struct hdac_ext_link *hlink;
+		int stream_tag;
+
+		stream_tag = hdac_stream(bpt_stream)->stream_tag;
+		hlink = hdac_bus_eml_sdw_get_hlink(bus);
+
+		snd_hdac_ext_bus_link_set_stream_id(hlink, stream_tag);
+	}
+	return 0;
+}
+
+static int hda_sdw_bpt_dma_deprepare(struct device *dev,
+				     struct hdac_ext_stream *sdw_bpt_stream,
+				     struct snd_dma_buffer *dmab_bdl)
+{
+	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
+	int ret;
+
+	ret = hda_cl_cleanup(sdev->dev, dmab_bdl, sdw_bpt_stream);
+	if (ret < 0) {
+		dev_err(sdev->dev, "%s: SDW BPT DMA cleanup failed\n",
+			__func__);
+		return ret;
+	}
+
+	if (hdac_stream(sdw_bpt_stream)->direction == SNDRV_PCM_STREAM_PLAYBACK) {
+		struct hdac_bus *bus = sof_to_bus(sdev);
+		struct hdac_ext_link *hlink;
+		int stream_tag;
+
+		stream_tag = hdac_stream(sdw_bpt_stream)->stream_tag;
+		hlink = hdac_bus_eml_sdw_get_hlink(bus);
+
+		snd_hdac_ext_bus_link_clear_stream_id(hlink, stream_tag);
+	}
+
+	return 0;
+}
+
+static int hda_sdw_bpt_dma_enable(struct device *dev,
+				  struct hdac_ext_stream *sdw_bpt_stream)
+{
+	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
+	int ret;
+
+	ret = hda_cl_trigger(sdev->dev, sdw_bpt_stream, SNDRV_PCM_TRIGGER_START);
+	if (ret < 0) {
+		dev_err(sdev->dev, "%s: SDW BPT DMA trigger start failed\n", __func__);
+		return ret;
+	}
+	return 0;
+}
+
+static int hda_sdw_bpt_dma_disable(struct device *dev,
+				   struct hdac_ext_stream *sdw_bpt_stream)
+{
+	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
+	int ret;
+
+	ret = hda_cl_trigger(sdev->dev, sdw_bpt_stream, SNDRV_PCM_TRIGGER_STOP);
+	if (ret < 0) {
+		dev_err(sdev->dev, "%s: SDW BPT DMA trigger stop failed\n", __func__);
+		return ret;
+	}
+	return 0;
+}
+
+int hda_sdw_bpt_open(struct device *dev,
+		     int link_id,
+		     struct hdac_ext_stream **bpt_tx_stream,
+		     struct snd_dma_buffer *dmab_tx_bdl,
+		     u32 bpt_tx_num_bytes,
+		     u32 tx_dma_bandwidth,
+		     struct hdac_ext_stream **bpt_rx_stream,
+		     struct snd_dma_buffer *dmab_rx_bdl,
+		     u32 bpt_rx_num_bytes,
+		     u32 rx_dma_bandwidth)
+{
+	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
+	unsigned int num_channels_tx;
+	unsigned int num_channels_rx;
+	int ret2;
+	int ret1;
+	int ret;
+
+	num_channels_tx = DIV_ROUND_UP(tx_dma_bandwidth, 48000 * 32);
+
+	ret = hda_sdw_bpt_dma_prepare(dev,
+				      bpt_tx_stream,
+				      dmab_tx_bdl,
+				      bpt_tx_num_bytes,
+				      num_channels_tx,
+				      SNDRV_PCM_STREAM_PLAYBACK);
+	if (ret < 0) {
+		dev_err(dev, "%s: hda_sdw_bpt_dma_prepare failed for TX: %d\n",
+			__func__, ret);
+		return ret;
+	}
+
+	num_channels_rx = DIV_ROUND_UP(rx_dma_bandwidth, 48000 * 32);
+
+	ret = hda_sdw_bpt_dma_prepare(dev,
+				      bpt_rx_stream,
+				      dmab_rx_bdl,
+				      bpt_rx_num_bytes,
+				      num_channels_rx,
+				      SNDRV_PCM_STREAM_CAPTURE);
+	if (ret < 0) {
+		dev_err(dev, "%s: hda_sdw_bpt_dma_prepare failed for RX: %d\n",
+			__func__, ret);
+
+		ret1 = hda_sdw_bpt_dma_deprepare(dev,
+						 *bpt_tx_stream,
+						 dmab_tx_bdl);
+		if (ret1 < 0)
+			dev_err(dev, "%s: hda_sdw_bpt_dma_deprepare failed for TX: %d\n",
+				__func__, ret1);
+		return ret;
+	}
+
+	/* we need to map the channels in PCMSyCM registers */
+	ret = hdac_bus_eml_sdw_map_stream_ch(sof_to_bus(sdev),
+					     link_id,
+					     0, /* cpu_dai->id -> PDI0 */
+					     GENMASK(num_channels_tx - 1, 0),
+					     hdac_stream(*bpt_tx_stream)->stream_tag,
+					     SNDRV_PCM_STREAM_PLAYBACK);
+	if (ret < 0)
+		dev_err(dev, "%s: hdac_bus_eml_sdw_map_stream_ch failed for TX: %d\n",
+			__func__, ret);
+
+	ret1 = hdac_bus_eml_sdw_map_stream_ch(sof_to_bus(sdev),
+					      link_id,
+					      1, /* cpu_dai->id -> PDI1 */
+					      GENMASK(num_channels_rx - 1, 0),
+					      hdac_stream(*bpt_rx_stream)->stream_tag,
+					      SNDRV_PCM_STREAM_CAPTURE);
+	if (ret1 < 0)
+		dev_err(dev, "%s: hdac_bus_eml_sdw_map_stream_ch failed for RX: %d\n",
+			__func__, ret1);
+
+	if (!ret)
+		ret = ret1;
+
+	if (ret < 0) {
+		ret2 = hda_sdw_bpt_close(dev,
+					 *bpt_tx_stream,
+					 dmab_tx_bdl,
+					 *bpt_rx_stream,
+					 dmab_rx_bdl);
+		if (ret2 < 0)
+			dev_err(dev, "%s: hda_sdw_bpt_close failed: %d\n",
+				__func__, ret2);
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS(hda_sdw_bpt_open, SND_SOC_SOF_INTEL_HDA_SDW_BPT);
+
+int hda_sdw_bpt_send_async(struct device *dev,
+			   struct hdac_ext_stream *bpt_tx_stream,
+			   struct hdac_ext_stream *bpt_rx_stream)
+{
+	int ret1;
+	int ret;
+
+	ret = hda_sdw_bpt_dma_enable(dev, bpt_tx_stream);
+	if (ret < 0) {
+		dev_err(dev, "%s: hda_sdw_bpt_dma_enable failed for TX: %d\n",
+			__func__, ret);
+		return ret;
+	}
+
+	ret = hda_sdw_bpt_dma_enable(dev, bpt_rx_stream);
+	if (ret < 0) {
+		dev_err(dev, "%s: hda_sdw_bpt_dma_enable failed for RX: %d\n",
+			__func__, ret);
+
+		ret1 = hda_sdw_bpt_dma_disable(dev, bpt_tx_stream);
+		if (ret1 < 0)
+			dev_err(dev, "%s: hda_sdw_bpt_dma_disable failed for TX: %d\n",
+				__func__, ret1);
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS(hda_sdw_bpt_send_async, SND_SOC_SOF_INTEL_HDA_SDW_BPT);
+
+/*
+ * 3s is several orders of magnitude larger than what is needed for a
+ * typical firmware download.
+ */
+#define HDA_BPT_IOC_TIMEOUT_MS 3000
+
+int hda_sdw_bpt_wait(struct device *dev,
+		     struct hdac_ext_stream *bpt_tx_stream,
+		     struct hdac_ext_stream *bpt_rx_stream)
+{
+	struct sof_intel_hda_stream *hda_tx_stream;
+	struct sof_intel_hda_stream *hda_rx_stream;
+	unsigned long time_tx_left;
+	unsigned long time_rx_left;
+	snd_pcm_uframes_t tx_position;
+	snd_pcm_uframes_t rx_position;
+	int ret = 0;
+	int ret1;
+
+	hda_tx_stream = container_of(bpt_tx_stream, struct sof_intel_hda_stream,
+				     hext_stream);
+	hda_rx_stream = container_of(bpt_rx_stream, struct sof_intel_hda_stream,
+				     hext_stream);
+
+	time_tx_left = wait_for_completion_timeout(&hda_tx_stream->ioc,
+						   msecs_to_jiffies(HDA_BPT_IOC_TIMEOUT_MS));
+	if (!time_tx_left) {
+		tx_position = hda_dsp_stream_get_position(hdac_stream(bpt_tx_stream),
+							  SNDRV_PCM_STREAM_PLAYBACK,
+							  false);
+		dev_err(dev, "%s: SDW BPT TX DMA did not complete: %ld\n",
+			__func__, tx_position);
+		ret = -ETIMEDOUT;
+	}
+
+	/* the wait should be minimal here */
+	time_rx_left = wait_for_completion_timeout(&hda_rx_stream->ioc,
+						   msecs_to_jiffies(HDA_BPT_IOC_TIMEOUT_MS));
+	if (!time_rx_left) {
+		rx_position = hda_dsp_stream_get_position(hdac_stream(bpt_rx_stream),
+							  SNDRV_PCM_STREAM_CAPTURE,
+							  false);
+		dev_err(dev, "%s: SDW BPT RX DMA did not complete: %ld\n",
+			__func__, rx_position);
+		ret = -ETIMEDOUT;
+	}
+
+	ret1 = hda_sdw_bpt_dma_disable(dev, bpt_rx_stream);
+	if (!ret)
+		ret = ret1;
+
+	ret1 = hda_sdw_bpt_dma_disable(dev, bpt_tx_stream);
+	if (!ret)
+		ret = ret1;
+
+	return ret;
+}
+EXPORT_SYMBOL_NS(hda_sdw_bpt_wait, SND_SOC_SOF_INTEL_HDA_SDW_BPT);
+
+int hda_sdw_bpt_close(struct device *dev,
+		      struct hdac_ext_stream *bpt_tx_stream,
+		      struct snd_dma_buffer *dmab_tx_bdl,
+		      struct hdac_ext_stream *bpt_rx_stream,
+		      struct snd_dma_buffer *dmab_rx_bdl)
+{
+	int ret;
+	int ret1;
+
+	ret = hda_sdw_bpt_dma_deprepare(dev,
+					bpt_tx_stream,
+					dmab_tx_bdl);
+
+	ret1 = hda_sdw_bpt_dma_deprepare(dev,
+					 bpt_rx_stream,
+					 dmab_rx_bdl);
+	if (!ret)
+		ret = ret1;
+
+	return ret;
+}
+EXPORT_SYMBOL_NS(hda_sdw_bpt_close, SND_SOC_SOF_INTEL_HDA_SDW_BPT);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_IMPORT_NS(SND_SOC_SOF_INTEL_HDA_COMMON);
+MODULE_IMPORT_NS(SND_SOC_SOF_HDA_MLINK);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 13/16] soundwire: intel: add BPT context definition
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (11 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 12/16] ASoC: SOF: Intel: hda-sdw-bpt: add helpers for SoundWire BPT DMA Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 14/16] soundwire: intel_ace2x: add BPT open/close/send_async/wait Pierre-Louis Bossart
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

This is needed to be shared between open/send_async/close.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/intel.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/soundwire/intel.h b/drivers/soundwire/intel.h
index 511932c55216..2289719f318c 100644
--- a/drivers/soundwire/intel.h
+++ b/drivers/soundwire/intel.h
@@ -48,11 +48,23 @@ struct sdw_intel_link_res {
 	struct hdac_bus *hbus;
 };
 
+struct sdw_intel_bpt {
+	struct hdac_ext_stream *bpt_tx_stream;
+	struct snd_dma_buffer dmab_tx_bdl;
+	struct hdac_ext_stream *bpt_rx_stream;
+	struct snd_dma_buffer dmab_rx_bdl;
+	unsigned int pdi0_buffer_size;
+	unsigned int pdi1_buffer_size;
+	unsigned int num_frames;
+	unsigned int data_per_frame;
+};
+
 struct sdw_intel {
 	struct sdw_cdns cdns;
 	int instance;
 	struct sdw_intel_link_res *link_res;
 	bool startup_done;
+	struct sdw_intel_bpt bpt_ctx;
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *debugfs;
 #endif
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 14/16] soundwire: intel_ace2x: add BPT open/close/send_async/wait
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (12 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 13/16] soundwire: intel: add BPT context definition Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 15/16] soundwire: debugfs: add interface for BPT/BRA transfers Pierre-Louis Bossart
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Add support for BTP API using Cadence and hda-sdw-bpt helpers.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/intel_ace2x.c | 377 ++++++++++++++++++++++++++++++++
 1 file changed, 377 insertions(+)

diff --git a/drivers/soundwire/intel_ace2x.c b/drivers/soundwire/intel_ace2x.c
index fb64c6d37ab0..f2f5dd503572 100644
--- a/drivers/soundwire/intel_ace2x.c
+++ b/drivers/soundwire/intel_ace2x.c
@@ -12,12 +12,383 @@
 #include <linux/soundwire/sdw_intel.h>
 #include <sound/hdaudio.h>
 #include <sound/hda-mlink.h>
+#include <sound/hda-sdw-bpt.h>
 #include <sound/hda_register.h>
 #include <sound/pcm_params.h>
 #include "cadence_master.h"
 #include "bus.h"
 #include "intel.h"
 
+static int sdw_slave_bpt_stream_add(struct sdw_slave *slave,
+				    struct sdw_stream_runtime *stream)
+{
+	struct sdw_stream_config sconfig = {0};
+	struct sdw_port_config pconfig = {0};
+	int ret;
+
+	/* arbitrary configuration */
+	sconfig.frame_rate = 16000;
+	sconfig.ch_count = 1;
+	sconfig.bps = 32; /* this is required for BPT/BRA */
+	sconfig.direction = SDW_DATA_DIR_RX;
+	sconfig.type = SDW_STREAM_BPT;
+
+	pconfig.num = 0;
+	pconfig.ch_mask = BIT(0);
+
+	ret = sdw_stream_add_slave(slave, &sconfig,
+				   &pconfig, 1, stream);
+	if (ret) {
+		dev_err(&slave->dev, "%s: failed: %d\n", __func__, ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int intel_ace2x_bpt_open_stream(struct sdw_intel *sdw,
+				       struct sdw_slave *slave,
+				       enum sdw_bpt_type mode,
+				       struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = &sdw->cdns;
+	struct sdw_bus *bus = &cdns->bus;
+	struct sdw_master_prop *prop = &bus->prop;
+	struct sdw_stream_runtime *stream;
+	struct sdw_stream_config sconfig;
+	struct sdw_port_config *pconfig;
+	unsigned int pdi0_buffer_size;
+	unsigned int tx_dma_bandwidth;
+	unsigned int pdi1_buffer_size;
+	unsigned int rx_dma_bandwidth;
+	unsigned int num_frames;
+	unsigned int data_per_frame;
+	unsigned int tx_total_bytes;
+	struct sdw_cdns_pdi *pdi0;
+	struct sdw_cdns_pdi *pdi1;
+	const int ch = 1;
+	const int num_pdi = 2;
+	int command;
+	int ret1;
+	int ret;
+	int dir;
+	int i;
+
+	if (mode != SDW_BRA)
+		return -EINVAL;
+
+	stream = sdw_alloc_stream("BPT", SDW_STREAM_BPT);
+	if (!stream)
+		return -ENOMEM;
+	cdns->bus.bpt_stream = stream;
+
+	ret = sdw_slave_bpt_stream_add(slave, stream);
+	if (ret < 0)
+		goto release_stream;
+
+	/* handle PDI0 first */
+	dir = SDW_DATA_DIR_TX;
+
+	pdi0 = sdw_cdns_alloc_pdi(cdns, &cdns->pcm, ch,  dir, 0);
+	if (!pdi0) {
+		dev_err(cdns->dev, "%s: sdw_cdns_alloc_pdi0 failed\n", __func__);
+		ret = -EINVAL;
+		goto remove_slave;
+	}
+
+	sdw_cdns_config_stream(cdns, ch, dir, pdi0);
+
+	/* handle PDI1  */
+	dir = SDW_DATA_DIR_RX;
+
+	pdi1 = sdw_cdns_alloc_pdi(cdns, &cdns->pcm, ch,  dir, 1);
+	if (!pdi1) {
+		dev_err(cdns->dev, "%s: sdw_cdns_alloc_pdi1 failed\n", __func__);
+		ret = -EINVAL;
+		goto remove_slave;
+	}
+
+	sdw_cdns_config_stream(cdns, ch, dir, pdi1);
+
+	/*
+	 * the port config direction, number of channels and frame
+	 * rate is totally arbitrary
+	 */
+	sconfig.direction = dir;
+	sconfig.ch_count = ch;
+	sconfig.frame_rate = 16000;
+	sconfig.type = SDW_STREAM_BPT;
+	sconfig.bps = 32; /* this is required for BPT/BRA */
+
+	/* Port configuration */
+	pconfig = kcalloc(num_pdi, sizeof(*pconfig), GFP_KERNEL);
+	if (!pconfig) {
+		ret =  -ENOMEM;
+		goto remove_slave;
+	}
+
+	for (i = 0; i < num_pdi; i++) {
+		pconfig[i].num = i;
+		pconfig[i].ch_mask = ch;
+	}
+
+	ret = sdw_stream_add_master(&cdns->bus, &sconfig,
+				    pconfig, num_pdi, stream);
+	kfree(pconfig);
+
+	if (ret < 0) {
+		dev_err(cdns->dev, "add master to stream failed:%d\n", ret);
+		goto remove_slave;
+	}
+
+	ret = sdw_prepare_stream(cdns->bus.bpt_stream);
+	if (ret < 0)
+		goto remove_master;
+
+	command = (msg->flags & SDW_MSG_FLAG_WRITE) ? 0 : 1;
+
+	ret = sdw_cdns_bpt_find_buffer_sizes(command,
+					     cdns->bus.params.row,
+					     cdns->bus.params.col,
+					     msg->len,
+					     SDW_BPT_MSG_MAX_BYTES,
+					     &data_per_frame,
+					     &pdi0_buffer_size,
+					     &pdi1_buffer_size,
+					     &num_frames);
+	if (ret < 0)
+		goto deprepare_stream;
+
+	sdw->bpt_ctx.pdi0_buffer_size = pdi0_buffer_size;
+	sdw->bpt_ctx.pdi1_buffer_size = pdi1_buffer_size;
+	sdw->bpt_ctx.num_frames = num_frames;
+	sdw->bpt_ctx.data_per_frame = data_per_frame;
+	tx_dma_bandwidth = div_u64((u64)pdi0_buffer_size * 8 * (u64)prop->default_frame_rate,
+				   num_frames);
+	rx_dma_bandwidth = div_u64((u64)pdi1_buffer_size * 8 * (u64)prop->default_frame_rate,
+				   num_frames);
+
+	dev_dbg(cdns->dev, "Message len %d transferred in %d frames (%d per frame)\n",
+		msg->len, num_frames, data_per_frame);
+	dev_dbg(cdns->dev, "sizes pdi0 %d pdi1 %d tx_bandwidth %d rx_bandwidth %d\n",
+		pdi0_buffer_size, pdi1_buffer_size,
+		tx_dma_bandwidth,
+		rx_dma_bandwidth);
+
+	ret = hda_sdw_bpt_open(cdns->dev->parent, /* PCI device */
+			       sdw->instance,
+			       &sdw->bpt_ctx.bpt_tx_stream,
+			       &sdw->bpt_ctx.dmab_tx_bdl,
+			       pdi0_buffer_size,
+			       tx_dma_bandwidth,
+			       &sdw->bpt_ctx.bpt_rx_stream,
+			       &sdw->bpt_ctx.dmab_rx_bdl,
+			       pdi1_buffer_size,
+			       rx_dma_bandwidth);
+	if (ret < 0) {
+		dev_err(cdns->dev, "%s: hda_sdw_bpt_open failed %d\n", __func__, ret);
+		goto deprepare_stream;
+	}
+
+	if (!command) {
+		ret = sdw_cdns_prepare_write_dma_buffer(
+			msg->dev_num, msg->addr,
+			msg->buf, msg->len,
+			data_per_frame,
+			sdw->bpt_ctx.dmab_tx_bdl.area,
+			pdi0_buffer_size,
+			&tx_total_bytes);
+	} else {
+		ret = sdw_cdns_prepare_read_dma_buffer(
+			msg->dev_num, msg->addr,
+			msg->len,
+			data_per_frame,
+			sdw->bpt_ctx.dmab_tx_bdl.area,
+			pdi0_buffer_size,
+			&tx_total_bytes);
+
+	}
+	if (ret < 0) {
+		dev_err(cdns->dev, "%s: sdw_prepare_%s_dma_buffer failed %d\n",
+			__func__, command ? "read" : "write", ret);
+		goto bpt_close;
+	}
+
+	return 0;
+
+bpt_close:
+	ret1 = hda_sdw_bpt_close(cdns->dev->parent, /* PCI device */
+				 sdw->bpt_ctx.bpt_tx_stream,
+				 &sdw->bpt_ctx.dmab_tx_bdl,
+				 sdw->bpt_ctx.bpt_rx_stream,
+				 &sdw->bpt_ctx.dmab_rx_bdl);
+	if (ret1 < 0)
+		dev_err(cdns->dev, "%s:  hda_sdw_bpt_close failed: ret %d\n",
+			__func__, ret1);
+
+deprepare_stream:
+	sdw_deprepare_stream(cdns->bus.bpt_stream);
+
+remove_master:
+	ret1 = sdw_stream_remove_master(&cdns->bus, cdns->bus.bpt_stream);
+	if (ret1 < 0)
+		dev_err(cdns->dev, "%s: remove master failed: %d\n",
+			__func__, ret1);
+
+remove_slave:
+	ret1 = sdw_stream_remove_slave(slave, cdns->bus.bpt_stream);
+	if (ret1 < 0)
+		dev_err(cdns->dev, "%s: remove slave failed: %d\n",
+			__func__, ret1);
+
+release_stream:
+	sdw_release_stream(cdns->bus.bpt_stream);
+	cdns->bus.bpt_stream = NULL;
+
+	return ret;
+}
+
+static int intel_ace2x_bpt_close_stream(struct sdw_intel *sdw,
+					struct sdw_slave *slave,
+					enum sdw_bpt_type mode,
+					struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = &sdw->cdns;
+	int ret1;
+	int ret;
+
+	if (mode != SDW_BRA)
+		return -EINVAL;
+
+	ret = hda_sdw_bpt_close(cdns->dev->parent, /* PCI device */
+				sdw->bpt_ctx.bpt_tx_stream,
+				&sdw->bpt_ctx.dmab_tx_bdl,
+				sdw->bpt_ctx.bpt_rx_stream,
+				&sdw->bpt_ctx.dmab_rx_bdl);
+	if (ret < 0)
+		dev_err(cdns->dev, "%s:  hda_sdw_bpt_close failed: ret %d\n",
+			__func__, ret);
+
+	ret1 = sdw_deprepare_stream(cdns->bus.bpt_stream);
+	if (ret1 < 0) {
+		dev_err(cdns->dev, "%s: sdw_deprepare_stream failed: ret %d\n",
+			__func__, ret1);
+		if (!ret)
+			ret = ret1;
+	}
+
+	ret1 = sdw_stream_remove_master(&cdns->bus, cdns->bus.bpt_stream);
+	if (ret1 < 0) {
+		dev_err(cdns->dev, "%s: remove master failed: %d\n",
+			__func__, ret1);
+		if (!ret)
+			ret = ret1;
+	}
+
+	ret1 = sdw_stream_remove_slave(slave, cdns->bus.bpt_stream);
+	if (ret1 < 0) {
+		dev_err(cdns->dev, "%s: remove slave failed: %d\n",
+			__func__, ret1);
+		if (!ret)
+			ret = ret1;
+	}
+
+	cdns->bus.bpt_stream = NULL;
+
+	return ret;
+}
+
+static int intel_ace2x_bpt_send_async(struct sdw_intel *sdw,
+				      struct sdw_slave *slave,
+				      struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = &sdw->cdns;
+	int ret;
+
+	dev_dbg(cdns->dev, "BPT Transfer start\n");
+
+	ret = hda_sdw_bpt_send_async(cdns->dev->parent, /* PCI device */
+				     sdw->bpt_ctx.bpt_tx_stream,
+				     sdw->bpt_ctx.bpt_rx_stream);
+	if (ret < 0)
+		return ret;
+
+	ret = sdw_enable_stream(cdns->bus.bpt_stream);
+	if (ret < 0) {
+		dev_err(cdns->dev, "%s: sdw_stream_enable failed: %d\n",
+			__func__, ret);
+		return ret;
+	}
+
+	dev_info(cdns->dev, "%s: done\n", __func__);
+
+	return 0;
+}
+
+static int intel_ace2x_bpt_wait(struct sdw_intel *sdw,
+				struct sdw_slave *slave,
+				struct sdw_bpt_msg *msg)
+{
+	struct sdw_cdns *cdns = &sdw->cdns;
+	int ret;
+
+	dev_dbg(cdns->dev, "BTP Transfer wait\n");
+
+	ret = hda_sdw_bpt_wait(cdns->dev->parent, /* PCI device */
+			       sdw->bpt_ctx.bpt_tx_stream,
+			       sdw->bpt_ctx.bpt_rx_stream);
+	if (ret < 0)
+		dev_err(cdns->dev, "%s: hda_sdw_bpt_wait failed: %d\n", __func__, ret);
+
+	ret = sdw_disable_stream(cdns->bus.bpt_stream);
+	if (ret < 0) {
+		dev_err(cdns->dev, "%s: sdw_stream_enable failed: %d\n",
+			__func__, ret);
+		return ret;
+	}
+
+	if (0) {
+		/* dump DMA TX buffer */
+		print_hex_dump_debug("DMA out: ", DUMP_PREFIX_OFFSET,
+				     4, 4,
+				     sdw->bpt_ctx.dmab_tx_bdl.area,
+				     sdw->bpt_ctx.pdi0_buffer_size,
+				     false);
+
+		print_hex_dump_debug("DMA in: ", DUMP_PREFIX_OFFSET,
+				     4, 4,
+				     sdw->bpt_ctx.dmab_rx_bdl.area,
+				     sdw->bpt_ctx.pdi1_buffer_size,
+				     false);
+	}
+
+	if (msg->flags & SDW_MSG_FLAG_WRITE) {
+		ret = sdw_cdns_check_write_response(cdns->dev,
+						    sdw->bpt_ctx.dmab_rx_bdl.area,
+						    sdw->bpt_ctx.pdi1_buffer_size,
+						    sdw->bpt_ctx.num_frames);
+		if (ret < 0) {
+			dev_err(cdns->dev, "%s: BPT Write failed %d\n", __func__, ret);
+			return ret;
+		}
+	} else {
+		ret = sdw_cdns_check_read_response(cdns->dev,
+						   sdw->bpt_ctx.dmab_rx_bdl.area,
+						   sdw->bpt_ctx.pdi1_buffer_size,
+						   msg->buf, msg->len,
+						   sdw->bpt_ctx.num_frames,
+						   sdw->bpt_ctx.data_per_frame);
+		if (ret < 0) {
+			dev_err(cdns->dev, "%s: BPT Read failed %d\n", __func__, ret);
+			return ret;
+		}
+	}
+
+	dev_dbg(cdns->dev, "BPT Transfer done\n");
+
+	return 0;
+}
+
 /*
  * shim vendor-specific (vs) ops
  */
@@ -692,7 +1063,13 @@ const struct sdw_intel_hw_ops sdw_intel_lnl_hw_ops = {
 	.sync_check_cmdsync_unlocked = intel_check_cmdsync_unlocked,
 
 	.program_sdi = intel_program_sdi,
+
+	.bpt_open_stream = intel_ace2x_bpt_open_stream,
+	.bpt_close_stream = intel_ace2x_bpt_close_stream,
+	.bpt_send_async = intel_ace2x_bpt_send_async,
+	.bpt_wait = intel_ace2x_bpt_wait,
 };
 EXPORT_SYMBOL_NS(sdw_intel_lnl_hw_ops, SOUNDWIRE_INTEL);
 
 MODULE_IMPORT_NS(SND_SOC_SOF_HDA_MLINK);
+MODULE_IMPORT_NS(SND_SOC_SOF_INTEL_HDA_SDW_BPT);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 15/16] soundwire: debugfs: add interface for BPT/BRA transfers
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (13 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 14/16] soundwire: intel_ace2x: add BPT open/close/send_async/wait Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:29 ` [RFC PATCH 16/16] ASoC: rt711-sdca: add DP0 support Pierre-Louis Bossart
  2023-12-07 22:56 ` [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Mark Brown
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

Add code to show what codec drivers will need to do to enable BPT/BRA
transfers. The only difference is to set the 'command_type' file to
1. A zero-value will rely on regular read/write commands in Column0.

For now the code will only return error codes since the Manager side
is not enabled yet.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/debugfs.c | 122 +++++++++++++++++++++++++++++++-----
 1 file changed, 106 insertions(+), 16 deletions(-)

diff --git a/drivers/soundwire/debugfs.c b/drivers/soundwire/debugfs.c
index 6d253d69871d..de35f5beec4d 100644
--- a/drivers/soundwire/debugfs.c
+++ b/drivers/soundwire/debugfs.c
@@ -138,9 +138,10 @@ static int sdw_slave_reg_show(struct seq_file *s_file, void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(sdw_slave_reg);
 
-#define MAX_CMD_BYTES 256
+#define MAX_CMD_BYTES (1024 * 1024)
 
 static int cmd;
+static int cmd_type;
 static u32 start_addr;
 static size_t num_bytes;
 static u8 read_buffer[MAX_CMD_BYTES];
@@ -164,6 +165,25 @@ static int set_command(void *data, u64 value)
 DEFINE_DEBUGFS_ATTRIBUTE(set_command_fops, NULL,
 			 set_command, "%llu\n");
 
+static int set_command_type(void *data, u64 value)
+{
+	struct sdw_slave *slave = data;
+
+	if (value > 1)
+		return -EINVAL;
+
+	/* Userspace changed the hardware state behind the kernel's back */
+	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+
+	dev_dbg(&slave->dev, "command type: %s\n", value ? "BRA" : "Column0");
+
+	cmd_type = value;
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(set_command_type_fops, NULL,
+			 set_command_type, "%llu\n");
+
 static int set_start_address(void *data, u64 value)
 {
 	struct sdw_slave *slave = data;
@@ -199,9 +219,64 @@ static int set_num_bytes(void *data, u64 value)
 DEFINE_DEBUGFS_ATTRIBUTE(set_num_bytes_fops, NULL,
 			 set_num_bytes, "%llu\n");
 
+static int do_bpt_sequence(struct sdw_slave *slave, bool write,
+			   u8 *buffer)
+{
+	struct sdw_bpt_msg msg = {0};
+	int ret1;
+	int ret;
+
+	msg.addr = start_addr;
+	msg.len = num_bytes;
+	msg.dev_num = slave->dev_num;
+	if (write)
+		msg.flags = SDW_MSG_FLAG_WRITE;
+	else
+		msg.flags = SDW_MSG_FLAG_READ;
+	msg.buf = buffer;
+
+	ret = sdw_bpt_open_stream(slave->bus, slave, SDW_BRA, &msg);
+	if (ret < 0) {
+		dev_err(&slave->dev, "bpt_open_stream failed: %d\n", ret);
+		return ret;
+	}
+
+	ret = sdw_bpt_send_async(slave->bus,
+				 slave,
+				 &msg);
+	if (ret < 0) {
+		dev_err(&slave->dev, "bpt_send_async failed: %d\n", ret);
+		goto close;
+	}
+
+	dev_dbg(&slave->dev, "send_async done\n");
+
+	ret = sdw_bpt_wait(slave->bus,
+			   slave,
+			   &msg);
+	if (ret < 0)
+		dev_err(&slave->dev, "%s: bpt_wait failed: %d\n", __func__, ret);
+
+	dev_dbg(&slave->dev, "wait done\n");
+
+close:
+	ret1 = sdw_bpt_close_stream(slave->bus, slave, SDW_BRA, &msg);
+	if (ret1 < 0) {
+		dev_err(&slave->dev, "%s: bpt_close_stream failed: %d\n",
+			__func__, ret1);
+		if (!ret)
+			ret = ret1;
+	}
+
+	return ret;
+}
+
 static int cmd_go(void *data, u64 value)
 {
+	const struct firmware *fw = NULL;
 	struct sdw_slave *slave = data;
+	ktime_t start_t;
+	ktime_t finish_t;
 	int ret;
 
 	if (value != 1)
@@ -218,40 +293,54 @@ static int cmd_go(void *data, u64 value)
 		return ret;
 	}
 
-	/* Userspace changed the hardware state behind the kernel's back */
-	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
-
-	dev_dbg(&slave->dev, "starting command\n");
-
 	if (cmd == 0) {
-		const struct firmware *fw;
-
 		ret = request_firmware(&fw, firmware_file, &slave->dev);
 		if (ret < 0) {
 			dev_err(&slave->dev, "firmware %s not found\n", firmware_file);
 			goto out;
 		}
-
-		if (fw->size != num_bytes) {
+		if (fw->size < num_bytes) {
 			dev_err(&slave->dev,
-				"firmware %s: unexpected size %zd, desired %zd\n",
+				"firmware %s: firmware size %zd, desired %zd\n",
 				firmware_file, fw->size, num_bytes);
-			release_firmware(fw);
 			goto out;
 		}
+	}
 
-		ret = sdw_nwrite_no_pm(slave, start_addr, num_bytes, fw->data);
-		release_firmware(fw);
+	/* Userspace changed the hardware state behind the kernel's back */
+	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+
+	dev_dbg(&slave->dev, "starting command\n");
+	start_t = ktime_get();
+
+	if (cmd == 0) {
+		if (cmd_type) {
+			ret = do_bpt_sequence(slave, true, (u8 *)fw->data);
+		} else {
+			ret = sdw_nwrite_no_pm(slave, start_addr, num_bytes, fw->data);
+		}
 	} else {
-		ret = sdw_nread_no_pm(slave, start_addr, num_bytes, read_buffer);
+		memset(read_buffer, 0, sizeof(read_buffer));
+
+		if (cmd_type) {
+			ret = do_bpt_sequence(slave, false, read_buffer);
+		} else {
+			ret = sdw_nread_no_pm(slave, start_addr, num_bytes, read_buffer);
+		}
 	}
 
-	dev_dbg(&slave->dev, "command completed %d\n", ret);
+	finish_t = ktime_get();
 
 out:
+	if (fw)
+		release_firmware(fw);
+
 	pm_runtime_mark_last_busy(&slave->dev);
 	pm_runtime_put(&slave->dev);
 
+	dev_dbg(&slave->dev, "command completed, status %d, time %lld ms\n",
+		ret, div_u64(finish_t - start_t, NSEC_PER_MSEC));
+
 	return ret;
 }
 DEFINE_DEBUGFS_ATTRIBUTE(cmd_go_fops, NULL,
@@ -293,6 +382,7 @@ void sdw_slave_debugfs_init(struct sdw_slave *slave)
 
 	/* interface to send arbitrary commands */
 	debugfs_create_file("command", 0200, d, slave, &set_command_fops);
+	debugfs_create_file("command_type", 0200, d, slave, &set_command_type_fops);
 	debugfs_create_file("start_address", 0200, d, slave, &set_start_address_fops);
 	debugfs_create_file("num_bytes", 0200, d, slave, &set_num_bytes_fops);
 	debugfs_create_file("go", 0200, d, slave, &cmd_go_fops);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 16/16] ASoC: rt711-sdca: add DP0 support
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (14 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 15/16] soundwire: debugfs: add interface for BPT/BRA transfers Pierre-Louis Bossart
@ 2023-12-07 22:29 ` Pierre-Louis Bossart
  2023-12-07 22:56 ` [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Mark Brown
  16 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 22:29 UTC (permalink / raw)
  To: linux-sound
  Cc: alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou, Pierre-Louis Bossart

DP0 is required for BPT/BRA transport.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 sound/soc/codecs/rt711-sdca-sdw.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/sound/soc/codecs/rt711-sdca-sdw.c b/sound/soc/codecs/rt711-sdca-sdw.c
index 935e597022d3..583c839e0e49 100644
--- a/sound/soc/codecs/rt711-sdca-sdw.c
+++ b/sound/soc/codecs/rt711-sdca-sdw.c
@@ -225,6 +225,14 @@ static int rt711_sdca_read_prop(struct sdw_slave *slave)
 		j++;
 	}
 
+	prop->dp0_prop = devm_kzalloc(&slave->dev, sizeof(*prop->dp0_prop),
+				      GFP_KERNEL);
+	if (!prop->dp0_prop)
+		return -ENOMEM;
+
+	prop->dp0_prop->simple_ch_prep_sm = true;
+	prop->dp0_prop->ch_prep_timeout = 10;
+
 	/* set the timeout values */
 	prop->clk_stop_timeout = 700;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol
  2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
                   ` (15 preceding siblings ...)
  2023-12-07 22:29 ` [RFC PATCH 16/16] ASoC: rt711-sdca: add DP0 support Pierre-Louis Bossart
@ 2023-12-07 22:56 ` Mark Brown
  2023-12-07 23:06   ` Pierre-Louis Bossart
  16 siblings, 1 reply; 57+ messages in thread
From: Mark Brown @ 2023-12-07 22:56 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

[-- Attachment #1: Type: text/plain, Size: 491 bytes --]

On Thu, Dec 07, 2023 at 04:29:28PM -0600, Pierre-Louis Bossart wrote:

> The MIPI specification and most of the new codecs support the Bulk
> Transfer Protocol (BTP) and specifically the Bulk Register Access
> (BRA) configuration. This mode reclaims the 'audio' data space of the
> SoundWire frame to send firmware/coefficients over the DataPort 0
> (DP0).

So the bulk register access is accessing registers that are also visible
through the one register at at time interface, just faster?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol
  2023-12-07 22:56 ` [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Mark Brown
@ 2023-12-07 23:06   ` Pierre-Louis Bossart
  0 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-07 23:06 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



On 12/7/23 16:56, Mark Brown wrote:
> On Thu, Dec 07, 2023 at 04:29:28PM -0600, Pierre-Louis Bossart wrote:
> 
>> The MIPI specification and most of the new codecs support the Bulk
>> Transfer Protocol (BTP) and specifically the Bulk Register Access
>> (BRA) configuration. This mode reclaims the 'audio' data space of the
>> SoundWire frame to send firmware/coefficients over the DataPort 0
>> (DP0).
> 
> So the bulk register access is accessing registers that are also visible
> through the one register at at time interface, just faster?

Yes, each frame can transmit a packet with a start address, length and a
bunch of data bytes protected with a CRC. With the default 50x4 frame
size we use, we can send 8 contiguous bytes per frame instead of 1. With
a larger frame you get even more bytes per frame.

Also because we program a large buffer with all the packets
pre-formatted by software, we don't have much software overhead. The
packets are streamed over DMA and inserted in the frame by hardware at
the relevant time. That means waiting for one DMA complete event instead
of dealing with thousands of command/responses with interrupts.

There are limitations though, if the frame is already transmitting audio
data then obviously we have a conflict.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
@ 2023-12-07 23:29   ` Mark Brown
  2023-12-08  0:56     ` Pierre-Louis Bossart
  2023-12-08 16:27   ` Charles Keepax
  2023-12-18 11:40   ` Vinod Koul
  2 siblings, 1 reply; 57+ messages in thread
From: Mark Brown @ 2023-12-07 23:29 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

[-- Attachment #1: Type: text/plain, Size: 5359 bytes --]

On Thu, Dec 07, 2023 at 04:29:29PM -0600, Pierre-Louis Bossart wrote:

> +Addressing restrictions
> +-----------------------
> +
> +The Device Number specified in the Header follows the SoundWire
> +definitions, and broadcast and group addressing are
> +permitted. However, in reality it is very unlikely that the exact same
> +binary data needs to be provided to the two different Peripheral
> +devices. The Linux implementation only allows for transfers to a
> +single device at a time.

For the firmware download case it seems likely that this won't always be
the case, but it's definitely a thing that could be done incrementally.

> +Regmap use
> +~~~~~~~~~~

> +Existing codec drivers rely on regmap to download firmware to
> +Peripherals, so at a high-level it would seem natural to combine BRA
> +and regmap. The regmap layer could check if BRA is available or not,
> +and use a regular read-write command channel in the latter case.

> +However, the regmap layer does not have information on latency or how
> +critical a BRA transfer is. It would make more sense to let the codec
> +driver make decisions (wait, timeout, fallback to regular
> +read/writes).

These don't seem like insurmountable obstacles - they feel more like a
copy break kind of situation where we can infer things from the pattern
of transactions, and there's always the possibility of adding some calls
to give regmap more high level information about the overall state of
the device.  One of the expected usage patterns with cache only mode is
to build up a final register state then let the cache code work out the
optimal pattern to actually write that out.

> +In addition, the hardware may lose context and ask for the firmware to
> +be downloaded again. The firmware is not however a fixed definition,
> +the SDCA definition allows the hardware to request an updated firmware
> +or a different coefficient table to deal with specific environment
> +conditions. In other words, traditional regmap cache management is not
> +good enough, the Peripheral driver is required track hardware
> +notifications and react accordingly.

I might be missing something but those requests for redownload sound
like things that would occur regardless of the mechanism used to perform
the I/O?

> +Concurrency between BRA and regular read/write
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +The existing 'nread/nwrite' API already relies on a notion of start
> +address and number of bytes, so it would be possible to extend this
> +API with a 'hint' requesting BPT/BRA be used.

> +However BRA transfers could be quite long, and the use of a single
> +mutex for regular read/write and BRA is a show-stopper. Independent
> +operation of the control/command and BRA transfers is a fundamental
> +requirement, e.g. to change the volume level with the existing regmap
> +interface while downloading firmware.

> +This places the burden on the codec driver to verify that there is no
> +concurrent access to the same address with the command/control
> +protocol and the BRA protocol.

This makes me feel very nervous about robustness.  I do note that regmap
has an async interface which is currently only used for SPI and really
only by the Cirrus DSPs (plus opportunisticly by rbtree cache sync), the
original usage was to fill the pipleine of SPI controllers so we can
ideally push data entirely from interrupt.  TBH that was done before SMP
became standard and it's a lot less clear in this day and age that this
is actually helpful, the overhead of cross core synchronisation likely
obliterates any benefit.  There's sufficently few users that we could
make API changes readily to fit better.

> +One possible strategy to speed-up all initialization tasks would be to
> +start a BRA transfer for firmware download, then deal with all the
> +"regular" read/writes in parallel with the command channel, and last
> +to wait for the BRA transfers to complete. This would allow for a
> +degree of overlap instead of a purely sequential solution. As a
> +results, the BRA API must support async transfers and expose a
> +separate wait function.

This feels a lot like it could map onto the regmap async interface, it
would need a bit of a rework (mainly that currently they provide
ordering guarantees with respect to both each other and sync I/O) but
those could be removed with some care) but it's got the "here's a list
of I/O, here's another call to ensure it's all actually happened" thing.
It sounds very much like how I was thinking of the async API when I was
writing it and the initial users.

It's this bit that really got me thinking it could fit well into regmap.

> +Error handling
> +~~~~~~~~~~~~~~
> +
> +The expected response to a 'bra_message' and follow-up behavior may
> +vary:
> +
> +   (1) A Peripheral driver may want to receive an immediate -EBUSY
> +       response if the BRA protocol is not available at a given time.
> +
> +   (2) A Peripheral driver may want to wait until a timeout for the
> +       on-going transfer to be handled
> +
> +   (3) A Peripheral driver may want to wait until existing BRA
> +       transfers complete or deal with BRA as a background task when
> +       audio transfers stop. In this case, there would be no timeout,
> +       and the operation may not happen if the platform is suspended.

Option 3 would be a jump for regmap.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-07 23:29   ` Mark Brown
@ 2023-12-08  0:56     ` Pierre-Louis Bossart
  2023-12-08 21:49       ` Mark Brown
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-08  0:56 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

Thanks for the comments Mark, much appreciated.

>> +Addressing restrictions
>> +-----------------------
>> +
>> +The Device Number specified in the Header follows the SoundWire
>> +definitions, and broadcast and group addressing are
>> +permitted. However, in reality it is very unlikely that the exact same
>> +binary data needs to be provided to the two different Peripheral
>> +devices. The Linux implementation only allows for transfers to a
>> +single device at a time.
> 
> For the firmware download case it seems likely that this won't always be
> the case, but it's definitely a thing that could be done incrementally.

One example discussed this week on the mailing list is the Cirrus Logic
case, where the firmware contains information on which speakers a given
amplifier should be doing, and each firmware is named as AMP-n.

I have really not found any practical case where the same data needs to
be sent to N devices, and I don't have a burning desire to tie the codec
initialization together with all the asynchronous nature of
enumeration/probe.

>> +Regmap use
>> +~~~~~~~~~~
> 
>> +Existing codec drivers rely on regmap to download firmware to
>> +Peripherals, so at a high-level it would seem natural to combine BRA
>> +and regmap. The regmap layer could check if BRA is available or not,
>> +and use a regular read-write command channel in the latter case.
> 
>> +However, the regmap layer does not have information on latency or how
>> +critical a BRA transfer is. It would make more sense to let the codec
>> +driver make decisions (wait, timeout, fallback to regular
>> +read/writes).
> 
> These don't seem like insurmountable obstacles - they feel more like a
> copy break kind of situation where we can infer things from the pattern
> of transactions, and there's always the possibility of adding some calls
> to give regmap more high level information about the overall state of
> the device.  One of the expected usage patterns with cache only mode is
> to build up a final register state then let the cache code work out the
> optimal pattern to actually write that out.

I did expect some pushback on regmap :-)

The point is really that the main use for this download is a set of
write-once memory areas which happen to be mapped as registers. There is
no real need to declare or access each memory address individually.

In addition in terms of error handling, the expectation is that the set
of writes are handled in a pass/fail manner. There's no real way to know
which of the individual writes failed.

>> +In addition, the hardware may lose context and ask for the firmware to
>> +be downloaded again. The firmware is not however a fixed definition,
>> +the SDCA definition allows the hardware to request an updated firmware
>> +or a different coefficient table to deal with specific environment
>> +conditions. In other words, traditional regmap cache management is not
>> +good enough, the Peripheral driver is required track hardware
>> +notifications and react accordingly.
> 
> I might be missing something but those requests for redownload sound
> like things that would occur regardless of the mechanism used to perform
> the I/O?

What I meant is that the codec driver would e.g. need to fetch a
different firmware table and download it. There's no real need to
maintain a cache on the host side since the entire table will be downloaded.

>> +Concurrency between BRA and regular read/write
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
>> +The existing 'nread/nwrite' API already relies on a notion of start
>> +address and number of bytes, so it would be possible to extend this
>> +API with a 'hint' requesting BPT/BRA be used.
> 
>> +However BRA transfers could be quite long, and the use of a single
>> +mutex for regular read/write and BRA is a show-stopper. Independent
>> +operation of the control/command and BRA transfers is a fundamental
>> +requirement, e.g. to change the volume level with the existing regmap
>> +interface while downloading firmware.
> 
>> +This places the burden on the codec driver to verify that there is no
>> +concurrent access to the same address with the command/control
>> +protocol and the BRA protocol.
> 
> This makes me feel very nervous about robustness.  I do note that regmap
> has an async interface which is currently only used for SPI and really
> only by the Cirrus DSPs (plus opportunisticly by rbtree cache sync), the
> original usage was to fill the pipleine of SPI controllers so we can
> ideally push data entirely from interrupt.  TBH that was done before SMP
> became standard and it's a lot less clear in this day and age that this
> is actually helpful, the overhead of cross core synchronisation likely
> obliterates any benefit.  There's sufficently few users that we could
> make API changes readily to fit better.

I am not that nervous, it's a known hardware issue and the software
drivers SHALL make sure that the two methods of accessing a register are
not used concurrently. We could add some sort of mutex on specific
memory areas.

>> +One possible strategy to speed-up all initialization tasks would be to
>> +start a BRA transfer for firmware download, then deal with all the
>> +"regular" read/writes in parallel with the command channel, and last
>> +to wait for the BRA transfers to complete. This would allow for a
>> +degree of overlap instead of a purely sequential solution. As a
>> +results, the BRA API must support async transfers and expose a
>> +separate wait function.
> 
> This feels a lot like it could map onto the regmap async interface, it
> would need a bit of a rework (mainly that currently they provide
> ordering guarantees with respect to both each other and sync I/O) but
> those could be removed with some care) but it's got the "here's a list
> of I/O, here's another call to ensure it's all actually happened" thing.
> It sounds very much like how I was thinking of the async API when I was
> writing it and the initial users.
> 
> It's this bit that really got me thinking it could fit well into regmap.

I really don't know anything about this async interface, if you have
pointers on existing examples I am all ears....I am not aware of any
Intel platform or codec used on an Intel platform making use of this API.

At any rate the low-level behavior is to
a) allocate and configure all the SoundWire resources
b) allocate and configure all the DMA resources
c) trigger DMA and enable SoundWire transfers
d) wait for the DMA to complete

The codec API can be modified as needed, as long as there are provisions
for these 4 steps.

>> +Error handling
>> +~~~~~~~~~~~~~~
>> +
>> +The expected response to a 'bra_message' and follow-up behavior may
>> +vary:
>> +
>> +   (1) A Peripheral driver may want to receive an immediate -EBUSY
>> +       response if the BRA protocol is not available at a given time.
>> +
>> +   (2) A Peripheral driver may want to wait until a timeout for the
>> +       on-going transfer to be handled
>> +
>> +   (3) A Peripheral driver may want to wait until existing BRA
>> +       transfers complete or deal with BRA as a background task when
>> +       audio transfers stop. In this case, there would be no timeout,
>> +       and the operation may not happen if the platform is suspended.
> 
> Option 3 would be a jump for regmap.

Sorry, I don't get what a 'jump' means in this context.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
  2023-12-07 23:29   ` Mark Brown
@ 2023-12-08 16:27   ` Charles Keepax
  2023-12-08 18:45     ` Pierre-Louis Bossart
  2023-12-18 11:40   ` Vinod Koul
  2 siblings, 1 reply; 57+ messages in thread
From: Charles Keepax @ 2023-12-08 16:27 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Thu, Dec 07, 2023 at 04:29:29PM -0600, Pierre-Louis Bossart wrote:
> The Bulk Register Access protocol was left as a TODO topic since
> 2018. It's time to document this protocol and the design of its Linux
> support.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
> +Concurrency between BRA and regular read/write
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The existing 'nread/nwrite' API already relies on a notion of start
> +address and number of bytes, so it would be possible to extend this
> +API with a 'hint' requesting BPT/BRA be used.
> +
> +However BRA transfers could be quite long, and the use of a single
> +mutex for regular read/write and BRA is a show-stopper. Independent
> +operation of the control/command and BRA transfers is a fundamental
> +requirement, e.g. to change the volume level with the existing regmap
> +interface while downloading firmware.

Is this definitely a show stopper? Not saying that it wouldn't be
desirable to do both from a speed perspective, but current
systems that download firmware (I2C/SPI) typically will block the
bus for some amount of time. There are also some desirable properties
to a single lock such as not needing to worry about accessing the
same register in the bulk transfer and a normal command transfer.

> +Audio DMA support
> +-----------------
> +
> +Some DMAs, such as HDaudio, require an audio format field to be
> +set. This format is in turn used to define acceptable bursts. BPT/BRA
> +support is not fully compatible with these definitions in that the
> +format may vary between read and write commands.
> +
> +In addition, on Intel HDaudio Intel platforms the DMAs need to be
> +programmed with a PCM format matching the bandwidth of the BPT/BRA
> +transfer. The format is based on 48kHz 32-bit samples, and the number
> +of channels varies to adjust the bandwidth. The notion of channel is
> +completely notional since the data is not typical audio
> +PCM. Programming channels helps reserve enough bandwidth and adjust
> +FIFO sizes to avoid xruns. Note that the quality of service comes as a
> +cost. Since all channels need to be present as a sample block, data
> +sizes not aligned to 128-bytes are not supported.

Apologies but could you elaborate a litte on this? I am not sure
I follow the reasoning, how does the 48k 32bit DMA implementation
result in 128-byte limitation? I would have thought 1 channel would
be 4-bytes and you are varying the channels so I would have expected
4-byte aligned maybe 8-byte if the DMA expects stereo pairs.

And what exactly do we mean by aligned, are we saying the length
all transfers needs to be a multiple of 128-bytes?

I think we might have some annoying restrictions on the block
size on our hardware as well I will go dig into that and report
back.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-08 16:27   ` Charles Keepax
@ 2023-12-08 18:45     ` Pierre-Louis Bossart
  2023-12-08 18:55       ` Mark Brown
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-08 18:45 UTC (permalink / raw)
  To: Charles Keepax
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou



On 12/8/23 10:27, Charles Keepax wrote:
> On Thu, Dec 07, 2023 at 04:29:29PM -0600, Pierre-Louis Bossart wrote:
>> The Bulk Register Access protocol was left as a TODO topic since
>> 2018. It's time to document this protocol and the design of its Linux
>> support.
>>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>> +Concurrency between BRA and regular read/write
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +The existing 'nread/nwrite' API already relies on a notion of start
>> +address and number of bytes, so it would be possible to extend this
>> +API with a 'hint' requesting BPT/BRA be used.
>> +
>> +However BRA transfers could be quite long, and the use of a single
>> +mutex for regular read/write and BRA is a show-stopper. Independent
>> +operation of the control/command and BRA transfers is a fundamental
>> +requirement, e.g. to change the volume level with the existing regmap
>> +interface while downloading firmware.
> 
> Is this definitely a show stopper? Not saying that it wouldn't be
> desirable to do both from a speed perspective, but current
> systems that download firmware (I2C/SPI) typically will block the
> bus for some amount of time. There are also some desirable properties
> to a single lock such as not needing to worry about accessing the
> same register in the bulk transfer and a normal command transfer.

There are other cases where we do want to interact with the device even
if we are in the middle of downloading stuff. We currently have a
msg_lock that's used to prevent multiple read/writes from being sent on
the bus, I really don't think it would be wise to use this same
"lightweight" lock for much longer transfers. This could impact response
to alerts, changes in state reported by the hardware, etc.

>> +Audio DMA support
>> +-----------------
>> +
>> +Some DMAs, such as HDaudio, require an audio format field to be
>> +set. This format is in turn used to define acceptable bursts. BPT/BRA
>> +support is not fully compatible with these definitions in that the
>> +format may vary between read and write commands.
>> +
>> +In addition, on Intel HDaudio Intel platforms the DMAs need to be
>> +programmed with a PCM format matching the bandwidth of the BPT/BRA
>> +transfer. The format is based on 48kHz 32-bit samples, and the number
>> +of channels varies to adjust the bandwidth. The notion of channel is
>> +completely notional since the data is not typical audio
>> +PCM. Programming channels helps reserve enough bandwidth and adjust
>> +FIFO sizes to avoid xruns. Note that the quality of service comes as a
>> +cost. Since all channels need to be present as a sample block, data
>> +sizes not aligned to 128-bytes are not supported.
> 
> Apologies but could you elaborate a litte on this? I am not sure
> I follow the reasoning, how does the 48k 32bit DMA implementation
> result in 128-byte limitation? I would have thought 1 channel would
> be 4-bytes and you are varying the channels so I would have expected
> 4-byte aligned maybe 8-byte if the DMA expects stereo pairs.

With the 50x4 default frame size that we're using, a BTP/BRA write
requires 13.824 Mbits/s for TX and 3.072Mbits/s for RX. We have to treat
these "non-audio' streams as respectively 10 and 2 channels of 48kHz
32bit data to reserve the equivalent PCM bandwidth, and that creates an
alignment requirement since the DMA expects all 'channels' to be present
in the same block.

We could try and play with parameters, e.g. I am not sure what happens
if we used 192kHz, maybe that reduces the alignment back to 32 bytes.
But the point is arbitrary sizes *cannot* be handled.

The HDaudio spec also mentions that for efficiency reason it's strongly
recommended to use 128-byte multiples.

> And what exactly do we mean by aligned, are we saying the length
> all transfers needs to be a multiple of 128-bytes?

Yes.

> I think we might have some annoying restrictions on the block
> size on our hardware as well I will go dig into that and report
> back.

That would be very useful indeed, we have to make sure this alignment is
reasonably supported across hosts and devices.
If we can't agree we'll have to make the alignment host-dependent, but
that will make the logic in codec drivers more complicated.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-08 18:45     ` Pierre-Louis Bossart
@ 2023-12-08 18:55       ` Mark Brown
  0 siblings, 0 replies; 57+ messages in thread
From: Mark Brown @ 2023-12-08 18:55 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Charles Keepax, linux-sound, alsa-devel, tiwai, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

[-- Attachment #1: Type: text/plain, Size: 635 bytes --]

On Fri, Dec 08, 2023 at 12:45:21PM -0600, Pierre-Louis Bossart wrote:
> On 12/8/23 10:27, Charles Keepax wrote:

> > I think we might have some annoying restrictions on the block
> > size on our hardware as well I will go dig into that and report
> > back.

> That would be very useful indeed, we have to make sure this alignment is
> reasonably supported across hosts and devices.
> If we can't agree we'll have to make the alignment host-dependent, but
> that will make the logic in codec drivers more complicated.

Hopefully even if it's complicated it's something that can be factored
out into library code rather than replicated.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-08  0:56     ` Pierre-Louis Bossart
@ 2023-12-08 21:49       ` Mark Brown
  2023-12-19 16:50         ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Mark Brown @ 2023-12-08 21:49 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

[-- Attachment #1: Type: text/plain, Size: 5044 bytes --]

On Thu, Dec 07, 2023 at 06:56:55PM -0600, Pierre-Louis Bossart wrote:

> >> +The Device Number specified in the Header follows the SoundWire
> >> +definitions, and broadcast and group addressing are
> >> +permitted. However, in reality it is very unlikely that the exact same
> >> +binary data needs to be provided to the two different Peripheral
> >> +devices. The Linux implementation only allows for transfers to a
> >> +single device at a time.

> > For the firmware download case it seems likely that this won't always be
> > the case, but it's definitely a thing that could be done incrementally.

> One example discussed this week on the mailing list is the Cirrus Logic
> case, where the firmware contains information on which speakers a given
> amplifier should be doing, and each firmware is named as AMP-n.

I can imagine a vendor structuring things so they've got separate
firmware and coefficent/configuration images, the firmware images could
be shared.

> I have really not found any practical case where the same data needs to
> be sent to N devices, and I don't have a burning desire to tie the codec
> initialization together with all the asynchronous nature of
> enumeration/probe.

Like I say it does seem like something that could be done incrementally.

> > These don't seem like insurmountable obstacles - they feel more like a
> > copy break kind of situation where we can infer things from the pattern
> > of transactions, and there's always the possibility of adding some calls
> > to give regmap more high level information about the overall state of
> > the device.  One of the expected usage patterns with cache only mode is
> > to build up a final register state then let the cache code work out the
> > optimal pattern to actually write that out.

> I did expect some pushback on regmap :-)

> The point is really that the main use for this download is a set of
> write-once memory areas which happen to be mapped as registers. There is
> no real need to declare or access each memory address individually.

Yeah, normally it's just write only stuff but I've seen things like
controls being added in the DSP memory before - the 

> In addition in terms of error handling, the expectation is that the set
> of writes are handled in a pass/fail manner. There's no real way to know
> which of the individual writes failed.

That's the case for any block writes.

> > I might be missing something but those requests for redownload sound
> > like things that would occur regardless of the mechanism used to perform
> > the I/O?

> What I meant is that the codec driver would e.g. need to fetch a
> different firmware table and download it. There's no real need to
> maintain a cache on the host side since the entire table will be downloaded.

I mean, if that's the usage pattern surely things would just be marked
volatile anyway?  A cache is entirely optional.

> > This feels a lot like it could map onto the regmap async interface, it
> > would need a bit of a rework (mainly that currently they provide
> > ordering guarantees with respect to both each other and sync I/O) but
> > those could be removed with some care) but it's got the "here's a list
> > of I/O, here's another call to ensure it's all actually happened" thing.
> > It sounds very much like how I was thinking of the async API when I was
> > writing it and the initial users.

> > It's this bit that really got me thinking it could fit well into regmap.

> I really don't know anything about this async interface, if you have
> pointers on existing examples I am all ears....I am not aware of any
> Intel platform or codec used on an Intel platform making use of this API.

grep for regmap_.*async - cs_dsp.c is the upstream example in a driver,
or there's the rbtree cache sync code which uses a back door to go into
an async mode.  Basically just variants of all the normal regmap I/O
calls with a _complete() call you can use to wait for everything to
happen.  The implementation is a bit heavyweight since it was written to
work with fairly slow buses.

> At any rate the low-level behavior is to
> a) allocate and configure all the SoundWire resources
> b) allocate and configure all the DMA resources
> c) trigger DMA and enable SoundWire transfers
> d) wait for the DMA to complete

> The codec API can be modified as needed, as long as there are provisions
> for these 4 steps.

That's not quite how the current API works, but it feels like it's close
enough to the intent and there's literally one user of the actual API.

> >> +   (3) A Peripheral driver may want to wait until existing BRA
> >> +       transfers complete or deal with BRA as a background task when
> >> +       audio transfers stop. In this case, there would be no timeout,
> >> +       and the operation may not happen if the platform is suspended.

> > Option 3 would be a jump for regmap.

> Sorry, I don't get what a 'jump' means in this context.

Big change.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-07 22:29 ` [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol Pierre-Louis Bossart
@ 2023-12-12 12:19   ` Charles Keepax
  2023-12-18 10:38     ` Pierre-Louis Bossart
  2023-12-18 11:54   ` Vinod Koul
  1 sibling, 1 reply; 57+ messages in thread
From: Charles Keepax @ 2023-12-12 12:19 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Thu, Dec 07, 2023 at 04:29:35PM -0600, Pierre-Louis Bossart wrote:
> This patch suggests a minimum bound of 64 bytes, for smaller transfers

128 in the code.

> +int sdw_bpt_close_stream(struct sdw_bus *bus,
> +			 struct sdw_slave *slave,
> +			 enum sdw_bpt_type mode,
> +			 struct sdw_bpt_msg *msg)
> +{

Seems weird to pass the message to close?

> +int sdw_bpt_open_stream(struct sdw_bus *bus,
> +			struct sdw_slave *slave,
> +			enum sdw_bpt_type mode,
> +			struct sdw_bpt_msg *msg)

Ditto, here does it make sense to pass the msg to open? I guess
the idea is that one only sends a single message in one
open->send->wait->close cycle? Would be much nicer if multiple
send/waits could be done in a single open/close, or if the
limitation is unavoidable why split out open/send into separate
calls at all? Just have send and wait and wrap open/close into
them.

> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
> +		return -EINVAL;
> +	}

Should this really be here? My understanding is this alignment is
a limitation of specific hardware so should this check not be in
the Cadence master code.

> +int sdw_bpt_send_async(struct sdw_bus *bus,
> +		       struct sdw_slave *slave,
> +		       struct sdw_bpt_msg *msg)
> +{
> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
> +		return -EINVAL;

Does this check make sense since this was already checked in
open? I guess the user could pass in a different message at this
stage, but that I guess is part of what feels weird about passing
the message into open.

> +/**
> + * struct sdw_btp_msg - Message structure
> + * @addr: Start Register address accessed in the Slave
> + * @len: number of bytes to transfer. More than 64Kb can be transferred
> + * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced.

Where is the 64kb coming from here?

> +/*
> + * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of
> + * overhead, enabling/disabling a stream costs 6 write commands, plus the bank
> + * switch that could be delayed in time. In addition, transferring very small
> + * data sets over DMA is known to be problematic on multiple platforms.
> + */
> +#define SDW_BPT_MSG_MIN_BYTES      128
> +

Is it really necessary for the core to enforce a minimum transfer
size (well except for the required alignment thing)? Yes small
transfers don't obviously make sense, but there is nothing inherently
wrong with doing one, which makes me feel it is excessive for the
core to block more than it has to here.

I would be of the opinion its up to driver writers if they have
some reason to do small BRA transfers. Firstly, since we are so
keen on allowing BRA and normal writes to overlap, one could see
use-cases when you want to do something through BRA such that it
doesn't block the normal command stream. Also there is already 1
feature on cs42l43 that can only be accessed through BRA, yes that
is a heroically questionable hardware design choice. Whilst we are
mostly ignoring that for now, I can see this being something some
other hardware teams decide to heroically do at some point as well.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream
  2023-12-07 22:29 ` [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream Pierre-Louis Bossart
@ 2023-12-12 12:30   ` Charles Keepax
  2023-12-18 10:45     ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Charles Keepax @ 2023-12-12 12:30 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Thu, Dec 07, 2023 at 04:29:34PM -0600, Pierre-Louis Bossart wrote:
> DP0 (Data Port 0) is very similar to regular data ports, with minor
> tweaks we can reuse the same code.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
> -	dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
> -					  s_rt->direction,
> -					  t_params->port_num);
> -	if (!dpn_prop)
> -		return -EINVAL;
> +	if (t_params->port_num) {
> +		struct sdw_dpn_prop *dpn_prop;
> +
> +		dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
> +						  s_rt->direction,
> +						  t_params->port_num);
> +		if (!dpn_prop)
> +			return -EINVAL;
> +
> +		read_only_wordlength = dpn_prop->read_only_wordlength;
> +		port_type = dpn_prop->type;
> +	} else {
> +		read_only_wordlength = false;
> +		port_type = SDW_DPN_FULL;
> +	}

Would it be nicer to just add a special case sdw_get_slave_dpn_prop
to return dp0_prop and avoid all this special casing in the rest
of the code?

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-12 12:19   ` Charles Keepax
@ 2023-12-18 10:38     ` Pierre-Louis Bossart
  0 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 10:38 UTC (permalink / raw)
  To: Charles Keepax
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou



On 12/12/23 06:19, Charles Keepax wrote:
> On Thu, Dec 07, 2023 at 04:29:35PM -0600, Pierre-Louis Bossart wrote:
>> This patch suggests a minimum bound of 64 bytes, for smaller transfers
> 
> 128 in the code.

indeed, that's a left-over from an earlier trial. Will fix.

>> +int sdw_bpt_close_stream(struct sdw_bus *bus,
>> +			 struct sdw_slave *slave,
>> +			 enum sdw_bpt_type mode,
>> +			 struct sdw_bpt_msg *msg)
>> +{
> 
> Seems weird to pass the message to close?

It's not necessary in my current solution, but I opted to keep the
arguments aligned.

>> +int sdw_bpt_open_stream(struct sdw_bus *bus,
>> +			struct sdw_slave *slave,
>> +			enum sdw_bpt_type mode,
>> +			struct sdw_bpt_msg *msg)
> 
> Ditto, here does it make sense to pass the msg to open? I guess
> the idea is that one only sends a single message in one
> open->send->wait->close cycle? Would be much nicer if multiple
> send/waits could be done in a single open/close, or if the
> limitation is unavoidable why split out open/send into separate
> calls at all? Just have send and wait and wrap open/close into
> them.

it's needed for the Intel solution, the open() will allocate the
required DMA buffers and copy the data from the messsage into the DMA
buffers with the required formatting expected by the IP.

An alternative would be to simplify open/close but then we have to add a
hw_params/prepare and hw_free steps. I didn't see a need for such a
split, unlike for audio the arguments are not going to be variable.

But yes it's a good question, what exactly is an open/startup callback
supposed to do in ALSA/ASoC?

> 
>> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
>> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
>> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
>> +		return -EINVAL;
>> +	}
> 
> Should this really be here? My understanding is this alignment is
> a limitation of specific hardware so should this check not be in
> the Cadence master code.

As discussed earlier, we *could* move this to host-specific parts, but
then the codec driver would need to know about host alignment
requirements. One of those 'be careful what you ask for' things where
you may have more work to do on the codec driver side...

>> +int sdw_bpt_send_async(struct sdw_bus *bus,
>> +		       struct sdw_slave *slave,
>> +		       struct sdw_bpt_msg *msg)
>> +{
>> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
>> +		return -EINVAL;
> 
> Does this check make sense since this was already checked in
> open? I guess the user could pass in a different message at this
> stage, but that I guess is part of what feels weird about passing
> the message into open.

The error check is a big open, we could assume that all the previous
steps were done the right way and skip tests, or we would add a set of
paranoia checks.

The primary intent of those checks is to help the integration of codec
drivers, it's better to get an error code and a clear error message than
a kernel oops because the expected sequence is not followed. Once the
integration is done, those tests are probably not very useful indeed.

>> +/**
>> + * struct sdw_btp_msg - Message structure
>> + * @addr: Start Register address accessed in the Slave
>> + * @len: number of bytes to transfer. More than 64Kb can be transferred
>> + * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced.
> 
> Where is the 64kb coming from here?

Ah, this is a reference to the 16 bit address limitation in the regular
read/write commands.

>> +/*
>> + * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of
>> + * overhead, enabling/disabling a stream costs 6 write commands, plus the bank
>> + * switch that could be delayed in time. In addition, transferring very small
>> + * data sets over DMA is known to be problematic on multiple platforms.
>> + */
>> +#define SDW_BPT_MSG_MIN_BYTES      128
>> +
> 
> Is it really necessary for the core to enforce a minimum transfer
> size (well except for the required alignment thing)? Yes small
> transfers don't obviously make sense, but there is nothing inherently
> wrong with doing one, which makes me feel it is excessive for the
> core to block more than it has to here.

I think it's really better to have a clear design intent that BRA is not
meant for small transfers. It would be opening a can of worms if we
start seeing uses of this protocol beyond the intended firmware/table
download and history buffer retrieval.

> I would be of the opinion its up to driver writers if they have
> some reason to do small BRA transfers. Firstly, since we are so
> keen on allowing BRA and normal writes to overlap, one could see
> use-cases when you want to do something through BRA such that it
> doesn't block the normal command stream. Also there is already 1
> feature on cs42l43 that can only be accessed through BRA, yes that
> is a heroically questionable hardware design choice. Whilst we are
> mostly ignoring that for now, I can see this being something some
> other hardware teams decide to heroically do at some point as well.

To be clear, I would like to allow BRA in parallel with normal writes
*to a different set of registers* to deal with alerts, etc.

A set of registers only accessible through BRA is a very questionable
design indeed. That's not what the SoundWire spec intended and it's not
what this patchset had in mind. You'll have to tell us more on what
exactly the hardware is and does, and how strongly you want this
capability supported...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream
  2023-12-12 12:30   ` Charles Keepax
@ 2023-12-18 10:45     ` Pierre-Louis Bossart
  0 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 10:45 UTC (permalink / raw)
  To: Charles Keepax
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou



On 12/12/23 06:30, Charles Keepax wrote:
> On Thu, Dec 07, 2023 at 04:29:34PM -0600, Pierre-Louis Bossart wrote:
>> DP0 (Data Port 0) is very similar to regular data ports, with minor
>> tweaks we can reuse the same code.
>>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>> -	dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
>> -					  s_rt->direction,
>> -					  t_params->port_num);
>> -	if (!dpn_prop)
>> -		return -EINVAL;
>> +	if (t_params->port_num) {
>> +		struct sdw_dpn_prop *dpn_prop;
>> +
>> +		dpn_prop = sdw_get_slave_dpn_prop(s_rt->slave,
>> +						  s_rt->direction,
>> +						  t_params->port_num);
>> +		if (!dpn_prop)
>> +			return -EINVAL;
>> +
>> +		read_only_wordlength = dpn_prop->read_only_wordlength;
>> +		port_type = dpn_prop->type;
>> +	} else {
>> +		read_only_wordlength = false;
>> +		port_type = SDW_DPN_FULL;
>> +	}
> 
> Would it be nicer to just add a special case sdw_get_slave_dpn_prop
> to return dp0_prop and avoid all this special casing in the rest
> of the code?

There are multiple cases in the same function where we need to check for
DP0. We could move the read_only_wordlength and port_type to the
sdw_get_slave_dpn_prop() helper, but then we would spread the special
cases in multiple locations. It's not pretty either way.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
  2023-12-07 23:29   ` Mark Brown
  2023-12-08 16:27   ` Charles Keepax
@ 2023-12-18 11:40   ` Vinod Koul
  2023-12-18 12:58     ` Pierre-Louis Bossart
  2 siblings, 1 reply; 57+ messages in thread
From: Vinod Koul @ 2023-12-18 11:40 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

Hi Pierre,

On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> The Bulk Register Access protocol was left as a TODO topic since
> 2018. It's time to document this protocol and the design of its Linux
> support.

Thanks for letting me know about this thread. At least now with B4 we
can grab threads we are not copied.

> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  Documentation/driver-api/soundwire/bra.rst    | 478 ++++++++++++++++++

Can we split the cadence parts of this to bra-cadence.rst that way this
file documents the core parts only

> +Concurrency between BRA and regular read/write
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The existing 'nread/nwrite' API already relies on a notion of start
> +address and number of bytes, so it would be possible to extend this
> +API with a 'hint' requesting BPT/BRA be used.
> +
> +However BRA transfers could be quite long, and the use of a single
> +mutex for regular read/write and BRA is a show-stopper. Independent
> +operation of the control/command and BRA transfers is a fundamental
> +requirement, e.g. to change the volume level with the existing regmap
> +interface while downloading firmware.
> +
> +This places the burden on the codec driver to verify that there is no
> +concurrent access to the same address with the command/control
> +protocol and the BRA protocol.
> +
> +In addition, the 'sdw_msg' structure hard-codes support for 16-bit
> +addresses and paging registers which are irrelevant for BPT/BRA
> +support based on native 32-bit addresses. A separate API with
> +'sdw_bpt_msg' makes more sense.
> +
> +One possible strategy to speed-up all initialization tasks would be to
> +start a BRA transfer for firmware download, then deal with all the
> +"regular" read/writes in parallel with the command channel, and last
> +to wait for the BRA transfers to complete. This would allow for a
> +degree of overlap instead of a purely sequential solution. As a
> +results, the BRA API must support async transfers and expose a
> +separate wait function.

That sounds logical to me

> +
> +Error handling
> +~~~~~~~~~~~~~~
> +
> +The expected response to a 'bra_message' and follow-up behavior may
> +vary:
> +
> +   (1) A Peripheral driver may want to receive an immediate -EBUSY
> +       response if the BRA protocol is not available at a given time.
> +
> +   (2) A Peripheral driver may want to wait until a timeout for the
> +       on-going transfer to be handled
> +
> +   (3) A Peripheral driver may want to wait until existing BRA
> +       transfers complete or deal with BRA as a background task when
> +       audio transfers stop. In this case, there would be no timeout,
> +       and the operation may not happen if the platform is suspended.

Is this runtime suspend or S3/S4 case?

> +BTP/BRA API available to peripheral drivers
> +-------------------------------------------
> +
> +ASoC Peripheral drivers may use
> +
> +   - sdw_bpt_stream_open(mode)
> +
> +      This function verifies that the BPT protocol with the
> +      'mode'. For now only BRA is accepted as a mode. This function
> +      allocates a work buffer internally. This buffer is not exposed
> +      to the caller.
> +
> +     errors:
> +         -ENODEV: BPT/BRA is not supported by the Manager.
> +
> +         -EBUSY: another agent is already using the audio payload for
> +          audio transfers. There is no way to predict when the audio
> +          streams might stop, this will require the Peripheral driver
> +          to fall back to the regular (slow) command channel.
> +
> +         -EAGAIN: another agent is already transferring data using the
> +          BPT/BRA protocol. Since the transfers will typically last
> +          10s or 100s of ms, the Peripheral driver may wait and retry
> +          later.
> +
> +    - sdw_bpt_message_send_async(bpt_message)

why not have a single API that does both? First check if it is supported
and then allocate buffers and do the transfer.. What are the advantages
of using this two step process

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-07 22:29 ` [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol Pierre-Louis Bossart
  2023-12-12 12:19   ` Charles Keepax
@ 2023-12-18 11:54   ` Vinod Koul
  2023-12-18 13:12     ` Pierre-Louis Bossart
  1 sibling, 1 reply; 57+ messages in thread
From: Vinod Koul @ 2023-12-18 11:54 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> Add definitions and helpers for the BPT/BRA protocol. Peripheral
> drivers (aka ASoC codec drivers) can use this API to send bulk data
> such as firmware or tables.
> 
> The API is only available when no other audio streams have been
> allocated, and only one BTP/BRA stream is allowed per link.  To avoid
> the addition of yet another lock, the refcount tests are handled in
> the stream master_runtime alloc/free routines where the bus_lock is
> already held. Another benefit of this approach is that the same
> bus_lock is used to handle runtime and port linked lists, which
> reduces the potential for misaligned configurations.
> 
> In addition to exclusion with audio streams, BPT transfers have a lot
> of overhead, specifically registers writes are needed to enable
> transport in DP0. In addition, most DMAs don't handle too well very
> small data sets.
> 
> This patch suggests a minimum bound of 64 bytes, for smaller transfers
> codec drivers should rely on the regular read/write commands in
> Column0.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  drivers/soundwire/bus.c       | 77 +++++++++++++++++++++++++++++++++++
>  drivers/soundwire/bus.h       | 18 ++++++++
>  drivers/soundwire/stream.c    | 30 ++++++++++++++
>  include/linux/soundwire/sdw.h | 76 ++++++++++++++++++++++++++++++++++
>  4 files changed, 201 insertions(+)
> 
> diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
> index f3fec15c3112..e5758d2ed88f 100644
> --- a/drivers/soundwire/bus.c
> +++ b/drivers/soundwire/bus.c
> @@ -2015,3 +2015,80 @@ void sdw_clear_slave_status(struct sdw_bus *bus, u32 request)
>  	}
>  }
>  EXPORT_SYMBOL(sdw_clear_slave_status);
> +
> +int sdw_bpt_close_stream(struct sdw_bus *bus,
> +			 struct sdw_slave *slave,
> +			 enum sdw_bpt_type mode,
> +			 struct sdw_bpt_msg *msg)
> +{
> +	int ret;
> +
> +	ret = bus->ops->bpt_close_stream(bus, slave, mode, msg);
> +	if (ret < 0)
> +		dev_err(bus->dev, "BPT stream close, err %d\n", ret);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(sdw_bpt_close_stream);
> +
> +int sdw_bpt_open_stream(struct sdw_bus *bus,
> +			struct sdw_slave *slave,
> +			enum sdw_bpt_type mode,
> +			struct sdw_bpt_msg *msg)
> +{
> +	int ret;
> +
> +	/* only Bulk Register Access (BRA) is supported for now */
> +	if (mode != SDW_BRA)
> +		return -EINVAL;
> +
> +	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
> +		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
> +			msg->len, SDW_BPT_MSG_MIN_BYTES);
> +		return -EINVAL;
> +	}
> +
> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
> +		return -EINVAL;
> +	}

Is this a protocol requirement?

> +
> +	/* check device is enumerated */
> +	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
> +	    slave->dev_num > SDW_MAX_DEVICES)
> +		return -ENODEV;
> +
> +	/* make sure all callbacks are defined */
> +	if (!bus->ops->bpt_open_stream ||
> +	    !bus->ops->bpt_close_stream ||
> +	    !bus->ops->bpt_send_async ||
> +	    !bus->ops->bpt_wait)
> +		return -ENOTSUPP;

should this not be checked at probe time, if device declares the support

> +
> +	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
> +	if (ret < 0)
> +		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(sdw_bpt_open_stream);

can we open multiple times (i dont see a check preventing that), how do
we close..?

Re-iterating my comment on documentation patch, can we do with a async api
and wait api, that makes symantics a lot simpler, right..?

> +
> +int sdw_bpt_send_async(struct sdw_bus *bus,
> +		       struct sdw_slave *slave,
> +		       struct sdw_bpt_msg *msg)
> +{
> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
> +		return -EINVAL;
> +
> +	return bus->ops->bpt_send_async(bus, slave, msg);
> +}
> +EXPORT_SYMBOL(sdw_bpt_send_async);

Can we call this multiple times after open, it is unclear to me. Can you
please add kernel-doc comments about the APIs here as well

>  struct sdw_master_ops {
>  	int (*read_prop)(struct sdw_bus *bus);
> @@ -869,6 +913,20 @@ struct sdw_master_ops {
>  	void (*new_peripheral_assigned)(struct sdw_bus *bus,
>  					struct sdw_slave *slave,
>  					int dev_num);
> +	int (*bpt_open_stream)(struct sdw_bus *bus,
> +			       struct sdw_slave *slave,
> +			       enum sdw_bpt_type mode,
> +			       struct sdw_bpt_msg *msg);
> +	int (*bpt_close_stream)(struct sdw_bus *bus,
> +				struct sdw_slave *slave,
> +				enum sdw_bpt_type mode,
> +				struct sdw_bpt_msg *msg);
> +	int (*bpt_send_async)(struct sdw_bus *bus,
> +			      struct sdw_slave *slave,
> +			      struct sdw_bpt_msg *msg);
> +	int (*bpt_wait)(struct sdw_bus *bus,
> +			struct sdw_slave *slave,
> +			struct sdw_bpt_msg *msg);

do we need both bus and slave, that was a mistake in orignal design IMO.
We should fix that for bpt_ apis

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer
  2023-12-07 22:29 ` [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer Pierre-Louis Bossart
@ 2023-12-18 11:55   ` Vinod Koul
  2023-12-18 13:20     ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Vinod Koul @ 2023-12-18 11:55 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> Add a convenience pointer to the 'sdw_bus' structure. BPT is a
> dedicated stream which will typically not be handled by DAIs or
> dailinks. Since there's only one BPT stream per link, storing the
> pointer at the link level seems rather natural.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  include/linux/soundwire/sdw.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
> index e54c5bbd2b91..8db0cd7d0d89 100644
> --- a/include/linux/soundwire/sdw.h
> +++ b/include/linux/soundwire/sdw.h
> @@ -965,6 +965,7 @@ struct sdw_master_ops {
>   * @stream_refcount: number of streams currently using this bus
>   * @btp_stream_refcount: number of BTP streams currently using this bus (should
>   * be zero or one, multiple streams per link is not supported).
> + * @bpt_stream: pointer stored for convenience.
>   */
>  struct sdw_bus {
>  	struct device *dev;
> @@ -996,6 +997,7 @@ struct sdw_bus {
>  	int hw_sync_min_links;
>  	int stream_refcount;
>  	int bpt_stream_refcount;
> +	struct sdw_stream_runtime *bpt_stream;

So we are limiting to single stream? Can we have multiple transfers
queued up, which I guess might imply multiple streams?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 09/16] soundwire: crc8: add constant table
  2023-12-07 22:29 ` [RFC PATCH 09/16] soundwire: crc8: add constant table Pierre-Louis Bossart
@ 2023-12-18 12:01   ` Vinod Koul
  2023-12-18 13:26     ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Vinod Koul @ 2023-12-18 12:01 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> Add the lookup table required by crc8(). All configuration values were
> directly table from the MIPI SoundWire 1.x specification.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  drivers/soundwire/Makefile |   4 +-
>  drivers/soundwire/crc8.c   | 277 +++++++++++++++++++++++++++++++++++++
>  drivers/soundwire/crc8.h   |  11 ++
>  3 files changed, 291 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/soundwire/crc8.c
>  create mode 100644 drivers/soundwire/crc8.h
> 
> diff --git a/drivers/soundwire/Makefile b/drivers/soundwire/Makefile
> index 657f5888a77b..170128dd9318 100644
> --- a/drivers/soundwire/Makefile
> +++ b/drivers/soundwire/Makefile
> @@ -5,7 +5,9 @@
>  
>  #Bus Objs
>  soundwire-bus-y := bus_type.o bus.o master.o slave.o mipi_disco.o stream.o  \
> -			sysfs_slave.o sysfs_slave_dpn.o
> +			sysfs_slave.o sysfs_slave_dpn.o \
> +			crc8.o
> +
>  obj-$(CONFIG_SOUNDWIRE) += soundwire-bus.o
>  
>  soundwire-generic-allocation-objs := generic_bandwidth_allocation.o
> diff --git a/drivers/soundwire/crc8.c b/drivers/soundwire/crc8.c
> new file mode 100644
> index 000000000000..b6b984d7f39a
> --- /dev/null
> +++ b/drivers/soundwire/crc8.c
> @@ -0,0 +1,277 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
> +// Copyright(c) 2024 Intel Corporation.
> +
> +#include <linux/crc8.h>
> +#include <linux/module.h>
> +#include "crc8.h"
> +
> +/*
> + * the MIPI SoundWire CRC8 polynomial is X^8 + X^6 + X^3 + X^2 + 1, MSB first
> + * The value is (1)01001101 = 0x4D
> + *
> + * the table below was generated with
> + *
> + *	u8 crc8_lookup_table[CRC8_TABLE_SIZE];
> + *	crc8_populate_msb(crc8_lookup_table, SDW_CRC8_POLY);

Good that you found this API, so next question would be why should we
have this static table in kernel and not generate on probe if bpt is
supported..? Many subsystems use these APIs to generate the tables..


> + *
> + */
> +
> +const u8 sdw_crc8_lookup_msb[CRC8_TABLE_SIZE] = {
> +	0x00, /* 0 */
> +	0x4d, /* 1 */
> +	0x9a, /* 2 */
> +	0xd7, /* 3 */
> +	0x79, /* 4 */
> +	0x34, /* 5 */
> +	0xe3, /* 6 */
> +	0xae, /* 7 */
> +	0xf2, /* 8 */
> +	0xbf, /* 9 */
> +	0x68, /* 10 */
> +	0x25, /* 11 */
> +	0x8b, /* 12 */
> +	0xc6, /* 13 */
> +	0x11, /* 14 */
> +	0x5c, /* 15 */
> +	0xa9, /* 16 */
> +	0xe4, /* 17 */
> +	0x33, /* 18 */
> +	0x7e, /* 19 */
> +	0xd0, /* 20 */
> +	0x9d, /* 21 */
> +	0x4a, /* 22 */
> +	0x07, /* 23 */
> +	0x5b, /* 24 */
> +	0x16, /* 25 */
> +	0xc1, /* 26 */
> +	0x8c, /* 27 */
> +	0x22, /* 28 */
> +	0x6f, /* 29 */
> +	0xb8, /* 30 */
> +	0xf5, /* 31 */
> +	0x1f, /* 32 */
> +	0x52, /* 33 */
> +	0x85, /* 34 */
> +	0xc8, /* 35 */
> +	0x66, /* 36 */
> +	0x2b, /* 37 */
> +	0xfc, /* 38 */
> +	0xb1, /* 39 */
> +	0xed, /* 40 */
> +	0xa0, /* 41 */
> +	0x77, /* 42 */
> +	0x3a, /* 43 */
> +	0x94, /* 44 */
> +	0xd9, /* 45 */
> +	0x0e, /* 46 */
> +	0x43, /* 47 */
> +	0xb6, /* 48 */
> +	0xfb, /* 49 */
> +	0x2c, /* 50 */
> +	0x61, /* 51 */
> +	0xcf, /* 52 */
> +	0x82, /* 53 */
> +	0x55, /* 54 */
> +	0x18, /* 55 */
> +	0x44, /* 56 */
> +	0x09, /* 57 */
> +	0xde, /* 58 */
> +	0x93, /* 59 */
> +	0x3d, /* 60 */
> +	0x70, /* 61 */
> +	0xa7, /* 62 */
> +	0xea, /* 63 */
> +	0x3e, /* 64 */
> +	0x73, /* 65 */
> +	0xa4, /* 66 */
> +	0xe9, /* 67 */
> +	0x47, /* 68 */
> +	0x0a, /* 69 */
> +	0xdd, /* 70 */
> +	0x90, /* 71 */
> +	0xcc, /* 72 */
> +	0x81, /* 73 */
> +	0x56, /* 74 */
> +	0x1b, /* 75 */
> +	0xb5, /* 76 */
> +	0xf8, /* 77 */
> +	0x2f, /* 78 */
> +	0x62, /* 79 */
> +	0x97, /* 80 */
> +	0xda, /* 81 */
> +	0x0d, /* 82 */
> +	0x40, /* 83 */
> +	0xee, /* 84 */
> +	0xa3, /* 85 */
> +	0x74, /* 86 */
> +	0x39, /* 87 */
> +	0x65, /* 88 */
> +	0x28, /* 89 */
> +	0xff, /* 90 */
> +	0xb2, /* 91 */
> +	0x1c, /* 92 */
> +	0x51, /* 93 */
> +	0x86, /* 94 */
> +	0xcb, /* 95 */
> +	0x21, /* 96 */
> +	0x6c, /* 97 */
> +	0xbb, /* 98 */
> +	0xf6, /* 99 */
> +	0x58, /* 100 */
> +	0x15, /* 101 */
> +	0xc2, /* 102 */
> +	0x8f, /* 103 */
> +	0xd3, /* 104 */
> +	0x9e, /* 105 */
> +	0x49, /* 106 */
> +	0x04, /* 107 */
> +	0xaa, /* 108 */
> +	0xe7, /* 109 */
> +	0x30, /* 110 */
> +	0x7d, /* 111 */
> +	0x88, /* 112 */
> +	0xc5, /* 113 */
> +	0x12, /* 114 */
> +	0x5f, /* 115 */
> +	0xf1, /* 116 */
> +	0xbc, /* 117 */
> +	0x6b, /* 118 */
> +	0x26, /* 119 */
> +	0x7a, /* 120 */
> +	0x37, /* 121 */
> +	0xe0, /* 122 */
> +	0xad, /* 123 */
> +	0x03, /* 124 */
> +	0x4e, /* 125 */
> +	0x99, /* 126 */
> +	0xd4, /* 127 */
> +	0x7c, /* 128 */
> +	0x31, /* 129 */
> +	0xe6, /* 130 */
> +	0xab, /* 131 */
> +	0x05, /* 132 */
> +	0x48, /* 133 */
> +	0x9f, /* 134 */
> +	0xd2, /* 135 */
> +	0x8e, /* 136 */
> +	0xc3, /* 137 */
> +	0x14, /* 138 */
> +	0x59, /* 139 */
> +	0xf7, /* 140 */
> +	0xba, /* 141 */
> +	0x6d, /* 142 */
> +	0x20, /* 143 */
> +	0xd5, /* 144 */
> +	0x98, /* 145 */
> +	0x4f, /* 146 */
> +	0x02, /* 147 */
> +	0xac, /* 148 */
> +	0xe1, /* 149 */
> +	0x36, /* 150 */
> +	0x7b, /* 151 */
> +	0x27, /* 152 */
> +	0x6a, /* 153 */
> +	0xbd, /* 154 */
> +	0xf0, /* 155 */
> +	0x5e, /* 156 */
> +	0x13, /* 157 */
> +	0xc4, /* 158 */
> +	0x89, /* 159 */
> +	0x63, /* 160 */
> +	0x2e, /* 161 */
> +	0xf9, /* 162 */
> +	0xb4, /* 163 */
> +	0x1a, /* 164 */
> +	0x57, /* 165 */
> +	0x80, /* 166 */
> +	0xcd, /* 167 */
> +	0x91, /* 168 */
> +	0xdc, /* 169 */
> +	0x0b, /* 170 */
> +	0x46, /* 171 */
> +	0xe8, /* 172 */
> +	0xa5, /* 173 */
> +	0x72, /* 174 */
> +	0x3f, /* 175 */
> +	0xca, /* 176 */
> +	0x87, /* 177 */
> +	0x50, /* 178 */
> +	0x1d, /* 179 */
> +	0xb3, /* 180 */
> +	0xfe, /* 181 */
> +	0x29, /* 182 */
> +	0x64, /* 183 */
> +	0x38, /* 184 */
> +	0x75, /* 185 */
> +	0xa2, /* 186 */
> +	0xef, /* 187 */
> +	0x41, /* 188 */
> +	0x0c, /* 189 */
> +	0xdb, /* 190 */
> +	0x96, /* 191 */
> +	0x42, /* 192 */
> +	0x0f, /* 193 */
> +	0xd8, /* 194 */
> +	0x95, /* 195 */
> +	0x3b, /* 196 */
> +	0x76, /* 197 */
> +	0xa1, /* 198 */
> +	0xec, /* 199 */
> +	0xb0, /* 200 */
> +	0xfd, /* 201 */
> +	0x2a, /* 202 */
> +	0x67, /* 203 */
> +	0xc9, /* 204 */
> +	0x84, /* 205 */
> +	0x53, /* 206 */
> +	0x1e, /* 207 */
> +	0xeb, /* 208 */
> +	0xa6, /* 209 */
> +	0x71, /* 210 */
> +	0x3c, /* 211 */
> +	0x92, /* 212 */
> +	0xdf, /* 213 */
> +	0x08, /* 214 */
> +	0x45, /* 215 */
> +	0x19, /* 216 */
> +	0x54, /* 217 */
> +	0x83, /* 218 */
> +	0xce, /* 219 */
> +	0x60, /* 220 */
> +	0x2d, /* 221 */
> +	0xfa, /* 222 */
> +	0xb7, /* 223 */
> +	0x5d, /* 224 */
> +	0x10, /* 225 */
> +	0xc7, /* 226 */
> +	0x8a, /* 227 */
> +	0x24, /* 228 */
> +	0x69, /* 229 */
> +	0xbe, /* 230 */
> +	0xf3, /* 231 */
> +	0xaf, /* 232 */
> +	0xe2, /* 233 */
> +	0x35, /* 234 */
> +	0x78, /* 235 */
> +	0xd6, /* 236 */
> +	0x9b, /* 237 */
> +	0x4c, /* 238 */
> +	0x01, /* 239 */
> +	0xf4, /* 240 */
> +	0xb9, /* 241 */
> +	0x6e, /* 242 */
> +	0x23, /* 243 */
> +	0x8d, /* 244 */
> +	0xc0, /* 245 */
> +	0x17, /* 246 */
> +	0x5a, /* 247 */
> +	0x06, /* 248 */
> +	0x4b, /* 249 */
> +	0x9c, /* 250 */
> +	0xd1, /* 251 */
> +	0x7f, /* 252 */
> +	0x32, /* 253 */
> +	0xe5, /* 254 */
> +	0xa8  /* 255 */
> +};
> +EXPORT_SYMBOL(sdw_crc8_lookup_msb);
> diff --git a/drivers/soundwire/crc8.h b/drivers/soundwire/crc8.h
> new file mode 100644
> index 000000000000..9a88d3866016
> --- /dev/null
> +++ b/drivers/soundwire/crc8.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
> +/* Copyright(c) 2024 Intel Corporation. */
> +
> +#ifndef __SDW_CRC8_H
> +#define __SDW_CRC8_H
> +
> +#define SDW_CRC8_SEED 0xFF
> +#define SDW_CRC8_POLY 0x4D
> +extern const u8 sdw_crc8_lookup_msb[CRC8_TABLE_SIZE];
> +
> +#endif /* __SDW_CRC8_H */
> -- 
> 2.39.2
> 

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-18 11:40   ` Vinod Koul
@ 2023-12-18 12:58     ` Pierre-Louis Bossart
  2023-12-18 14:29       ` Charles Keepax
  2023-12-21 14:44       ` Vinod Koul
  0 siblings, 2 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 12:58 UTC (permalink / raw)
  To: Vinod Koul
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou


>>  Documentation/driver-api/soundwire/bra.rst    | 478 ++++++++++++++++++
> 
> Can we split the cadence parts of this to bra-cadence.rst that way this
> file documents the core parts only

Yes, we can split the Cadence parts out.


>> +Error handling
>> +~~~~~~~~~~~~~~
>> +
>> +The expected response to a 'bra_message' and follow-up behavior may
>> +vary:
>> +
>> +   (1) A Peripheral driver may want to receive an immediate -EBUSY
>> +       response if the BRA protocol is not available at a given time.
>> +
>> +   (2) A Peripheral driver may want to wait until a timeout for the
>> +       on-going transfer to be handled
>> +
>> +   (3) A Peripheral driver may want to wait until existing BRA
>> +       transfers complete or deal with BRA as a background task when
>> +       audio transfers stop. In this case, there would be no timeout,
>> +       and the operation may not happen if the platform is suspended.
> 
> Is this runtime suspend or S3/S4 case?

System suspend (which can also mean S0i1).

I don't think we can have a case where a peripheral driver waits on
something without having done a pm_runtime_get_sync() to prevent
runtime_pm suspend.

> 
>> +BTP/BRA API available to peripheral drivers
>> +-------------------------------------------
>> +
>> +ASoC Peripheral drivers may use
>> +
>> +   - sdw_bpt_stream_open(mode)
>> +
>> +      This function verifies that the BPT protocol with the
>> +      'mode'. For now only BRA is accepted as a mode. This function
>> +      allocates a work buffer internally. This buffer is not exposed
>> +      to the caller.
>> +
>> +     errors:
>> +         -ENODEV: BPT/BRA is not supported by the Manager.
>> +
>> +         -EBUSY: another agent is already using the audio payload for
>> +          audio transfers. There is no way to predict when the audio
>> +          streams might stop, this will require the Peripheral driver
>> +          to fall back to the regular (slow) command channel.
>> +
>> +         -EAGAIN: another agent is already transferring data using the
>> +          BPT/BRA protocol. Since the transfers will typically last
>> +          10s or 100s of ms, the Peripheral driver may wait and retry
>> +          later.
>> +
>> +    - sdw_bpt_message_send_async(bpt_message)
> 
> why not have a single API that does both? First check if it is supported
> and then allocate buffers and do the transfer.. What are the advantages
> of using this two step process

Symmetry is the only thing that comes to my mind. Open - close and send
- wait are natural matches, aren't they?

We do need a wait(), so bundling open() and send() would be odd.

But you have a point that the open() is not generic in that it also
prepares the DMA buffers for transmission. Maybe it's more natural to
follow the traditional open(), hw_params(), hw_free, close() from ALSA.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-18 11:54   ` Vinod Koul
@ 2023-12-18 13:12     ` Pierre-Louis Bossart
  2023-12-18 14:57       ` Charles Keepax
  2023-12-21 14:49       ` Vinod Koul
  0 siblings, 2 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 13:12 UTC (permalink / raw)
  To: Vinod Koul
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou


>> +int sdw_bpt_open_stream(struct sdw_bus *bus,
>> +			struct sdw_slave *slave,
>> +			enum sdw_bpt_type mode,
>> +			struct sdw_bpt_msg *msg)
>> +{
>> +	int ret;
>> +
>> +	/* only Bulk Register Access (BRA) is supported for now */
>> +	if (mode != SDW_BRA)
>> +		return -EINVAL;
>> +
>> +	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
>> +		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
>> +			msg->len, SDW_BPT_MSG_MIN_BYTES);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
>> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
>> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
>> +		return -EINVAL;
>> +	}
> 
> Is this a protocol requirement?

No, it's an implementation requirement.

We could move this to host-specific parts but then the codec drivers
will have to know about alignment requirements for each host they are
use with. IOW, it's more work for codec drivers if we don't have a
minimum bar for alignment requirement across all platforms.

> 
>> +
>> +	/* check device is enumerated */
>> +	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
>> +	    slave->dev_num > SDW_MAX_DEVICES)
>> +		return -ENODEV;
>> +
>> +	/* make sure all callbacks are defined */
>> +	if (!bus->ops->bpt_open_stream ||
>> +	    !bus->ops->bpt_close_stream ||
>> +	    !bus->ops->bpt_send_async ||
>> +	    !bus->ops->bpt_wait)
>> +		return -ENOTSUPP;
> 
> should this not be checked at probe time, if device declares the support

sdw_bpt_open_stream() would be called by the peripheral driver (or
regmap as a proxy). The peripheral driver could also decide to check for
those callback during its probe, but that's beyond the scope of this
patchset.

These checks are just there for paranoia, in case a peripheral driver
uses BTP/BRA on a host where they are not supported.

It's not science-fiction, we see AMD- and INTEL-based platforms using
the same SoundWire-based codecs.

>> +	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
>> +	if (ret < 0)
>> +		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(sdw_bpt_open_stream);
> 
> can we open multiple times (i dont see a check preventing that), how do
> we close..?

there's a refcount preventing multiples BTP streams from being opened.

> Re-iterating my comment on documentation patch, can we do with a async api
> and wait api, that makes symantics a lot simpler, right..?

see reply in previous email, combining open+send is weird IMHO.

>> +
>> +int sdw_bpt_send_async(struct sdw_bus *bus,
>> +		       struct sdw_slave *slave,
>> +		       struct sdw_bpt_msg *msg)
>> +{
>> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
>> +		return -EINVAL;
>> +
>> +	return bus->ops->bpt_send_async(bus, slave, msg);
>> +}
>> +EXPORT_SYMBOL(sdw_bpt_send_async);
> 
> Can we call this multiple times after open, it is unclear to me. Can you
> please add kernel-doc comments about the APIs here as well

This can be called multiple times but it's useless: all the buffers are
prepared in the open() stage. This is the moral equivalent of a trigger
step, just enable data transfers.

> 
>>  struct sdw_master_ops {
>>  	int (*read_prop)(struct sdw_bus *bus);
>> @@ -869,6 +913,20 @@ struct sdw_master_ops {
>>  	void (*new_peripheral_assigned)(struct sdw_bus *bus,
>>  					struct sdw_slave *slave,
>>  					int dev_num);
>> +	int (*bpt_open_stream)(struct sdw_bus *bus,
>> +			       struct sdw_slave *slave,
>> +			       enum sdw_bpt_type mode,
>> +			       struct sdw_bpt_msg *msg);
>> +	int (*bpt_close_stream)(struct sdw_bus *bus,
>> +				struct sdw_slave *slave,
>> +				enum sdw_bpt_type mode,
>> +				struct sdw_bpt_msg *msg);
>> +	int (*bpt_send_async)(struct sdw_bus *bus,
>> +			      struct sdw_slave *slave,
>> +			      struct sdw_bpt_msg *msg);
>> +	int (*bpt_wait)(struct sdw_bus *bus,
>> +			struct sdw_slave *slave,
>> +			struct sdw_bpt_msg *msg);
> 
> do we need both bus and slave, that was a mistake in orignal design IMO.
> We should fix that for bpt_ apis

No disagreement. All the routines follow the same template, if we change
one we should also change the others.

The main question as discussed with Charles is whether we want to pass
the 'msg' argument in all routines.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer
  2023-12-18 11:55   ` Vinod Koul
@ 2023-12-18 13:20     ` Pierre-Louis Bossart
  2023-12-21 14:39       ` Vinod Koul
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 13:20 UTC (permalink / raw)
  To: Vinod Koul
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



On 12/18/23 05:55, Vinod Koul wrote:
> On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
>> Add a convenience pointer to the 'sdw_bus' structure. BPT is a
>> dedicated stream which will typically not be handled by DAIs or
>> dailinks. Since there's only one BPT stream per link, storing the
>> pointer at the link level seems rather natural.
>>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>>  include/linux/soundwire/sdw.h | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
>> index e54c5bbd2b91..8db0cd7d0d89 100644
>> --- a/include/linux/soundwire/sdw.h
>> +++ b/include/linux/soundwire/sdw.h
>> @@ -965,6 +965,7 @@ struct sdw_master_ops {
>>   * @stream_refcount: number of streams currently using this bus
>>   * @btp_stream_refcount: number of BTP streams currently using this bus (should
>>   * be zero or one, multiple streams per link is not supported).
>> + * @bpt_stream: pointer stored for convenience.
>>   */
>>  struct sdw_bus {
>>  	struct device *dev;
>> @@ -996,6 +997,7 @@ struct sdw_bus {
>>  	int hw_sync_min_links;
>>  	int stream_refcount;
>>  	int bpt_stream_refcount;
>> +	struct sdw_stream_runtime *bpt_stream;
> 
> So we are limiting to single stream? Can we have multiple transfers
> queued up, which I guess might imply multiple streams?


Yes for now there is a BTP/BRA single stream active when there are no
audio transfers taking place. This is the only way to guarantee
predictable download/resume times.

There is no mechanism to queue up transfers, be it from the same
peripheral device or different ones. It would be up to the peripheral
driver to wait for the BTP stream to be available.

Adding a queuing mechanism is a bridge too far for now, most platforms
only have 1 or 2 devices only, and a peripheral driver may or may not be
ok with delayed downloads. For now the main ask is to reduce download
times, a single stream is already a good start. There are other
refinements we need to add as well, such as changing clocks to use the
fastest gear. I'd like to proceed with baby steps...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 09/16] soundwire: crc8: add constant table
  2023-12-18 12:01   ` Vinod Koul
@ 2023-12-18 13:26     ` Pierre-Louis Bossart
  2023-12-21 14:42       ` Vinod Koul
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 13:26 UTC (permalink / raw)
  To: Vinod Koul
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



On 12/18/23 06:01, Vinod Koul wrote:
> On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
>> Add the lookup table required by crc8(). All configuration values were
>> directly table from the MIPI SoundWire 1.x specification.
>>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>>  drivers/soundwire/Makefile |   4 +-
>>  drivers/soundwire/crc8.c   | 277 +++++++++++++++++++++++++++++++++++++
>>  drivers/soundwire/crc8.h   |  11 ++
>>  3 files changed, 291 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/soundwire/crc8.c
>>  create mode 100644 drivers/soundwire/crc8.h
>>
>> diff --git a/drivers/soundwire/Makefile b/drivers/soundwire/Makefile
>> index 657f5888a77b..170128dd9318 100644
>> --- a/drivers/soundwire/Makefile
>> +++ b/drivers/soundwire/Makefile
>> @@ -5,7 +5,9 @@
>>  
>>  #Bus Objs
>>  soundwire-bus-y := bus_type.o bus.o master.o slave.o mipi_disco.o stream.o  \
>> -			sysfs_slave.o sysfs_slave_dpn.o
>> +			sysfs_slave.o sysfs_slave_dpn.o \
>> +			crc8.o
>> +
>>  obj-$(CONFIG_SOUNDWIRE) += soundwire-bus.o
>>  
>>  soundwire-generic-allocation-objs := generic_bandwidth_allocation.o
>> diff --git a/drivers/soundwire/crc8.c b/drivers/soundwire/crc8.c
>> new file mode 100644
>> index 000000000000..b6b984d7f39a
>> --- /dev/null
>> +++ b/drivers/soundwire/crc8.c
>> @@ -0,0 +1,277 @@
>> +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
>> +// Copyright(c) 2024 Intel Corporation.
>> +
>> +#include <linux/crc8.h>
>> +#include <linux/module.h>
>> +#include "crc8.h"
>> +
>> +/*
>> + * the MIPI SoundWire CRC8 polynomial is X^8 + X^6 + X^3 + X^2 + 1, MSB first
>> + * The value is (1)01001101 = 0x4D
>> + *
>> + * the table below was generated with
>> + *
>> + *	u8 crc8_lookup_table[CRC8_TABLE_SIZE];
>> + *	crc8_populate_msb(crc8_lookup_table, SDW_CRC8_POLY);
> 
> Good that you found this API, so next question would be why should we
> have this static table in kernel and not generate on probe if bpt is
> supported..? Many subsystems use these APIs to generate the tables..

The table is going to be the same for all hosts, it's simpler if
everyone uses a constant table, no? We're talking about 256 bytes added
for the common bus parts, be it with dynamically allocated memory or a
static table.

I don't mind reverting to a dynamically allocated table populated at
boot if that was the consensus.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-18 12:58     ` Pierre-Louis Bossart
@ 2023-12-18 14:29       ` Charles Keepax
  2023-12-18 16:33         ` Pierre-Louis Bossart
  2023-12-21 14:44         ` Vinod Koul
  2023-12-21 14:44       ` Vinod Koul
  1 sibling, 2 replies; 57+ messages in thread
From: Charles Keepax @ 2023-12-18 14:29 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Vinod Koul, linux-sound, alsa-devel, tiwai, broonie, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Mon, Dec 18, 2023 at 01:58:47PM +0100, Pierre-Louis Bossart wrote:
> > why not have a single API that does both? First check if it is supported
> > and then allocate buffers and do the transfer.. What are the advantages
> > of using this two step process
> 
> Symmetry is the only thing that comes to my mind. Open - close and send
> - wait are natural matches, aren't they?
> 
> We do need a wait(), so bundling open() and send() would be odd.
> 

I agree send->wait->close would be odd, But you just bundle close
into wait. So the API becomes just send->wait, which seems pretty
logical.

> But you have a point that the open() is not generic in that it also
> prepares the DMA buffers for transmission. Maybe it's more natural to
> follow the traditional open(), hw_params(), hw_free, close() from ALSA.

I think this just makes it worse, you are now adding even more
calls. The problem I see here is that, open and close (at least to
me) strongly implies that you can do multiple operations between
them and unless I have misunderstood something here you can't.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-18 13:12     ` Pierre-Louis Bossart
@ 2023-12-18 14:57       ` Charles Keepax
  2023-12-18 16:44         ` Pierre-Louis Bossart
  2023-12-21 14:49       ` Vinod Koul
  1 sibling, 1 reply; 57+ messages in thread
From: Charles Keepax @ 2023-12-18 14:57 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Vinod Koul, linux-sound, alsa-devel, tiwai, broonie, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Mon, Dec 18, 2023 at 02:12:36PM +0100, Pierre-Louis Bossart wrote:
> > Is this a protocol requirement?
> 
> No, it's an implementation requirement.
> 
> We could move this to host-specific parts but then the codec drivers
> will have to know about alignment requirements for each host they are
> use with. IOW, it's more work for codec drivers if we don't have a
> minimum bar for alignment requirement across all platforms.
> 

I do certainly see that side of the argument and it does probably
warrant some thought as to how a slave might learn the alignment
requirements. I guess maybe some sort of core helper function to
return the alignment? Or putting it in properties the slave can
access? One could even keep the check here, but just pull the
value from something system specific.

The danger with putting it in the core is IMHO:

a) It rules out certain use-cases, generally I think its a bad
idea if the framework design prohibits stuff the underlying bus
could do because someone will, at some point, want to do it.

b) The core limit could get a bit out of hand once more
controllers are added. The core limit needs to be a multiple of
all the controller limits, if a controller comes along with a
weird alignment requirement, that gets problematic fast.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-18 14:29       ` Charles Keepax
@ 2023-12-18 16:33         ` Pierre-Louis Bossart
  2023-12-21 14:45           ` Vinod Koul
  2023-12-21 14:44         ` Vinod Koul
  1 sibling, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 16:33 UTC (permalink / raw)
  To: Charles Keepax
  Cc: Vinod Koul, linux-sound, alsa-devel, tiwai, broonie, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou



On 12/18/23 08:29, Charles Keepax wrote:
> On Mon, Dec 18, 2023 at 01:58:47PM +0100, Pierre-Louis Bossart wrote:
>>> why not have a single API that does both? First check if it is supported
>>> and then allocate buffers and do the transfer.. What are the advantages
>>> of using this two step process
>>
>> Symmetry is the only thing that comes to my mind. Open - close and send
>> - wait are natural matches, aren't they?
>>
>> We do need a wait(), so bundling open() and send() would be odd.
>>
> 
> I agree send->wait->close would be odd, But you just bundle close
> into wait. So the API becomes just send->wait, which seems pretty
> logical.

Fair enough, send()/wait() would work indeed.

I guess I wanted to keep the callbacks reasonably small (already 200
lines for the open), but we can split the 'send' callback into smaller
helpers to keep the code readable. There's no good reason to expose
these smaller helpers to codec drivers.

>> But you have a point that the open() is not generic in that it also
>> prepares the DMA buffers for transmission. Maybe it's more natural to
>> follow the traditional open(), hw_params(), hw_free, close() from ALSA.
> 
> I think this just makes it worse, you are now adding even more
> calls. The problem I see here is that, open and close (at least to
> me) strongly implies that you can do multiple operations between
> them and unless I have misunderstood something here you can't.

That's right, the open was not compatible with multiple operations.
Collapsing open/send and wait/close sounds more logical, thanks for the
feedback.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-18 14:57       ` Charles Keepax
@ 2023-12-18 16:44         ` Pierre-Louis Bossart
  0 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-18 16:44 UTC (permalink / raw)
  To: Charles Keepax
  Cc: Vinod Koul, linux-sound, alsa-devel, tiwai, broonie, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou



On 12/18/23 08:57, Charles Keepax wrote:
> On Mon, Dec 18, 2023 at 02:12:36PM +0100, Pierre-Louis Bossart wrote:
>>> Is this a protocol requirement?
>>
>> No, it's an implementation requirement.
>>
>> We could move this to host-specific parts but then the codec drivers
>> will have to know about alignment requirements for each host they are
>> use with. IOW, it's more work for codec drivers if we don't have a
>> minimum bar for alignment requirement across all platforms.
>>
> 
> I do certainly see that side of the argument and it does probably
> warrant some thought as to how a slave might learn the alignment
> requirements. I guess maybe some sort of core helper function to
> return the alignment? Or putting it in properties the slave can
> access? One could even keep the check here, but just pull the
> value from something system specific.
> 
> The danger with putting it in the core is IMHO:
> 
> a) It rules out certain use-cases, generally I think its a bad
> idea if the framework design prohibits stuff the underlying bus
> could do because someone will, at some point, want to do it.

SoundWire has lots of fancy and borderline nebulous concepts, my take is
"let's do few things and do them well". We can always revisit new usages
later, for now my main objective is "speed up downloads".

> b) The core limit could get a bit out of hand once more
> controllers are added. The core limit needs to be a multiple of
> all the controller limits, if a controller comes along with a
> weird alignment requirement, that gets problematic fast.

I don't have any information on other controllers, but I wouldn't be
surprised if most of the quirks are due to peripheral limitations and
ambiguous interpretations of what 'ACK' means. I tried writing to
reserved parts of the memory or non-existent registers and nothing bad
was reported...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-08 21:49       ` Mark Brown
@ 2023-12-19 16:50         ` Pierre-Louis Bossart
  2023-12-19 16:53           ` Mark Brown
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-19 16:50 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



>>> This feels a lot like it could map onto the regmap async interface, it
>>> would need a bit of a rework (mainly that currently they provide
>>> ordering guarantees with respect to both each other and sync I/O) but
>>> those could be removed with some care) but it's got the "here's a list
>>> of I/O, here's another call to ensure it's all actually happened" thing.
>>> It sounds very much like how I was thinking of the async API when I was
>>> writing it and the initial users.
> 
>>> It's this bit that really got me thinking it could fit well into regmap.
> 
>> I really don't know anything about this async interface, if you have
>> pointers on existing examples I am all ears....I am not aware of any
>> Intel platform or codec used on an Intel platform making use of this API.
> 
> grep for regmap_.*async - cs_dsp.c is the upstream example in a driver,
> or there's the rbtree cache sync code which uses a back door to go into
> an async mode.  Basically just variants of all the normal regmap I/O
> calls with a _complete() call you can use to wait for everything to
> happen.  The implementation is a bit heavyweight since it was written to
> work with fairly slow buses.

I spent a fair amount of time this afternoon trying to understand the
regmap_async parts, and I am not following where in the code there is an
ordering requirement/enforcement between async and sync usages.

I do see this comment in the code

 * @async_write: Write operation which completes asynchronously,
optional and must serialise with respect to non-async I/O.

But that 'must' is a requirement on the codec side, isn't it?

When using regmap_raw_write_async(), the lock is released immediately. I
don't see anything at the regmap level that would prevent a codec driver
from using regmap_raw_write() immediately.

That's probably a violation of the API to do so, but it's the same
problem I was referring earlier in the conversation where 'regular'
read/writes cannot happen in parallel with BTP/BRA transactions.

Also is this just me spacing out or there is no regmap_raw_read_async()?


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-19 16:50         ` Pierre-Louis Bossart
@ 2023-12-19 16:53           ` Mark Brown
  2023-12-19 17:08             ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Mark Brown @ 2023-12-19 16:53 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

[-- Attachment #1: Type: text/plain, Size: 928 bytes --]

On Tue, Dec 19, 2023 at 05:50:30PM +0100, Pierre-Louis Bossart wrote:

> > grep for regmap_.*async - cs_dsp.c is the upstream example in a driver,
> > or there's the rbtree cache sync code which uses a back door to go into
> > an async mode.  Basically just variants of all the normal regmap I/O
> > calls with a _complete() call you can use to wait for everything to
> > happen.  The implementation is a bit heavyweight since it was written to
> > work with fairly slow buses.

> I spent a fair amount of time this afternoon trying to understand the
> regmap_async parts, and I am not following where in the code there is an
> ordering requirement/enforcement between async and sync usages.

The only actual async implementation is SPI which processes things in
order of submission, the sync API wraps the async API.

> Also is this just me spacing out or there is no regmap_raw_read_async()?

Right, there was never any need.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-19 16:53           ` Mark Brown
@ 2023-12-19 17:08             ` Pierre-Louis Bossart
  2023-12-20 15:16               ` Charles Keepax
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-19 17:08 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-sound, alsa-devel, tiwai, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



On 12/19/23 17:53, Mark Brown wrote:
> On Tue, Dec 19, 2023 at 05:50:30PM +0100, Pierre-Louis Bossart wrote:
> 
>>> grep for regmap_.*async - cs_dsp.c is the upstream example in a driver,
>>> or there's the rbtree cache sync code which uses a back door to go into
>>> an async mode.  Basically just variants of all the normal regmap I/O
>>> calls with a _complete() call you can use to wait for everything to
>>> happen.  The implementation is a bit heavyweight since it was written to
>>> work with fairly slow buses.
> 
>> I spent a fair amount of time this afternoon trying to understand the
>> regmap_async parts, and I am not following where in the code there is an
>> ordering requirement/enforcement between async and sync usages.
> 
> The only actual async implementation is SPI which processes things in
> order of submission, the sync API wraps the async API.
> 
>> Also is this just me spacing out or there is no regmap_raw_read_async()?
> 
> Right, there was never any need.

ok. I am starting to think that we could have a new type of regmap, say
"regmap-sdw-bra", where the use of write_raw_async() would rely on the
send/wait bus primitives, and write_raw() would fallback to the regular
read/write commands. We'd need a mutual exclusion to prevent parallel
async/sync access to the same regmap.

In other words, "memory" areas that are used for firmware downloads
would be moved to a different regmap with async capabilities and no
caching support.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-19 17:08             ` Pierre-Louis Bossart
@ 2023-12-20 15:16               ` Charles Keepax
  2023-12-20 18:26                 ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Charles Keepax @ 2023-12-20 15:16 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Mark Brown, linux-sound, alsa-devel, tiwai, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Tue, Dec 19, 2023 at 06:08:15PM +0100, Pierre-Louis Bossart wrote:
> On 12/19/23 17:53, Mark Brown wrote:
> > On Tue, Dec 19, 2023 at 05:50:30PM +0100, Pierre-Louis Bossart wrote:
> >>> grep for regmap_.*async - cs_dsp.c is the upstream example in a driver,
> >>> or there's the rbtree cache sync code which uses a back door to go into
> >>> an async mode.  Basically just variants of all the normal regmap I/O
> >>> calls with a _complete() call you can use to wait for everything to
> >>> happen.  The implementation is a bit heavyweight since it was written to
> >>> work with fairly slow buses.
> > 
> >> I spent a fair amount of time this afternoon trying to understand the
> >> regmap_async parts, and I am not following where in the code there is an
> >> ordering requirement/enforcement between async and sync usages.
> > 
> > The only actual async implementation is SPI which processes things in
> > order of submission, the sync API wraps the async API.
> > 
> >> Also is this just me spacing out or there is no regmap_raw_read_async()?
> > 
> > Right, there was never any need.
> 
> ok. I am starting to think that we could have a new type of regmap, say
> "regmap-sdw-bra", where the use of write_raw_async() would rely on the
> send/wait bus primitives, and write_raw() would fallback to the regular
> read/write commands. We'd need a mutual exclusion to prevent parallel
> async/sync access to the same regmap.
> 
> In other words, "memory" areas that are used for firmware downloads
> would be moved to a different regmap with async capabilities and no
> caching support.

I would be a little inclined to say leave adding a regmap for a
follow up series, whether we add it to the existing regmap or add
a new one, or whatever, it should all sit happily on top of the
API being added in this series. Makes it a little more contained
to focus on one area at a time, and leave this series as adding
core support for BRA. But that said, if we really want to I don't
feel mega strongly on this one.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-20 15:16               ` Charles Keepax
@ 2023-12-20 18:26                 ` Pierre-Louis Bossart
  2023-12-20 18:28                   ` Mark Brown
  2023-12-21  9:46                   ` Charles Keepax
  0 siblings, 2 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-20 18:26 UTC (permalink / raw)
  To: Charles Keepax
  Cc: Mark Brown, linux-sound, alsa-devel, tiwai, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou



On 12/20/23 16:16, Charles Keepax wrote:
> On Tue, Dec 19, 2023 at 06:08:15PM +0100, Pierre-Louis Bossart wrote:
>> On 12/19/23 17:53, Mark Brown wrote:
>>> On Tue, Dec 19, 2023 at 05:50:30PM +0100, Pierre-Louis Bossart wrote:
>>>>> grep for regmap_.*async - cs_dsp.c is the upstream example in a driver,
>>>>> or there's the rbtree cache sync code which uses a back door to go into
>>>>> an async mode.  Basically just variants of all the normal regmap I/O
>>>>> calls with a _complete() call you can use to wait for everything to
>>>>> happen.  The implementation is a bit heavyweight since it was written to
>>>>> work with fairly slow buses.
>>>
>>>> I spent a fair amount of time this afternoon trying to understand the
>>>> regmap_async parts, and I am not following where in the code there is an
>>>> ordering requirement/enforcement between async and sync usages.
>>>
>>> The only actual async implementation is SPI which processes things in
>>> order of submission, the sync API wraps the async API.
>>>
>>>> Also is this just me spacing out or there is no regmap_raw_read_async()?
>>>
>>> Right, there was never any need.
>>
>> ok. I am starting to think that we could have a new type of regmap, say
>> "regmap-sdw-bra", where the use of write_raw_async() would rely on the
>> send/wait bus primitives, and write_raw() would fallback to the regular
>> read/write commands. We'd need a mutual exclusion to prevent parallel
>> async/sync access to the same regmap.
>>
>> In other words, "memory" areas that are used for firmware downloads
>> would be moved to a different regmap with async capabilities and no
>> caching support.
> 
> I would be a little inclined to say leave adding a regmap for a
> follow up series, whether we add it to the existing regmap or add
> a new one, or whatever, it should all sit happily on top of the
> API being added in this series. Makes it a little more contained
> to focus on one area at a time, and leave this series as adding
> core support for BRA. But that said, if we really want to I don't
> feel mega strongly on this one.

Right, I was probably going too far down in the details for a December
20 post.

The point I was trying to make it seems there's consensus that regmap
with the async parts would be the API used by SoundWire/ASoC codecs, and
the regmap implementation would rely on the bus/host send/wait routines.

The regmap stuff will need joint work with codec driver folks so it
should indeed be done in a second step when the SoundWire bus/host parts
are available.

Put differently: is there any sustained objection to the proposal of
extending regmap with async BRA transfers?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-20 18:26                 ` Pierre-Louis Bossart
@ 2023-12-20 18:28                   ` Mark Brown
  2023-12-21  9:46                   ` Charles Keepax
  1 sibling, 0 replies; 57+ messages in thread
From: Mark Brown @ 2023-12-20 18:28 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Charles Keepax, linux-sound, alsa-devel, tiwai, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

[-- Attachment #1: Type: text/plain, Size: 291 bytes --]

On Wed, Dec 20, 2023 at 07:26:24PM +0100, Pierre-Louis Bossart wrote:

> Put differently: is there any sustained objection to the proposal of
> extending regmap with async BRA transfers?

Not from me, and I agree that it makes sense to do it once the
underlying SoundWire bits are in place.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-20 18:26                 ` Pierre-Louis Bossart
  2023-12-20 18:28                   ` Mark Brown
@ 2023-12-21  9:46                   ` Charles Keepax
  1 sibling, 0 replies; 57+ messages in thread
From: Charles Keepax @ 2023-12-21  9:46 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Mark Brown, linux-sound, alsa-devel, tiwai, vinod.koul,
	Bard liao, Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Richard Fitzgerald, Shuming Fan, Jack Yu, Oder Chiou

On Wed, Dec 20, 2023 at 07:26:24PM +0100, Pierre-Louis Bossart wrote:
> On 12/20/23 16:16, Charles Keepax wrote:
> > On Tue, Dec 19, 2023 at 06:08:15PM +0100, Pierre-Louis Bossart wrote:
> >> On 12/19/23 17:53, Mark Brown wrote:
> >>> On Tue, Dec 19, 2023 at 05:50:30PM +0100, Pierre-Louis Bossart wrote:
> Put differently: is there any sustained objection to the proposal of
> extending regmap with async BRA transfers?

Seems good to me, was not my intention to object to the idea itself.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer
  2023-12-18 13:20     ` Pierre-Louis Bossart
@ 2023-12-21 14:39       ` Vinod Koul
  2023-12-21 17:09         ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 14:39 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 18-12-23, 14:20, Pierre-Louis Bossart wrote:
> 
> 
> On 12/18/23 05:55, Vinod Koul wrote:
> > On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> >> Add a convenience pointer to the 'sdw_bus' structure. BPT is a
> >> dedicated stream which will typically not be handled by DAIs or
> >> dailinks. Since there's only one BPT stream per link, storing the
> >> pointer at the link level seems rather natural.
> >>
> >> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> >> ---
> >>  include/linux/soundwire/sdw.h | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
> >> index e54c5bbd2b91..8db0cd7d0d89 100644
> >> --- a/include/linux/soundwire/sdw.h
> >> +++ b/include/linux/soundwire/sdw.h
> >> @@ -965,6 +965,7 @@ struct sdw_master_ops {
> >>   * @stream_refcount: number of streams currently using this bus
> >>   * @btp_stream_refcount: number of BTP streams currently using this bus (should
> >>   * be zero or one, multiple streams per link is not supported).
> >> + * @bpt_stream: pointer stored for convenience.
> >>   */
> >>  struct sdw_bus {
> >>  	struct device *dev;
> >> @@ -996,6 +997,7 @@ struct sdw_bus {
> >>  	int hw_sync_min_links;
> >>  	int stream_refcount;
> >>  	int bpt_stream_refcount;
> >> +	struct sdw_stream_runtime *bpt_stream;
> > 
> > So we are limiting to single stream? Can we have multiple transfers
> > queued up, which I guess might imply multiple streams?
> 
> 
> Yes for now there is a BTP/BRA single stream active when there are no
> audio transfers taking place. This is the only way to guarantee
> predictable download/resume times.
> 
> There is no mechanism to queue up transfers, be it from the same
> peripheral device or different ones. It would be up to the peripheral
> driver to wait for the BTP stream to be available.
> 
> Adding a queuing mechanism is a bridge too far for now, most platforms
> only have 1 or 2 devices only, and a peripheral driver may or may not be
> ok with delayed downloads. For now the main ask is to reduce download
> times, a single stream is already a good start. There are other
> refinements we need to add as well, such as changing clocks to use the
> fastest gear. I'd like to proceed with baby steps...

Since the API is async in nature, I think it is very reasonable that we
add the queue support (a simple list would do) and notify when the
transfer is complete..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 09/16] soundwire: crc8: add constant table
  2023-12-18 13:26     ` Pierre-Louis Bossart
@ 2023-12-21 14:42       ` Vinod Koul
  2023-12-21 17:15         ` Pierre-Louis Bossart
  0 siblings, 1 reply; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 14:42 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 18-12-23, 14:26, Pierre-Louis Bossart wrote:
> 
> 
> On 12/18/23 06:01, Vinod Koul wrote:
> > On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> >> Add the lookup table required by crc8(). All configuration values were
> >> directly table from the MIPI SoundWire 1.x specification.
> >>
> >> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> >> ---
> >>  drivers/soundwire/Makefile |   4 +-
> >>  drivers/soundwire/crc8.c   | 277 +++++++++++++++++++++++++++++++++++++
> >>  drivers/soundwire/crc8.h   |  11 ++
> >>  3 files changed, 291 insertions(+), 1 deletion(-)
> >>  create mode 100644 drivers/soundwire/crc8.c
> >>  create mode 100644 drivers/soundwire/crc8.h
> >>
> >> diff --git a/drivers/soundwire/Makefile b/drivers/soundwire/Makefile
> >> index 657f5888a77b..170128dd9318 100644
> >> --- a/drivers/soundwire/Makefile
> >> +++ b/drivers/soundwire/Makefile
> >> @@ -5,7 +5,9 @@
> >>  
> >>  #Bus Objs
> >>  soundwire-bus-y := bus_type.o bus.o master.o slave.o mipi_disco.o stream.o  \
> >> -			sysfs_slave.o sysfs_slave_dpn.o
> >> +			sysfs_slave.o sysfs_slave_dpn.o \
> >> +			crc8.o
> >> +
> >>  obj-$(CONFIG_SOUNDWIRE) += soundwire-bus.o
> >>  
> >>  soundwire-generic-allocation-objs := generic_bandwidth_allocation.o
> >> diff --git a/drivers/soundwire/crc8.c b/drivers/soundwire/crc8.c
> >> new file mode 100644
> >> index 000000000000..b6b984d7f39a
> >> --- /dev/null
> >> +++ b/drivers/soundwire/crc8.c
> >> @@ -0,0 +1,277 @@
> >> +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
> >> +// Copyright(c) 2024 Intel Corporation.
> >> +
> >> +#include <linux/crc8.h>
> >> +#include <linux/module.h>
> >> +#include "crc8.h"
> >> +
> >> +/*
> >> + * the MIPI SoundWire CRC8 polynomial is X^8 + X^6 + X^3 + X^2 + 1, MSB first
> >> + * The value is (1)01001101 = 0x4D
> >> + *
> >> + * the table below was generated with
> >> + *
> >> + *	u8 crc8_lookup_table[CRC8_TABLE_SIZE];
> >> + *	crc8_populate_msb(crc8_lookup_table, SDW_CRC8_POLY);
> > 
> > Good that you found this API, so next question would be why should we
> > have this static table in kernel and not generate on probe if bpt is
> > supported..? Many subsystems use these APIs to generate the tables..
> 
> The table is going to be the same for all hosts, it's simpler if
> everyone uses a constant table, no? We're talking about 256 bytes added
> for the common bus parts, be it with dynamically allocated memory or a
> static table.
> 
> I don't mind reverting to a dynamically allocated table populated at
> boot if that was the consensus.

Most of the kernel users I looked have dynamically allocated table
populated at boot, also out of many users how many would support BTP.?
Your older platforms, current qcom dont, so not point is adding for
everyone...


-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-18 12:58     ` Pierre-Louis Bossart
  2023-12-18 14:29       ` Charles Keepax
@ 2023-12-21 14:44       ` Vinod Koul
  1 sibling, 0 replies; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 14:44 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 18-12-23, 13:58, Pierre-Louis Bossart wrote:
> 
> >>  Documentation/driver-api/soundwire/bra.rst    | 478 ++++++++++++++++++
> > 
> > Can we split the cadence parts of this to bra-cadence.rst that way this
> > file documents the core parts only
> 
> Yes, we can split the Cadence parts out.

Great

> 
> 
> >> +Error handling
> >> +~~~~~~~~~~~~~~
> >> +
> >> +The expected response to a 'bra_message' and follow-up behavior may
> >> +vary:
> >> +
> >> +   (1) A Peripheral driver may want to receive an immediate -EBUSY
> >> +       response if the BRA protocol is not available at a given time.
> >> +
> >> +   (2) A Peripheral driver may want to wait until a timeout for the
> >> +       on-going transfer to be handled
> >> +
> >> +   (3) A Peripheral driver may want to wait until existing BRA
> >> +       transfers complete or deal with BRA as a background task when
> >> +       audio transfers stop. In this case, there would be no timeout,
> >> +       and the operation may not happen if the platform is suspended.
> > 
> > Is this runtime suspend or S3/S4 case?
> 
> System suspend (which can also mean S0i1).
> 
> I don't think we can have a case where a peripheral driver waits on
> something without having done a pm_runtime_get_sync() to prevent
> runtime_pm suspend.
> 
> > 
> >> +BTP/BRA API available to peripheral drivers
> >> +-------------------------------------------
> >> +
> >> +ASoC Peripheral drivers may use
> >> +
> >> +   - sdw_bpt_stream_open(mode)
> >> +
> >> +      This function verifies that the BPT protocol with the
> >> +      'mode'. For now only BRA is accepted as a mode. This function
> >> +      allocates a work buffer internally. This buffer is not exposed
> >> +      to the caller.
> >> +
> >> +     errors:
> >> +         -ENODEV: BPT/BRA is not supported by the Manager.
> >> +
> >> +         -EBUSY: another agent is already using the audio payload for
> >> +          audio transfers. There is no way to predict when the audio
> >> +          streams might stop, this will require the Peripheral driver
> >> +          to fall back to the regular (slow) command channel.
> >> +
> >> +         -EAGAIN: another agent is already transferring data using the
> >> +          BPT/BRA protocol. Since the transfers will typically last
> >> +          10s or 100s of ms, the Peripheral driver may wait and retry
> >> +          later.
> >> +
> >> +    - sdw_bpt_message_send_async(bpt_message)
> > 
> > why not have a single API that does both? First check if it is supported
> > and then allocate buffers and do the transfer.. What are the advantages
> > of using this two step process
> 
> Symmetry is the only thing that comes to my mind. Open - close and send
> - wait are natural matches, aren't they?

Why have symmetry to DAI apis, why not symmetry to regmap write APIs..?
This is data transfer, so I am not sure why would we model it as a DAI.
(Internal implementation may rely on that but from API design, i dont
think that should be a concern)

> 
> We do need a wait(), so bundling open() and send() would be odd.
> 
> But you have a point that the open() is not generic in that it also
> prepares the DMA buffers for transmission. Maybe it's more natural to
> follow the traditional open(), hw_params(), hw_free, close() from ALSA.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-18 14:29       ` Charles Keepax
  2023-12-18 16:33         ` Pierre-Louis Bossart
@ 2023-12-21 14:44         ` Vinod Koul
  1 sibling, 0 replies; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 14:44 UTC (permalink / raw)
  To: Charles Keepax
  Cc: Pierre-Louis Bossart, linux-sound, alsa-devel, tiwai, broonie,
	vinod.koul, Bard liao, Ranjani Sridharan, Peter Ujfalusi,
	Kai Vehmanen, srinivas.kandagatla, Krzysztof Kozlowski,
	vijendar.mukunda, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 18-12-23, 14:29, Charles Keepax wrote:
> On Mon, Dec 18, 2023 at 01:58:47PM +0100, Pierre-Louis Bossart wrote:
> > > why not have a single API that does both? First check if it is supported
> > > and then allocate buffers and do the transfer.. What are the advantages
> > > of using this two step process
> > 
> > Symmetry is the only thing that comes to my mind. Open - close and send
> > - wait are natural matches, aren't they?
> > 
> > We do need a wait(), so bundling open() and send() would be odd.
> > 
> 
> I agree send->wait->close would be odd, But you just bundle close
> into wait. So the API becomes just send->wait, which seems pretty
> logical.

You would drop close as well, only send and wait...

> 
> > But you have a point that the open() is not generic in that it also
> > prepares the DMA buffers for transmission. Maybe it's more natural to
> > follow the traditional open(), hw_params(), hw_free, close() from ALSA.
> 
> I think this just makes it worse, you are now adding even more
> calls. The problem I see here is that, open and close (at least to
> me) strongly implies that you can do multiple operations between
> them and unless I have misunderstood something here you can't.
> 
> Thanks,
> Charles

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description
  2023-12-18 16:33         ` Pierre-Louis Bossart
@ 2023-12-21 14:45           ` Vinod Koul
  0 siblings, 0 replies; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 14:45 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Charles Keepax, linux-sound, alsa-devel, tiwai, broonie,
	vinod.koul, Bard liao, Ranjani Sridharan, Peter Ujfalusi,
	Kai Vehmanen, srinivas.kandagatla, Krzysztof Kozlowski,
	vijendar.mukunda, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 18-12-23, 17:33, Pierre-Louis Bossart wrote:
> 
> 
> On 12/18/23 08:29, Charles Keepax wrote:
> > On Mon, Dec 18, 2023 at 01:58:47PM +0100, Pierre-Louis Bossart wrote:
> >>> why not have a single API that does both? First check if it is supported
> >>> and then allocate buffers and do the transfer.. What are the advantages
> >>> of using this two step process
> >>
> >> Symmetry is the only thing that comes to my mind. Open - close and send
> >> - wait are natural matches, aren't they?
> >>
> >> We do need a wait(), so bundling open() and send() would be odd.
> >>
> > 
> > I agree send->wait->close would be odd, But you just bundle close
> > into wait. So the API becomes just send->wait, which seems pretty
> > logical.
> 
> Fair enough, send()/wait() would work indeed.
> 
> I guess I wanted to keep the callbacks reasonably small (already 200
> lines for the open), but we can split the 'send' callback into smaller
> helpers to keep the code readable. There's no good reason to expose
> these smaller helpers to codec drivers.

Yes! that would be a better design IMO

> 
> >> But you have a point that the open() is not generic in that it also
> >> prepares the DMA buffers for transmission. Maybe it's more natural to
> >> follow the traditional open(), hw_params(), hw_free, close() from ALSA.
> > 
> > I think this just makes it worse, you are now adding even more
> > calls. The problem I see here is that, open and close (at least to
> > me) strongly implies that you can do multiple operations between
> > them and unless I have misunderstood something here you can't.
> 
> That's right, the open was not compatible with multiple operations.
> Collapsing open/send and wait/close sounds more logical, thanks for the
> feedback.

Sure

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol
  2023-12-18 13:12     ` Pierre-Louis Bossart
  2023-12-18 14:57       ` Charles Keepax
@ 2023-12-21 14:49       ` Vinod Koul
  1 sibling, 0 replies; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 14:49 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 18-12-23, 14:12, Pierre-Louis Bossart wrote:
> 
> >> +int sdw_bpt_open_stream(struct sdw_bus *bus,
> >> +			struct sdw_slave *slave,
> >> +			enum sdw_bpt_type mode,
> >> +			struct sdw_bpt_msg *msg)
> >> +{
> >> +	int ret;
> >> +
> >> +	/* only Bulk Register Access (BRA) is supported for now */
> >> +	if (mode != SDW_BRA)
> >> +		return -EINVAL;
> >> +
> >> +	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
> >> +		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
> >> +			msg->len, SDW_BPT_MSG_MIN_BYTES);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
> >> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
> >> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
> >> +		return -EINVAL;
> >> +	}
> > 
> > Is this a protocol requirement?
> 
> No, it's an implementation requirement.
> 
> We could move this to host-specific parts but then the codec drivers
> will have to know about alignment requirements for each host they are
> use with. IOW, it's more work for codec drivers if we don't have a
> minimum bar for alignment requirement across all platforms.
> 
> > 
> >> +
> >> +	/* check device is enumerated */
> >> +	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
> >> +	    slave->dev_num > SDW_MAX_DEVICES)
> >> +		return -ENODEV;
> >> +
> >> +	/* make sure all callbacks are defined */
> >> +	if (!bus->ops->bpt_open_stream ||
> >> +	    !bus->ops->bpt_close_stream ||
> >> +	    !bus->ops->bpt_send_async ||
> >> +	    !bus->ops->bpt_wait)
> >> +		return -ENOTSUPP;
> > 
> > should this not be checked at probe time, if device declares the support
> 
> sdw_bpt_open_stream() would be called by the peripheral driver (or
> regmap as a proxy). The peripheral driver could also decide to check for
> those callback during its probe, but that's beyond the scope of this
> patchset.

I would think that it is better to have capablities registered by the
driver and those are checked at registration, so we know if bpt is
supported or not for a particular platform.

This make more sense to me as some driver, depending on the SoC may or
maynot support this, so easy way would be to turn off caps, what do you
think?

> 
> These checks are just there for paranoia, in case a peripheral driver
> uses BTP/BRA on a host where they are not supported.
> 
> It's not science-fiction, we see AMD- and INTEL-based platforms using
> the same SoundWire-based codecs.

Ofcourse, it is entrely reasonable thing to do, event across x86/arm64

> 
> >> +	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
> >> +	if (ret < 0)
> >> +		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
> >> +
> >> +	return ret;
> >> +}
> >> +EXPORT_SYMBOL(sdw_bpt_open_stream);
> > 
> > can we open multiple times (i dont see a check preventing that), how do
> > we close..?
> 
> there's a refcount preventing multiples BTP streams from being opened.
> 
> > Re-iterating my comment on documentation patch, can we do with a async api
> > and wait api, that makes symantics a lot simpler, right..?
> 
> see reply in previous email, combining open+send is weird IMHO.
> 
> >> +
> >> +int sdw_bpt_send_async(struct sdw_bus *bus,
> >> +		       struct sdw_slave *slave,
> >> +		       struct sdw_bpt_msg *msg)
> >> +{
> >> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
> >> +		return -EINVAL;
> >> +
> >> +	return bus->ops->bpt_send_async(bus, slave, msg);
> >> +}
> >> +EXPORT_SYMBOL(sdw_bpt_send_async);
> > 
> > Can we call this multiple times after open, it is unclear to me. Can you
> > please add kernel-doc comments about the APIs here as well
> 
> This can be called multiple times but it's useless: all the buffers are
> prepared in the open() stage. This is the moral equivalent of a trigger
> step, just enable data transfers.
> 
> > 
> >>  struct sdw_master_ops {
> >>  	int (*read_prop)(struct sdw_bus *bus);
> >> @@ -869,6 +913,20 @@ struct sdw_master_ops {
> >>  	void (*new_peripheral_assigned)(struct sdw_bus *bus,
> >>  					struct sdw_slave *slave,
> >>  					int dev_num);
> >> +	int (*bpt_open_stream)(struct sdw_bus *bus,
> >> +			       struct sdw_slave *slave,
> >> +			       enum sdw_bpt_type mode,
> >> +			       struct sdw_bpt_msg *msg);
> >> +	int (*bpt_close_stream)(struct sdw_bus *bus,
> >> +				struct sdw_slave *slave,
> >> +				enum sdw_bpt_type mode,
> >> +				struct sdw_bpt_msg *msg);
> >> +	int (*bpt_send_async)(struct sdw_bus *bus,
> >> +			      struct sdw_slave *slave,
> >> +			      struct sdw_bpt_msg *msg);
> >> +	int (*bpt_wait)(struct sdw_bus *bus,
> >> +			struct sdw_slave *slave,
> >> +			struct sdw_bpt_msg *msg);
> > 
> > do we need both bus and slave, that was a mistake in orignal design IMO.
> > We should fix that for bpt_ apis
> 
> No disagreement. All the routines follow the same template, if we change
> one we should also change the others.
> 
> The main question as discussed with Charles is whether we want to pass
> the 'msg' argument in all routines.

Lets revisit when we have new API

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer
  2023-12-21 14:39       ` Vinod Koul
@ 2023-12-21 17:09         ` Pierre-Louis Bossart
  0 siblings, 0 replies; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-21 17:09 UTC (permalink / raw)
  To: Vinod Koul
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



On 12/21/23 15:39, Vinod Koul wrote:
> On 18-12-23, 14:20, Pierre-Louis Bossart wrote:
>>
>>
>> On 12/18/23 05:55, Vinod Koul wrote:
>>> On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
>>>> Add a convenience pointer to the 'sdw_bus' structure. BPT is a
>>>> dedicated stream which will typically not be handled by DAIs or
>>>> dailinks. Since there's only one BPT stream per link, storing the
>>>> pointer at the link level seems rather natural.
>>>>
>>>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>>>> ---
>>>>  include/linux/soundwire/sdw.h | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
>>>> index e54c5bbd2b91..8db0cd7d0d89 100644
>>>> --- a/include/linux/soundwire/sdw.h
>>>> +++ b/include/linux/soundwire/sdw.h
>>>> @@ -965,6 +965,7 @@ struct sdw_master_ops {
>>>>   * @stream_refcount: number of streams currently using this bus
>>>>   * @btp_stream_refcount: number of BTP streams currently using this bus (should
>>>>   * be zero or one, multiple streams per link is not supported).
>>>> + * @bpt_stream: pointer stored for convenience.
>>>>   */
>>>>  struct sdw_bus {
>>>>  	struct device *dev;
>>>> @@ -996,6 +997,7 @@ struct sdw_bus {
>>>>  	int hw_sync_min_links;
>>>>  	int stream_refcount;
>>>>  	int bpt_stream_refcount;
>>>> +	struct sdw_stream_runtime *bpt_stream;
>>>
>>> So we are limiting to single stream? Can we have multiple transfers
>>> queued up, which I guess might imply multiple streams?
>>
>>
>> Yes for now there is a BTP/BRA single stream active when there are no
>> audio transfers taking place. This is the only way to guarantee
>> predictable download/resume times.
>>
>> There is no mechanism to queue up transfers, be it from the same
>> peripheral device or different ones. It would be up to the peripheral
>> driver to wait for the BTP stream to be available.
>>
>> Adding a queuing mechanism is a bridge too far for now, most platforms
>> only have 1 or 2 devices only, and a peripheral driver may or may not be
>> ok with delayed downloads. For now the main ask is to reduce download
>> times, a single stream is already a good start. There are other
>> refinements we need to add as well, such as changing clocks to use the
>> fastest gear. I'd like to proceed with baby steps...
> 
> Since the API is async in nature, I think it is very reasonable that we
> add the queue support (a simple list would do) and notify when the
> transfer is complete..

On paper queueing is nice/elegant.

In practice it's likely that there's a bit of configuration needed
before/after each download, e.g. restart the DSP.

It would also be an absolute horror to deal with error flow if one of
the queued transfers did not succeed.

I would be more than happy if we successfully enable a single transfer
across multiple platforms, and look at queueing as a optimization.

BTW even if there are multiple transfers queued, there's still only one
stream active at a given time.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 09/16] soundwire: crc8: add constant table
  2023-12-21 14:42       ` Vinod Koul
@ 2023-12-21 17:15         ` Pierre-Louis Bossart
  2023-12-21 17:21           ` Vinod Koul
  0 siblings, 1 reply; 57+ messages in thread
From: Pierre-Louis Bossart @ 2023-12-21 17:15 UTC (permalink / raw)
  To: Vinod Koul
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou



On 12/21/23 15:42, Vinod Koul wrote:
> On 18-12-23, 14:26, Pierre-Louis Bossart wrote:
>>
>>
>> On 12/18/23 06:01, Vinod Koul wrote:
>>> On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
>>>> Add the lookup table required by crc8(). All configuration values were
>>>> directly table from the MIPI SoundWire 1.x specification.
>>>>
>>>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>>>> ---
>>>>  drivers/soundwire/Makefile |   4 +-
>>>>  drivers/soundwire/crc8.c   | 277 +++++++++++++++++++++++++++++++++++++
>>>>  drivers/soundwire/crc8.h   |  11 ++
>>>>  3 files changed, 291 insertions(+), 1 deletion(-)
>>>>  create mode 100644 drivers/soundwire/crc8.c
>>>>  create mode 100644 drivers/soundwire/crc8.h
>>>>
>>>> diff --git a/drivers/soundwire/Makefile b/drivers/soundwire/Makefile
>>>> index 657f5888a77b..170128dd9318 100644
>>>> --- a/drivers/soundwire/Makefile
>>>> +++ b/drivers/soundwire/Makefile
>>>> @@ -5,7 +5,9 @@
>>>>  
>>>>  #Bus Objs
>>>>  soundwire-bus-y := bus_type.o bus.o master.o slave.o mipi_disco.o stream.o  \
>>>> -			sysfs_slave.o sysfs_slave_dpn.o
>>>> +			sysfs_slave.o sysfs_slave_dpn.o \
>>>> +			crc8.o
>>>> +
>>>>  obj-$(CONFIG_SOUNDWIRE) += soundwire-bus.o
>>>>  
>>>>  soundwire-generic-allocation-objs := generic_bandwidth_allocation.o
>>>> diff --git a/drivers/soundwire/crc8.c b/drivers/soundwire/crc8.c
>>>> new file mode 100644
>>>> index 000000000000..b6b984d7f39a
>>>> --- /dev/null
>>>> +++ b/drivers/soundwire/crc8.c
>>>> @@ -0,0 +1,277 @@
>>>> +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
>>>> +// Copyright(c) 2024 Intel Corporation.
>>>> +
>>>> +#include <linux/crc8.h>
>>>> +#include <linux/module.h>
>>>> +#include "crc8.h"
>>>> +
>>>> +/*
>>>> + * the MIPI SoundWire CRC8 polynomial is X^8 + X^6 + X^3 + X^2 + 1, MSB first
>>>> + * The value is (1)01001101 = 0x4D
>>>> + *
>>>> + * the table below was generated with
>>>> + *
>>>> + *	u8 crc8_lookup_table[CRC8_TABLE_SIZE];
>>>> + *	crc8_populate_msb(crc8_lookup_table, SDW_CRC8_POLY);
>>>
>>> Good that you found this API, so next question would be why should we
>>> have this static table in kernel and not generate on probe if bpt is
>>> supported..? Many subsystems use these APIs to generate the tables..
>>
>> The table is going to be the same for all hosts, it's simpler if
>> everyone uses a constant table, no? We're talking about 256 bytes added
>> for the common bus parts, be it with dynamically allocated memory or a
>> static table.
>>
>> I don't mind reverting to a dynamically allocated table populated at
>> boot if that was the consensus.
> 
> Most of the kernel users I looked have dynamically allocated table
> populated at boot, also out of many users how many would support BTP.?
> Your older platforms, current qcom dont, so not point is adding for
> everyone...

All Intel hardware supports BTP/BRA, we just didn't have a compelling
reason to enable it so far. I've seen AMD stating that they also have
BTP/BRA. That's 2/3 of controllers.

I don't mind reverting to a devm_ allocated table, I am not sure I see
the benefits for 256 bytes of constant data.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 09/16] soundwire: crc8: add constant table
  2023-12-21 17:15         ` Pierre-Louis Bossart
@ 2023-12-21 17:21           ` Vinod Koul
  0 siblings, 0 replies; 57+ messages in thread
From: Vinod Koul @ 2023-12-21 17:21 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: linux-sound, alsa-devel, tiwai, broonie, vinod.koul, Bard liao,
	Ranjani Sridharan, Peter Ujfalusi, Kai Vehmanen,
	srinivas.kandagatla, Krzysztof Kozlowski, vijendar.mukunda,
	Charles Keepax, Richard Fitzgerald, Shuming Fan, Jack Yu,
	Oder Chiou

On 21-12-23, 18:15, Pierre-Louis Bossart wrote:

> >> I don't mind reverting to a dynamically allocated table populated at
> >> boot if that was the consensus.
> > 
> > Most of the kernel users I looked have dynamically allocated table
> > populated at boot, also out of many users how many would support BTP.?
> > Your older platforms, current qcom dont, so not point is adding for
> > everyone...
> 
> All Intel hardware supports BTP/BRA, we just didn't have a compelling
> reason to enable it so far. I've seen AMD stating that they also have
> BTP/BRA. That's 2/3 of controllers.

Not all qcom platforms support this, so this would be enabled
selectively for them

-- 
~Vinod

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2023-12-21 17:21 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-07 22:29 [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 01/16] Documentation: driver: add SoundWire BRA description Pierre-Louis Bossart
2023-12-07 23:29   ` Mark Brown
2023-12-08  0:56     ` Pierre-Louis Bossart
2023-12-08 21:49       ` Mark Brown
2023-12-19 16:50         ` Pierre-Louis Bossart
2023-12-19 16:53           ` Mark Brown
2023-12-19 17:08             ` Pierre-Louis Bossart
2023-12-20 15:16               ` Charles Keepax
2023-12-20 18:26                 ` Pierre-Louis Bossart
2023-12-20 18:28                   ` Mark Brown
2023-12-21  9:46                   ` Charles Keepax
2023-12-08 16:27   ` Charles Keepax
2023-12-08 18:45     ` Pierre-Louis Bossart
2023-12-08 18:55       ` Mark Brown
2023-12-18 11:40   ` Vinod Koul
2023-12-18 12:58     ` Pierre-Louis Bossart
2023-12-18 14:29       ` Charles Keepax
2023-12-18 16:33         ` Pierre-Louis Bossart
2023-12-21 14:45           ` Vinod Koul
2023-12-21 14:44         ` Vinod Koul
2023-12-21 14:44       ` Vinod Koul
2023-12-07 22:29 ` [RFC PATCH 02/16] soundwire: cadence: add BTP support for DP0 Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 03/16] soundwire: stream: extend sdw_alloc_stream() to take 'type' parameter Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 04/16] soundwire: extend sdw_stream_type to BPT Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 05/16] soundwire: stream: special-case the bus compute_params() routine Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 06/16] soundwire: stream: reuse existing code for BPT stream Pierre-Louis Bossart
2023-12-12 12:30   ` Charles Keepax
2023-12-18 10:45     ` Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 07/16] soundwire: bus: add API for BPT protocol Pierre-Louis Bossart
2023-12-12 12:19   ` Charles Keepax
2023-12-18 10:38     ` Pierre-Louis Bossart
2023-12-18 11:54   ` Vinod Koul
2023-12-18 13:12     ` Pierre-Louis Bossart
2023-12-18 14:57       ` Charles Keepax
2023-12-18 16:44         ` Pierre-Louis Bossart
2023-12-21 14:49       ` Vinod Koul
2023-12-07 22:29 ` [RFC PATCH 08/16] soundwire: bus: add bpt_stream pointer Pierre-Louis Bossart
2023-12-18 11:55   ` Vinod Koul
2023-12-18 13:20     ` Pierre-Louis Bossart
2023-12-21 14:39       ` Vinod Koul
2023-12-21 17:09         ` Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 09/16] soundwire: crc8: add constant table Pierre-Louis Bossart
2023-12-18 12:01   ` Vinod Koul
2023-12-18 13:26     ` Pierre-Louis Bossart
2023-12-21 14:42       ` Vinod Koul
2023-12-21 17:15         ` Pierre-Louis Bossart
2023-12-21 17:21           ` Vinod Koul
2023-12-07 22:29 ` [RFC PATCH 10/16] soundwire: cadence: add BTP/BRA helpers to format data Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 11/16] soundwire: intel_auxdevice: add indirection for BPT open/close/send_async/wait Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 12/16] ASoC: SOF: Intel: hda-sdw-bpt: add helpers for SoundWire BPT DMA Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 13/16] soundwire: intel: add BPT context definition Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 14/16] soundwire: intel_ace2x: add BPT open/close/send_async/wait Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 15/16] soundwire: debugfs: add interface for BPT/BRA transfers Pierre-Louis Bossart
2023-12-07 22:29 ` [RFC PATCH 16/16] ASoC: rt711-sdca: add DP0 support Pierre-Louis Bossart
2023-12-07 22:56 ` [RFC PATCH 00/16] soundwire/ASoC: speed-up downloads with BTP/BRA protocol Mark Brown
2023-12-07 23:06   ` Pierre-Louis Bossart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.