All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Yet another patchset for LPE audio refactoring
@ 2017-02-03 16:43 Takashi Iwai
  2017-02-03 16:43 ` [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag Takashi Iwai
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 16:43 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart, Jerome Anand

Hi,

this is the final part of my cleanup / refactoring patchsets for Intel
LPE audio.  Now the PCM stream engine has been rewritten.  It supports
more flexible configuration, not only fixed 4 periods, for example,
yet with a lot of code reduction.

As usual, the patches are found in topic/intel-lpe-audio-cleanup
branch.


thanks,

Takashi

===

Takashi Iwai (4):
  ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  ALSA: x86: Explicit specify 32bit DMA
  ALSA: x86: Don't check connection in lowlevel accessors
  ALSA: x86: Refactor PCM process engine

 sound/x86/intel_hdmi_audio.c     | 613 ++++++++++++++-------------------------
 sound/x86/intel_hdmi_audio.h     |  23 +-
 sound/x86/intel_hdmi_lpe_audio.h |  26 +-
 3 files changed, 236 insertions(+), 426 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-03 16:43 [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
@ 2017-02-03 16:43 ` Takashi Iwai
  2017-02-03 19:39   ` Pierre-Louis Bossart
  2017-02-03 16:43 ` [PATCH 2/4] ALSA: x86: Explicit specify 32bit DMA Takashi Iwai
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 16:43 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart, Jerome Anand

The PCM engine on LPE audio isn't like a batch-style process, but
rather it deals with the standard ring buffer.  Passing the BATCH info
flag is inappropriate.

Similarly, the DOUBLE flag is also superfluous.  Drop both bits.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/x86/intel_hdmi_audio.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
index c83f02c2593e..32a21422e6f5 100644
--- a/sound/x86/intel_hdmi_audio.c
+++ b/sound/x86/intel_hdmi_audio.c
@@ -131,10 +131,8 @@ static const struct channel_map_table map_tables[] = {
 /* hardware capability structure */
 static const struct snd_pcm_hardware snd_intel_hadstream = {
 	.info =	(SNDRV_PCM_INFO_INTERLEAVED |
-		SNDRV_PCM_INFO_DOUBLE |
 		SNDRV_PCM_INFO_MMAP|
-		SNDRV_PCM_INFO_MMAP_VALID |
-		SNDRV_PCM_INFO_BATCH),
+		SNDRV_PCM_INFO_MMAP_VALID),
 	.formats = (SNDRV_PCM_FMTBIT_S24 |
 		SNDRV_PCM_FMTBIT_U24),
 	.rates = SNDRV_PCM_RATE_32000 |
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/4] ALSA: x86: Explicit specify 32bit DMA
  2017-02-03 16:43 [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
  2017-02-03 16:43 ` [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag Takashi Iwai
@ 2017-02-03 16:43 ` Takashi Iwai
  2017-02-03 16:43 ` [PATCH 3/4] ALSA: x86: Don't check connection in lowlevel accessors Takashi Iwai
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 16:43 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart, Jerome Anand

LPE audio is capable only up to 32bit address, as it seems.
Then we should limit the DMA addresses accordingly via dma-mapping
API.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/x86/intel_hdmi_audio.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
index 32a21422e6f5..9e08132fd332 100644
--- a/sound/x86/intel_hdmi_audio.c
+++ b/sound/x86/intel_hdmi_audio.c
@@ -28,6 +28,7 @@
 #include <linux/module.h>
 #include <linux/interrupt.h>
 #include <linux/pm_runtime.h>
+#include <linux/dma-mapping.h>
 #include <asm/cacheflush.h>
 #include <sound/core.h>
 #include <sound/asoundef.h>
@@ -1823,8 +1824,6 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev)
 	struct resource *res_mmio;
 	int i, ret;
 
-	dev_dbg(&pdev->dev, "dma_mask: %p\n", pdev->dev.dma_mask);
-
 	pdata = pdev->dev.platform_data;
 	if (!pdata) {
 		dev_err(&pdev->dev, "%s: quit: pdata not allocated by i915!!\n", __func__);
@@ -1905,6 +1904,11 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev)
 	/* setup the ops for playabck */
 	snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK,
 			    &snd_intelhad_playback_ops);
+
+	/* only 32bit addressable */
+	dma_set_mask(&pdev->dev, DMA_BIT_MASK(32));
+	dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
+
 	/* allocate dma pages for ALSA stream operations
 	 * memory allocated is based on size, not max value
 	 * thus using same argument for max & size
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/4] ALSA: x86: Don't check connection in lowlevel accessors
  2017-02-03 16:43 [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
  2017-02-03 16:43 ` [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag Takashi Iwai
  2017-02-03 16:43 ` [PATCH 2/4] ALSA: x86: Explicit specify 32bit DMA Takashi Iwai
@ 2017-02-03 16:43 ` Takashi Iwai
  2017-02-03 16:44 ` [PATCH 4/4] ALSA: x86: Refactor PCM process engine Takashi Iwai
  2017-02-06 19:07 ` [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
  4 siblings, 0 replies; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 16:43 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart, Jerome Anand

The lowlevel register read/write don't have to be careful about the
connection state.  It should be checked in the caller side instead.
By dropping the check, we can simplify the code, and readability.

This patch also refacors the functions slightly: namely,
- drop the useless always-zero return values
- fold the inline functions to the main accessor functions themselves
- move the DP audio hack for AUD_CONFIG to the caller side
- simplify snd_intelhad_eanble_audio() and drop the unused
  had_read_modify()

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/x86/intel_hdmi_audio.c | 91 ++++++++++++--------------------------------
 1 file changed, 25 insertions(+), 66 deletions(-)

diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
index 9e08132fd332..c503f0de3975 100644
--- a/sound/x86/intel_hdmi_audio.c
+++ b/sound/x86/intel_hdmi_audio.c
@@ -186,69 +186,20 @@ static void had_substream_put(struct snd_intelhad *intelhaddata)
 }
 
 /* Register access functions */
-static inline void
-mid_hdmi_audio_read(struct snd_intelhad *ctx, u32 reg, u32 *val)
+static void had_read_register(struct snd_intelhad *ctx, u32 reg, u32 *val)
 {
 	*val = ioread32(ctx->mmio_start + ctx->had_config_offset + reg);
 }
 
-static inline void
-mid_hdmi_audio_write(struct snd_intelhad *ctx, u32 reg, u32 val)
+static void had_write_register(struct snd_intelhad *ctx, u32 reg, u32 val)
 {
 	iowrite32(val, ctx->mmio_start + ctx->had_config_offset + reg);
 }
 
-static int had_read_register(struct snd_intelhad *intelhaddata,
-			     u32 offset, u32 *data)
-{
-	if (!intelhaddata->connected)
-		return -ENODEV;
-
-	mid_hdmi_audio_read(intelhaddata, offset, data);
-	return 0;
-}
-
-static void fixup_dp_config(struct snd_intelhad *intelhaddata,
-			    u32 offset, u32 *data)
-{
-	if (intelhaddata->dp_output) {
-		if (offset == AUD_CONFIG && (*data & AUD_CONFIG_VALID_BIT))
-			*data |= AUD_CONFIG_DP_MODE | AUD_CONFIG_BLOCK_BIT;
-	}
-}
-
-static int had_write_register(struct snd_intelhad *intelhaddata,
-			      u32 offset, u32 data)
-{
-	if (!intelhaddata->connected)
-		return -ENODEV;
-
-	fixup_dp_config(intelhaddata, offset, &data);
-	mid_hdmi_audio_write(intelhaddata, offset, data);
-	return 0;
-}
-
-static int had_read_modify(struct snd_intelhad *intelhaddata, u32 offset,
-			   u32 data, u32 mask)
-{
-	u32 val_tmp;
-
-	if (!intelhaddata->connected)
-		return -ENODEV;
-
-	mid_hdmi_audio_read(intelhaddata, offset, &val_tmp);
-	val_tmp &= ~mask;
-	val_tmp |= (data & mask);
-
-	fixup_dp_config(intelhaddata, offset, &val_tmp);
-	mid_hdmi_audio_write(intelhaddata, offset, val_tmp);
-	return 0;
-}
-
 /*
  * enable / disable audio configuration
  *
- * The had_read_modify() function should not directly be used on VLV2 for
+ * The normal read/modify should not directly be used on VLV2 for
  * updating AUD_CONFIG register.
  * This is because:
  * Bit6 of AUD_CONFIG register is writeonly due to a silicon bug on VLV2
@@ -265,24 +216,25 @@ static void snd_intelhad_enable_audio(struct snd_pcm_substream *substream,
 				      bool enable)
 {
 	union aud_cfg cfg_val = {.regval = 0};
-	u8 channels, data, mask;
+	u8 channels;
+	u32 mask, val;
 
 	/*
 	 * If substream is NULL, there is no active stream.
 	 * In this case just set channels to 2
 	 */
 	channels = substream ? substream->runtime->channels : 2;
-	cfg_val.regx.num_ch = channels - 2;
+	dev_dbg(intelhaddata->dev, "enable %d, ch=%d\n", enable, channels);
 
-	data = cfg_val.regval;
+	cfg_val.regx.num_ch = channels - 2;
 	if (enable)
-		data |= 1;
+		cfg_val.regx.aud_en = 1;
 	mask = AUD_CONFIG_CH_MASK | 1;
 
-	dev_dbg(intelhaddata->dev, "%s : data = %x, mask =%x\n",
-		__func__, data, mask);
-
-	had_read_modify(intelhaddata, AUD_CONFIG, data, mask);
+	had_read_register(intelhaddata, AUD_CONFIG, &val);
+	val &= ~mask;
+	val |= cfg_val.regval;
+	had_write_register(intelhaddata, AUD_CONFIG, val);
 }
 
 /* enable / disable the audio interface */
@@ -291,10 +243,10 @@ static void snd_intelhad_enable_audio_int(struct snd_intelhad *ctx, bool enable)
 	u32 status_reg;
 
 	if (enable) {
-		mid_hdmi_audio_read(ctx, AUD_HDMI_STATUS, &status_reg);
+		had_read_register(ctx, AUD_HDMI_STATUS, &status_reg);
 		status_reg |= HDMI_AUDIO_BUFFER_DONE | HDMI_AUDIO_UNDERRUN;
-		mid_hdmi_audio_write(ctx, AUD_HDMI_STATUS, status_reg);
-		mid_hdmi_audio_read(ctx, AUD_HDMI_STATUS, &status_reg);
+		had_write_register(ctx, AUD_HDMI_STATUS, status_reg);
+		had_read_register(ctx, AUD_HDMI_STATUS, &status_reg);
 	}
 }
 
@@ -399,6 +351,13 @@ static int snd_intelhad_audio_ctrl(struct snd_pcm_substream *substream,
 		cfg_val.regx.layout = LAYOUT1;
 
 	cfg_val.regx.val_bit = 1;
+
+	/* fix up the DP bits */
+	if (intelhaddata->dp_output) {
+		cfg_val.regx.dp_modei = 1;
+		cfg_val.regx.set = 1;
+	}
+
 	had_write_register(intelhaddata, AUD_CONFIG, cfg_val.regval);
 	return 0;
 }
@@ -1682,15 +1641,15 @@ static irqreturn_t display_pipe_interrupt_handler(int irq, void *dev_id)
 	u32 audio_stat, audio_reg;
 
 	audio_reg = AUD_HDMI_STATUS;
-	mid_hdmi_audio_read(ctx, audio_reg, &audio_stat);
+	had_read_register(ctx, audio_reg, &audio_stat);
 
 	if (audio_stat & HDMI_AUDIO_UNDERRUN) {
-		mid_hdmi_audio_write(ctx, audio_reg, HDMI_AUDIO_UNDERRUN);
+		had_write_register(ctx, audio_reg, HDMI_AUDIO_UNDERRUN);
 		had_process_buffer_underrun(ctx);
 	}
 
 	if (audio_stat & HDMI_AUDIO_BUFFER_DONE) {
-		mid_hdmi_audio_write(ctx, audio_reg, HDMI_AUDIO_BUFFER_DONE);
+		had_write_register(ctx, audio_reg, HDMI_AUDIO_BUFFER_DONE);
 		had_process_buffer_done(ctx);
 	}
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-03 16:43 [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
                   ` (2 preceding siblings ...)
  2017-02-03 16:43 ` [PATCH 3/4] ALSA: x86: Don't check connection in lowlevel accessors Takashi Iwai
@ 2017-02-03 16:44 ` Takashi Iwai
  2017-02-03 19:47   ` Pierre-Louis Bossart
  2017-02-06 19:07 ` [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
  4 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 16:44 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart, Jerome Anand

This is again a big rewrite of the driver; now it touches the code to
process PCM stream transfers.

The most fundamental change is that now the driver supports more than
four periods.  Instead of keeping the same index between the ring
buffers (from A to D) and the PCM buffer periods, now we keep
difference indices for both.  Also, for the cases with less periods
than four, we track the head index, too.  That is, we now have four
indices: ringbuf_head, pcm_head, ringbuf_filled, and pcm_filled.

By this flexibility, we can use even dmix, which requires 16 periods
as default.

The buffer size could be up to 20bit, so it's set to that value.  But
the pre-allocation is limited to 128k as default.  It's because the
chip requires the continuous pages (unfortunately no-SG possible),
thus we don't want too much continuous pages.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/x86/intel_hdmi_audio.c     | 510 ++++++++++++++-------------------------
 sound/x86/intel_hdmi_audio.h     |  23 +-
 sound/x86/intel_hdmi_lpe_audio.h |  26 +-
 3 files changed, 204 insertions(+), 355 deletions(-)

diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
index c503f0de3975..2c3dcdcb43f5 100644
--- a/sound/x86/intel_hdmi_audio.c
+++ b/sound/x86/intel_hdmi_audio.c
@@ -619,82 +619,6 @@ static void snd_intelhad_prog_dip(struct snd_pcm_substream *substream,
 	had_write_register(intelhaddata, AUD_CNTL_ST, ctrl_state.regval);
 }
 
-/*
- * Programs buffer address and length registers
- * This function programs ring buffer address and length into registers.
- */
-static int snd_intelhad_prog_buffer(struct snd_pcm_substream *substream,
-				    struct snd_intelhad *intelhaddata,
-				    int start, int end)
-{
-	u32 ring_buf_addr, ring_buf_size, period_bytes;
-	u8 i, num_periods;
-
-	ring_buf_addr = substream->runtime->dma_addr;
-	ring_buf_size = snd_pcm_lib_buffer_bytes(substream);
-	intelhaddata->stream_info.ring_buf_size = ring_buf_size;
-	period_bytes = frames_to_bytes(substream->runtime,
-				substream->runtime->period_size);
-	num_periods = substream->runtime->periods;
-
-	/*
-	 * buffer addr should  be 64 byte aligned, period bytes
-	 * will be used to calculate addr offset
-	 */
-	period_bytes &= ~0x3F;
-
-	/* Hardware supports MAX_PERIODS buffers */
-	if (end >= HAD_MAX_PERIODS)
-		return -EINVAL;
-
-	for (i = start; i <= end; i++) {
-		/* Program the buf registers with addr and len */
-		intelhaddata->buf_info[i].buf_addr = ring_buf_addr +
-							 (i * period_bytes);
-		if (i < num_periods-1)
-			intelhaddata->buf_info[i].buf_size = period_bytes;
-		else
-			intelhaddata->buf_info[i].buf_size = ring_buf_size -
-							(i * period_bytes);
-
-		had_write_register(intelhaddata,
-				   AUD_BUF_A_ADDR + (i * HAD_REG_WIDTH),
-					intelhaddata->buf_info[i].buf_addr |
-					BIT(0) | BIT(1));
-		had_write_register(intelhaddata,
-				   AUD_BUF_A_LENGTH + (i * HAD_REG_WIDTH),
-					period_bytes);
-		intelhaddata->buf_info[i].is_valid = true;
-	}
-	dev_dbg(intelhaddata->dev, "%s:buf[%d-%d] addr=%#x  and size=%d\n",
-		__func__, start, end,
-		intelhaddata->buf_info[start].buf_addr,
-		intelhaddata->buf_info[start].buf_size);
-	intelhaddata->valid_buf_cnt = num_periods;
-	return 0;
-}
-
-static int snd_intelhad_read_len(struct snd_intelhad *intelhaddata)
-{
-	int i, retval = 0;
-	u32 len[4];
-
-	for (i = 0; i < 4 ; i++) {
-		had_read_register(intelhaddata,
-				  AUD_BUF_A_LENGTH + (i * HAD_REG_WIDTH),
-				  &len[i]);
-		if (!len[i])
-			retval++;
-	}
-	if (retval != 1) {
-		for (i = 0; i < 4 ; i++)
-			dev_dbg(intelhaddata->dev, "buf[%d] size=%d\n",
-				i, len[i]);
-	}
-
-	return retval;
-}
-
 static int had_calculate_maud_value(u32 aud_samp_freq, u32 link_rate)
 {
 	u32 maud_val;
@@ -882,34 +806,192 @@ static int snd_intelhad_prog_n(u32 aud_samp_freq, u32 *n_param,
 	return 0;
 }
 
+/*
+ * PCM ring buffer handling
+ */
+
+#define AUD_BUF_ADDR(x)		(AUD_BUF_A_ADDR + (x) * HAD_REG_WIDTH)
+#define AUD_BUF_LEN(x)		(AUD_BUF_A_LENGTH + (x) * HAD_REG_WIDTH)
+
+/* Set up a ring buffer at the "filled" position */
+static void had_prog_ringbuf(struct snd_pcm_substream *substream,
+			     struct snd_intelhad *intelhaddata)
+{
+	int idx = intelhaddata->ringbuf_filled;
+	int ofs = intelhaddata->pcmbuf_filled * intelhaddata->period_bytes;
+	u32 addr = substream->runtime->dma_addr + ofs;
+
+	addr |= AUD_BUF_VALID | AUD_BUF_INTR_EN;
+	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), addr);
+	had_write_register(intelhaddata, AUD_BUF_LEN(idx),
+			   intelhaddata->period_bytes);
+
+	/* advance the indices to the next */
+	intelhaddata->ringbuf_filled++;
+	intelhaddata->ringbuf_filled %= HAD_NUM_RINGBUFS;
+	intelhaddata->pcmbuf_filled++;
+	intelhaddata->pcmbuf_filled %= substream->runtime->periods;
+}
+
+/* invalidate a ring buffer with the given index */
+static void had_invalidate_ringbuf(struct snd_intelhad *intelhaddata,
+				   int idx)
+{
+	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), 0);
+	had_write_register(intelhaddata, AUD_BUF_LEN(idx), 0);
+}
+
+/* Initial programming of ring buffers */
+static void had_init_ringbufs(struct snd_pcm_substream *substream,
+			      struct snd_intelhad *intelhaddata)
+{
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	int i, period_bytes, num_periods;
+
+	num_periods = runtime->periods;
+	intelhaddata->period_bytes = period_bytes =
+		frames_to_bytes(runtime, runtime->period_size);
+	WARN_ON(period_bytes & 0x3f);
+
+	intelhaddata->ringbuf_head = 0;
+	intelhaddata->pcmbuf_head = 0;
+	intelhaddata->ringbuf_filled = 0;
+	intelhaddata->pcmbuf_filled = 0;
+
+	for (i = 0; i < HAD_NUM_RINGBUFS; i++) {
+		if (i < num_periods)
+			had_prog_ringbuf(substream, intelhaddata);
+		else /* invalidate the rest */
+			had_invalidate_ringbuf(intelhaddata, i);
+	}
+}
+
+/* process a ring buf, advance to the next */
+static void had_advance_ringbuf(struct snd_pcm_substream *substream,
+				struct snd_intelhad *intelhaddata)
+{
+	int num_periods = substream->runtime->periods;
+
+	/* reprogram the next buffer */
+	had_prog_ringbuf(substream, intelhaddata);
+
+	/* invalidate the processed buf for shorter PCM buffer */
+	if (num_periods < HAD_NUM_RINGBUFS)
+		had_invalidate_ringbuf(intelhaddata,
+				       intelhaddata->ringbuf_head);
+
+	/* proceed to next */
+	intelhaddata->ringbuf_head++;
+	intelhaddata->ringbuf_head %= HAD_NUM_RINGBUFS;
+	intelhaddata->pcmbuf_head++;
+	intelhaddata->pcmbuf_head %= num_periods;
+}
+
+/* process the current ring buffer(s);
+ * returns the current PCM buffer byte position, or -EPIPE for underrun.
+ */
+static int had_process_ringbufs(struct snd_pcm_substream *substream,
+				struct snd_intelhad *intelhaddata)
+{
+	int len, processed;
+	unsigned long flags;
+
+	processed = 0;
+	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
+	for (;;) {
+		/* get the remaining bytes on the buffer */
+		had_read_register(intelhaddata,
+				  AUD_BUF_LEN(intelhaddata->ringbuf_head),
+				  &len);
+		if (len < 0 || len > intelhaddata->period_bytes) {
+			dev_dbg(intelhaddata->dev, "Invalid buf length %d\n",
+				len);
+			len = -EPIPE;
+			goto out;
+		}
+
+		if (len > 0) /* OK, this is the current buffer */
+			break;
+
+		/* len=0 => already empty, check the next buffer */
+		if (++processed >= HAD_NUM_RINGBUFS) {
+			len = -EPIPE; /* all empty? - report underrun */
+			goto out;
+		}
+		had_advance_ringbuf(substream, intelhaddata);
+	}
+
+	len = intelhaddata->period_bytes - len;
+	len += intelhaddata->period_bytes * intelhaddata->pcmbuf_head;
+ out:
+	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
+	return len;
+}
+
+/* called from irq handler */
+static void had_process_buffer_done(struct snd_intelhad *intelhaddata)
+{
+	struct snd_pcm_substream *substream;
+
+	if (!intelhaddata->connected)
+		return; /* disconnected? - bail out */
+
+	substream = had_substream_get(intelhaddata);
+	if (!substream)
+		return; /* no stream? - bail out */
+
+	/* process or stop the stream */
+	if (had_process_ringbufs(substream, intelhaddata) < 0)
+		snd_pcm_stop_xrun(substream);
+	else
+		snd_pcm_period_elapsed(substream);
+
+	had_substream_put(intelhaddata);
+}
+
 #define MAX_CNT			0xFF
 
-static void snd_intelhad_handle_underrun(struct snd_intelhad *intelhaddata)
+/*
+ * The interrupt status 'sticky' bits might not be cleared by
+ * setting '1' to that bit once...
+ */
+static void wait_clear_underrun_bit(struct snd_intelhad *intelhaddata)
 {
-	u32 hdmi_status = 0, i = 0;
+	int i;
+	u32 val;
+
+	for (i = 0; i < MAX_CNT; i++) {
+		/* clear bit30, 31 AUD_HDMI_STATUS */
+		had_read_register(intelhaddata, AUD_HDMI_STATUS, &val);
+		if (!(val & AUD_CONFIG_MASK_UNDERRUN))
+			return;
+		had_write_register(intelhaddata, AUD_HDMI_STATUS, val);
+	}
+	dev_err(intelhaddata->dev, "Unable to clear UNDERRUN bits\n");
+}
+
+/* called from irq handler */
+static void had_process_buffer_underrun(struct snd_intelhad *intelhaddata)
+{
+	struct snd_pcm_substream *substream;
 
 	/* Handle Underrun interrupt within Audio Unit */
 	had_write_register(intelhaddata, AUD_CONFIG, 0);
 	/* Reset buffer pointers */
 	had_write_register(intelhaddata, AUD_HDMI_STATUS, 1);
 	had_write_register(intelhaddata, AUD_HDMI_STATUS, 0);
-	/*
-	 * The interrupt status 'sticky' bits might not be cleared by
-	 * setting '1' to that bit once...
-	 */
-	do { /* clear bit30, 31 AUD_HDMI_STATUS */
-		had_read_register(intelhaddata, AUD_HDMI_STATUS,
-				  &hdmi_status);
-		dev_dbg(intelhaddata->dev, "HDMI status =0x%x\n", hdmi_status);
-		if (hdmi_status & AUD_CONFIG_MASK_UNDERRUN) {
-			i++;
-			had_write_register(intelhaddata,
-					   AUD_HDMI_STATUS, hdmi_status);
-		} else
-			break;
-	} while (i < MAX_CNT);
-	if (i >= MAX_CNT)
-		dev_err(intelhaddata->dev, "Unable to clear UNDERRUN bits\n");
+
+	wait_clear_underrun_bit(intelhaddata);
+
+	if (!intelhaddata->connected)
+		return; /* disconnected? - bail out */
+
+	/* Report UNDERRUN error to above layers */
+	substream = had_substream_get(intelhaddata);
+	if (substream) {
+		snd_pcm_stop_xrun(substream);
+		had_substream_put(intelhaddata);
+	}
 }
 
 /*
@@ -955,11 +1037,6 @@ static int snd_intelhad_open(struct snd_pcm_substream *substream)
 	intelhaddata->stream_info.substream_refcount++;
 	spin_unlock_irq(&intelhaddata->had_spinlock);
 
-	/* these are cleared in prepare callback, but just to be sure */
-	intelhaddata->curr_buf = 0;
-	intelhaddata->underrun_count = 0;
-	intelhaddata->stream_info.buffer_rendered = 0;
-
 	return retval;
  error:
 	pm_runtime_put(intelhaddata->dev);
@@ -1123,10 +1200,6 @@ static int snd_intelhad_pcm_prepare(struct snd_pcm_substream *substream)
 	dev_dbg(intelhaddata->dev, "rate=%d\n", runtime->rate);
 	dev_dbg(intelhaddata->dev, "channels=%d\n", runtime->channels);
 
-	intelhaddata->curr_buf = 0;
-	intelhaddata->underrun_count = 0;
-	intelhaddata->stream_info.buffer_rendered = 0;
-
 	/* Get N value in KHz */
 	disp_samp_freq = intelhaddata->tmds_clock_speed;
 
@@ -1150,8 +1223,7 @@ static int snd_intelhad_pcm_prepare(struct snd_pcm_substream *substream)
 	retval = snd_intelhad_audio_ctrl(substream, intelhaddata);
 
 	/* Prog buffer address */
-	retval = snd_intelhad_prog_buffer(substream, intelhaddata,
-			HAD_BUF_TYPE_A, HAD_BUF_TYPE_D);
+	had_init_ringbufs(substream, intelhaddata);
 
 	/*
 	 * Program channel mapping in following order:
@@ -1171,48 +1243,17 @@ static snd_pcm_uframes_t
 snd_intelhad_pcm_pointer(struct snd_pcm_substream *substream)
 {
 	struct snd_intelhad *intelhaddata;
-	u32 bytes_rendered = 0;
-	u32 t;
-	int buf_id;
+	int len;
 
 	intelhaddata = snd_pcm_substream_chip(substream);
 
 	if (!intelhaddata->connected)
 		return SNDRV_PCM_POS_XRUN;
 
-	/* Use a hw register to calculate sub-period position reports.
-	 * This makes PulseAudio happier.
-	 */
-
-	buf_id = intelhaddata->curr_buf % 4;
-	had_read_register(intelhaddata,
-			  AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH), &t);
-
-	if ((t == 0) || (t == ((u32)-1L))) {
-		intelhaddata->underrun_count++;
-		dev_dbg(intelhaddata->dev,
-			"discovered buffer done for buf %d, count = %d\n",
-			 buf_id, intelhaddata->underrun_count);
-
-		if (intelhaddata->underrun_count > (HAD_MIN_PERIODS/2)) {
-			dev_dbg(intelhaddata->dev,
-				"assume audio_codec_reset, underrun = %d - do xrun\n",
-				 intelhaddata->underrun_count);
-			return SNDRV_PCM_POS_XRUN;
-		}
-	} else {
-		/* Reset Counter */
-		intelhaddata->underrun_count = 0;
-	}
-
-	t = intelhaddata->buf_info[buf_id].buf_size - t;
-
-	if (intelhaddata->stream_info.buffer_rendered)
-		div_u64_rem(intelhaddata->stream_info.buffer_rendered,
-			intelhaddata->stream_info.ring_buf_size,
-			&(bytes_rendered));
-
-	return bytes_to_frames(substream->runtime, bytes_rendered + t);
+	len = had_process_ringbufs(substream, intelhaddata);
+	if (len < 0)
+		return SNDRV_PCM_POS_XRUN;
+	return bytes_to_frames(substream->runtime, len);
 }
 
 /*
@@ -1283,179 +1324,9 @@ static int hdmi_audio_mode_change(struct snd_intelhad *intelhaddata)
 	return retval;
 }
 
-static inline int had_chk_intrmiss(struct snd_intelhad *intelhaddata,
-		enum intel_had_aud_buf_type buf_id)
-{
-	int i, intr_count = 0;
-	enum intel_had_aud_buf_type buff_done;
-	u32 buf_size, buf_addr;
-
-	buff_done = buf_id;
-
-	intr_count = snd_intelhad_read_len(intelhaddata);
-	if (intr_count > 1) {
-		/* In case of active playback */
-		dev_err(intelhaddata->dev,
-			"Driver detected %d missed buffer done interrupt(s)\n",
-			(intr_count - 1));
-		if (intr_count > 3)
-			return intr_count;
-
-		buf_id += (intr_count - 1);
-		/* Reprogram registers*/
-		for (i = buff_done; i < buf_id; i++) {
-			int j = i % 4;
-
-			buf_size = intelhaddata->buf_info[j].buf_size;
-			buf_addr = intelhaddata->buf_info[j].buf_addr;
-			had_write_register(intelhaddata,
-					   AUD_BUF_A_LENGTH +
-					   (j * HAD_REG_WIDTH), buf_size);
-			had_write_register(intelhaddata,
-					   AUD_BUF_A_ADDR+(j * HAD_REG_WIDTH),
-					   (buf_addr | BIT(0) | BIT(1)));
-		}
-		buf_id = buf_id % 4;
-		intelhaddata->buff_done = buf_id;
-	}
-
-	return intr_count;
-}
-
-/* called from irq handler */
-static int had_process_buffer_done(struct snd_intelhad *intelhaddata)
-{
-	u32 len = 1;
-	enum intel_had_aud_buf_type buf_id;
-	enum intel_had_aud_buf_type buff_done;
-	struct pcm_stream_info *stream;
-	struct snd_pcm_substream *substream;
-	u32 buf_size;
-	int intr_count;
-	unsigned long flags;
-
-	stream = &intelhaddata->stream_info;
-	intr_count = 1;
-
-	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
-	if (!intelhaddata->connected) {
-		spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
-		dev_dbg(intelhaddata->dev,
-			"%s:Device already disconnected\n", __func__);
-		return 0;
-	}
-	buf_id = intelhaddata->curr_buf;
-	intelhaddata->buff_done = buf_id;
-	buff_done = intelhaddata->buff_done;
-	buf_size = intelhaddata->buf_info[buf_id].buf_size;
-
-	/* Every debug statement has an implication
-	 * of ~5msec. Thus, avoid having >3 debug statements
-	 * for each buffer_done handling.
-	 */
-
-	/* Check for any intr_miss in case of active playback */
-	if (stream->running) {
-		intr_count = had_chk_intrmiss(intelhaddata, buf_id);
-		if (!intr_count || (intr_count > 3)) {
-			spin_unlock_irqrestore(&intelhaddata->had_spinlock,
-					       flags);
-			dev_err(intelhaddata->dev,
-				"HAD SW state in non-recoverable mode\n");
-			return 0;
-		}
-		buf_id += (intr_count - 1);
-		buf_id = buf_id % 4;
-	}
-
-	intelhaddata->buf_info[buf_id].is_valid = true;
-	if (intelhaddata->valid_buf_cnt-1 == buf_id) {
-		if (stream->running)
-			intelhaddata->curr_buf = HAD_BUF_TYPE_A;
-	} else
-		intelhaddata->curr_buf = buf_id + 1;
-
-	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
-
-	if (!intelhaddata->connected) {
-		dev_dbg(intelhaddata->dev, "HDMI cable plugged-out\n");
-		return 0;
-	}
-
-	/* Reprogram the registers with addr and length */
-	had_write_register(intelhaddata,
-			   AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH),
-			   buf_size);
-	had_write_register(intelhaddata,
-			   AUD_BUF_A_ADDR + (buf_id * HAD_REG_WIDTH),
-			   intelhaddata->buf_info[buf_id].buf_addr |
-			   BIT(0) | BIT(1));
-
-	had_read_register(intelhaddata,
-			  AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH),
-			  &len);
-	dev_dbg(intelhaddata->dev, "%s:Enabled buf[%d]\n", __func__, buf_id);
-
-	/* In case of actual data,
-	 * report buffer_done to above ALSA layer
-	 */
-	substream = had_substream_get(intelhaddata);
-	if (substream) {
-		buf_size = intelhaddata->buf_info[buf_id].buf_size;
-		intelhaddata->stream_info.buffer_rendered +=
-			(intr_count * buf_size);
-		snd_pcm_period_elapsed(substream);
-		had_substream_put(intelhaddata);
-	}
-
-	return 0;
-}
-
-/* called from irq handler */
-static int had_process_buffer_underrun(struct snd_intelhad *intelhaddata)
-{
-	enum intel_had_aud_buf_type buf_id;
-	struct pcm_stream_info *stream;
-	struct snd_pcm_substream *substream;
-	unsigned long flags;
-	int connected;
-
-	stream = &intelhaddata->stream_info;
-
-	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
-	buf_id = intelhaddata->curr_buf;
-	intelhaddata->buff_done = buf_id;
-	connected = intelhaddata->connected;
-	if (stream->running)
-		intelhaddata->curr_buf = HAD_BUF_TYPE_A;
-
-	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
-
-	dev_dbg(intelhaddata->dev, "Enter:%s buf_id=%d, stream_running=%d\n",
-			__func__, buf_id, stream->running);
-
-	snd_intelhad_handle_underrun(intelhaddata);
-
-	if (!connected) {
-		dev_dbg(intelhaddata->dev,
-			"%s:Device already disconnected\n", __func__);
-		return 0;
-	}
-
-	/* Report UNDERRUN error to above layers */
-	substream = had_substream_get(intelhaddata);
-	if (substream) {
-		snd_pcm_stop_xrun(substream);
-		had_substream_put(intelhaddata);
-	}
-
-	return 0;
-}
-
 /* process hot plug, called from wq with mutex locked */
 static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
 {
-	enum intel_had_aud_buf_type buf_id;
 	struct snd_pcm_substream *substream;
 
 	spin_lock_irq(&intelhaddata->had_spinlock);
@@ -1465,17 +1336,12 @@ static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
 		return;
 	}
 
-	buf_id = intelhaddata->curr_buf;
-	intelhaddata->buff_done = buf_id;
 	intelhaddata->connected = true;
 	dev_dbg(intelhaddata->dev,
 		"%s @ %d:DEBUG PLUG/UNPLUG : HAD_DRV_CONNECTED\n",
 			__func__, __LINE__);
 	spin_unlock_irq(&intelhaddata->had_spinlock);
 
-	dev_dbg(intelhaddata->dev, "Processing HOT_PLUG, buf_id = %d\n",
-		buf_id);
-
 	/* Safety check */
 	substream = had_substream_get(intelhaddata);
 	if (substream) {
@@ -1492,11 +1358,8 @@ static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
 /* process hot unplug, called from wq with mutex locked */
 static void had_process_hot_unplug(struct snd_intelhad *intelhaddata)
 {
-	enum intel_had_aud_buf_type buf_id;
 	struct snd_pcm_substream *substream;
 
-	buf_id = intelhaddata->curr_buf;
-
 	substream = had_substream_get(intelhaddata);
 
 	spin_lock_irq(&intelhaddata->had_spinlock);
@@ -1868,13 +1731,12 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev)
 	dma_set_mask(&pdev->dev, DMA_BIT_MASK(32));
 	dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
 
-	/* allocate dma pages for ALSA stream operations
-	 * memory allocated is based on size, not max value
-	 * thus using same argument for max & size
+	/* allocate dma pages;
+	 * try to allocate 128k buffer as default which is large enough
 	 */
 	snd_pcm_lib_preallocate_pages_for_all(pcm,
 			SNDRV_DMA_TYPE_DEV, NULL,
-			HAD_MAX_BUFFER, HAD_MAX_BUFFER);
+			128 * 1024, HAD_MAX_BUFFER);
 
 	/* create controls */
 	for (i = 0; i < ARRAY_SIZE(had_controls); i++) {
diff --git a/sound/x86/intel_hdmi_audio.h b/sound/x86/intel_hdmi_audio.h
index 9f713a8a88bc..6178b09801e7 100644
--- a/sound/x86/intel_hdmi_audio.h
+++ b/sound/x86/intel_hdmi_audio.h
@@ -64,24 +64,15 @@
 
 struct pcm_stream_info {
 	struct snd_pcm_substream *substream;
-	u64		buffer_rendered;
-	u32		ring_buf_size;
 	int substream_refcount;
 	bool running;
 };
 
-struct ring_buf_info {
-	u32	buf_addr;
-	u32	buf_size;
-	u8	is_valid;
-};
-
 /*
  * struct snd_intelhad - intelhad driver structure
  *
  * @card: ptr to hold card details
  * @connected: the monitor connection status
- * @buf_info: ring buffer info
  * @stream_info: stream information
  * @eld: holds ELD info
  * @curr_buf: pointer to hold current active ring buf
@@ -91,26 +82,28 @@ struct ring_buf_info {
  * @buff_done: id of current buffer done intr
  * @dev: platoform device handle
  * @chmap: holds channel map info
- * @underrun_count: PCM stream underrun counter
  */
 struct snd_intelhad {
 	struct snd_card	*card;
 	bool		connected;
-	struct		ring_buf_info buf_info[HAD_NUM_OF_RING_BUFS];
 	struct		pcm_stream_info stream_info;
 	unsigned char	eld[HDMI_MAX_ELD_BYTES];
 	bool dp_output;
-	enum		intel_had_aud_buf_type curr_buf;
-	int		valid_buf_cnt;
 	unsigned int	aes_bits;
 	spinlock_t had_spinlock;
-	enum		intel_had_aud_buf_type buff_done;
 	struct device *dev;
 	struct snd_pcm_chmap *chmap;
-	int underrun_count;
 	int tmds_clock_speed;
 	int link_rate;
 
+	/* positions (indices) for ring buffer [A-D] */
+	unsigned int ringbuf_head;	/* being processed */
+	unsigned int ringbuf_filled;	/* to be filled */
+	/* positions (indices) for PCM buffer */
+	unsigned int pcmbuf_head;	/* being processed */
+	unsigned int pcmbuf_filled;	/* to be filled */
+	unsigned int period_bytes;	/* PCM period size in bytes */
+
 	/* internal stuff */
 	int irq;
 	void __iomem *mmio_start;
diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h
index be9783910a3a..612714eecaff 100644
--- a/sound/x86/intel_hdmi_lpe_audio.h
+++ b/sound/x86/intel_hdmi_lpe_audio.h
@@ -26,15 +26,14 @@
 #define HAD_MAX_DEVICES		1
 #define HAD_MIN_CHANNEL		2
 #define HAD_MAX_CHANNEL		8
-#define HAD_NUM_OF_RING_BUFS	4
+#define HAD_NUM_RINGBUFS	4
 
-/* Assume 192KHz, 8channel, 25msec period */
-#define HAD_MAX_BUFFER		(600*1024)
-#define HAD_MIN_BUFFER		(32*1024)
-#define HAD_MAX_PERIODS		4
-#define HAD_MIN_PERIODS		4
-#define HAD_MAX_PERIOD_BYTES	(HAD_MAX_BUFFER/HAD_MIN_PERIODS)
-#define HAD_MIN_PERIOD_BYTES	256
+/* max 20bit address, aligned to 64 */
+#define HAD_MAX_BUFFER		((1024 * 1024 - 1) & ~0x3f)
+#define HAD_MAX_PERIODS		256	/* arbitrary, but should suffice */
+#define HAD_MIN_PERIODS		2
+#define HAD_MAX_PERIOD_BYTES	((HAD_MAX_BUFFER / HAD_MIN_PERIODS) & ~0x3f)
+#define HAD_MIN_PERIOD_BYTES	1024	/* might be smaller */
 #define HAD_FIFO_SIZE		0 /* fifo not being used */
 #define MAX_SPEAKERS		8
 
@@ -82,14 +81,6 @@
 /* Naud Value */
 #define DP_NAUD_VAL					32768
 
-/* enum intel_had_aud_buf_type - HDMI controller ring buffer types */
-enum intel_had_aud_buf_type {
-	HAD_BUF_TYPE_A = 0,
-	HAD_BUF_TYPE_B = 1,
-	HAD_BUF_TYPE_C = 2,
-	HAD_BUF_TYPE_D = 3,
-};
-
 /* HDMI Controller register offsets - audio domain common */
 /* Base address for below regs = 0x65000 */
 enum hdmi_ctrl_reg_offset_common {
@@ -274,6 +265,9 @@ union aud_buf_addr {
 	u32 regval;
 };
 
+#define AUD_BUF_VALID		(1U << 0)
+#define AUD_BUF_INTR_EN		(1U << 1)
+
 /* Length of Audio Buffer */
 union aud_buf_len {
 	struct {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-03 16:43 ` [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag Takashi Iwai
@ 2017-02-03 19:39   ` Pierre-Louis Bossart
  2017-02-03 20:13     ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-03 19:39 UTC (permalink / raw)
  To: Takashi Iwai, alsa-devel; +Cc: Jerome Anand

On 2/3/17 10:43 AM, Takashi Iwai wrote:
> The PCM engine on LPE audio isn't like a batch-style process, but
> rather it deals with the standard ring buffer.  Passing the BATCH info
> flag is inappropriate.

Humm, this is controversial. There are 4 DMA descriptors and getting a 
precise position in the stream is not straightforward. Rewind is also 
not supported so if you remove the BATCH flag you will push PulseAudio 
into doing timer-based scheduling and rewinds that probably don't work.

>
> Similarly, the DOUBLE flag is also superfluous.  Drop both bits.
>
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> ---
>  sound/x86/intel_hdmi_audio.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
> index c83f02c2593e..32a21422e6f5 100644
> --- a/sound/x86/intel_hdmi_audio.c
> +++ b/sound/x86/intel_hdmi_audio.c
> @@ -131,10 +131,8 @@ static const struct channel_map_table map_tables[] = {
>  /* hardware capability structure */
>  static const struct snd_pcm_hardware snd_intel_hadstream = {
>  	.info =	(SNDRV_PCM_INFO_INTERLEAVED |
> -		SNDRV_PCM_INFO_DOUBLE |
>  		SNDRV_PCM_INFO_MMAP|
> -		SNDRV_PCM_INFO_MMAP_VALID |
> -		SNDRV_PCM_INFO_BATCH),
> +		SNDRV_PCM_INFO_MMAP_VALID),
>  	.formats = (SNDRV_PCM_FMTBIT_S24 |
>  		SNDRV_PCM_FMTBIT_U24),
>  	.rates = SNDRV_PCM_RATE_32000 |
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-03 16:44 ` [PATCH 4/4] ALSA: x86: Refactor PCM process engine Takashi Iwai
@ 2017-02-03 19:47   ` Pierre-Louis Bossart
  2017-02-03 20:11     ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-03 19:47 UTC (permalink / raw)
  To: Takashi Iwai, alsa-devel; +Cc: Jerome Anand

On 2/3/17 10:44 AM, Takashi Iwai wrote:
> This is again a big rewrite of the driver; now it touches the code to
> process PCM stream transfers.
>
> The most fundamental change is that now the driver supports more than
> four periods.  Instead of keeping the same index between the ring
> buffers (from A to D) and the PCM buffer periods, now we keep
> difference indices for both.  Also, for the cases with less periods
> than four, we track the head index, too.  That is, we now have four
> indices: ringbuf_head, pcm_head, ringbuf_filled, and pcm_filled.

Well that's not completely right. The DMA can only generate an interrupt 
once the buffer you submit was played, and with 4 descriptors you can't 
have more than 4 points where interrupts are generated.  If you program 
different values in different descriptors then the notion of periodic 
hardware interrupts will be lost.

>
> By this flexibility, we can use even dmix, which requires 16 periods
> as default.
>
> The buffer size could be up to 20bit, so it's set to that value.  But
> the pre-allocation is limited to 128k as default.  It's because the
> chip requires the continuous pages (unfortunately no-SG possible),
> thus we don't want too much continuous pages.

No, that's not true. You need contiguous regions for each descriptor, 
but the entire buffer doesn't need to be contiguous.

>
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> ---
>  sound/x86/intel_hdmi_audio.c     | 510 ++++++++++++++-------------------------
>  sound/x86/intel_hdmi_audio.h     |  23 +-
>  sound/x86/intel_hdmi_lpe_audio.h |  26 +-
>  3 files changed, 204 insertions(+), 355 deletions(-)
>
> diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
> index c503f0de3975..2c3dcdcb43f5 100644
> --- a/sound/x86/intel_hdmi_audio.c
> +++ b/sound/x86/intel_hdmi_audio.c
> @@ -619,82 +619,6 @@ static void snd_intelhad_prog_dip(struct snd_pcm_substream *substream,
>  	had_write_register(intelhaddata, AUD_CNTL_ST, ctrl_state.regval);
>  }
>
> -/*
> - * Programs buffer address and length registers
> - * This function programs ring buffer address and length into registers.
> - */
> -static int snd_intelhad_prog_buffer(struct snd_pcm_substream *substream,
> -				    struct snd_intelhad *intelhaddata,
> -				    int start, int end)
> -{
> -	u32 ring_buf_addr, ring_buf_size, period_bytes;
> -	u8 i, num_periods;
> -
> -	ring_buf_addr = substream->runtime->dma_addr;
> -	ring_buf_size = snd_pcm_lib_buffer_bytes(substream);
> -	intelhaddata->stream_info.ring_buf_size = ring_buf_size;
> -	period_bytes = frames_to_bytes(substream->runtime,
> -				substream->runtime->period_size);
> -	num_periods = substream->runtime->periods;
> -
> -	/*
> -	 * buffer addr should  be 64 byte aligned, period bytes
> -	 * will be used to calculate addr offset
> -	 */
> -	period_bytes &= ~0x3F;
> -
> -	/* Hardware supports MAX_PERIODS buffers */
> -	if (end >= HAD_MAX_PERIODS)
> -		return -EINVAL;
> -
> -	for (i = start; i <= end; i++) {
> -		/* Program the buf registers with addr and len */
> -		intelhaddata->buf_info[i].buf_addr = ring_buf_addr +
> -							 (i * period_bytes);
> -		if (i < num_periods-1)
> -			intelhaddata->buf_info[i].buf_size = period_bytes;
> -		else
> -			intelhaddata->buf_info[i].buf_size = ring_buf_size -
> -							(i * period_bytes);
> -
> -		had_write_register(intelhaddata,
> -				   AUD_BUF_A_ADDR + (i * HAD_REG_WIDTH),
> -					intelhaddata->buf_info[i].buf_addr |
> -					BIT(0) | BIT(1));
> -		had_write_register(intelhaddata,
> -				   AUD_BUF_A_LENGTH + (i * HAD_REG_WIDTH),
> -					period_bytes);
> -		intelhaddata->buf_info[i].is_valid = true;
> -	}
> -	dev_dbg(intelhaddata->dev, "%s:buf[%d-%d] addr=%#x  and size=%d\n",
> -		__func__, start, end,
> -		intelhaddata->buf_info[start].buf_addr,
> -		intelhaddata->buf_info[start].buf_size);
> -	intelhaddata->valid_buf_cnt = num_periods;
> -	return 0;
> -}
> -
> -static int snd_intelhad_read_len(struct snd_intelhad *intelhaddata)
> -{
> -	int i, retval = 0;
> -	u32 len[4];
> -
> -	for (i = 0; i < 4 ; i++) {
> -		had_read_register(intelhaddata,
> -				  AUD_BUF_A_LENGTH + (i * HAD_REG_WIDTH),
> -				  &len[i]);
> -		if (!len[i])
> -			retval++;
> -	}
> -	if (retval != 1) {
> -		for (i = 0; i < 4 ; i++)
> -			dev_dbg(intelhaddata->dev, "buf[%d] size=%d\n",
> -				i, len[i]);
> -	}
> -
> -	return retval;
> -}
> -
>  static int had_calculate_maud_value(u32 aud_samp_freq, u32 link_rate)
>  {
>  	u32 maud_val;
> @@ -882,34 +806,192 @@ static int snd_intelhad_prog_n(u32 aud_samp_freq, u32 *n_param,
>  	return 0;
>  }
>
> +/*
> + * PCM ring buffer handling
> + */
> +
> +#define AUD_BUF_ADDR(x)		(AUD_BUF_A_ADDR + (x) * HAD_REG_WIDTH)
> +#define AUD_BUF_LEN(x)		(AUD_BUF_A_LENGTH + (x) * HAD_REG_WIDTH)
> +
> +/* Set up a ring buffer at the "filled" position */
> +static void had_prog_ringbuf(struct snd_pcm_substream *substream,
> +			     struct snd_intelhad *intelhaddata)
> +{
> +	int idx = intelhaddata->ringbuf_filled;
> +	int ofs = intelhaddata->pcmbuf_filled * intelhaddata->period_bytes;
> +	u32 addr = substream->runtime->dma_addr + ofs;
> +
> +	addr |= AUD_BUF_VALID | AUD_BUF_INTR_EN;
> +	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), addr);
> +	had_write_register(intelhaddata, AUD_BUF_LEN(idx),
> +			   intelhaddata->period_bytes);
> +
> +	/* advance the indices to the next */
> +	intelhaddata->ringbuf_filled++;
> +	intelhaddata->ringbuf_filled %= HAD_NUM_RINGBUFS;
> +	intelhaddata->pcmbuf_filled++;
> +	intelhaddata->pcmbuf_filled %= substream->runtime->periods;
> +}
> +
> +/* invalidate a ring buffer with the given index */
> +static void had_invalidate_ringbuf(struct snd_intelhad *intelhaddata,
> +				   int idx)
> +{
> +	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), 0);
> +	had_write_register(intelhaddata, AUD_BUF_LEN(idx), 0);
> +}
> +
> +/* Initial programming of ring buffers */
> +static void had_init_ringbufs(struct snd_pcm_substream *substream,
> +			      struct snd_intelhad *intelhaddata)
> +{
> +	struct snd_pcm_runtime *runtime = substream->runtime;
> +	int i, period_bytes, num_periods;
> +
> +	num_periods = runtime->periods;
> +	intelhaddata->period_bytes = period_bytes =
> +		frames_to_bytes(runtime, runtime->period_size);
> +	WARN_ON(period_bytes & 0x3f);
> +
> +	intelhaddata->ringbuf_head = 0;
> +	intelhaddata->pcmbuf_head = 0;
> +	intelhaddata->ringbuf_filled = 0;
> +	intelhaddata->pcmbuf_filled = 0;
> +
> +	for (i = 0; i < HAD_NUM_RINGBUFS; i++) {
> +		if (i < num_periods)
> +			had_prog_ringbuf(substream, intelhaddata);
> +		else /* invalidate the rest */
> +			had_invalidate_ringbuf(intelhaddata, i);
> +	}
> +}
> +
> +/* process a ring buf, advance to the next */
> +static void had_advance_ringbuf(struct snd_pcm_substream *substream,
> +				struct snd_intelhad *intelhaddata)
> +{
> +	int num_periods = substream->runtime->periods;
> +
> +	/* reprogram the next buffer */
> +	had_prog_ringbuf(substream, intelhaddata);
> +
> +	/* invalidate the processed buf for shorter PCM buffer */
> +	if (num_periods < HAD_NUM_RINGBUFS)
> +		had_invalidate_ringbuf(intelhaddata,
> +				       intelhaddata->ringbuf_head);
> +
> +	/* proceed to next */
> +	intelhaddata->ringbuf_head++;
> +	intelhaddata->ringbuf_head %= HAD_NUM_RINGBUFS;
> +	intelhaddata->pcmbuf_head++;
> +	intelhaddata->pcmbuf_head %= num_periods;
> +}
> +
> +/* process the current ring buffer(s);
> + * returns the current PCM buffer byte position, or -EPIPE for underrun.
> + */
> +static int had_process_ringbufs(struct snd_pcm_substream *substream,
> +				struct snd_intelhad *intelhaddata)
> +{
> +	int len, processed;
> +	unsigned long flags;
> +
> +	processed = 0;
> +	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
> +	for (;;) {
> +		/* get the remaining bytes on the buffer */
> +		had_read_register(intelhaddata,
> +				  AUD_BUF_LEN(intelhaddata->ringbuf_head),
> +				  &len);
> +		if (len < 0 || len > intelhaddata->period_bytes) {
> +			dev_dbg(intelhaddata->dev, "Invalid buf length %d\n",
> +				len);
> +			len = -EPIPE;
> +			goto out;
> +		}
> +
> +		if (len > 0) /* OK, this is the current buffer */
> +			break;
> +
> +		/* len=0 => already empty, check the next buffer */
> +		if (++processed >= HAD_NUM_RINGBUFS) {
> +			len = -EPIPE; /* all empty? - report underrun */
> +			goto out;
> +		}
> +		had_advance_ringbuf(substream, intelhaddata);
> +	}
> +
> +	len = intelhaddata->period_bytes - len;
> +	len += intelhaddata->period_bytes * intelhaddata->pcmbuf_head;
> + out:
> +	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
> +	return len;
> +}
> +
> +/* called from irq handler */
> +static void had_process_buffer_done(struct snd_intelhad *intelhaddata)
> +{
> +	struct snd_pcm_substream *substream;
> +
> +	if (!intelhaddata->connected)
> +		return; /* disconnected? - bail out */
> +
> +	substream = had_substream_get(intelhaddata);
> +	if (!substream)
> +		return; /* no stream? - bail out */
> +
> +	/* process or stop the stream */
> +	if (had_process_ringbufs(substream, intelhaddata) < 0)
> +		snd_pcm_stop_xrun(substream);
> +	else
> +		snd_pcm_period_elapsed(substream);
> +
> +	had_substream_put(intelhaddata);
> +}
> +
>  #define MAX_CNT			0xFF
>
> -static void snd_intelhad_handle_underrun(struct snd_intelhad *intelhaddata)
> +/*
> + * The interrupt status 'sticky' bits might not be cleared by
> + * setting '1' to that bit once...
> + */
> +static void wait_clear_underrun_bit(struct snd_intelhad *intelhaddata)
>  {
> -	u32 hdmi_status = 0, i = 0;
> +	int i;
> +	u32 val;
> +
> +	for (i = 0; i < MAX_CNT; i++) {
> +		/* clear bit30, 31 AUD_HDMI_STATUS */
> +		had_read_register(intelhaddata, AUD_HDMI_STATUS, &val);
> +		if (!(val & AUD_CONFIG_MASK_UNDERRUN))
> +			return;
> +		had_write_register(intelhaddata, AUD_HDMI_STATUS, val);
> +	}
> +	dev_err(intelhaddata->dev, "Unable to clear UNDERRUN bits\n");
> +}
> +
> +/* called from irq handler */
> +static void had_process_buffer_underrun(struct snd_intelhad *intelhaddata)
> +{
> +	struct snd_pcm_substream *substream;
>
>  	/* Handle Underrun interrupt within Audio Unit */
>  	had_write_register(intelhaddata, AUD_CONFIG, 0);
>  	/* Reset buffer pointers */
>  	had_write_register(intelhaddata, AUD_HDMI_STATUS, 1);
>  	had_write_register(intelhaddata, AUD_HDMI_STATUS, 0);
> -	/*
> -	 * The interrupt status 'sticky' bits might not be cleared by
> -	 * setting '1' to that bit once...
> -	 */
> -	do { /* clear bit30, 31 AUD_HDMI_STATUS */
> -		had_read_register(intelhaddata, AUD_HDMI_STATUS,
> -				  &hdmi_status);
> -		dev_dbg(intelhaddata->dev, "HDMI status =0x%x\n", hdmi_status);
> -		if (hdmi_status & AUD_CONFIG_MASK_UNDERRUN) {
> -			i++;
> -			had_write_register(intelhaddata,
> -					   AUD_HDMI_STATUS, hdmi_status);
> -		} else
> -			break;
> -	} while (i < MAX_CNT);
> -	if (i >= MAX_CNT)
> -		dev_err(intelhaddata->dev, "Unable to clear UNDERRUN bits\n");
> +
> +	wait_clear_underrun_bit(intelhaddata);
> +
> +	if (!intelhaddata->connected)
> +		return; /* disconnected? - bail out */
> +
> +	/* Report UNDERRUN error to above layers */
> +	substream = had_substream_get(intelhaddata);
> +	if (substream) {
> +		snd_pcm_stop_xrun(substream);
> +		had_substream_put(intelhaddata);
> +	}
>  }
>
>  /*
> @@ -955,11 +1037,6 @@ static int snd_intelhad_open(struct snd_pcm_substream *substream)
>  	intelhaddata->stream_info.substream_refcount++;
>  	spin_unlock_irq(&intelhaddata->had_spinlock);
>
> -	/* these are cleared in prepare callback, but just to be sure */
> -	intelhaddata->curr_buf = 0;
> -	intelhaddata->underrun_count = 0;
> -	intelhaddata->stream_info.buffer_rendered = 0;
> -
>  	return retval;
>   error:
>  	pm_runtime_put(intelhaddata->dev);
> @@ -1123,10 +1200,6 @@ static int snd_intelhad_pcm_prepare(struct snd_pcm_substream *substream)
>  	dev_dbg(intelhaddata->dev, "rate=%d\n", runtime->rate);
>  	dev_dbg(intelhaddata->dev, "channels=%d\n", runtime->channels);
>
> -	intelhaddata->curr_buf = 0;
> -	intelhaddata->underrun_count = 0;
> -	intelhaddata->stream_info.buffer_rendered = 0;
> -
>  	/* Get N value in KHz */
>  	disp_samp_freq = intelhaddata->tmds_clock_speed;
>
> @@ -1150,8 +1223,7 @@ static int snd_intelhad_pcm_prepare(struct snd_pcm_substream *substream)
>  	retval = snd_intelhad_audio_ctrl(substream, intelhaddata);
>
>  	/* Prog buffer address */
> -	retval = snd_intelhad_prog_buffer(substream, intelhaddata,
> -			HAD_BUF_TYPE_A, HAD_BUF_TYPE_D);
> +	had_init_ringbufs(substream, intelhaddata);
>
>  	/*
>  	 * Program channel mapping in following order:
> @@ -1171,48 +1243,17 @@ static snd_pcm_uframes_t
>  snd_intelhad_pcm_pointer(struct snd_pcm_substream *substream)
>  {
>  	struct snd_intelhad *intelhaddata;
> -	u32 bytes_rendered = 0;
> -	u32 t;
> -	int buf_id;
> +	int len;
>
>  	intelhaddata = snd_pcm_substream_chip(substream);
>
>  	if (!intelhaddata->connected)
>  		return SNDRV_PCM_POS_XRUN;
>
> -	/* Use a hw register to calculate sub-period position reports.
> -	 * This makes PulseAudio happier.
> -	 */
> -
> -	buf_id = intelhaddata->curr_buf % 4;
> -	had_read_register(intelhaddata,
> -			  AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH), &t);
> -
> -	if ((t == 0) || (t == ((u32)-1L))) {
> -		intelhaddata->underrun_count++;
> -		dev_dbg(intelhaddata->dev,
> -			"discovered buffer done for buf %d, count = %d\n",
> -			 buf_id, intelhaddata->underrun_count);
> -
> -		if (intelhaddata->underrun_count > (HAD_MIN_PERIODS/2)) {
> -			dev_dbg(intelhaddata->dev,
> -				"assume audio_codec_reset, underrun = %d - do xrun\n",
> -				 intelhaddata->underrun_count);
> -			return SNDRV_PCM_POS_XRUN;
> -		}
> -	} else {
> -		/* Reset Counter */
> -		intelhaddata->underrun_count = 0;
> -	}
> -
> -	t = intelhaddata->buf_info[buf_id].buf_size - t;
> -
> -	if (intelhaddata->stream_info.buffer_rendered)
> -		div_u64_rem(intelhaddata->stream_info.buffer_rendered,
> -			intelhaddata->stream_info.ring_buf_size,
> -			&(bytes_rendered));
> -
> -	return bytes_to_frames(substream->runtime, bytes_rendered + t);
> +	len = had_process_ringbufs(substream, intelhaddata);
> +	if (len < 0)
> +		return SNDRV_PCM_POS_XRUN;
> +	return bytes_to_frames(substream->runtime, len);
>  }
>
>  /*
> @@ -1283,179 +1324,9 @@ static int hdmi_audio_mode_change(struct snd_intelhad *intelhaddata)
>  	return retval;
>  }
>
> -static inline int had_chk_intrmiss(struct snd_intelhad *intelhaddata,
> -		enum intel_had_aud_buf_type buf_id)
> -{
> -	int i, intr_count = 0;
> -	enum intel_had_aud_buf_type buff_done;
> -	u32 buf_size, buf_addr;
> -
> -	buff_done = buf_id;
> -
> -	intr_count = snd_intelhad_read_len(intelhaddata);
> -	if (intr_count > 1) {
> -		/* In case of active playback */
> -		dev_err(intelhaddata->dev,
> -			"Driver detected %d missed buffer done interrupt(s)\n",
> -			(intr_count - 1));
> -		if (intr_count > 3)
> -			return intr_count;
> -
> -		buf_id += (intr_count - 1);
> -		/* Reprogram registers*/
> -		for (i = buff_done; i < buf_id; i++) {
> -			int j = i % 4;
> -
> -			buf_size = intelhaddata->buf_info[j].buf_size;
> -			buf_addr = intelhaddata->buf_info[j].buf_addr;
> -			had_write_register(intelhaddata,
> -					   AUD_BUF_A_LENGTH +
> -					   (j * HAD_REG_WIDTH), buf_size);
> -			had_write_register(intelhaddata,
> -					   AUD_BUF_A_ADDR+(j * HAD_REG_WIDTH),
> -					   (buf_addr | BIT(0) | BIT(1)));
> -		}
> -		buf_id = buf_id % 4;
> -		intelhaddata->buff_done = buf_id;
> -	}
> -
> -	return intr_count;
> -}
> -
> -/* called from irq handler */
> -static int had_process_buffer_done(struct snd_intelhad *intelhaddata)
> -{
> -	u32 len = 1;
> -	enum intel_had_aud_buf_type buf_id;
> -	enum intel_had_aud_buf_type buff_done;
> -	struct pcm_stream_info *stream;
> -	struct snd_pcm_substream *substream;
> -	u32 buf_size;
> -	int intr_count;
> -	unsigned long flags;
> -
> -	stream = &intelhaddata->stream_info;
> -	intr_count = 1;
> -
> -	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
> -	if (!intelhaddata->connected) {
> -		spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
> -		dev_dbg(intelhaddata->dev,
> -			"%s:Device already disconnected\n", __func__);
> -		return 0;
> -	}
> -	buf_id = intelhaddata->curr_buf;
> -	intelhaddata->buff_done = buf_id;
> -	buff_done = intelhaddata->buff_done;
> -	buf_size = intelhaddata->buf_info[buf_id].buf_size;
> -
> -	/* Every debug statement has an implication
> -	 * of ~5msec. Thus, avoid having >3 debug statements
> -	 * for each buffer_done handling.
> -	 */
> -
> -	/* Check for any intr_miss in case of active playback */
> -	if (stream->running) {
> -		intr_count = had_chk_intrmiss(intelhaddata, buf_id);
> -		if (!intr_count || (intr_count > 3)) {
> -			spin_unlock_irqrestore(&intelhaddata->had_spinlock,
> -					       flags);
> -			dev_err(intelhaddata->dev,
> -				"HAD SW state in non-recoverable mode\n");
> -			return 0;
> -		}
> -		buf_id += (intr_count - 1);
> -		buf_id = buf_id % 4;
> -	}
> -
> -	intelhaddata->buf_info[buf_id].is_valid = true;
> -	if (intelhaddata->valid_buf_cnt-1 == buf_id) {
> -		if (stream->running)
> -			intelhaddata->curr_buf = HAD_BUF_TYPE_A;
> -	} else
> -		intelhaddata->curr_buf = buf_id + 1;
> -
> -	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
> -
> -	if (!intelhaddata->connected) {
> -		dev_dbg(intelhaddata->dev, "HDMI cable plugged-out\n");
> -		return 0;
> -	}
> -
> -	/* Reprogram the registers with addr and length */
> -	had_write_register(intelhaddata,
> -			   AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH),
> -			   buf_size);
> -	had_write_register(intelhaddata,
> -			   AUD_BUF_A_ADDR + (buf_id * HAD_REG_WIDTH),
> -			   intelhaddata->buf_info[buf_id].buf_addr |
> -			   BIT(0) | BIT(1));
> -
> -	had_read_register(intelhaddata,
> -			  AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH),
> -			  &len);
> -	dev_dbg(intelhaddata->dev, "%s:Enabled buf[%d]\n", __func__, buf_id);
> -
> -	/* In case of actual data,
> -	 * report buffer_done to above ALSA layer
> -	 */
> -	substream = had_substream_get(intelhaddata);
> -	if (substream) {
> -		buf_size = intelhaddata->buf_info[buf_id].buf_size;
> -		intelhaddata->stream_info.buffer_rendered +=
> -			(intr_count * buf_size);
> -		snd_pcm_period_elapsed(substream);
> -		had_substream_put(intelhaddata);
> -	}
> -
> -	return 0;
> -}
> -
> -/* called from irq handler */
> -static int had_process_buffer_underrun(struct snd_intelhad *intelhaddata)
> -{
> -	enum intel_had_aud_buf_type buf_id;
> -	struct pcm_stream_info *stream;
> -	struct snd_pcm_substream *substream;
> -	unsigned long flags;
> -	int connected;
> -
> -	stream = &intelhaddata->stream_info;
> -
> -	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
> -	buf_id = intelhaddata->curr_buf;
> -	intelhaddata->buff_done = buf_id;
> -	connected = intelhaddata->connected;
> -	if (stream->running)
> -		intelhaddata->curr_buf = HAD_BUF_TYPE_A;
> -
> -	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
> -
> -	dev_dbg(intelhaddata->dev, "Enter:%s buf_id=%d, stream_running=%d\n",
> -			__func__, buf_id, stream->running);
> -
> -	snd_intelhad_handle_underrun(intelhaddata);
> -
> -	if (!connected) {
> -		dev_dbg(intelhaddata->dev,
> -			"%s:Device already disconnected\n", __func__);
> -		return 0;
> -	}
> -
> -	/* Report UNDERRUN error to above layers */
> -	substream = had_substream_get(intelhaddata);
> -	if (substream) {
> -		snd_pcm_stop_xrun(substream);
> -		had_substream_put(intelhaddata);
> -	}
> -
> -	return 0;
> -}
> -
>  /* process hot plug, called from wq with mutex locked */
>  static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
>  {
> -	enum intel_had_aud_buf_type buf_id;
>  	struct snd_pcm_substream *substream;
>
>  	spin_lock_irq(&intelhaddata->had_spinlock);
> @@ -1465,17 +1336,12 @@ static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
>  		return;
>  	}
>
> -	buf_id = intelhaddata->curr_buf;
> -	intelhaddata->buff_done = buf_id;
>  	intelhaddata->connected = true;
>  	dev_dbg(intelhaddata->dev,
>  		"%s @ %d:DEBUG PLUG/UNPLUG : HAD_DRV_CONNECTED\n",
>  			__func__, __LINE__);
>  	spin_unlock_irq(&intelhaddata->had_spinlock);
>
> -	dev_dbg(intelhaddata->dev, "Processing HOT_PLUG, buf_id = %d\n",
> -		buf_id);
> -
>  	/* Safety check */
>  	substream = had_substream_get(intelhaddata);
>  	if (substream) {
> @@ -1492,11 +1358,8 @@ static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
>  /* process hot unplug, called from wq with mutex locked */
>  static void had_process_hot_unplug(struct snd_intelhad *intelhaddata)
>  {
> -	enum intel_had_aud_buf_type buf_id;
>  	struct snd_pcm_substream *substream;
>
> -	buf_id = intelhaddata->curr_buf;
> -
>  	substream = had_substream_get(intelhaddata);
>
>  	spin_lock_irq(&intelhaddata->had_spinlock);
> @@ -1868,13 +1731,12 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev)
>  	dma_set_mask(&pdev->dev, DMA_BIT_MASK(32));
>  	dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
>
> -	/* allocate dma pages for ALSA stream operations
> -	 * memory allocated is based on size, not max value
> -	 * thus using same argument for max & size
> +	/* allocate dma pages;
> +	 * try to allocate 128k buffer as default which is large enough
>  	 */
>  	snd_pcm_lib_preallocate_pages_for_all(pcm,
>  			SNDRV_DMA_TYPE_DEV, NULL,
> -			HAD_MAX_BUFFER, HAD_MAX_BUFFER);
> +			128 * 1024, HAD_MAX_BUFFER);
>
>  	/* create controls */
>  	for (i = 0; i < ARRAY_SIZE(had_controls); i++) {
> diff --git a/sound/x86/intel_hdmi_audio.h b/sound/x86/intel_hdmi_audio.h
> index 9f713a8a88bc..6178b09801e7 100644
> --- a/sound/x86/intel_hdmi_audio.h
> +++ b/sound/x86/intel_hdmi_audio.h
> @@ -64,24 +64,15 @@
>
>  struct pcm_stream_info {
>  	struct snd_pcm_substream *substream;
> -	u64		buffer_rendered;
> -	u32		ring_buf_size;
>  	int substream_refcount;
>  	bool running;
>  };
>
> -struct ring_buf_info {
> -	u32	buf_addr;
> -	u32	buf_size;
> -	u8	is_valid;
> -};
> -
>  /*
>   * struct snd_intelhad - intelhad driver structure
>   *
>   * @card: ptr to hold card details
>   * @connected: the monitor connection status
> - * @buf_info: ring buffer info
>   * @stream_info: stream information
>   * @eld: holds ELD info
>   * @curr_buf: pointer to hold current active ring buf
> @@ -91,26 +82,28 @@ struct ring_buf_info {
>   * @buff_done: id of current buffer done intr
>   * @dev: platoform device handle
>   * @chmap: holds channel map info
> - * @underrun_count: PCM stream underrun counter
>   */
>  struct snd_intelhad {
>  	struct snd_card	*card;
>  	bool		connected;
> -	struct		ring_buf_info buf_info[HAD_NUM_OF_RING_BUFS];
>  	struct		pcm_stream_info stream_info;
>  	unsigned char	eld[HDMI_MAX_ELD_BYTES];
>  	bool dp_output;
> -	enum		intel_had_aud_buf_type curr_buf;
> -	int		valid_buf_cnt;
>  	unsigned int	aes_bits;
>  	spinlock_t had_spinlock;
> -	enum		intel_had_aud_buf_type buff_done;
>  	struct device *dev;
>  	struct snd_pcm_chmap *chmap;
> -	int underrun_count;
>  	int tmds_clock_speed;
>  	int link_rate;
>
> +	/* positions (indices) for ring buffer [A-D] */
> +	unsigned int ringbuf_head;	/* being processed */
> +	unsigned int ringbuf_filled;	/* to be filled */
> +	/* positions (indices) for PCM buffer */
> +	unsigned int pcmbuf_head;	/* being processed */
> +	unsigned int pcmbuf_filled;	/* to be filled */
> +	unsigned int period_bytes;	/* PCM period size in bytes */
> +
>  	/* internal stuff */
>  	int irq;
>  	void __iomem *mmio_start;
> diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h
> index be9783910a3a..612714eecaff 100644
> --- a/sound/x86/intel_hdmi_lpe_audio.h
> +++ b/sound/x86/intel_hdmi_lpe_audio.h
> @@ -26,15 +26,14 @@
>  #define HAD_MAX_DEVICES		1
>  #define HAD_MIN_CHANNEL		2
>  #define HAD_MAX_CHANNEL		8
> -#define HAD_NUM_OF_RING_BUFS	4
> +#define HAD_NUM_RINGBUFS	4
>
> -/* Assume 192KHz, 8channel, 25msec period */
> -#define HAD_MAX_BUFFER		(600*1024)
> -#define HAD_MIN_BUFFER		(32*1024)
> -#define HAD_MAX_PERIODS		4
> -#define HAD_MIN_PERIODS		4
> -#define HAD_MAX_PERIOD_BYTES	(HAD_MAX_BUFFER/HAD_MIN_PERIODS)
> -#define HAD_MIN_PERIOD_BYTES	256
> +/* max 20bit address, aligned to 64 */
> +#define HAD_MAX_BUFFER		((1024 * 1024 - 1) & ~0x3f)
> +#define HAD_MAX_PERIODS		256	/* arbitrary, but should suffice */
> +#define HAD_MIN_PERIODS		2
> +#define HAD_MAX_PERIOD_BYTES	((HAD_MAX_BUFFER / HAD_MIN_PERIODS) & ~0x3f)
> +#define HAD_MIN_PERIOD_BYTES	1024	/* might be smaller */
>  #define HAD_FIFO_SIZE		0 /* fifo not being used */
>  #define MAX_SPEAKERS		8
>
> @@ -82,14 +81,6 @@
>  /* Naud Value */
>  #define DP_NAUD_VAL					32768
>
> -/* enum intel_had_aud_buf_type - HDMI controller ring buffer types */
> -enum intel_had_aud_buf_type {
> -	HAD_BUF_TYPE_A = 0,
> -	HAD_BUF_TYPE_B = 1,
> -	HAD_BUF_TYPE_C = 2,
> -	HAD_BUF_TYPE_D = 3,
> -};
> -
>  /* HDMI Controller register offsets - audio domain common */
>  /* Base address for below regs = 0x65000 */
>  enum hdmi_ctrl_reg_offset_common {
> @@ -274,6 +265,9 @@ union aud_buf_addr {
>  	u32 regval;
>  };
>
> +#define AUD_BUF_VALID		(1U << 0)
> +#define AUD_BUF_INTR_EN		(1U << 1)
> +
>  /* Length of Audio Buffer */
>  union aud_buf_len {
>  	struct {
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-03 19:47   ` Pierre-Louis Bossart
@ 2017-02-03 20:11     ` Takashi Iwai
  2017-02-03 23:22       ` Pierre-Louis Bossart
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 20:11 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Fri, 03 Feb 2017 20:47:55 +0100,
Pierre-Louis Bossart wrote:
> 
> On 2/3/17 10:44 AM, Takashi Iwai wrote:
> > This is again a big rewrite of the driver; now it touches the code to
> > process PCM stream transfers.
> >
> > The most fundamental change is that now the driver supports more than
> > four periods.  Instead of keeping the same index between the ring
> > buffers (from A to D) and the PCM buffer periods, now we keep
> > difference indices for both.  Also, for the cases with less periods
> > than four, we track the head index, too.  That is, we now have four
> > indices: ringbuf_head, pcm_head, ringbuf_filled, and pcm_filled.
> 
> Well that's not completely right. The DMA can only generate an
> interrupt once the buffer you submit was played, and with 4
> descriptors you can't have more than 4 points where interrupts are
> generated.  If you program different values in different descriptors
> then the notion of periodic hardware interrupts will be lost.

There is a standard trick such as used for ICH or other drivers.
With this kind of hardware, the address to be written to each buffer
descriptor can be changed dynamically while streaming.  (BTW, on some
hardware like HD-audio, it's not allowed, but HD-audio has a larger
table, so it doesn't matter.)

That is, each BD maps to each period on the PCM buffer, and it moves
there.  A picture like below would illustrate:

At time=0

  PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
  BD  | A | B | C | D |

At time=1 (period elapsed)

  PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
  BD      | B | C | D | A |

At time=2

  PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
  BD          | C | D | A | B |

and so on.

For the case periods < 4, it works other way round:

t=0
  PCM | 0 | 1 |
  BD  | A | B | - | - |

t=1
  PCM     | 1 | 0 |
  BD  | - | B | C | - |

t=2
  PCM         | 0 | 1 |
  BD  | - | - | C | D |


> > By this flexibility, we can use even dmix, which requires 16 periods
> > as default.
> >
> > The buffer size could be up to 20bit, so it's set to that value.  But
> > the pre-allocation is limited to 128k as default.  It's because the
> > chip requires the continuous pages (unfortunately no-SG possible),
> > thus we don't want too much continuous pages.
> 
> No, that's not true. You need contiguous regions for each descriptor,
> but the entire buffer doesn't need to be contiguous.

Yeah, in theory, it's possible, but it actually doesn't help much with
only four descriptors.  Usually SG buffer management is per page.  And
if we fix the size per descriptor, it won't fit what user-space apps
want in most cases.  So practically it mandates the continuous pages.


Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-03 19:39   ` Pierre-Louis Bossart
@ 2017-02-03 20:13     ` Takashi Iwai
  2017-02-03 23:13       ` Pierre-Louis Bossart
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-03 20:13 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Fri, 03 Feb 2017 20:39:49 +0100,
Pierre-Louis Bossart wrote:
> 
> On 2/3/17 10:43 AM, Takashi Iwai wrote:
> > The PCM engine on LPE audio isn't like a batch-style process, but
> > rather it deals with the standard ring buffer.  Passing the BATCH info
> > flag is inappropriate.
> 
> Humm, this is controversial. There are 4 DMA descriptors and getting a
> precise position in the stream is not straightforward.

Well, as far as I've tested, it is.  The buffer length register keeps
the remaining bytes, and you can easily read out and compute the
current position in fairly exact manner.  Even the old driver has that
information -- the patch David added does it.

> Rewind is also
> not supported so if you remove the BATCH flag you will push PulseAudio
> into doing timer-based scheduling and rewinds that probably don't
> work.

I don't think there is much difference regarding this.
Please check the buffer manage description in my previous post.

In anyway, it'd be appreciated if you can test on your hardware.
I could test only on a single machine.


thanks,

Takashi


> >
> > Similarly, the DOUBLE flag is also superfluous.  Drop both bits.
> >
> > Signed-off-by: Takashi Iwai <tiwai@suse.de>
> > ---
> >  sound/x86/intel_hdmi_audio.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
> > index c83f02c2593e..32a21422e6f5 100644
> > --- a/sound/x86/intel_hdmi_audio.c
> > +++ b/sound/x86/intel_hdmi_audio.c
> > @@ -131,10 +131,8 @@ static const struct channel_map_table map_tables[] = {
> >  /* hardware capability structure */
> >  static const struct snd_pcm_hardware snd_intel_hadstream = {
> >  	.info =	(SNDRV_PCM_INFO_INTERLEAVED |
> > -		SNDRV_PCM_INFO_DOUBLE |
> >  		SNDRV_PCM_INFO_MMAP|
> > -		SNDRV_PCM_INFO_MMAP_VALID |
> > -		SNDRV_PCM_INFO_BATCH),
> > +		SNDRV_PCM_INFO_MMAP_VALID),
> >  	.formats = (SNDRV_PCM_FMTBIT_S24 |
> >  		SNDRV_PCM_FMTBIT_U24),
> >  	.rates = SNDRV_PCM_RATE_32000 |
> >
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-03 20:13     ` Takashi Iwai
@ 2017-02-03 23:13       ` Pierre-Louis Bossart
  2017-02-04  7:57         ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-03 23:13 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand



On 02/03/2017 02:13 PM, Takashi Iwai wrote:
> On Fri, 03 Feb 2017 20:39:49 +0100,
> Pierre-Louis Bossart wrote:
>> On 2/3/17 10:43 AM, Takashi Iwai wrote:
>>> The PCM engine on LPE audio isn't like a batch-style process, but
>>> rather it deals with the standard ring buffer.  Passing the BATCH info
>>> flag is inappropriate.
>> Humm, this is controversial. There are 4 DMA descriptors and getting a
>> precise position in the stream is not straightforward.
> Well, as far as I've tested, it is.  The buffer length register keeps
> the remaining bytes, and you can easily read out and compute the
> current position in fairly exact manner.  Even the old driver has that
> information -- the patch David added does it.
Yes, and I don't think anyone on the Intel side quite understood what 
David did there. The code didn't seem quite right.
>
>> Rewind is also
>> not supported so if you remove the BATCH flag you will push PulseAudio
>> into doing timer-based scheduling and rewinds that probably don't
>> work.
> I don't think there is much difference regarding this.
> Please check the buffer manage description in my previous post
We've never tested the hardware in those configurations so I'd like a 
bit more time to double check how well this might work. None of us has a 
clear view of how much buffering and pre-fetch is done by the DMA engine 
so if you rewind 'too much' you may be out of luck.
>
> In anyway, it'd be appreciated if you can test on your hardware.
> I could test only on a single machine.
I can test more but only in 10 days from now so if we could delay this 
patch a bit it'd be better.
>
>
> thanks,
>
> Takashi
>
>
>>> Similarly, the DOUBLE flag is also superfluous.  Drop both bits.
>>>
>>> Signed-off-by: Takashi Iwai <tiwai@suse.de>
>>> ---
>>>   sound/x86/intel_hdmi_audio.c | 4 +---
>>>   1 file changed, 1 insertion(+), 3 deletions(-)
>>>
>>> diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
>>> index c83f02c2593e..32a21422e6f5 100644
>>> --- a/sound/x86/intel_hdmi_audio.c
>>> +++ b/sound/x86/intel_hdmi_audio.c
>>> @@ -131,10 +131,8 @@ static const struct channel_map_table map_tables[] = {
>>>   /* hardware capability structure */
>>>   static const struct snd_pcm_hardware snd_intel_hadstream = {
>>>   	.info =	(SNDRV_PCM_INFO_INTERLEAVED |
>>> -		SNDRV_PCM_INFO_DOUBLE |
>>>   		SNDRV_PCM_INFO_MMAP|
>>> -		SNDRV_PCM_INFO_MMAP_VALID |
>>> -		SNDRV_PCM_INFO_BATCH),
>>> +		SNDRV_PCM_INFO_MMAP_VALID),
>>>   	.formats = (SNDRV_PCM_FMTBIT_S24 |
>>>   		SNDRV_PCM_FMTBIT_U24),
>>>   	.rates = SNDRV_PCM_RATE_32000 |
>>>
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-03 20:11     ` Takashi Iwai
@ 2017-02-03 23:22       ` Pierre-Louis Bossart
  2017-02-04  7:51         ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-03 23:22 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand



On 02/03/2017 02:11 PM, Takashi Iwai wrote:
> On Fri, 03 Feb 2017 20:47:55 +0100,
> Pierre-Louis Bossart wrote:
>> On 2/3/17 10:44 AM, Takashi Iwai wrote:
>>> This is again a big rewrite of the driver; now it touches the code to
>>> process PCM stream transfers.
>>>
>>> The most fundamental change is that now the driver supports more than
>>> four periods.  Instead of keeping the same index between the ring
>>> buffers (from A to D) and the PCM buffer periods, now we keep
>>> difference indices for both.  Also, for the cases with less periods
>>> than four, we track the head index, too.  That is, we now have four
>>> indices: ringbuf_head, pcm_head, ringbuf_filled, and pcm_filled.
>> Well that's not completely right. The DMA can only generate an
>> interrupt once the buffer you submit was played, and with 4
>> descriptors you can't have more than 4 points where interrupts are
>> generated.  If you program different values in different descriptors
>> then the notion of periodic hardware interrupts will be lost.
> There is a standard trick such as used for ICH or other drivers.
> With this kind of hardware, the address to be written to each buffer
> descriptor can be changed dynamically while streaming.  (BTW, on some
> hardware like HD-audio, it's not allowed, but HD-audio has a larger
> table, so it doesn't matter.)
>
> That is, each BD maps to each period on the PCM buffer, and it moves
> there.  A picture like below would illustrate:
>
> At time=0
>
>    PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
>    BD  | A | B | C | D |
>
> At time=1 (period elapsed)
>
>    PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
>    BD      | B | C | D | A |
>
> At time=2
>
>    PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
>    BD          | C | D | A | B |
Yes I thought this was a case of a moving window being used but what 
sort of application needs this? I never understood the benefits of 
expanding the number of periods, it just forces additional wake-ups for 
no good reason. It's really silly from a power perspective.

>
> and so on.
>
> For the case periods < 4, it works other way round:
>
> t=0
>    PCM | 0 | 1 |
>    BD  | A | B | - | - |
>
> t=1
>    PCM     | 1 | 0 |
>    BD  | - | B | C | - |
>
> t=2
>    PCM         | 0 | 1 |
>    BD  | - | - | C | D |
it can be done this way but I don't believe this is required. I think 
you can mark 2 descriptors as invalid and use only two, i.e. to follow 
your example you'd loop on A-B. The hardware will loop on the 4 
descriptors, ignore the invalid ones and stop if it can't find a valid one.

>
>
>>> By this flexibility, we can use even dmix, which requires 16 periods
>>> as default.
>>>
>>> The buffer size could be up to 20bit, so it's set to that value.  But
>>> the pre-allocation is limited to 128k as default.  It's because the
>>> chip requires the continuous pages (unfortunately no-SG possible),
>>> thus we don't want too much continuous pages.
>> No, that's not true. You need contiguous regions for each descriptor,
>> but the entire buffer doesn't need to be contiguous.
> Yeah, in theory, it's possible, but it actually doesn't help much with
> only four descriptors.  Usually SG buffer management is per page.  And
> if we fix the size per descriptor, it won't fit what user-space apps
> want in most cases.  So practically it mandates the continuous pages.
>
>
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-03 23:22       ` Pierre-Louis Bossart
@ 2017-02-04  7:51         ` Takashi Iwai
  2017-02-06 11:22           ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-04  7:51 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Sat, 04 Feb 2017 00:22:00 +0100,
Pierre-Louis Bossart wrote:
> 
> 
> 
> On 02/03/2017 02:11 PM, Takashi Iwai wrote:
> > On Fri, 03 Feb 2017 20:47:55 +0100,
> > Pierre-Louis Bossart wrote:
> >> On 2/3/17 10:44 AM, Takashi Iwai wrote:
> >>> This is again a big rewrite of the driver; now it touches the code to
> >>> process PCM stream transfers.
> >>>
> >>> The most fundamental change is that now the driver supports more than
> >>> four periods.  Instead of keeping the same index between the ring
> >>> buffers (from A to D) and the PCM buffer periods, now we keep
> >>> difference indices for both.  Also, for the cases with less periods
> >>> than four, we track the head index, too.  That is, we now have four
> >>> indices: ringbuf_head, pcm_head, ringbuf_filled, and pcm_filled.
> >> Well that's not completely right. The DMA can only generate an
> >> interrupt once the buffer you submit was played, and with 4
> >> descriptors you can't have more than 4 points where interrupts are
> >> generated.  If you program different values in different descriptors
> >> then the notion of periodic hardware interrupts will be lost.
> > There is a standard trick such as used for ICH or other drivers.
> > With this kind of hardware, the address to be written to each buffer
> > descriptor can be changed dynamically while streaming.  (BTW, on some
> > hardware like HD-audio, it's not allowed, but HD-audio has a larger
> > table, so it doesn't matter.)
> >
> > That is, each BD maps to each period on the PCM buffer, and it moves
> > there.  A picture like below would illustrate:
> >
> > At time=0
> >
> >    PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
> >    BD  | A | B | C | D |
> >
> > At time=1 (period elapsed)
> >
> >    PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
> >    BD      | B | C | D | A |
> >
> > At time=2
> >
> >    PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
> >    BD          | C | D | A | B |
> Yes I thought this was a case of a moving window being used but what
> sort of application needs this? I never understood the benefits of
> expanding the number of periods, it just forces additional wake-ups
> for no good reason. It's really silly from a power perspective.

The usage like dmix requires more periods for more finer wakeups.
With the old driver with the fixed 4 periods, dmix doesn't work
properly.  The dmix needs the synced timing control over multiple
streams, and the period wakeup is used.  It could be rewritten with
POSIX timer, but it'd make things a bit complicated.

> > and so on.
> >
> > For the case periods < 4, it works other way round:
> >
> > t=0
> >    PCM | 0 | 1 |
> >    BD  | A | B | - | - |
> >
> > t=1
> >    PCM     | 1 | 0 |
> >    BD  | - | B | C | - |
> >
> > t=2
> >    PCM         | 0 | 1 |
> >    BD  | - | - | C | D |
> it can be done this way but I don't believe this is required. I think
> you can mark 2 descriptors as invalid and use only two, i.e. to follow
> your example you'd loop on A-B. The hardware will loop on the 4
> descriptors, ignore the invalid ones and stop if it can't find a valid
> one.

Ah, that's good to know.  So I just keep to set the rest invalid.
(The code change would be likely only one or two line reduction,
 though :)


thanks,

Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-03 23:13       ` Pierre-Louis Bossart
@ 2017-02-04  7:57         ` Takashi Iwai
  2017-02-06 19:01           ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-04  7:57 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Sat, 04 Feb 2017 00:13:49 +0100,
Pierre-Louis Bossart wrote:
> 
> 
> 
> On 02/03/2017 02:13 PM, Takashi Iwai wrote:
> > On Fri, 03 Feb 2017 20:39:49 +0100,
> > Pierre-Louis Bossart wrote:
> >> On 2/3/17 10:43 AM, Takashi Iwai wrote:
> >>> The PCM engine on LPE audio isn't like a batch-style process, but
> >>> rather it deals with the standard ring buffer.  Passing the BATCH info
> >>> flag is inappropriate.
> >> Humm, this is controversial. There are 4 DMA descriptors and getting a
> >> precise position in the stream is not straightforward.
> > Well, as far as I've tested, it is.  The buffer length register keeps
> > the remaining bytes, and you can easily read out and compute the
> > current position in fairly exact manner.  Even the old driver has that
> > information -- the patch David added does it.
> Yes, and I don't think anyone on the Intel side quite understood what
> David did there. The code didn't seem quite right.

The David's code should be mostly OK.  It just reads the length buffer
register that keeps the remaining bytes on the current BD.

The only doubtful part in the original code is the handling when it
reads zero or -1 from the length register.  It tries to ignore for a
couple of times.  The zero read-back may actually happen at any time,
and actually it means that we've missed one irq handling.  OTOH, a
negative value must not happen, and it means some serious hardware
issue, and we should stop streaming at this point.

> >> Rewind is also
> >> not supported so if you remove the BATCH flag you will push PulseAudio
> >> into doing timer-based scheduling and rewinds that probably don't
> >> work.
> > I don't think there is much difference regarding this.
> > Please check the buffer manage description in my previous post
> We've never tested the hardware in those configurations so I'd like a
> bit more time to double check how well this might work. None of us has
> a clear view of how much buffering and pre-fetch is done by the DMA
> engine so if you rewind 'too much' you may be out of luck.

Right, that's unclear part, especially for me without any hardware
datasheet ;)


> > In anyway, it'd be appreciated if you can test on your hardware.
> > I could test only on a single machine.
> I can test more but only in 10 days from now so if we could delay this
> patch a bit it'd be better.

OK, I can postpone this change and keep BATCH and DOUBLE.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-04  7:51         ` Takashi Iwai
@ 2017-02-06 11:22           ` Takashi Iwai
  2017-02-06 15:46             ` Pierre-Louis Bossart
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-06 11:22 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Sat, 04 Feb 2017 08:51:54 +0100,
Takashi Iwai wrote:
> 
> > > and so on.
> > >
> > > For the case periods < 4, it works other way round:
> > >
> > > t=0
> > >    PCM | 0 | 1 |
> > >    BD  | A | B | - | - |
> > >
> > > t=1
> > >    PCM     | 1 | 0 |
> > >    BD  | - | B | C | - |
> > >
> > > t=2
> > >    PCM         | 0 | 1 |
> > >    BD  | - | - | C | D |
> > it can be done this way but I don't believe this is required. I think
> > you can mark 2 descriptors as invalid and use only two, i.e. to follow
> > your example you'd loop on A-B. The hardware will loop on the 4
> > descriptors, ignore the invalid ones and stop if it can't find a valid
> > one.
> 
> Ah, that's good to know.  So I just keep to set the rest invalid.
> (The code change would be likely only one or two line reduction,
>  though :)

Below is the revised patch.  I ended up adding more comments,
refactoring the codes with one more field removal.

Also, the default buffer pre-allocation size isn't changed; no need to
reduce the size if it worked beforehand.


Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH v2] ALSA: x86: Refactor PCM process engine

This is again a big rewrite of the driver; now it touches the code to
process PCM stream transfers.

The most fundamental change is that the driver may support more than
four periods.  Instead of keeping the same index between both the ring
buffer (with the fixed four buffer descriptors) and the PCM buffer
periods, we keep difference indices for both (bd_head and pcm_head
fields).  In addition, when the periods are more than four, we need to
track both head and next indices.  That is, we now have three indices:
bd_head, pcm_head and pcm_filled.

Also, the driver works better for periods < 4, too: the remaining BDs
out of four are marked as invalid, so that the hardware skips those
BDs in its loop.

By this flexibility, we can use even ALSA-lib dmix plugin, which
requires 16 periods as default.

The buffer size could be up to 20bit, so the max buffer size was
increased accordingly.  However, the buffer pre-allocation is kept as
the old value (600kB) as default.  The reason is the limited number of
BDs: since it doesn't suffice for the useful SG page management that
can fit with the usual page allocator like some other drivers, we have
to still allocate continuous pages, hence we shouldn't take too big
memories there.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/x86/intel_hdmi_audio.c     | 536 ++++++++++++++++-----------------------
 sound/x86/intel_hdmi_audio.h     |  24 +-
 sound/x86/intel_hdmi_lpe_audio.h |  25 +-
 3 files changed, 231 insertions(+), 354 deletions(-)

diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
index 57042ef3a480..0e30e03404ea 100644
--- a/sound/x86/intel_hdmi_audio.c
+++ b/sound/x86/intel_hdmi_audio.c
@@ -622,82 +622,6 @@ static void had_prog_dip(struct snd_pcm_substream *substream,
 	had_write_register(intelhaddata, AUD_CNTL_ST, ctrl_state.regval);
 }
 
-/*
- * Programs buffer address and length registers
- * This function programs ring buffer address and length into registers.
- */
-static int snd_intelhad_prog_buffer(struct snd_pcm_substream *substream,
-				    struct snd_intelhad *intelhaddata,
-				    int start, int end)
-{
-	u32 ring_buf_addr, ring_buf_size, period_bytes;
-	u8 i, num_periods;
-
-	ring_buf_addr = substream->runtime->dma_addr;
-	ring_buf_size = snd_pcm_lib_buffer_bytes(substream);
-	intelhaddata->stream_info.ring_buf_size = ring_buf_size;
-	period_bytes = frames_to_bytes(substream->runtime,
-				substream->runtime->period_size);
-	num_periods = substream->runtime->periods;
-
-	/*
-	 * buffer addr should  be 64 byte aligned, period bytes
-	 * will be used to calculate addr offset
-	 */
-	period_bytes &= ~0x3F;
-
-	/* Hardware supports MAX_PERIODS buffers */
-	if (end >= HAD_MAX_PERIODS)
-		return -EINVAL;
-
-	for (i = start; i <= end; i++) {
-		/* Program the buf registers with addr and len */
-		intelhaddata->buf_info[i].buf_addr = ring_buf_addr +
-							 (i * period_bytes);
-		if (i < num_periods-1)
-			intelhaddata->buf_info[i].buf_size = period_bytes;
-		else
-			intelhaddata->buf_info[i].buf_size = ring_buf_size -
-							(i * period_bytes);
-
-		had_write_register(intelhaddata,
-				   AUD_BUF_A_ADDR + (i * HAD_REG_WIDTH),
-					intelhaddata->buf_info[i].buf_addr |
-					BIT(0) | BIT(1));
-		had_write_register(intelhaddata,
-				   AUD_BUF_A_LENGTH + (i * HAD_REG_WIDTH),
-					period_bytes);
-		intelhaddata->buf_info[i].is_valid = true;
-	}
-	dev_dbg(intelhaddata->dev, "%s:buf[%d-%d] addr=%#x  and size=%d\n",
-		__func__, start, end,
-		intelhaddata->buf_info[start].buf_addr,
-		intelhaddata->buf_info[start].buf_size);
-	intelhaddata->valid_buf_cnt = num_periods;
-	return 0;
-}
-
-static int snd_intelhad_read_len(struct snd_intelhad *intelhaddata)
-{
-	int i, retval = 0;
-	u32 len[4];
-
-	for (i = 0; i < 4 ; i++) {
-		had_read_register(intelhaddata,
-				  AUD_BUF_A_LENGTH + (i * HAD_REG_WIDTH),
-				  &len[i]);
-		if (!len[i])
-			retval++;
-	}
-	if (retval != 1) {
-		for (i = 0; i < 4 ; i++)
-			dev_dbg(intelhaddata->dev, "buf[%d] size=%d\n",
-				i, len[i]);
-	}
-
-	return retval;
-}
-
 static int had_calculate_maud_value(u32 aud_samp_freq, u32 link_rate)
 {
 	u32 maud_val;
@@ -885,33 +809,217 @@ static int had_prog_n(u32 aud_samp_freq, u32 *n_param,
 	return 0;
 }
 
+/*
+ * PCM ring buffer handling
+ *
+ * The hardware provides a ring buffer with the fixed 4 buffer descriptors
+ * (BDs).  The driver maps these 4 BDs onto the PCM ring buffer.  The mapping
+ * moves at each period elapsed.  The below illustrates how it works:
+ *
+ * At time=0
+ *  PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
+ *  BD  | 0 | 1 | 2 | 3 |
+ *
+ * At time=1 (period elapsed)
+ *  PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
+ *  BD      | 1 | 2 | 3 | 0 |
+ *
+ * At time=2 (second period elapsed)
+ *  PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
+ *  BD          | 2 | 3 | 0 | 1 |
+ *
+ * The bd_head field points to the index of the BD to be read.  It's also the
+ * position to be filled at next.  The pcm_head and the pcm_filled fields
+ * point to the indices of the current position and of the next position to
+ * be filled, respectively.  For PCM buffer there are both _head and _filled
+ * because they may be difference when nperiods > 4.  For example, in the
+ * example above at t=1, bd_head=1 and pcm_head=1 while pcm_filled=5:
+ *
+ * pcm_head (=1) --v               v-- pcm_filled (=5)
+ *       PCM | 0 | 1 | 2 | 3 | 4 | 5 | .... |n-1|
+ *       BD      | 1 | 2 | 3 | 0 |
+ *  bd_head (=1) --^               ^-- next to fill (= bd_head)
+ *
+ * For nperiods < 4, the remaining BDs out of 4 are marked as invalid, so that
+ * the hardware skips those BDs in the loop.
+ */
+
+#define AUD_BUF_ADDR(x)		(AUD_BUF_A_ADDR + (x) * HAD_REG_WIDTH)
+#define AUD_BUF_LEN(x)		(AUD_BUF_A_LENGTH + (x) * HAD_REG_WIDTH)
+
+/* Set up a buffer descriptor at the "filled" position */
+static void had_prog_bd(struct snd_pcm_substream *substream,
+			struct snd_intelhad *intelhaddata)
+{
+	int idx = intelhaddata->bd_head;
+	int ofs = intelhaddata->pcmbuf_filled * intelhaddata->period_bytes;
+	u32 addr = substream->runtime->dma_addr + ofs;
+
+	addr |= AUD_BUF_VALID | AUD_BUF_INTR_EN;
+	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), addr);
+	had_write_register(intelhaddata, AUD_BUF_LEN(idx),
+			   intelhaddata->period_bytes);
+
+	/* advance the indices to the next */
+	intelhaddata->bd_head++;
+	intelhaddata->bd_head %= intelhaddata->num_bds;
+	intelhaddata->pcmbuf_filled++;
+	intelhaddata->pcmbuf_filled %= substream->runtime->periods;
+}
+
+/* invalidate a buffer descriptor with the given index */
+static void had_invalidate_bd(struct snd_intelhad *intelhaddata,
+			      int idx)
+{
+	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), 0);
+	had_write_register(intelhaddata, AUD_BUF_LEN(idx), 0);
+}
+
+/* Initial programming of ring buffer */
+static void had_init_ringbuf(struct snd_pcm_substream *substream,
+			     struct snd_intelhad *intelhaddata)
+{
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	int i, num_periods;
+
+	num_periods = runtime->periods;
+	intelhaddata->num_bds = min(num_periods, HAD_NUM_OF_RING_BUFS);
+	intelhaddata->period_bytes =
+		frames_to_bytes(runtime, runtime->period_size);
+	WARN_ON(intelhaddata->period_bytes & 0x3f);
+
+	intelhaddata->bd_head = 0;
+	intelhaddata->pcmbuf_head = 0;
+	intelhaddata->pcmbuf_filled = 0;
+
+	for (i = 0; i < HAD_NUM_OF_RING_BUFS; i++) {
+		if (i < num_periods)
+			had_prog_bd(substream, intelhaddata);
+		else /* invalidate the rest */
+			had_invalidate_bd(intelhaddata, i);
+	}
+
+	intelhaddata->bd_head = 0; /* reset at head again before starting */
+}
+
+/* process a bd, advance to the next */
+static void had_advance_ringbuf(struct snd_pcm_substream *substream,
+				struct snd_intelhad *intelhaddata)
+{
+	int num_periods = substream->runtime->periods;
+
+	/* reprogram the next buffer */
+	had_prog_bd(substream, intelhaddata);
+
+	/* proceed to next */
+	intelhaddata->pcmbuf_head++;
+	intelhaddata->pcmbuf_head %= num_periods;
+}
+
+/* process the current BD(s);
+ * returns the current PCM buffer byte position, or -EPIPE for underrun.
+ */
+static int had_process_ringbuf(struct snd_pcm_substream *substream,
+			       struct snd_intelhad *intelhaddata)
+{
+	int len, processed;
+	unsigned long flags;
+
+	processed = 0;
+	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
+	for (;;) {
+		/* get the remaining bytes on the buffer */
+		had_read_register(intelhaddata,
+				  AUD_BUF_LEN(intelhaddata->bd_head),
+				  &len);
+		if (len < 0 || len > intelhaddata->period_bytes) {
+			dev_dbg(intelhaddata->dev, "Invalid buf length %d\n",
+				len);
+			len = -EPIPE;
+			goto out;
+		}
+
+		if (len > 0) /* OK, this is the current buffer */
+			break;
+
+		/* len=0 => already empty, check the next buffer */
+		if (++processed >= intelhaddata->num_bds) {
+			len = -EPIPE; /* all empty? - report underrun */
+			goto out;
+		}
+		had_advance_ringbuf(substream, intelhaddata);
+	}
+
+	len = intelhaddata->period_bytes - len;
+	len += intelhaddata->period_bytes * intelhaddata->pcmbuf_head;
+ out:
+	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
+	return len;
+}
+
+/* called from irq handler */
+static void had_process_buffer_done(struct snd_intelhad *intelhaddata)
+{
+	struct snd_pcm_substream *substream;
+
+	if (!intelhaddata->connected)
+		return; /* disconnected? - bail out */
+
+	substream = had_substream_get(intelhaddata);
+	if (!substream)
+		return; /* no stream? - bail out */
+
+	/* process or stop the stream */
+	if (had_process_ringbuf(substream, intelhaddata) < 0)
+		snd_pcm_stop_xrun(substream);
+	else
+		snd_pcm_period_elapsed(substream);
+
+	had_substream_put(intelhaddata);
+}
+
 #define MAX_CNT			0xFF
 
-static void snd_intelhad_handle_underrun(struct snd_intelhad *intelhaddata)
+/*
+ * The interrupt status 'sticky' bits might not be cleared by
+ * setting '1' to that bit once...
+ */
+static void wait_clear_underrun_bit(struct snd_intelhad *intelhaddata)
+{
+	int i;
+	u32 val;
+
+	for (i = 0; i < MAX_CNT; i++) {
+		/* clear bit30, 31 AUD_HDMI_STATUS */
+		had_read_register(intelhaddata, AUD_HDMI_STATUS, &val);
+		if (!(val & AUD_CONFIG_MASK_UNDERRUN))
+			return;
+		had_write_register(intelhaddata, AUD_HDMI_STATUS, val);
+	}
+	dev_err(intelhaddata->dev, "Unable to clear UNDERRUN bits\n");
+}
+
+/* called from irq handler */
+static void had_process_buffer_underrun(struct snd_intelhad *intelhaddata)
 {
-	u32 hdmi_status = 0, i = 0;
+	struct snd_pcm_substream *substream;
 
 	/* Handle Underrun interrupt within Audio Unit */
 	had_write_register(intelhaddata, AUD_CONFIG, 0);
 	/* Reset buffer pointers */
 	had_reset_audio(intelhaddata);
-	/*
-	 * The interrupt status 'sticky' bits might not be cleared by
-	 * setting '1' to that bit once...
-	 */
-	do { /* clear bit30, 31 AUD_HDMI_STATUS */
-		had_read_register(intelhaddata, AUD_HDMI_STATUS,
-				  &hdmi_status);
-		dev_dbg(intelhaddata->dev, "HDMI status =0x%x\n", hdmi_status);
-		if (hdmi_status & AUD_CONFIG_MASK_UNDERRUN) {
-			i++;
-			had_write_register(intelhaddata,
-					   AUD_HDMI_STATUS, hdmi_status);
-		} else
-			break;
-	} while (i < MAX_CNT);
-	if (i >= MAX_CNT)
-		dev_err(intelhaddata->dev, "Unable to clear UNDERRUN bits\n");
+
+	wait_clear_underrun_bit(intelhaddata);
+
+	if (!intelhaddata->connected)
+		return; /* disconnected? - bail out */
+
+	/* Report UNDERRUN error to above layers */
+	substream = had_substream_get(intelhaddata);
+	if (substream) {
+		snd_pcm_stop_xrun(substream);
+		had_substream_put(intelhaddata);
+	}
 }
 
 /*
@@ -957,11 +1065,6 @@ static int had_pcm_open(struct snd_pcm_substream *substream)
 	intelhaddata->stream_info.substream_refcount++;
 	spin_unlock_irq(&intelhaddata->had_spinlock);
 
-	/* these are cleared in prepare callback, but just to be sure */
-	intelhaddata->curr_buf = 0;
-	intelhaddata->underrun_count = 0;
-	intelhaddata->stream_info.buffer_rendered = 0;
-
 	return retval;
  error:
 	pm_runtime_put(intelhaddata->dev);
@@ -1123,10 +1226,6 @@ static int had_pcm_prepare(struct snd_pcm_substream *substream)
 	dev_dbg(intelhaddata->dev, "rate=%d\n", runtime->rate);
 	dev_dbg(intelhaddata->dev, "channels=%d\n", runtime->channels);
 
-	intelhaddata->curr_buf = 0;
-	intelhaddata->underrun_count = 0;
-	intelhaddata->stream_info.buffer_rendered = 0;
-
 	/* Get N value in KHz */
 	disp_samp_freq = intelhaddata->tmds_clock_speed;
 
@@ -1148,8 +1247,7 @@ static int had_pcm_prepare(struct snd_pcm_substream *substream)
 	retval = had_init_audio_ctrl(substream, intelhaddata);
 
 	/* Prog buffer address */
-	retval = snd_intelhad_prog_buffer(substream, intelhaddata,
-			HAD_BUF_TYPE_A, HAD_BUF_TYPE_D);
+	had_init_ringbuf(substream, intelhaddata);
 
 	/*
 	 * Program channel mapping in following order:
@@ -1168,48 +1266,17 @@ static int had_pcm_prepare(struct snd_pcm_substream *substream)
 static snd_pcm_uframes_t had_pcm_pointer(struct snd_pcm_substream *substream)
 {
 	struct snd_intelhad *intelhaddata;
-	u32 bytes_rendered = 0;
-	u32 t;
-	int buf_id;
+	int len;
 
 	intelhaddata = snd_pcm_substream_chip(substream);
 
 	if (!intelhaddata->connected)
 		return SNDRV_PCM_POS_XRUN;
 
-	/* Use a hw register to calculate sub-period position reports.
-	 * This makes PulseAudio happier.
-	 */
-
-	buf_id = intelhaddata->curr_buf % 4;
-	had_read_register(intelhaddata,
-			  AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH), &t);
-
-	if ((t == 0) || (t == ((u32)-1L))) {
-		intelhaddata->underrun_count++;
-		dev_dbg(intelhaddata->dev,
-			"discovered buffer done for buf %d, count = %d\n",
-			 buf_id, intelhaddata->underrun_count);
-
-		if (intelhaddata->underrun_count > (HAD_MIN_PERIODS/2)) {
-			dev_dbg(intelhaddata->dev,
-				"assume audio_codec_reset, underrun = %d - do xrun\n",
-				 intelhaddata->underrun_count);
-			return SNDRV_PCM_POS_XRUN;
-		}
-	} else {
-		/* Reset Counter */
-		intelhaddata->underrun_count = 0;
-	}
-
-	t = intelhaddata->buf_info[buf_id].buf_size - t;
-
-	if (intelhaddata->stream_info.buffer_rendered)
-		div_u64_rem(intelhaddata->stream_info.buffer_rendered,
-			intelhaddata->stream_info.ring_buf_size,
-			&(bytes_rendered));
-
-	return bytes_to_frames(substream->runtime, bytes_rendered + t);
+	len = had_process_ringbuf(substream, intelhaddata);
+	if (len < 0)
+		return SNDRV_PCM_POS_XRUN;
+	return bytes_to_frames(substream->runtime, len);
 }
 
 /*
@@ -1278,179 +1345,9 @@ static int had_process_mode_change(struct snd_intelhad *intelhaddata)
 	return retval;
 }
 
-static inline int had_chk_intrmiss(struct snd_intelhad *intelhaddata,
-		enum intel_had_aud_buf_type buf_id)
-{
-	int i, intr_count = 0;
-	enum intel_had_aud_buf_type buff_done;
-	u32 buf_size, buf_addr;
-
-	buff_done = buf_id;
-
-	intr_count = snd_intelhad_read_len(intelhaddata);
-	if (intr_count > 1) {
-		/* In case of active playback */
-		dev_err(intelhaddata->dev,
-			"Driver detected %d missed buffer done interrupt(s)\n",
-			(intr_count - 1));
-		if (intr_count > 3)
-			return intr_count;
-
-		buf_id += (intr_count - 1);
-		/* Reprogram registers*/
-		for (i = buff_done; i < buf_id; i++) {
-			int j = i % 4;
-
-			buf_size = intelhaddata->buf_info[j].buf_size;
-			buf_addr = intelhaddata->buf_info[j].buf_addr;
-			had_write_register(intelhaddata,
-					   AUD_BUF_A_LENGTH +
-					   (j * HAD_REG_WIDTH), buf_size);
-			had_write_register(intelhaddata,
-					   AUD_BUF_A_ADDR+(j * HAD_REG_WIDTH),
-					   (buf_addr | BIT(0) | BIT(1)));
-		}
-		buf_id = buf_id % 4;
-		intelhaddata->buff_done = buf_id;
-	}
-
-	return intr_count;
-}
-
-/* called from irq handler */
-static int had_process_buffer_done(struct snd_intelhad *intelhaddata)
-{
-	u32 len = 1;
-	enum intel_had_aud_buf_type buf_id;
-	enum intel_had_aud_buf_type buff_done;
-	struct pcm_stream_info *stream;
-	struct snd_pcm_substream *substream;
-	u32 buf_size;
-	int intr_count;
-	unsigned long flags;
-
-	stream = &intelhaddata->stream_info;
-	intr_count = 1;
-
-	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
-	if (!intelhaddata->connected) {
-		spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
-		dev_dbg(intelhaddata->dev,
-			"%s:Device already disconnected\n", __func__);
-		return 0;
-	}
-	buf_id = intelhaddata->curr_buf;
-	intelhaddata->buff_done = buf_id;
-	buff_done = intelhaddata->buff_done;
-	buf_size = intelhaddata->buf_info[buf_id].buf_size;
-
-	/* Every debug statement has an implication
-	 * of ~5msec. Thus, avoid having >3 debug statements
-	 * for each buffer_done handling.
-	 */
-
-	/* Check for any intr_miss in case of active playback */
-	if (stream->running) {
-		intr_count = had_chk_intrmiss(intelhaddata, buf_id);
-		if (!intr_count || (intr_count > 3)) {
-			spin_unlock_irqrestore(&intelhaddata->had_spinlock,
-					       flags);
-			dev_err(intelhaddata->dev,
-				"HAD SW state in non-recoverable mode\n");
-			return 0;
-		}
-		buf_id += (intr_count - 1);
-		buf_id = buf_id % 4;
-	}
-
-	intelhaddata->buf_info[buf_id].is_valid = true;
-	if (intelhaddata->valid_buf_cnt-1 == buf_id) {
-		if (stream->running)
-			intelhaddata->curr_buf = HAD_BUF_TYPE_A;
-	} else
-		intelhaddata->curr_buf = buf_id + 1;
-
-	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
-
-	if (!intelhaddata->connected) {
-		dev_dbg(intelhaddata->dev, "HDMI cable plugged-out\n");
-		return 0;
-	}
-
-	/* Reprogram the registers with addr and length */
-	had_write_register(intelhaddata,
-			   AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH),
-			   buf_size);
-	had_write_register(intelhaddata,
-			   AUD_BUF_A_ADDR + (buf_id * HAD_REG_WIDTH),
-			   intelhaddata->buf_info[buf_id].buf_addr |
-			   BIT(0) | BIT(1));
-
-	had_read_register(intelhaddata,
-			  AUD_BUF_A_LENGTH + (buf_id * HAD_REG_WIDTH),
-			  &len);
-	dev_dbg(intelhaddata->dev, "%s:Enabled buf[%d]\n", __func__, buf_id);
-
-	/* In case of actual data,
-	 * report buffer_done to above ALSA layer
-	 */
-	substream = had_substream_get(intelhaddata);
-	if (substream) {
-		buf_size = intelhaddata->buf_info[buf_id].buf_size;
-		intelhaddata->stream_info.buffer_rendered +=
-			(intr_count * buf_size);
-		snd_pcm_period_elapsed(substream);
-		had_substream_put(intelhaddata);
-	}
-
-	return 0;
-}
-
-/* called from irq handler */
-static int had_process_buffer_underrun(struct snd_intelhad *intelhaddata)
-{
-	enum intel_had_aud_buf_type buf_id;
-	struct pcm_stream_info *stream;
-	struct snd_pcm_substream *substream;
-	unsigned long flags;
-	int connected;
-
-	stream = &intelhaddata->stream_info;
-
-	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
-	buf_id = intelhaddata->curr_buf;
-	intelhaddata->buff_done = buf_id;
-	connected = intelhaddata->connected;
-	if (stream->running)
-		intelhaddata->curr_buf = HAD_BUF_TYPE_A;
-
-	spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags);
-
-	dev_dbg(intelhaddata->dev, "Enter:%s buf_id=%d, stream_running=%d\n",
-			__func__, buf_id, stream->running);
-
-	snd_intelhad_handle_underrun(intelhaddata);
-
-	if (!connected) {
-		dev_dbg(intelhaddata->dev,
-			"%s:Device already disconnected\n", __func__);
-		return 0;
-	}
-
-	/* Report UNDERRUN error to above layers */
-	substream = had_substream_get(intelhaddata);
-	if (substream) {
-		snd_pcm_stop_xrun(substream);
-		had_substream_put(intelhaddata);
-	}
-
-	return 0;
-}
-
 /* process hot plug, called from wq with mutex locked */
 static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
 {
-	enum intel_had_aud_buf_type buf_id;
 	struct snd_pcm_substream *substream;
 
 	spin_lock_irq(&intelhaddata->had_spinlock);
@@ -1460,17 +1357,12 @@ static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
 		return;
 	}
 
-	buf_id = intelhaddata->curr_buf;
-	intelhaddata->buff_done = buf_id;
 	intelhaddata->connected = true;
 	dev_dbg(intelhaddata->dev,
 		"%s @ %d:DEBUG PLUG/UNPLUG : HAD_DRV_CONNECTED\n",
 			__func__, __LINE__);
 	spin_unlock_irq(&intelhaddata->had_spinlock);
 
-	dev_dbg(intelhaddata->dev, "Processing HOT_PLUG, buf_id = %d\n",
-		buf_id);
-
 	/* Safety check */
 	substream = had_substream_get(intelhaddata);
 	if (substream) {
@@ -1487,11 +1379,8 @@ static void had_process_hot_plug(struct snd_intelhad *intelhaddata)
 /* process hot unplug, called from wq with mutex locked */
 static void had_process_hot_unplug(struct snd_intelhad *intelhaddata)
 {
-	enum intel_had_aud_buf_type buf_id;
 	struct snd_pcm_substream *substream;
 
-	buf_id = intelhaddata->curr_buf;
-
 	substream = had_substream_get(intelhaddata);
 
 	spin_lock_irq(&intelhaddata->had_spinlock);
@@ -1862,13 +1751,12 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev)
 	dma_set_mask(&pdev->dev, DMA_BIT_MASK(32));
 	dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
 
-	/* allocate dma pages for ALSA stream operations
-	 * memory allocated is based on size, not max value
-	 * thus using same argument for max & size
+	/* allocate dma pages;
+	 * try to allocate 600k buffer as default which is large enough
 	 */
 	snd_pcm_lib_preallocate_pages_for_all(pcm,
 			SNDRV_DMA_TYPE_DEV, NULL,
-			HAD_MAX_BUFFER, HAD_MAX_BUFFER);
+			HAD_DEFAULT_BUFFER, HAD_MAX_BUFFER);
 
 	/* create controls */
 	for (i = 0; i < ARRAY_SIZE(had_controls); i++) {
diff --git a/sound/x86/intel_hdmi_audio.h b/sound/x86/intel_hdmi_audio.h
index 9f713a8a88bc..7e2546b853ca 100644
--- a/sound/x86/intel_hdmi_audio.h
+++ b/sound/x86/intel_hdmi_audio.h
@@ -64,24 +64,15 @@
 
 struct pcm_stream_info {
 	struct snd_pcm_substream *substream;
-	u64		buffer_rendered;
-	u32		ring_buf_size;
 	int substream_refcount;
 	bool running;
 };
 
-struct ring_buf_info {
-	u32	buf_addr;
-	u32	buf_size;
-	u8	is_valid;
-};
-
 /*
  * struct snd_intelhad - intelhad driver structure
  *
  * @card: ptr to hold card details
  * @connected: the monitor connection status
- * @buf_info: ring buffer info
  * @stream_info: stream information
  * @eld: holds ELD info
  * @curr_buf: pointer to hold current active ring buf
@@ -91,26 +82,29 @@ struct ring_buf_info {
  * @buff_done: id of current buffer done intr
  * @dev: platoform device handle
  * @chmap: holds channel map info
- * @underrun_count: PCM stream underrun counter
  */
 struct snd_intelhad {
 	struct snd_card	*card;
 	bool		connected;
-	struct		ring_buf_info buf_info[HAD_NUM_OF_RING_BUFS];
 	struct		pcm_stream_info stream_info;
 	unsigned char	eld[HDMI_MAX_ELD_BYTES];
 	bool dp_output;
-	enum		intel_had_aud_buf_type curr_buf;
-	int		valid_buf_cnt;
 	unsigned int	aes_bits;
 	spinlock_t had_spinlock;
-	enum		intel_had_aud_buf_type buff_done;
 	struct device *dev;
 	struct snd_pcm_chmap *chmap;
-	int underrun_count;
 	int tmds_clock_speed;
 	int link_rate;
 
+	/* ring buffer (BD) position index */
+	unsigned int bd_head;
+	/* PCM buffer position indices */
+	unsigned int pcmbuf_head;	/* being processed */
+	unsigned int pcmbuf_filled;	/* to be filled */
+
+	unsigned int num_bds;		/* number of BDs */
+	unsigned int period_bytes;	/* PCM period size in bytes */
+
 	/* internal stuff */
 	int irq;
 	void __iomem *mmio_start;
diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h
index be9783910a3a..ca4212dca94e 100644
--- a/sound/x86/intel_hdmi_lpe_audio.h
+++ b/sound/x86/intel_hdmi_lpe_audio.h
@@ -28,13 +28,13 @@
 #define HAD_MAX_CHANNEL		8
 #define HAD_NUM_OF_RING_BUFS	4
 
-/* Assume 192KHz, 8channel, 25msec period */
-#define HAD_MAX_BUFFER		(600*1024)
-#define HAD_MIN_BUFFER		(32*1024)
-#define HAD_MAX_PERIODS		4
-#define HAD_MIN_PERIODS		4
-#define HAD_MAX_PERIOD_BYTES	(HAD_MAX_BUFFER/HAD_MIN_PERIODS)
-#define HAD_MIN_PERIOD_BYTES	256
+/* max 20bit address, aligned to 64 */
+#define HAD_MAX_BUFFER		((1024 * 1024 - 1) & ~0x3f)
+#define HAD_DEFAULT_BUFFER	(600 * 1024) /* default prealloc size */
+#define HAD_MAX_PERIODS		256	/* arbitrary, but should suffice */
+#define HAD_MIN_PERIODS		2
+#define HAD_MAX_PERIOD_BYTES	((HAD_MAX_BUFFER / HAD_MIN_PERIODS) & ~0x3f)
+#define HAD_MIN_PERIOD_BYTES	1024	/* might be smaller */
 #define HAD_FIFO_SIZE		0 /* fifo not being used */
 #define MAX_SPEAKERS		8
 
@@ -82,14 +82,6 @@
 /* Naud Value */
 #define DP_NAUD_VAL					32768
 
-/* enum intel_had_aud_buf_type - HDMI controller ring buffer types */
-enum intel_had_aud_buf_type {
-	HAD_BUF_TYPE_A = 0,
-	HAD_BUF_TYPE_B = 1,
-	HAD_BUF_TYPE_C = 2,
-	HAD_BUF_TYPE_D = 3,
-};
-
 /* HDMI Controller register offsets - audio domain common */
 /* Base address for below regs = 0x65000 */
 enum hdmi_ctrl_reg_offset_common {
@@ -274,6 +266,9 @@ union aud_buf_addr {
 	u32 regval;
 };
 
+#define AUD_BUF_VALID		(1U << 0)
+#define AUD_BUF_INTR_EN		(1U << 1)
+
 /* Length of Audio Buffer */
 union aud_buf_len {
 	struct {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-06 11:22           ` Takashi Iwai
@ 2017-02-06 15:46             ` Pierre-Louis Bossart
  2017-02-06 15:54               ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-06 15:46 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand

Looks nice, with one comment below:


> +/* process a bd, advance to the next */
> +static void had_advance_ringbuf(struct snd_pcm_substream *substream,
> +				struct snd_intelhad *intelhaddata)
> +{
> +	int num_periods = substream->runtime->periods;
> +
> +	/* reprogram the next buffer */
> +	had_prog_bd(substream, intelhaddata);
> +
> +	/* proceed to next */
> +	intelhaddata->pcmbuf_head++;
> +	intelhaddata->pcmbuf_head %= num_periods;
> +}
> +
> +/* process the current BD(s);
> + * returns the current PCM buffer byte position, or -EPIPE for underrun.
> + */
> +static int had_process_ringbuf(struct snd_pcm_substream *substream,
> +			       struct snd_intelhad *intelhaddata)
> +{
> +	int len, processed;
> +	unsigned long flags;
> +
> +	processed = 0;
> +	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
> +	for (;;) {
> +		/* get the remaining bytes on the buffer */
> +		had_read_register(intelhaddata,
> +				  AUD_BUF_LEN(intelhaddata->bd_head),
> +				  &len);
> +		if (len < 0 || len > intelhaddata->period_bytes) {
> +			dev_dbg(intelhaddata->dev, "Invalid buf length %d\n",
> +				len);
> +			len = -EPIPE;
> +			goto out;
> +		}
> +
> +		if (len > 0) /* OK, this is the current buffer */
> +			break;
> +
> +		/* len=0 => already empty, check the next buffer */
> +		if (++processed >= intelhaddata->num_bds) {
> +			len = -EPIPE; /* all empty? - report underrun */
> +			goto out;
> +		}
> +		had_advance_ringbuf(substream, intelhaddata);
> +	}
> +
> +	len = intelhaddata->period_bytes - len;
> +	len += intelhaddata->period_bytes * intelhaddata->pcmbuf_head;
I don't know if this code is completely correct (and I had similar 
concerns with David's).
If the len==0, then the new buffer descriptor will be used in the next 
iteration. If the register is read immediately, there is a risk that the 
DMA position has not moved and len then becomes 
intelhaddata->period_bytes, but the last line will increase the number 
of bytes by a period. I think there should be a test here to handle this 
corner case.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-06 15:46             ` Pierre-Louis Bossart
@ 2017-02-06 15:54               ` Takashi Iwai
  2017-02-06 16:00                 ` Pierre-Louis Bossart
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-06 15:54 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Mon, 06 Feb 2017 16:46:53 +0100,
Pierre-Louis Bossart wrote:
> 
> Looks nice, with one comment below:
> 
> 
> > +/* process a bd, advance to the next */
> > +static void had_advance_ringbuf(struct snd_pcm_substream *substream,
> > +				struct snd_intelhad *intelhaddata)
> > +{
> > +	int num_periods = substream->runtime->periods;
> > +
> > +	/* reprogram the next buffer */
> > +	had_prog_bd(substream, intelhaddata);
> > +
> > +	/* proceed to next */
> > +	intelhaddata->pcmbuf_head++;
> > +	intelhaddata->pcmbuf_head %= num_periods;
> > +}
> > +
> > +/* process the current BD(s);
> > + * returns the current PCM buffer byte position, or -EPIPE for underrun.
> > + */
> > +static int had_process_ringbuf(struct snd_pcm_substream *substream,
> > +			       struct snd_intelhad *intelhaddata)
> > +{
> > +	int len, processed;
> > +	unsigned long flags;
> > +
> > +	processed = 0;
> > +	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
> > +	for (;;) {
> > +		/* get the remaining bytes on the buffer */
> > +		had_read_register(intelhaddata,
> > +				  AUD_BUF_LEN(intelhaddata->bd_head),
> > +				  &len);
> > +		if (len < 0 || len > intelhaddata->period_bytes) {
> > +			dev_dbg(intelhaddata->dev, "Invalid buf length %d\n",
> > +				len);
> > +			len = -EPIPE;
> > +			goto out;
> > +		}
> > +
> > +		if (len > 0) /* OK, this is the current buffer */
> > +			break;
> > +
> > +		/* len=0 => already empty, check the next buffer */
> > +		if (++processed >= intelhaddata->num_bds) {
> > +			len = -EPIPE; /* all empty? - report underrun */
> > +			goto out;
> > +		}
> > +		had_advance_ringbuf(substream, intelhaddata);
> > +	}
> > +
> > +	len = intelhaddata->period_bytes - len;
> > +	len += intelhaddata->period_bytes * intelhaddata->pcmbuf_head;
> I don't know if this code is completely correct (and I had similar
> concerns with David's).
> If the len==0, then the new buffer descriptor will be used in the next
> iteration. If the register is read immediately, there is a risk that
> the DMA position has not moved and len then becomes
> intelhaddata->period_bytes, but the last line will increase the number
> of bytes by a period. I think there should be a test here to handle
> this corner case.

That's OK.  When len=0, the loop goes to the next buffer -- i.e.
pcm_buf is also increased.  Then it reads len=period_bytes and quits
the loop.  Now len is re-calculated as
  len = period_bytes - len;
	--> len = 0
  len += period_bytes * pcmbuf_head;
	--> len = new head position in bytes

which is exactly the expected position.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] ALSA: x86: Refactor PCM process engine
  2017-02-06 15:54               ` Takashi Iwai
@ 2017-02-06 16:00                 ` Pierre-Louis Bossart
  0 siblings, 0 replies; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-06 16:00 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand



On 02/06/2017 09:54 AM, Takashi Iwai wrote:
> On Mon, 06 Feb 2017 16:46:53 +0100,
> Pierre-Louis Bossart wrote:
>> Looks nice, with one comment below:
>>
>>
>>> +/* process a bd, advance to the next */
>>> +static void had_advance_ringbuf(struct snd_pcm_substream *substream,
>>> +				struct snd_intelhad *intelhaddata)
>>> +{
>>> +	int num_periods = substream->runtime->periods;
>>> +
>>> +	/* reprogram the next buffer */
>>> +	had_prog_bd(substream, intelhaddata);
>>> +
>>> +	/* proceed to next */
>>> +	intelhaddata->pcmbuf_head++;
>>> +	intelhaddata->pcmbuf_head %= num_periods;
>>> +}
>>> +
>>> +/* process the current BD(s);
>>> + * returns the current PCM buffer byte position, or -EPIPE for underrun.
>>> + */
>>> +static int had_process_ringbuf(struct snd_pcm_substream *substream,
>>> +			       struct snd_intelhad *intelhaddata)
>>> +{
>>> +	int len, processed;
>>> +	unsigned long flags;
>>> +
>>> +	processed = 0;
>>> +	spin_lock_irqsave(&intelhaddata->had_spinlock, flags);
>>> +	for (;;) {
>>> +		/* get the remaining bytes on the buffer */
>>> +		had_read_register(intelhaddata,
>>> +				  AUD_BUF_LEN(intelhaddata->bd_head),
>>> +				  &len);
>>> +		if (len < 0 || len > intelhaddata->period_bytes) {
>>> +			dev_dbg(intelhaddata->dev, "Invalid buf length %d\n",
>>> +				len);
>>> +			len = -EPIPE;
>>> +			goto out;
>>> +		}
>>> +
>>> +		if (len > 0) /* OK, this is the current buffer */
>>> +			break;
>>> +
>>> +		/* len=0 => already empty, check the next buffer */
>>> +		if (++processed >= intelhaddata->num_bds) {
>>> +			len = -EPIPE; /* all empty? - report underrun */
>>> +			goto out;
>>> +		}
>>> +		had_advance_ringbuf(substream, intelhaddata);
>>> +	}
>>> +
>>> +	len = intelhaddata->period_bytes - len;
>>> +	len += intelhaddata->period_bytes * intelhaddata->pcmbuf_head;
>> I don't know if this code is completely correct (and I had similar
>> concerns with David's).
>> If the len==0, then the new buffer descriptor will be used in the next
>> iteration. If the register is read immediately, there is a risk that
>> the DMA position has not moved and len then becomes
>> intelhaddata->period_bytes, but the last line will increase the number
>> of bytes by a period. I think there should be a test here to handle
>> this corner case.
> That's OK.  When len=0, the loop goes to the next buffer -- i.e.
> pcm_buf is also increased.  Then it reads len=period_bytes and quits
> the loop.  Now len is re-calculated as
>    len = period_bytes - len;
> 	--> len = 0
>    len += period_bytes * pcmbuf_head;
> 	--> len = new head position in bytes
>
> which is exactly the expected position.
>
>
ok, i guess I need more coffee... David's code did not include the 
additional read and I wanted to check this was fine.
Thanks for the precision.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-04  7:57         ` Takashi Iwai
@ 2017-02-06 19:01           ` Takashi Iwai
  2017-02-06 21:27             ` Pierre-Louis Bossart
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-06 19:01 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Sat, 04 Feb 2017 08:57:08 +0100,
Takashi Iwai wrote:
> 
> > > In anyway, it'd be appreciated if you can test on your hardware.
> > > I could test only on a single machine.
> > I can test more but only in 10 days from now so if we could delay this
> > patch a bit it'd be better.
> 
> OK, I can postpone this change and keep BATCH and DOUBLE.

I've tested more intensively, and this seems working well, so far.
Hopefully the DMA FIFO or fetch timing doesn't play a big role.

BTW, with the combination of this and the latest my PCM rewrite patch,
a more interesting experiment can be done: extend to (a kind of)
no-period-wakeup mode.  Of course, it doesn't work like HD-audio, as
we can't the ring buffer persistently on LPE audio but have to refresh
the buffer descriptors.  But the refresh of BDs is also done at PCM
pointer callback, so it would work as is.  For supporting the case
period=1, though, another trick is needed: namely, we set up 4 BDs
pointing to the same address, so that it won't go out to underrun.

A totally untested patch is below.


thanks,

Takashi

---
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
index 0bd77acb9689..be9cabc9f58b 100644
--- a/sound/x86/intel_hdmi_audio.c
+++ b/sound/x86/intel_hdmi_audio.c
@@ -133,7 +133,8 @@ static const struct channel_map_table map_tables[] = {
 static const struct snd_pcm_hardware had_pcm_hardware = {
 	.info =	(SNDRV_PCM_INFO_INTERLEAVED |
 		SNDRV_PCM_INFO_MMAP|
-		SNDRV_PCM_INFO_MMAP_VALID),
+		SNDRV_PCM_INFO_MMAP_VALID|
+		SNDRV_PCM_INFO_NO_PERIOD_WAKEUP),
 	.formats = (SNDRV_PCM_FMTBIT_S24 |
 		SNDRV_PCM_FMTBIT_U24),
 	.rates = SNDRV_PCM_RATE_32000 |
@@ -853,7 +854,9 @@ static void had_prog_bd(struct snd_pcm_substream *substream,
 	int ofs = intelhaddata->pcmbuf_filled * intelhaddata->period_bytes;
 	u32 addr = substream->runtime->dma_addr + ofs;
 
-	addr |= AUD_BUF_VALID | AUD_BUF_INTR_EN;
+	addr |= AUD_BUF_VALID;
+	if (!subsream->runtime->no_period_wakeup)
+		addr |= AUD_BUF_INTR_EN;
 	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), addr);
 	had_write_register(intelhaddata, AUD_BUF_LEN(idx),
 			   intelhaddata->period_bytes);
@@ -881,7 +884,10 @@ static void had_init_ringbuf(struct snd_pcm_substream *substream,
 	int i, num_periods;
 
 	num_periods = runtime->periods;
-	intelhaddata->num_bds = min(num_periods, HAD_NUM_OF_RING_BUFS);
+	if (num_periods == 1)
+		intelhaddata->num_bds = HAD_NUM_OF_RING_BUFS;
+	else
+		intelhaddata->num_bds = min(num_periods, HAD_NUM_OF_RING_BUFS);
 	intelhaddata->period_bytes =
 		frames_to_bytes(runtime, runtime->period_size);
 	WARN_ON(intelhaddata->period_bytes & 0x3f);
@@ -891,7 +897,7 @@ static void had_init_ringbuf(struct snd_pcm_substream *substream,
 	intelhaddata->pcmbuf_filled = 0;
 
 	for (i = 0; i < HAD_NUM_OF_RING_BUFS; i++) {
-		if (i < num_periods)
+		if (i < num_periods || num_periods == 1)
 			had_prog_bd(substream, intelhaddata);
 		else /* invalidate the rest */
 			had_invalidate_bd(intelhaddata, i);
diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h
index ca4212dca94e..3c2befa721a4 100644
--- a/sound/x86/intel_hdmi_lpe_audio.h
+++ b/sound/x86/intel_hdmi_lpe_audio.h
@@ -32,7 +32,7 @@
 #define HAD_MAX_BUFFER		((1024 * 1024 - 1) & ~0x3f)
 #define HAD_DEFAULT_BUFFER	(600 * 1024) /* default prealloc size */
 #define HAD_MAX_PERIODS		256	/* arbitrary, but should suffice */
-#define HAD_MIN_PERIODS		2
+#define HAD_MIN_PERIODS		1
 #define HAD_MAX_PERIOD_BYTES	((HAD_MAX_BUFFER / HAD_MIN_PERIODS) & ~0x3f)
 #define HAD_MIN_PERIOD_BYTES	1024	/* might be smaller */
 #define HAD_FIFO_SIZE		0 /* fifo not being used */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/4] Yet another patchset for LPE audio refactoring
  2017-02-03 16:43 [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
                   ` (3 preceding siblings ...)
  2017-02-03 16:44 ` [PATCH 4/4] ALSA: x86: Refactor PCM process engine Takashi Iwai
@ 2017-02-06 19:07 ` Takashi Iwai
  2017-02-06 21:16   ` Pierre-Louis Bossart
  4 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-06 19:07 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Fri, 03 Feb 2017 17:43:56 +0100,
Takashi Iwai wrote:
> 
> Hi,
> 
> this is the final part of my cleanup / refactoring patchsets for Intel
> LPE audio.  Now the PCM stream engine has been rewritten.  It supports
> more flexible configuration, not only fixed 4 periods, for example,
> yet with a lot of code reduction.
> 
> As usual, the patches are found in topic/intel-lpe-audio-cleanup
> branch.

So far, from my side, now most of changes have been submitted /
merged.  But I still have some unclear things in the code.

- The PCM format: the code contains the handling of S16_LE, but it
  doesn't seem work when I enabled the info bit.  What's missing?
  Also U24_LE format is ever supported...?  How?

- The behavior in had_process_buffer_underrun() isn't clear enough.
  Or, more exactly, the behavior of AUD_HDMI_STATUS reg and AUD_CONFIG
  reg look complex, and need a bit more description for readers
  (including me).

Can you guys enlighten a bit?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/4] Yet another patchset for LPE audio refactoring
  2017-02-06 19:07 ` [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
@ 2017-02-06 21:16   ` Pierre-Louis Bossart
  2017-02-06 21:45     ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-06 21:16 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand

On 2/6/17 1:07 PM, Takashi Iwai wrote:
> On Fri, 03 Feb 2017 17:43:56 +0100,
> Takashi Iwai wrote:
>>
>> Hi,
>>
>> this is the final part of my cleanup / refactoring patchsets for Intel
>> LPE audio.  Now the PCM stream engine has been rewritten.  It supports
>> more flexible configuration, not only fixed 4 periods, for example,
>> yet with a lot of code reduction.
>>
>> As usual, the patches are found in topic/intel-lpe-audio-cleanup
>> branch.
>
> So far, from my side, now most of changes have been submitted /
> merged.  But I still have some unclear things in the code.
>
> - The PCM format: the code contains the handling of S16_LE, but it
>   doesn't seem work when I enabled the info bit.  What's missing?
>   Also U24_LE format is ever supported...?  How?

In the initial version the samples were assumed to be using the 24 lsb 
of a 32-bit word. 16-bit data can be supported as long as it was 
left-shifted before being written to the ring buffer.

For stereo the channels are assumed to be interleaved in Layout 0 (left, 
right). For more than 2ch, all channels for the same sample are inserted 
consecutively (ie. no blocks of channels). In the initial version the 
number of channels is assumed to be even and one bogus channel had to be 
inserted for odd number of channels.

in more recent versions the samples could also be represented as using 
the 24 msb and the requirement for the bogus channel was removed but I 
never worked on this personally and the documentation is far from clear.

At any rate, here's what I see in the description of the 
STREAM_X_LPE_AUD_CONFIG registers for CHT:

Bit 15:
1: DP mode
0: HDMI (default)

Bit14:
1: no bogus sample for odd channels
0: bogus sample for off channels (default)

Bit13:
1: MSB is bit 31 of 32-bit container
0: MSB is bit 23 of 32-bit container (default)

Bit 12:
1: 16-bit container
0: 32-bit container (default)

Bit 11:
1: send silence stream
0: send null packets (default)

I see no evidence that unsigned data is supported.

I would really take all these configurations with caution, it's not 
clear to me if the documentation reflects the actual implementation.

Note that the hardware can swap channels, I don't remember if this is 
currently enabled.

>
> - The behavior in had_process_buffer_underrun() isn't clear enough.
>   Or, more exactly, the behavior of AUD_HDMI_STATUS reg and AUD_CONFIG
>   reg look complex, and need a bit more description for readers
>   (including me).
>
> Can you guys enlighten a bit?

For the LPE_AUDIO_Buffer_config register (65020h), there is one 
important field
10:8: DMA Fifo watermark. The audio unit has 8x64 bytes, this register 
define when to fetch data. Bottom line it'd be wise to avoid rewinding 
below that FIFO size...


There are two possible interrupt sources in the 
STREAM_X_LPE_AUD_HDMI_STATUS (65064h)
Bit 31: sample buffer underrun. HDMI audio unit did not have enough 
samples to insert in video frame (cleared by writing a one)
Bit 29: buffer done status, write 1 to clear
Bit 15: sample buffer underrun interrupt enable

Hope this helps
-Pierre


>
>
> thanks,
>
> Takashi
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-06 19:01           ` Takashi Iwai
@ 2017-02-06 21:27             ` Pierre-Louis Bossart
  2017-02-06 21:51               ` Takashi Iwai
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-06 21:27 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand

On 2/6/17 1:01 PM, Takashi Iwai wrote:
> On Sat, 04 Feb 2017 08:57:08 +0100,
> Takashi Iwai wrote:
>>
>>>> In anyway, it'd be appreciated if you can test on your hardware.
>>>> I could test only on a single machine.
>>> I can test more but only in 10 days from now so if we could delay this
>>> patch a bit it'd be better.
>>
>> OK, I can postpone this change and keep BATCH and DOUBLE.
>
> I've tested more intensively, and this seems working well, so far.
> Hopefully the DMA FIFO or fetch timing doesn't play a big role.

As I mentioned in the other email, the FIFO can be up to 512 bytes so 
beware. And it makes sense to limit the period size to 1024 bytes minimum.

> BTW, with the combination of this and the latest my PCM rewrite patch,
> a more interesting experiment can be done: extend to (a kind of)
> no-period-wakeup mode.  Of course, it doesn't work like HD-audio, as
> we can't the ring buffer persistently on LPE audio but have to refresh
> the buffer descriptors.  But the refresh of BDs is also done at PCM
> pointer callback, so it would work as is.  For supporting the case
> period=1, though, another trick is needed: namely, we set up 4 BDs
> pointing to the same address, so that it won't go out to underrun.'

Both the pseudo-no-period-wakeup and single period should work but I am 
not sure how useful this is.

>
> A totally untested patch is below.
>
>
> thanks,
>
> Takashi
>
> ---
> diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c
> index 0bd77acb9689..be9cabc9f58b 100644
> --- a/sound/x86/intel_hdmi_audio.c
> +++ b/sound/x86/intel_hdmi_audio.c
> @@ -133,7 +133,8 @@ static const struct channel_map_table map_tables[] = {
>  static const struct snd_pcm_hardware had_pcm_hardware = {
>  	.info =	(SNDRV_PCM_INFO_INTERLEAVED |
>  		SNDRV_PCM_INFO_MMAP|
> -		SNDRV_PCM_INFO_MMAP_VALID),
> +		SNDRV_PCM_INFO_MMAP_VALID|
> +		SNDRV_PCM_INFO_NO_PERIOD_WAKEUP),
>  	.formats = (SNDRV_PCM_FMTBIT_S24 |
>  		SNDRV_PCM_FMTBIT_U24),
>  	.rates = SNDRV_PCM_RATE_32000 |
> @@ -853,7 +854,9 @@ static void had_prog_bd(struct snd_pcm_substream *substream,
>  	int ofs = intelhaddata->pcmbuf_filled * intelhaddata->period_bytes;
>  	u32 addr = substream->runtime->dma_addr + ofs;
>
> -	addr |= AUD_BUF_VALID | AUD_BUF_INTR_EN;
> +	addr |= AUD_BUF_VALID;
> +	if (!subsream->runtime->no_period_wakeup)
> +		addr |= AUD_BUF_INTR_EN;
>  	had_write_register(intelhaddata, AUD_BUF_ADDR(idx), addr);
>  	had_write_register(intelhaddata, AUD_BUF_LEN(idx),
>  			   intelhaddata->period_bytes);
> @@ -881,7 +884,10 @@ static void had_init_ringbuf(struct snd_pcm_substream *substream,
>  	int i, num_periods;
>
>  	num_periods = runtime->periods;
> -	intelhaddata->num_bds = min(num_periods, HAD_NUM_OF_RING_BUFS);
> +	if (num_periods == 1)
> +		intelhaddata->num_bds = HAD_NUM_OF_RING_BUFS;
> +	else
> +		intelhaddata->num_bds = min(num_periods, HAD_NUM_OF_RING_BUFS);
>  	intelhaddata->period_bytes =
>  		frames_to_bytes(runtime, runtime->period_size);
>  	WARN_ON(intelhaddata->period_bytes & 0x3f);
> @@ -891,7 +897,7 @@ static void had_init_ringbuf(struct snd_pcm_substream *substream,
>  	intelhaddata->pcmbuf_filled = 0;
>
>  	for (i = 0; i < HAD_NUM_OF_RING_BUFS; i++) {
> -		if (i < num_periods)
> +		if (i < num_periods || num_periods == 1)
>  			had_prog_bd(substream, intelhaddata);
>  		else /* invalidate the rest */
>  			had_invalidate_bd(intelhaddata, i);
> diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h
> index ca4212dca94e..3c2befa721a4 100644
> --- a/sound/x86/intel_hdmi_lpe_audio.h
> +++ b/sound/x86/intel_hdmi_lpe_audio.h
> @@ -32,7 +32,7 @@
>  #define HAD_MAX_BUFFER		((1024 * 1024 - 1) & ~0x3f)
>  #define HAD_DEFAULT_BUFFER	(600 * 1024) /* default prealloc size */
>  #define HAD_MAX_PERIODS		256	/* arbitrary, but should suffice */
> -#define HAD_MIN_PERIODS		2
> +#define HAD_MIN_PERIODS		1
>  #define HAD_MAX_PERIOD_BYTES	((HAD_MAX_BUFFER / HAD_MIN_PERIODS) & ~0x3f)
>  #define HAD_MIN_PERIOD_BYTES	1024	/* might be smaller */
>  #define HAD_FIFO_SIZE		0 /* fifo not being used */
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/4] Yet another patchset for LPE audio refactoring
  2017-02-06 21:16   ` Pierre-Louis Bossart
@ 2017-02-06 21:45     ` Takashi Iwai
  2017-02-06 23:56       ` Pierre-Louis Bossart
  0 siblings, 1 reply; 26+ messages in thread
From: Takashi Iwai @ 2017-02-06 21:45 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Mon, 06 Feb 2017 22:16:22 +0100,
Pierre-Louis Bossart wrote:
> 
> On 2/6/17 1:07 PM, Takashi Iwai wrote:
> > On Fri, 03 Feb 2017 17:43:56 +0100,
> > Takashi Iwai wrote:
> >>
> >> Hi,
> >>
> >> this is the final part of my cleanup / refactoring patchsets for Intel
> >> LPE audio.  Now the PCM stream engine has been rewritten.  It supports
> >> more flexible configuration, not only fixed 4 periods, for example,
> >> yet with a lot of code reduction.
> >>
> >> As usual, the patches are found in topic/intel-lpe-audio-cleanup
> >> branch.
> >
> > So far, from my side, now most of changes have been submitted /
> > merged.  But I still have some unclear things in the code.
> >
> > - The PCM format: the code contains the handling of S16_LE, but it
> >   doesn't seem work when I enabled the info bit.  What's missing?
> >   Also U24_LE format is ever supported...?  How?
> 
> In the initial version the samples were assumed to be using the 24 lsb
> of a 32-bit word. 16-bit data can be supported as long as it was
> left-shifted before being written to the ring buffer.
>
> For stereo the channels are assumed to be interleaved in Layout 0
> (left, right). For more than 2ch, all channels for the same sample are
> inserted consecutively (ie. no blocks of channels). In the initial
> version the number of channels is assumed to be even and one bogus
> channel had to be inserted for odd number of channels.
> 
> in more recent versions the samples could also be represented as using
> the 24 msb and the requirement for the bogus channel was removed but I
> never worked on this personally and the documentation is far from
> clear.

OK, that's interesting.  So we may support even  S32_LE format, too.

The odd channel restriction isn't seen in the current code, indeed.
It's worth to try 3 channels tomorrow :)

> At any rate, here's what I see in the description of the
> STREAM_X_LPE_AUD_CONFIG registers for CHT:
> 
> Bit 15:
> 1: DP mode
> 0: HDMI (default)
> 
> Bit14:
> 1: no bogus sample for odd channels
> 0: bogus sample for off channels (default)
> 
> Bit13:
> 1: MSB is bit 31 of 32-bit container
> 0: MSB is bit 23 of 32-bit container (default)
> 
> Bit 12:
> 1: 16-bit container
> 0: 32-bit container (default)
> 
> Bit 11:
> 1: send silence stream
> 0: send null packets (default)

So this corresponds to the hw_silence stuff you mentioned?


> I see no evidence that unsigned data is supported.

Yeah, and I really doubt whether unsigned format is supported at all.
Better to drop it.

> I would really take all these configurations with caution, it's not
> clear to me if the documentation reflects the actual implementation.

Sure, it's just for fun for now.

> Note that the hardware can swap channels, I don't remember if this is
> currently enabled.

We have the following code:

#define SWAP_LFE_CENTER		0x00fac4c8

	/*
	 * Program channel mapping in following order:
	 * FL, FR, C, LFE, RL, RR
	 */

	had_write_register(intelhaddata, AUD_BUF_CH_SWAP, SWAP_LFE_CENTER);

0x00fac4c8 is octal 74543210, so it's jus a plain order.

But this reminds me that I didn't check the correctness of chmap order
in the current driver.  Another thing to try tomorrow.

> 
> >
> > - The behavior in had_process_buffer_underrun() isn't clear enough.
> >   Or, more exactly, the behavior of AUD_HDMI_STATUS reg and AUD_CONFIG
> >   reg look complex, and need a bit more description for readers
> >   (including me).
> >
> > Can you guys enlighten a bit?
> 
> For the LPE_AUDIO_Buffer_config register (65020h), there is one
> important field
> 10:8: DMA Fifo watermark. The audio unit has 8x64 bytes, this register
> define when to fetch data. Bottom line it'd be wise to avoid rewinding
> below that FIFO size...

Right, this may be needed to be exposed.


> There are two possible interrupt sources in the
> STREAM_X_LPE_AUD_HDMI_STATUS (65064h)
> Bit 31: sample buffer underrun. HDMI audio unit did not have enough
> samples to insert in video frame (cleared by writing a one)
> Bit 29: buffer done status, write 1 to clear
> Bit 15: sample buffer underrun interrupt enable

OK, then the current code seems naming the function wrongly.
It names had_enable_audio_int() but actually this should be named as
had_clear_interrupts() or such.

> 
> Hope this helps

Very appreciated, thank you for prompt answer!


Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag
  2017-02-06 21:27             ` Pierre-Louis Bossart
@ 2017-02-06 21:51               ` Takashi Iwai
  0 siblings, 0 replies; 26+ messages in thread
From: Takashi Iwai @ 2017-02-06 21:51 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Mon, 06 Feb 2017 22:27:28 +0100,
Pierre-Louis Bossart wrote:
> 
> On 2/6/17 1:01 PM, Takashi Iwai wrote:
> > On Sat, 04 Feb 2017 08:57:08 +0100,
> > Takashi Iwai wrote:
> >>
> >>>> In anyway, it'd be appreciated if you can test on your hardware.
> >>>> I could test only on a single machine.
> >>> I can test more but only in 10 days from now so if we could delay this
> >>> patch a bit it'd be better.
> >>
> >> OK, I can postpone this change and keep BATCH and DOUBLE.
> >
> > I've tested more intensively, and this seems working well, so far.
> > Hopefully the DMA FIFO or fetch timing doesn't play a big role.
> 
> As I mentioned in the other email, the FIFO can be up to 512 bytes so
> beware. And it makes sense to limit the period size to 1024 bytes
> minimum.

Yes, I've examined this minimum by some trial-and-error, so it be set
now.

> > BTW, with the combination of this and the latest my PCM rewrite patch,
> > a more interesting experiment can be done: extend to (a kind of)
> > no-period-wakeup mode.  Of course, it doesn't work like HD-audio, as
> > we can't the ring buffer persistently on LPE audio but have to refresh
> > the buffer descriptors.  But the refresh of BDs is also done at PCM
> > pointer callback, so it would work as is.  For supporting the case
> > period=1, though, another trick is needed: namely, we set up 4 BDs
> > pointing to the same address, so that it won't go out to underrun.'
> 
> Both the pseudo-no-period-wakeup and single period should work but I
> am not sure how useful this is.

The practical gain would be fairly small, I suppose.  But it'd be nice
to have such a platform as a reference implementation.


Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/4] Yet another patchset for LPE audio refactoring
  2017-02-06 21:45     ` Takashi Iwai
@ 2017-02-06 23:56       ` Pierre-Louis Bossart
  2017-02-07  1:22         ` Pierre-Louis Bossart
  2017-02-07  6:39         ` Takashi Iwai
  0 siblings, 2 replies; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-06 23:56 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand

On 2/6/17 3:45 PM, Takashi Iwai wrote:
> On Mon, 06 Feb 2017 22:16:22 +0100,
> Pierre-Louis Bossart wrote:
>>
>> On 2/6/17 1:07 PM, Takashi Iwai wrote:
>>> On Fri, 03 Feb 2017 17:43:56 +0100,
>>> Takashi Iwai wrote:
>>>>
>>>> Hi,
>>>>
>>>> this is the final part of my cleanup / refactoring patchsets for Intel
>>>> LPE audio.  Now the PCM stream engine has been rewritten.  It supports
>>>> more flexible configuration, not only fixed 4 periods, for example,
>>>> yet with a lot of code reduction.
>>>>
>>>> As usual, the patches are found in topic/intel-lpe-audio-cleanup
>>>> branch.
>>>
>>> So far, from my side, now most of changes have been submitted /
>>> merged.  But I still have some unclear things in the code.
>>>
>>> - The PCM format: the code contains the handling of S16_LE, but it
>>>   doesn't seem work when I enabled the info bit.  What's missing?
>>>   Also U24_LE format is ever supported...?  How?
>>
>> In the initial version the samples were assumed to be using the 24 lsb
>> of a 32-bit word. 16-bit data can be supported as long as it was
>> left-shifted before being written to the ring buffer.
>>
>> For stereo the channels are assumed to be interleaved in Layout 0
>> (left, right). For more than 2ch, all channels for the same sample are
>> inserted consecutively (ie. no blocks of channels). In the initial
>> version the number of channels is assumed to be even and one bogus
>> channel had to be inserted for odd number of channels.
>>
>> in more recent versions the samples could also be represented as using
>> the 24 msb and the requirement for the bogus channel was removed but I
>> never worked on this personally and the documentation is far from
>> clear.
>
> OK, that's interesting.  So we may support even  S32_LE format, too.

well, yes but the 8 LSBs would be ignored, you can only transmit 24 bits 
with HDMI (it's 28 bits total per sample with 4 status bits).

>
> The odd channel restriction isn't seen in the current code, indeed.
> It's worth to try 3 channels tomorrow :)
>
>> At any rate, here's what I see in the description of the
>> STREAM_X_LPE_AUD_CONFIG registers for CHT:
>>
>> Bit 15:
>> 1: DP mode
>> 0: HDMI (default)
>>
>> Bit14:
>> 1: no bogus sample for odd channels
>> 0: bogus sample for off channels (default)
>>
>> Bit13:
>> 1: MSB is bit 31 of 32-bit container
>> 0: MSB is bit 23 of 32-bit container (default)
>>
>> Bit 12:
>> 1: 16-bit container
>> 0: 32-bit container (default)
>>
>> Bit 11:
>> 1: send silence stream
>> 0: send null packets (default)
>
> So this corresponds to the hw_silence stuff you mentioned?

yes. I am still trying to figure out how it's supposed to be used (you 
need a valid configuration first).

>
>
>> I see no evidence that unsigned data is supported.
>
> Yeah, and I really doubt whether unsigned format is supported at all.
> Better to drop it.

Agree.

>
>> I would really take all these configurations with caution, it's not
>> clear to me if the documentation reflects the actual implementation.
>
> Sure, it's just for fun for now.
>
>> Note that the hardware can swap channels, I don't remember if this is
>> currently enabled.
>
> We have the following code:
>
> #define SWAP_LFE_CENTER		0x00fac4c8
>
> 	/*
> 	 * Program channel mapping in following order:
> 	 * FL, FR, C, LFE, RL, RR
> 	 */
>
> 	had_write_register(intelhaddata, AUD_BUF_CH_SWAP, SWAP_LFE_CENTER);
>
> 0x00fac4c8 is octal 74543210, so it's jus a plain order.

Yes, the 24 LBSs (23:0) in this register are split in 8 3-bit subfields, 
e.g if you program 7 (111b) in bits 2:0 and 0 in bits 23:21 you would 
swap first and last channels.
This only works for multichannel (layout 1), no support for l/r swap.

>
> But this reminds me that I didn't check the correctness of chmap order
> in the current driver.  Another thing to try tomorrow.
>
>>
>>>
>>> - The behavior in had_process_buffer_underrun() isn't clear enough.
>>>   Or, more exactly, the behavior of AUD_HDMI_STATUS reg and AUD_CONFIG
>>>   reg look complex, and need a bit more description for readers
>>>   (including me).
>>>
>>> Can you guys enlighten a bit?
>>
>> For the LPE_AUDIO_Buffer_config register (65020h), there is one
>> important field
>> 10:8: DMA Fifo watermark. The audio unit has 8x64 bytes, this register
>> define when to fetch data. Bottom line it'd be wise to avoid rewinding
>> below that FIFO size...
>
> Right, this may be needed to be exposed.
>
>
>> There are two possible interrupt sources in the
>> STREAM_X_LPE_AUD_HDMI_STATUS (65064h)
>> Bit 31: sample buffer underrun. HDMI audio unit did not have enough
>> samples to insert in video frame (cleared by writing a one)
>> Bit 29: buffer done status, write 1 to clear
>> Bit 15: sample buffer underrun interrupt enable
>
> OK, then the current code seems naming the function wrongly.
> It names had_enable_audio_int() but actually this should be named as
> had_clear_interrupts() or such.

I think enable and clear are two separate steps, you only clear what 
you've enabled, no?

>
>>
>> Hope this helps
>
> Very appreciated, thank you for prompt answer!
>
>
> Takashi
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/4] Yet another patchset for LPE audio refactoring
  2017-02-06 23:56       ` Pierre-Louis Bossart
@ 2017-02-07  1:22         ` Pierre-Louis Bossart
  2017-02-07  6:39         ` Takashi Iwai
  1 sibling, 0 replies; 26+ messages in thread
From: Pierre-Louis Bossart @ 2017-02-07  1:22 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Jerome Anand


>>>> - The behavior in had_process_buffer_underrun() isn't clear enough.
>>>>   Or, more exactly, the behavior of AUD_HDMI_STATUS reg and AUD_CONFIG
>>>>   reg look complex, and need a bit more description for readers
>>>>   (including me).
>>>>
>>>> Can you guys enlighten a bit?
>>>
>>> For the LPE_AUDIO_Buffer_config register (65020h), there is one
>>> important field
>>> 10:8: DMA Fifo watermark. The audio unit has 8x64 bytes, this register
>>> define when to fetch data. Bottom line it'd be wise to avoid rewinding
>>> below that FIFO size...
>>
>> Right, this may be needed to be exposed.
>>
>>
>>> There are two possible interrupt sources in the
>>> STREAM_X_LPE_AUD_HDMI_STATUS (65064h)
>>> Bit 31: sample buffer underrun. HDMI audio unit did not have enough
>>> samples to insert in video frame (cleared by writing a one)
>>> Bit 29: buffer done status, write 1 to clear
>>> Bit 15: sample buffer underrun interrupt enable
>>
>> OK, then the current code seems naming the function wrongly.
>> It names had_enable_audio_int() but actually this should be named as
>> had_clear_interrupts() or such.
>
> I think enable and clear are two separate steps, you only clear what
> you've enabled, no?

To be crystal-clear, there are two interrupts we care about
1. buffer interrupt. This is enabled in bit1 of 
STREAM_X_LPE_AUD_BUFFER_M_ADDR, the hardware will generate an interrupt 
when the hardware is done reading the data from memory. Bit0 is set 
after a new address is programmed (or just to reactivate the same address).
2. underrun interrupt. This is enabled once in the 
STREAM_X_LPE_AUD_HDMI_STATUS register, you don't need to re-enable it 
for each buffer.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/4] Yet another patchset for LPE audio refactoring
  2017-02-06 23:56       ` Pierre-Louis Bossart
  2017-02-07  1:22         ` Pierre-Louis Bossart
@ 2017-02-07  6:39         ` Takashi Iwai
  1 sibling, 0 replies; 26+ messages in thread
From: Takashi Iwai @ 2017-02-07  6:39 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel, Jerome Anand

On Tue, 07 Feb 2017 00:56:54 +0100,
Pierre-Louis Bossart wrote:
> 
> On 2/6/17 3:45 PM, Takashi Iwai wrote:
> > On Mon, 06 Feb 2017 22:16:22 +0100,
> > Pierre-Louis Bossart wrote:
> >>
> >> On 2/6/17 1:07 PM, Takashi Iwai wrote:
> >>> On Fri, 03 Feb 2017 17:43:56 +0100,
> >>> Takashi Iwai wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> this is the final part of my cleanup / refactoring patchsets for Intel
> >>>> LPE audio.  Now the PCM stream engine has been rewritten.  It supports
> >>>> more flexible configuration, not only fixed 4 periods, for example,
> >>>> yet with a lot of code reduction.
> >>>>
> >>>> As usual, the patches are found in topic/intel-lpe-audio-cleanup
> >>>> branch.
> >>>
> >>> So far, from my side, now most of changes have been submitted /
> >>> merged.  But I still have some unclear things in the code.
> >>>
> >>> - The PCM format: the code contains the handling of S16_LE, but it
> >>>   doesn't seem work when I enabled the info bit.  What's missing?
> >>>   Also U24_LE format is ever supported...?  How?
> >>
> >> In the initial version the samples were assumed to be using the 24 lsb
> >> of a 32-bit word. 16-bit data can be supported as long as it was
> >> left-shifted before being written to the ring buffer.
> >>
> >> For stereo the channels are assumed to be interleaved in Layout 0
> >> (left, right). For more than 2ch, all channels for the same sample are
> >> inserted consecutively (ie. no blocks of channels). In the initial
> >> version the number of channels is assumed to be even and one bogus
> >> channel had to be inserted for odd number of channels.
> >>
> >> in more recent versions the samples could also be represented as using
> >> the 24 msb and the requirement for the bogus channel was removed but I
> >> never worked on this personally and the documentation is far from
> >> clear.
> >
> > OK, that's interesting.  So we may support even  S32_LE format, too.
> 
> well, yes but the 8 LSBs would be ignored, you can only transmit 24
> bits with HDMI (it's 28 bits total per sample with 4 status bits).

Yes, S32_LE indicates only the format, and the effective bits are
specified in hwparam.msbits field additionally.


> > The odd channel restriction isn't seen in the current code, indeed.
> > It's worth to try 3 channels tomorrow :)
> >
> >> At any rate, here's what I see in the description of the
> >> STREAM_X_LPE_AUD_CONFIG registers for CHT:
> >>
> >> Bit 15:
> >> 1: DP mode
> >> 0: HDMI (default)
> >>
> >> Bit14:
> >> 1: no bogus sample for odd channels
> >> 0: bogus sample for off channels (default)
> >>
> >> Bit13:
> >> 1: MSB is bit 31 of 32-bit container
> >> 0: MSB is bit 23 of 32-bit container (default)
> >>
> >> Bit 12:
> >> 1: 16-bit container
> >> 0: 32-bit container (default)
> >>
> >> Bit 11:
> >> 1: send silence stream
> >> 0: send null packets (default)
> >
> > So this corresponds to the hw_silence stuff you mentioned?
> 
> yes. I am still trying to figure out how it's supposed to be used (you
> need a valid configuration first).

Judging from the field name, this influences on the underrun behavior.
If so, just setting all BDs to invalid (and set AUD_CONFIG enable bit)
should suffice?


> >> Note that the hardware can swap channels, I don't remember if this is
> >> currently enabled.
> >
> > We have the following code:
> >
> > #define SWAP_LFE_CENTER		0x00fac4c8
> >
> > 	/*
> > 	 * Program channel mapping in following order:
> > 	 * FL, FR, C, LFE, RL, RR
> > 	 */
> >
> > 	had_write_register(intelhaddata, AUD_BUF_CH_SWAP, SWAP_LFE_CENTER);
> >
> > 0x00fac4c8 is octal 74543210, so it's jus a plain order.
> 
> Yes, the 24 LBSs (23:0) in this register are split in 8 3-bit
> subfields, e.g if you program 7 (111b) in bits 2:0 and 0 in bits 23:21
> you would swap first and last channels.
> This only works for multichannel (layout 1), no support for l/r swap.

OK, if such a restriction is present, maybe it's no good idea to allow
the flexible mapping via chmap API.  Safer to keep the chmap as
read-only.


> >> There are two possible interrupt sources in the
> >> STREAM_X_LPE_AUD_HDMI_STATUS (65064h)
> >> Bit 31: sample buffer underrun. HDMI audio unit did not have enough
> >> samples to insert in video frame (cleared by writing a one)
> >> Bit 29: buffer done status, write 1 to clear
> >> Bit 15: sample buffer underrun interrupt enable
> >
> > OK, then the current code seems naming the function wrongly.
> > It names had_enable_audio_int() but actually this should be named as
> > had_clear_interrupts() or such.
> 
> I think enable and clear are two separate steps, you only clear what
> you've enabled, no?

Not really.  In the original code, it corresponds to
hdmi_audio_set_caps():

int hdmi_audio_set_caps(enum had_caps_list set_element,
			void *capabilties)
{
.....
	switch (set_element) {

	case HAD_SET_ENABLE_AUDIO_INT:
		{
			u32 status_reg;

			hdmi_audio_read(AUD_HDMI_STATUS_v2 +
				ctx->had_config_offset, &status_reg);
			status_reg |=
				HDMI_AUDIO_BUFFER_DONE | HDMI_AUDIO_UNDERRUN;
			hdmi_audio_write(AUD_HDMI_STATUS_v2 +
				ctx->had_config_offset, status_reg);
			hdmi_audio_read(AUD_HDMI_STATUS_v2 +
				ctx->had_config_offset, &status_reg);

		}

And there is no counterpart, HAD_SET_DISABLE_AUDIO_INT.
I wasn't sure about the code above, but now I understand that these
are simply ACKing both BUFFER_DONE and UNDERRUN bits unconditionally,
and do a pre- and a post-read.  Effectively, it forcibly clears the
interrupts.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-02-07  6:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-03 16:43 [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
2017-02-03 16:43 ` [PATCH 1/4] ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag Takashi Iwai
2017-02-03 19:39   ` Pierre-Louis Bossart
2017-02-03 20:13     ` Takashi Iwai
2017-02-03 23:13       ` Pierre-Louis Bossart
2017-02-04  7:57         ` Takashi Iwai
2017-02-06 19:01           ` Takashi Iwai
2017-02-06 21:27             ` Pierre-Louis Bossart
2017-02-06 21:51               ` Takashi Iwai
2017-02-03 16:43 ` [PATCH 2/4] ALSA: x86: Explicit specify 32bit DMA Takashi Iwai
2017-02-03 16:43 ` [PATCH 3/4] ALSA: x86: Don't check connection in lowlevel accessors Takashi Iwai
2017-02-03 16:44 ` [PATCH 4/4] ALSA: x86: Refactor PCM process engine Takashi Iwai
2017-02-03 19:47   ` Pierre-Louis Bossart
2017-02-03 20:11     ` Takashi Iwai
2017-02-03 23:22       ` Pierre-Louis Bossart
2017-02-04  7:51         ` Takashi Iwai
2017-02-06 11:22           ` Takashi Iwai
2017-02-06 15:46             ` Pierre-Louis Bossart
2017-02-06 15:54               ` Takashi Iwai
2017-02-06 16:00                 ` Pierre-Louis Bossart
2017-02-06 19:07 ` [PATCH 0/4] Yet another patchset for LPE audio refactoring Takashi Iwai
2017-02-06 21:16   ` Pierre-Louis Bossart
2017-02-06 21:45     ` Takashi Iwai
2017-02-06 23:56       ` Pierre-Louis Bossart
2017-02-07  1:22         ` Pierre-Louis Bossart
2017-02-07  6:39         ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.