All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] pxa_camera: Redesign DMA handling
@ 2009-03-05 19:45 Robert Jarzmik
  2009-03-05 19:45 ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Robert Jarzmik
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-05 19:45 UTC (permalink / raw)
  To: g.liakhovetski, mike; +Cc: linux-media, Robert Jarzmik

This patchset, formerly known as "pxa_camera: Redesign DMA handling", attempts
so simplify the code for all DMA related parts of pxa_camera host driver.

As asked for by Guennadi and Mike, the original patch was split up into 4
patches :
 - one to address the YUV planar format hole (page alignment)
 - one to redesign the DMA
 - one for code style change
 - one for lately discovered overrun issue

A decision about enforcing a size for pxa_camera_set_fmt() to be a multiple of 8
was not done yet. Meanwhile, the patchset doesn't make any hypothesis about the
image size, and even a weird size like 223 x 111 will work. If such a decision
was to be taken, patch 1 would have to amended.

Powermanagment with suspend to RAM, then resume in the middle of a capture does
work.

As Mike noticed, YUV planar format overlay was not tested after these changes.

Robert Jarzmik (4):
  pxa_camera: remove YUV planar formats hole
  pxa_camera: Redesign DMA handling
  pxa_camera: Coding style sweeping
  pxa_camera: Fix overrun condition on last buffer

 drivers/media/video/pxa_camera.c |  474 ++++++++++++++++++++++----------------
 1 files changed, 277 insertions(+), 197 deletions(-)


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 19:45 [PATCH 0/4] pxa_camera: Redesign DMA handling Robert Jarzmik
@ 2009-03-05 19:45 ` Robert Jarzmik
  2009-03-05 19:45   ` [PATCH 2/4] pxa_camera: Redesign DMA handling Robert Jarzmik
                     ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-05 19:45 UTC (permalink / raw)
  To: g.liakhovetski, mike; +Cc: linux-media, Robert Jarzmik

All planes were PAGE aligned (ie. 4096 bytes aligned). This
is not consistent with YUV422 format, which requires Y, U
and V planes glued together.  The new implementation forces
the alignement on 8 bytes (DMA requirement), which is almost
always the case (granted by width x height being a multiple
of 8).

The test cases include tests in both YUV422 and RGB565 :
 - a picture of size 111 x 111 (cross RAM pages example)
 - a picture of size 1023 x 4 in (under 1 RAM page)
 - a picture of size 1024 x 4 in (exactly 1 RAM page)
 - a picture of size 1025 x 4 in (over 1 RAM page)
 - a picture of size 1280 x 1024 (many RAM pages)

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/media/video/pxa_camera.c |  165 ++++++++++++++++++++++++++------------
 1 files changed, 114 insertions(+), 51 deletions(-)

diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
index e3e6b29..54df071 100644
--- a/drivers/media/video/pxa_camera.c
+++ b/drivers/media/video/pxa_camera.c
@@ -242,14 +242,13 @@ static int pxa_videobuf_setup(struct videobuf_queue *vq, unsigned int *count,
 	dev_dbg(&icd->dev, "count=%d, size=%d\n", *count, *size);
 
 	/* planar capture requires Y, U and V buffers to be page aligned */
-	if (pcdev->channels == 3) {
-		*size = PAGE_ALIGN(icd->width * icd->height); /* Y pages */
-		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* U pages */
-		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* V pages */
-	} else {
-		*size = icd->width * icd->height *
-			((icd->current_fmt->depth + 7) >> 3);
-	}
+	if (pcdev->channels == 3)
+		*size = roundup(icd->width * icd->height, 8) /* Y pages */
+			+ roundup(icd->width * icd->height / 2, 8) /* U pages */
+			+ roundup(icd->width * icd->height / 2, 8); /* V pages */
+	else
+		*size = roundup(icd->width * icd->height *
+				((icd->current_fmt->depth + 7) >> 3), 8);
 
 	if (0 == *count)
 		*count = 32;
@@ -289,19 +288,63 @@ static void free_buffer(struct videobuf_queue *vq, struct pxa_buffer *buf)
 	buf->vb.state = VIDEOBUF_NEEDS_INIT;
 }
 
+static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
+			       int sg_first_ofs, int size)
+{
+	int i, offset, dma_len, xfer_len;
+	struct scatterlist *sg;
+
+	offset = sg_first_ofs;
+	for_each_sg(sglist, sg, sglen, i) {
+		dma_len = sg_dma_len(sg);
+
+		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
+		xfer_len = roundup(min(dma_len - offset, size), 8);
+
+		size = max(0, size - xfer_len);
+		offset = 0;
+		if (size == 0)
+			break;
+	}
+
+	BUG_ON(size != 0);
+	return i + 1;
+}
+
+/**
+ * pxa_init_dma_channel - init dma descriptors
+ * @pcdev: pxa camera device
+ * @buf: pxa buffer to find pxa dma channel
+ * @dma: dma video buffer
+ * @channel: dma channel (0 => 'Y', 1 => 'U', 2 => 'V')
+ * @cibr: camera read fifo
+ * @size: bytes to transfer
+ * @sg_first: index of first element of sg_list
+ * @sg_first_ofs: offset in first element of sg_list
+ *
+ * Prepares the pxa dma descriptors to transfer one camera channel.
+ * Beware sg_first and sg_first_ofs are both input and output parameters.
+ *
+ * Returns 0
+ */
 static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 				struct pxa_buffer *buf,
 				struct videobuf_dmabuf *dma, int channel,
-				int sglen, int sg_start, int cibr,
-				unsigned int size)
+				int cibr, int size,
+				struct scatterlist **sg_first, int *sg_first_ofs)
 {
 	struct pxa_cam_dma *pxa_dma = &buf->dmas[channel];
-	int i;
+	struct scatterlist *sg;
+	int i, offset, sglen;
+	int dma_len = 0, xfer_len = 0;
 
 	if (pxa_dma->sg_cpu)
 		dma_free_coherent(pcdev->dev, pxa_dma->sg_size,
 				  pxa_dma->sg_cpu, pxa_dma->sg_dma);
 
+	sglen = calculate_dma_sglen(*sg_first, dma->sglen,
+				    *sg_first_ofs, size);
+
 	pxa_dma->sg_size = (sglen + 1) * sizeof(struct pxa_dma_desc);
 	pxa_dma->sg_cpu = dma_alloc_coherent(pcdev->dev, pxa_dma->sg_size,
 					     &pxa_dma->sg_dma, GFP_KERNEL);
@@ -309,27 +352,51 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 		return -ENOMEM;
 
 	pxa_dma->sglen = sglen;
+	offset = *sg_first_ofs;
 
-	for (i = 0; i < sglen; i++) {
-		int sg_i = sg_start + i;
-		struct scatterlist *sg = dma->sglist;
-		unsigned int dma_len = sg_dma_len(&sg[sg_i]), xfer_len;
+	dev_dbg(pcdev->dev, "DMA: sg_first=%p, sglen=%d, ofs=%d, dma.desc=%x\n",
+		*sg_first, sglen, *sg_first_ofs, pxa_dma->sg_dma);
 
-		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
-		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(&sg[sg_i]);
+
+	for_each_sg(*sg_first, sg, sglen, i) {
+		dma_len = sg_dma_len(sg);
 
 		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
-		xfer_len = (min(dma_len, size) + 7) & ~7;
+		xfer_len = roundup(min(dma_len - offset, size), 8);
 
+		size = max(0, size - xfer_len);
+
+		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
+		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
 		pxa_dma->sg_cpu[i].dcmd =
 			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
-		size -= dma_len;
 		pxa_dma->sg_cpu[i].ddadr =
 			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
+
+		dev_vdbg(pcdev->dev, "DMA: desc.%08x->@phys=0x%08x, len=%d\n",
+			 pxa_dma->sg_dma + i * sizeof(struct pxa_dma_desc),
+			 sg_dma_address(sg) + offset, xfer_len);
+		offset = 0;
+
+		if (size == 0)
+			break;
 	}
 
-	pxa_dma->sg_cpu[sglen - 1].ddadr = DDADR_STOP;
-	pxa_dma->sg_cpu[sglen - 1].dcmd |= DCMD_ENDIRQEN;
+	pxa_dma->sg_cpu[sglen].ddadr = DDADR_STOP;
+	pxa_dma->sg_cpu[sglen].dcmd  = DCMD_FLOWSRC | DCMD_BURST8 | DCMD_ENDIRQEN;
+
+	*sg_first_ofs = xfer_len;
+	/*
+	 * Handle 1 special case :
+	 *  - if we finish the DMA transfer in the last 7 bytes of a RAM page
+	 *    then we return the sg element pointing on the next page
+	 */
+	if (*sg_first_ofs >= dma_len) {
+		*sg_first_ofs -= dma_len;
+		*sg_first = sg_next(sg);
+	} else {
+		*sg_first = sg;
+	}
 
 	return 0;
 }
@@ -342,9 +409,7 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
 	struct pxa_camera_dev *pcdev = ici->priv;
 	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
 	int ret;
-	int sglen_y,  sglen_yu = 0, sglen_u = 0, sglen_v = 0;
-	int size_y, size_u = 0, size_v = 0;
-
+	int size_y = 0, size_u = 0, size_v = 0;
 	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
 		vb, vb->baddr, vb->bsize);
 
@@ -381,53 +446,51 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
 	}
 
 	if (vb->state == VIDEOBUF_NEEDS_INIT) {
-		unsigned int size = vb->size;
+		int size = vb->size;
+		int next_ofs = 0;
 		struct videobuf_dmabuf *dma = videobuf_to_dma(vb);
+		struct scatterlist *sg;
 
 		ret = videobuf_iolock(vq, vb, NULL);
 		if (ret)
 			goto fail;
 
 		if (pcdev->channels == 3) {
-			/* FIXME the calculations should be more precise */
-			sglen_y = dma->sglen / 2;
-			sglen_u = sglen_v = dma->sglen / 4 + 1;
-			sglen_yu = sglen_y + sglen_u;
 			size_y = size / 2;
 			size_u = size_v = size / 4;
 		} else {
-			sglen_y = dma->sglen;
 			size_y = size;
 		}
 
-		/* init DMA for Y channel */
-		ret = pxa_init_dma_channel(pcdev, buf, dma, 0, sglen_y,
-					   0, 0x28, size_y);
+		sg = dma->sglist;
 
+		/* init DMA for Y channel */
+		ret = pxa_init_dma_channel(pcdev, buf, dma, 0, CIBR0, size_y,
+					   &sg, &next_ofs);
 		if (ret) {
 			dev_err(pcdev->dev,
 				"DMA initialization for Y/RGB failed\n");
 			goto fail;
 		}
 
-		if (pcdev->channels == 3) {
-			/* init DMA for U channel */
-			ret = pxa_init_dma_channel(pcdev, buf, dma, 1, sglen_u,
-						   sglen_y, 0x30, size_u);
-			if (ret) {
-				dev_err(pcdev->dev,
-					"DMA initialization for U failed\n");
-				goto fail_u;
-			}
+		/* init DMA for U channel */
+		if (size_u)
+			ret = pxa_init_dma_channel(pcdev, buf, dma, 1, CIBR1,
+						   size_u, &sg, &next_ofs);
+		if (ret) {
+			dev_err(pcdev->dev,
+				"DMA initialization for U failed\n");
+			goto fail_u;
+		}
 
-			/* init DMA for V channel */
-			ret = pxa_init_dma_channel(pcdev, buf, dma, 2, sglen_v,
-						   sglen_yu, 0x38, size_v);
-			if (ret) {
-				dev_err(pcdev->dev,
-					"DMA initialization for V failed\n");
-				goto fail_v;
-			}
+		/* init DMA for V channel */
+		if (size_v)
+			ret = pxa_init_dma_channel(pcdev, buf, dma, 2, CIBR2,
+						   size_u, &sg, &next_ofs);
+		if (ret) {
+			dev_err(pcdev->dev,
+				"DMA initialization for V failed\n");
+			goto fail_v;
 		}
 
 		vb->state = VIDEOBUF_PREPARED;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-05 19:45 ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Robert Jarzmik
@ 2009-03-05 19:45   ` Robert Jarzmik
  2009-03-05 19:45     ` [PATCH 3/4] pxa_camera: Coding style sweeping Robert Jarzmik
  2009-03-09 11:35     ` [PATCH 2/4] pxa_camera: Redesign DMA handling Guennadi Liakhovetski
  2009-03-05 20:29   ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Guennadi Liakhovetski
  2009-03-09 10:45   ` Guennadi Liakhovetski
  2 siblings, 2 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-05 19:45 UTC (permalink / raw)
  To: g.liakhovetski, mike; +Cc: linux-media, Robert Jarzmik

The DMA transfers in pxa_camera showed some weaknesses in
multiple queued buffers context :
 - poll/select problem
   The order between list pcdev->capture and DMA chain was
   not the same. This creates a discrepancy between video
   buffers marked as "done" by the IRQ handler, and the
   really finished video buffer.

   The bug shows up with capture_example tool from v4l2 hg
   tree. The process just "stalls" on a "select timeout".

   The key problem is in pxa_videobuf_queue(), where the
   queued buffer is chained before the active buffer, while
   it should have been the active buffer first, and queued
   buffer tailed after.

 - multiple buffers DMA starting
   When multiple buffers were queued, the DMA channels were
   always started right away. This is not optimal, as a
   special case appears when the first EOF was not yet
   reached, and the DMA channels were prematurely started.

 - Maintainability
   DMA code was a bit obfuscated. Rationalize the code to be
   easily maintainable by anyone.

This patch attemps to address these issues.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/media/video/pxa_camera.c |  264 ++++++++++++++++++++------------------
 1 files changed, 139 insertions(+), 125 deletions(-)

diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
index 54df071..2d79ded 100644
--- a/drivers/media/video/pxa_camera.c
+++ b/drivers/media/video/pxa_camera.c
@@ -325,7 +325,7 @@ static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
  * Prepares the pxa dma descriptors to transfer one camera channel.
  * Beware sg_first and sg_first_ofs are both input and output parameters.
  *
- * Returns 0
+ * Returns 0 or -ENOMEM si no coherent memory is available
  */
 static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 				struct pxa_buffer *buf,
@@ -369,7 +369,8 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
 		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
 		pxa_dma->sg_cpu[i].dcmd =
-			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
+			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len
+			| ((i == 0) ? DCMD_STARTIRQEN : 0);
 		pxa_dma->sg_cpu[i].ddadr =
 			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
 
@@ -516,6 +517,97 @@ out:
 	return ret;
 }
 
+/**
+ * pxa_dma_start_channels - start DMA channel for active buffer
+ * @pcdev: pxa camera device
+ *
+ * Initialize DMA channels to the beginning of the active video buffer, and
+ * start these channels.
+ */
+static void pxa_dma_start_channels(struct pxa_camera_dev *pcdev)
+{
+	int i;
+	struct pxa_buffer *active;
+
+	active = pcdev->active;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		dev_dbg(pcdev->dev, "%s (channel=%d) ddadr=%08x\n", __func__,
+			i, active->dmas[i].sg_dma);
+		DDADR(pcdev->dma_chans[i]) = active->dmas[i].sg_dma;
+		DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
+	}
+}
+
+static void pxa_dma_stop_channels(struct pxa_camera_dev *pcdev)
+{
+	int i;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		dev_dbg(pcdev->dev, "%s (channel=%d)\n", __func__, i);
+		DCSR(pcdev->dma_chans[i]) = 0;
+	}
+}
+
+static void pxa_dma_update_sg_tail(struct pxa_camera_dev *pcdev,
+				   struct pxa_buffer *buf)
+{
+	int i;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
+		pcdev->sg_tail[i]->ddadr = DDADR_STOP;
+	}
+}
+
+static void pxa_dma_add_tail_buf(struct pxa_camera_dev *pcdev,
+				 struct pxa_buffer *buf)
+{
+	int i;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		if (!pcdev->sg_tail[i])
+			continue;
+		pcdev->sg_tail[i]->ddadr = buf->dmas[i].sg_dma;
+	}
+
+	pxa_dma_update_sg_tail(pcdev, buf);
+}
+
+/**
+ * pxa_camera_start_capture - start video capturing
+ * @pcdev: camera device
+ *
+ * Launch capturing. DMA channels should not be active yet. They should get
+ * activated at the end of frame interrupt, to capture only whole frames, and
+ * never begin the capture of a partial frame.
+ */
+static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev)
+{
+	unsigned long cicr0, cifr;
+
+	dev_dbg(pcdev->dev, "%s\n", __func__);
+	cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
+	__raw_writel(cifr, pcdev->base + CIFR);
+
+	cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB | CISR_IFO_0
+		| CISR_IFO_1 | CISR_IFO_2;
+	cicr0 &= ~CICR0_EOFM;
+	__raw_writel(cicr0, pcdev->base + CICR0);
+}
+
+static void pxa_camera_stop_capture(struct pxa_camera_dev *pcdev)
+{
+	unsigned long cicr0;
+
+	pxa_dma_stop_channels(pcdev);
+
+	cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
+	__raw_writel(cicr0, pcdev->base + CICR0);
+
+	dev_dbg(pcdev->dev, "%s\n", __func__);
+}
+
 static void pxa_videobuf_queue(struct videobuf_queue *vq,
 			       struct videobuf_buffer *vb)
 {
@@ -523,81 +615,23 @@ static void pxa_videobuf_queue(struct videobuf_queue *vq,
 	struct soc_camera_host *ici = to_soc_camera_host(icd->dev.parent);
 	struct pxa_camera_dev *pcdev = ici->priv;
 	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
-	struct pxa_buffer *active;
 	unsigned long flags;
-	int i;
 
-	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
-		vb, vb->baddr, vb->bsize);
-	spin_lock_irqsave(&pcdev->lock, flags);
+	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d active=%p\n", __func__,
+		vb, vb->baddr, vb->bsize, pcdev->active);
 
+	spin_lock_irqsave(&pcdev->lock, flags);
 	list_add_tail(&vb->queue, &pcdev->capture);
 
 	vb->state = VIDEOBUF_ACTIVE;
-	active = pcdev->active;
-
-	if (!active) {
-		unsigned long cifr, cicr0;
-
-		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
-		__raw_writel(cifr, pcdev->base + CIFR);
-
-		for (i = 0; i < pcdev->channels; i++) {
-			DDADR(pcdev->dma_chans[i]) = buf->dmas[i].sg_dma;
-			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
-			pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen - 1;
-		}
 
-		pcdev->active = buf;
+	pxa_dma_stop_channels(pcdev);
+	pxa_dma_add_tail_buf(pcdev, buf);
 
-		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB;
-		__raw_writel(cicr0, pcdev->base + CICR0);
-	} else {
-		struct pxa_cam_dma *buf_dma;
-		struct pxa_cam_dma *act_dma;
-		int nents;
-
-		for (i = 0; i < pcdev->channels; i++) {
-			buf_dma = &buf->dmas[i];
-			act_dma = &active->dmas[i];
-			nents = buf_dma->sglen;
-
-			/* Stop DMA engine */
-			DCSR(pcdev->dma_chans[i]) = 0;
-
-			/* Add the descriptors we just initialized to
-			   the currently running chain */
-			pcdev->sg_tail[i]->ddadr = buf_dma->sg_dma;
-			pcdev->sg_tail[i] = buf_dma->sg_cpu + buf_dma->sglen - 1;
-
-			/* Setup a dummy descriptor with the DMA engines current
-			 * state
-			 */
-			buf_dma->sg_cpu[nents].dsadr =
-				pcdev->res->start + 0x28 + i*8; /* CIBRx */
-			buf_dma->sg_cpu[nents].dtadr =
-				DTADR(pcdev->dma_chans[i]);
-			buf_dma->sg_cpu[nents].dcmd =
-				DCMD(pcdev->dma_chans[i]);
-
-			if (DDADR(pcdev->dma_chans[i]) == DDADR_STOP) {
-				/* The DMA engine is on the last
-				   descriptor, set the next descriptors
-				   address to the descriptors we just
-				   initialized */
-				buf_dma->sg_cpu[nents].ddadr = buf_dma->sg_dma;
-			} else {
-				buf_dma->sg_cpu[nents].ddadr =
-					DDADR(pcdev->dma_chans[i]);
-			}
-
-			/* The next descriptor is the dummy descriptor */
-			DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
-				sizeof(struct pxa_dma_desc);
-
-			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
-		}
-	}
+	if (!pcdev->active)
+		pxa_camera_start_capture(pcdev);
+	else
+		pxa_dma_start_channels(pcdev);
 
 	spin_unlock_irqrestore(&pcdev->lock, flags);
 }
@@ -635,7 +669,7 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
 			      struct videobuf_buffer *vb,
 			      struct pxa_buffer *buf)
 {
-	unsigned long cicr0;
+	int i;
 
 	/* _init is used to debug races, see comment in pxa_camera_reqbufs() */
 	list_del_init(&vb->queue);
@@ -643,15 +677,13 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
 	do_gettimeofday(&vb->ts);
 	vb->field_count++;
 	wake_up(&vb->done);
+	dev_dbg(pcdev->dev, "%s dequeud buffer (vb=0x%p)\n", __func__, vb);
 
 	if (list_empty(&pcdev->capture)) {
+		pxa_camera_stop_capture(pcdev);
 		pcdev->active = NULL;
-		DCSR(pcdev->dma_chans[0]) = 0;
-		DCSR(pcdev->dma_chans[1]) = 0;
-		DCSR(pcdev->dma_chans[2]) = 0;
-
-		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
-		__raw_writel(cicr0, pcdev->base + CICR0);
+		for (i = 0; i < pcdev->channels; i++)
+			pcdev->sg_tail[i] = NULL;
 		return;
 	}
 
@@ -666,19 +698,23 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
 	unsigned long flags;
 	u32 status, camera_status, overrun;
 	struct videobuf_buffer *vb;
-	unsigned long cifr, cicr0;
 
 	spin_lock_irqsave(&pcdev->lock, flags);
 
 	status = DCSR(channel);
-	DCSR(channel) = status | DCSR_ENDINTR;
+	DCSR(channel) = status | DCSR_STARTINTR | DCSR_ENDINTR;
+
+	camera_status = __raw_readl(pcdev->base + CISR);
+	overrun = CISR_IFO_0;
+	if (pcdev->channels == 3)
+		overrun |= CISR_IFO_1 | CISR_IFO_2;
 
 	if (status & DCSR_BUSERR) {
 		dev_err(pcdev->dev, "DMA Bus Error IRQ!\n");
 		goto out;
 	}
 
-	if (!(status & DCSR_ENDINTR)) {
+	if (!(status & (DCSR_ENDINTR | DCSR_STARTINTR))) {
 		dev_err(pcdev->dev, "Unknown DMA IRQ source, "
 			"status: 0x%08x\n", status);
 		goto out;
@@ -689,38 +725,27 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
 		goto out;
 	}
 
-	camera_status = __raw_readl(pcdev->base + CISR);
-	overrun = CISR_IFO_0;
-	if (pcdev->channels == 3)
-		overrun |= CISR_IFO_1 | CISR_IFO_2;
-	if (camera_status & overrun) {
-		dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n", camera_status);
-		/* Stop the Capture Interface */
-		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
-		__raw_writel(cicr0, pcdev->base + CICR0);
-
-		/* Stop DMA */
-		DCSR(channel) = 0;
-		/* Reset the FIFOs */
-		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
-		__raw_writel(cifr, pcdev->base + CIFR);
-		/* Enable End-Of-Frame Interrupt */
-		cicr0 &= ~CICR0_EOFM;
-		__raw_writel(cicr0, pcdev->base + CICR0);
-		/* Restart the Capture Interface */
-		__raw_writel(cicr0 | CICR0_ENB, pcdev->base + CICR0);
-		goto out;
-	}
-
 	vb = &pcdev->active->vb;
 	buf = container_of(vb, struct pxa_buffer, vb);
 	WARN_ON(buf->inwork || list_empty(&vb->queue));
-	dev_dbg(pcdev->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
-		vb, vb->baddr, vb->bsize);
 
-	buf->active_dma &= ~act_dma;
-	if (!buf->active_dma)
-		pxa_camera_wakeup(pcdev, vb, buf);
+	dev_dbg(pcdev->dev, "%s channel=%d %s%s(vb=0x%p) dma.desc=%x\n",
+		__func__, channel, status & DCSR_STARTINTR ? "SOF " : "",
+		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
+
+	if (status & DCSR_ENDINTR) {
+		if (camera_status & overrun) {
+			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
+				camera_status);
+			pxa_camera_stop_capture(pcdev);
+			pxa_camera_start_capture(pcdev);
+			goto out;
+		}
+
+		buf->active_dma &= ~act_dma;
+		if (!buf->active_dma)
+			pxa_camera_wakeup(pcdev, vb, buf);
+	}
 
 out:
 	spin_unlock_irqrestore(&pcdev->lock, flags);
@@ -859,12 +884,11 @@ static irqreturn_t pxa_camera_irq(int irq, void *data)
 	__raw_writel(status, pcdev->base + CISR);
 
 	if (status & CISR_EOF) {
-		int i;
-		for (i = 0; i < pcdev->channels; i++) {
-			DDADR(pcdev->dma_chans[i]) =
-				pcdev->active->dmas[i].sg_dma;
-			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
-		}
+		pcdev->active = list_first_entry(&pcdev->capture,
+					   struct pxa_buffer, vb.queue);
+
+		pxa_dma_start_channels(pcdev);
+
 		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_EOFM;
 		__raw_writel(cicr0, pcdev->base + CICR0);
 	}
@@ -1404,18 +1428,8 @@ static int pxa_camera_resume(struct soc_camera_device *icd)
 		ret = pcdev->icd->ops->resume(pcdev->icd);
 
 	/* Restart frame capture if active buffer exists */
-	if (!ret && pcdev->active) {
-		unsigned long cifr, cicr0;
-
-		/* Reset the FIFOs */
-		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
-		__raw_writel(cifr, pcdev->base + CIFR);
-
-		cicr0 = __raw_readl(pcdev->base + CICR0);
-		cicr0 &= ~CICR0_EOFM;	/* Enable End-Of-Frame Interrupt */
-		cicr0 |= CICR0_ENB;	/* Restart the Capture Interface */
-		__raw_writel(cicr0, pcdev->base + CICR0);
-	}
+	if (!ret && pcdev->active)
+		pxa_camera_start_capture(pcdev);
 
 	return ret;
 }
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/4] pxa_camera: Coding style sweeping
  2009-03-05 19:45   ` [PATCH 2/4] pxa_camera: Redesign DMA handling Robert Jarzmik
@ 2009-03-05 19:45     ` Robert Jarzmik
  2009-03-05 19:45       ` [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer Robert Jarzmik
  2009-03-09 11:35     ` [PATCH 2/4] pxa_camera: Redesign DMA handling Guennadi Liakhovetski
  1 sibling, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-05 19:45 UTC (permalink / raw)
  To: g.liakhovetski, mike; +Cc: linux-media, Robert Jarzmik

Transform sequences of form:
	  foo = val1 | val2 |
	        val3 | val4;
into :
	  foo = val1 | val2
	        | val3 | val4;

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/media/video/pxa_camera.c |   43 ++++++++++++++++++-------------------
 1 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
index 2d79ded..16bf0a3 100644
--- a/drivers/media/video/pxa_camera.c
+++ b/drivers/media/video/pxa_camera.c
@@ -158,9 +158,9 @@
 #define CICR3_VSW_VAL(x)  (((x) << 11) & CICR3_VSW) /* Vertical sync pulse width */
 #define CICR3_LPF_VAL(x)  (((x) << 0) & CICR3_LPF)  /* Lines per frame */
 
-#define CICR0_IRQ_MASK (CICR0_TOM | CICR0_RDAVM | CICR0_FEM | CICR0_EOLM | \
-			CICR0_PERRM | CICR0_QDM | CICR0_CDM | CICR0_SOFM | \
-			CICR0_EOFM | CICR0_FOM)
+#define CICR0_IRQ_MASK (CICR0_TOM | CICR0_RDAVM | CICR0_FEM | CICR0_EOLM \
+			| CICR0_PERRM | CICR0_QDM | CICR0_CDM | CICR0_SOFM \
+			| CICR0_EOFM | CICR0_FOM)
 
 /*
  * Structures
@@ -429,10 +429,10 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
 	 * the actual buffer is yours */
 	buf->inwork = 1;
 
-	if (buf->fmt	!= icd->current_fmt ||
-	    vb->width	!= icd->width ||
-	    vb->height	!= icd->height ||
-	    vb->field	!= field) {
+	if (buf->fmt		!= icd->current_fmt
+	    || vb->width	!= icd->width
+	    || vb->height	!= icd->height
+	    || vb->field	!= field) {
 		buf->fmt	= icd->current_fmt;
 		vb->width	= icd->width;
 		vb->height	= icd->height;
@@ -960,13 +960,13 @@ static int test_platform_param(struct pxa_camera_dev *pcdev,
 	 * quick capture interface supports both.
 	 */
 	*flags = (pcdev->platform_flags & PXA_CAMERA_MASTER ?
-		  SOCAM_MASTER : SOCAM_SLAVE) |
-		SOCAM_HSYNC_ACTIVE_HIGH |
-		SOCAM_HSYNC_ACTIVE_LOW |
-		SOCAM_VSYNC_ACTIVE_HIGH |
-		SOCAM_VSYNC_ACTIVE_LOW |
-		SOCAM_PCLK_SAMPLE_RISING |
-		SOCAM_PCLK_SAMPLE_FALLING;
+		  SOCAM_MASTER : SOCAM_SLAVE)
+		| SOCAM_HSYNC_ACTIVE_HIGH
+		| SOCAM_HSYNC_ACTIVE_LOW
+		| SOCAM_VSYNC_ACTIVE_HIGH
+		| SOCAM_VSYNC_ACTIVE_LOW
+		| SOCAM_PCLK_SAMPLE_RISING
+		| SOCAM_PCLK_SAMPLE_FALLING;
 
 	/* If requested data width is supported by the platform, use it */
 	switch (buswidth) {
@@ -1094,8 +1094,8 @@ static int pxa_camera_set_bus_param(struct soc_camera_device *icd, __u32 pixfmt)
 		cicr1 |= CICR1_COLOR_SP_VAL(2);
 		break;
 	case V4L2_PIX_FMT_RGB555:
-		cicr1 |= CICR1_RGB_BPP_VAL(1) | CICR1_RGBT_CONV_VAL(2) |
-			CICR1_TBIT | CICR1_COLOR_SP_VAL(1);
+		cicr1 |= CICR1_RGB_BPP_VAL(1) | CICR1_RGBT_CONV_VAL(2)
+			| CICR1_TBIT | CICR1_COLOR_SP_VAL(1);
 		break;
 	case V4L2_PIX_FMT_RGB565:
 		cicr1 |= CICR1_COLOR_SP_VAL(1) | CICR1_RGB_BPP_VAL(2);
@@ -1103,8 +1103,8 @@ static int pxa_camera_set_bus_param(struct soc_camera_device *icd, __u32 pixfmt)
 	}
 
 	cicr2 = 0;
-	cicr3 = CICR3_LPF_VAL(icd->height - 1) |
-		CICR3_BFW_VAL(min((unsigned short)255, icd->y_skip_top));
+	cicr3 = CICR3_LPF_VAL(icd->height - 1)
+		| CICR3_BFW_VAL(min((unsigned short)255, icd->y_skip_top));
 	cicr4 |= pcdev->mclk_divisor;
 
 	__raw_writel(cicr1, pcdev->base + CICR1);
@@ -1372,8 +1372,7 @@ static unsigned int pxa_camera_poll(struct file *file, poll_table *pt)
 
 	poll_wait(file, &buf->vb.done, pt);
 
-	if (buf->vb.state == VIDEOBUF_DONE ||
-	    buf->vb.state == VIDEOBUF_ERROR)
+	if (buf->vb.state == VIDEOBUF_DONE || buf->vb.state == VIDEOBUF_ERROR)
 		return POLLIN|POLLRDNORM;
 
 	return 0;
@@ -1489,8 +1488,8 @@ static int pxa_camera_probe(struct platform_device *pdev)
 
 	pcdev->pdata = pdev->dev.platform_data;
 	pcdev->platform_flags = pcdev->pdata->flags;
-	if (!(pcdev->platform_flags & (PXA_CAMERA_DATAWIDTH_8 |
-			PXA_CAMERA_DATAWIDTH_9 | PXA_CAMERA_DATAWIDTH_10))) {
+	if (!(pcdev->platform_flags & (PXA_CAMERA_DATAWIDTH_8
+			| PXA_CAMERA_DATAWIDTH_9 | PXA_CAMERA_DATAWIDTH_10))) {
 		/* Platform hasn't set available data widths. This is bad.
 		 * Warn and use a default. */
 		dev_warn(&pdev->dev, "WARNING! Platform hasn't set available "
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer
  2009-03-05 19:45     ` [PATCH 3/4] pxa_camera: Coding style sweeping Robert Jarzmik
@ 2009-03-05 19:45       ` Robert Jarzmik
  2009-03-09 11:39         ` Guennadi Liakhovetski
  2009-03-11 18:31         ` Guennadi Liakhovetski
  0 siblings, 2 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-05 19:45 UTC (permalink / raw)
  To: g.liakhovetski, mike; +Cc: linux-media, Robert Jarzmik

The last buffer queued will often overrun, as the DMA chain
is finished, and the time the dma irq handler is activated,
the QIF fifos are filled by the sensor.

The fix is to ignore the overrun condition on the last
queued buffer, and restart the capture only on intermediate
buffers of the chain.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/media/video/pxa_camera.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
index 16bf0a3..dd56c35 100644
--- a/drivers/media/video/pxa_camera.c
+++ b/drivers/media/video/pxa_camera.c
@@ -734,14 +734,18 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
 		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
 
 	if (status & DCSR_ENDINTR) {
-		if (camera_status & overrun) {
+		/*
+		 * It's normal if the last frame creates an overrun, as there
+		 * are no more DMA descriptors to fetch from QIF fifos
+		 */
+		if (camera_status & overrun
+		    && !list_is_last(pcdev->capture.next, &pcdev->capture)) {
 			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
 				camera_status);
 			pxa_camera_stop_capture(pcdev);
 			pxa_camera_start_capture(pcdev);
 			goto out;
 		}
-
 		buf->active_dma &= ~act_dma;
 		if (!buf->active_dma)
 			pxa_camera_wakeup(pcdev, vb, buf);
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 19:45 ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Robert Jarzmik
  2009-03-05 19:45   ` [PATCH 2/4] pxa_camera: Redesign DMA handling Robert Jarzmik
@ 2009-03-05 20:29   ` Guennadi Liakhovetski
  2009-03-05 21:10     ` Robert Jarzmik
  2009-03-09 10:45   ` Guennadi Liakhovetski
  2 siblings, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-05 20:29 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Robert Jarzmik wrote:

> All planes were PAGE aligned (ie. 4096 bytes aligned). This
> is not consistent with YUV422 format, which requires Y, U
> and V planes glued together.  The new implementation forces
> the alignement on 8 bytes (DMA requirement), which is almost
> always the case (granted by width x height being a multiple
> of 8).

This is not a review yet - just an explanation why I was suggesting to 
adjust height and width - you say yourself, that YUV422P (I think, this is 
wat you meant, not just YUV422) requires planes to immediately follow one 
another. But you have to align them on 8 byte boundary for DMA, so, you 
violate the standard, right? If so, I would rather suggest to adjust width 
and height for planar formats to comply to the standard. Or have I 
misunderstood you?

Thanks
Guennadi

> 
> The test cases include tests in both YUV422 and RGB565 :
>  - a picture of size 111 x 111 (cross RAM pages example)
>  - a picture of size 1023 x 4 in (under 1 RAM page)
>  - a picture of size 1024 x 4 in (exactly 1 RAM page)
>  - a picture of size 1025 x 4 in (over 1 RAM page)
>  - a picture of size 1280 x 1024 (many RAM pages)
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  drivers/media/video/pxa_camera.c |  165 ++++++++++++++++++++++++++------------
>  1 files changed, 114 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> index e3e6b29..54df071 100644
> --- a/drivers/media/video/pxa_camera.c
> +++ b/drivers/media/video/pxa_camera.c
> @@ -242,14 +242,13 @@ static int pxa_videobuf_setup(struct videobuf_queue *vq, unsigned int *count,
>  	dev_dbg(&icd->dev, "count=%d, size=%d\n", *count, *size);
>  
>  	/* planar capture requires Y, U and V buffers to be page aligned */
> -	if (pcdev->channels == 3) {
> -		*size = PAGE_ALIGN(icd->width * icd->height); /* Y pages */
> -		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* U pages */
> -		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* V pages */
> -	} else {
> -		*size = icd->width * icd->height *
> -			((icd->current_fmt->depth + 7) >> 3);
> -	}
> +	if (pcdev->channels == 3)
> +		*size = roundup(icd->width * icd->height, 8) /* Y pages */
> +			+ roundup(icd->width * icd->height / 2, 8) /* U pages */
> +			+ roundup(icd->width * icd->height / 2, 8); /* V pages */
> +	else
> +		*size = roundup(icd->width * icd->height *
> +				((icd->current_fmt->depth + 7) >> 3), 8);
>  
>  	if (0 == *count)
>  		*count = 32;
> @@ -289,19 +288,63 @@ static void free_buffer(struct videobuf_queue *vq, struct pxa_buffer *buf)
>  	buf->vb.state = VIDEOBUF_NEEDS_INIT;
>  }
>  
> +static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
> +			       int sg_first_ofs, int size)
> +{
> +	int i, offset, dma_len, xfer_len;
> +	struct scatterlist *sg;
> +
> +	offset = sg_first_ofs;
> +	for_each_sg(sglist, sg, sglen, i) {
> +		dma_len = sg_dma_len(sg);
> +
> +		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
> +		xfer_len = roundup(min(dma_len - offset, size), 8);
> +
> +		size = max(0, size - xfer_len);
> +		offset = 0;
> +		if (size == 0)
> +			break;
> +	}
> +
> +	BUG_ON(size != 0);
> +	return i + 1;
> +}
> +
> +/**
> + * pxa_init_dma_channel - init dma descriptors
> + * @pcdev: pxa camera device
> + * @buf: pxa buffer to find pxa dma channel
> + * @dma: dma video buffer
> + * @channel: dma channel (0 => 'Y', 1 => 'U', 2 => 'V')
> + * @cibr: camera read fifo
> + * @size: bytes to transfer
> + * @sg_first: index of first element of sg_list
> + * @sg_first_ofs: offset in first element of sg_list
> + *
> + * Prepares the pxa dma descriptors to transfer one camera channel.
> + * Beware sg_first and sg_first_ofs are both input and output parameters.
> + *
> + * Returns 0
> + */
>  static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  				struct pxa_buffer *buf,
>  				struct videobuf_dmabuf *dma, int channel,
> -				int sglen, int sg_start, int cibr,
> -				unsigned int size)
> +				int cibr, int size,
> +				struct scatterlist **sg_first, int *sg_first_ofs)
>  {
>  	struct pxa_cam_dma *pxa_dma = &buf->dmas[channel];
> -	int i;
> +	struct scatterlist *sg;
> +	int i, offset, sglen;
> +	int dma_len = 0, xfer_len = 0;
>  
>  	if (pxa_dma->sg_cpu)
>  		dma_free_coherent(pcdev->dev, pxa_dma->sg_size,
>  				  pxa_dma->sg_cpu, pxa_dma->sg_dma);
>  
> +	sglen = calculate_dma_sglen(*sg_first, dma->sglen,
> +				    *sg_first_ofs, size);
> +
>  	pxa_dma->sg_size = (sglen + 1) * sizeof(struct pxa_dma_desc);
>  	pxa_dma->sg_cpu = dma_alloc_coherent(pcdev->dev, pxa_dma->sg_size,
>  					     &pxa_dma->sg_dma, GFP_KERNEL);
> @@ -309,27 +352,51 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  		return -ENOMEM;
>  
>  	pxa_dma->sglen = sglen;
> +	offset = *sg_first_ofs;
>  
> -	for (i = 0; i < sglen; i++) {
> -		int sg_i = sg_start + i;
> -		struct scatterlist *sg = dma->sglist;
> -		unsigned int dma_len = sg_dma_len(&sg[sg_i]), xfer_len;
> +	dev_dbg(pcdev->dev, "DMA: sg_first=%p, sglen=%d, ofs=%d, dma.desc=%x\n",
> +		*sg_first, sglen, *sg_first_ofs, pxa_dma->sg_dma);
>  
> -		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
> -		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(&sg[sg_i]);
> +
> +	for_each_sg(*sg_first, sg, sglen, i) {
> +		dma_len = sg_dma_len(sg);
>  
>  		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
> -		xfer_len = (min(dma_len, size) + 7) & ~7;
> +		xfer_len = roundup(min(dma_len - offset, size), 8);
>  
> +		size = max(0, size - xfer_len);
> +
> +		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
> +		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
>  		pxa_dma->sg_cpu[i].dcmd =
>  			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
> -		size -= dma_len;
>  		pxa_dma->sg_cpu[i].ddadr =
>  			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
> +
> +		dev_vdbg(pcdev->dev, "DMA: desc.%08x->@phys=0x%08x, len=%d\n",
> +			 pxa_dma->sg_dma + i * sizeof(struct pxa_dma_desc),
> +			 sg_dma_address(sg) + offset, xfer_len);
> +		offset = 0;
> +
> +		if (size == 0)
> +			break;
>  	}
>  
> -	pxa_dma->sg_cpu[sglen - 1].ddadr = DDADR_STOP;
> -	pxa_dma->sg_cpu[sglen - 1].dcmd |= DCMD_ENDIRQEN;
> +	pxa_dma->sg_cpu[sglen].ddadr = DDADR_STOP;
> +	pxa_dma->sg_cpu[sglen].dcmd  = DCMD_FLOWSRC | DCMD_BURST8 | DCMD_ENDIRQEN;
> +
> +	*sg_first_ofs = xfer_len;
> +	/*
> +	 * Handle 1 special case :
> +	 *  - if we finish the DMA transfer in the last 7 bytes of a RAM page
> +	 *    then we return the sg element pointing on the next page
> +	 */
> +	if (*sg_first_ofs >= dma_len) {
> +		*sg_first_ofs -= dma_len;
> +		*sg_first = sg_next(sg);
> +	} else {
> +		*sg_first = sg;
> +	}
>  
>  	return 0;
>  }
> @@ -342,9 +409,7 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
>  	struct pxa_camera_dev *pcdev = ici->priv;
>  	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
>  	int ret;
> -	int sglen_y,  sglen_yu = 0, sglen_u = 0, sglen_v = 0;
> -	int size_y, size_u = 0, size_v = 0;
> -
> +	int size_y = 0, size_u = 0, size_v = 0;
>  	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
>  		vb, vb->baddr, vb->bsize);
>  
> @@ -381,53 +446,51 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
>  	}
>  
>  	if (vb->state == VIDEOBUF_NEEDS_INIT) {
> -		unsigned int size = vb->size;
> +		int size = vb->size;
> +		int next_ofs = 0;
>  		struct videobuf_dmabuf *dma = videobuf_to_dma(vb);
> +		struct scatterlist *sg;
>  
>  		ret = videobuf_iolock(vq, vb, NULL);
>  		if (ret)
>  			goto fail;
>  
>  		if (pcdev->channels == 3) {
> -			/* FIXME the calculations should be more precise */
> -			sglen_y = dma->sglen / 2;
> -			sglen_u = sglen_v = dma->sglen / 4 + 1;
> -			sglen_yu = sglen_y + sglen_u;
>  			size_y = size / 2;
>  			size_u = size_v = size / 4;
>  		} else {
> -			sglen_y = dma->sglen;
>  			size_y = size;
>  		}
>  
> -		/* init DMA for Y channel */
> -		ret = pxa_init_dma_channel(pcdev, buf, dma, 0, sglen_y,
> -					   0, 0x28, size_y);
> +		sg = dma->sglist;
>  
> +		/* init DMA for Y channel */
> +		ret = pxa_init_dma_channel(pcdev, buf, dma, 0, CIBR0, size_y,
> +					   &sg, &next_ofs);
>  		if (ret) {
>  			dev_err(pcdev->dev,
>  				"DMA initialization for Y/RGB failed\n");
>  			goto fail;
>  		}
>  
> -		if (pcdev->channels == 3) {
> -			/* init DMA for U channel */
> -			ret = pxa_init_dma_channel(pcdev, buf, dma, 1, sglen_u,
> -						   sglen_y, 0x30, size_u);
> -			if (ret) {
> -				dev_err(pcdev->dev,
> -					"DMA initialization for U failed\n");
> -				goto fail_u;
> -			}
> +		/* init DMA for U channel */
> +		if (size_u)
> +			ret = pxa_init_dma_channel(pcdev, buf, dma, 1, CIBR1,
> +						   size_u, &sg, &next_ofs);
> +		if (ret) {
> +			dev_err(pcdev->dev,
> +				"DMA initialization for U failed\n");
> +			goto fail_u;
> +		}
>  
> -			/* init DMA for V channel */
> -			ret = pxa_init_dma_channel(pcdev, buf, dma, 2, sglen_v,
> -						   sglen_yu, 0x38, size_v);
> -			if (ret) {
> -				dev_err(pcdev->dev,
> -					"DMA initialization for V failed\n");
> -				goto fail_v;
> -			}
> +		/* init DMA for V channel */
> +		if (size_v)
> +			ret = pxa_init_dma_channel(pcdev, buf, dma, 2, CIBR2,
> +						   size_u, &sg, &next_ofs);
> +		if (ret) {
> +			dev_err(pcdev->dev,
> +				"DMA initialization for V failed\n");
> +			goto fail_v;
>  		}
>  
>  		vb->state = VIDEOBUF_PREPARED;
> -- 
> 1.5.6.5
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 20:29   ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Guennadi Liakhovetski
@ 2009-03-05 21:10     ` Robert Jarzmik
  2009-03-05 21:22       ` Trent Piepho
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-05 21:10 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

> This is not a review yet - just an explanation why I was suggesting to 
> adjust height and width - you say yourself, that YUV422P (I think, this is 
> wat you meant, not just YUV422) requires planes to immediately follow one 
> another. But you have to align them on 8 byte boundary for DMA, so, you 
> violate the standard, right? If so, I would rather suggest to adjust width 
> and height for planar formats to comply to the standard. Or have I 
> misunderstood you?
No, you understand perfectly.

And now, what do we do :
 - adjust height ?
 - adjust height ?
 - adjust both ?

I couldn't decide which one, any hint ?

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 21:10     ` Robert Jarzmik
@ 2009-03-05 21:22       ` Trent Piepho
  2009-03-05 22:15         ` Guennadi Liakhovetski
  0 siblings, 1 reply; 33+ messages in thread
From: Trent Piepho @ 2009-03-05 21:22 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Guennadi Liakhovetski, mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Robert Jarzmik wrote:
> Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:
>
> > This is not a review yet - just an explanation why I was suggesting to
> > adjust height and width - you say yourself, that YUV422P (I think, this is
> > wat you meant, not just YUV422) requires planes to immediately follow one
> > another. But you have to align them on 8 byte boundary for DMA, so, you
> > violate the standard, right? If so, I would rather suggest to adjust width
> > and height for planar formats to comply to the standard. Or have I
> > misunderstood you?
> No, you understand perfectly.
>
> And now, what do we do :
>  - adjust height ?
>  - adjust height ?
>  - adjust both ?
>
> I couldn't decide which one, any hint ?

Shame the planes have to be contiguous.  Software like ffmpeg doesn't
require this and could handle planes with gaps between them without
trouble.  Plans aligned on 8 bytes boundaries would probably be faster in
fact.  Be better if v4l2_buffer gave us offsets for each plane.

If you must adjust, probably better to adjust both.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 21:22       ` Trent Piepho
@ 2009-03-05 22:15         ` Guennadi Liakhovetski
  2009-03-06  9:30           ` Trent Piepho
  0 siblings, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-05 22:15 UTC (permalink / raw)
  To: Trent Piepho; +Cc: Robert Jarzmik, mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Trent Piepho wrote:

> On Thu, 5 Mar 2009, Robert Jarzmik wrote:
> > Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:
> >
> > > This is not a review yet - just an explanation why I was suggesting to
> > > adjust height and width - you say yourself, that YUV422P (I think, this is
> > > wat you meant, not just YUV422) requires planes to immediately follow one
> > > another. But you have to align them on 8 byte boundary for DMA, so, you
> > > violate the standard, right? If so, I would rather suggest to adjust width
> > > and height for planar formats to comply to the standard. Or have I
> > > misunderstood you?
> > No, you understand perfectly.
> >
> > And now, what do we do :
> >  - adjust height ?
> >  - adjust height ?
> >  - adjust both ?
> >
> > I couldn't decide which one, any hint ?
> 
> Shame the planes have to be contiguous.  Software like ffmpeg doesn't
> require this and could handle planes with gaps between them without
> trouble.  Plans aligned on 8 bytes boundaries would probably be faster in
> fact.  Be better if v4l2_buffer gave us offsets for each plane.
> 
> If you must adjust, probably better to adjust both.

Yes, adjusting both is also what I was suggesting in my original review. 
How about aligning the bigger of the two to 4 bytes and the smaller to 2? 

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 22:15         ` Guennadi Liakhovetski
@ 2009-03-06  9:30           ` Trent Piepho
  0 siblings, 0 replies; 33+ messages in thread
From: Trent Piepho @ 2009-03-06  9:30 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: Robert Jarzmik, mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Guennadi Liakhovetski wrote:
> On Thu, 5 Mar 2009, Trent Piepho wrote:
> > On Thu, 5 Mar 2009, Robert Jarzmik wrote:
> > > Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:
> > > > This is not a review yet - just an explanation why I was suggesting to
> > > > adjust height and width - you say yourself, that YUV422P (I think, this is
> > > > wat you meant, not just YUV422) requires planes to immediately follow one
> > > > another. But you have to align them on 8 byte boundary for DMA, so, you
> > > > violate the standard, right? If so, I would rather suggest to adjust width
> > > > and height for planar formats to comply to the standard. Or have I
> > > > misunderstood you?
> > > No, you understand perfectly.
> > >
> > > And now, what do we do :
> > >  - adjust height ?
> > >  - adjust height ?
> > >  - adjust both ?
> > >
> > > I couldn't decide which one, any hint ?
> >
> > Shame the planes have to be contiguous.  Software like ffmpeg doesn't
> > require this and could handle planes with gaps between them without
> > trouble.  Plans aligned on 8 bytes boundaries would probably be faster in
> > fact.  Be better if v4l2_buffer gave us offsets for each plane.
> >
> > If you must adjust, probably better to adjust both.
>
> Yes, adjusting both is also what I was suggesting in my original review.
> How about aligning the bigger of the two to 4 bytes and the smaller to 2?

In order to 8 byte align the end of the first chroma plane, doesn't the
size need to be a multiple of 32, to take into account the chroma
decimation, assuming yuv 4:2:0?

Here is some code I could put this code into v4l that solves this.

In this case one could say:
v4l_bound_align_image(*width, 48, 640, 2, *height, 32, 480, 0, 5);
That will give you width between 48 and 640 that's a multiple of 4, a
height between 32 and 480, and the image size will be a multiple of 32.  It
will try to adjust the image size as little as possible to get it to work.

For example, give it 175x241 and it comes back with 176x242.  As opposed to
something like 192x241 or 172x240, which are also valid but more different
than the requested size.

/* Clamp x to be between min and max, aligned to a multiple of 2^align.  min
 * and max don't have to be aligned, but there must be at least one valid
 * value.  E.g., min=17,max=31,align=4 is not allowed as there are no multiples
 * of 16 between 17 and 31.  */
static unsigned int clamp_align(unsigned int x, unsigned int min,
                                unsigned int max, unsigned int align)
{
        /* Bits that must be zero to be aligned */
        unsigned int mask = (1 << align) - 1;

        /* Round to nearest aligned value */
        if (x & (1 << (align - 1)))
                x += mask; /* make x&=~mask round up */
        x &= ~mask;

        /* Clamp to aligned value of min and max */
        if (x < min)
                x = (min + mask) & ~mask;
        else if (x > max)
                x = max & ~mask;

        return x;
}

/* Bound an image to have a width between wmin and wmax, and height between
 * hmin and hmax, inclusive.  Additionally, the width will be a multiple of
 * 2^walign, the height will be a multiple of 2^halign, and the overall size
 * (width*height) will be a multiple of 2^salign.  May shrink or enlarge the
 * image to fit the alignment constraints.  The width or height maximum must
 * not be smaller than the corresponding minimum.  The alignments must not be
 * so high there are no possible image sizes within the allowed bounds.
 */
void v4l_bound_align_image(unsigned int *width, unsigned int wmin,
                           unsigned int wmax, unsigned int walign,
                           unsigned int *height, unsigned int hmin,
                           unsigned int hmax, unsigned int halign,
                           unsigned int salign)
{
        *width = clamp_align(*width, wmin, wmax, walign);
        *height = clamp_align(*height, hmin, hmax, halign);

        /* How much alignment do we have? */
        walign = __ffs(*width);
        halign = __ffs(*height);
        /* Enough to satisfy the image alignment? */
        if (walign + halign < salign) {
                /* Max walign where there is still a valid width */
                unsigned int wmaxa = __fls(wmax ^ (wmin - 1));

                salign -= walign + halign;
                /* up the smaller alignment until we have enough */
                do {
                        if (walign <= halign && walign < wmaxa)
                                walign++;
                        else
                                halign++;
                } while(--salign);
                *width = clamp_align(*width, wmin, wmax, walign);
                *height = clamp_align(*height, hmin, hmax, halign);
        }
}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-05 19:45 ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Robert Jarzmik
  2009-03-05 19:45   ` [PATCH 2/4] pxa_camera: Redesign DMA handling Robert Jarzmik
  2009-03-05 20:29   ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Guennadi Liakhovetski
@ 2009-03-09 10:45   ` Guennadi Liakhovetski
  2009-03-09 19:13     ` Robert Jarzmik
  2 siblings, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-09 10:45 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Robert Jarzmik wrote:

> All planes were PAGE aligned (ie. 4096 bytes aligned). This
> is not consistent with YUV422 format, which requires Y, U
> and V planes glued together.  The new implementation forces
> the alignement on 8 bytes (DMA requirement), which is almost
> always the case (granted by width x height being a multiple
> of 8).
> 
> The test cases include tests in both YUV422 and RGB565 :
>  - a picture of size 111 x 111 (cross RAM pages example)
>  - a picture of size 1023 x 4 in (under 1 RAM page)
>  - a picture of size 1024 x 4 in (exactly 1 RAM page)
>  - a picture of size 1025 x 4 in (over 1 RAM page)
>  - a picture of size 1280 x 1024 (many RAM pages)
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  drivers/media/video/pxa_camera.c |  165 ++++++++++++++++++++++++++------------
>  1 files changed, 114 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> index e3e6b29..54df071 100644
> --- a/drivers/media/video/pxa_camera.c
> +++ b/drivers/media/video/pxa_camera.c
> @@ -242,14 +242,13 @@ static int pxa_videobuf_setup(struct videobuf_queue *vq, unsigned int *count,
>  	dev_dbg(&icd->dev, "count=%d, size=%d\n", *count, *size);
>  
>  	/* planar capture requires Y, U and V buffers to be page aligned */
> -	if (pcdev->channels == 3) {
> -		*size = PAGE_ALIGN(icd->width * icd->height); /* Y pages */
> -		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* U pages */
> -		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* V pages */
> -	} else {
> -		*size = icd->width * icd->height *
> -			((icd->current_fmt->depth + 7) >> 3);
> -	}
> +	if (pcdev->channels == 3)
> +		*size = roundup(icd->width * icd->height, 8) /* Y pages */
> +			+ roundup(icd->width * icd->height / 2, 8) /* U pages */
> +			+ roundup(icd->width * icd->height / 2, 8); /* V pages */
> +	else
> +		*size = roundup(icd->width * icd->height *
> +				((icd->current_fmt->depth + 7) >> 3), 8);
>  
>  	if (0 == *count)
>  		*count = 32;

Ok, this one will change I presume - new alignment calculations and 
line-breaking. In fact, if you adjust width and height earlier in set_fmt, 
maybe you'll just remove any rounding here completely.

> @@ -289,19 +288,63 @@ static void free_buffer(struct videobuf_queue *vq, struct pxa_buffer *buf)
>  	buf->vb.state = VIDEOBUF_NEEDS_INIT;
>  }
>  
> +static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
> +			       int sg_first_ofs, int size)
> +{
> +	int i, offset, dma_len, xfer_len;
> +	struct scatterlist *sg;
> +
> +	offset = sg_first_ofs;
> +	for_each_sg(sglist, sg, sglen, i) {
> +		dma_len = sg_dma_len(sg);
> +
> +		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
> +		xfer_len = roundup(min(dma_len - offset, size), 8);

Ok, let's see. dma_len is PAGE_SIZE. offset is for Y-plane 0, for further 
planes it will be aligned after we recalculate width and height. size will 
be aligned too, so, roundup will disappear, right? You might want to just 
just add a test for these. The calculation itself gives size >= xfer_len

> +
> +		size = max(0, size - xfer_len);

So, max is useless here, just "size -= xfer_len."

> +		offset = 0;
> +		if (size == 0)
> +			break;
> +	}
> +
> +	BUG_ON(size != 0);
> +	return i + 1;
> +}
> +
> +/**
> + * pxa_init_dma_channel - init dma descriptors
> + * @pcdev: pxa camera device
> + * @buf: pxa buffer to find pxa dma channel
> + * @dma: dma video buffer
> + * @channel: dma channel (0 => 'Y', 1 => 'U', 2 => 'V')
> + * @cibr: camera read fifo
> + * @size: bytes to transfer
> + * @sg_first: index of first element of sg_list
> + * @sg_first_ofs: offset in first element of sg_list
> + *
> + * Prepares the pxa dma descriptors to transfer one camera channel.
> + * Beware sg_first and sg_first_ofs are both input and output parameters.
> + *
> + * Returns 0
> + */
>  static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  				struct pxa_buffer *buf,
>  				struct videobuf_dmabuf *dma, int channel,
> -				int sglen, int sg_start, int cibr,
> -				unsigned int size)
> +				int cibr, int size,
> +				struct scatterlist **sg_first, int *sg_first_ofs)
>  {
>  	struct pxa_cam_dma *pxa_dma = &buf->dmas[channel];
> -	int i;
> +	struct scatterlist *sg;
> +	int i, offset, sglen;
> +	int dma_len = 0, xfer_len = 0;
>  
>  	if (pxa_dma->sg_cpu)
>  		dma_free_coherent(pcdev->dev, pxa_dma->sg_size,
>  				  pxa_dma->sg_cpu, pxa_dma->sg_dma);
>  
> +	sglen = calculate_dma_sglen(*sg_first, dma->sglen,
> +				    *sg_first_ofs, size);
> +
>  	pxa_dma->sg_size = (sglen + 1) * sizeof(struct pxa_dma_desc);
>  	pxa_dma->sg_cpu = dma_alloc_coherent(pcdev->dev, pxa_dma->sg_size,
>  					     &pxa_dma->sg_dma, GFP_KERNEL);
> @@ -309,27 +352,51 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  		return -ENOMEM;
>  
>  	pxa_dma->sglen = sglen;
> +	offset = *sg_first_ofs;
>  
> -	for (i = 0; i < sglen; i++) {
> -		int sg_i = sg_start + i;
> -		struct scatterlist *sg = dma->sglist;
> -		unsigned int dma_len = sg_dma_len(&sg[sg_i]), xfer_len;
> +	dev_dbg(pcdev->dev, "DMA: sg_first=%p, sglen=%d, ofs=%d, dma.desc=%x\n",
> +		*sg_first, sglen, *sg_first_ofs, pxa_dma->sg_dma);
>  
> -		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
> -		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(&sg[sg_i]);
> +
> +	for_each_sg(*sg_first, sg, sglen, i) {
> +		dma_len = sg_dma_len(sg);
>  
>  		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
> -		xfer_len = (min(dma_len, size) + 7) & ~7;
> +		xfer_len = roundup(min(dma_len - offset, size), 8);
>  
> +		size = max(0, size - xfer_len);

Same here for roundup() and max().

> +
> +		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
> +		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
>  		pxa_dma->sg_cpu[i].dcmd =
>  			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
> -		size -= dma_len;
>  		pxa_dma->sg_cpu[i].ddadr =
>  			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
> +
> +		dev_vdbg(pcdev->dev, "DMA: desc.%08x->@phys=0x%08x, len=%d\n",
> +			 pxa_dma->sg_dma + i * sizeof(struct pxa_dma_desc),
> +			 sg_dma_address(sg) + offset, xfer_len);
> +		offset = 0;
> +
> +		if (size == 0)
> +			break;
>  	}
>  
> -	pxa_dma->sg_cpu[sglen - 1].ddadr = DDADR_STOP;
> -	pxa_dma->sg_cpu[sglen - 1].dcmd |= DCMD_ENDIRQEN;
> +	pxa_dma->sg_cpu[sglen].ddadr = DDADR_STOP;
> +	pxa_dma->sg_cpu[sglen].dcmd  = DCMD_FLOWSRC | DCMD_BURST8 | DCMD_ENDIRQEN;

Why are you now always using the n+1'th element? Even if it is right, it 
rather belongs to the patch "2/4," not "1/4," right? In your earlier email 
you wrote:

>  - in the former pxa_videobuf_queue(), when a buffer was queued while another
>  was already active, a dummy descriptor was added, and then the new buffer was
>  chained with the actively running buffer. See code below :
> 
> -                       } else {
> -                               buf_dma->sg_cpu[nents].ddadr =
> -                                       DDADR(pcdev->dma_chans[i]);
> -                       }
> -
> -                       /* The next descriptor is the dummy descriptor */
> -                       DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
> -                               sizeof(struct pxa_dma_desc);
> 
>    The fix is in the code refactoring, as now the buffer is always added at the
>    tail of the queue through pxa_dma_add_tail_buf().

I don't understand, what this is fixing. It would make a nice 
simplification, if it worked, but see my review to patch "2/4."

> +
> +	*sg_first_ofs = xfer_len;
> +	/*
> +	 * Handle 1 special case :
> +	 *  - if we finish the DMA transfer in the last 7 bytes of a RAM page
> +	 *    then we return the sg element pointing on the next page
> +	 */
> +	if (*sg_first_ofs >= dma_len) {
> +		*sg_first_ofs -= dma_len;
> +		*sg_first = sg_next(sg);
> +	} else {
> +		*sg_first = sg;
> +	}

As we will not be rounding up any more, this special case shouldn't be 
needed either, right?

>  
>  	return 0;
>  }
> @@ -342,9 +409,7 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
>  	struct pxa_camera_dev *pcdev = ici->priv;
>  	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
>  	int ret;
> -	int sglen_y,  sglen_yu = 0, sglen_u = 0, sglen_v = 0;
> -	int size_y, size_u = 0, size_v = 0;
> -
> +	int size_y = 0, size_u = 0, size_v = 0;

Isn't size_y always initialised?

>  	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
>  		vb, vb->baddr, vb->bsize);
>  
> @@ -381,53 +446,51 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
>  	}
>  
>  	if (vb->state == VIDEOBUF_NEEDS_INIT) {
> -		unsigned int size = vb->size;
> +		int size = vb->size;
> +		int next_ofs = 0;
>  		struct videobuf_dmabuf *dma = videobuf_to_dma(vb);
> +		struct scatterlist *sg;
>  
>  		ret = videobuf_iolock(vq, vb, NULL);
>  		if (ret)
>  			goto fail;
>  
>  		if (pcdev->channels == 3) {
> -			/* FIXME the calculations should be more precise */
> -			sglen_y = dma->sglen / 2;
> -			sglen_u = sglen_v = dma->sglen / 4 + 1;
> -			sglen_yu = sglen_y + sglen_u;
>  			size_y = size / 2;
>  			size_u = size_v = size / 4;
>  		} else {
> -			sglen_y = dma->sglen;
>  			size_y = size;
>  		}
>  
> -		/* init DMA for Y channel */
> -		ret = pxa_init_dma_channel(pcdev, buf, dma, 0, sglen_y,
> -					   0, 0x28, size_y);
> +		sg = dma->sglist;
>  
> +		/* init DMA for Y channel */
> +		ret = pxa_init_dma_channel(pcdev, buf, dma, 0, CIBR0, size_y,
> +					   &sg, &next_ofs);
>  		if (ret) {
>  			dev_err(pcdev->dev,
>  				"DMA initialization for Y/RGB failed\n");
>  			goto fail;
>  		}
>  
> -		if (pcdev->channels == 3) {
> -			/* init DMA for U channel */
> -			ret = pxa_init_dma_channel(pcdev, buf, dma, 1, sglen_u,
> -						   sglen_y, 0x30, size_u);
> -			if (ret) {
> -				dev_err(pcdev->dev,
> -					"DMA initialization for U failed\n");
> -				goto fail_u;
> -			}
> +		/* init DMA for U channel */
> +		if (size_u)
> +			ret = pxa_init_dma_channel(pcdev, buf, dma, 1, CIBR1,
> +						   size_u, &sg, &next_ofs);
> +		if (ret) {
> +			dev_err(pcdev->dev,
> +				"DMA initialization for U failed\n");
> +			goto fail_u;
> +		}
>  
> -			/* init DMA for V channel */
> -			ret = pxa_init_dma_channel(pcdev, buf, dma, 2, sglen_v,
> -						   sglen_yu, 0x38, size_v);
> -			if (ret) {
> -				dev_err(pcdev->dev,
> -					"DMA initialization for V failed\n");
> -				goto fail_v;
> -			}
> +		/* init DMA for V channel */
> +		if (size_v)
> +			ret = pxa_init_dma_channel(pcdev, buf, dma, 2, CIBR2,
> +						   size_u, &sg, &next_ofs);
> +		if (ret) {
> +			dev_err(pcdev->dev,
> +				"DMA initialization for V failed\n");
> +			goto fail_v;
>  		}
>  
>  		vb->state = VIDEOBUF_PREPARED;
> -- 
> 1.5.6.5
> 

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-05 19:45   ` [PATCH 2/4] pxa_camera: Redesign DMA handling Robert Jarzmik
  2009-03-05 19:45     ` [PATCH 3/4] pxa_camera: Coding style sweeping Robert Jarzmik
@ 2009-03-09 11:35     ` Guennadi Liakhovetski
  2009-03-09 20:50       ` Robert Jarzmik
  1 sibling, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-09 11:35 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: g.liakhovetski, mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Robert Jarzmik wrote:

> The DMA transfers in pxa_camera showed some weaknesses in
> multiple queued buffers context :
>  - poll/select problem
>    The order between list pcdev->capture and DMA chain was
>    not the same. This creates a discrepancy between video
>    buffers marked as "done" by the IRQ handler, and the
>    really finished video buffer.
> 
>    The bug shows up with capture_example tool from v4l2 hg
>    tree. The process just "stalls" on a "select timeout".
> 
>    The key problem is in pxa_videobuf_queue(), where the
>    queued buffer is chained before the active buffer, while
>    it should have been the active buffer first, and queued
>    buffer tailed after.
> 
>  - multiple buffers DMA starting
>    When multiple buffers were queued, the DMA channels were
>    always started right away. This is not optimal, as a
>    special case appears when the first EOF was not yet
>    reached, and the DMA channels were prematurely started.
> 
>  - Maintainability
>    DMA code was a bit obfuscated. Rationalize the code to be
>    easily maintainable by anyone.
> 
> This patch attemps to address these issues.
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  drivers/media/video/pxa_camera.c |  264 ++++++++++++++++++++------------------
>  1 files changed, 139 insertions(+), 125 deletions(-)
> 
> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> index 54df071..2d79ded 100644
> --- a/drivers/media/video/pxa_camera.c
> +++ b/drivers/media/video/pxa_camera.c
> @@ -325,7 +325,7 @@ static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
>   * Prepares the pxa dma descriptors to transfer one camera channel.
>   * Beware sg_first and sg_first_ofs are both input and output parameters.
>   *
> - * Returns 0
> + * Returns 0 or -ENOMEM si no coherent memory is available

Let's stay with English for now:-) s/si/if/

>   */
>  static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  				struct pxa_buffer *buf,
> @@ -369,7 +369,8 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
>  		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
>  		pxa_dma->sg_cpu[i].dcmd =
> -			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
> +			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len
> +			| ((i == 0) ? DCMD_STARTIRQEN : 0);

If DCMD_STARTIRQEN is still for debugging only, maybe put it under

#ifdef DEBUG
		if (!i)
			pxa_dma->sg_cpu[i].dcmd |= DCMD_STARTIRQEN;
#endif

you anyway only see any effect of this interrupt with dev_dbg().

>  		pxa_dma->sg_cpu[i].ddadr =
>  			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
>  
> @@ -516,6 +517,97 @@ out:
>  	return ret;
>  }
>  
> +/**
> + * pxa_dma_start_channels - start DMA channel for active buffer
> + * @pcdev: pxa camera device
> + *
> + * Initialize DMA channels to the beginning of the active video buffer, and
> + * start these channels.
> + */
> +static void pxa_dma_start_channels(struct pxa_camera_dev *pcdev)
> +{
> +	int i;
> +	struct pxa_buffer *active;
> +
> +	active = pcdev->active;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		dev_dbg(pcdev->dev, "%s (channel=%d) ddadr=%08x\n", __func__,
> +			i, active->dmas[i].sg_dma);
> +		DDADR(pcdev->dma_chans[i]) = active->dmas[i].sg_dma;
> +		DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> +	}
> +}
> +
> +static void pxa_dma_stop_channels(struct pxa_camera_dev *pcdev)
> +{
> +	int i;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		dev_dbg(pcdev->dev, "%s (channel=%d)\n", __func__, i);
> +		DCSR(pcdev->dma_chans[i]) = 0;
> +	}
> +}
> +
> +static void pxa_dma_update_sg_tail(struct pxa_camera_dev *pcdev,
> +				   struct pxa_buffer *buf)
> +{
> +	int i;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
> +		pcdev->sg_tail[i]->ddadr = DDADR_STOP;

Do I understand it right, assuming capture is running, i.e., active != 
NULL:

before your patch

sg_tail points to the last real DMA descriptor
the last real DMA descriptor has DDADR_STOP
on queuing of the next buffer we
 1. stop DMA
 2. link the last real descriptor to the new first descriptor
 3. allocate an additional dummy descriptor, fill it with DMA engine's 
	current state and use it to
 4. re-start DMA

after your patch

sg_tail points to the additional DMA descriptor
the last valid DMA descriptor points to the additional descriptor
the additional descriptor has DDADR_STOP
on queuing of the next buffer
 1. stop DMA
 2. the additional dummy descriptor at the tail of the current chain is 
	reconfigured to point to the new start
 3. pxa_dma_start_channels() is called, which drops the current partial 
	transfer and re-starts the frame?...

If I am right, this doesn't seem right. If I am wrong, please, explain and 
add explanatory comments, so, the next one (or the same one 2 months 
later) does not have to spend time trying to figure out.

> +	}
> +}
> +
> +static void pxa_dma_add_tail_buf(struct pxa_camera_dev *pcdev,
> +				 struct pxa_buffer *buf)
> +{
> +	int i;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		if (!pcdev->sg_tail[i])
> +			continue;
> +		pcdev->sg_tail[i]->ddadr = buf->dmas[i].sg_dma;
> +	}
> +
> +	pxa_dma_update_sg_tail(pcdev, buf);
> +}
> +
> +/**
> + * pxa_camera_start_capture - start video capturing
> + * @pcdev: camera device
> + *
> + * Launch capturing. DMA channels should not be active yet. They should get
> + * activated at the end of frame interrupt, to capture only whole frames, and
> + * never begin the capture of a partial frame.
> + */
> +static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev)
> +{
> +	unsigned long cicr0, cifr;
> +
> +	dev_dbg(pcdev->dev, "%s\n", __func__);

I originally had a "reset the FIFOs" comment here, wouldn't hurt to add it 
now too.

> +	cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> +	__raw_writel(cifr, pcdev->base + CIFR);
> +
> +	cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB | CISR_IFO_0
> +		| CISR_IFO_1 | CISR_IFO_2;

CISR_* flags have nothing to do with the CICR register.

> +	cicr0 &= ~CICR0_EOFM;
> +	__raw_writel(cicr0, pcdev->base + CICR0);
> +}

It is nice to synchronise on a frame start, but you're relying on being 
"fast," i.e., on servicing the End of Frame interrupt between the two 
frames and having enough time to configure DMA. With smaller frames with 
short inter-frame times this can be difficult, I think. But, well, that's 
the best we can do, I guess. And yes, I know, I'm already doing this in 
the overrun case.

> +
> +static void pxa_camera_stop_capture(struct pxa_camera_dev *pcdev)
> +{
> +	unsigned long cicr0;
> +
> +	pxa_dma_stop_channels(pcdev);
> +
> +	cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
> +	__raw_writel(cicr0, pcdev->base + CICR0);
> +
> +	dev_dbg(pcdev->dev, "%s\n", __func__);
> +}
> +
>  static void pxa_videobuf_queue(struct videobuf_queue *vq,
>  			       struct videobuf_buffer *vb)
>  {
> @@ -523,81 +615,23 @@ static void pxa_videobuf_queue(struct videobuf_queue *vq,
>  	struct soc_camera_host *ici = to_soc_camera_host(icd->dev.parent);
>  	struct pxa_camera_dev *pcdev = ici->priv;
>  	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
> -	struct pxa_buffer *active;
>  	unsigned long flags;
> -	int i;
>  
> -	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
> -		vb, vb->baddr, vb->bsize);
> -	spin_lock_irqsave(&pcdev->lock, flags);
> +	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d active=%p\n", __func__,
> +		vb, vb->baddr, vb->bsize, pcdev->active);
>  
> +	spin_lock_irqsave(&pcdev->lock, flags);
>  	list_add_tail(&vb->queue, &pcdev->capture);
>  
>  	vb->state = VIDEOBUF_ACTIVE;
> -	active = pcdev->active;
> -
> -	if (!active) {
> -		unsigned long cifr, cicr0;
> -
> -		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> -		__raw_writel(cifr, pcdev->base + CIFR);
> -
> -		for (i = 0; i < pcdev->channels; i++) {
> -			DDADR(pcdev->dma_chans[i]) = buf->dmas[i].sg_dma;
> -			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> -			pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen - 1;
> -		}
>  
> -		pcdev->active = buf;
> +	pxa_dma_stop_channels(pcdev);
> +	pxa_dma_add_tail_buf(pcdev, buf);
>  
> -		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -	} else {
> -		struct pxa_cam_dma *buf_dma;
> -		struct pxa_cam_dma *act_dma;
> -		int nents;
> -
> -		for (i = 0; i < pcdev->channels; i++) {
> -			buf_dma = &buf->dmas[i];
> -			act_dma = &active->dmas[i];
> -			nents = buf_dma->sglen;
> -
> -			/* Stop DMA engine */
> -			DCSR(pcdev->dma_chans[i]) = 0;
> -
> -			/* Add the descriptors we just initialized to
> -			   the currently running chain */
> -			pcdev->sg_tail[i]->ddadr = buf_dma->sg_dma;
> -			pcdev->sg_tail[i] = buf_dma->sg_cpu + buf_dma->sglen - 1;
> -
> -			/* Setup a dummy descriptor with the DMA engines current
> -			 * state
> -			 */
> -			buf_dma->sg_cpu[nents].dsadr =
> -				pcdev->res->start + 0x28 + i*8; /* CIBRx */
> -			buf_dma->sg_cpu[nents].dtadr =
> -				DTADR(pcdev->dma_chans[i]);
> -			buf_dma->sg_cpu[nents].dcmd =
> -				DCMD(pcdev->dma_chans[i]);
> -
> -			if (DDADR(pcdev->dma_chans[i]) == DDADR_STOP) {
> -				/* The DMA engine is on the last
> -				   descriptor, set the next descriptors
> -				   address to the descriptors we just
> -				   initialized */
> -				buf_dma->sg_cpu[nents].ddadr = buf_dma->sg_dma;
> -			} else {
> -				buf_dma->sg_cpu[nents].ddadr =
> -					DDADR(pcdev->dma_chans[i]);
> -			}
> -
> -			/* The next descriptor is the dummy descriptor */
> -			DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
> -				sizeof(struct pxa_dma_desc);
> -
> -			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> -		}
> -	}
> +	if (!pcdev->active)
> +		pxa_camera_start_capture(pcdev);
> +	else
> +		pxa_dma_start_channels(pcdev);
>  
>  	spin_unlock_irqrestore(&pcdev->lock, flags);
>  }
> @@ -635,7 +669,7 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
>  			      struct videobuf_buffer *vb,
>  			      struct pxa_buffer *buf)
>  {
> -	unsigned long cicr0;
> +	int i;
>  
>  	/* _init is used to debug races, see comment in pxa_camera_reqbufs() */
>  	list_del_init(&vb->queue);
> @@ -643,15 +677,13 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
>  	do_gettimeofday(&vb->ts);
>  	vb->field_count++;
>  	wake_up(&vb->done);
> +	dev_dbg(pcdev->dev, "%s dequeud buffer (vb=0x%p)\n", __func__, vb);
>  
>  	if (list_empty(&pcdev->capture)) {
> +		pxa_camera_stop_capture(pcdev);
>  		pcdev->active = NULL;
> -		DCSR(pcdev->dma_chans[0]) = 0;
> -		DCSR(pcdev->dma_chans[1]) = 0;
> -		DCSR(pcdev->dma_chans[2]) = 0;
> -
> -		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> +		for (i = 0; i < pcdev->channels; i++)
> +			pcdev->sg_tail[i] = NULL;
>  		return;
>  	}
>  

You're now also stopping capture here, should work, yes...

> @@ -666,19 +698,23 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  	unsigned long flags;
>  	u32 status, camera_status, overrun;
>  	struct videobuf_buffer *vb;
> -	unsigned long cifr, cicr0;
>  
>  	spin_lock_irqsave(&pcdev->lock, flags);
>  
>  	status = DCSR(channel);
> -	DCSR(channel) = status | DCSR_ENDINTR;
> +	DCSR(channel) = status | DCSR_STARTINTR | DCSR_ENDINTR;

Now as I look at it, actually, this is racy. If for whatever reason we 
entered here without ENDINTR set, so status & DCSR_ENDINTR == 0, then it 
got immediately set and we clear it, thus we lose it. I think, there's no 
reason here not to use the standard

	irq_reason = read(IRQ_REASON_REG);
	write(irq_reason, IRQ_REASON_REG);

> +
> +	camera_status = __raw_readl(pcdev->base + CISR);
> +	overrun = CISR_IFO_0;
> +	if (pcdev->channels == 3)
> +		overrun |= CISR_IFO_1 | CISR_IFO_2;
>  
>  	if (status & DCSR_BUSERR) {
>  		dev_err(pcdev->dev, "DMA Bus Error IRQ!\n");
>  		goto out;
>  	}
>  
> -	if (!(status & DCSR_ENDINTR)) {
> +	if (!(status & (DCSR_ENDINTR | DCSR_STARTINTR))) {
>  		dev_err(pcdev->dev, "Unknown DMA IRQ source, "
>  			"status: 0x%08x\n", status);
>  		goto out;
> @@ -689,38 +725,27 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  		goto out;
>  	}
>  
> -	camera_status = __raw_readl(pcdev->base + CISR);
> -	overrun = CISR_IFO_0;
> -	if (pcdev->channels == 3)
> -		overrun |= CISR_IFO_1 | CISR_IFO_2;
> -	if (camera_status & overrun) {
> -		dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n", camera_status);
> -		/* Stop the Capture Interface */
> -		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -
> -		/* Stop DMA */
> -		DCSR(channel) = 0;
> -		/* Reset the FIFOs */
> -		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> -		__raw_writel(cifr, pcdev->base + CIFR);
> -		/* Enable End-Of-Frame Interrupt */
> -		cicr0 &= ~CICR0_EOFM;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -		/* Restart the Capture Interface */
> -		__raw_writel(cicr0 | CICR0_ENB, pcdev->base + CICR0);
> -		goto out;
> -	}
> -
>  	vb = &pcdev->active->vb;
>  	buf = container_of(vb, struct pxa_buffer, vb);
>  	WARN_ON(buf->inwork || list_empty(&vb->queue));
> -	dev_dbg(pcdev->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
> -		vb, vb->baddr, vb->bsize);
>  
> -	buf->active_dma &= ~act_dma;
> -	if (!buf->active_dma)
> -		pxa_camera_wakeup(pcdev, vb, buf);
> +	dev_dbg(pcdev->dev, "%s channel=%d %s%s(vb=0x%p) dma.desc=%x\n",
> +		__func__, channel, status & DCSR_STARTINTR ? "SOF " : "",
> +		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
> +
> +	if (status & DCSR_ENDINTR) {
> +		if (camera_status & overrun) {
> +			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
> +				camera_status);
> +			pxa_camera_stop_capture(pcdev);
> +			pxa_camera_start_capture(pcdev);
> +			goto out;
> +		}
> +
> +		buf->active_dma &= ~act_dma;
> +		if (!buf->active_dma)
> +			pxa_camera_wakeup(pcdev, vb, buf);
> +	}
>  
>  out:
>  	spin_unlock_irqrestore(&pcdev->lock, flags);
> @@ -859,12 +884,11 @@ static irqreturn_t pxa_camera_irq(int irq, void *data)
>  	__raw_writel(status, pcdev->base + CISR);
>  
>  	if (status & CISR_EOF) {
> -		int i;
> -		for (i = 0; i < pcdev->channels; i++) {
> -			DDADR(pcdev->dma_chans[i]) =
> -				pcdev->active->dmas[i].sg_dma;
> -			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> -		}
> +		pcdev->active = list_first_entry(&pcdev->capture,
> +					   struct pxa_buffer, vb.queue);
> +
> +		pxa_dma_start_channels(pcdev);
> +
>  		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_EOFM;
>  		__raw_writel(cicr0, pcdev->base + CICR0);
>  	}
> @@ -1404,18 +1428,8 @@ static int pxa_camera_resume(struct soc_camera_device *icd)
>  		ret = pcdev->icd->ops->resume(pcdev->icd);
>  
>  	/* Restart frame capture if active buffer exists */
> -	if (!ret && pcdev->active) {
> -		unsigned long cifr, cicr0;
> -
> -		/* Reset the FIFOs */
> -		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> -		__raw_writel(cifr, pcdev->base + CIFR);
> -
> -		cicr0 = __raw_readl(pcdev->base + CICR0);
> -		cicr0 &= ~CICR0_EOFM;	/* Enable End-Of-Frame Interrupt */
> -		cicr0 |= CICR0_ENB;	/* Restart the Capture Interface */
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -	}
> +	if (!ret && pcdev->active)
> +		pxa_camera_start_capture(pcdev);
>  
>  	return ret;
>  }
> -- 
> 1.5.6.5
> 

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer
  2009-03-05 19:45       ` [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer Robert Jarzmik
@ 2009-03-09 11:39         ` Guennadi Liakhovetski
  2009-03-09 19:16           ` Robert Jarzmik
  2009-03-11 18:31         ` Guennadi Liakhovetski
  1 sibling, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-09 11:39 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Robert Jarzmik wrote:

> The last buffer queued will often overrun, as the DMA chain
> is finished, and the time the dma irq handler is activated,

s/and the time/and during the time/ ?

> the QIF fifos are filled by the sensor.
> 
> The fix is to ignore the overrun condition on the last
> queued buffer, and restart the capture only on intermediate
> buffers of the chain.
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  drivers/media/video/pxa_camera.c |    8 ++++++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> index 16bf0a3..dd56c35 100644
> --- a/drivers/media/video/pxa_camera.c
> +++ b/drivers/media/video/pxa_camera.c
> @@ -734,14 +734,18 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
>  
>  	if (status & DCSR_ENDINTR) {
> -		if (camera_status & overrun) {
> +		/*
> +		 * It's normal if the last frame creates an overrun, as there
> +		 * are no more DMA descriptors to fetch from QIF fifos
> +		 */
> +		if (camera_status & overrun
> +		    && !list_is_last(pcdev->capture.next, &pcdev->capture)) {
>  			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
>  				camera_status);
>  			pxa_camera_stop_capture(pcdev);
>  			pxa_camera_start_capture(pcdev);
>  			goto out;
>  		}
> -
>  		buf->active_dma &= ~act_dma;

This empty like removal doesn't belong to the fix, I'll remove it when 
committing, and amend the commit message as above. Please, comment if you 
disagree.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-09 10:45   ` Guennadi Liakhovetski
@ 2009-03-09 19:13     ` Robert Jarzmik
  2009-03-10 18:33       ` Trent Piepho
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-09 19:13 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

>> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
>> index e3e6b29..54df071 100644
>> --- a/drivers/media/video/pxa_camera.c
>> +++ b/drivers/media/video/pxa_camera.c
>> @@ -242,14 +242,13 @@ static int pxa_videobuf_setup(struct videobuf_queue *vq, unsigned int *count,
>>  	dev_dbg(&icd->dev, "count=%d, size=%d\n", *count, *size);
>>  
>>  	/* planar capture requires Y, U and V buffers to be page aligned */
>> -	if (pcdev->channels == 3) {
>> -		*size = PAGE_ALIGN(icd->width * icd->height); /* Y pages */
>> -		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* U pages */
>> -		*size += PAGE_ALIGN(icd->width * icd->height / 2); /* V pages */
>> -	} else {
>> -		*size = icd->width * icd->height *
>> -			((icd->current_fmt->depth + 7) >> 3);
>> -	}
>> +	if (pcdev->channels == 3)
>> +		*size = roundup(icd->width * icd->height, 8) /* Y pages */
>> +			+ roundup(icd->width * icd->height / 2, 8) /* U pages */
>> +			+ roundup(icd->width * icd->height / 2, 8); /* V pages */
>> +	else
>> +		*size = roundup(icd->width * icd->height *
>> +				((icd->current_fmt->depth + 7) >> 3), 8);
>>  
>>  	if (0 == *count)
>>  		*count = 32;
>
> Ok, this one will change I presume - new alignment calculations and 
> line-breaking. In fact, if you adjust width and height earlier in set_fmt, 
> maybe you'll just remove any rounding here completely.
Helas, not fully.
The problem is with passthrough and rgb formats, where I don't enforce
width/height. In the newest form of the patch I have this :
	if (pcdev->channels == 3)
		*size = icd->width * icd->height * 2;
	else
		*size = roundup(icd->width * icd->height *
				((icd->current_fmt->depth + 7) >> 3), 8);

>
>> @@ -289,19 +288,63 @@ static void free_buffer(struct videobuf_queue *vq, struct pxa_buffer *buf)
>>  	buf->vb.state = VIDEOBUF_NEEDS_INIT;
>>  }
>>  
>> +static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
>> +			       int sg_first_ofs, int size)
>> +{
>> +	int i, offset, dma_len, xfer_len;
>> +	struct scatterlist *sg;
>> +
>> +	offset = sg_first_ofs;
>> +	for_each_sg(sglist, sg, sglen, i) {
>> +		dma_len = sg_dma_len(sg);
>> +
>> +		/* PXA27x Developer's Manual 27.4.4.1: round up to 8 bytes */
>> +		xfer_len = roundup(min(dma_len - offset, size), 8);
>
> Ok, let's see. dma_len is PAGE_SIZE. offset is for Y-plane 0, for further 
> planes it will be aligned after we recalculate width and height. size will 
> be aligned too, so, roundup will disappear, right?

No, I don't think so.

Consider the case of a RGB565 image which size is 223*33 = 7359 bytes. This
makes a transfer of 4096 bytes and another of 3263 bytes.

But the QIF fifo will give 4096 + 3264 bytes (the last one beeing 0), and this
last byte has to be read from the fifo. As I understand it, the QIF fifo works
with 8 bytes permutations, and that why it's giving always a multiple of 8
bytes. Please, cross-check PXA developper manual, paragraph 27.4.4.1 to see if
you understand the same as I did.

So the roundup() is to be kept :(

> You might want to just 
> just add a test for these. The calculation itself gives size >= xfer_len
>
>> +
>> +		size = max(0, size - xfer_len);
>
> So, max is useless here, just "size -= xfer_len."
And so the max() still hold I think.

>> +		size = max(0, size - xfer_len);
>
> Same here for roundup() and max().
Yes, same discussion.

>>  
>> -	pxa_dma->sg_cpu[sglen - 1].ddadr = DDADR_STOP;
>> -	pxa_dma->sg_cpu[sglen - 1].dcmd |= DCMD_ENDIRQEN;
>> +	pxa_dma->sg_cpu[sglen].ddadr = DDADR_STOP;
>> +	pxa_dma->sg_cpu[sglen].dcmd  = DCMD_FLOWSRC | DCMD_BURST8 | DCMD_ENDIRQEN;
>
> Why are you now always using the n+1'th element? Even if it is right, it 
> rather belongs to the patch "2/4," not "1/4," right? In your earlier email 
> you wrote:
>
>>  - in the former pxa_videobuf_queue(), when a buffer was queued while another
>>  was already active, a dummy descriptor was added, and then the new buffer was
>>  chained with the actively running buffer. See code below :
>> 
>> -                       } else {
>> -                               buf_dma->sg_cpu[nents].ddadr =
>> -                                       DDADR(pcdev->dma_chans[i]);
>> -                       }
>> -
>> -                       /* The next descriptor is the dummy descriptor */
>> -                       DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
>> -                               sizeof(struct pxa_dma_desc);
>> 
>>    The fix is in the code refactoring, as now the buffer is always added at the
>>    tail of the queue through pxa_dma_add_tail_buf().
>
> I don't understand, what this is fixing. It would make a nice 
> simplification, if it worked, but see my review to patch "2/4."
Let's discuss it in patch 2/4 review, yes.

>> +
>> +	*sg_first_ofs = xfer_len;
>> +	/*
>> +	 * Handle 1 special case :
>> +	 *  - if we finish the DMA transfer in the last 7 bytes of a RAM page
>> +	 *    then we return the sg element pointing on the next page
>> +	 */
>> +	if (*sg_first_ofs >= dma_len) {
>> +		*sg_first_ofs -= dma_len;
>> +		*sg_first = sg_next(sg);
>> +	} else {
>> +		*sg_first = sg;
>> +	}
>
> As we will not be rounding up any more, this special case shouldn't be 
> needed either, right?
Same discussion as before. But here, as sg_first and sf_first_ofs only make
sense for 3 planes output, and the rounding takes care of the corner cases, this
part will be simplified, yes.

>
>>  
>>  	return 0;
>>  }
>> @@ -342,9 +409,7 @@ static int pxa_videobuf_prepare(struct videobuf_queue *vq,
>>  	struct pxa_camera_dev *pcdev = ici->priv;
>>  	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
>>  	int ret;
>> -	int sglen_y,  sglen_yu = 0, sglen_u = 0, sglen_v = 0;
>> -	int size_y, size_u = 0, size_v = 0;
>> -
>> +	int size_y = 0, size_u = 0, size_v = 0;
>
> Isn't size_y always initialised?
Yes, I'll remove that.

Thanks for the review, let's keep that going on :)

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer
  2009-03-09 11:39         ` Guennadi Liakhovetski
@ 2009-03-09 19:16           ` Robert Jarzmik
  0 siblings, 0 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-09 19:16 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

> On Thu, 5 Mar 2009, Robert Jarzmik wrote:
>
>> The last buffer queued will often overrun, as the DMA chain
>> is finished, and the time the dma irq handler is activated,
>
> s/and the time/and during the time/ ?
If you wish, or might be simply "and before the dma irq handler is
activated". As you see fit.

>> +		if (camera_status & overrun
>> +		    && !list_is_last(pcdev->capture.next, &pcdev->capture)) {
>>  			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
>>  				camera_status);
>>  			pxa_camera_stop_capture(pcdev);
>>  			pxa_camera_start_capture(pcdev);
>>  			goto out;
>>  		}
>> -
>>  		buf->active_dma &= ~act_dma;
>
> This empty like removal doesn't belong to the fix, I'll remove it when 
> committing, and amend the commit message as above. Please, comment if you 
> disagree.
I totally agree with you, remove that "removal" :)

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-09 11:35     ` [PATCH 2/4] pxa_camera: Redesign DMA handling Guennadi Liakhovetski
@ 2009-03-09 20:50       ` Robert Jarzmik
  2009-03-09 23:14         ` Guennadi Liakhovetski
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-09 20:50 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

>> + * Returns 0 or -ENOMEM si no coherent memory is available
>
> Let's stay with English for now:-) s/si/if/
Oups ... sorry ... the froggish touch is back :)

>
>>   */
>>  static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>>  				struct pxa_buffer *buf,
>> @@ -369,7 +369,8 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>>  		pxa_dma->sg_cpu[i].dsadr = pcdev->res->start + cibr;
>>  		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
>>  		pxa_dma->sg_cpu[i].dcmd =
>> -			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
>> +			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len
>> +			| ((i == 0) ? DCMD_STARTIRQEN : 0);
>
> If DCMD_STARTIRQEN is still for debugging only, maybe put it under
>
> #ifdef DEBUG
> 		if (!i)
> 			pxa_dma->sg_cpu[i].dcmd |= DCMD_STARTIRQEN;
> #endif
OK. Will amend.
>> +static void pxa_dma_update_sg_tail(struct pxa_camera_dev *pcdev,
>> +				   struct pxa_buffer *buf)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < pcdev->channels; i++) {
>> +		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
>> +		pcdev->sg_tail[i]->ddadr = DDADR_STOP;
>
> Do I understand it right, assuming capture is running, i.e., active != 
> NULL:
>
> before your patch
>
> sg_tail points to the last real DMA descriptor
> the last real DMA descriptor has DDADR_STOP
> on queuing of the next buffer we
>  1. stop DMA
>  2. link the last real descriptor to the new first descriptor
>  3. allocate an additional dummy descriptor, fill it with DMA engine's 
> 	current state and use it to
>  4. re-start DMA
Yes, but you forget :
   5. link the last new buffer descriptor (the called dummy buffer) to the
   running chain.

I see it that way, after former pxa_video_queue() :

 +----------+-----------+------------+
 | First vb | Second vb | Third vb | |
 +----^-----+-----------+-----------|+
      |                             |      +----------------+
      |                             +----> | New vb | dummy |
      |                                    +------------|---+
      |                                                 |
      +-------------------------------------------------+

This is my understanding. The DMA is restarted at the dummy descriptor, which
re-reads the current DMA descriptor (is that correct, if 16 bytes were already
transfered ?), then comes back to the head of DMA chain.
Then first vb is finished, then second and third, and then new vb is re-filled.

Would you comment to see where I'm wrong please ?

> after your patch
>
> sg_tail points to the additional DMA descriptor
Which additional ? Do you mean "the last DMA descriptor of the last video buffer
queued which never transfers any data" ? (which is what I point it at, yes)

> the last valid DMA descriptor points to the additional descriptor
> the additional descriptor has DDADR_STOP
Yes.

> on queuing of the next buffer
>  1. stop DMA
>  2. the additional dummy descriptor at the tail of the current chain is 
> 	reconfigured to point to the new start
Yes.

>  3. pxa_dma_start_channels() is called, which drops the current partial 
> 	transfer and re-starts the frame?...
Yes, that is wrong.
The trick is, if I restart the DMA channel where it was, I remember having my
"select stalled" message.

I see it that way, after new pxa_video_queue() :

 +----------+-----------+------------+
 | First vb | Second vb | Third vb | |
 +----------+-----------+-----------|+
 ^                                  |      +----------------+
 |                                  +----> | New vb | dummy |
  \restart                                 +----------------+

> If I am right, this doesn't seem right. If I am wrong, please, explain and 
> add explanatory comments, so, the next one (or the same one 2 months 
> later) does not have to spend time trying to figure out.
Well, you've got a point. There is something to dig here. By experiment, it is
working. But I will search why, as my patch does restart the frame :(

I will investigate :
 - if stopping the DMA chain and restarting in the middle of a DMA transfer
   (ie. in the middle of the 4096 bytes, on byte 2040 for example) does work.
 - how my DMA chain does work.

As a matter of fact, before this patch, I had a pxa_dma_restart_channels()
called in pxa_videobuf_queue(), which just "restarted" the DMA channel without
touching the DADR() register. I will search why this wasn't working.

>> +static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev)
>> +{
>> +	unsigned long cicr0, cifr;
>> +
>> +	dev_dbg(pcdev->dev, "%s\n", __func__);
>
> I originally had a "reset the FIFOs" comment here, wouldn't hurt to add it 
> now too.
Sorry, I'll reput it there. Will amend.

>
>> +	cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
>> +	__raw_writel(cifr, pcdev->base + CIFR);
>> +
>> +	cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB | CISR_IFO_0
>> +		| CISR_IFO_1 | CISR_IFO_2;
>
> CISR_* flags have nothing to do with the CICR register.
Right, good catch. I'll remove all the CISR* stuff. I must have been confused,
it's the CIFR_RESET_F which was meant there (fifo flush).

> It is nice to synchronise on a frame start, but you're relying on being 
> "fast," i.e., on servicing the End of Frame interrupt between the two 
> frames and having enough time to configure DMA. With smaller frames with 
> short inter-frame times this can be difficult, I think. But, well, that's 
> the best we can do, I guess. And yes, I know, I'm already doing this in 
> the overrun case.
Yep. But you're right. I'll expand my testcases to 32x32 frames, and bombard my
PXA with interrupts, at low cpufreq. We'll see what happens then :)

>> @@ -666,19 +698,23 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>>  	unsigned long flags;
>>  	u32 status, camera_status, overrun;
>>  	struct videobuf_buffer *vb;
>> -	unsigned long cifr, cicr0;
>>  
>>  	spin_lock_irqsave(&pcdev->lock, flags);
>>  
>>  	status = DCSR(channel);
>> -	DCSR(channel) = status | DCSR_ENDINTR;
>> +	DCSR(channel) = status | DCSR_STARTINTR | DCSR_ENDINTR;
>
> Now as I look at it, actually, this is racy. If for whatever reason we 
> entered here without ENDINTR set, so status & DCSR_ENDINTR == 0, then it 
> got immediately set and we clear it, thus we lose it. I think, there's no 
> reason here not to use the standard
>
> 	irq_reason = read(IRQ_REASON_REG);
> 	write(irq_reason, IRQ_REASON_REG);
Right. It is racy. Will amend.

OK, I have work to do on that one. Would please just check my understanding of
the chain (the superb ascii-art I draw :)), so that we could speak on the same
ground. That will help me understand better.

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-09 20:50       ` Robert Jarzmik
@ 2009-03-09 23:14         ` Guennadi Liakhovetski
  2009-03-10 21:46           ` Robert Jarzmik
  0 siblings, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-09 23:14 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

I'll answer all points tomorrow, but so you can start thinking about it 
earlier and get used to it:-), I'll explain the current driver behaviour 
now:

On Mon, 9 Mar 2009, Robert Jarzmik wrote:

> Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:
> 
> > before your patch
> >
> > sg_tail points to the last real DMA descriptor
> > the last real DMA descriptor has DDADR_STOP
> > on queuing of the next buffer we
> >  1. stop DMA
> >  2. link the last real descriptor to the new first descriptor
> >  3. allocate an additional dummy descriptor, fill it with DMA engine's 
> > 	current state and use it to
> >  4. re-start DMA
> Yes, but you forget :
>    5. link the last new buffer descriptor (the called dummy buffer) to the
>    running chain.
> 
> I see it that way, after former pxa_video_queue() :
> 
>  +----------+-----------+------------+
>  | First vb | Second vb | Third vb | |
>  +----^-----+-----------+-----------|+
>       |                             |      +----------------+
>       |                             +----> | New vb | dummy |
>       |                                    +------------|---+
>       |                                                 |
>       +-------------------------------------------------+
> 
> This is my understanding. The DMA is restarted at the dummy descriptor, which
> re-reads the current DMA descriptor (is that correct, if 16 bytes were already
> transfered ?), then comes back to the head of DMA chain.
> Then first vb is finished, then second and third, and then new vb is re-filled.
> 
> Would you comment to see where I'm wrong please ?

IIUYC, you mean, that the dummy descriptor re-starts the interrupted 
transfer from the beginning. This is wrong:

With the current code, let's say we capture frames 80x60=4800 at 1 byte 
per pixel - monochrome or Bayer. Then we allocate 3 sg-elements:

static int pxa_init_dma_channel()
{
	...
	pxa_dma->sg_size = (sglen + 1) * sizeof(struct pxa_dma_desc);
	pxa_dma->sg_cpu = dma_alloc_coherent(pcdev->dev, pxa_dma->sg_size,
					     &pxa_dma->sg_dma, GFP_KERNEL);
	...

and they are initialised

	pxa_dma->sg_cpu[0].dsadr = pcdev->res->start + cibr;
	pxa_dma->sg_cpu[0].dtadr = sg_dma_address(&sg[0]);

	pxa_dma->sg_cpu[0].dcmd =
		DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | 4096;
	pxa_dma->sg_cpu[0].ddadr =
		pxa_dma->sg_dma + sizeof(struct pxa_dma_desc);

	pxa_dma->sg_cpu[1].dsadr = pcdev->res->start + cibr;
	pxa_dma->sg_cpu[1].dtadr = sg_dma_address(&sg[1]);

	pxa_dma->sg_cpu[1].dcmd =
		DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | 704;
	pxa_dma->sg_cpu[1].ddadr =
		pxa_dma->sg_dma + 2 * sizeof(struct pxa_dma_desc);

	pxa_dma->sg_cpu[1].ddadr = DDADR_STOP;
	pxa_dma->sg_cpu[1].dcmd |= DCMD_ENDIRQEN;

Notice, sg_cpu[2] (the dummy) is not used yet. So, in normal case the DMA 
engine would process 0, 1, and stop.

}

Now, as this buffer is queued in
let's say, the previou

pxa_videobuf_queue()
{
	...

With locked interrupts we stop the DMA engine, and hope, that there's 
still enough space in the FIFO left and that we won't be getting an 
overrun...

	/* Stop DMA engine */
	DCSR(pcdev->dma_chans[i]) = 0;

>From now on we have to be fast until we re-enable DMA.

	/* Add the descriptors we just initialized to
	   the currently running chain */
	pcdev->sg_tail[i]->ddadr = buf_dma->sg_dma;
	pcdev->sg_tail[i] = buf_dma->sg_cpu + buf_dma->sglen - 1;

See, sg_tail is set to point to the last valid (not dummy) PXA DMA 
descriptor, i.e., to sg_cpu[1] in our example. So, before it also pointed 
to the last valid descriptor from the previous buffer, which now links to 
the beginning of our new buffer.

Now, this is the trick: we use a dummy descriptor (actually, the one from 
the new video buffer, but it doesn't matter) to set up a descriptor to 
finish the interrupted transfer. For this we set dtadr to the _current_ 
DTADR to continue filling the buffer exactly where we stopped.

	/* Setup a dummy descriptor with the DMA engines current
	 * state
	 */
	buf_dma->sg_cpu[nents].dsadr =
		pcdev->res->start + 0x28 + i*8; /* CIBRx */
	buf_dma->sg_cpu[nents].dtadr =
		DTADR(pcdev->dma_chans[i]);
	buf_dma->sg_cpu[nents].dcmd =
		DCMD(pcdev->dma_chans[i]);

Now we just check where we should link this our linking partial transfer 
descriptor - either to the first descriptor in our new buffer, if DMA was 
currently processing the last descriptor currently queued, or to the same 
descriptor to which it used to be linked.

	if (DDADR(pcdev->dma_chans[i]) == DDADR_STOP) {
		/* The DMA engine is on the last
		   descriptor, set the next descriptors
		   address to the descriptors we just
		   initialized */
		buf_dma->sg_cpu[nents].ddadr = buf_dma->sg_dma;
	} else {
		buf_dma->sg_cpu[nents].ddadr =
			DDADR(pcdev->dma_chans[i]);
	}

Now we restart DMA at our "dummy" descriptor. Actually, it is not dummy 
any more, it is "linking," "partial," or whatever you call it.

	/* The next descriptor is the dummy descriptor */
	DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
		sizeof(struct pxa_dma_desc);

	DCSR(pcdev->dma_chans[i]) = DCSR_RUN;

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-09 19:13     ` Robert Jarzmik
@ 2009-03-10 18:33       ` Trent Piepho
  0 siblings, 0 replies; 33+ messages in thread
From: Trent Piepho @ 2009-03-10 18:33 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: Guennadi Liakhovetski, mike, Linux Media Mailing List

On Mon, 9 Mar 2009, Robert Jarzmik wrote:
> > Ok, this one will change I presume - new alignment calculations and
> > line-breaking. In fact, if you adjust width and height earlier in set_fmt,
> > maybe you'll just remove any rounding here completely.
> Helas, not fully.
> The problem is with passthrough and rgb formats, where I don't enforce
> width/height. In the newest form of the patch I have this :
> 	if (pcdev->channels == 3)
> 		*size = icd->width * icd->height * 2;
> 	else
> 		*size = roundup(icd->width * icd->height *
> 				((icd->current_fmt->depth + 7) >> 3), 8);

If icd->current_fmt->depth could be set to 16 for planar formats, then you
could get rid of the special case here.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-09 23:14         ` Guennadi Liakhovetski
@ 2009-03-10 21:46           ` Robert Jarzmik
  2009-03-11 18:25             ` Guennadi Liakhovetski
  2009-03-11 21:21             ` Robert Jarzmik
  0 siblings, 2 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-10 21:46 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

> I'll answer all points tomorrow, but so you can start thinking about it 
> earlier and get used to it:-), I'll explain the current driver behaviour 
> now:
OK. Would you take that patch instead and comment on that one ?

 It is the result of our conversation about "hot DMA linking". I tested both
paths (the optimal one and the one where DMA stops while queuing =>
cf. pxa_camera_check_link_miss) for RGB565 format.  I'll test further for
YUV422P ...

Have a nice evening.

--
Robert


>From 7730f71bf94e3ee61550659d923ff776e7481191 Mon Sep 17 00:00:00 2001
From: Robert Jarzmik <robert.jarzmik@free.fr>
Date: Wed, 4 Mar 2009 20:03:13 +0100
Subject: [PATCH v2] pxa_camera: Redesign DMA handling

The DMA transfers in pxa_camera showed some weaknesses in
multiple queued buffers context :
 - poll/select problem
   The order between list pcdev->capture and DMA chain was
   not the same. This creates a discrepancy between video
   buffers marked as "done" by the IRQ handler, and the
   really finished video buffer.

   The bug shows up with capture_example tool from v4l2 hg
   tree. The process just "stalls" on a "select timeout".

   The key problem is in pxa_videobuf_queue(), where the
   queued buffer is chained before the active buffer, while
   it should have been the active buffer first, and queued
   buffer tailed after.

 - multiple buffers DMA starting
   When multiple buffers were queued, the DMA channels were
   always started right away. This is not optimal, as a
   special case appears when the first EOF was not yet
   reached, and the DMA channels were prematurely started.

 - Maintainability
   DMA code was a bit obfuscated. Rationalize the code to be
   easily maintainable by anyone.

This patch attemps to address these issues.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/media/video/pxa_camera.c |  329 +++++++++++++++++++++++--------------
 1 files changed, 204 insertions(+), 125 deletions(-)

diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
index aca5374..6ec8135 100644
--- a/drivers/media/video/pxa_camera.c
+++ b/drivers/media/video/pxa_camera.c
@@ -324,7 +324,7 @@ static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
  * Prepares the pxa dma descriptors to transfer one camera channel.
  * Beware sg_first and sg_first_ofs are both input and output parameters.
  *
- * Returns 0
+ * Returns 0 or -ENOMEM si no coherent memory is available
  */
 static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 				struct pxa_buffer *buf,
@@ -369,6 +369,10 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
 		pxa_dma->sg_cpu[i].dcmd =
 			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
+#ifdef DEBUG
+		if (!i)
+			pxa_dma->sg_cpu[i].dcmd |= DCMD_STARTIRQEN;
+#endif
 		pxa_dma->sg_cpu[i].ddadr =
 			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
 
@@ -402,6 +406,47 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
 	return 0;
 }
 
+/*
+ * A DMA prepared buffer will have this structure :
+ * +------------+-----+---------------+-----------------+
+ * | desc-sg[0] | ... | desc-sg[last] | finisher/linker |
+ * +------------+-----+---------------+-----------------+
+ *
+ * This structure is pointed by dma->sg_cpu.
+ * The descriptors are used as follows :
+ *  - desc-sg[i]: i-th descriptor, transfering the i-th sg
+ *    element to the video buffer scatter gather
+ *  - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN
+ *  - linker: has ddadr= desc-sg[0] of next video buffer, dcmd=0
+ *
+ * For the next schema, let's assume d0=desc-sg[0] .. dN=desc-sg[N],
+ * "f" stands for finisher and "l" for linker.
+ * A typical running chain is :
+ *
+ *     Videobuffer 1         Videobuffer 2
+ * +---------+----+---+  +----+----+----+---+
+ * | d0 | .. | dN | l |  | d0 | .. | dN | f |
+ * +---------+----+-|-+  ^----+----+----+---+
+ *                  |    |
+ *                  +----+
+ *
+ * After the chaining is finished, the chain looks like :
+ *
+ *     Videobuffer 1         Videobuffer 2         Videobuffer 3
+ * +---------+----+---+  +----+----+----+---+  +----+----+----+---+
+ * | d0 | .. | dN | l |  | d0 | .. | dN | l |  | d0 | .. | dN | f |
+ * +---------+----+-|-+  ^----+----+----+-|-+  ^----+----+----+---+
+ *                  |    |                |    |
+ *                  +----+                +----+
+ *                                       new_link
+ *
+ * Now, the chaining done in pxa_videobuf_queue, creating "new_link" can happen
+ * while the DMA chain as already jumped to Videobuffer 2 "finisher". In this
+ * case, the DMA chain is stopped, and the DMA irq handler is pending.
+ * Then, the DMA irq completion handler will check if the DMA is finished and a
+ * buffer is still on the pcdev->capture list. If that's the case, the capture
+ * will be restarted.
+ */
 static int pxa_videobuf_prepare(struct videobuf_queue *vq,
 		struct videobuf_buffer *vb, enum v4l2_field field)
 {
@@ -517,6 +562,96 @@ out:
 	return ret;
 }
 
+/**
+ * pxa_dma_start_channels - start DMA channel for active buffer
+ * @pcdev: pxa camera device
+ *
+ * Initialize DMA channels to the beginning of the active video buffer, and
+ * start these channels.
+ */
+static void pxa_dma_start_channels(struct pxa_camera_dev *pcdev)
+{
+	int i;
+	struct pxa_buffer *active;
+
+	active = pcdev->active;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		dev_dbg(pcdev->dev, "%s (channel=%d) ddadr=%08x\n", __func__,
+			i, active->dmas[i].sg_dma);
+		DDADR(pcdev->dma_chans[i]) = active->dmas[i].sg_dma;
+		DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
+	}
+}
+
+static void pxa_dma_stop_channels(struct pxa_camera_dev *pcdev)
+{
+	int i;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		dev_dbg(pcdev->dev, "%s (channel=%d)\n", __func__, i);
+		DCSR(pcdev->dma_chans[i]) = 0;
+	}
+}
+
+static void pxa_dma_update_sg_tail(struct pxa_camera_dev *pcdev,
+				   struct pxa_buffer *buf)
+{
+	int i;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
+		pcdev->sg_tail[i]->ddadr = DDADR_STOP;
+	}
+}
+
+static void pxa_dma_add_tail_buf(struct pxa_camera_dev *pcdev,
+				 struct pxa_buffer *buf)
+{
+	int i;
+
+	for (i = 0; i < pcdev->channels; i++) {
+		if (!pcdev->sg_tail[i])
+			continue;
+		pcdev->sg_tail[i]->ddadr = buf->dmas[i].sg_dma;
+	}
+
+	pxa_dma_update_sg_tail(pcdev, buf);
+}
+
+/**
+ * pxa_camera_start_capture - start video capturing
+ * @pcdev: camera device
+ *
+ * Launch capturing. DMA channels should not be active yet. They should get
+ * activated at the end of frame interrupt, to capture only whole frames, and
+ * never begin the capture of a partial frame.
+ */
+static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev)
+{
+	unsigned long cicr0, cifr;
+
+	dev_dbg(pcdev->dev, "%s\n", __func__);
+	cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
+	__raw_writel(cifr, pcdev->base + CIFR);
+
+	cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB;
+	cicr0 &= ~CICR0_EOFM;
+	__raw_writel(cicr0, pcdev->base + CICR0);
+}
+
+static void pxa_camera_stop_capture(struct pxa_camera_dev *pcdev)
+{
+	unsigned long cicr0;
+
+	pxa_dma_stop_channels(pcdev);
+
+	cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
+	__raw_writel(cicr0, pcdev->base + CICR0);
+
+	dev_dbg(pcdev->dev, "%s\n", __func__);
+}
+
 static void pxa_videobuf_queue(struct videobuf_queue *vq,
 			       struct videobuf_buffer *vb)
 {
@@ -524,81 +659,19 @@ static void pxa_videobuf_queue(struct videobuf_queue *vq,
 	struct soc_camera_host *ici = to_soc_camera_host(icd->dev.parent);
 	struct pxa_camera_dev *pcdev = ici->priv;
 	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
-	struct pxa_buffer *active;
 	unsigned long flags;
-	int i;
 
-	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
-		vb, vb->baddr, vb->bsize);
-	spin_lock_irqsave(&pcdev->lock, flags);
+	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d active=%p\n", __func__,
+		vb, vb->baddr, vb->bsize, pcdev->active);
 
+	spin_lock_irqsave(&pcdev->lock, flags);
 	list_add_tail(&vb->queue, &pcdev->capture);
 
 	vb->state = VIDEOBUF_ACTIVE;
-	active = pcdev->active;
-
-	if (!active) {
-		unsigned long cifr, cicr0;
-
-		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
-		__raw_writel(cifr, pcdev->base + CIFR);
-
-		for (i = 0; i < pcdev->channels; i++) {
-			DDADR(pcdev->dma_chans[i]) = buf->dmas[i].sg_dma;
-			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
-			pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen - 1;
-		}
+	pxa_dma_add_tail_buf(pcdev, buf);
 
-		pcdev->active = buf;
-
-		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB;
-		__raw_writel(cicr0, pcdev->base + CICR0);
-	} else {
-		struct pxa_cam_dma *buf_dma;
-		struct pxa_cam_dma *act_dma;
-		int nents;
-
-		for (i = 0; i < pcdev->channels; i++) {
-			buf_dma = &buf->dmas[i];
-			act_dma = &active->dmas[i];
-			nents = buf_dma->sglen;
-
-			/* Stop DMA engine */
-			DCSR(pcdev->dma_chans[i]) = 0;
-
-			/* Add the descriptors we just initialized to
-			   the currently running chain */
-			pcdev->sg_tail[i]->ddadr = buf_dma->sg_dma;
-			pcdev->sg_tail[i] = buf_dma->sg_cpu + buf_dma->sglen - 1;
-
-			/* Setup a dummy descriptor with the DMA engines current
-			 * state
-			 */
-			buf_dma->sg_cpu[nents].dsadr =
-				pcdev->res->start + 0x28 + i*8; /* CIBRx */
-			buf_dma->sg_cpu[nents].dtadr =
-				DTADR(pcdev->dma_chans[i]);
-			buf_dma->sg_cpu[nents].dcmd =
-				DCMD(pcdev->dma_chans[i]);
-
-			if (DDADR(pcdev->dma_chans[i]) == DDADR_STOP) {
-				/* The DMA engine is on the last
-				   descriptor, set the next descriptors
-				   address to the descriptors we just
-				   initialized */
-				buf_dma->sg_cpu[nents].ddadr = buf_dma->sg_dma;
-			} else {
-				buf_dma->sg_cpu[nents].ddadr =
-					DDADR(pcdev->dma_chans[i]);
-			}
-
-			/* The next descriptor is the dummy descriptor */
-			DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
-				sizeof(struct pxa_dma_desc);
-
-			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
-		}
-	}
+	if (!pcdev->active)
+		pxa_camera_start_capture(pcdev);
 
 	spin_unlock_irqrestore(&pcdev->lock, flags);
 }
@@ -636,7 +709,7 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
 			      struct videobuf_buffer *vb,
 			      struct pxa_buffer *buf)
 {
-	unsigned long cicr0;
+	int i;
 
 	/* _init is used to debug races, see comment in pxa_camera_reqbufs() */
 	list_del_init(&vb->queue);
@@ -644,15 +717,13 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
 	do_gettimeofday(&vb->ts);
 	vb->field_count++;
 	wake_up(&vb->done);
+	dev_dbg(pcdev->dev, "%s dequeud buffer (vb=0x%p)\n", __func__, vb);
 
 	if (list_empty(&pcdev->capture)) {
+		pxa_camera_stop_capture(pcdev);
 		pcdev->active = NULL;
-		DCSR(pcdev->dma_chans[0]) = 0;
-		DCSR(pcdev->dma_chans[1]) = 0;
-		DCSR(pcdev->dma_chans[2]) = 0;
-
-		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
-		__raw_writel(cicr0, pcdev->base + CICR0);
+		for (i = 0; i < pcdev->channels; i++)
+			pcdev->sg_tail[i] = NULL;
 		return;
 	}
 
@@ -660,6 +731,32 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
 				   struct pxa_buffer, vb.queue);
 }
 
+/**
+ * pxa_camera_check_link_miss - check missed DMA linking
+ * @pcdev: camera device
+ *
+ * The DMA chaining is done with DMA running. This means a tiny temporal window
+ * remains, where a buffer is queued on the chain, while the chain is already
+ * stopped. This means the tailed buffer would never be transfered by DMA.
+ * This function restarts the capture for this corner case, where :
+ *  - DADR() == DADDR_STOP
+ *  - a videobuffer is queued on the pcdev->capture list
+ *
+ * Context: should only be called within the dma irq handler
+ */
+static void pxa_camera_check_link_miss(struct pxa_camera_dev *pcdev)
+{
+	int i, is_dma_stopped = 1;
+
+	for (i = 0; i < pcdev->channels; i++)
+		if (DDADR(pcdev->dma_chans[i]) != DDADR_STOP)
+			is_dma_stopped = 0;
+	dev_dbg(pcdev->dev, "%s : top queued buffer=%p, dma_stopped=%d\n",
+		__func__, pcdev->active, is_dma_stopped);
+	if (pcdev->active && is_dma_stopped)
+		pxa_camera_start_capture(pcdev);
+}
+
 static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
 			       enum pxa_camera_active_dma act_dma)
 {
@@ -667,19 +764,23 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
 	unsigned long flags;
 	u32 status, camera_status, overrun;
 	struct videobuf_buffer *vb;
-	unsigned long cifr, cicr0;
 
 	spin_lock_irqsave(&pcdev->lock, flags);
 
 	status = DCSR(channel);
-	DCSR(channel) = status | DCSR_ENDINTR;
+	DCSR(channel) = status;
+
+	camera_status = __raw_readl(pcdev->base + CISR);
+	overrun = CISR_IFO_0;
+	if (pcdev->channels == 3)
+		overrun |= CISR_IFO_1 | CISR_IFO_2;
 
 	if (status & DCSR_BUSERR) {
 		dev_err(pcdev->dev, "DMA Bus Error IRQ!\n");
 		goto out;
 	}
 
-	if (!(status & DCSR_ENDINTR)) {
+	if (!(status & (DCSR_ENDINTR | DCSR_STARTINTR))) {
 		dev_err(pcdev->dev, "Unknown DMA IRQ source, "
 			"status: 0x%08x\n", status);
 		goto out;
@@ -690,38 +791,27 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
 		goto out;
 	}
 
-	camera_status = __raw_readl(pcdev->base + CISR);
-	overrun = CISR_IFO_0;
-	if (pcdev->channels == 3)
-		overrun |= CISR_IFO_1 | CISR_IFO_2;
-	if (camera_status & overrun) {
-		dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n", camera_status);
-		/* Stop the Capture Interface */
-		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
-		__raw_writel(cicr0, pcdev->base + CICR0);
-
-		/* Stop DMA */
-		DCSR(channel) = 0;
-		/* Reset the FIFOs */
-		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
-		__raw_writel(cifr, pcdev->base + CIFR);
-		/* Enable End-Of-Frame Interrupt */
-		cicr0 &= ~CICR0_EOFM;
-		__raw_writel(cicr0, pcdev->base + CICR0);
-		/* Restart the Capture Interface */
-		__raw_writel(cicr0 | CICR0_ENB, pcdev->base + CICR0);
-		goto out;
-	}
-
 	vb = &pcdev->active->vb;
 	buf = container_of(vb, struct pxa_buffer, vb);
 	WARN_ON(buf->inwork || list_empty(&vb->queue));
-	dev_dbg(pcdev->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
-		vb, vb->baddr, vb->bsize);
 
-	buf->active_dma &= ~act_dma;
-	if (!buf->active_dma)
-		pxa_camera_wakeup(pcdev, vb, buf);
+	dev_dbg(pcdev->dev, "%s channel=%d %s%s(vb=0x%p) dma.desc=%x\n",
+		__func__, channel, status & DCSR_STARTINTR ? "SOF " : "",
+		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
+
+	if (status & DCSR_ENDINTR) {
+		if (camera_status & overrun) {
+			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
+				camera_status);
+			pxa_camera_stop_capture(pcdev);
+			pxa_camera_start_capture(pcdev);
+			goto out;
+		}
+		buf->active_dma &= ~act_dma;
+		if (!buf->active_dma)
+			pxa_camera_wakeup(pcdev, vb, buf);
+		pxa_camera_check_link_miss(pcdev);
+	}
 
 out:
 	spin_unlock_irqrestore(&pcdev->lock, flags);
@@ -860,12 +950,11 @@ static irqreturn_t pxa_camera_irq(int irq, void *data)
 	__raw_writel(status, pcdev->base + CISR);
 
 	if (status & CISR_EOF) {
-		int i;
-		for (i = 0; i < pcdev->channels; i++) {
-			DDADR(pcdev->dma_chans[i]) =
-				pcdev->active->dmas[i].sg_dma;
-			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
-		}
+		pcdev->active = list_first_entry(&pcdev->capture,
+					   struct pxa_buffer, vb.queue);
+
+		pxa_dma_start_channels(pcdev);
+
 		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_EOFM;
 		__raw_writel(cicr0, pcdev->base + CICR0);
 	}
@@ -1417,18 +1506,8 @@ static int pxa_camera_resume(struct soc_camera_device *icd)
 		ret = pcdev->icd->ops->resume(pcdev->icd);
 
 	/* Restart frame capture if active buffer exists */
-	if (!ret && pcdev->active) {
-		unsigned long cifr, cicr0;
-
-		/* Reset the FIFOs */
-		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
-		__raw_writel(cifr, pcdev->base + CIFR);
-
-		cicr0 = __raw_readl(pcdev->base + CICR0);
-		cicr0 &= ~CICR0_EOFM;	/* Enable End-Of-Frame Interrupt */
-		cicr0 |= CICR0_ENB;	/* Restart the Capture Interface */
-		__raw_writel(cicr0, pcdev->base + CICR0);
-	}
+	if (!ret && pcdev->active)
+		pxa_camera_start_capture(pcdev);
 
 	return ret;
 }
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-10 21:46           ` Robert Jarzmik
@ 2009-03-11 18:25             ` Guennadi Liakhovetski
  2009-03-11 19:45               ` Robert Jarzmik
  2009-03-11 21:21             ` Robert Jarzmik
  1 sibling, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-11 18:25 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

On Tue, 10 Mar 2009, Robert Jarzmik wrote:

> The DMA transfers in pxa_camera showed some weaknesses in
> multiple queued buffers context :
>  - poll/select problem
>    The order between list pcdev->capture and DMA chain was
>    not the same. This creates a discrepancy between video
>    buffers marked as "done" by the IRQ handler, and the
>    really finished video buffer.

Double-check. I still do not see how the order can be swapped. But don't 
worry, this doesn't diminish the value of your work, I'm just trying to be 
fair to the existing driver:-)

> 
>    The bug shows up with capture_example tool from v4l2 hg
>    tree. The process just "stalls" on a "select timeout".
> 
>    The key problem is in pxa_videobuf_queue(), where the
>    queued buffer is chained before the active buffer, while
>    it should have been the active buffer first, and queued
>    buffer tailed after.
> 
>  - multiple buffers DMA starting
>    When multiple buffers were queued, the DMA channels were
>    always started right away. This is not optimal, as a
>    special case appears when the first EOF was not yet
>    reached, and the DMA channels were prematurely started.
> 
>  - Maintainability
>    DMA code was a bit obfuscated. Rationalize the code to be
>    easily maintainable by anyone.
> 
> This patch attemps to address these issues.
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  drivers/media/video/pxa_camera.c |  329 +++++++++++++++++++++++--------------
>  1 files changed, 204 insertions(+), 125 deletions(-)
> 
> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> index aca5374..6ec8135 100644
> --- a/drivers/media/video/pxa_camera.c
> +++ b/drivers/media/video/pxa_camera.c
> @@ -324,7 +324,7 @@ static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
>   * Prepares the pxa dma descriptors to transfer one camera channel.
>   * Beware sg_first and sg_first_ofs are both input and output parameters.
>   *
> - * Returns 0
> + * Returns 0 or -ENOMEM si no coherent memory is available

s/si/if/ s'il vous plait:-)

>   */
>  static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  				struct pxa_buffer *buf,
> @@ -369,6 +369,10 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  		pxa_dma->sg_cpu[i].dtadr = sg_dma_address(sg) + offset;
>  		pxa_dma->sg_cpu[i].dcmd =
>  			DCMD_FLOWSRC | DCMD_BURST8 | DCMD_INCTRGADDR | xfer_len;
> +#ifdef DEBUG
> +		if (!i)
> +			pxa_dma->sg_cpu[i].dcmd |= DCMD_STARTIRQEN;
> +#endif
>  		pxa_dma->sg_cpu[i].ddadr =
>  			pxa_dma->sg_dma + (i + 1) * sizeof(struct pxa_dma_desc);
>  
> @@ -402,6 +406,47 @@ static int pxa_init_dma_channel(struct pxa_camera_dev *pcdev,
>  	return 0;
>  }
>  
> +/*
> + * A DMA prepared buffer will have this structure :
> + * +------------+-----+---------------+-----------------+
> + * | desc-sg[0] | ... | desc-sg[last] | finisher/linker |
> + * +------------+-----+---------------+-----------------+
> + *
> + * This structure is pointed by dma->sg_cpu.
> + * The descriptors are used as follows :
> + *  - desc-sg[i]: i-th descriptor, transfering the i-th sg
> + *    element to the video buffer scatter gather
> + *  - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN
> + *  - linker: has ddadr= desc-sg[0] of next video buffer, dcmd=0
> + *
> + * For the next schema, let's assume d0=desc-sg[0] .. dN=desc-sg[N],
> + * "f" stands for finisher and "l" for linker.
> + * A typical running chain is :
> + *
> + *     Videobuffer 1         Videobuffer 2
> + * +---------+----+---+  +----+----+----+---+
> + * | d0 | .. | dN | l |  | d0 | .. | dN | f |
> + * +---------+----+-|-+  ^----+----+----+---+
> + *                  |    |
> + *                  +----+
> + *
> + * After the chaining is finished, the chain looks like :
> + *
> + *     Videobuffer 1         Videobuffer 2         Videobuffer 3
> + * +---------+----+---+  +----+----+----+---+  +----+----+----+---+
> + * | d0 | .. | dN | l |  | d0 | .. | dN | l |  | d0 | .. | dN | f |
> + * +---------+----+-|-+  ^----+----+----+-|-+  ^----+----+----+---+
> + *                  |    |                |    |
> + *                  +----+                +----+
> + *                                       new_link
> + *
> + * Now, the chaining done in pxa_videobuf_queue, creating "new_link" can happen
> + * while the DMA chain as already jumped to Videobuffer 2 "finisher". In this
> + * case, the DMA chain is stopped, and the DMA irq handler is pending.
> + * Then, the DMA irq completion handler will check if the DMA is finished and a
> + * buffer is still on the pcdev->capture list. If that's the case, the capture
> + * will be restarted.
> + */

Nice, how about putting this in Documentation/video4linux/pxa_camera.txt 
and a reference to it in the comment?

>  static int pxa_videobuf_prepare(struct videobuf_queue *vq,
>  		struct videobuf_buffer *vb, enum v4l2_field field)
>  {
> @@ -517,6 +562,96 @@ out:
>  	return ret;
>  }
>  
> +/**
> + * pxa_dma_start_channels - start DMA channel for active buffer
> + * @pcdev: pxa camera device
> + *
> + * Initialize DMA channels to the beginning of the active video buffer, and
> + * start these channels.
> + */
> +static void pxa_dma_start_channels(struct pxa_camera_dev *pcdev)
> +{
> +	int i;
> +	struct pxa_buffer *active;
> +
> +	active = pcdev->active;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		dev_dbg(pcdev->dev, "%s (channel=%d) ddadr=%08x\n", __func__,
> +			i, active->dmas[i].sg_dma);
> +		DDADR(pcdev->dma_chans[i]) = active->dmas[i].sg_dma;
> +		DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> +	}
> +}
> +
> +static void pxa_dma_stop_channels(struct pxa_camera_dev *pcdev)
> +{
> +	int i;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		dev_dbg(pcdev->dev, "%s (channel=%d)\n", __func__, i);
> +		DCSR(pcdev->dma_chans[i]) = 0;
> +	}
> +}
> +
> +static void pxa_dma_update_sg_tail(struct pxa_camera_dev *pcdev,
> +				   struct pxa_buffer *buf)
> +{
> +	int i;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
> +		pcdev->sg_tail[i]->ddadr = DDADR_STOP;

This function is now called "live" with running DMA, and you first append 
the chain, and only then terminate it... It should be ok because it is 
done with switched off IRQs, and DMA must be still at tail - 1 to 
automatically continue onto the appended chain, so, you should have enough 
time in 100% of cases, still it would look better to first terminate the 
chain and then append it.

> +	}
> +}
> +
> +static void pxa_dma_add_tail_buf(struct pxa_camera_dev *pcdev,
> +				 struct pxa_buffer *buf)
> +{
> +	int i;
> +
> +	for (i = 0; i < pcdev->channels; i++) {
> +		if (!pcdev->sg_tail[i])
> +			continue;
> +		pcdev->sg_tail[i]->ddadr = buf->dmas[i].sg_dma;
> +	}
> +
> +	pxa_dma_update_sg_tail(pcdev, buf);
> +}
> +
> +/**
> + * pxa_camera_start_capture - start video capturing
> + * @pcdev: camera device
> + *
> + * Launch capturing. DMA channels should not be active yet. They should get
> + * activated at the end of frame interrupt, to capture only whole frames, and
> + * never begin the capture of a partial frame.
> + */
> +static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev)
> +{
> +	unsigned long cicr0, cifr;
> +
> +	dev_dbg(pcdev->dev, "%s\n", __func__);

<quote>
I originally had a "reset the FIFOs" comment here, wouldn't hurt to add it 
now too.
</quote>

> +	cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> +	__raw_writel(cifr, pcdev->base + CIFR);
> +
> +	cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB;
> +	cicr0 &= ~CICR0_EOFM;
> +	__raw_writel(cicr0, pcdev->base + CICR0);
> +}
> +
> +static void pxa_camera_stop_capture(struct pxa_camera_dev *pcdev)
> +{
> +	unsigned long cicr0;
> +
> +	pxa_dma_stop_channels(pcdev);
> +
> +	cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
> +	__raw_writel(cicr0, pcdev->base + CICR0);
> +
> +	dev_dbg(pcdev->dev, "%s\n", __func__);
> +}
> +
>  static void pxa_videobuf_queue(struct videobuf_queue *vq,
>  			       struct videobuf_buffer *vb)
>  {
> @@ -524,81 +659,19 @@ static void pxa_videobuf_queue(struct videobuf_queue *vq,
>  	struct soc_camera_host *ici = to_soc_camera_host(icd->dev.parent);
>  	struct pxa_camera_dev *pcdev = ici->priv;
>  	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
> -	struct pxa_buffer *active;
>  	unsigned long flags;
> -	int i;
>  
> -	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
> -		vb, vb->baddr, vb->bsize);
> -	spin_lock_irqsave(&pcdev->lock, flags);
> +	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d active=%p\n", __func__,
> +		vb, vb->baddr, vb->bsize, pcdev->active);
>  
> +	spin_lock_irqsave(&pcdev->lock, flags);
>  	list_add_tail(&vb->queue, &pcdev->capture);

I can understand adding an empty line between dev_dbg() and 
spin_lock_irqsave(), but I do not understand why you removed one between 
spin_lock_irqsave() and list_add_tail().

>  
>  	vb->state = VIDEOBUF_ACTIVE;
> -	active = pcdev->active;
> -
> -	if (!active) {
> -		unsigned long cifr, cicr0;
> -
> -		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> -		__raw_writel(cifr, pcdev->base + CIFR);
> -
> -		for (i = 0; i < pcdev->channels; i++) {
> -			DDADR(pcdev->dma_chans[i]) = buf->dmas[i].sg_dma;
> -			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> -			pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen - 1;
> -		}
> +	pxa_dma_add_tail_buf(pcdev, buf);
>  
> -		pcdev->active = buf;
> -
> -		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -	} else {
> -		struct pxa_cam_dma *buf_dma;
> -		struct pxa_cam_dma *act_dma;
> -		int nents;
> -
> -		for (i = 0; i < pcdev->channels; i++) {
> -			buf_dma = &buf->dmas[i];
> -			act_dma = &active->dmas[i];
> -			nents = buf_dma->sglen;
> -
> -			/* Stop DMA engine */
> -			DCSR(pcdev->dma_chans[i]) = 0;
> -
> -			/* Add the descriptors we just initialized to
> -			   the currently running chain */
> -			pcdev->sg_tail[i]->ddadr = buf_dma->sg_dma;
> -			pcdev->sg_tail[i] = buf_dma->sg_cpu + buf_dma->sglen - 1;
> -
> -			/* Setup a dummy descriptor with the DMA engines current
> -			 * state
> -			 */
> -			buf_dma->sg_cpu[nents].dsadr =
> -				pcdev->res->start + 0x28 + i*8; /* CIBRx */
> -			buf_dma->sg_cpu[nents].dtadr =
> -				DTADR(pcdev->dma_chans[i]);
> -			buf_dma->sg_cpu[nents].dcmd =
> -				DCMD(pcdev->dma_chans[i]);
> -
> -			if (DDADR(pcdev->dma_chans[i]) == DDADR_STOP) {
> -				/* The DMA engine is on the last
> -				   descriptor, set the next descriptors
> -				   address to the descriptors we just
> -				   initialized */
> -				buf_dma->sg_cpu[nents].ddadr = buf_dma->sg_dma;
> -			} else {
> -				buf_dma->sg_cpu[nents].ddadr =
> -					DDADR(pcdev->dma_chans[i]);
> -			}
> -
> -			/* The next descriptor is the dummy descriptor */
> -			DDADR(pcdev->dma_chans[i]) = buf_dma->sg_dma + nents *
> -				sizeof(struct pxa_dma_desc);
> -
> -			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> -		}
> -	}
> +	if (!pcdev->active)
> +		pxa_camera_start_capture(pcdev);
>  
>  	spin_unlock_irqrestore(&pcdev->lock, flags);
>  }
> @@ -636,7 +709,7 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
>  			      struct videobuf_buffer *vb,
>  			      struct pxa_buffer *buf)
>  {
> -	unsigned long cicr0;
> +	int i;
>  
>  	/* _init is used to debug races, see comment in pxa_camera_reqbufs() */
>  	list_del_init(&vb->queue);
> @@ -644,15 +717,13 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
>  	do_gettimeofday(&vb->ts);
>  	vb->field_count++;
>  	wake_up(&vb->done);
> +	dev_dbg(pcdev->dev, "%s dequeud buffer (vb=0x%p)\n", __func__, vb);
>  
>  	if (list_empty(&pcdev->capture)) {
> +		pxa_camera_stop_capture(pcdev);
>  		pcdev->active = NULL;
> -		DCSR(pcdev->dma_chans[0]) = 0;
> -		DCSR(pcdev->dma_chans[1]) = 0;
> -		DCSR(pcdev->dma_chans[2]) = 0;
> -
> -		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> +		for (i = 0; i < pcdev->channels; i++)
> +			pcdev->sg_tail[i] = NULL;
>  		return;
>  	}
>  
> @@ -660,6 +731,32 @@ static void pxa_camera_wakeup(struct pxa_camera_dev *pcdev,
>  				   struct pxa_buffer, vb.queue);
>  }
>  
> +/**
> + * pxa_camera_check_link_miss - check missed DMA linking
> + * @pcdev: camera device
> + *
> + * The DMA chaining is done with DMA running. This means a tiny temporal window
> + * remains, where a buffer is queued on the chain, while the chain is already
> + * stopped. This means the tailed buffer would never be transfered by DMA.
> + * This function restarts the capture for this corner case, where :
> + *  - DADR() == DADDR_STOP
> + *  - a videobuffer is queued on the pcdev->capture list
> + *
> + * Context: should only be called within the dma irq handler
> + */
> +static void pxa_camera_check_link_miss(struct pxa_camera_dev *pcdev)
> +{
> +	int i, is_dma_stopped = 1;
> +
> +	for (i = 0; i < pcdev->channels; i++)
> +		if (DDADR(pcdev->dma_chans[i]) != DDADR_STOP)
> +			is_dma_stopped = 0;
> +	dev_dbg(pcdev->dev, "%s : top queued buffer=%p, dma_stopped=%d\n",
> +		__func__, pcdev->active, is_dma_stopped);
> +	if (pcdev->active && is_dma_stopped)
> +		pxa_camera_start_capture(pcdev);
> +}
> +
>  static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  			       enum pxa_camera_active_dma act_dma)
>  {
> @@ -667,19 +764,23 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  	unsigned long flags;
>  	u32 status, camera_status, overrun;
>  	struct videobuf_buffer *vb;
> -	unsigned long cifr, cicr0;
>  
>  	spin_lock_irqsave(&pcdev->lock, flags);
>  
>  	status = DCSR(channel);
> -	DCSR(channel) = status | DCSR_ENDINTR;
> +	DCSR(channel) = status;
> +
> +	camera_status = __raw_readl(pcdev->base + CISR);
> +	overrun = CISR_IFO_0;
> +	if (pcdev->channels == 3)
> +		overrun |= CISR_IFO_1 | CISR_IFO_2;
>  
>  	if (status & DCSR_BUSERR) {
>  		dev_err(pcdev->dev, "DMA Bus Error IRQ!\n");
>  		goto out;
>  	}
>  
> -	if (!(status & DCSR_ENDINTR)) {
> +	if (!(status & (DCSR_ENDINTR | DCSR_STARTINTR))) {
>  		dev_err(pcdev->dev, "Unknown DMA IRQ source, "
>  			"status: 0x%08x\n", status);
>  		goto out;
> @@ -690,38 +791,27 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  		goto out;
>  	}
>  
> -	camera_status = __raw_readl(pcdev->base + CISR);
> -	overrun = CISR_IFO_0;
> -	if (pcdev->channels == 3)
> -		overrun |= CISR_IFO_1 | CISR_IFO_2;
> -	if (camera_status & overrun) {
> -		dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n", camera_status);
> -		/* Stop the Capture Interface */
> -		cicr0 = __raw_readl(pcdev->base + CICR0) & ~CICR0_ENB;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -
> -		/* Stop DMA */
> -		DCSR(channel) = 0;
> -		/* Reset the FIFOs */
> -		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> -		__raw_writel(cifr, pcdev->base + CIFR);
> -		/* Enable End-Of-Frame Interrupt */
> -		cicr0 &= ~CICR0_EOFM;
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -		/* Restart the Capture Interface */
> -		__raw_writel(cicr0 | CICR0_ENB, pcdev->base + CICR0);
> -		goto out;
> -	}
> -
>  	vb = &pcdev->active->vb;
>  	buf = container_of(vb, struct pxa_buffer, vb);
>  	WARN_ON(buf->inwork || list_empty(&vb->queue));
> -	dev_dbg(pcdev->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
> -		vb, vb->baddr, vb->bsize);
>  
> -	buf->active_dma &= ~act_dma;
> -	if (!buf->active_dma)
> -		pxa_camera_wakeup(pcdev, vb, buf);
> +	dev_dbg(pcdev->dev, "%s channel=%d %s%s(vb=0x%p) dma.desc=%x\n",
> +		__func__, channel, status & DCSR_STARTINTR ? "SOF " : "",
> +		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
> +
> +	if (status & DCSR_ENDINTR) {
> +		if (camera_status & overrun) {
> +			dev_dbg(pcdev->dev, "FIFO overrun! CISR: %x\n",
> +				camera_status);
> +			pxa_camera_stop_capture(pcdev);
> +			pxa_camera_start_capture(pcdev);
> +			goto out;
> +		}
> +		buf->active_dma &= ~act_dma;
> +		if (!buf->active_dma)
> +			pxa_camera_wakeup(pcdev, vb, buf);
> +		pxa_camera_check_link_miss(pcdev);
> +	}
>  
>  out:
>  	spin_unlock_irqrestore(&pcdev->lock, flags);
> @@ -860,12 +950,11 @@ static irqreturn_t pxa_camera_irq(int irq, void *data)
>  	__raw_writel(status, pcdev->base + CISR);
>  
>  	if (status & CISR_EOF) {
> -		int i;
> -		for (i = 0; i < pcdev->channels; i++) {
> -			DDADR(pcdev->dma_chans[i]) =
> -				pcdev->active->dmas[i].sg_dma;
> -			DCSR(pcdev->dma_chans[i]) = DCSR_RUN;
> -		}
> +		pcdev->active = list_first_entry(&pcdev->capture,
> +					   struct pxa_buffer, vb.queue);
> +
> +		pxa_dma_start_channels(pcdev);
> +
>  		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_EOFM;
>  		__raw_writel(cicr0, pcdev->base + CICR0);
>  	}
> @@ -1417,18 +1506,8 @@ static int pxa_camera_resume(struct soc_camera_device *icd)
>  		ret = pcdev->icd->ops->resume(pcdev->icd);
>  
>  	/* Restart frame capture if active buffer exists */
> -	if (!ret && pcdev->active) {
> -		unsigned long cifr, cicr0;
> -
> -		/* Reset the FIFOs */
> -		cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F;
> -		__raw_writel(cifr, pcdev->base + CIFR);
> -
> -		cicr0 = __raw_readl(pcdev->base + CICR0);
> -		cicr0 &= ~CICR0_EOFM;	/* Enable End-Of-Frame Interrupt */
> -		cicr0 |= CICR0_ENB;	/* Restart the Capture Interface */
> -		__raw_writel(cicr0, pcdev->base + CICR0);
> -	}
> +	if (!ret && pcdev->active)
> +		pxa_camera_start_capture(pcdev);
>  
>  	return ret;
>  }
> -- 
> 1.5.6.5
> 

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer
  2009-03-05 19:45       ` [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer Robert Jarzmik
  2009-03-09 11:39         ` Guennadi Liakhovetski
@ 2009-03-11 18:31         ` Guennadi Liakhovetski
  2009-03-12 21:36           ` Robert Jarzmik
  1 sibling, 1 reply; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-11 18:31 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

On Thu, 5 Mar 2009, Robert Jarzmik wrote:

> The last buffer queued will often overrun, as the DMA chain
> is finished, and the time the dma irq handler is activated,
> the QIF fifos are filled by the sensor.
> 
> The fix is to ignore the overrun condition on the last
> queued buffer, and restart the capture only on intermediate
> buffers of the chain.
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  drivers/media/video/pxa_camera.c |    8 ++++++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> index 16bf0a3..dd56c35 100644
> --- a/drivers/media/video/pxa_camera.c
> +++ b/drivers/media/video/pxa_camera.c
> @@ -734,14 +734,18 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>  		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
>  
>  	if (status & DCSR_ENDINTR) {
> -		if (camera_status & overrun) {
> +		/*
> +		 * It's normal if the last frame creates an overrun, as there
> +		 * are no more DMA descriptors to fetch from QIF fifos
> +		 */
> +		if (camera_status & overrun
> +		    && !list_is_last(pcdev->capture.next, &pcdev->capture)) {

On a second look - didn't you want to test for ->active being the last?

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-11 18:25             ` Guennadi Liakhovetski
@ 2009-03-11 19:45               ` Robert Jarzmik
  2009-03-11 20:24                 ` Robert Jarzmik
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-11 19:45 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

> On Tue, 10 Mar 2009, Robert Jarzmik wrote:
>
>> The DMA transfers in pxa_camera showed some weaknesses in
>> multiple queued buffers context :
>>  - poll/select problem
>>    The order between list pcdev->capture and DMA chain was
>>    not the same. This creates a discrepancy between video
>>    buffers marked as "done" by the IRQ handler, and the
>>    really finished video buffer.
>
> Double-check. I still do not see how the order can be swapped. But don't 
> worry, this doesn't diminish the value of your work, I'm just trying to be 
> fair to the existing driver:-)
Yes :) I thought I had sent a mail to apologize for the remaining comment ... I
have fully removed the order issue, I only left the poll/select issue.
And yes, it's not fair to the previous code, you're perfectly right.

>> @@ -324,7 +324,7 @@ static int calculate_dma_sglen(struct scatterlist *sglist, int sglen,
>>   * Prepares the pxa dma descriptors to transfer one camera channel.
>>   * Beware sg_first and sg_first_ofs are both input and output parameters.
>>   *
>> - * Returns 0
>> + * Returns 0 or -ENOMEM si no coherent memory is available
>
> s/si/if/ s'il vous plait:-)
Argh. I was sure I had amended that ... urg.

> Nice, how about putting this in Documentation/video4linux/pxa_camera.txt 
> and a reference to it in the comment?
Yes, it will make the code lighter, won't it ? I'll need a bit of help to
correct my english here ...

>> +	for (i = 0; i < pcdev->channels; i++) {
>> +		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
>> +		pcdev->sg_tail[i]->ddadr = DDADR_STOP;

> This function is now called "live" with running DMA, and you first append 
> the chain, and only then terminate it... It should be ok because it is 
> done with switched off IRQs, and DMA must be still at tail - 1 to 
> automatically continue onto the appended chain, so, you should have enough 
> time in 100% of cases, still it would look better to first terminate the 
> chain and then append it.
Correct. I'll invert the 2 assignments.

>
>> +static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev)
>> +{
>> +	unsigned long cicr0, cifr;
>> +
>> +	dev_dbg(pcdev->dev, "%s\n", __func__);
>
> <quote>
> I originally had a "reset the FIFOs" comment here, wouldn't hurt to add it 
> now too.
> </quote>
Yes. I re-added it. I added also your "Enable End-Of-Frame Interrupt" back.

>> @@ -524,81 +659,19 @@ static void pxa_videobuf_queue(struct videobuf_queue *vq,
>>  	struct soc_camera_host *ici = to_soc_camera_host(icd->dev.parent);
>>  	struct pxa_camera_dev *pcdev = ici->priv;
>>  	struct pxa_buffer *buf = container_of(vb, struct pxa_buffer, vb);
>> -	struct pxa_buffer *active;
>>  	unsigned long flags;
>> -	int i;
>>  
>> -	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d\n", __func__,
>> -		vb, vb->baddr, vb->bsize);
>> -	spin_lock_irqsave(&pcdev->lock, flags);
>> +	dev_dbg(&icd->dev, "%s (vb=0x%p) 0x%08lx %d active=%p\n", __func__,
>> +		vb, vb->baddr, vb->bsize, pcdev->active);
>>  
>> +	spin_lock_irqsave(&pcdev->lock, flags);
>>  	list_add_tail(&vb->queue, &pcdev->capture);
>
> I can understand adding an empty line between dev_dbg() and 
> spin_lock_irqsave(), but I do not understand why you removed one between 
> spin_lock_irqsave() and list_add_tail().
My bad. I had a mdelay() around to test the corner case :) I'll amend that.

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-11 19:45               ` Robert Jarzmik
@ 2009-03-11 20:24                 ` Robert Jarzmik
  0 siblings, 0 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-11 20:24 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Robert Jarzmik <robert.jarzmik@free.fr> writes:

> Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:
>
>>> +	for (i = 0; i < pcdev->channels; i++) {
>>> +		pcdev->sg_tail[i] = buf->dmas[i].sg_cpu + buf->dmas[i].sglen;
>>> +		pcdev->sg_tail[i]->ddadr = DDADR_STOP;
>
>> This function is now called "live" with running DMA, and you first append 
>> the chain, and only then terminate it... It should be ok because it is 
>> done with switched off IRQs, and DMA must be still at tail - 1 to 
>> automatically continue onto the appended chain, so, you should have enough 
>> time in 100% of cases, still it would look better to first terminate the 
>> chain and then append it.

> Correct. I'll invert the 2 assignments.

At second thought, I think I'll change this. The first assignment doesn't append
the chain, it just moves where sg_tail points at. The "chain append" was
previously done in "pxa_dma_add_tail_buf".

So the correct thing to do is to displace the "DDADR_STOP" assignment into
"pxa_dma_add_tail_buf", to make the "STOP" on the last descriptor of the newest
tail buffer. This "STOP" should be set up _before_ the buffer is queued.

I'll amend that.

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] pxa_camera: Redesign DMA handling
  2009-03-10 21:46           ` Robert Jarzmik
  2009-03-11 18:25             ` Guennadi Liakhovetski
@ 2009-03-11 21:21             ` Robert Jarzmik
  1 sibling, 0 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-11 21:21 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Robert Jarzmik <robert.jarzmik@free.fr> writes:

> Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

>  It is the result of our conversation about "hot DMA linking". I tested both
> paths (the optimal one and the one where DMA stops while queuing =>
> cf. pxa_camera_check_link_miss) for RGB565 format.  I'll test further for
> YUV422P ...

Well, surprise surprise with the YUV422P format. We're not done yet ...

There is a little issue with overrun : buf->active_dma is cleared in dma irq
handler (for example suppose is cleared of DMA_U which finished first). Then an
overrun occurs, and we restart that frame ...

We should have reset buf->active_dma to DMA_Y | DMA_U | DMA_V.
I think same thing applies to the "hot chain" link miss restart.

The non-regression tests are not yet finished ... exciting, isn't it ? :)

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer
  2009-03-11 18:31         ` Guennadi Liakhovetski
@ 2009-03-12 21:36           ` Robert Jarzmik
  2009-03-12 22:12             ` Guennadi Liakhovetski
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-12 21:36 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

>> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
>> index 16bf0a3..dd56c35 100644
>> --- a/drivers/media/video/pxa_camera.c
>> +++ b/drivers/media/video/pxa_camera.c
>> @@ -734,14 +734,18 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
>>  		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
>>  
>>  	if (status & DCSR_ENDINTR) {
>> -		if (camera_status & overrun) {
>> +		/*
>> +		 * It's normal if the last frame creates an overrun, as there
>> +		 * are no more DMA descriptors to fetch from QIF fifos
>> +		 */
>> +		if (camera_status & overrun
>> +		    && !list_is_last(pcdev->capture.next, &pcdev->capture)) {
>
> On a second look - didn't you want to test for ->active being the last?

Mmm, I'm not sure I get you right here. AFAICR pcdev->active has no direct link
with pcdev->capture (it has nothing to do with a list_head *). Of course with a
bit of "container_of" magic (or list_entry equivalent), I'll find it ...

If that list_is_last is not good, would you provide me with a better alternative
?

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer
  2009-03-12 21:36           ` Robert Jarzmik
@ 2009-03-12 22:12             ` Guennadi Liakhovetski
  0 siblings, 0 replies; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-12 22:12 UTC (permalink / raw)
  To: Robert Jarzmik; +Cc: mike, Linux Media Mailing List

On Thu, 12 Mar 2009, Robert Jarzmik wrote:

> Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:
> 
> >> diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
> >> index 16bf0a3..dd56c35 100644
> >> --- a/drivers/media/video/pxa_camera.c
> >> +++ b/drivers/media/video/pxa_camera.c
> >> @@ -734,14 +734,18 @@ static void pxa_camera_dma_irq(int channel, struct pxa_camera_dev *pcdev,
> >>  		status & DCSR_ENDINTR ? "EOF " : "", vb, DDADR(channel));
> >>  
> >>  	if (status & DCSR_ENDINTR) {
> >> -		if (camera_status & overrun) {
> >> +		/*
> >> +		 * It's normal if the last frame creates an overrun, as there
> >> +		 * are no more DMA descriptors to fetch from QIF fifos
> >> +		 */
> >> +		if (camera_status & overrun
> >> +		    && !list_is_last(pcdev->capture.next, &pcdev->capture)) {
> >
> > On a second look - didn't you want to test for ->active being the last?
> 
> Mmm, I'm not sure I get you right here. AFAICR pcdev->active has no direct link
> with pcdev->capture (it has nothing to do with a list_head *). Of course with a
> bit of "container_of" magic (or list_entry equivalent), I'll find it ...

Ah, sorry, scratch it, I now understand what you're doing here, looks ok.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-06 23:12     ` Trent Piepho
@ 2009-03-06 23:30       ` Robert Jarzmik
  0 siblings, 0 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-06 23:30 UTC (permalink / raw)
  To: Trent Piepho; +Cc: Guennadi Liakhovetski, mike, Linux Media Mailing List

Trent Piepho <xyzzy@speakeasy.org> writes:

> I like the algorithm I posted, after another small improvement, better.
So push it toward v4l2, to have wider audience.

If I were you, I'd have a peek at include/linux/kernel.h, which brings you
beautiful functions like ALIGN(), IS_ALIGNED(), and so on ... That could make
your next review easier.

> For instance, if width is aligned by 8 and height by 2, then you have
> already have 16 byte alignment and there is no need to align height by 4.
> E.g., 168x202 will be kept as 168x202 with my method but the rounding down
> method changes it to 168x200.
In the algorithm I posted, I keep 168x202 as well.

> Another example, take 159x243.  My algorithm produces 160x243, which seems
> much better than 156x240, what one gets by rounding each dimention down to
> a multiple of four.
By better you mean "nearer" ? Well, why not. If your patch mades it through v4l2
stack, I'll push an update to use it, deal ?

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-06 18:55   ` Guennadi Liakhovetski
  2009-03-06 23:12     ` Trent Piepho
@ 2009-03-06 23:17     ` Robert Jarzmik
  1 sibling, 0 replies; 33+ messages in thread
From: Robert Jarzmik @ 2009-03-06 23:17 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List, Trent Piepho

Guennadi Liakhovetski <g.liakhovetski@gmx.de> writes:

>> This implies that even if DMA is 8 bytes aligned, width x height should 
>> be a multiple of 16, not 8 as I stated in the first git comment. So that 
>> would align :
>>  - width on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to width)
>>  - and height on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to height)
>> 
>> Do we have an agreement on that specification, so that I can amend the code accordingly ?
>
> Yep, looks good to me.

All right, this should amend patch 1/4. I'll wait for the complete review to
resend the patch as a whole.

Cheers

--
Robert

commit 4d3bd5219dd3ef27f11c7061adf10f8249d2ba26
Author: Robert Jarzmik <robert.jarzmik@free.fr>
Date:   Fri Mar 6 22:39:41 2009 +0100

    pxa_camera: Enforce YUV422P frame sizes to be 16 multiples
    
    Due to DMA constraints, the DMA chain always transfers bytes
    from the QIF fifos to memory in 8 bytes units. In planar
    formats, that could mean 0 padding between Y and U plane
    (and between U and V plane), which is against YUV422P
    standard.
    
    Therefore, a frame size is required to be a multiple of 16
    (so U plane size is a multiple of 8). It is enforced in
    try_fmt() and set_fmt() primitives, be aligning height then
    width on 4 multiples as need be, to reach a 16 multiple.
    
    Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/media/video/pxa_camera.c |   19 +++++++++++++++----
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c
index 54df071..f736f6b 100644
--- a/drivers/media/video/pxa_camera.c
+++ b/drivers/media/video/pxa_camera.c
@@ -162,6 +162,8 @@
 			CICR0_PERRM | CICR0_QDM | CICR0_CDM | CICR0_SOFM | \
 			CICR0_EOFM | CICR0_FOM)
 
+#define PIX_YUV422P_ALIGN 16	/* YUV422P pix size should be a multiple of 16 */
+
 /*
  * Structures
  */
@@ -241,11 +243,8 @@ static int pxa_videobuf_setup(struct videobuf_queue *vq, unsigned int *count,
 
 	dev_dbg(&icd->dev, "count=%d, size=%d\n", *count, *size);
 
-	/* planar capture requires Y, U and V buffers to be page aligned */
 	if (pcdev->channels == 3)
-		*size = roundup(icd->width * icd->height, 8) /* Y pages */
-			+ roundup(icd->width * icd->height / 2, 8) /* U pages */
-			+ roundup(icd->width * icd->height / 2, 8); /* V pages */
+		*size = icd->width * icd->height * 2;
 	else
 		*size = roundup(icd->width * icd->height *
 				((icd->current_fmt->depth + 7) >> 3), 8);
@@ -1297,6 +1296,18 @@ static int pxa_camera_try_fmt(struct soc_camera_device *icd,
 		pix->width = 2048;
 	pix->width &= ~0x01;
 
+	/*
+	 * YUV422P planar format requires images size to be a 16 bytes
+	 * multiple. If not, zeros will be inserted between Y and U planes, and
+	 * U and V planes, and YUV422P standard would be violated.
+	 */
+	if (xlate->host_fmt->fourcc == V4L2_PIX_FMT_YUV422P) {
+		if ((pix->width * pix->height) & PIX_YUV422P_ALIGN)
+			pix->height = ALIGN(pix->height, PIX_YUV422P_ALIGN / 2);
+		if ((pix->width * pix->height) & PIX_YUV422P_ALIGN)
+			pix->width = ALIGN(pix->width, PIX_YUV422P_ALIGN / 2);
+	}
+
 	pix->bytesperline = pix->width *
 		DIV_ROUND_UP(xlate->host_fmt->depth, 8);
 	pix->sizeimage = pix->height * pix->bytesperline;

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-06 18:55   ` Guennadi Liakhovetski
@ 2009-03-06 23:12     ` Trent Piepho
  2009-03-06 23:30       ` Robert Jarzmik
  2009-03-06 23:17     ` Robert Jarzmik
  1 sibling, 1 reply; 33+ messages in thread
From: Trent Piepho @ 2009-03-06 23:12 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: robert.jarzmik, mike, Linux Media Mailing List

On Fri, 6 Mar 2009, Guennadi Liakhovetski wrote:
> On Fri, 6 Mar 2009, robert.jarzmik@free.fr wrote:
> >
> > This implies that even if DMA is 8 bytes aligned, width x height should
> > be a multiple of 16, not 8 as I stated in the first git comment. So that
> > would align :
> >  - width on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to width)
> >  - and height on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to height)
> >
> > Do we have an agreement on that specification, so that I can amend the code accordingly ?
>
> Yep, looks good to me.

I like the algorithm I posted, after another small improvement, better.

For instance, if width is aligned by 8 and height by 2, then you have
already have 16 byte alignment and there is no need to align height by 4.
E.g., 168x202 will be kept as 168x202 with my method but the rounding down
method changes it to 168x200.

Another example, take 159x243.  My algorithm produces 160x243, which seems
much better than 156x240, what one gets by rounding each dimention down to
a multiple of four.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-06  8:26 ` robert.jarzmik
  2009-03-06  9:56   ` Trent Piepho
@ 2009-03-06 18:55   ` Guennadi Liakhovetski
  2009-03-06 23:12     ` Trent Piepho
  2009-03-06 23:17     ` Robert Jarzmik
  1 sibling, 2 replies; 33+ messages in thread
From: Guennadi Liakhovetski @ 2009-03-06 18:55 UTC (permalink / raw)
  To: robert.jarzmik; +Cc: mike, Linux Media Mailing List, Trent Piepho

On Fri, 6 Mar 2009, robert.jarzmik@free.fr wrote:

> ----- Mail Original -----
> De: "Guennadi Liakhovetski" <g.liakhovetski@gmx.de>

(hm, there's something strange about this and following emails in this 
thread, I didn't get them directly, only over the mailing list... strange)

> À: "Trent Piepho" <xyzzy@speakeasy.org>
> Cc: "Robert Jarzmik" <robert.jarzmik@free.fr>, mike@compulab.co.il, "Linux Media Mailing List" <linux-media@vger.kernel.org>
> Envoyé: Jeudi 5 Mars 2009 23h15:03 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
> Objet: Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
> 
> > Yes, adjusting both is also what I was suggesting in my original review. 
> > How about aligning the bigger of the two to 4 bytes and the smaller to 2? 
> 
> Yes, sounds good.
> 
> I remade my calculations :
>  - if (width x height % 8 == 0) :
>    => frame size = width x height x 2
>    => U plane size = frame size / 4 = width x height /2
>    => U plane size is a multiple of 4
>       As the last DMA load from QIF fifo will return 8 bytes (and not 4 as 
>       we would expect, cf. PXA Developer's Manual, 27.4.4.1), this is not
>       good.
> 
> This implies that even if DMA is 8 bytes aligned, width x height should 
> be a multiple of 16, not 8 as I stated in the first git comment. So that 
> would align :
>  - width on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to width)
>  - and height on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to height)
> 
> Do we have an agreement on that specification, so that I can amend the code accordingly ?

Yep, looks good to me.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
       [not found] <421785551.2131221236346602590.JavaMail.root@zimbra20-e3.priv.proxad.net>
@ 2009-03-06 13:39 ` robert.jarzmik
  0 siblings, 0 replies; 33+ messages in thread
From: robert.jarzmik @ 2009-03-06 13:39 UTC (permalink / raw)
  To: Trent Piepho; +Cc: Guennadi Liakhovetski, mike, Linux Media Mailing List

----- Mail Original -----
De: "Trent Piepho" <xyzzy@speakeasy.org>
À: "robert jarzmik" <robert.jarzmik@free.fr>
Cc: "Guennadi Liakhovetski" <g.liakhovetski@gmx.de>, mike@compulab.co.il, "Linux Media Mailing List" <linux-media@vger.kernel.org>
Envoyé: Vendredi 6 Mars 2009 10h56:39 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
Objet: Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole

> I remade my calculations :
>  - if (width x height % 8 == 0) :
>    => frame size = width x height x 2
>    => U plane size = frame size / 4 = width x height /2
>    => U plane size is a multiple of 4
>       As the last DMA load from QIF fifo will return 8 bytes (and not 4 as
>       we would expect, cf. PXA Developer's Manual, 27.4.4.1), this is not
>       good.

So it's only a requirement that the total size of all three planes put
together is a multiple of 8?  Or is it a requirement that each plane is a
multiple of 8?  And are you using 4:2:2 or 4:2:0?

The requirement is for each plane, as each plane has a QIF fifo associated. If the requirement is granted for plane U, then it is for plane V (same size), and plane Y as well (twice the size of plane U).

And the file format is planar 4:2:2 YCbCr, which gives (assuming 32 bit wide representation, with most significant bit on the left of the schema and least on the right) for example :

 31      23     15     7     0
 +------+------+------+------+
 | Yn+3 | Yn+2 | Yn+1 | Yn   |
 | ..........................|
 + YN   | YN-1 | YN-1 | 0    |
 +------+------+------+------+
 | Cr+3 | Cr+2 | Cr+1 | Cr   |
 | ..........................|
 | CrN  | CrN-1|  0   |  0   |
 +------+------+------+------+
 | Cr+3 | Cr+2 | Cr+1 | Cr   |
 | ..........................|
 | CrN  | CrN-1|  0   |  0   |
 +------+------+------+------+

Everywhere where I put a "0", de QIF interface sends a 0 byte even if it doesn't represent a pixel (padding by 0s).
The same is true for RGB formats, but instead of 0, it is 0 or "transparency bit".

For the height/width calculation, I didn't find your function in kernel tree. As it is very generic, I have no way to put it into pxa_camera, it should go ... elsewhere. So I think I'll use a dumb "ALIGN(x, 8)" ...

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
  2009-03-06  8:26 ` robert.jarzmik
@ 2009-03-06  9:56   ` Trent Piepho
  2009-03-06 18:55   ` Guennadi Liakhovetski
  1 sibling, 0 replies; 33+ messages in thread
From: Trent Piepho @ 2009-03-06  9:56 UTC (permalink / raw)
  To: robert.jarzmik; +Cc: Guennadi Liakhovetski, mike, Linux Media Mailing List

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 1220 bytes --]

On Fri, 6 Mar 2009 robert.jarzmik@free.fr wrote:
> ----- Mail Original -----
> De: "Guennadi Liakhovetski" <g.liakhovetski@gmx.de>
> À: "Trent Piepho" <xyzzy@speakeasy.org>
> Cc: "Robert Jarzmik" <robert.jarzmik@free.fr>, mike@compulab.co.il, "Linux Media Mailing List" <linux-media@vger.kernel.org>
> Envoyé: Jeudi 5 Mars 2009 23h15:03 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
> Objet: Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
>
> > Yes, adjusting both is also what I was suggesting in my original review.
> > How about aligning the bigger of the two to 4 bytes and the smaller to 2?
>
> Yes, sounds good.
>
> I remade my calculations :
>  - if (width x height % 8 == 0) :
>    => frame size = width x height x 2
>    => U plane size = frame size / 4 = width x height /2
>    => U plane size is a multiple of 4
>       As the last DMA load from QIF fifo will return 8 bytes (and not 4 as
>       we would expect, cf. PXA Developer's Manual, 27.4.4.1), this is not
>       good.

So it's only a requirement that the total size of all three planes put
together is a multiple of 8?  Or is it a requirement that each plane is a
multiple of 8?  And are you using 4:2:2 or 4:2:0?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole
       [not found] <1402002204.2045281236327939805.JavaMail.root@zimbra20-e3.priv.proxad.net>
@ 2009-03-06  8:26 ` robert.jarzmik
  2009-03-06  9:56   ` Trent Piepho
  2009-03-06 18:55   ` Guennadi Liakhovetski
  0 siblings, 2 replies; 33+ messages in thread
From: robert.jarzmik @ 2009-03-06  8:26 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: mike, Linux Media Mailing List, Trent Piepho

----- Mail Original -----
De: "Guennadi Liakhovetski" <g.liakhovetski@gmx.de>
À: "Trent Piepho" <xyzzy@speakeasy.org>
Cc: "Robert Jarzmik" <robert.jarzmik@free.fr>, mike@compulab.co.il, "Linux Media Mailing List" <linux-media@vger.kernel.org>
Envoyé: Jeudi 5 Mars 2009 23h15:03 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
Objet: Re: [PATCH 1/4] pxa_camera: Remove YUV planar formats hole

> Yes, adjusting both is also what I was suggesting in my original review. 
> How about aligning the bigger of the two to 4 bytes and the smaller to 2? 

Yes, sounds good.

I remade my calculations :
 - if (width x height % 8 == 0) :
   => frame size = width x height x 2
   => U plane size = frame size / 4 = width x height /2
   => U plane size is a multiple of 4
      As the last DMA load from QIF fifo will return 8 bytes (and not 4 as 
      we would expect, cf. PXA Developer's Manual, 27.4.4.1), this is not
      good.

This implies that even if DMA is 8 bytes aligned, width x height should be a multiple of 16, not 8 as I stated in the first git comment. So that would align :
 - width on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to width)
 - and height on 4 bytes (aligning meaning the lowest multiple of 4 below or equal to height)

Do we have an agreement on that specification, so that I can amend the code accordingly ?

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2009-03-12 22:12 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-05 19:45 [PATCH 0/4] pxa_camera: Redesign DMA handling Robert Jarzmik
2009-03-05 19:45 ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Robert Jarzmik
2009-03-05 19:45   ` [PATCH 2/4] pxa_camera: Redesign DMA handling Robert Jarzmik
2009-03-05 19:45     ` [PATCH 3/4] pxa_camera: Coding style sweeping Robert Jarzmik
2009-03-05 19:45       ` [PATCH 4/4] pxa_camera: Fix overrun condition on last buffer Robert Jarzmik
2009-03-09 11:39         ` Guennadi Liakhovetski
2009-03-09 19:16           ` Robert Jarzmik
2009-03-11 18:31         ` Guennadi Liakhovetski
2009-03-12 21:36           ` Robert Jarzmik
2009-03-12 22:12             ` Guennadi Liakhovetski
2009-03-09 11:35     ` [PATCH 2/4] pxa_camera: Redesign DMA handling Guennadi Liakhovetski
2009-03-09 20:50       ` Robert Jarzmik
2009-03-09 23:14         ` Guennadi Liakhovetski
2009-03-10 21:46           ` Robert Jarzmik
2009-03-11 18:25             ` Guennadi Liakhovetski
2009-03-11 19:45               ` Robert Jarzmik
2009-03-11 20:24                 ` Robert Jarzmik
2009-03-11 21:21             ` Robert Jarzmik
2009-03-05 20:29   ` [PATCH 1/4] pxa_camera: Remove YUV planar formats hole Guennadi Liakhovetski
2009-03-05 21:10     ` Robert Jarzmik
2009-03-05 21:22       ` Trent Piepho
2009-03-05 22:15         ` Guennadi Liakhovetski
2009-03-06  9:30           ` Trent Piepho
2009-03-09 10:45   ` Guennadi Liakhovetski
2009-03-09 19:13     ` Robert Jarzmik
2009-03-10 18:33       ` Trent Piepho
     [not found] <1402002204.2045281236327939805.JavaMail.root@zimbra20-e3.priv.proxad.net>
2009-03-06  8:26 ` robert.jarzmik
2009-03-06  9:56   ` Trent Piepho
2009-03-06 18:55   ` Guennadi Liakhovetski
2009-03-06 23:12     ` Trent Piepho
2009-03-06 23:30       ` Robert Jarzmik
2009-03-06 23:17     ` Robert Jarzmik
     [not found] <421785551.2131221236346602590.JavaMail.root@zimbra20-e3.priv.proxad.net>
2009-03-06 13:39 ` robert.jarzmik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.