All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philipp Zabel <p.zabel@pengutronix.de>
To: Steve Longerbeam <steve_longerbeam@mentor.com>
Cc: Steve Longerbeam <slongerbeam@gmail.com>,
	plagnioj@jcrosoft.com, tomi.valkeinen@ti.com,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	linux-fbdev@vger.kernel.org
Subject: Re: [PATCH v4 3/4] gpu: ipu-ic: Add complete image conversion support with tiling
Date: Fri, 16 Sep 2016 16:16:02 +0200	[thread overview]
Message-ID: <1474035362.2491.59.camel@pengutronix.de> (raw)
In-Reply-To: <fa6812fb-e3b4-9a39-e956-42b3253142f3@mentor.com>

Hi Steve,

thanks for the update.

Am Mittwoch, den 14.09.2016, 18:45 -0700 schrieb Steve Longerbeam:
> Hi Philipp,
> 
> 
> On 09/06/2016 02:26 AM, Philipp Zabel wrote:
> > Hi Steve,
> >
> > Am Mittwoch, den 17.08.2016, 17:50 -0700 schrieb Steve Longerbeam:
> >> This patch implements complete image conversion support to ipu-ic,
> >> with tiling to support scaling to and from images up to 4096x4096.
> >> Image rotation is also supported.
> >>
> >> The internal API is subsystem agnostic (no V4L2 dependency except
> >> for the use of V4L2 fourcc pixel formats).
> >>
> >> Callers prepare for image conversion by calling
> >> ipu_image_convert_prepare(), which initializes the parameters of
> >> the conversion.
> > ... and possibly allocates intermediate buffers for rotation support.
> > This should be documented somewhere, with a node that v4l2 users should
> > be doing this during REQBUFS.
> 
> I added comment headers for all the image conversion prototypes.
> It caused bloat in imx-ipu-v3.h, so I moved it to a new header:
> include/video/imx-image-convert.h, but let me know if we should put
> this somewhere else and/or under Documentation/ somewhere.

I think that is the right place already. imx-image-convert.h could be
renamed to imx-ipu-image-convert.h, to make clear that this is about the
IPU image converter.

> >>   The caller passes in the ipu_ic task to use for
> >> the conversion, the input and output image formats, a rotation mode,
> >> and a completion callback and completion context pointer:
> >>
> >> struct image_converter_ctx *
> >> ipu_image_convert_prepare(struct ipu_ic *ic,
> >>                            struct ipu_image *in, struct ipu_image *out,
> >>                            enum ipu_rotate_mode rot_mode,
> >>                            image_converter_cb_t complete,
> >>                            void *complete_context);
> > As I commented on the other patch, I think the image_convert functions
> > should use a separate handle for the image conversion queues that sit on
> > top of the ipu_ic task handles.
> 
> Here is a new prototype I came up with:
> 
> struct ipu_image_convert_ctx *
> ipu_image_convert_prepare(struct ipu_soc *ipu, enum ipu_ic_task ic_task,
>                struct ipu_image *in, struct ipu_image *out,
>                enum ipu_rotate_mode rot_mode,
>                ipu_image_convert_cb_t complete,
>                void *complete_context);
> 
> In other words, the ipu_ic handle is replaced by the IPU handle and IC task
> that are requested for carrying out the conversion.

Looks good to me for now.

> The image converter will acquire the ipu_ic handle internally, whenever 
> there
> are queued contexts to that IC task (which I am calling a 'struct 
> ipu_image_convert_chan').
> This way the IC handle can be shared by all contexts using that IC task. 
> After all
> contexts have been freed from the (struct 
> ipu_image_convert_chan)->ctx_list queue,
> the ipu_ic handle is freed.
> 
> The ipu_ic handle is acquired in get_ipu_resources() and freed in 
> release_ipu_resources(),
> along with all the other IPU resources that *could possibly be needed* 
> in that
> ipu_image_convert_chan by future contexts (*all* idmac channels, *all* 
> irqs).

Ok.

[...]
> >> +#define MIN_W     128
> >> +#define MIN_H     128
> > Where does this minimum come from?
> 
> Nowhere really :) This is just some sane minimums, to pass
> to clamp_align() when aligning input/output width/height in
> ipu_image_convert_adjust().

Let's use hardware minimum in the low level code. Sane defaults are for
the V4L2 API. Would that be 8x2 pixels per input tile?

> >> +struct ic_task_channels {
> >> +	int in;
> >> +	int out;
> >> +	int rot_in;
> >> +	int rot_out;
> >> +	int vdi_in_p;
> >> +	int vdi_in;
> >> +	int vdi_in_n;
> > The vdi channels are unused.
> 
> Well, I'd prefer to keep the VDI channels. It's quite possible we
> can add motion compensated deinterlacing support using the
> PRP_VF task to the image converter in the future.

Indeed.

> >> +struct image_converter_ctx {
> >> +	struct image_converter *cvt;
> >> +
> >> +	image_converter_cb_t complete;
> >> +	void *complete_context;
> >> +
> >> +	/* Source/destination image data and rotation mode */
> >> +	struct ipu_ic_image in;
> >> +	struct ipu_ic_image out;
> >> +	enum ipu_rotate_mode rot_mode;
> >> +
> >> +	/* intermediate buffer for rotation */
> >> +	struct ipu_ic_dma_buf rot_intermediate[2];
> > No need to change it now, but I assume these could be per IC task
> > instead of per context.
> 
> Actually no. The rotation intermediate buffers have the dimension
> of a single tile, so they must remain in the context struct.

I see. The per task intermediate buffer would have to be the maximum
size, so this would only ever make sense when rotating multiple large
RGB streams simultaneously. I think we can reasonably ignore this use
case.

[...]
> >> +		.fourcc	= V4L2_PIX_FMT_RGB565,
> >> +		.bpp    = 16,
> > bpp is only ever used in bytes, not bits (always divided by 8).
> > Why not make this bytes_per_pixel or pixel_stride = 2.
> 
> Actually bpp is used to calculate *total* tile sizes and *total* bytes
> per line. For the planar 4:2:0 formats that means it must be specified
> in bits.

Ok for total size of chroma subsampled planar formats.

[...]
> > Most of the following code seems to be running under one big spinlock.
> > Is this really necessary?
> 
> You're right, convert_stop(), convert_start(), and init_idmac_channel() are
> only calling the ipu_ic lower level primitives. So they don't require 
> the irqlock.
> I did remove the "hold irqlock when calling" comment for those. However
> they are called embedded in the irq handling, so it would be cumbersome
> to drop the lock there only because they don't need it. We can revisit the
> lock handling later if you see some room for optimization there.

Alright, let's call it future performance optimization potential.

> >> +static irqreturn_t ipu_ic_norotate_irq(int irq, void *data)
> >> +{
[...]
> >> +	if (ipu_rot_mode_is_irt(ctx->rot_mode)) {
> >> +		/* this is a rotation operation, just ignore */
> >> +		spin_unlock_irqrestore(&cvt->irqlock, flags);
> >> +		return IRQ_HANDLED;
> >> +	}
> > Why enable the out_chan EOF irq at all when using the IRT mode?
> 
> Because (see above), all the IPU resources that might be needed
> for any conversion context that is queued to a image conversion
> channel (IC task) are acquired when the first context is queued,
> including rotation resources. So by acquiring the non-rotation EOF
> irq, it will get fielded even for rotation conversions, so we have to
> handle it.

There is nothing wrong with acquiring the irq. It could still be
disabled while it is not needed.

> >> +/* Adjusts input/output images to IPU restrictions */
> >> +int ipu_image_convert_adjust(struct ipu_image *in, struct ipu_image *out,
> >> +			     enum ipu_rotate_mode rot_mode)
> >> +{
> >> +	const struct ipu_ic_pixfmt *infmt, *outfmt;
> >> +	unsigned int num_in_rows, num_in_cols;
> >> +	unsigned int num_out_rows, num_out_cols;
> >> +	u32 w_align, h_align;
> >> +
> >> +	infmt = ipu_ic_get_format(in->pix.pixelformat);
> >> +	outfmt = ipu_ic_get_format(out->pix.pixelformat);
> >> +
> >> +	/* set some defaults if needed */
> > Is this our task at all?
> 
> ipu_image_convert_adjust() is meant to be called by v4l2 try_format(),
> which should never return EINVAL but should return a supported format
> when the passed format is not supported. So I added this here to return
> some default pixel formats and width/heights if needed.

I'd prefer to move this into the mem2mem driver try_format, then.

The remaining issues are minor and can be fixed later.
I'll apply this as is.

regards
Philipp

WARNING: multiple messages have this Message-ID (diff)
From: Philipp Zabel <p.zabel@pengutronix.de>
To: Steve Longerbeam <steve_longerbeam@mentor.com>
Cc: Steve Longerbeam <slongerbeam@gmail.com>,
	plagnioj@jcrosoft.com, tomi.valkeinen@ti.com,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	linux-fbdev@vger.kernel.org
Subject: Re: [PATCH v4 3/4] gpu: ipu-ic: Add complete image conversion support with tiling
Date: Fri, 16 Sep 2016 14:16:02 +0000	[thread overview]
Message-ID: <1474035362.2491.59.camel@pengutronix.de> (raw)
In-Reply-To: <fa6812fb-e3b4-9a39-e956-42b3253142f3@mentor.com>

Hi Steve,

thanks for the update.

Am Mittwoch, den 14.09.2016, 18:45 -0700 schrieb Steve Longerbeam:
> Hi Philipp,
> 
> 
> On 09/06/2016 02:26 AM, Philipp Zabel wrote:
> > Hi Steve,
> >
> > Am Mittwoch, den 17.08.2016, 17:50 -0700 schrieb Steve Longerbeam:
> >> This patch implements complete image conversion support to ipu-ic,
> >> with tiling to support scaling to and from images up to 4096x4096.
> >> Image rotation is also supported.
> >>
> >> The internal API is subsystem agnostic (no V4L2 dependency except
> >> for the use of V4L2 fourcc pixel formats).
> >>
> >> Callers prepare for image conversion by calling
> >> ipu_image_convert_prepare(), which initializes the parameters of
> >> the conversion.
> > ... and possibly allocates intermediate buffers for rotation support.
> > This should be documented somewhere, with a node that v4l2 users should
> > be doing this during REQBUFS.
> 
> I added comment headers for all the image conversion prototypes.
> It caused bloat in imx-ipu-v3.h, so I moved it to a new header:
> include/video/imx-image-convert.h, but let me know if we should put
> this somewhere else and/or under Documentation/ somewhere.

I think that is the right place already. imx-image-convert.h could be
renamed to imx-ipu-image-convert.h, to make clear that this is about the
IPU image converter.

> >>   The caller passes in the ipu_ic task to use for
> >> the conversion, the input and output image formats, a rotation mode,
> >> and a completion callback and completion context pointer:
> >>
> >> struct image_converter_ctx *
> >> ipu_image_convert_prepare(struct ipu_ic *ic,
> >>                            struct ipu_image *in, struct ipu_image *out,
> >>                            enum ipu_rotate_mode rot_mode,
> >>                            image_converter_cb_t complete,
> >>                            void *complete_context);
> > As I commented on the other patch, I think the image_convert functions
> > should use a separate handle for the image conversion queues that sit on
> > top of the ipu_ic task handles.
> 
> Here is a new prototype I came up with:
> 
> struct ipu_image_convert_ctx *
> ipu_image_convert_prepare(struct ipu_soc *ipu, enum ipu_ic_task ic_task,
>                struct ipu_image *in, struct ipu_image *out,
>                enum ipu_rotate_mode rot_mode,
>                ipu_image_convert_cb_t complete,
>                void *complete_context);
> 
> In other words, the ipu_ic handle is replaced by the IPU handle and IC task
> that are requested for carrying out the conversion.

Looks good to me for now.

> The image converter will acquire the ipu_ic handle internally, whenever 
> there
> are queued contexts to that IC task (which I am calling a 'struct 
> ipu_image_convert_chan').
> This way the IC handle can be shared by all contexts using that IC task. 
> After all
> contexts have been freed from the (struct 
> ipu_image_convert_chan)->ctx_list queue,
> the ipu_ic handle is freed.
> 
> The ipu_ic handle is acquired in get_ipu_resources() and freed in 
> release_ipu_resources(),
> along with all the other IPU resources that *could possibly be needed* 
> in that
> ipu_image_convert_chan by future contexts (*all* idmac channels, *all* 
> irqs).

Ok.

[...]
> >> +#define MIN_W     128
> >> +#define MIN_H     128
> > Where does this minimum come from?
> 
> Nowhere really :) This is just some sane minimums, to pass
> to clamp_align() when aligning input/output width/height in
> ipu_image_convert_adjust().

Let's use hardware minimum in the low level code. Sane defaults are for
the V4L2 API. Would that be 8x2 pixels per input tile?

> >> +struct ic_task_channels {
> >> +	int in;
> >> +	int out;
> >> +	int rot_in;
> >> +	int rot_out;
> >> +	int vdi_in_p;
> >> +	int vdi_in;
> >> +	int vdi_in_n;
> > The vdi channels are unused.
> 
> Well, I'd prefer to keep the VDI channels. It's quite possible we
> can add motion compensated deinterlacing support using the
> PRP_VF task to the image converter in the future.

Indeed.

> >> +struct image_converter_ctx {
> >> +	struct image_converter *cvt;
> >> +
> >> +	image_converter_cb_t complete;
> >> +	void *complete_context;
> >> +
> >> +	/* Source/destination image data and rotation mode */
> >> +	struct ipu_ic_image in;
> >> +	struct ipu_ic_image out;
> >> +	enum ipu_rotate_mode rot_mode;
> >> +
> >> +	/* intermediate buffer for rotation */
> >> +	struct ipu_ic_dma_buf rot_intermediate[2];
> > No need to change it now, but I assume these could be per IC task
> > instead of per context.
> 
> Actually no. The rotation intermediate buffers have the dimension
> of a single tile, so they must remain in the context struct.

I see. The per task intermediate buffer would have to be the maximum
size, so this would only ever make sense when rotating multiple large
RGB streams simultaneously. I think we can reasonably ignore this use
case.

[...]
> >> +		.fourcc	= V4L2_PIX_FMT_RGB565,
> >> +		.bpp    = 16,
> > bpp is only ever used in bytes, not bits (always divided by 8).
> > Why not make this bytes_per_pixel or pixel_stride = 2.
> 
> Actually bpp is used to calculate *total* tile sizes and *total* bytes
> per line. For the planar 4:2:0 formats that means it must be specified
> in bits.

Ok for total size of chroma subsampled planar formats.

[...]
> > Most of the following code seems to be running under one big spinlock.
> > Is this really necessary?
> 
> You're right, convert_stop(), convert_start(), and init_idmac_channel() are
> only calling the ipu_ic lower level primitives. So they don't require 
> the irqlock.
> I did remove the "hold irqlock when calling" comment for those. However
> they are called embedded in the irq handling, so it would be cumbersome
> to drop the lock there only because they don't need it. We can revisit the
> lock handling later if you see some room for optimization there.

Alright, let's call it future performance optimization potential.

> >> +static irqreturn_t ipu_ic_norotate_irq(int irq, void *data)
> >> +{
[...]
> >> +	if (ipu_rot_mode_is_irt(ctx->rot_mode)) {
> >> +		/* this is a rotation operation, just ignore */
> >> +		spin_unlock_irqrestore(&cvt->irqlock, flags);
> >> +		return IRQ_HANDLED;
> >> +	}
> > Why enable the out_chan EOF irq at all when using the IRT mode?
> 
> Because (see above), all the IPU resources that might be needed
> for any conversion context that is queued to a image conversion
> channel (IC task) are acquired when the first context is queued,
> including rotation resources. So by acquiring the non-rotation EOF
> irq, it will get fielded even for rotation conversions, so we have to
> handle it.

There is nothing wrong with acquiring the irq. It could still be
disabled while it is not needed.

> >> +/* Adjusts input/output images to IPU restrictions */
> >> +int ipu_image_convert_adjust(struct ipu_image *in, struct ipu_image *out,
> >> +			     enum ipu_rotate_mode rot_mode)
> >> +{
> >> +	const struct ipu_ic_pixfmt *infmt, *outfmt;
> >> +	unsigned int num_in_rows, num_in_cols;
> >> +	unsigned int num_out_rows, num_out_cols;
> >> +	u32 w_align, h_align;
> >> +
> >> +	infmt = ipu_ic_get_format(in->pix.pixelformat);
> >> +	outfmt = ipu_ic_get_format(out->pix.pixelformat);
> >> +
> >> +	/* set some defaults if needed */
> > Is this our task at all?
> 
> ipu_image_convert_adjust() is meant to be called by v4l2 try_format(),
> which should never return EINVAL but should return a supported format
> when the passed format is not supported. So I added this here to return
> some default pixel formats and width/heights if needed.

I'd prefer to move this into the mem2mem driver try_format, then.

The remaining issues are minor and can be fixed later.
I'll apply this as is.

regards
Philipp


  reply	other threads:[~2016-09-16 14:16 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-18  0:50 [PATCH v4 0/4] IPUv3 prep for i.MX5/6 v4l2 staging drivers, v4 Steve Longerbeam
2016-08-18  0:50 ` Steve Longerbeam
2016-08-18  0:50 ` [PATCH v4 1/4] gpu: ipu-v3: Add Video Deinterlacer unit Steve Longerbeam
2016-08-18  0:50   ` Steve Longerbeam
2016-08-18  0:50 ` [PATCH v4 2/4] gpu: ipu-v3: Add FSU channel linking support Steve Longerbeam
2016-08-18  0:50   ` Steve Longerbeam
2016-08-18  0:50 ` [PATCH v4 3/4] gpu: ipu-ic: Add complete image conversion support with tiling Steve Longerbeam
2016-08-18  0:50   ` Steve Longerbeam
2016-09-06  9:26   ` Philipp Zabel
2016-09-06  9:26     ` Philipp Zabel
2016-09-06  9:26     ` Philipp Zabel
2016-09-15  1:45     ` Steve Longerbeam
2016-09-15  1:45       ` Steve Longerbeam
2016-09-15  1:45       ` Steve Longerbeam
2016-09-16 14:16       ` Philipp Zabel [this message]
2016-09-16 14:16         ` Philipp Zabel
2016-09-17 18:46         ` Steve Longerbeam
2016-09-17 18:46           ` Steve Longerbeam
2016-09-17 18:46           ` Steve Longerbeam
2016-08-18  0:50 ` [PATCH v4 4/4] gpu: ipu-ic: allow multiple handles to ic Steve Longerbeam
2016-08-18  0:50   ` Steve Longerbeam
2016-09-06  9:26   ` Philipp Zabel
2016-09-06  9:26     ` Philipp Zabel
2016-09-06  9:26     ` Philipp Zabel
2016-08-25 14:17 ` [PATCH v4 0/4] IPUv3 prep for i.MX5/6 v4l2 staging drivers, v4 Tim Harvey
2016-08-25 14:17   ` Tim Harvey
2016-09-05 14:41   ` Fabio Estevam
2016-09-05 14:41     ` Fabio Estevam
2016-09-05 14:41     ` Fabio Estevam
2016-09-06  9:26     ` Philipp Zabel
2016-09-06  9:26       ` Philipp Zabel
2016-09-06  9:26       ` Philipp Zabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1474035362.2491.59.camel@pengutronix.de \
    --to=p.zabel@pengutronix.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=plagnioj@jcrosoft.com \
    --cc=slongerbeam@gmail.com \
    --cc=steve_longerbeam@mentor.com \
    --cc=tomi.valkeinen@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.