From mboxrd@z Thu Jan  1 00:00:00 1970
Return-path: <linux-media-owner@vger.kernel.org>
Received: from perceval.ideasonboard.com ([213.167.242.64]:60198 "EHLO
        perceval.ideasonboard.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752056AbeERIPT (ORCPT
        <rfc822;linux-media@vger.kernel.org>);
        Fri, 18 May 2018 04:15:19 -0400
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Nicolas Dufresne <nicolas@ndufresne.ca>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
        LMML <linux-media@vger.kernel.org>,
        Wim Taymans <wtaymans@redhat.com>, schaller@redhat.com
Subject: Re: [ANN] Meeting to discuss improvements to support MC-based cameras on generic apps
Date: Fri, 18 May 2018 11:15:39 +0300
Message-ID: <3216261.G88TfqiCiH@avalon>
In-Reply-To: <644920d91d1f69d659f233c6a52382d3f919babc.camel@ndufresne.ca>
References: <20180517160708.74811cfb@vento.lan> <644920d91d1f69d659f233c6a52382d3f919babc.camel@ndufresne.ca>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="iso-8859-1"
Sender: linux-media-owner@vger.kernel.org
List-ID: <linux-media.vger.kernel.org>

Hi Nicolas,

On Friday, 18 May 2018 00:38:53 EEST Nicolas Dufresne wrote:
> Le jeudi 17 mai 2018 =E0 16:07 -0300, Mauro Carvalho Chehab a =E9crit :
> > Hi all,
> >=20
> > The goal of this e-mail is to schedule a meeting in order to discuss
> > improvements at the media subsystem in order to support complex
> > camera hardware by usual apps.
> >=20
> > The main focus here is to allow supporting devices with MC-based
> > hardware connected to a camera.
> >=20
> > In short, my proposal is to meet with the interested parties on
> > solving this issue during the Open Source Summit in Japan, e. g. between
> > June, 19-22, in Tokyo.
> >=20
> > I'd like to know who is interested on joining us for such meeting,
> > and to hear a proposal of themes for discussions.
> >=20
> > I'm enclosing a detailed description of the problem, in order to
> > allow the interested parties to be at the same page.
>=20
> It's unlikely I'll be able to attend this meeting, but I'd like to
> provide some initial input on this. Find inline some clarification on
> why libv4l2 is disabled by default in Gst, as it's not just
> performance.
>=20
> A major aspect that is totally absent of this mail is PipeWire. With
> the venue of sandboxed application, there is a need to control access
> to cameras through a daemon. The same daemon is also used to control
> access to screen capture on Wayland (instead of letting any random
> application capture your screen, like on X11). The effort is lead by
> the desktop team at RedHat (folks CCed). PipeWire already have V4L2
> native support and is integrated in GStreamer already in a way that it
> can totally replace the V4L2 capture component there. PipeWire is
> plugin base, so more type of camera support (including proprietary
> ones) can be added.

One issue that has been worrying me for the past five years or so is how to=
=20
ensure that we will continue having open-source camera support in the futur=
e.=20
Pipewire is just a technology and as such can be used in good or evil ways,=
=20
but as a community we need to care about availability of open solutions.

So far, by pushing the V4L2 API as the proper way to support cameras, we ha=
ve=20
tried to resist the natural inclination of vendors to close everything, as=
=20
implementing a closed-source kernel driver isn't an option that most would=
=20
consider. Of course, the drawback is that some vendors have simply decided =
not=20
to care about upstream camera support.

If we move the camera API one level up to userspace (and whether the API wi=
ll=20
be defined by Pipewire, by libv4l or by something else), we'll make it easi=
er=20
for vendors not to play along. My big question is how to prevent that. I th=
ink=20
there's still value there in mandating V4L2 as the only API for cameras, an=
d=20
in ensuring that we support multiple userspace multimedia stacks, not just=
=20
Pipewire (this is already done in a way, as I don't foresee Android moving=
=20
away from their camera HAL in the near future). That will likely not be=20
enough, and I'd like to hear other people's opinions on this topic.

I would like to emphasize that I don't expect vendors to open the=20
implementation of their 3A algorithms, and I'm not actually concerned about=
=20
that part. If that's the only part shipped as closed-source, and if the=20
hardware operation is documented (ideally in public datasheet, but at a=20
minimum with proper documentation of custom ioctls used to configure the=20
hardware), then the community will have the opportunity to implement an ope=
n-
source 3A library. My main concern is thus about all component other than t=
he=20
3A library.

> Remote daemon can also provide streams, as this is
> the case for compositors and screen casting. An extra benefit is that
> you can have multiple application reading frames from the same camera.
> It also allow sandboxed application (the do not have access to /dev) to
> use the cameras. PipeWire is much more then that, but let's focus on
> that.
>=20
> This is the direction we are heading on the "generic" / Desktop Linux.
> Porting Firefox and Chrome is obviously planed, as these beast are
> clear candidate for being sand-boxed and requires screen share feature
> for WebRTC.
>=20
> In this context, proprietary or HW specific algorithm could be
> implemented in userspace as PipeWire plugins, and then application will
> automatically be enable to enumerate and use these. I'm not saying the
> libv4l2 stuff is not needed short term, but it's just a short term
> thing in my opinion.
>=20
> > 1. Introduction
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >=20
> > 1.1 V4L2 Kernel aspects
> > -----------------------
> >=20
> > The media subsystem supports two types of devices:
> >=20
> > - "traditional" media hardware, supported via V4L2 API. On such
> >   hardware, opening a single device node (usually /dev/video0) is enough
> >   to control the entire device. We call it as devnode-based devices.
> >=20
> > - Media-controller based devices. On those devices, there are several
> >   /dev/video? nodes and several /dev/v4l2-subdev? nodes, plus a media
> >   controller device node (usually /dev/media0).
> >   We call it as mc-based devices. Controlling the hardware require
> >   opening the media device (/dev/media0), setup the pipeline and
> >   adjust the sub-devices via /dev/v4l2-subdev?. Only streaming is
> >   controlled by /dev/video?.
> >=20
> > All "standard" media applications, including open source ones
> > (Camorama, Cheese, Xawtv, Firefox, Chromium, ...) and closed source ones
> > (Skype, Chrome, ...) supports devnode-based devices.
> >=20
> > Support for mc-based devices currently require an specialized
> > application in order to prepare the device for its usage (setup pipelin=
es,
> > adjust hardware controls, etc). Once pipeline is set, the streaming goes
> > via /dev/video?, although usually some /dev/v4l2-subdev? devnodes should
> > also be opened, in order to implement algorithms designed to make video
> > quality reasonable. On such devices, it is not uncommon that the device
> > used by the application to be a random number (on OMAP3 driver, typical=
ly,
> > is either /dev/video4 or /dev/video6).
> >=20
> > One example of such hardware is at the OMAP3-based hardware:
> > 	http://www.infradead.org/~mchehab/mc-next-gen/omap3-igepv2-with
> >=20
> > -tvp5150.png
> >=20
> > On the picture, there's a graph with the hardware blocks in
> > blue/dark/blue and the corresponding devnode interfaces in yellow.
> >=20
> > The mc-based approach was taken when support for Nokia N9/N900
> > cameras was added (with has OMAP3 SoC). It is required because the came=
ra
> > hardware on SoC comes with a media processor (ISP), with does a lot more
> > than just capturing, allowing complex algorithms to enhance image quali=
ty
> > in runtime.
> >=20
> > Those algorithms are known as 3A - an acronym for 3 other acronyms:
> > 	- AE (Auto Exposure);
> > 	- AF (Auto Focus);
> > 	- AWB (Auto White Balance).
> >=20
> > Setting a camera with such ISPs are harder because the pipelines to
> > be set actually depends the requirements for those 3A algorithms to run.
> > Also, usually, the 3A algorithms use some chipset-specific userspace
> > API, that exports some image properties, calculated by the ISP, to speed
> > up the convergence of those algorithms.
> >=20
> > Btw, usually, the 3A algorithms are IP-protected, provided by vendors
> > as binary only blobs, although there are a few OSS implementations.
> >=20
> > 1.2 V4L2 userspace aspects
> > --------------------------
> >=20
> > Back when USB cameras were introduced, the hardware were really
> > simple: they had a CCD camera sensor and a chip that bridges the data
> > though USB. CCD camera sensors typically provide data using a bayer
> > format, but they usually have their own proprietary ways to pack the da=
ta,
> > in order to reduce the USB bandwidth (original cameras were USB 1.1).
> >=20
> > So, V4L2 has a myriad of different formats, in order to match each
> > CCD camera sensor format. At the end of the day, applications were
> > able to use only a subset of the available hardware, since they need
> > to come with format converters for all formats the developer uses
> > (usually a very small subset of the available ones).
> >=20
> > To end with this mess, an userspace library was written, called
> > libv4l. It supports all those proprietary formats. So, applications can
> > use a RGB or YUV format, without needing to concern about conversions.
> >=20
> > The way it works is by adding wrappers to system calls: open, close,
> > ioctl, mmap, mmunmap. So, a conversion to use it is really simple:
> > at the source code of the apps, all it was needed is to prepend the
> > existing calls with "v4l2_", e. g. v4l2_open, v4l2_close, etc.
> >=20
> > All open source apps we know now supports libv4l. On a few (like
> > gstreamer), support for it is optional.
> >=20
> > In order to support closed source, another wrapper was added,
> > allowing to call any closed source application to use it, by using
> > LD_PRELOAD.
> >=20
> > For example, using skype with it is as simple as calling it with:
> > 	$ LD_PRELOAD=3D/usr/lib/libv4l/v4l1compat.so
> >=20
> > /usr/bin/skypeforlinux
> >=20
> > 2. Current problems
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >=20
> > 2.1 Libv4l can slow image handling
> > ----------------------------------
> >=20
> > Nowadays, almost all new "simple" cameras are connected via USB using
> > the UVC class (USB Video Class). UVC standardized the allowed
> > formats, and most apps just implement them. The UVC hardware is more
> > complex, having format converters inside it. So, for most usages, format
> > conversion isn't needed anymore.
> >=20
> > The need of doing format conversion in software makes libv4l slow,
> > requiring lots of CPU usage in order to convert a 4K or 8K format,
> > being even worse with 3D cameras.
> >=20
> > Also, due to the need of supporting LD_PRELOAD, zero-buffer copy via
> > DMA_BUFFER currently doesn't work with libv4l.
> >=20
> > Right now, gstreamer defaults to not enable libv4l2, mainly due to
> > those performance issues.
>=20
> I need to clarify a little bit on why we disabled libv4l2 in GStreamer,
> as it's not only for performance reason, there is couple of major
> issues in the libv4l2 implementation that get's in way. Just a short
> list:
>=20
>   - Crash when CREATE_BUFS is being used
>   - Crash in the jpeg decoder (when frames are corrupted)
>   - App exporting DMABuf need to be aware of emulation, otherwise the
>     DMABuf exported are in the orignal format
>   - RW emulation only initialize the queue on first read (causing
>     userspace poll() to fail)
>   - Signature of v4l2_mmap does not match mmap() (minor)
>   - The colorimetry does not seem emulated when conversion
>   - Sub-optimal locking (at least deadlocks were fixed)

Do you see any point in that list that couldn't be fixed in libv4l ?

> Except for the colorimetry (which causes negotiation failure, as it
> causes invalid colorimetry / format matches), these issues are already
> worked around in GStreamer, but with the lost of features of course.
> There is other cases were something worked without libv4l2, but didn't
> work with libv4l2, but we haven't tracked down the cause.
>=20
> For people working on this venue, since 1.14, you can enable libv4l2 at
> run-time using env GST_V4L2_USE_LIBV4L2=3D1.
>=20
> > 2.2 Modern hardware is starting to come with "complex" camera ISP
> > -----------------------------------------------------------------
> >=20
> > While mc-based devices were limited to SoC, it was easy to
> > "delegate" the task of talking with the hardware to the
> > embedded hardware designers.
> >=20
> > However, this is changing. Dell Latitude 5285 laptop is a standard
> > PC with an i3-core, i5-core or i7-core CPU, with comes with the
> > Intel IMU3 ISP hardware[1]
> >=20
> > [1] https://www.spinics.net/lists/linux-usb/msg167478.html
> >=20
> > There, instead of an USB camera, the hardware is equipped with a
> > MC-based ISP, connected to its camera. Currently, despite having
> > a Kernel driver for it, the camera doesn't work with any
> > userspace application.
> >=20
> > I'm also aware of other projects that are considering the usage of
> > mc-based devices for non-dedicated hardware.
> >=20
> > 3. How to solve it?
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >=20
> > That's the main focus of the meeting :-)
> >=20
> > From a previous discussion I had with media sub-maintainers, there
> > are at least two actions that seem required. I'm listing them below as
> > an starting point for the discussions, but we can eventually come up
> > with some different approach after the meeting.
> >=20
> > 3.1 libv4l2 support for mc-based hardware
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >=20
> > In order to support those hardware, we'll need to do some redesign
> > mainly at libv4l2[2].
> >=20
> > The idea is to work on a new API for libv4l2 that will allow to
> > split the format conversion on a separate part of it, add support
> > for DMA Buffer and come up with a way for the library to work
> > transparently with both devnode-based and mc-based hardware.
> >=20
> > That envolves adding capacity at libv4l to setup hardware pipelines
> > and to propagate controls among their sub-devices. Eventually, part
> > of it will be done in Kernel.
> >=20
> > That should give performance increase at the library and would allow
> > gstreamer to use it by default, without compromising performance.
> >=20
> > [2] I don't discard that some Kernel changes could also be part of
> > the solution, like, for example, doing control propagation along the
> > pipeline on simple use case scenarios.
> >=20
> > 3.2 libv4l2 support for 3A algorithms
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >=20
> > The 3A algorithm handing is highly dependent on the hardware. The
> > idea here is to allow libv4l to have a set of 3A algorithms that
> > will be specific to certain mc-based hardware. Ideally, this should
> > be added in a way that it will allow external closed-source
> > algorithms to run as well.

=2D-=20
Regards,

Laurent Pinchart