On Wed 2017-05-03 20:05:56, Russell King - ARM Linux wrote: > On Wed, Apr 26, 2017 at 06:43:54PM +0300, Ivaylo Dimitrov wrote: > > >+static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format *fmt) > > >+{ > > >+ long long avg_lum = 0; > > >+ int x, y; > > >+ > > >+ buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 + > > >+ fmt->fmt.pix.width / 4; > > >+ > > >+ for (y = 0; y < fmt->fmt.pix.height / 2; y++) { > > >+ for (x = 0; x < fmt->fmt.pix.width / 2; x++) > > > > That would take some time :). AIUI, we have NEON support in ARM kernels > > (CONFIG_KERNEL_MODE_NEON), I wonder if it makes sense (me) to convert the > > above loop to NEON-optimized when it comes to it? Are there any drawbacks in > > using NEON code in kernel? > > Using neon without the VFP state saved and restored corrupts userspace's > FP state. So, you have to save the entire VFP state to use neon in kernel > mode. There are helper functions for this: kernel_neon_begin() and > kernel_neon_end(). ... > Given that, do we really want to be walking over multi-megabytes of image > data in the kernel with preemption disabled - it sounds like a recipe for > a very sluggish system. I think this should (and can only sensibly be > done) in userspace. The patch was for libv4l2. (And I explained why we don't need to overoptimize this.) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html