On Tue, Nov 27, 2018 at 05:30:00PM +0100, Jernej Škrabec wrote: > > > > +static void _cedrus_write_ref_list(struct cedrus_ctx *ctx, > > > > + struct cedrus_run *run, > > > > + const u8 *ref_list, u8 num_ref, > > > > + enum cedrus_h264_sram_off sram) > > > > +{ > > > > + const struct v4l2_ctrl_h264_decode_param *decode = > > > > run->h264.decode_param; + struct vb2_queue *cap_q = > > > > &ctx->fh.m2m_ctx->cap_q_ctx.q; > > > > + struct cedrus_dev *dev = ctx->dev; > > > > + u32 sram_array[CEDRUS_MAX_REF_IDX / sizeof(u32)]; > > > > + unsigned int size, i; > > > > + > > > > + memset(sram_array, 0, sizeof(sram_array)); > > > > + > > > > + for (i = 0; i < num_ref; i += 4) { > > > > + unsigned int j; > > > > + > > > > + for (j = 0; j < 4; j++) { > > > > > > I don't think you have to complicate with two loops here. > > > cedrus_h264_write_sram() takes void* and it aligns to 4 anyway. So as long > > > input buffer is multiple of 4 (u8[CEDRUS_MAX_REF_IDX] qualifies for that), > > > you can use single for loop with "u8 sram_array[CEDRUS_MAX_REF_IDX]". > > > This should make code much more readable. > > > > This wasn't really about the alignment, but in order to get the > > offsets in the u32 and the array more easily. > > > > Breaking out the loop will make that computation less easy on the eye, > > so I guess it's very subjective. > > > > For some strange reason, code below fixes decoding issue from one of my test > samples. This is what I actually meant with 1 loop approach: Do you have that test sample somewhere accessible? > static void _cedrus_write_ref_list(struct cedrus_ctx *ctx, > struct cedrus_run *run, > const u8 *ref_list, u8 num_ref, > enum cedrus_h264_sram_off sram) > { > const struct v4l2_ctrl_h264_decode_param *decode = run->h264.decode_param; > struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q; > struct cedrus_dev *dev = ctx->dev; > u8 sram_array[CEDRUS_MAX_REF_IDX]; > unsigned int i; > > memset(sram_array, 0, sizeof(sram_array)); > num_ref = min(num_ref, (u8)CEDRUS_MAX_REF_IDX); > > for (i = 0; i < num_ref; i++) { > const struct v4l2_h264_dpb_entry *dpb; > const struct cedrus_buffer *cedrus_buf; > const struct vb2_v4l2_buffer *ref_buf; > unsigned int position; > int buf_idx; > u8 dpb_idx; > > dpb_idx = ref_list[i]; > dpb = &decode->dpb[dpb_idx]; > > if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) > continue; > > buf_idx = vb2_find_tag(cap_q, dpb->tag, 0); > if (buf_idx < 0) > continue; > > ref_buf = to_vb2_v4l2_buffer(ctx->dst_bufs[buf_idx]); > cedrus_buf = vb2_v4l2_to_cedrus_buffer(ref_buf); > position = cedrus_buf->codec.h264.position; > > sram_array[i] |= position << 1; > if (ref_buf->field == V4L2_FIELD_BOTTOM) > sram_array[i] |= BIT(0); > } > > cedrus_h264_write_sram(dev, sram, &sram_array, num_ref); > } > > IMO this code is easier to read. INdeed, thanks! > > > > + const struct v4l2_h264_dpb_entry *dpb; > > > > + const struct cedrus_buffer *cedrus_buf; > > > > + const struct vb2_v4l2_buffer *ref_buf; > > > > + unsigned int position; > > > > + int buf_idx; > > > > + u8 ref_idx = i + j; > > > > + u8 dpb_idx; > > > > + > > > > + if (ref_idx >= num_ref) > > > > + break; > > > > + > > > > + dpb_idx = ref_list[ref_idx]; > > > > + dpb = &decode->dpb[dpb_idx]; > > > > + > > > > + if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) > > > > + continue; > > > > + > > > > + buf_idx = vb2_find_tag(cap_q, dpb->tag, 0); > > > > + if (buf_idx < 0) > > > > + continue; > > > > + > > > > + ref_buf = to_vb2_v4l2_buffer(ctx->dst_bufs[buf_idx]); > > > > + cedrus_buf = vb2_v4l2_to_cedrus_buffer(ref_buf); > > > > + position = cedrus_buf->codec.h264.position; > > > > + > > > > + sram_array[i] |= position << (j * 8 + 1); > > > > + if (ref_buf->field == V4L2_FIELD_BOTTOM) > > > > > > You newer set above flag to buffer so this will be always false. > > > > As far as I know, the field is supposed to be set by the userspace. > > How? I thought that only flags at queueing buffers can be set and there is no > bottom/top flag. https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#c.v4l2_buffer "Indicates the field order of the image in the buffer, see v4l2_field. This field is not used when the buffer contains VBI data. Drivers must set it when type refers to a capture stream, applications when it refers to an output stream." My understanding is that the application should set it, since we'll use the output stream's buffer here. But I might very well be wrong about it :/ Maxime -- Maxime Ripard, Bootlin Embedded Linux and Kernel engineering https://bootlin.com