Am 09.01.19 um 17:14 schrieb Marek Olšák:
Am 09.01.19 um 13:36 schrieb Marek Olšák:
Looks good, but I'm wondering what's the actual improvement?
No malloc calls and 1 less for loop copying the bo list.
Yeah, but didn't we want to get completely rid of the bo list?
If we have multiple IBs (e.g. gfx + compute) that share a BO list, I think it's faster to send the BO list to the kernel only once.
That's not really faster.
The only thing we safe us is a single loop over all BOs to lockup the handle into a pointer and that is only a tiny fraction of the overhead.
The majority of the overhead is locking the BOs and reserving space for the submission.
What could really help here is to submit gfx+comput together in just one CS IOCTL. This way we would need the locking and space reservation only once.
It's a bit of work in the kernel side, but certainly doable.