On Fri, Nov 08, 2013 at 08:18:25PM +0100, Martin Sperl wrote:
> On 08.11.2013, at 19:09, Mark Brown wrote:
> > On Fri, Nov 08, 2013 at 06:31:37PM +0100, Martin Sperl wrote:

> > This sounds like an artificial benchmark attempting to saturate the bus
> > rather than a real world use case, as does everything else you mention,
> > and the contributions of the individual changes aren't broken down so it
> > isn't clear what specifically the API change delivers.

> As explained - it is a reasonable use-case that you can easily trigger.

> For example: updating the firmware of a 128KB Flash via the CAN bus requires
> about: 22528 8 byte can packets to transfer just the data and some signaling.
> With 3200 CAN-messages/s as the upper limit for these 8 byte messages
> this requires 7.04seconds to transfer all the Flash data.

You mentioned that systems could be constructed but you didn't mention
examples of doing that; in general prolonged bus saturation tends to be
something that people avoid when designing systems.

> OK, there would be gaps every 44 packets while a flash page gets written.
> But even then, at that time other devices that are blocked, will send their
> messages as the Firmware update is idle. So with more nodes under such a 
> situation the bus becomes very likely saturated for 10 seconds.

> So it is IMO a realistic load simulation to take the "automatic re-broadcast"
> as a repeatable scenario.

What I'm missing with this example is the benefit of having an API for
pre-cooked messages, how it will deliver a performance improvement?
Flashing should be an infrequent operation and I'd expect it to involve
little reuse of messages which I'd expect to be the main thing that we
could gain from the API change.  I'd also not expect the controller
specific work to be especially expensive.

> > I'd like to see both a practical use case and specific analysis showing
> > that changing the API is delivering a benefit as opposed to the parts
> > which can be done by improving the implementation of the current API.

> I have already shared at some point and also it shows in the forum:

You've been doing a bunch of other performance improvements as well,
it's not clear to me how much of this is coming from the API change and
how much of this is due to changes which can also be done by improving
the implementation and without requiring drivers to be specifically
updated to take advantage of it. 

> Does this answer your question and convince you of this being realistic?

It's still not clear to me exactly how this works and hence if the
benefit comes from the API change itself.

> Also my next work is moving to DMA scheduling multiple messages via "transfer".
> This should bring down the CPU utilization even further and it should also
> decrease the context switches as the spi_pump thread goes out of the picture...
> (and that will probably decrease the number of overall interrupts as well...)

Right, and simply driving transfers from interrupt rather than task
context probably gets an awfully long way there.  This is the sort of
improvement which will benefit all users of the API - the main reason
I'm so cautious about changing the API is that I don't want to make it
more complex to implement this sort of improvement.