Simon,

Thanks - see below,

On Mon, Mar 30, 2020 at 7:32 AM Simon Marchi <simark@simark.ca> wrote:

On 2020-03-30 12:10 a.m., Rocky Dunlap via lttng-dev wrote:
> A couple of questions on performance considerations when setting up bt2 processing graphs.
>
> 1. Do parts of the processing graph that can execute concurrently do so? Does the user have any control over this, e.g., by creating threads in sink components?

The graph execution in itself is single-threaded at the moment. We could imaging
a design where different parts of the graph execute concurrently, in different
threads, but it's the the case right now.

You could make your components spawn threads to do some processing on the side,
if that helps, but these other threads should not interact directly with the
graph.

In my case I have CTF trace where some analyses can be performed on a per-stream basis (no need to mux the streams together). In this case, I was thinking that it would make sense to thread over the streams. However, I think can easily do this at a level above the graph simply by creating multiple graphs where each one is handling a single stream. In my case I am thinking this will be mostly I/O bound, so I'm not sure what kind of payoff the threads will give. Overall, I just want to make sure that I am not doing anything that would, in the long run, preclude threading/concurrency if it is added to the graph model itself.

> 2. It looks like you cannot connect one output port to multiple inputs. Is there a way to create a tee component?

Yes, we have discussed making a tee component, it is on the roadmap but not really
planned yet. It should be possible, it's just not as trivial as it may sound.

One easy way to achieve it is to make each iterator that is created on the tee
component create and maintain its own upstream iterator. If you have a tee with
two outputs, this will effectively make it so you have two graphs executing in
parallel. If you have a src.ctf.fs source upstream of the tee, then there will
be two iterators created on that source, so the CTF trace will be open and decoded
twice. We'd like to avoid that.

The other way of doing it is to make the tee buffer messages, and discard a
message once all downstream iterators have consumed it. This has some more
difficult technical challenges, like what to do when one downstream iterator
consumes, but the other does not (we don't want to buffer an infinite amount
of data). It also makes seeking a bit tricky.

We could go in more details, if you are interested in starting implementing it
yourself.

Yea, I can see how this can get tricky. This is not critical at this very moment, but I just wondered if there was a precedent for how to do this kind of thing.

Simon