Hi Jonathan :)

It's very unlikely that the race could occur, BUT it can happen.

OK run
1. test_event_tracker starts gen-ust-events
2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH
3. gen-ust-event create first event and create AFTER_FIRST_PATH
4. gen-ust-event continue and create 99 events
5. gen-ust-event wait for  BEFORE_LAST_PATH
6. test_event_track start collecting events (lttng start .....)
7. test_event_tracker calls "lttng track ... -pid "Faulty PID"
8. test_event_tracker touches BEFORE_LAST_PATH
9. gen-ust-events creates the last event
10. test_event finishes and since it tracks the faulty pid the result will
be 0 events == OK

Faulty run
1. test_event_tracker starts gen-ust-events
2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH
3. gen-ust-event create first event and create AFTER_FIRST_PATH
4. gen-ust-event gets rescheduled before it has created 99 events, e.g
after 9 events
5. test_event_track start collecting events (lttng start .....)
6. gen-ust-event is rescheduled and starts generating the remaining events.
90 events
7. lttng collects these 90 events since we have not setup "tracking" of PID
yet
8. test_event_tracker calls "lttng track ... -pid "Faulty PID"
9. test_event_tracker touches BEFORE_LAST_PATH
10. gen-ust-events creates the last event
11. test_event finishes and since it tracks the faulty pid the result
should  be 0 events, but we got 90 == FAULTY

We can solve this by;
A: using NR_ITER=2
or
B: by adding a flag to gen-ust-events to wait before sending the first event
1. test_event_tracker starts gen-ust-events
2. test_event_tracker waits for gen-ust-events waits for BEFORE_FIRST_PATH
3. test_event_track start collecting events (lttng start .....)
4. test_event_tracker calls "lttng track ... -pid "Faulty PID"
5. test_event_track creates BEFORE_FIRST_PATH
6. gen_ust_event creates 100 events
7. test_event_tracker waits for gen_event_ust to end
8. test_event finishes and since it tracked the faulty pid the result will
be 0 events == OK

This is in principle how gen-kernel-test-events works (but with different
arguments)
I would suggest to use B since that will be bulletproof

Anders Wallin


On Wed, Mar 31, 2021 at 11:33 PM Jonathan Rajotte-Julien <
jonathan.rajotte-julien@efficios.com> wrote:

> Hi,
>
> On Wed, Mar 31, 2021 at 11:09:42PM +0200, Anders Wallin wrote:
> > Hi Julian,
>
> You can use Jonathan. ;)
>
> >
> > Neither mine "sleep 0.1" or your version with "while [! -f ............"
> > are race condition free.
>
> I might be missing something here but as far as I understand the race you
> are
> trying to mitigate is that the test script execute/continue before the
> `backgrounded`
> command (background test app) had time to execute, right?
>
> If so at least waiting for the app to create a file is necessary. Now
> gen_kernel_test_events does not have this functionality. The
> PATH_WAIT_FILE is
> used to control when the testapp can continue. Hence the script still
> cannot
> know if the app have been scheduled.
>
> Now based on the test case you might need more synchronization for the test
> cases.
>
> Note that in the ust cases, the trace_ust_app uses `touch
> "$BEFORE_LAST_PATH"`
> that effectively unblock the app and allows it to perform the last
> tracepoint
> hit and the we `wait` on the background process.
>
> Note: some tests case are a bit clever and uses "trace_"$domain"_app"
> instead of
> calling trace_ust_app directly.
>
> For these tests case it seems that we are only expecting at least a single
> event
> matching the event name under test. Here the last tracepoint hit should
> satisfy
> this criteria.
>
> Am I missing a race?
>
> Cheers
>
>
> > I suggest that we add an option to gen-ust-events to wait before the
> first
> > event is generated.
> > gen_kernel_test_events already have this functionality to wait before the
> > first event.
> >
> > Something like this
> > static struct option long_options[] =
> > {
> > /* These options set a flag. */
> > {"iter", required_argument, 0, 'i'},
> > {"wait", required_argument, 0, 'w'},
> > {"sync-after-first-event", required_argument, 0, 'a'},
> > {"sync-before-last-event", required_argument, 0, 'b'},
> > {"sync-before-last-event-touch", required_argument, 0, 'c'},
> > {"sync-before-exit", required_argument, 0, 'd'},
> > {"sync-before-exit-touch", required_argument, 0, 'e'},
> > *+ {"sync-before-first-event", required_argument, 0, 'f'},*
> >
> > {0, 0, 0, 0}
> > };
> > ....
> >
> > I will create one or more patches for this tomorrow
> >
> > Anders Wallin
> >
> >
> > On Wed, Mar 31, 2021 at 9:25 PM Jonathan Rajotte-Julien <
> > jonathan.rajotte-julien@efficios.com> wrote:
> >
> > > > #
> > > > # SPDX-License-Identifier: GPL-2.0-only
> > > >
> > > > -TEST_DESC="LTTng - Event traker test"
> > > > +TEST_DESC="LTTng - Event tracker test"
> > > >
> > > > CURDIR=$(dirname "$0")/
> > > > TESTDIR="$CURDIR/../../.."
> > > > @@ -42,6 +42,8 @@ function prepare_ust_app
> > > >
> > > >       $TESTAPP_BIN -i $NR_ITER -w $NR_USEC_WAIT -a
> "$AFTER_FIRST_PATH" -b
> > > >       "$BEFORE_LAST_PATH" &
> > > >       CHILD_PID=$!
> > > > +     # voluntary context switch to start $TESTAPP_BIN
> > > > +     sleep 0.1
> > >
> > > A wait on the $AFTER_FIRST_PATH file would be probably more
> deterministic
> > > than a sleep here.
> > >
> > >   while [ ! -f "${AFTER_FIRST_PATH}" ]; do
> > >           sleep 0.1
> > >   done
> > >
> > > I would also expect something similar for the `prepare_kernel_app`
> > > function considering the same race is most probably present and simply
> not
> > > triggered by a chance of luck.
> > > Seems like gen-kernel-test-events does not expose the same sync
> > > capabilities here, please use gen-ust-events as an example of how it is
> > > done.
> > >
> > > Let us know how your testing goes.
> > >
> > > Thanks
> > >
>
> --
> Jonathan Rajotte-Julien
> EfficiOS
>