Hi, I think you understood what I meant. The issue I have is for this 4 test cases in ./regression/tools/tracker/test_event_tracker - test_event_vpid_tracker ust 0 "${EVENT_NAME}" - test_event_vpid_track_untrack ust 0 "${EVENT_NAME}" - test_event_pid_tracker ust 0 "${EVENT_NAME}" - test_event_pid_track_untrack ust 0 "${EVENT_NAME}" I have 2 patches ready, but I want to run some more tests first before posting them Anders Wallin On Thu, Apr 1, 2021 at 3:45 PM Jonathan Rajotte-Julien < jonathan.rajotte-julien@efficios.com> wrote: > On Thu, Apr 01, 2021 at 03:21:10AM +0200, Anders Wallin wrote: > > Hi Jonathan :) > > > > It's very unlikely that the race could occur, BUT it can happen. > > > > OK run > > 1. test_event_tracker starts gen-ust-events > > 2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH > > 3. gen-ust-event create first event and create AFTER_FIRST_PATH > > 4. gen-ust-event continue and create 99 events > > 5. gen-ust-event wait for BEFORE_LAST_PATH > > 6. test_event_track start collecting events (lttng start .....) > > > 7. test_event_tracker calls "lttng track ... -pid "Faulty PID" > > 8. test_event_tracker touches BEFORE_LAST_PATH > > 9. gen-ust-events creates the last event > > 10. test_event finishes and since it tracks the faulty pid the result > will > > be 0 events == OK > > The sequence of event for the `test_event_tracker` function is (as of > master): > > create > enable event > start > track > > launch app > wait for app return > stop > destroy > > > The sequence you are describing here is: > > lauch app > start > track > ... > > I'm pretty sure we are not talking about the same things. Please specify > the > test case and all functions involved and make sure to use the proper name > for > each of them. > > I suspect you are talking about test_event_pid_tracker. Still let's make > sure of > it. If it is, I do agree that it seems to have a window where we can gather > event for. > > You might want to look if there is a real reason for this sequence instead > of > mimicking test_event_tracker > > Current code: > enable_"$domain"_lttng_event_ok $SESSION_NAME "$wildcard" "$channel" > prepare_"$domain"_app > start_lttng_tracing_ok > > if [ "$expect_event" -eq 1 ]; then > lttng_track_"$domain"_ok "--vpid ${CHILD_PID}" > else > lttng_track_"$domain"_ok "--vpid $((CHILD_PID+1))" > fi > trace_"$domain"_app > stop_lttng_tracing_ok > destroy_lttng_session_ok $SESSION_NAME > > > We might simply want to move the track command before the start > considering have > all the information to do so. > > enable_"$domain"_lttng_event_ok $SESSION_NAME "$wildcard" "$channel" > > prepare_"$domain"_app > > if [ "$expect_event" -eq 1 ]; then > lttng_track_"$domain"_ok "--vpid ${CHILD_PID}" > else > lttng_track_"$domain"_ok "--vpid $((CHILD_PID+1))" > fi > > start_lttng_tracing_ok > trace_"$domain"_app > > stop_lttng_tracing_ok > destroy_lttng_session_ok $SESSION_NAME > > After testing this, seems like the validate_trace_empty does not handle the > case were simply stream allocation did not take place since no application > was > 'valid' at the moment of the trace start. > > Well we got a good one here. We will wait for your updated patch and go > from > there. > > Cheers > > > > > Faulty run > > 1. test_event_tracker starts gen-ust-events > > 2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH > > 3. gen-ust-event create first event and create AFTER_FIRST_PATH > > 4. gen-ust-event gets rescheduled before it has created 99 events, e.g > > after 9 events > > 5. test_event_track start collecting events (lttng start .....) > > 6. gen-ust-event is rescheduled and starts generating the remaining > events. > > 90 events > > 7. lttng collects these 90 events since we have not setup "tracking" of > PID > > yet > > 8. test_event_tracker calls "lttng track ... -pid "Faulty PID" > > 9. test_event_tracker touches BEFORE_LAST_PATH > > 10. gen-ust-events creates the last event > > 11. test_event finishes and since it tracks the faulty pid the result > > should be 0 events, but we got 90 == FAULTY > > > > We can solve this by; > > A: using NR_ITER=2 > > or > > B: by adding a flag to gen-ust-events to wait before sending the first > event > > 1. test_event_tracker starts gen-ust-events > > 2. test_event_tracker waits for gen-ust-events waits for > BEFORE_FIRST_PATH > > 3. test_event_track start collecting events (lttng start .....) > > 4. test_event_tracker calls "lttng track ... -pid "Faulty PID" > > 5. test_event_track creates BEFORE_FIRST_PATH > > 6. gen_ust_event creates 100 events > > 7. test_event_tracker waits for gen_event_ust to end > > 8. test_event finishes and since it tracked the faulty pid the result > will > > be 0 events == OK > > > > This is in principle how gen-kernel-test-events works (but with different > > arguments) > > I would suggest to use B since that will be bulletproof > > > > Anders Wallin > > > > > > On Wed, Mar 31, 2021 at 11:33 PM Jonathan Rajotte-Julien < > > jonathan.rajotte-julien@efficios.com> wrote: > > > > > Hi, > > > > > > On Wed, Mar 31, 2021 at 11:09:42PM +0200, Anders Wallin wrote: > > > > Hi Julian, > > > > > > You can use Jonathan. ;) > > > > > > > > > > > Neither mine "sleep 0.1" or your version with "while [! -f > ............" > > > > are race condition free. > > > > > > I might be missing something here but as far as I understand the race > you > > > are > > > trying to mitigate is that the test script execute/continue before the > > > `backgrounded` > > > command (background test app) had time to execute, right? > > > > > > If so at least waiting for the app to create a file is necessary. Now > > > gen_kernel_test_events does not have this functionality. The > > > PATH_WAIT_FILE is > > > used to control when the testapp can continue. Hence the script still > > > cannot > > > know if the app have been scheduled. > > > > > > Now based on the test case you might need more synchronization for the > test > > > cases. > > > > > > Note that in the ust cases, the trace_ust_app uses `touch > > > "$BEFORE_LAST_PATH"` > > > that effectively unblock the app and allows it to perform the last > > > tracepoint > > > hit and the we `wait` on the background process. > > > > > > Note: some tests case are a bit clever and uses "trace_"$domain"_app" > > > instead of > > > calling trace_ust_app directly. > > > > > > For these tests case it seems that we are only expecting at least a > single > > > event > > > matching the event name under test. Here the last tracepoint hit should > > > satisfy > > > this criteria. > > > > > > Am I missing a race? > > > > > > Cheers > > > > > > > > > > I suggest that we add an option to gen-ust-events to wait before the > > > first > > > > event is generated. > > > > gen_kernel_test_events already have this functionality to wait > before the > > > > first event. > > > > > > > > Something like this > > > > static struct option long_options[] = > > > > { > > > > /* These options set a flag. */ > > > > {"iter", required_argument, 0, 'i'}, > > > > {"wait", required_argument, 0, 'w'}, > > > > {"sync-after-first-event", required_argument, 0, 'a'}, > > > > {"sync-before-last-event", required_argument, 0, 'b'}, > > > > {"sync-before-last-event-touch", required_argument, 0, 'c'}, > > > > {"sync-before-exit", required_argument, 0, 'd'}, > > > > {"sync-before-exit-touch", required_argument, 0, 'e'}, > > > > *+ {"sync-before-first-event", required_argument, 0, 'f'},* > > > > > > > > {0, 0, 0, 0} > > > > }; > > > > .... > > > > > > > > I will create one or more patches for this tomorrow > > > > > > > > Anders Wallin > > > > > > > > > > > > On Wed, Mar 31, 2021 at 9:25 PM Jonathan Rajotte-Julien < > > > > jonathan.rajotte-julien@efficios.com> wrote: > > > > > > > > > > # > > > > > > # SPDX-License-Identifier: GPL-2.0-only > > > > > > > > > > > > -TEST_DESC="LTTng - Event traker test" > > > > > > +TEST_DESC="LTTng - Event tracker test" > > > > > > > > > > > > CURDIR=$(dirname "$0")/ > > > > > > TESTDIR="$CURDIR/../../.." > > > > > > @@ -42,6 +42,8 @@ function prepare_ust_app > > > > > > > > > > > > $TESTAPP_BIN -i $NR_ITER -w $NR_USEC_WAIT -a > > > "$AFTER_FIRST_PATH" -b > > > > > > "$BEFORE_LAST_PATH" & > > > > > > CHILD_PID=$! > > > > > > + # voluntary context switch to start $TESTAPP_BIN > > > > > > + sleep 0.1 > > > > > > > > > > A wait on the $AFTER_FIRST_PATH file would be probably more > > > deterministic > > > > > than a sleep here. > > > > > > > > > > while [ ! -f "${AFTER_FIRST_PATH}" ]; do > > > > > sleep 0.1 > > > > > done > > > > > > > > > > I would also expect something similar for the `prepare_kernel_app` > > > > > function considering the same race is most probably present and > simply > > > not > > > > > triggered by a chance of luck. > > > > > Seems like gen-kernel-test-events does not expose the same sync > > > > > capabilities here, please use gen-ust-events as an example of how > it is > > > > > done. > > > > > > > > > > Let us know how your testing goes. > > > > > > > > > > Thanks > > > > > > > > > > > -- > > > Jonathan Rajotte-Julien > > > EfficiOS > > > > > -- > Jonathan Rajotte-Julien > EfficiOS >