lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Rajotte-Julien via lttng-dev <lttng-dev@lists.lttng.org>
To: Anders Wallin <wallinux@gmail.com>
Cc: lttng-dev <lttng-dev@lists.lttng.org>
Subject: Re: [lttng-dev] [PATCH lttng-tools] Fix: test code assumes that child process is schedule to run before parent
Date: Thu, 1 Apr 2021 09:45:27 -0400	[thread overview]
Message-ID: <20210401134527.GA79283@joraj-alpa> (raw)
In-Reply-To: <CAF2baFdzoD-wmoO84A83DXdTzBr1WZ7ifoQtah493xH7WE3KhQ@mail.gmail.com>

On Thu, Apr 01, 2021 at 03:21:10AM +0200, Anders Wallin wrote:
> Hi Jonathan :)
> 
> It's very unlikely that the race could occur, BUT it can happen.
> 
> OK run
> 1. test_event_tracker starts gen-ust-events
> 2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH
> 3. gen-ust-event create first event and create AFTER_FIRST_PATH
> 4. gen-ust-event continue and create 99 events
> 5. gen-ust-event wait for  BEFORE_LAST_PATH
> 6. test_event_track start collecting events (lttng start .....)

> 7. test_event_tracker calls "lttng track ... -pid "Faulty PID"
> 8. test_event_tracker touches BEFORE_LAST_PATH
> 9. gen-ust-events creates the last event
> 10. test_event finishes and since it tracks the faulty pid the result will
> be 0 events == OK

The sequence of event for the `test_event_tracker` function is (as of master):

create
enable event
start
track

launch app
wait for app return
stop
destroy


The sequence you are describing here is:

lauch app
start
track
...

I'm pretty sure we are not talking about the same things. Please specify the
test case and all functions involved and make sure to use the proper name for
each of them.

I suspect you are talking about test_event_pid_tracker. Still let's make sure of
it. If it is, I do agree that it seems to have a window where we can gather
event for.

You might want to look if there is a real reason for this sequence instead of
mimicking test_event_tracker

Current code:
 enable_"$domain"_lttng_event_ok $SESSION_NAME "$wildcard" "$channel"
 prepare_"$domain"_app
 start_lttng_tracing_ok

 if [ "$expect_event" -eq 1 ]; then
         lttng_track_"$domain"_ok "--vpid ${CHILD_PID}"
 else
         lttng_track_"$domain"_ok "--vpid $((CHILD_PID+1))"
 fi
 trace_"$domain"_app
 stop_lttng_tracing_ok
 destroy_lttng_session_ok $SESSION_NAME


We might simply want to move the track command before the start considering have
all the information to do so.

 enable_"$domain"_lttng_event_ok $SESSION_NAME "$wildcard" "$channel"

 prepare_"$domain"_app

 if [ "$expect_event" -eq 1 ]; then
         lttng_track_"$domain"_ok "--vpid ${CHILD_PID}"
 else
         lttng_track_"$domain"_ok "--vpid $((CHILD_PID+1))"
 fi

 start_lttng_tracing_ok
 trace_"$domain"_app

 stop_lttng_tracing_ok
 destroy_lttng_session_ok $SESSION_NAME

After testing this, seems like the validate_trace_empty does not handle the
case were simply stream allocation did not take place since no application was
'valid' at the moment of the trace start.

Well we got a good one here. We will wait for your updated patch and go from
there.

Cheers

> 
> Faulty run
> 1. test_event_tracker starts gen-ust-events
> 2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH
> 3. gen-ust-event create first event and create AFTER_FIRST_PATH
> 4. gen-ust-event gets rescheduled before it has created 99 events, e.g
> after 9 events
> 5. test_event_track start collecting events (lttng start .....)
> 6. gen-ust-event is rescheduled and starts generating the remaining events.
> 90 events
> 7. lttng collects these 90 events since we have not setup "tracking" of PID
> yet
> 8. test_event_tracker calls "lttng track ... -pid "Faulty PID"
> 9. test_event_tracker touches BEFORE_LAST_PATH
> 10. gen-ust-events creates the last event
> 11. test_event finishes and since it tracks the faulty pid the result
> should  be 0 events, but we got 90 == FAULTY
> 
> We can solve this by;
> A: using NR_ITER=2
> or
> B: by adding a flag to gen-ust-events to wait before sending the first event
> 1. test_event_tracker starts gen-ust-events
> 2. test_event_tracker waits for gen-ust-events waits for BEFORE_FIRST_PATH
> 3. test_event_track start collecting events (lttng start .....)
> 4. test_event_tracker calls "lttng track ... -pid "Faulty PID"
> 5. test_event_track creates BEFORE_FIRST_PATH
> 6. gen_ust_event creates 100 events
> 7. test_event_tracker waits for gen_event_ust to end
> 8. test_event finishes and since it tracked the faulty pid the result will
> be 0 events == OK
> 
> This is in principle how gen-kernel-test-events works (but with different
> arguments)
> I would suggest to use B since that will be bulletproof
> 
> Anders Wallin
> 
> 
> On Wed, Mar 31, 2021 at 11:33 PM Jonathan Rajotte-Julien <
> jonathan.rajotte-julien@efficios.com> wrote:
> 
> > Hi,
> >
> > On Wed, Mar 31, 2021 at 11:09:42PM +0200, Anders Wallin wrote:
> > > Hi Julian,
> >
> > You can use Jonathan. ;)
> >
> > >
> > > Neither mine "sleep 0.1" or your version with "while [! -f ............"
> > > are race condition free.
> >
> > I might be missing something here but as far as I understand the race you
> > are
> > trying to mitigate is that the test script execute/continue before the
> > `backgrounded`
> > command (background test app) had time to execute, right?
> >
> > If so at least waiting for the app to create a file is necessary. Now
> > gen_kernel_test_events does not have this functionality. The
> > PATH_WAIT_FILE is
> > used to control when the testapp can continue. Hence the script still
> > cannot
> > know if the app have been scheduled.
> >
> > Now based on the test case you might need more synchronization for the test
> > cases.
> >
> > Note that in the ust cases, the trace_ust_app uses `touch
> > "$BEFORE_LAST_PATH"`
> > that effectively unblock the app and allows it to perform the last
> > tracepoint
> > hit and the we `wait` on the background process.
> >
> > Note: some tests case are a bit clever and uses "trace_"$domain"_app"
> > instead of
> > calling trace_ust_app directly.
> >
> > For these tests case it seems that we are only expecting at least a single
> > event
> > matching the event name under test. Here the last tracepoint hit should
> > satisfy
> > this criteria.
> >
> > Am I missing a race?
> >
> > Cheers
> >
> >
> > > I suggest that we add an option to gen-ust-events to wait before the
> > first
> > > event is generated.
> > > gen_kernel_test_events already have this functionality to wait before the
> > > first event.
> > >
> > > Something like this
> > > static struct option long_options[] =
> > > {
> > > /* These options set a flag. */
> > > {"iter", required_argument, 0, 'i'},
> > > {"wait", required_argument, 0, 'w'},
> > > {"sync-after-first-event", required_argument, 0, 'a'},
> > > {"sync-before-last-event", required_argument, 0, 'b'},
> > > {"sync-before-last-event-touch", required_argument, 0, 'c'},
> > > {"sync-before-exit", required_argument, 0, 'd'},
> > > {"sync-before-exit-touch", required_argument, 0, 'e'},
> > > *+ {"sync-before-first-event", required_argument, 0, 'f'},*
> > >
> > > {0, 0, 0, 0}
> > > };
> > > ....
> > >
> > > I will create one or more patches for this tomorrow
> > >
> > > Anders Wallin
> > >
> > >
> > > On Wed, Mar 31, 2021 at 9:25 PM Jonathan Rajotte-Julien <
> > > jonathan.rajotte-julien@efficios.com> wrote:
> > >
> > > > > #
> > > > > # SPDX-License-Identifier: GPL-2.0-only
> > > > >
> > > > > -TEST_DESC="LTTng - Event traker test"
> > > > > +TEST_DESC="LTTng - Event tracker test"
> > > > >
> > > > > CURDIR=$(dirname "$0")/
> > > > > TESTDIR="$CURDIR/../../.."
> > > > > @@ -42,6 +42,8 @@ function prepare_ust_app
> > > > >
> > > > >       $TESTAPP_BIN -i $NR_ITER -w $NR_USEC_WAIT -a
> > "$AFTER_FIRST_PATH" -b
> > > > >       "$BEFORE_LAST_PATH" &
> > > > >       CHILD_PID=$!
> > > > > +     # voluntary context switch to start $TESTAPP_BIN
> > > > > +     sleep 0.1
> > > >
> > > > A wait on the $AFTER_FIRST_PATH file would be probably more
> > deterministic
> > > > than a sleep here.
> > > >
> > > >   while [ ! -f "${AFTER_FIRST_PATH}" ]; do
> > > >           sleep 0.1
> > > >   done
> > > >
> > > > I would also expect something similar for the `prepare_kernel_app`
> > > > function considering the same race is most probably present and simply
> > not
> > > > triggered by a chance of luck.
> > > > Seems like gen-kernel-test-events does not expose the same sync
> > > > capabilities here, please use gen-ust-events as an example of how it is
> > > > done.
> > > >
> > > > Let us know how your testing goes.
> > > >
> > > > Thanks
> > > >
> >
> > --
> > Jonathan Rajotte-Julien
> > EfficiOS
> >

-- 
Jonathan Rajotte-Julien
EfficiOS
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

  reply	other threads:[~2021-04-01 13:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 18:56 [lttng-dev] [PATCH lttng-tools] Fix: test code assumes that child process is schedule to run before parent Anders Wallin via lttng-dev
2021-03-31 19:25 ` Jonathan Rajotte-Julien via lttng-dev
2021-03-31 21:09   ` Anders Wallin via lttng-dev
2021-03-31 21:33     ` Jonathan Rajotte-Julien via lttng-dev
2021-04-01  1:21       ` Anders Wallin via lttng-dev
2021-04-01 13:45         ` Jonathan Rajotte-Julien via lttng-dev [this message]
2021-04-01 15:02           ` Anders Wallin via lttng-dev
2021-04-01 16:19 ` Mathieu Desnoyers via lttng-dev
2021-04-01 16:21   ` Mathieu Desnoyers via lttng-dev
2021-04-01 16:33   ` Anders Wallin via lttng-dev
2021-04-01 16:37 Anders Wallin via lttng-dev
2021-04-07 15:31 ` Mathieu Desnoyers via lttng-dev
2021-04-08  9:22   ` Anders Wallin via lttng-dev
2021-04-08 12:48     ` Mathieu Desnoyers via lttng-dev
2021-04-08 15:47       ` Anders Wallin via lttng-dev
2021-04-09 14:41 Anders Wallin via lttng-dev
2021-04-09 19:12 ` Mathieu Desnoyers via lttng-dev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210401134527.GA79283@joraj-alpa \
    --to=lttng-dev@lists.lttng.org \
    --cc=jonathan.rajotte-julien@efficios.com \
    --cc=wallinux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).