call_rcu seems inefficient without futex

* call_rcu seems inefficient without futex
       [not found] <157982514329.691.6168767011604689030.ref@pink>
@ 2020-01-24  0:19 ` Alex Xu via lttng-dev
  2020-01-27 15:38   ` Mathieu Desnoyers
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Xu via lttng-dev @ 2020-01-24  0:19 UTC (permalink / raw)
  To: lttng-dev

Hi,

I recently installed knot dns for a very small FreeBSD server. I noticed
that it uses a surprising amount of CPU, even when there is no load:
about 0.25%. That's not huge, but it seems unnecessarily high when my
QPS is less than 0.01.

After some profiling, I came to the conclusion that this is caused by
call_rcu_wait using futex_async to repeatedly wait. Since there is no
futex on FreeBSD (without the Linux compatibility layer), this
effectively turns into a permanent busy waiting loop.

I think futex_noasync can be used here instead. call_rcu_wait is only
supposed to be called from call_rcu_thread, never from a signal context.
call_rcu calls get_call_rcu_data, which may call
get_default_call_rcu_data, which calls pthread_mutex_lock through
call_rcu_lock. Therefore, call_rcu is not async-signal-safe already.
Also, I think it only makes sense to use call_rcu around a RCU write,
which contradicts the README saying that only RCU reads are allowed in
signal handlers.

I applied "sed -i -e 's/futex_async/futex_noasync/'
src/urcu-call-rcu-impl.h" and knot seems to work correctly with only
0.01% CPU now. I also ran tests/unit and tests/regression with default
and signal backends and all completed successfully.

I think that the other two usages of futex_async are also a little
suspicious, but I didn't look too closely.

Thanks,
Alex.

^ permalink raw reply	[flat|nested] 5+ messages in thread