All of lore.kernel.org
 help / color / mirror / Atom feed
From: ramesh.thomas@intel.com
To: linux-rt-users@vger.kernel.org
Cc: Ramesh Thomas <ramesh.thomas@intel.com>,
	williams@redhat.com, frederic@kernel.org, bigeasy@linutronix.de
Subject: [PATCH 0/1] nohz_full state entry failure in preempt_rt and proposed fix
Date: Wed, 23 Dec 2020 04:20:34 -0500	[thread overview]
Message-ID: <20201223092034.528782-1-ramesh.thomas@intel.com> (raw)

From: Ramesh Thomas <ramesh.thomas@intel.com>

Hello,

This addresses an issue we have been facing with preempt_rt kernels not
able to enter nohz_full state consistently. Following are my debug
findings and details of a tool I had devloped that can help reproduce
the issue. Following patch has a proposed fix or at least a pointer to
areas worth looking into. We need preempt_rt for determinism and being
able to use nohz_full along with it is very valuable.

Problem:
Sometimes nohz_full state is never entered even when all necessary
conditions are met. It is easier to reproduce the issue in preempt_rt
kernel, however it may not be limited to preempt_rt.

Debug findings and proposed fix:
Observed that in the failure condition, entry into nohz_full state is
repeatedly aborted due to the detection of a pending timer event in the
next period in tick_nohz_next_event(). The issue is not reproduceable if
tick stoppage is not bailed out here. The skipping of the bailing out
code is done only if CONFIG_NO_HZ_FULL is defined. Since in nohz_full
mode, idle state is not entered when ticks are being stopped, aborting
tick stoppage may not be necessary. It is simpler to let the common code
that handles reprogramming of the timer at tick_nohz_stop_tick() handle
the next tick. 

Environment to reproduce:
Used Intel NUCs with 4 cores (Apollo Lake and Tiger Lake). Easier to
reproduce in embedded platforms. 

Kernel version:5.10.1-rt19 (This is not a new issue and I have seen it
in rt kernel version 4.17)

Relevant kernel config flags:
- NO_HZ_FULL=y
- PREEMPT_RT=y
- CPU_ISOLATION=y
- RCU_NOCB_CPU=y

Relevant kernel boot parameters:
- isolcpus=nohz,domain,1,3 nohz_full=1,3 rcu_nocbs=1,3 irqaffinity=0
- cpufreq.off=1 idle=poll cpuidle.off=1

Steps to reproduce:
1. Disable rt throttling assigning 100% scheduler period to rt tasks
2. Set cpu affinity to one of the nohz_full cpus
3. Set scheduling policy to sched_fifo with max priority
4. Wait till "tick_stopped" gets set in /proc/timer_list
5. Return failure if tick is not stopped in 15 seconds
6. Run above steps in a loop to stress it. It may take a while to
reproduce.

The above can be done using a tool I had developed as part of a
framework to assist setting up CPU thread isolation and measuring
jitter. It can be found at https://github.com/intel/tif

Build the tif_test app and run as follows using the included script to
run it in a loop till the failure is reproduced. 

make test
./tif_stress.sh

e.g. output. 
"Test# 818
Successfully entered nohz state in 724us

Error entering nohz state after 15000102us
Reproduced NOHZ_FULL failure after 818 tries!!!
Test elapsed time: 835 seconds"

(PS: The framework has a workaround for the issue which is not used in
the test. The workaround that helped was, switching CPU affinity in and
out of the nohz_full CPU giving the scheduler a fresh start)


Ramesh Thomas (1):
  dynticks/preempt_rt: Fix a nohz_full entry failure in preempt_rt

 kernel/time/tick-sched.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

-- 
2.26.2


             reply	other threads:[~2020-12-23  9:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-23  9:20 ramesh.thomas [this message]
2020-12-23  9:20 ` [PATCH 1/1] dynticks/preempt_rt: Fix a nohz_full entry failure in preempt_rt ramesh.thomas
2021-01-14 14:48   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201223092034.528782-1-ramesh.thomas@intel.com \
    --to=ramesh.thomas@intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.