linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch v2] mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu
@ 2022-02-22 16:01 Marcelo Tosatti
  2022-02-22 16:07 ` [patch v3] " Marcelo Tosatti
  0 siblings, 1 reply; 26+ messages in thread
From: Marcelo Tosatti @ 2022-02-22 16:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Nicolas Saenz Julienne, Juri Lelli, Thomas Gleixner,
	Sebastian Andrzej Siewior, Paul E. McKenney


On systems that run FIFO:1 applications that busy loop 
on isolated CPUs, executing tasks on such CPUs under
lower priority is undesired (since that will either
hang the system, or cause longer interruption to the
FIFO task due to execution of lower priority task 
with very small sched slices).

Commit d479960e44f27e0e52ba31b21740b703c538027c ("mm: disable LRU 
pagevec during the migration temporarily") relies on 
queueing work items on all online CPUs to ensure visibility
of lru_disable_count.

However, its possible to use synchronize_rcu which will provide the same
guarantees:

    * synchronize_rcu() waits for preemption disabled
    * and RCU read side critical sections
    * For the users of lru_disable_count:
    *
    * preempt_disable, local_irq_disable() [bh_lru_lock()]
    * rcu_read_lock                        [lru_pvecs CONFIG_PREEMPT_RT]
    * preempt_disable                      [lru_pvecs !CONFIG_PREEMPT_RT]
    *
    *
    * so any calls of lru_cache_disabled wrapped by
    * local_lock+rcu_read_lock or preemption disabled would be
    * ordered by that. 

Fixes:

[ 1873.243925] INFO: task kworker/u160:0:9 blocked for more than 622 seconds.
[ 1873.243927]       Tainted: G          I      --------- ---  5.14.0-31.rt21.31.el9.x86_64 #1
[ 1873.243929] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1873.243929] task:kworker/u160:0  state:D stack:    0 pid:    9 ppid:     2 flags:0x00004000
[ 1873.243932] Workqueue: cpuset_migrate_mm cpuset_migrate_mm_workfn
[ 1873.243936] Call Trace:
[ 1873.243938]  __schedule+0x21b/0x5b0
[ 1873.243941]  schedule+0x43/0xe0
[ 1873.243943]  schedule_timeout+0x14d/0x190
[ 1873.243946]  ? resched_curr+0x20/0xe0
[ 1873.243953]  ? __prepare_to_swait+0x4b/0x70
[ 1873.243958]  wait_for_completion+0x84/0xe0
[ 1873.243962]  __flush_work.isra.0+0x146/0x200
[ 1873.243966]  ? flush_workqueue_prep_pwqs+0x130/0x130
[ 1873.243971]  __lru_add_drain_all+0x158/0x1f0
[ 1873.243978]  do_migrate_pages+0x3d/0x2d0
[ 1873.243985]  ? pick_next_task_fair+0x39/0x3b0
[ 1873.243989]  ? put_prev_task_fair+0x1e/0x30
[ 1873.243992]  ? pick_next_task+0xb30/0xbd0
[ 1873.243995]  ? __tick_nohz_task_switch+0x1e/0x70
[ 1873.244000]  ? raw_spin_rq_unlock+0x18/0x60
[ 1873.244002]  ? finish_task_switch.isra.0+0xc1/0x2d0
[ 1873.244005]  ? __switch_to+0x12f/0x510
[ 1873.244013]  cpuset_migrate_mm_workfn+0x22/0x40
[ 1873.244016]  process_one_work+0x1e0/0x410
[ 1873.244019]  worker_thread+0x50/0x3b0
[ 1873.244022]  ? process_one_work+0x410/0x410
[ 1873.244024]  kthread+0x173/0x190
[ 1873.244027]  ? set_kthread_struct+0x40/0x40
[ 1873.244031]  ret_from_fork+0x1f/0x30

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

v2: rt_spin_lock calls rcu_read_lock, no need
to add it before local_lock on swap.c		(Nicolas Saenz Julienne) 

diff --git a/mm/swap.c b/mm/swap.c
index bcf3ac288b56..48299a125d68 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -831,8 +831,7 @@ inline void __lru_add_drain_all(bool force_all_cpus)
 	for_each_online_cpu(cpu) {
 		struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
 
-		if (force_all_cpus ||
-		    pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) ||
+		if (pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) ||
 		    data_race(pagevec_count(&per_cpu(lru_rotate.pvec, cpu))) ||
 		    pagevec_count(&per_cpu(lru_pvecs.lru_deactivate_file, cpu)) ||
 		    pagevec_count(&per_cpu(lru_pvecs.lru_deactivate, cpu)) ||
@@ -876,14 +875,21 @@ atomic_t lru_disable_count = ATOMIC_INIT(0);
 void lru_cache_disable(void)
 {
 	atomic_inc(&lru_disable_count);
+	synchronize_rcu();
 #ifdef CONFIG_SMP
 	/*
-	 * lru_add_drain_all in the force mode will schedule draining on
-	 * all online CPUs so any calls of lru_cache_disabled wrapped by
-	 * local_lock or preemption disabled would be ordered by that.
-	 * The atomic operation doesn't need to have stronger ordering
-	 * requirements because that is enforced by the scheduling
-	 * guarantees.
+	 * synchronize_rcu() waits for preemption disabled
+	 * and RCU read side critical sections.
+	 * For the users of lru_disable_count:
+	 *
+	 * preempt_disable, local_irq_disable  [bh_lru_lock()]
+	 * rcu_read_lock		       [rt_spin_lock CONFIG_PREEMPT_RT]
+	 * preempt_disable		       [local_lock !CONFIG_PREEMPT_RT]
+	 *
+	 * so any calls of lru_cache_disabled wrapped by local_lock or
+	 * preemption disabled would be ordered by that. The atomic
+	 * operation doesn't need to have stronger ordering requirements
+	 * because that is enforced by the scheduling guarantees.
 	 */
 	__lru_add_drain_all(true);
 #else



^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-06-22  0:16 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-22 16:01 [patch v2] mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu Marcelo Tosatti
2022-02-22 16:07 ` [patch v3] " Marcelo Tosatti
2022-02-22 16:25   ` Nicolas Saenz Julienne
2022-03-04  1:03   ` Andrew Morton
2022-03-04  1:49     ` Paul E. McKenney
2022-03-04 15:08       ` Marcelo Tosatti
2022-03-04 16:02         ` Paul E. McKenney
2022-03-04 15:11     ` Marcelo Tosatti
2022-03-04 16:29   ` [patch v4] " Marcelo Tosatti
2022-03-05  0:35     ` Andrew Morton
2022-03-07 18:52       ` Marcelo Tosatti
2022-03-10 13:22       ` [patch v5] " Marcelo Tosatti
2022-03-11  2:23         ` Andrew Morton
2022-03-11  8:35           ` Sebastian Andrzej Siewior
2022-03-12  0:40             ` Andrew Morton
2022-03-12 20:39             ` Marcelo Tosatti
2022-03-13  9:23               ` Hillf Danton
2022-03-31 13:52         ` Borislav Petkov
2022-04-28 18:00           ` Marcelo Tosatti
2022-05-28 21:18             ` Andrew Morton
2022-05-28 22:54               ` Michael Larabel
2022-05-29  0:48                 ` Michael Larabel
2022-06-19 12:14                   ` Thorsten Leemhuis
2022-06-22  0:15                     ` Andrew Morton
2022-03-05  4:33     ` [patch v4] " Paul E. McKenney
2022-03-08 17:41     ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).