All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	 Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	 Vincent Guittot <vincent.guittot@linaro.org>,
	 Dietmar Eggemann <dietmar.eggemann@arm.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	 Ben Segall <bsegall@google.com>,  Mel Gorman <mgorman@suse.de>,
	 Daniel Bristot de Oliveira <bristot@redhat.com>,
	 Valentin Schneider <vschneid@redhat.com>,
	 linux-kernel@vger.kernel.org,
	 Luca Abeni <luca.abeni@santannapisa.it>,
	 Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>,
	 Thomas Gleixner <tglx@linutronix.de>,
	 Joel Fernandes <joel@joelfernandes.org>,
	 Vineeth Pillai <vineeth@bitbyteword.org>,
	 Shuah Khan <skhan@linuxfoundation.org>,
	Phil Auld <pauld@redhat.com>, Aaron Lu <aaron.lu@intel.com>,
	Kairui Song <kasong@tencent.com>,
	Guo Ziliang <guo.ziliang@zte.com.cn>
Subject: Re: [PATCH v5 0/7] SCHED_DEADLINE server infrastructure
Date: Mon, 19 Feb 2024 15:33:30 +0800	[thread overview]
Message-ID: <8734tosyb9.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <cover.1699095159.git.bristot@kernel.org> (Daniel Bristot de Oliveira's message of "Sat, 4 Nov 2023 11:59:17 +0100")

Hi, Daniel,

Thanks a lot for your great patchset!

We have a similar starvation issue in mm subsystem too.  Details are in
the patch description of the below commit.  In short, task A is busy
looping on some event, while task B will signal the event after some
work.  If the priority of task A is higher than that of task B, task B
may be starved.

IIUC, if task A is RT task while task B is fair task, then your patchset
will solve the issue.  If both task A and task B is RT tasks, is there
some way to solve the issue?

Now, we use a ugly schedule_timeout_uninterruptible(1) in the loop to
resolve the issue, is there something better?

Best Regards,
Huang, Ying

--------------------------8<---------------------------------------
commit 029c4628b2eb2ca969e9bf979b05dc18d8d5575e
Author: Guo Ziliang <guo.ziliang@zte.com.cn>
Date:   Wed Mar 16 16:15:03 2022 -0700

    mm: swap: get rid of livelock in swapin readahead
    
    In our testing, a livelock task was found.  Through sysrq printing, same
    stack was found every time, as follows:
    
      __swap_duplicate+0x58/0x1a0
      swapcache_prepare+0x24/0x30
      __read_swap_cache_async+0xac/0x220
      read_swap_cache_async+0x58/0xa0
      swapin_readahead+0x24c/0x628
      do_swap_page+0x374/0x8a0
      __handle_mm_fault+0x598/0xd60
      handle_mm_fault+0x114/0x200
      do_page_fault+0x148/0x4d0
      do_translation_fault+0xb0/0xd4
      do_mem_abort+0x50/0xb0
    
    The reason for the livelock is that swapcache_prepare() always returns
    EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
    cannot jump out of the loop.  We suspect that the task that clears the
    SWAP_HAS_CACHE flag never gets a chance to run.  We try to lower the
    priority of the task stuck in a livelock so that the task that clears
    the SWAP_HAS_CACHE flag will run.  The results show that the system
    returns to normal after the priority is lowered.
    
    In our testing, multiple real-time tasks are bound to the same core, and
    the task in the livelock is the highest priority task of the core, so
    the livelocked task cannot be preempted.
    
    Although cond_resched() is used by __read_swap_cache_async, it is an
    empty function in the preemptive system and cannot achieve the purpose
    of releasing the CPU.  A high-priority task cannot release the CPU
    unless preempted by a higher-priority task.  But when this task is
    already the highest priority task on this core, other tasks will not be
    able to be scheduled.  So we think we should replace cond_resched() with
    schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
    call set_current_state first to set the task state, so the task will be
    removed from the running queue, so as to achieve the purpose of giving
    up the CPU and prevent it from running in kernel mode for too long.
    
    (akpm: ugly hack becomes uglier.  But it fixes the issue in a
    backportable-to-stable fashion while we hopefully work on something
    better)
    
    Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
    Signed-off-by: Guo Ziliang <guo.ziliang@zte.com.cn>
    Reported-by: Zeal Robot <zealci@zte.com.cn>
    Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
    Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn>
    Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
    Acked-by: Hugh Dickins <hughd@google.com>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Roger Quadros <rogerq@kernel.org>
    Cc: Ziliang Guo <guo.ziliang@zte.com.cn>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 8d4104242100..ee67164531c0 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -478,7 +478,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 		 * __read_swap_cache_async(), which has set SWAP_HAS_CACHE
 		 * in swap_map, but not yet added its page to swap cache.
 		 */
-		cond_resched();
+		schedule_timeout_uninterruptible(1);
 	}
 
 	/*

  parent reply	other threads:[~2024-02-19  7:35 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-04 10:59 [PATCH v5 0/7] SCHED_DEADLINE server infrastructure Daniel Bristot de Oliveira
2023-11-04 10:59 ` [PATCH v5 1/7] sched: Unify runtime accounting across classes Daniel Bristot de Oliveira
2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2023-11-04 10:59 ` [PATCH v5 2/7] sched/deadline: Collect sched_dl_entity initialization Daniel Bristot de Oliveira
2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2023-11-04 10:59 ` [PATCH v5 3/7] sched/deadline: Move bandwidth accounting into {en,de}queue_dl_entity Daniel Bristot de Oliveira
2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2023-11-04 10:59 ` [PATCH v5 4/7] sched/deadline: Introduce deadline servers Daniel Bristot de Oliveira
2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2023-11-04 10:59 ` [PATCH v5 5/7] sched/fair: Add trivial fair server Daniel Bristot de Oliveira
2023-11-06 14:24   ` Peter Zijlstra
2023-11-06 14:26     ` Daniel Bristot de Oliveira
2023-11-04 10:59 ` [PATCH v5 6/7] sched/deadline: Deferrable dl server Daniel Bristot de Oliveira
2023-11-06 14:55   ` Peter Zijlstra
2023-11-06 17:05     ` Daniel Bristot de Oliveira
2023-11-06 19:32   ` Joel Fernandes
2023-11-06 21:32     ` Joel Fernandes
2023-11-06 21:37       ` Joel Fernandes
2023-11-07 11:58         ` Daniel Bristot de Oliveira
2023-11-08  2:42           ` Joel Fernandes
2023-11-07 16:47         ` Steven Rostedt
2023-11-07 17:35           ` Steven Rostedt
2023-11-07 17:46             ` Steven Rostedt
2023-11-07 17:54             ` Steven Rostedt
2023-11-07 19:32               ` Steven Rostedt
2023-11-07 20:07                 ` Steven Rostedt
2023-11-07 17:37           ` Daniel Bristot de Oliveira
2023-11-07 18:50             ` Daniel Bristot de Oliveira
2023-11-08  3:20               ` Joel Fernandes
2023-11-08  8:01                 ` Daniel Bristot de Oliveira
2023-11-08 18:25                   ` Joel Fernandes
2023-11-08 12:44               ` Peter Zijlstra
2023-11-08 12:50                 ` Peter Zijlstra
2023-11-08 14:52                   ` Daniel Bristot de Oliveira
2023-11-08 13:46                 ` Daniel Bristot de Oliveira
2023-11-08 13:58                 ` Daniel Bristot de Oliveira
2023-11-08 15:14                 ` Juri Lelli
2023-11-08 16:57                   ` Peter Zijlstra
2023-11-08  2:37           ` Joel Fernandes
2023-11-07  7:30     ` Daniel Bristot de Oliveira
2023-11-07 16:37   ` Steven Rostedt
2023-11-13 15:05   ` kernel test robot
2024-03-20  0:03   ` Joel Fernandes
2024-03-20 19:24     ` Daniel Bristot de Oliveira
2024-03-21 16:15       ` Joel Fernandes
2024-03-23 14:37         ` Joel Fernandes
2024-04-05 14:35         ` Daniel Bristot de Oliveira
2024-04-08 17:11           ` Steven Rostedt
2023-11-04 10:59 ` [PATCH v5 7/7] sched/fair: Fair server interface Daniel Bristot de Oliveira
2023-11-04 15:18   ` kernel test robot
2023-11-05  0:55   ` kernel test robot
2023-11-06 15:40   ` Peter Zijlstra
2023-11-06 16:29     ` Daniel Bristot de Oliveira
2023-11-07  8:16       ` Peter Zijlstra
2023-11-07 14:06         ` Daniel Bristot de Oliveira
2023-11-07 14:44       ` Peter Zijlstra
2023-11-07 12:38   ` Peter Zijlstra
2023-11-07 13:24     ` Daniel Bristot de Oliveira
2024-01-19  1:49   ` Joel Fernandes
2024-01-19  1:55   ` Joel Fernandes
2024-01-22 14:14     ` Daniel Bristot de Oliveira
2024-01-23 15:39       ` Joel Fernandes
2024-01-23 15:44       ` Joel Fernandes
2024-02-13  2:13   ` Joel Fernandes
2024-02-13  2:21     ` Joel Fernandes
2024-02-14 14:23     ` Daniel Bristot de Oliveira
2024-02-15 13:57       ` Joel Fernandes
2024-02-15 17:27         ` Daniel Bristot de Oliveira
2024-02-15 17:41           ` Joel Fernandes
2024-04-04 17:43             ` Daniel Bristot de Oliveira
2023-12-08 21:47 ` [PATCH v5 0/7] SCHED_DEADLINE server infrastructure Joel Fernandes
2024-02-19  7:33 ` Huang, Ying [this message]
2024-02-19 10:23   ` Daniel Bristot de Oliveira
2024-02-20  3:28     ` Huang, Ying
2024-02-20  8:31       ` Daniel Bristot de Oliveira
2024-02-20  8:41         ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734tosyb9.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=aaron.lu@intel.com \
    --cc=bristot@kernel.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=guo.ziliang@zte.com.cn \
    --cc=joel@joelfernandes.org \
    --cc=juri.lelli@redhat.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.abeni@santannapisa.it \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=skhan@linuxfoundation.org \
    --cc=tglx@linutronix.de \
    --cc=tommaso.cucinotta@santannapisa.it \
    --cc=vincent.guittot@linaro.org \
    --cc=vineeth@bitbyteword.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.