rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qi Zheng <qi.zheng@linux.dev>
To: RCU <rcu@vger.kernel.org>, Yujie Liu <yujie.liu@intel.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Vlastimil Babka" <vbabka@suse.cz>, "Kirill Tkhai" <tkhai@ya.ru>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Christian König" <christian.koenig@amd.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Yang Shi" <shy828301@gmail.com>,
	linux-mm@kvack.org, ying.huang@intel.com, feng.tang@intel.com,
	fengwei.yin@intel.com
Subject: Re: [linus:master] [mm] f95bdb700b: stress-ng.ramfs.ops_per_sec -88.8% regression
Date: Thu, 25 May 2023 12:03:16 +0800	[thread overview]
Message-ID: <be04dc3e-a671-ec70-6cf6-70dc702f4184@linux.dev> (raw)
In-Reply-To: <bfb36563-fac9-4c84-96db-87dd28892088@linux.dev>



On 2023/5/24 19:56, Qi Zheng wrote:
> 
> 
> On 2023/5/24 19:08, Qi Zheng wrote:
> 
> [...]
> 
>>
>> Well, I just ran the following command and reproduced the result:
>>
>> stress-ng --timeout 60 --times --verify --metrics-brief --ramfs 9 &
>>
>> 1) with commit 42c9db3970483:
>>
>> stress-ng: info:  [11023] setting to a 60 second run per stressor
>> stress-ng: info:  [11023] dispatching hogs: 9 ramfs
>> stress-ng: info:  [11023] stressor       bogo ops real time  usr time 
>> sys time   bogo ops/s     bogo ops/s
>> stress-ng: info:  [11023]                           (secs)    (secs) 
>> (secs)   (real time) (usr+sys time)
>> stress-ng: info:  [11023] ramfs            774966     60.00     10.18 
>> 169.45     12915.89        4314.26
>> stress-ng: info:  [11023] for a 60.00s run time:
>> stress-ng: info:  [11023]    1920.11s available CPU time
>> stress-ng: info:  [11023]      10.18s user time   (  0.53%)
>> stress-ng: info:  [11023]     169.44s system time (  8.82%)
>> stress-ng: info:  [11023]     179.62s total time  (  9.35%)
>> stress-ng: info:  [11023] load average: 8.99 2.69 0.93
>> stress-ng: info:  [11023] successful run completed in 60.00s (1 min, 
>> 0.00 secs)
>>
>> 2) with commit f95bdb700bc6b:
>>
>> stress-ng: info:  [37676] dispatching hogs: 9 ramfs
>> stress-ng: info:  [37676] stressor       bogo ops real time  usr time 
>> sys time   bogo ops/s     bogo ops/s
>> stress-ng: info:  [37676]                           (secs)    (secs) 
>> (secs)   (real time) (usr+sys time)
>> stress-ng: info:  [37676] ramfs            168673     60.00      1.61 
>>   39.66      2811.08        4087.47
>> stress-ng: info:  [37676] for a 60.10s run time:
>> stress-ng: info:  [37676]    1923.36s available CPU time
>> stress-ng: info:  [37676]       1.60s user time   (  0.08%)
>> stress-ng: info:  [37676]      39.66s system time (  2.06%)
>> stress-ng: info:  [37676]      41.26s total time  (  2.15%)
>> stress-ng: info:  [37676] load average: 7.69 3.63 2.36
>> stress-ng: info:  [37676] successful run completed in 60.10s (1 min, 
>> 0.10 secs)
>>
>> The bogo ops/s (real time) did drop significantly.
>>
>> And the memory reclaimation was not triggered in the whole process. so
>> theoretically no one is in the read critical section of shrinker_srcu.
>>
>> Then I found that some stress-ng-ramfs processes were in
>> TASK_UNINTERRUPTIBLE state for a long time:
>>
>> root       42313  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42314  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42315  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42316  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42317  7.8  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42318  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42319  7.8  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42320  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42321  7.8  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42322  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42323  7.8  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42324  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42325  7.8  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42326  0.0  0.0  69592  2068 pts/0    S    19:00   0:00 
>> stress-ng-ramfs [run]
>> root       42327  7.9  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42328  7.9  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42329  7.9  0.0  69592  1812 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>> root       42330  7.9  0.0  69592  1556 pts/0    D    19:00   0:02 
>> stress-ng-ramfs [run]
>>
>> Their call stack is as follows:
>>
>> cat /proc/42330/stack
>>
>> [<0>] __synchronize_srcu.part.21+0x83/0xb0
>> [<0>] unregister_shrinker+0x85/0xb0
>> [<0>] deactivate_locked_super+0x27/0x70
>> [<0>] cleanup_mnt+0xb8/0x140
>> [<0>] task_work_run+0x65/0x90
>> [<0>] exit_to_user_mode_prepare+0x1ba/0x1c0
>> [<0>] syscall_exit_to_user_mode+0x1b/0x40
>> [<0>] do_syscall_64+0x44/0x80
>> [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> + RCU folks, Is this result as expected? I would have thought that
>> synchronize_srcu() should return quickly if no one is in the read
>> critical section. :(
>>
> 
> With the following changes, ops/s can return to previous levels:

Or just set rcu_expedited to 1:
	echo 1 > /sys/kernel/rcu_expedited

> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index db2ed6e08f67..90f541b07cd1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -763,7 +763,7 @@ void unregister_shrinker(struct shrinker *shrinker)
>          debugfs_entry = shrinker_debugfs_remove(shrinker);
>          up_write(&shrinker_rwsem);
> 
> -       synchronize_srcu(&shrinker_srcu);
> +       synchronize_srcu_expedited(&shrinker_srcu);
> 
>          debugfs_remove_recursive(debugfs_entry);
> 
> stress-ng: info:  [13159] dispatching hogs: 9 ramfs
> stress-ng: info:  [13159] stressor       bogo ops real time  usr time 
> sys time   bogo ops/s     bogo ops/s
> stress-ng: info:  [13159]                           (secs)    (secs) 
> (secs)   (real time) (usr+sys time)
> stress-ng: info:  [13159] ramfs            710062     60.00      9.63 
> 157.26     11834.18        4254.75
> stress-ng: info:  [13159] for a 60.00s run time:
> stress-ng: info:  [13159]    1920.14s available CPU time
> stress-ng: info:  [13159]       9.62s user time   (  0.50%)
> stress-ng: info:  [13159]     157.26s system time (  8.19%)
> stress-ng: info:  [13159]     166.88s total time  (  8.69%)
> stress-ng: info:  [13159] load average: 9.49 4.02 1.65
> stress-ng: info:  [13159] successful run completed in 60.00s (1 min, 
> 0.00 secs)
> 
> Can we make synchronize_srcu() call synchronize_srcu_expedited() when no
> one is in the read critical section?
> 
>>
> 

-- 
Thanks,
Qi

  reply	other threads:[~2023-05-25  4:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <202305230837.db2c233f-yujie.liu@intel.com>
     [not found] ` <eba38fce-2454-d7a4-10ef-240b4686f23d@linux.dev>
     [not found]   ` <ZG29ULGNJdErnatI@yujie-X299>
     [not found]     ` <896bbb09-d400-ec73-ba3a-b64c6e9bbe46@linux.dev>
     [not found]       ` <e5fb8b34-c1ad-92e0-e7e5-f7ed1605dbc6@linux.dev>
2023-05-24 11:22         ` [linus:master] [mm] f95bdb700b: stress-ng.ramfs.ops_per_sec -88.8% regression Qi Zheng
2023-05-24 11:56         ` Qi Zheng
2023-05-25  4:03           ` Qi Zheng [this message]
2023-05-27 11:14             ` Paul E. McKenney
2023-05-29  2:39               ` Qi Zheng
2023-05-29 12:51                 ` Paul E. McKenney
2023-05-30  3:07                   ` Qi Zheng
2023-06-01  0:57                     ` Kirill Tkhai
2023-06-01  8:34                       ` Qi Zheng
     [not found]                         ` <932751685611907@mail.yandex.ru>
2023-06-01 10:44                           ` Qi Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be04dc3e-a671-ec70-6cf6-70dc702f4184@linux.dev \
    --to=qi.zheng@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=christian.koenig@amd.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=oe-lkp@lists.linux.dev \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=tkhai@ya.ru \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    --cc=yujie.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).