From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Ingo Molnar <mingo@redhat.com>, Juri Lelli <juri.lelli@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Steven Rostedt <rostedt@goodmis.org>, Thomas Gleixner <tglx@linutronix.de>, Vincent Guittot <vincent.guittot@linaro.org> Subject: [PATCH REPOST 0/8] kernel/fork: Move thread stack free otu of the scheduler path. Date: Tue, 25 Jan 2022 16:26:44 +0100 [thread overview] Message-ID: <20220125152652.1963111-1-bigeasy@linutronix.de> (raw) [ This is a repost of https://lkml.kernel.org/r/20211118143452.136421-1-bigeasy@linutronix.de ] This is a follup-up on the patch sched: Delay task stack freeing on RT https://lkml.kernel.org/r/20210928122411.593486363@linutronix.de It addresses the review feedback: - Decouple stack accounting from its free invocation. The accounting happens in do_exit(), the final free call happens later. - Add put_task_stack_sched() to finish_task_switch(). Here the VMAP stack is cached only. If it fails, or in the !VMAP case then the final free happens in delayed_put_task_struct(). This is also an oportunity to cache the stack. From testing I observe the following: | bash-1715 [006] ..... 124.901510: copy_process: allocC ffffc90007e70000 | sh-cmds.sh-1746 [007] ..... 124.907389: copy_process: allocC ffffc90007dc4000 | <idle>-0 [019] ...1. 124.918126: free_thread_stack: cache ffffc90007dc4000 | sh-cmds.sh-1746 [007] ..... 124.918279: copy_process: allocC ffffc90007de8000 | <idle>-0 [004] ...1. 124.920121: free_thread_stack: delay ffffc90007de8001 | <idle>-0 [007] ...1. 124.920299: free_thread_stack: cache ffffc90007e70000 | <idle>-0 [007] ..s1. 124.945433: free_thread_stack: cache ffffc90007de8000 TS 124.901510, bash started sh-cmds.sh, obtained stack from cache. TS 124.907389, script invokes its first command, obtained stacak from cache. As you can see bash was running on CPU6 but its child was moved CPU7. TS 124.918126, the first command is done, stack is ached on CPU19. TS 124.918279, script's second command, ache from stack. TS 124.920121, the command is done. The stack cache on CPU4 is full. TS 124.920299, the script is done, caches stack on CPU7. TS 124.945433, the RCU-callback of last command is now happening. On CPU7, which is where the command was invoked (but not running). Instead of freeing the stack, it was cached since CPU7 had an empty slot. If I pin the script to CPU5 and run it with multiple commands then it works as expected: | bash-1799 [005] ..... 993.608131: copy_process: allocC ffffc90007fa0000 | sh-cmds.sh-1827 [005] ..... 993.608888: copy_process: allocC ffffc90007fa8000 | sh-cmds.sh-1827 [005] ..... 993.610734: copy_process: allocV ffffc90007ff4000 | sh-cmds.sh-1829 [005] ...1. 993.610757: free_thread_stack: cache ffffc90007fa8000 | sh-cmds.sh-1827 [005] ..... 993.612401: copy_process: allocC ffffc90007fa8000 | <...>-1830 [005] ...1. 993.612416: free_thread_stack: cache ffffc90007ff4000 | sh-cmds.sh-1827 [005] ..... 993.613707: copy_process: allocC ffffc90007ff4000 | sh-cmds.sh-1831 [005] ...1. 993.613723: free_thread_stack: cache ffffc90007fa8000 | sh-cmds.sh-1827 [005] ..... 993.615024: copy_process: allocC ffffc90007fa8000 | <...>-1832 [005] ...1. 993.615040: free_thread_stack: cache ffffc90007ff4000 | sh-cmds.sh-1827 [005] ..... 993.616380: copy_process: allocC ffffc90007ff4000 | <...>-1833 [005] ...1. 993.616397: free_thread_stack: cache ffffc90007fa8000 | bash-1799 [005] ...1. 993.617759: free_thread_stack: cache ffffc90007fa0000 | <idle>-0 [005] ...1. 993.617871: free_thread_stack: delay ffffc90007ff4001 | <idle>-0 [005] ..s1. 993.638311: free_thread_stack: free ffffc90007ff4000 and no new is allocated during its runtime and a cached stack is used. Sebastian
WARNING: multiple messages have this Message-ID (diff)
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Ingo Molnar <mingo@redhat.com>, Juri Lelli <juri.lelli@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Steven Rostedt <rostedt@goodmis.org>, Thomas Gleixner <tglx@linutronix.de>, Vincent Guittot <vincent.guittot@linaro.org> Subject: [PATCH REPOST 0/8] kernel/fork: Move thread stack free otu of the scheduler path. Date: Tue, 25 Jan 2022 15:26:44 +0000 [thread overview] Message-ID: <20220125152652.1963111-1-bigeasy@linutronix.de> (raw) [ This is a repost of https://lkml.kernel.org/r/20211118143452.136421-1-bigeasy@linutronix.de ] This is a follup-up on the patch sched: Delay task stack freeing on RT https://lkml.kernel.org/r/20210928122411.593486363@linutronix.de It addresses the review feedback: - Decouple stack accounting from its free invocation. The accounting happens in do_exit(), the final free call happens later. - Add put_task_stack_sched() to finish_task_switch(). Here the VMAP stack is cached only. If it fails, or in the !VMAP case then the final free happens in delayed_put_task_struct(). This is also an oportunity to cache the stack. From testing I observe the following: | bash-1715 [006] ..... 124.901510: copy_process: allocC ffffc90007e70000 | sh-cmds.sh-1746 [007] ..... 124.907389: copy_process: allocC ffffc90007dc4000 | <idle>-0 [019] ...1. 124.918126: free_thread_stack: cache ffffc90007dc4000 | sh-cmds.sh-1746 [007] ..... 124.918279: copy_process: allocC ffffc90007de8000 | <idle>-0 [004] ...1. 124.920121: free_thread_stack: delay ffffc90007de8001 | <idle>-0 [007] ...1. 124.920299: free_thread_stack: cache ffffc90007e70000 | <idle>-0 [007] ..s1. 124.945433: free_thread_stack: cache ffffc90007de8000 TS 124.901510, bash started sh-cmds.sh, obtained stack from cache. TS 124.907389, script invokes its first command, obtained stacak from cache. As you can see bash was running on CPU6 but its child was moved CPU7. TS 124.918126, the first command is done, stack is ached on CPU19. TS 124.918279, script's second command, ache from stack. TS 124.920121, the command is done. The stack cache on CPU4 is full. TS 124.920299, the script is done, caches stack on CPU7. TS 124.945433, the RCU-callback of last command is now happening. On CPU7, which is where the command was invoked (but not running). Instead of freeing the stack, it was cached since CPU7 had an empty slot. If I pin the script to CPU5 and run it with multiple commands then it works as expected: | bash-1799 [005] ..... 993.608131: copy_process: allocC ffffc90007fa0000 | sh-cmds.sh-1827 [005] ..... 993.608888: copy_process: allocC ffffc90007fa8000 | sh-cmds.sh-1827 [005] ..... 993.610734: copy_process: allocV ffffc90007ff4000 | sh-cmds.sh-1829 [005] ...1. 993.610757: free_thread_stack: cache ffffc90007fa8000 | sh-cmds.sh-1827 [005] ..... 993.612401: copy_process: allocC ffffc90007fa8000 | <...>-1830 [005] ...1. 993.612416: free_thread_stack: cache ffffc90007ff4000 | sh-cmds.sh-1827 [005] ..... 993.613707: copy_process: allocC ffffc90007ff4000 | sh-cmds.sh-1831 [005] ...1. 993.613723: free_thread_stack: cache ffffc90007fa8000 | sh-cmds.sh-1827 [005] ..... 993.615024: copy_process: allocC ffffc90007fa8000 | <...>-1832 [005] ...1. 993.615040: free_thread_stack: cache ffffc90007ff4000 | sh-cmds.sh-1827 [005] ..... 993.616380: copy_process: allocC ffffc90007ff4000 | <...>-1833 [005] ...1. 993.616397: free_thread_stack: cache ffffc90007fa8000 | bash-1799 [005] ...1. 993.617759: free_thread_stack: cache ffffc90007fa0000 | <idle>-0 [005] ...1. 993.617871: free_thread_stack: delay ffffc90007ff4001 | <idle>-0 [005] ..s1. 993.638311: free_thread_stack: free ffffc90007ff4000 and no new is allocated during its runtime and a cached stack is used. Sebastian
next reply other threads:[~2022-01-25 15:31 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-01-25 15:26 Sebastian Andrzej Siewior [this message] 2022-01-25 15:26 ` [PATCH REPOST 0/8] kernel/fork: Move thread stack free otu of the scheduler path Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 1/8] kernel/fork: Redo ifdefs around task's handling Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 2/8] kernel/fork: Duplicate task_struct before stack allocation Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-02-11 23:42 ` Andy Lutomirski 2022-02-11 23:42 ` Andy Lutomirski 2022-02-14 11:39 ` Sebastian Andrzej Siewior 2022-02-14 11:39 ` Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 3/8] kernel/fork, IA64: Provide a alloc_thread_stack_node() for IA64 Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-02-14 18:00 ` Sebastian Andrzej Siewior 2022-02-14 18:00 ` Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 4/8] kernel/fork: Don't assign the stack pointer in dup_task_struct() Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 5/8] kernel/fork: Move memcg_charge_kernel_stack() into CONFIG_VMAP_STACK Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 6/8] kernel/fork: Move task stack account to do_exit() Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-02-11 23:43 ` Andy Lutomirski 2022-02-11 23:43 ` Andy Lutomirski 2022-01-25 15:26 ` [PATCH 7/8] kernel/fork: Only cache the VMAP stack in finish_task_switch() Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-02-11 23:55 ` Andy Lutomirski 2022-02-11 23:55 ` Andy Lutomirski 2022-02-14 12:10 ` Sebastian Andrzej Siewior 2022-02-14 12:10 ` Sebastian Andrzej Siewior 2022-02-14 12:24 ` Sebastian Andrzej Siewior 2022-02-14 12:24 ` Sebastian Andrzej Siewior 2022-02-14 16:54 ` Sebastian Andrzej Siewior 2022-02-14 16:54 ` Sebastian Andrzej Siewior 2022-02-14 17:48 ` Sebastian Andrzej Siewior 2022-02-14 17:48 ` Sebastian Andrzej Siewior 2022-02-14 18:15 ` [PATCH v2 " Sebastian Andrzej Siewior 2022-02-14 18:15 ` Sebastian Andrzej Siewior 2022-01-25 15:26 ` [PATCH 8/8] kernel/fork: Use IS_ENABLED() in account_kernel_stack() Sebastian Andrzej Siewior 2022-01-25 15:26 ` Sebastian Andrzej Siewior 2022-02-08 17:10 ` [PATCH REPOST 0/8] kernel/fork: Move thread stack free otu of the scheduler path Sebastian Andrzej Siewior 2022-02-08 17:10 ` Sebastian Andrzej Siewior
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220125152652.1963111-1-bigeasy@linutronix.de \ --to=bigeasy@linutronix.de \ --cc=bristot@redhat.com \ --cc=bsegall@google.com \ --cc=dietmar.eggemann@arm.com \ --cc=juri.lelli@redhat.com \ --cc=linux-ia64@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=rostedt@goodmis.org \ --cc=tglx@linutronix.de \ --cc=vincent.guittot@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.