* Re: [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-12 10:39 kernel test robot 0 siblings, 0 replies; 7+ messages in thread From: kernel test robot @ 2022-02-12 10:39 UTC (permalink / raw) To: kbuild [-- Attachment #1: Type: text/plain, Size: 6849 bytes --] CC: kbuild-all(a)lists.01.org In-Reply-To: <20220211223537.2175879-2-bigeasy@linutronix.de> References: <20220211223537.2175879-2-bigeasy@linutronix.de> TO: Sebastian Andrzej Siewior <bigeasy@linutronix.de> TO: cgroups(a)vger.kernel.org TO: linux-mm(a)kvack.org CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux Memory Management List <linux-mm@kvack.org> CC: Johannes Weiner <hannes@cmpxchg.org> CC: Michal Hocko <mhocko@kernel.org> CC: "Michal Koutný" <mkoutny@suse.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Vladimir Davydov <vdavydov.dev@gmail.com> CC: Waiman Long <longman@redhat.com> CC: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Hi Sebastian, I love your patch! Perhaps something to improve: [auto build test WARNING on v5.17-rc3] [also build test WARNING on next-20220211] [cannot apply to hnaz-mm/master tj-cgroup/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Sebastian-Andrzej-Siewior/mm-memcg-Address-PREEMPT_RT-problems-instead-of-disabling-it/20220212-063701 base: dfd42facf1e4ada021b939b4e19c935dcdd55566 :::::: branch date: 11 hours ago :::::: commit date: 11 hours ago config: arc-randconfig-m031-20220211 (https://download.01.org/0day-ci/archive/20220212/202202121755.VaCaguQX-lkp(a)intel.com/config) compiler: arc-elf-gcc (GCC) 11.2.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> New smatch warnings: mm/memcontrol.c:6826 uncharge_folio() error: uninitialized symbol 'objcg'. Old smatch warnings: arch/arc/include/asm/thread_info.h:65 current_thread_info() error: uninitialized symbol 'sp'. vim +/objcg +6826 mm/memcontrol.c 747db954cab64c Johannes Weiner 2014-08-08 6779 c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6780) static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) 747db954cab64c Johannes Weiner 2014-08-08 6781 { c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6782) long nr_pages; b4e0b68fbd9d1f Muchun Song 2021-04-29 6783 struct mem_cgroup *memcg; b4e0b68fbd9d1f Muchun Song 2021-04-29 6784 struct obj_cgroup *objcg; 9f762dbe19b9f1 Johannes Weiner 2020-06-03 6785 c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6786) VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); 747db954cab64c Johannes Weiner 2014-08-08 6787 747db954cab64c Johannes Weiner 2014-08-08 6788 /* 747db954cab64c Johannes Weiner 2014-08-08 6789 * Nobody should be changing or seriously looking at c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6790) * folio memcg or objcg at this point, we have fully c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6791) * exclusive access to the folio. 747db954cab64c Johannes Weiner 2014-08-08 6792 */ a6062be9ef6494 Michal Hocko 2022-02-11 6793 if (folio_memcg_kmem(folio)) { 1b7e4464d43a48 Matthew Wilcox (Oracle 2021-06-28 6794) objcg = __folio_objcg(folio); b4e0b68fbd9d1f Muchun Song 2021-04-29 6795 /* b4e0b68fbd9d1f Muchun Song 2021-04-29 6796 * This get matches the put at the end of the function and b4e0b68fbd9d1f Muchun Song 2021-04-29 6797 * kmem pages do not hold memcg references anymore. b4e0b68fbd9d1f Muchun Song 2021-04-29 6798 */ b4e0b68fbd9d1f Muchun Song 2021-04-29 6799 memcg = get_mem_cgroup_from_objcg(objcg); b4e0b68fbd9d1f Muchun Song 2021-04-29 6800 } else { 1b7e4464d43a48 Matthew Wilcox (Oracle 2021-06-28 6801) memcg = __folio_memcg(folio); b4e0b68fbd9d1f Muchun Song 2021-04-29 6802 } b4e0b68fbd9d1f Muchun Song 2021-04-29 6803 b4e0b68fbd9d1f Muchun Song 2021-04-29 6804 if (!memcg) b4e0b68fbd9d1f Muchun Song 2021-04-29 6805 return; 747db954cab64c Johannes Weiner 2014-08-08 6806 b4e0b68fbd9d1f Muchun Song 2021-04-29 6807 if (ug->memcg != memcg) { a9d5adeeb4b2c7 Jérôme Glisse 2017-09-08 6808 if (ug->memcg) { a9d5adeeb4b2c7 Jérôme Glisse 2017-09-08 6809 uncharge_batch(ug); a9d5adeeb4b2c7 Jérôme Glisse 2017-09-08 6810 uncharge_gather_clear(ug); 747db954cab64c Johannes Weiner 2014-08-08 6811 } b4e0b68fbd9d1f Muchun Song 2021-04-29 6812 ug->memcg = memcg; c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6813) ug->nid = folio_nid(folio); f1796544a0ca0f Michal Hocko 2020-09-04 6814 f1796544a0ca0f Michal Hocko 2020-09-04 6815 /* pairs with css_put in uncharge_batch */ b4e0b68fbd9d1f Muchun Song 2021-04-29 6816 css_get(&memcg->css); 747db954cab64c Johannes Weiner 2014-08-08 6817 } 747db954cab64c Johannes Weiner 2014-08-08 6818 c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6819) nr_pages = folio_nr_pages(folio); 9f762dbe19b9f1 Johannes Weiner 2020-06-03 6820 a6062be9ef6494 Michal Hocko 2022-02-11 6821 if (folio_memcg_kmem(folio)) { b4e0b68fbd9d1f Muchun Song 2021-04-29 6822 ug->nr_memory += nr_pages; 9f762dbe19b9f1 Johannes Weiner 2020-06-03 6823 ug->nr_kmem += nr_pages; b4e0b68fbd9d1f Muchun Song 2021-04-29 6824 c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6825) folio->memcg_data = 0; b4e0b68fbd9d1f Muchun Song 2021-04-29 @6826 obj_cgroup_put(objcg); b4e0b68fbd9d1f Muchun Song 2021-04-29 6827 } else { b4e0b68fbd9d1f Muchun Song 2021-04-29 6828 /* LRU pages aren't accounted at the root level */ b4e0b68fbd9d1f Muchun Song 2021-04-29 6829 if (!mem_cgroup_is_root(memcg)) b4e0b68fbd9d1f Muchun Song 2021-04-29 6830 ug->nr_memory += nr_pages; 18b2db3b038522 Roman Gushchin 2020-12-01 6831 ug->pgpgout++; 747db954cab64c Johannes Weiner 2014-08-08 6832 c4ed6ebfcb0929 Matthew Wilcox (Oracle 2021-06-29 6833) folio->memcg_data = 0; b4e0b68fbd9d1f Muchun Song 2021-04-29 6834 } b4e0b68fbd9d1f Muchun Song 2021-04-29 6835 b4e0b68fbd9d1f Muchun Song 2021-04-29 6836 css_put(&memcg->css); a9d5adeeb4b2c7 Jérôme Glisse 2017-09-08 6837 } a9d5adeeb4b2c7 Jérôme Glisse 2017-09-08 6838 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it. @ 2022-02-11 22:35 Sebastian Andrzej Siewior 2022-02-11 22:35 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 7+ messages in thread From: Sebastian Andrzej Siewior @ 2022-02-11 22:35 UTC (permalink / raw) To: cgroups, linux-mm Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long Hi, this series aims to address the memcg related problem on PREEMPT_RT. I tested them on CONFIG_PREEMPT and CONFIG_PREEMPT_RT with the tools/testing/selftests/cgroup/* tests and I haven't observed any regressions (other than the lockdep report that is already there). Changes since v1: - Made a full patch from Michal Hocko's diff to disable the from-IRQ vs from-task optimisation - Disabling threshold event handlers is using now IS_ENABLED(PREEMPT_RT) instead of #ifdef. The outcome is the same but there is no need to shuffle the code around. v1: https://lore.kernel.org/all/20220125164337.2071854-1-bigeasy@linutronix.de/ Changes since the RFC: - cgroup.event_control / memory.soft_limit_in_bytes is disabled on PREEMPT_RT. It is a deprecated v1 feature. Fixing the signal path is not worth it. - The updates to per-CPU counters are usually synchronised by disabling interrupts. There are a few spots where assumption about disabled interrupts are not true on PREEMPT_RT and therefore preemption is disabled. This is okay since the counter are never written from in_irq() context. RFC: https://lore.kernel.org/all/20211222114111.2206248-1-bigeasy@linutronix.de/ Sebastian ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-11 22:35 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 7+ messages in thread From: Sebastian Andrzej Siewior @ 2022-02-11 22:35 UTC (permalink / raw) To: cgroups, linux-mm Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long, Michal Hocko, Sebastian Andrzej Siewior From: Michal Hocko <mhocko@suse.com> The optimisation is based on a micro benchmark where local_irq_save() is more expensive than a preempt_disable(). There is no evidence that it is visible in a real-world workload and there are CPUs where the opposite is true (local_irq_save() is cheaper than preempt_disable()). Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE where preempt_disable() is optimized away. There is no improvement with PREEMPT_DYNAMIC since the preemption counter is always available. The optimization makes also the PREEMPT_RT integration more complicated since most of the assumption are not true on PREEMPT_RT. Revert the optimisation since it complicates the PREEMPT_RT integration and the improvement is hardly visible. [ bigeasy: Patch body around Michal's diff ] Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- mm/memcontrol.c | 94 ++++++++++++++----------------------------------- 1 file changed, 27 insertions(+), 67 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 09d342c7cbd0d..4b1572ae990d8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2085,23 +2085,17 @@ void unlock_page_memcg(struct page *page) folio_memcg_unlock(page_folio(page)); } -struct obj_stock { +struct memcg_stock_pcp { + struct mem_cgroup *cached; /* this never be root cgroup */ + unsigned int nr_pages; + #ifdef CONFIG_MEMCG_KMEM struct obj_cgroup *cached_objcg; struct pglist_data *cached_pgdat; unsigned int nr_bytes; int nr_slab_reclaimable_b; int nr_slab_unreclaimable_b; -#else - int dummy[0]; #endif -}; - -struct memcg_stock_pcp { - struct mem_cgroup *cached; /* this never be root cgroup */ - unsigned int nr_pages; - struct obj_stock task_obj; - struct obj_stock irq_obj; struct work_struct work; unsigned long flags; @@ -2111,12 +2105,12 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock); static DEFINE_MUTEX(percpu_charge_mutex); #ifdef CONFIG_MEMCG_KMEM -static void drain_obj_stock(struct obj_stock *stock); +static void drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); #else -static inline void drain_obj_stock(struct obj_stock *stock) +static inline void drain_obj_stock(struct memcg_stock_pcp *stock) { } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -2193,9 +2187,7 @@ static void drain_local_stock(struct work_struct *dummy) local_irq_save(flags); stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(&stock->irq_obj); - if (in_task()) - drain_obj_stock(&stock->task_obj); + drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); @@ -2770,41 +2762,6 @@ static struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg) */ #define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | __GFP_ACCOUNT) -/* - * Most kmem_cache_alloc() calls are from user context. The irq disable/enable - * sequence used in this case to access content from object stock is slow. - * To optimize for user context access, there are now two object stocks for - * task context and interrupt context access respectively. - * - * The task context object stock can be accessed by disabling preemption only - * which is cheap in non-preempt kernel. The interrupt context object stock - * can only be accessed after disabling interrupt. User context code can - * access interrupt object stock, but not vice versa. - */ -static inline struct obj_stock *get_obj_stock(unsigned long *pflags) -{ - struct memcg_stock_pcp *stock; - - if (likely(in_task())) { - *pflags = 0UL; - preempt_disable(); - stock = this_cpu_ptr(&memcg_stock); - return &stock->task_obj; - } - - local_irq_save(*pflags); - stock = this_cpu_ptr(&memcg_stock); - return &stock->irq_obj; -} - -static inline void put_obj_stock(unsigned long flags) -{ - if (likely(in_task())) - preempt_enable(); - else - local_irq_restore(flags); -} - /* * mod_objcg_mlstate() may be called with irq enabled, so * mod_memcg_lruvec_state() should be used. @@ -3075,10 +3032,13 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); int *bytes; + local_irq_save(flags); + stock = this_cpu_ptr(&memcg_stock); + /* * Save vmstat data in stock and skip vmstat array update unless * accumulating over a page of vmstat data or when pgdat or idx @@ -3129,26 +3089,29 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) mod_objcg_mlstate(objcg, pgdat, idx, nr); - put_obj_stock(flags); + local_irq_restore(flags); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); bool ret = false; + local_irq_save(flags); + + stock = this_cpu_ptr(&memcg_stock); if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { stock->nr_bytes -= nr_bytes; ret = true; } - put_obj_stock(flags); + local_irq_restore(flags); return ret; } -static void drain_obj_stock(struct obj_stock *stock) +static void drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old = stock->cached_objcg; @@ -3204,13 +3167,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, { struct mem_cgroup *memcg; - if (in_task() && stock->task_obj.cached_objcg) { - memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg); - if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) - return true; - } - if (stock->irq_obj.cached_objcg) { - memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg); + if (stock->cached_objcg) { + memcg = obj_cgroup_memcg(stock->cached_objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; } @@ -3221,10 +3179,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, bool allow_uncharge) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); unsigned int nr_pages = 0; + local_irq_save(flags); + + stock = this_cpu_ptr(&memcg_stock); if (stock->cached_objcg != objcg) { /* reset if necessary */ drain_obj_stock(stock); obj_cgroup_get(objcg); @@ -3240,7 +3201,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, stock->nr_bytes &= (PAGE_SIZE - 1); } - put_obj_stock(flags); + local_irq_restore(flags); if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); @@ -6821,7 +6782,6 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) long nr_pages; struct mem_cgroup *memcg; struct obj_cgroup *objcg; - bool use_objcg = folio_memcg_kmem(folio); VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); @@ -6830,7 +6790,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) * folio memcg or objcg at this point, we have fully * exclusive access to the folio. */ - if (use_objcg) { + if (folio_memcg_kmem(folio)) { objcg = __folio_objcg(folio); /* * This get matches the put at the end of the function and @@ -6858,7 +6818,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) nr_pages = folio_nr_pages(folio); - if (use_objcg) { + if (folio_memcg_kmem(folio)) { ug->nr_memory += nr_pages; ug->nr_kmem += nr_pages; -- 2.34.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-11 22:35 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 7+ messages in thread From: Sebastian Andrzej Siewior @ 2022-02-11 22:35 UTC (permalink / raw) To: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long, Michal Hocko, Sebastian Andrzej Siewior From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> The optimisation is based on a micro benchmark where local_irq_save() is more expensive than a preempt_disable(). There is no evidence that it is visible in a real-world workload and there are CPUs where the opposite is true (local_irq_save() is cheaper than preempt_disable()). Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE where preempt_disable() is optimized away. There is no improvement with PREEMPT_DYNAMIC since the preemption counter is always available. The optimization makes also the PREEMPT_RT integration more complicated since most of the assumption are not true on PREEMPT_RT. Revert the optimisation since it complicates the PREEMPT_RT integration and the improvement is hardly visible. [ bigeasy: Patch body around Michal's diff ] Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org Signed-off-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> --- mm/memcontrol.c | 94 ++++++++++++++----------------------------------- 1 file changed, 27 insertions(+), 67 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 09d342c7cbd0d..4b1572ae990d8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2085,23 +2085,17 @@ void unlock_page_memcg(struct page *page) folio_memcg_unlock(page_folio(page)); } -struct obj_stock { +struct memcg_stock_pcp { + struct mem_cgroup *cached; /* this never be root cgroup */ + unsigned int nr_pages; + #ifdef CONFIG_MEMCG_KMEM struct obj_cgroup *cached_objcg; struct pglist_data *cached_pgdat; unsigned int nr_bytes; int nr_slab_reclaimable_b; int nr_slab_unreclaimable_b; -#else - int dummy[0]; #endif -}; - -struct memcg_stock_pcp { - struct mem_cgroup *cached; /* this never be root cgroup */ - unsigned int nr_pages; - struct obj_stock task_obj; - struct obj_stock irq_obj; struct work_struct work; unsigned long flags; @@ -2111,12 +2105,12 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock); static DEFINE_MUTEX(percpu_charge_mutex); #ifdef CONFIG_MEMCG_KMEM -static void drain_obj_stock(struct obj_stock *stock); +static void drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); #else -static inline void drain_obj_stock(struct obj_stock *stock) +static inline void drain_obj_stock(struct memcg_stock_pcp *stock) { } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -2193,9 +2187,7 @@ static void drain_local_stock(struct work_struct *dummy) local_irq_save(flags); stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(&stock->irq_obj); - if (in_task()) - drain_obj_stock(&stock->task_obj); + drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); @@ -2770,41 +2762,6 @@ static struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg) */ #define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | __GFP_ACCOUNT) -/* - * Most kmem_cache_alloc() calls are from user context. The irq disable/enable - * sequence used in this case to access content from object stock is slow. - * To optimize for user context access, there are now two object stocks for - * task context and interrupt context access respectively. - * - * The task context object stock can be accessed by disabling preemption only - * which is cheap in non-preempt kernel. The interrupt context object stock - * can only be accessed after disabling interrupt. User context code can - * access interrupt object stock, but not vice versa. - */ -static inline struct obj_stock *get_obj_stock(unsigned long *pflags) -{ - struct memcg_stock_pcp *stock; - - if (likely(in_task())) { - *pflags = 0UL; - preempt_disable(); - stock = this_cpu_ptr(&memcg_stock); - return &stock->task_obj; - } - - local_irq_save(*pflags); - stock = this_cpu_ptr(&memcg_stock); - return &stock->irq_obj; -} - -static inline void put_obj_stock(unsigned long flags) -{ - if (likely(in_task())) - preempt_enable(); - else - local_irq_restore(flags); -} - /* * mod_objcg_mlstate() may be called with irq enabled, so * mod_memcg_lruvec_state() should be used. @@ -3075,10 +3032,13 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); int *bytes; + local_irq_save(flags); + stock = this_cpu_ptr(&memcg_stock); + /* * Save vmstat data in stock and skip vmstat array update unless * accumulating over a page of vmstat data or when pgdat or idx @@ -3129,26 +3089,29 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) mod_objcg_mlstate(objcg, pgdat, idx, nr); - put_obj_stock(flags); + local_irq_restore(flags); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); bool ret = false; + local_irq_save(flags); + + stock = this_cpu_ptr(&memcg_stock); if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { stock->nr_bytes -= nr_bytes; ret = true; } - put_obj_stock(flags); + local_irq_restore(flags); return ret; } -static void drain_obj_stock(struct obj_stock *stock) +static void drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old = stock->cached_objcg; @@ -3204,13 +3167,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, { struct mem_cgroup *memcg; - if (in_task() && stock->task_obj.cached_objcg) { - memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg); - if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) - return true; - } - if (stock->irq_obj.cached_objcg) { - memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg); + if (stock->cached_objcg) { + memcg = obj_cgroup_memcg(stock->cached_objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; } @@ -3221,10 +3179,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, bool allow_uncharge) { + struct memcg_stock_pcp *stock; unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); unsigned int nr_pages = 0; + local_irq_save(flags); + + stock = this_cpu_ptr(&memcg_stock); if (stock->cached_objcg != objcg) { /* reset if necessary */ drain_obj_stock(stock); obj_cgroup_get(objcg); @@ -3240,7 +3201,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, stock->nr_bytes &= (PAGE_SIZE - 1); } - put_obj_stock(flags); + local_irq_restore(flags); if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); @@ -6821,7 +6782,6 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) long nr_pages; struct mem_cgroup *memcg; struct obj_cgroup *objcg; - bool use_objcg = folio_memcg_kmem(folio); VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); @@ -6830,7 +6790,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) * folio memcg or objcg at this point, we have fully * exclusive access to the folio. */ - if (use_objcg) { + if (folio_memcg_kmem(folio)) { objcg = __folio_objcg(folio); /* * This get matches the put at the end of the function and @@ -6858,7 +6818,7 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) nr_pages = folio_nr_pages(folio); - if (use_objcg) { + if (folio_memcg_kmem(folio)) { ug->nr_memory += nr_pages; ug->nr_kmem += nr_pages; -- 2.34.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-14 16:23 ` Johannes Weiner 0 siblings, 0 replies; 7+ messages in thread From: Johannes Weiner @ 2022-02-14 16:23 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: cgroups, linux-mm, Andrew Morton, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long, Michal Hocko On Fri, Feb 11, 2022 at 11:35:34PM +0100, Sebastian Andrzej Siewior wrote: > From: Michal Hocko <mhocko@suse.com> > > The optimisation is based on a micro benchmark where local_irq_save() is > more expensive than a preempt_disable(). There is no evidence that it is > visible in a real-world workload and there are CPUs where the opposite is > true (local_irq_save() is cheaper than preempt_disable()). > > Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE > where preempt_disable() is optimized away. There is no improvement with > PREEMPT_DYNAMIC since the preemption counter is always available. > > The optimization makes also the PREEMPT_RT integration more complicated > since most of the assumption are not true on PREEMPT_RT. > > Revert the optimisation since it complicates the PREEMPT_RT integration > and the improvement is hardly visible. > > [ bigeasy: Patch body around Michal's diff ] > > Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz > Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de > Signed-off-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-14 16:23 ` Johannes Weiner 0 siblings, 0 replies; 7+ messages in thread From: Johannes Weiner @ 2022-02-14 16:23 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long, Michal Hocko On Fri, Feb 11, 2022 at 11:35:34PM +0100, Sebastian Andrzej Siewior wrote: > From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> > > The optimisation is based on a micro benchmark where local_irq_save() is > more expensive than a preempt_disable(). There is no evidence that it is > visible in a real-world workload and there are CPUs where the opposite is > true (local_irq_save() is cheaper than preempt_disable()). > > Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE > where preempt_disable() is optimized away. There is no improvement with > PREEMPT_DYNAMIC since the preemption counter is always available. > > The optimization makes also the PREEMPT_RT integration more complicated > since most of the assumption are not true on PREEMPT_RT. > > Revert the optimisation since it complicates the PREEMPT_RT integration > and the improvement is hardly visible. > > [ bigeasy: Patch body around Michal's diff ] > > Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org > Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org > Signed-off-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-14 19:45 ` Roman Gushchin 0 siblings, 0 replies; 7+ messages in thread From: Roman Gushchin @ 2022-02-14 19:45 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: cgroups, linux-mm, Andrew Morton, Johannes Weiner, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long, Michal Hocko On Fri, Feb 11, 2022 at 11:35:34PM +0100, Sebastian Andrzej Siewior wrote: > From: Michal Hocko <mhocko@suse.com> > > The optimisation is based on a micro benchmark where local_irq_save() is > more expensive than a preempt_disable(). There is no evidence that it is > visible in a real-world workload and there are CPUs where the opposite is > true (local_irq_save() is cheaper than preempt_disable()). > > Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE > where preempt_disable() is optimized away. There is no improvement with > PREEMPT_DYNAMIC since the preemption counter is always available. > > The optimization makes also the PREEMPT_RT integration more complicated > since most of the assumption are not true on PREEMPT_RT. > > Revert the optimisation since it complicates the PREEMPT_RT integration > and the improvement is hardly visible. > > [ bigeasy: Patch body around Michal's diff ] > > Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz > Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de > Signed-off-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Roman Gushchin <guro@fb.com> Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") @ 2022-02-14 19:45 ` Roman Gushchin 0 siblings, 0 replies; 7+ messages in thread From: Roman Gushchin @ 2022-02-14 19:45 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrew Morton, Johannes Weiner, Michal Hocko, Michal Koutný, Peter Zijlstra, Thomas Gleixner, Vladimir Davydov, Waiman Long, Michal Hocko On Fri, Feb 11, 2022 at 11:35:34PM +0100, Sebastian Andrzej Siewior wrote: > From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> > > The optimisation is based on a micro benchmark where local_irq_save() is > more expensive than a preempt_disable(). There is no evidence that it is > visible in a real-world workload and there are CPUs where the opposite is > true (local_irq_save() is cheaper than preempt_disable()). > > Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE > where preempt_disable() is optimized away. There is no improvement with > PREEMPT_DYNAMIC since the preemption counter is always available. > > The optimization makes also the PREEMPT_RT integration more complicated > since most of the assumption are not true on PREEMPT_RT. > > Revert the optimisation since it complicates the PREEMPT_RT integration > and the improvement is hardly visible. > > [ bigeasy: Patch body around Michal's diff ] > > Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org > Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org > Signed-off-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> Acked-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org> Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-02-14 19:45 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-12 10:39 [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") kernel test robot -- strict thread matches above, loose matches on Subject: below -- 2022-02-11 22:35 [PATCH v2 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Sebastian Andrzej Siewior 2022-02-11 22:35 ` [PATCH v2 1/4] mm/memcg: Revert ("mm/memcg: optimize user context object stock access") Sebastian Andrzej Siewior 2022-02-11 22:35 ` Sebastian Andrzej Siewior 2022-02-14 16:23 ` Johannes Weiner 2022-02-14 16:23 ` Johannes Weiner 2022-02-14 19:45 ` Roman Gushchin 2022-02-14 19:45 ` Roman Gushchin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.