All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-10 12:48 kernel test robot
  0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-04-10 12:48 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 17557 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210409231842.8840-6-longman@redhat.com>
References: <20210409231842.8840-6-longman@redhat.com>
TO: Waiman Long <longman@redhat.com>
TO: Johannes Weiner <hannes@cmpxchg.org>
TO: Michal Hocko <mhocko@kernel.org>
TO: Vladimir Davydov <vdavydov.dev@gmail.com>
TO: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
TO: Tejun Heo <tj@kernel.org>
TO: Christoph Lameter <cl@linux-foundation.org>
TO: Pekka Enberg <penberg@kernel.org>
TO: David Rientjes <rientjes@google.com>
TO: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hi Waiman,

I love your patch! Perhaps something to improve:

[auto build test WARNING on dennis-percpu/for-next]
[also build test WARNING on linus/master v5.12-rc6 next-20210409]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
base:   https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
:::::: branch date: 13 hours ago
:::::: commit date: 13 hours ago
config: x86_64-randconfig-m001-20210409 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
mm/memcontrol.c:3234 consume_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3234 consume_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3329 refill_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3329 refill_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3371 mod_obj_stock_state() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3371 mod_obj_stock_state() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3426 obj_cgroup_uncharge_mod_state() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3426 obj_cgroup_uncharge_mod_state() error: uninitialized symbol 'stock'.

vim +/stock +3234 mm/memcontrol.c

e632f602745d79 Waiman Long    2021-04-09  3227  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3228  static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3229  {
6b724b11a34580 Waiman Long    2021-04-09  3230  	struct obj_stock *stock;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3231  	unsigned long flags;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3232  	bool ret = false;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3233  
1f4e22fce44599 Waiman Long    2021-04-09 @3234  	stock = get_obj_stock(flags);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3235  
6b724b11a34580 Waiman Long    2021-04-09  3236  	stock = current_obj_stock();
bf4f059954dcb2 Roman Gushchin 2020-08-06  3237  	if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3238  		stock->nr_bytes -= nr_bytes;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3239  		ret = true;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3240  	}
bf4f059954dcb2 Roman Gushchin 2020-08-06  3241  
1f4e22fce44599 Waiman Long    2021-04-09  3242  	put_obj_stock(flags);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3243  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3244  	return ret;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3245  }
bf4f059954dcb2 Roman Gushchin 2020-08-06  3246  
6b724b11a34580 Waiman Long    2021-04-09  3247  static void drain_obj_stock(struct obj_stock *stock)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3248  {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3249  	struct obj_cgroup *old = stock->cached_objcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3250  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3251  	if (!old)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3252  		return;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3253  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3254  	if (stock->nr_bytes) {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3255  		unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3256  		unsigned int nr_bytes = stock->nr_bytes & (PAGE_SIZE - 1);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3257  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3258  		if (nr_pages) {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3259  			rcu_read_lock();
bf4f059954dcb2 Roman Gushchin 2020-08-06  3260  			__memcg_kmem_uncharge(obj_cgroup_memcg(old), nr_pages);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3261  			rcu_read_unlock();
bf4f059954dcb2 Roman Gushchin 2020-08-06  3262  		}
bf4f059954dcb2 Roman Gushchin 2020-08-06  3263  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3264  		/*
bf4f059954dcb2 Roman Gushchin 2020-08-06  3265  		 * The leftover is flushed to the centralized per-memcg value.
bf4f059954dcb2 Roman Gushchin 2020-08-06  3266  		 * On the next attempt to refill obj stock it will be moved
bf4f059954dcb2 Roman Gushchin 2020-08-06  3267  		 * to a per-cpu stock (probably, on an other CPU), see
bf4f059954dcb2 Roman Gushchin 2020-08-06  3268  		 * refill_obj_stock().
bf4f059954dcb2 Roman Gushchin 2020-08-06  3269  		 *
bf4f059954dcb2 Roman Gushchin 2020-08-06  3270  		 * How often it's flushed is a trade-off between the memory
bf4f059954dcb2 Roman Gushchin 2020-08-06  3271  		 * limit enforcement accuracy and potential CPU contention,
bf4f059954dcb2 Roman Gushchin 2020-08-06  3272  		 * so it might be changed in the future.
bf4f059954dcb2 Roman Gushchin 2020-08-06  3273  		 */
bf4f059954dcb2 Roman Gushchin 2020-08-06  3274  		atomic_add(nr_bytes, &old->nr_charged_bytes);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3275  		stock->nr_bytes = 0;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3276  	}
bf4f059954dcb2 Roman Gushchin 2020-08-06  3277  
e632f602745d79 Waiman Long    2021-04-09  3278  	if (stock->vmstat_bytes) {
e632f602745d79 Waiman Long    2021-04-09  3279  		mod_objcg_state(old, stock->cached_pgdat, stock->vmstat_idx,
e632f602745d79 Waiman Long    2021-04-09  3280  				stock->vmstat_bytes);
e632f602745d79 Waiman Long    2021-04-09  3281  		stock->vmstat_bytes = 0;
e632f602745d79 Waiman Long    2021-04-09  3282  		stock->vmstat_idx = 0;
e632f602745d79 Waiman Long    2021-04-09  3283  		stock->cached_pgdat = NULL;
e632f602745d79 Waiman Long    2021-04-09  3284  	}
e632f602745d79 Waiman Long    2021-04-09  3285  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3286  	obj_cgroup_put(old);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3287  	stock->cached_objcg = NULL;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3288  }
bf4f059954dcb2 Roman Gushchin 2020-08-06  3289  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3290  static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
bf4f059954dcb2 Roman Gushchin 2020-08-06  3291  				     struct mem_cgroup *root_memcg)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3292  {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3293  	struct mem_cgroup *memcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3294  
1f4e22fce44599 Waiman Long    2021-04-09  3295  	if (in_task() && stock->task_obj.cached_objcg) {
1f4e22fce44599 Waiman Long    2021-04-09  3296  		memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
1f4e22fce44599 Waiman Long    2021-04-09  3297  		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
1f4e22fce44599 Waiman Long    2021-04-09  3298  			return true;
1f4e22fce44599 Waiman Long    2021-04-09  3299  	}
1f4e22fce44599 Waiman Long    2021-04-09  3300  	if (stock->irq_obj.cached_objcg) {
1f4e22fce44599 Waiman Long    2021-04-09  3301  		memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3302  		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
bf4f059954dcb2 Roman Gushchin 2020-08-06  3303  			return true;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3304  	}
bf4f059954dcb2 Roman Gushchin 2020-08-06  3305  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3306  	return false;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3307  }
bf4f059954dcb2 Roman Gushchin 2020-08-06  3308  
6510c46c2c5965 Waiman Long    2021-04-09  3309  static void __refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3310  {
6b724b11a34580 Waiman Long    2021-04-09  3311  	struct obj_stock *stock = current_obj_stock();
bf4f059954dcb2 Roman Gushchin 2020-08-06  3312  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3313  	if (stock->cached_objcg != objcg) { /* reset if necessary */
bf4f059954dcb2 Roman Gushchin 2020-08-06  3314  		drain_obj_stock(stock);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3315  		obj_cgroup_get(objcg);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3316  		stock->cached_objcg = objcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3317  		stock->nr_bytes = atomic_xchg(&objcg->nr_charged_bytes, 0);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3318  	}
bf4f059954dcb2 Roman Gushchin 2020-08-06  3319  	stock->nr_bytes += nr_bytes;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3320  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3321  	if (stock->nr_bytes > PAGE_SIZE)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3322  		drain_obj_stock(stock);
6510c46c2c5965 Waiman Long    2021-04-09  3323  }
6510c46c2c5965 Waiman Long    2021-04-09  3324  
6510c46c2c5965 Waiman Long    2021-04-09  3325  static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
6510c46c2c5965 Waiman Long    2021-04-09  3326  {
6510c46c2c5965 Waiman Long    2021-04-09  3327  	unsigned long flags;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3328  
1f4e22fce44599 Waiman Long    2021-04-09 @3329  	get_obj_stock(flags);
6510c46c2c5965 Waiman Long    2021-04-09  3330  	__refill_obj_stock(objcg, nr_bytes);
1f4e22fce44599 Waiman Long    2021-04-09  3331  	put_obj_stock(flags);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3332  }
bf4f059954dcb2 Roman Gushchin 2020-08-06  3333  
e632f602745d79 Waiman Long    2021-04-09  3334  static void __mod_obj_stock_state(struct obj_cgroup *objcg,
e632f602745d79 Waiman Long    2021-04-09  3335  				  struct pglist_data *pgdat, int idx, int nr)
e632f602745d79 Waiman Long    2021-04-09  3336  {
6b724b11a34580 Waiman Long    2021-04-09  3337  	struct obj_stock *stock = current_obj_stock();
e632f602745d79 Waiman Long    2021-04-09  3338  
e632f602745d79 Waiman Long    2021-04-09  3339  	if (stock->cached_objcg != objcg) {
e632f602745d79 Waiman Long    2021-04-09  3340  		/* Output the current data as is */
e632f602745d79 Waiman Long    2021-04-09  3341  	} else if (!stock->vmstat_bytes) {
e632f602745d79 Waiman Long    2021-04-09  3342  		/* Save the current data */
e632f602745d79 Waiman Long    2021-04-09  3343  		stock->vmstat_bytes = nr;
e632f602745d79 Waiman Long    2021-04-09  3344  		stock->vmstat_idx = idx;
e632f602745d79 Waiman Long    2021-04-09  3345  		stock->cached_pgdat = pgdat;
e632f602745d79 Waiman Long    2021-04-09  3346  		nr = 0;
e632f602745d79 Waiman Long    2021-04-09  3347  	} else if ((stock->cached_pgdat != pgdat) ||
e632f602745d79 Waiman Long    2021-04-09  3348  		   (stock->vmstat_idx != idx)) {
e632f602745d79 Waiman Long    2021-04-09  3349  		/* Output the cached data & save the current data */
e632f602745d79 Waiman Long    2021-04-09  3350  		swap(nr, stock->vmstat_bytes);
e632f602745d79 Waiman Long    2021-04-09  3351  		swap(idx, stock->vmstat_idx);
e632f602745d79 Waiman Long    2021-04-09  3352  		swap(pgdat, stock->cached_pgdat);
e632f602745d79 Waiman Long    2021-04-09  3353  	} else {
e632f602745d79 Waiman Long    2021-04-09  3354  		stock->vmstat_bytes += nr;
e632f602745d79 Waiman Long    2021-04-09  3355  		if (abs(nr) > PAGE_SIZE) {
e632f602745d79 Waiman Long    2021-04-09  3356  			nr = stock->vmstat_bytes;
e632f602745d79 Waiman Long    2021-04-09  3357  			stock->vmstat_bytes = 0;
e632f602745d79 Waiman Long    2021-04-09  3358  		} else {
e632f602745d79 Waiman Long    2021-04-09  3359  			nr = 0;
e632f602745d79 Waiman Long    2021-04-09  3360  		}
e632f602745d79 Waiman Long    2021-04-09  3361  	}
e632f602745d79 Waiman Long    2021-04-09  3362  	if (nr)
e632f602745d79 Waiman Long    2021-04-09  3363  		mod_objcg_state(objcg, pgdat, idx, nr);
e632f602745d79 Waiman Long    2021-04-09  3364  }
e632f602745d79 Waiman Long    2021-04-09  3365  
e632f602745d79 Waiman Long    2021-04-09  3366  void mod_obj_stock_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
e632f602745d79 Waiman Long    2021-04-09  3367  			 int idx, int nr)
e632f602745d79 Waiman Long    2021-04-09  3368  {
e632f602745d79 Waiman Long    2021-04-09  3369  	unsigned long flags;
e632f602745d79 Waiman Long    2021-04-09  3370  
1f4e22fce44599 Waiman Long    2021-04-09 @3371  	get_obj_stock(flags);
e632f602745d79 Waiman Long    2021-04-09  3372  	__mod_obj_stock_state(objcg, pgdat, idx, nr);
1f4e22fce44599 Waiman Long    2021-04-09  3373  	put_obj_stock(flags);
e632f602745d79 Waiman Long    2021-04-09  3374  }
e632f602745d79 Waiman Long    2021-04-09  3375  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3376  int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3377  {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3378  	struct mem_cgroup *memcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3379  	unsigned int nr_pages, nr_bytes;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3380  	int ret;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3381  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3382  	if (consume_obj_stock(objcg, size))
bf4f059954dcb2 Roman Gushchin 2020-08-06  3383  		return 0;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3384  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3385  	/*
bf4f059954dcb2 Roman Gushchin 2020-08-06  3386  	 * In theory, memcg->nr_charged_bytes can have enough
bf4f059954dcb2 Roman Gushchin 2020-08-06  3387  	 * pre-charged bytes to satisfy the allocation. However,
bf4f059954dcb2 Roman Gushchin 2020-08-06  3388  	 * flushing memcg->nr_charged_bytes requires two atomic
bf4f059954dcb2 Roman Gushchin 2020-08-06  3389  	 * operations, and memcg->nr_charged_bytes can't be big,
bf4f059954dcb2 Roman Gushchin 2020-08-06  3390  	 * so it's better to ignore it and try grab some new pages.
bf4f059954dcb2 Roman Gushchin 2020-08-06  3391  	 * memcg->nr_charged_bytes will be flushed in
bf4f059954dcb2 Roman Gushchin 2020-08-06  3392  	 * refill_obj_stock(), called from this function or
bf4f059954dcb2 Roman Gushchin 2020-08-06  3393  	 * independently later.
bf4f059954dcb2 Roman Gushchin 2020-08-06  3394  	 */
bf4f059954dcb2 Roman Gushchin 2020-08-06  3395  	rcu_read_lock();
eefbfa7fd67880 Muchun Song    2020-12-14  3396  retry:
bf4f059954dcb2 Roman Gushchin 2020-08-06  3397  	memcg = obj_cgroup_memcg(objcg);
eefbfa7fd67880 Muchun Song    2020-12-14  3398  	if (unlikely(!css_tryget(&memcg->css)))
eefbfa7fd67880 Muchun Song    2020-12-14  3399  		goto retry;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3400  	rcu_read_unlock();
bf4f059954dcb2 Roman Gushchin 2020-08-06  3401  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3402  	nr_pages = size >> PAGE_SHIFT;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3403  	nr_bytes = size & (PAGE_SIZE - 1);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3404  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3405  	if (nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3406  		nr_pages += 1;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3407  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3408  	ret = __memcg_kmem_charge(memcg, gfp, nr_pages);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3409  	if (!ret && nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3410  		refill_obj_stock(objcg, PAGE_SIZE - nr_bytes);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3411  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3412  	css_put(&memcg->css);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3413  	return ret;
bf4f059954dcb2 Roman Gushchin 2020-08-06  3414  }
bf4f059954dcb2 Roman Gushchin 2020-08-06  3415  
bf4f059954dcb2 Roman Gushchin 2020-08-06  3416  void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size)
bf4f059954dcb2 Roman Gushchin 2020-08-06  3417  {
bf4f059954dcb2 Roman Gushchin 2020-08-06  3418  	refill_obj_stock(objcg, size);
bf4f059954dcb2 Roman Gushchin 2020-08-06  3419  }
bf4f059954dcb2 Roman Gushchin 2020-08-06  3420  
6510c46c2c5965 Waiman Long    2021-04-09  3421  void obj_cgroup_uncharge_mod_state(struct obj_cgroup *objcg, size_t size,
6510c46c2c5965 Waiman Long    2021-04-09  3422  				   struct pglist_data *pgdat, int idx)
6510c46c2c5965 Waiman Long    2021-04-09  3423  {
6510c46c2c5965 Waiman Long    2021-04-09  3424  	unsigned long flags;
6510c46c2c5965 Waiman Long    2021-04-09  3425  
1f4e22fce44599 Waiman Long    2021-04-09 @3426  	get_obj_stock(flags);
6510c46c2c5965 Waiman Long    2021-04-09  3427  	__refill_obj_stock(objcg, size);
e632f602745d79 Waiman Long    2021-04-09  3428  	__mod_obj_stock_state(objcg, pgdat, idx, -(int)size);
1f4e22fce44599 Waiman Long    2021-04-09  3429  	put_obj_stock(flags);
6510c46c2c5965 Waiman Long    2021-04-09  3430  }
6510c46c2c5965 Waiman Long    2021-04-09  3431  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 30052 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 19:58       ` Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 19:58 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka, linux-kernel, cgroups, linux-mm,
	Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
	Alexander Duyck, Wei Yang, Masayoshi Mizuma

On 4/12/21 2:55 PM, Roman Gushchin wrote:
> On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
>> Most kmem_cache_alloc() calls are from user context. With instrumentation
>> enabled, the measured amount of kmem_cache_alloc() calls from non-task
>> context was about 0.01% of the total.
>>
>> The irq disable/enable sequence used in this case to access content
>> from object stock is slow.  To optimize for user context access, there
>> are now two object stocks for task context and interrupt context access
>> respectively.
>>
>> The task context object stock can be accessed after disabling preemption
>> which is cheap in non-preempt kernel. The interrupt context object stock
>> can only be accessed after disabling interrupt. User context code can
>> access interrupt object stock, but not vice versa.
>>
>> The mod_objcg_state() function is also modified to make sure that memcg
>> and lruvec stat updates are done with interrupted disabled.
>>
>> The downside of this change is that there are more data stored in local
>> object stocks and not reflected in the charge counter and the vmstat
>> arrays.  However, this is a small price to pay for better performance.
> I agree, the extra memory space is not a significant concern.
> I'd be more worried about the code complexity, but the result looks
> nice to me!
>
> Acked-by: Roman Gushchin <guro@fb.com>
>
> Btw, it seems that the mm tree ran a bit off, so I had to apply this series
> on top of Linus's tree to review. Please, rebase.

This patchset is based on the code in Linus' tree. I had applied the 
patchset to linux-next to see if there was any conflicts. Two of the 
patches had minor fuzzes around the edge but no actual merge conflict 
for now.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 19:58       ` Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 19:58 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
	Alexander Duyck, Wei Yang, Masayoshi Mizuma

On 4/12/21 2:55 PM, Roman Gushchin wrote:
> On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
>> Most kmem_cache_alloc() calls are from user context. With instrumentation
>> enabled, the measured amount of kmem_cache_alloc() calls from non-task
>> context was about 0.01% of the total.
>>
>> The irq disable/enable sequence used in this case to access content
>> from object stock is slow.  To optimize for user context access, there
>> are now two object stocks for task context and interrupt context access
>> respectively.
>>
>> The task context object stock can be accessed after disabling preemption
>> which is cheap in non-preempt kernel. The interrupt context object stock
>> can only be accessed after disabling interrupt. User context code can
>> access interrupt object stock, but not vice versa.
>>
>> The mod_objcg_state() function is also modified to make sure that memcg
>> and lruvec stat updates are done with interrupted disabled.
>>
>> The downside of this change is that there are more data stored in local
>> object stocks and not reflected in the charge counter and the vmstat
>> arrays.  However, this is a small price to pay for better performance.
> I agree, the extra memory space is not a significant concern.
> I'd be more worried about the code complexity, but the result looks
> nice to me!
>
> Acked-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
>
> Btw, it seems that the mm tree ran a bit off, so I had to apply this series
> on top of Linus's tree to review. Please, rebase.

This patchset is based on the code in Linus' tree. I had applied the 
patchset to linux-next to see if there was any conflicts. Two of the 
patches had minor fuzzes around the edge but no actual merge conflict 
for now.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 18:55     ` Roman Gushchin
  0 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-04-12 18:55 UTC (permalink / raw)
  To: Waiman Long
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka, linux-kernel, cgroups, linux-mm,
	Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
	Alexander Duyck, Wei Yang, Masayoshi Mizuma

On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
> Most kmem_cache_alloc() calls are from user context. With instrumentation
> enabled, the measured amount of kmem_cache_alloc() calls from non-task
> context was about 0.01% of the total.
> 
> The irq disable/enable sequence used in this case to access content
> from object stock is slow.  To optimize for user context access, there
> are now two object stocks for task context and interrupt context access
> respectively.
> 
> The task context object stock can be accessed after disabling preemption
> which is cheap in non-preempt kernel. The interrupt context object stock
> can only be accessed after disabling interrupt. User context code can
> access interrupt object stock, but not vice versa.
> 
> The mod_objcg_state() function is also modified to make sure that memcg
> and lruvec stat updates are done with interrupted disabled.
> 
> The downside of this change is that there are more data stored in local
> object stocks and not reflected in the charge counter and the vmstat
> arrays.  However, this is a small price to pay for better performance.

I agree, the extra memory space is not a significant concern.
I'd be more worried about the code complexity, but the result looks
nice to me!

Acked-by: Roman Gushchin <guro@fb.com>

Btw, it seems that the mm tree ran a bit off, so I had to apply this series
on top of Linus's tree to review. Please, rebase.

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 18:55     ` Roman Gushchin
  0 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-04-12 18:55 UTC (permalink / raw)
  To: Waiman Long
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
	Alexander Duyck, Wei Yang, Masayoshi Mizuma

On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
> Most kmem_cache_alloc() calls are from user context. With instrumentation
> enabled, the measured amount of kmem_cache_alloc() calls from non-task
> context was about 0.01% of the total.
> 
> The irq disable/enable sequence used in this case to access content
> from object stock is slow.  To optimize for user context access, there
> are now two object stocks for task context and interrupt context access
> respectively.
> 
> The task context object stock can be accessed after disabling preemption
> which is cheap in non-preempt kernel. The interrupt context object stock
> can only be accessed after disabling interrupt. User context code can
> access interrupt object stock, but not vice versa.
> 
> The mod_objcg_state() function is also modified to make sure that memcg
> and lruvec stat updates are done with interrupted disabled.
> 
> The downside of this change is that there are more data stored in local
> object stocks and not reflected in the charge counter and the vmstat
> arrays.  However, this is a small price to pay for better performance.

I agree, the extra memory space is not a significant concern.
I'd be more worried about the code complexity, but the result looks
nice to me!

Acked-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>

Btw, it seems that the mm tree ran a bit off, so I had to apply this series
on top of Linus's tree to review. Please, rebase.

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
  2021-04-10  6:07     ` kernel test robot
@ 2021-04-12 14:07       ` Waiman Long
  -1 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 14:07 UTC (permalink / raw)
  To: kernel test robot, Johannes Weiner, Michal Hocko,
	Vladimir Davydov, Andrew Morton, Tejun Heo, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim
  Cc: kbuild-all, clang-built-linux, Linux Memory Management List

On 4/10/21 2:07 AM, kernel test robot wrote:
> Hi Waiman,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on dennis-percpu/for-next]
> [also build test WARNING on linus/master v5.12-rc6 next-20210409]
> [cannot apply to hnaz-linux-mm/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
> config: arm64-randconfig-r031-20210409 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
> reproduce (this is a W=1 build):
>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>          chmod +x ~/bin/make.cross
>          # install arm64 cross compiling tool for clang build
>          # apt-get install binutils-aarch64-linux-gnu
>          # https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
>          git remote add linux-review https://github.com/0day-ci/linux
>          git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
>          git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
>          # save the attached .config to linux build tree
>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             stock = get_obj_stock(flags);
>                     ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             get_obj_stock(flags);
>             ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             get_obj_stock(flags);
>             ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             get_obj_stock(flags);
>             ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     4 warnings generated.
>
>
> vim +/stock +3234 mm/memcontrol.c
>
>    3227	
>    3228	static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
>    3229	{
>    3230		struct obj_stock *stock;
>    3231		unsigned long flags;
>    3232		bool ret = false;
>    3233	
>> 3234		stock = get_obj_stock(flags);
>    3235	
>    3236		stock = current_obj_stock();
>    3237		if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
>    3238			stock->nr_bytes -= nr_bytes;
>    3239			ret = true;
>    3240		}
>    3241	
>    3242		put_obj_stock(flags);
>    3243	
>    3244		return ret;
>    3245	}
>    3246	
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

My bad, I somehow missed it. I will fix that in the version.

Thanks,
Longman



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 14:07       ` Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 14:07 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5027 bytes --]

On 4/10/21 2:07 AM, kernel test robot wrote:
> Hi Waiman,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on dennis-percpu/for-next]
> [also build test WARNING on linus/master v5.12-rc6 next-20210409]
> [cannot apply to hnaz-linux-mm/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
> config: arm64-randconfig-r031-20210409 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
> reproduce (this is a W=1 build):
>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>          chmod +x ~/bin/make.cross
>          # install arm64 cross compiling tool for clang build
>          # apt-get install binutils-aarch64-linux-gnu
>          # https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
>          git remote add linux-review https://github.com/0day-ci/linux
>          git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
>          git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
>          # save the attached .config to linux build tree
>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             stock = get_obj_stock(flags);
>                     ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             get_obj_stock(flags);
>             ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             get_obj_stock(flags);
>             ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
>             get_obj_stock(flags);
>             ^~~~~~~~~~~~~~~~~~~~
>     mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
>                     obj_stock = &stock->task_obj;   \
>                                  ^~~~~
>     mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
>     mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
>             struct memcg_stock_pcp *stock;          \
>             ^
>     4 warnings generated.
>
>
> vim +/stock +3234 mm/memcontrol.c
>
>    3227	
>    3228	static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
>    3229	{
>    3230		struct obj_stock *stock;
>    3231		unsigned long flags;
>    3232		bool ret = false;
>    3233	
>> 3234		stock = get_obj_stock(flags);
>    3235	
>    3236		stock = current_obj_stock();
>    3237		if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
>    3238			stock->nr_bytes -= nr_bytes;
>    3239			ret = true;
>    3240		}
>    3241	
>    3242		put_obj_stock(flags);
>    3243	
>    3244		return ret;
>    3245	}
>    3246	
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

My bad, I somehow missed it. I will fix that in the version.

Thanks,
Longman

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
  2021-04-09 23:18   ` Waiman Long
@ 2021-04-10  6:07     ` kernel test robot
  -1 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-04-10  6:07 UTC (permalink / raw)
  To: Waiman Long, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Andrew Morton, Tejun Heo, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim
  Cc: kbuild-all, clang-built-linux, Linux Memory Management List

[-- Attachment #1: Type: text/plain, Size: 4536 bytes --]

Hi Waiman,

I love your patch! Perhaps something to improve:

[auto build test WARNING on dennis-percpu/for-next]
[also build test WARNING on linus/master v5.12-rc6 next-20210409]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
base:   https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
config: arm64-randconfig-r031-20210409 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
        git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           stock = get_obj_stock(flags);
                   ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           get_obj_stock(flags);
           ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           get_obj_stock(flags);
           ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           get_obj_stock(flags);
           ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   4 warnings generated.


vim +/stock +3234 mm/memcontrol.c

  3227	
  3228	static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
  3229	{
  3230		struct obj_stock *stock;
  3231		unsigned long flags;
  3232		bool ret = false;
  3233	
> 3234		stock = get_obj_stock(flags);
  3235	
  3236		stock = current_obj_stock();
  3237		if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
  3238			stock->nr_bytes -= nr_bytes;
  3239			ret = true;
  3240		}
  3241	
  3242		put_obj_stock(flags);
  3243	
  3244		return ret;
  3245	}
  3246	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28994 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-10  6:07     ` kernel test robot
  0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-04-10  6:07 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4639 bytes --]

Hi Waiman,

I love your patch! Perhaps something to improve:

[auto build test WARNING on dennis-percpu/for-next]
[also build test WARNING on linus/master v5.12-rc6 next-20210409]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
base:   https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
config: arm64-randconfig-r031-20210409 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
        git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           stock = get_obj_stock(flags);
                   ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           get_obj_stock(flags);
           ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           get_obj_stock(flags);
           ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
           get_obj_stock(flags);
           ^~~~~~~~~~~~~~~~~~~~
   mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
                   obj_stock = &stock->task_obj;   \
                                ^~~~~
   mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
   mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
           struct memcg_stock_pcp *stock;          \
           ^
   4 warnings generated.


vim +/stock +3234 mm/memcontrol.c

  3227	
  3228	static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
  3229	{
  3230		struct obj_stock *stock;
  3231		unsigned long flags;
  3232		bool ret = false;
  3233	
> 3234		stock = get_obj_stock(flags);
  3235	
  3236		stock = current_obj_stock();
  3237		if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
  3238			stock->nr_bytes -= nr_bytes;
  3239			ret = true;
  3240		}
  3241	
  3242		put_obj_stock(flags);
  3243	
  3244		return ret;
  3245	}
  3246	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 28994 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-09 23:18   ` Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-09 23:18 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka, Roman Gushchin
  Cc: linux-kernel, cgroups, linux-mm, Shakeel Butt, Muchun Song,
	Alex Shi, Chris Down, Yafang Shao, Alexander Duyck, Wei Yang,
	Masayoshi Mizuma, Waiman Long

Most kmem_cache_alloc() calls are from user context. With instrumentation
enabled, the measured amount of kmem_cache_alloc() calls from non-task
context was about 0.01% of the total.

The irq disable/enable sequence used in this case to access content
from object stock is slow.  To optimize for user context access, there
are now two object stocks for task context and interrupt context access
respectively.

The task context object stock can be accessed after disabling preemption
which is cheap in non-preempt kernel. The interrupt context object stock
can only be accessed after disabling interrupt. User context code can
access interrupt object stock, but not vice versa.

The mod_objcg_state() function is also modified to make sure that memcg
and lruvec stat updates are done with interrupted disabled.

The downside of this change is that there are more data stored in local
object stocks and not reflected in the charge counter and the vmstat
arrays.  However, this is a small price to pay for better performance.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 mm/memcontrol.c | 71 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 69f728383efe..00c9074e42e5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2229,7 +2229,8 @@ struct obj_stock {
 struct memcg_stock_pcp {
 	struct mem_cgroup *cached; /* this never be root cgroup */
 	unsigned int nr_pages;
-	struct obj_stock obj;
+	struct obj_stock task_obj;
+	struct obj_stock irq_obj;
 
 	struct work_struct work;
 	unsigned long flags;
@@ -2254,11 +2255,46 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
 }
 #endif
 
+/*
+ * Most kmem_cache_alloc() calls are from user context. The irq disable/enable
+ * sequence used in this case to access content from object stock is slow.
+ * To optimize for user context access, there are now two object stocks for
+ * task context and interrupt context access respectively.
+ *
+ * The task context object stock can be accessed by disabling preemption only
+ * which is cheap in non-preempt kernel. The interrupt context object stock
+ * can only be accessed after disabling interrupt. User context code can
+ * access interrupt object stock, but not vice versa.
+ */
 static inline struct obj_stock *current_obj_stock(void)
 {
 	struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
 
-	return &stock->obj;
+	return in_task() ? &stock->task_obj : &stock->irq_obj;
+}
+
+#define get_obj_stock(flags)			\
+({						\
+	struct memcg_stock_pcp *stock;		\
+	struct obj_stock *obj_stock;		\
+						\
+	if (in_task()) {			\
+		preempt_disable();		\
+		(flags) = -1L;			\
+		obj_stock = &stock->task_obj;	\
+	} else {				\
+		local_irq_save(flags);		\
+		obj_stock = &stock->irq_obj;	\
+	}					\
+	obj_stock;				\
+})
+
+static inline void put_obj_stock(unsigned long flags)
+{
+	if (flags == -1L)
+		preempt_enable();
+	else
+		local_irq_restore(flags);
 }
 
 /**
@@ -2327,7 +2363,9 @@ static void drain_local_stock(struct work_struct *dummy)
 	local_irq_save(flags);
 
 	stock = this_cpu_ptr(&memcg_stock);
-	drain_obj_stock(&stock->obj);
+	drain_obj_stock(&stock->irq_obj);
+	if (in_task())
+		drain_obj_stock(&stock->task_obj);
 	drain_stock(stock);
 	clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
 
@@ -3183,7 +3221,7 @@ static inline void mod_objcg_state(struct obj_cgroup *objcg,
 	memcg = obj_cgroup_memcg(objcg);
 	if (pgdat)
 		lruvec = mem_cgroup_lruvec(memcg, pgdat);
-	__mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
+	mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
 	rcu_read_unlock();
 }
 
@@ -3193,7 +3231,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 	unsigned long flags;
 	bool ret = false;
 
-	local_irq_save(flags);
+	stock = get_obj_stock(flags);
 
 	stock = current_obj_stock();
 	if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
@@ -3201,7 +3239,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 		ret = true;
 	}
 
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 
 	return ret;
 }
@@ -3254,8 +3292,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
 {
 	struct mem_cgroup *memcg;
 
-	if (stock->obj.cached_objcg) {
-		memcg = obj_cgroup_memcg(stock->obj.cached_objcg);
+	if (in_task() && stock->task_obj.cached_objcg) {
+		memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
+		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
+			return true;
+	}
+	if (stock->irq_obj.cached_objcg) {
+		memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg);
 		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
 			return true;
 	}
@@ -3283,9 +3326,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	get_obj_stock(flags);
 	__refill_obj_stock(objcg, nr_bytes);
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 }
 
 static void __mod_obj_stock_state(struct obj_cgroup *objcg,
@@ -3325,9 +3368,9 @@ void mod_obj_stock_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	get_obj_stock(flags);
 	__mod_obj_stock_state(objcg, pgdat, idx, nr);
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 }
 
 int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size)
@@ -3380,10 +3423,10 @@ void obj_cgroup_uncharge_mod_state(struct obj_cgroup *objcg, size_t size,
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	get_obj_stock(flags);
 	__refill_obj_stock(objcg, size);
 	__mod_obj_stock_state(objcg, pgdat, idx, -(int)size);
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 }
 
 #endif /* CONFIG_MEMCG_KMEM */
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-09 23:18   ` Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-09 23:18 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka, Roman Gushchin
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
	Alexander Duyck, Wei Yang, Masayoshi Mizuma, Waiman Long

Most kmem_cache_alloc() calls are from user context. With instrumentation
enabled, the measured amount of kmem_cache_alloc() calls from non-task
context was about 0.01% of the total.

The irq disable/enable sequence used in this case to access content
from object stock is slow.  To optimize for user context access, there
are now two object stocks for task context and interrupt context access
respectively.

The task context object stock can be accessed after disabling preemption
which is cheap in non-preempt kernel. The interrupt context object stock
can only be accessed after disabling interrupt. User context code can
access interrupt object stock, but not vice versa.

The mod_objcg_state() function is also modified to make sure that memcg
and lruvec stat updates are done with interrupted disabled.

The downside of this change is that there are more data stored in local
object stocks and not reflected in the charge counter and the vmstat
arrays.  However, this is a small price to pay for better performance.

Signed-off-by: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 mm/memcontrol.c | 71 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 69f728383efe..00c9074e42e5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2229,7 +2229,8 @@ struct obj_stock {
 struct memcg_stock_pcp {
 	struct mem_cgroup *cached; /* this never be root cgroup */
 	unsigned int nr_pages;
-	struct obj_stock obj;
+	struct obj_stock task_obj;
+	struct obj_stock irq_obj;
 
 	struct work_struct work;
 	unsigned long flags;
@@ -2254,11 +2255,46 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
 }
 #endif
 
+/*
+ * Most kmem_cache_alloc() calls are from user context. The irq disable/enable
+ * sequence used in this case to access content from object stock is slow.
+ * To optimize for user context access, there are now two object stocks for
+ * task context and interrupt context access respectively.
+ *
+ * The task context object stock can be accessed by disabling preemption only
+ * which is cheap in non-preempt kernel. The interrupt context object stock
+ * can only be accessed after disabling interrupt. User context code can
+ * access interrupt object stock, but not vice versa.
+ */
 static inline struct obj_stock *current_obj_stock(void)
 {
 	struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
 
-	return &stock->obj;
+	return in_task() ? &stock->task_obj : &stock->irq_obj;
+}
+
+#define get_obj_stock(flags)			\
+({						\
+	struct memcg_stock_pcp *stock;		\
+	struct obj_stock *obj_stock;		\
+						\
+	if (in_task()) {			\
+		preempt_disable();		\
+		(flags) = -1L;			\
+		obj_stock = &stock->task_obj;	\
+	} else {				\
+		local_irq_save(flags);		\
+		obj_stock = &stock->irq_obj;	\
+	}					\
+	obj_stock;				\
+})
+
+static inline void put_obj_stock(unsigned long flags)
+{
+	if (flags == -1L)
+		preempt_enable();
+	else
+		local_irq_restore(flags);
 }
 
 /**
@@ -2327,7 +2363,9 @@ static void drain_local_stock(struct work_struct *dummy)
 	local_irq_save(flags);
 
 	stock = this_cpu_ptr(&memcg_stock);
-	drain_obj_stock(&stock->obj);
+	drain_obj_stock(&stock->irq_obj);
+	if (in_task())
+		drain_obj_stock(&stock->task_obj);
 	drain_stock(stock);
 	clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
 
@@ -3183,7 +3221,7 @@ static inline void mod_objcg_state(struct obj_cgroup *objcg,
 	memcg = obj_cgroup_memcg(objcg);
 	if (pgdat)
 		lruvec = mem_cgroup_lruvec(memcg, pgdat);
-	__mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
+	mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
 	rcu_read_unlock();
 }
 
@@ -3193,7 +3231,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 	unsigned long flags;
 	bool ret = false;
 
-	local_irq_save(flags);
+	stock = get_obj_stock(flags);
 
 	stock = current_obj_stock();
 	if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
@@ -3201,7 +3239,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 		ret = true;
 	}
 
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 
 	return ret;
 }
@@ -3254,8 +3292,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
 {
 	struct mem_cgroup *memcg;
 
-	if (stock->obj.cached_objcg) {
-		memcg = obj_cgroup_memcg(stock->obj.cached_objcg);
+	if (in_task() && stock->task_obj.cached_objcg) {
+		memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
+		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
+			return true;
+	}
+	if (stock->irq_obj.cached_objcg) {
+		memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg);
 		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
 			return true;
 	}
@@ -3283,9 +3326,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	get_obj_stock(flags);
 	__refill_obj_stock(objcg, nr_bytes);
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 }
 
 static void __mod_obj_stock_state(struct obj_cgroup *objcg,
@@ -3325,9 +3368,9 @@ void mod_obj_stock_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	get_obj_stock(flags);
 	__mod_obj_stock_state(objcg, pgdat, idx, nr);
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 }
 
 int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size)
@@ -3380,10 +3423,10 @@ void obj_cgroup_uncharge_mod_state(struct obj_cgroup *objcg, size_t size,
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	get_obj_stock(flags);
 	__refill_obj_stock(objcg, size);
 	__mod_obj_stock_state(objcg, pgdat, idx, -(int)size);
-	local_irq_restore(flags);
+	put_obj_stock(flags);
 }
 
 #endif /* CONFIG_MEMCG_KMEM */
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-04-12 19:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-10 12:48 [PATCH 5/5] mm/memcg: Optimize user context object stock access kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2021-04-09 23:18 [PATCH 0/5] mm/memcg: Reduce kmemcache memory accounting overhead Waiman Long
2021-04-09 23:18 ` [PATCH 5/5] mm/memcg: Optimize user context object stock access Waiman Long
2021-04-09 23:18   ` Waiman Long
2021-04-10  6:07   ` kernel test robot
2021-04-10  6:07     ` kernel test robot
2021-04-12 14:07     ` Waiman Long
2021-04-12 14:07       ` Waiman Long
2021-04-12 18:55   ` Roman Gushchin
2021-04-12 18:55     ` Roman Gushchin
2021-04-12 19:58     ` Waiman Long
2021-04-12 19:58       ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.