* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-10 12:48 kernel test robot
0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-04-10 12:48 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 17557 bytes --]
CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210409231842.8840-6-longman@redhat.com>
References: <20210409231842.8840-6-longman@redhat.com>
TO: Waiman Long <longman@redhat.com>
TO: Johannes Weiner <hannes@cmpxchg.org>
TO: Michal Hocko <mhocko@kernel.org>
TO: Vladimir Davydov <vdavydov.dev@gmail.com>
TO: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
TO: Tejun Heo <tj@kernel.org>
TO: Christoph Lameter <cl@linux-foundation.org>
TO: Pekka Enberg <penberg@kernel.org>
TO: David Rientjes <rientjes@google.com>
TO: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Hi Waiman,
I love your patch! Perhaps something to improve:
[auto build test WARNING on dennis-percpu/for-next]
[also build test WARNING on linus/master v5.12-rc6 next-20210409]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
base: https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
:::::: branch date: 13 hours ago
:::::: commit date: 13 hours ago
config: x86_64-randconfig-m001-20210409 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
smatch warnings:
mm/memcontrol.c:3234 consume_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3234 consume_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3329 refill_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3329 refill_obj_stock() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3371 mod_obj_stock_state() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3371 mod_obj_stock_state() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3426 obj_cgroup_uncharge_mod_state() error: uninitialized symbol 'stock'.
mm/memcontrol.c:3426 obj_cgroup_uncharge_mod_state() error: uninitialized symbol 'stock'.
vim +/stock +3234 mm/memcontrol.c
e632f602745d79 Waiman Long 2021-04-09 3227
bf4f059954dcb2 Roman Gushchin 2020-08-06 3228 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3229 {
6b724b11a34580 Waiman Long 2021-04-09 3230 struct obj_stock *stock;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3231 unsigned long flags;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3232 bool ret = false;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3233
1f4e22fce44599 Waiman Long 2021-04-09 @3234 stock = get_obj_stock(flags);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3235
6b724b11a34580 Waiman Long 2021-04-09 3236 stock = current_obj_stock();
bf4f059954dcb2 Roman Gushchin 2020-08-06 3237 if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3238 stock->nr_bytes -= nr_bytes;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3239 ret = true;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3240 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3241
1f4e22fce44599 Waiman Long 2021-04-09 3242 put_obj_stock(flags);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3243
bf4f059954dcb2 Roman Gushchin 2020-08-06 3244 return ret;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3245 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3246
6b724b11a34580 Waiman Long 2021-04-09 3247 static void drain_obj_stock(struct obj_stock *stock)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3248 {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3249 struct obj_cgroup *old = stock->cached_objcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3250
bf4f059954dcb2 Roman Gushchin 2020-08-06 3251 if (!old)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3252 return;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3253
bf4f059954dcb2 Roman Gushchin 2020-08-06 3254 if (stock->nr_bytes) {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3255 unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3256 unsigned int nr_bytes = stock->nr_bytes & (PAGE_SIZE - 1);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3257
bf4f059954dcb2 Roman Gushchin 2020-08-06 3258 if (nr_pages) {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3259 rcu_read_lock();
bf4f059954dcb2 Roman Gushchin 2020-08-06 3260 __memcg_kmem_uncharge(obj_cgroup_memcg(old), nr_pages);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3261 rcu_read_unlock();
bf4f059954dcb2 Roman Gushchin 2020-08-06 3262 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3263
bf4f059954dcb2 Roman Gushchin 2020-08-06 3264 /*
bf4f059954dcb2 Roman Gushchin 2020-08-06 3265 * The leftover is flushed to the centralized per-memcg value.
bf4f059954dcb2 Roman Gushchin 2020-08-06 3266 * On the next attempt to refill obj stock it will be moved
bf4f059954dcb2 Roman Gushchin 2020-08-06 3267 * to a per-cpu stock (probably, on an other CPU), see
bf4f059954dcb2 Roman Gushchin 2020-08-06 3268 * refill_obj_stock().
bf4f059954dcb2 Roman Gushchin 2020-08-06 3269 *
bf4f059954dcb2 Roman Gushchin 2020-08-06 3270 * How often it's flushed is a trade-off between the memory
bf4f059954dcb2 Roman Gushchin 2020-08-06 3271 * limit enforcement accuracy and potential CPU contention,
bf4f059954dcb2 Roman Gushchin 2020-08-06 3272 * so it might be changed in the future.
bf4f059954dcb2 Roman Gushchin 2020-08-06 3273 */
bf4f059954dcb2 Roman Gushchin 2020-08-06 3274 atomic_add(nr_bytes, &old->nr_charged_bytes);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3275 stock->nr_bytes = 0;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3276 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3277
e632f602745d79 Waiman Long 2021-04-09 3278 if (stock->vmstat_bytes) {
e632f602745d79 Waiman Long 2021-04-09 3279 mod_objcg_state(old, stock->cached_pgdat, stock->vmstat_idx,
e632f602745d79 Waiman Long 2021-04-09 3280 stock->vmstat_bytes);
e632f602745d79 Waiman Long 2021-04-09 3281 stock->vmstat_bytes = 0;
e632f602745d79 Waiman Long 2021-04-09 3282 stock->vmstat_idx = 0;
e632f602745d79 Waiman Long 2021-04-09 3283 stock->cached_pgdat = NULL;
e632f602745d79 Waiman Long 2021-04-09 3284 }
e632f602745d79 Waiman Long 2021-04-09 3285
bf4f059954dcb2 Roman Gushchin 2020-08-06 3286 obj_cgroup_put(old);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3287 stock->cached_objcg = NULL;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3288 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3289
bf4f059954dcb2 Roman Gushchin 2020-08-06 3290 static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
bf4f059954dcb2 Roman Gushchin 2020-08-06 3291 struct mem_cgroup *root_memcg)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3292 {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3293 struct mem_cgroup *memcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3294
1f4e22fce44599 Waiman Long 2021-04-09 3295 if (in_task() && stock->task_obj.cached_objcg) {
1f4e22fce44599 Waiman Long 2021-04-09 3296 memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
1f4e22fce44599 Waiman Long 2021-04-09 3297 if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
1f4e22fce44599 Waiman Long 2021-04-09 3298 return true;
1f4e22fce44599 Waiman Long 2021-04-09 3299 }
1f4e22fce44599 Waiman Long 2021-04-09 3300 if (stock->irq_obj.cached_objcg) {
1f4e22fce44599 Waiman Long 2021-04-09 3301 memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3302 if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
bf4f059954dcb2 Roman Gushchin 2020-08-06 3303 return true;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3304 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3305
bf4f059954dcb2 Roman Gushchin 2020-08-06 3306 return false;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3307 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3308
6510c46c2c5965 Waiman Long 2021-04-09 3309 static void __refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3310 {
6b724b11a34580 Waiman Long 2021-04-09 3311 struct obj_stock *stock = current_obj_stock();
bf4f059954dcb2 Roman Gushchin 2020-08-06 3312
bf4f059954dcb2 Roman Gushchin 2020-08-06 3313 if (stock->cached_objcg != objcg) { /* reset if necessary */
bf4f059954dcb2 Roman Gushchin 2020-08-06 3314 drain_obj_stock(stock);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3315 obj_cgroup_get(objcg);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3316 stock->cached_objcg = objcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3317 stock->nr_bytes = atomic_xchg(&objcg->nr_charged_bytes, 0);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3318 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3319 stock->nr_bytes += nr_bytes;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3320
bf4f059954dcb2 Roman Gushchin 2020-08-06 3321 if (stock->nr_bytes > PAGE_SIZE)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3322 drain_obj_stock(stock);
6510c46c2c5965 Waiman Long 2021-04-09 3323 }
6510c46c2c5965 Waiman Long 2021-04-09 3324
6510c46c2c5965 Waiman Long 2021-04-09 3325 static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
6510c46c2c5965 Waiman Long 2021-04-09 3326 {
6510c46c2c5965 Waiman Long 2021-04-09 3327 unsigned long flags;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3328
1f4e22fce44599 Waiman Long 2021-04-09 @3329 get_obj_stock(flags);
6510c46c2c5965 Waiman Long 2021-04-09 3330 __refill_obj_stock(objcg, nr_bytes);
1f4e22fce44599 Waiman Long 2021-04-09 3331 put_obj_stock(flags);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3332 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3333
e632f602745d79 Waiman Long 2021-04-09 3334 static void __mod_obj_stock_state(struct obj_cgroup *objcg,
e632f602745d79 Waiman Long 2021-04-09 3335 struct pglist_data *pgdat, int idx, int nr)
e632f602745d79 Waiman Long 2021-04-09 3336 {
6b724b11a34580 Waiman Long 2021-04-09 3337 struct obj_stock *stock = current_obj_stock();
e632f602745d79 Waiman Long 2021-04-09 3338
e632f602745d79 Waiman Long 2021-04-09 3339 if (stock->cached_objcg != objcg) {
e632f602745d79 Waiman Long 2021-04-09 3340 /* Output the current data as is */
e632f602745d79 Waiman Long 2021-04-09 3341 } else if (!stock->vmstat_bytes) {
e632f602745d79 Waiman Long 2021-04-09 3342 /* Save the current data */
e632f602745d79 Waiman Long 2021-04-09 3343 stock->vmstat_bytes = nr;
e632f602745d79 Waiman Long 2021-04-09 3344 stock->vmstat_idx = idx;
e632f602745d79 Waiman Long 2021-04-09 3345 stock->cached_pgdat = pgdat;
e632f602745d79 Waiman Long 2021-04-09 3346 nr = 0;
e632f602745d79 Waiman Long 2021-04-09 3347 } else if ((stock->cached_pgdat != pgdat) ||
e632f602745d79 Waiman Long 2021-04-09 3348 (stock->vmstat_idx != idx)) {
e632f602745d79 Waiman Long 2021-04-09 3349 /* Output the cached data & save the current data */
e632f602745d79 Waiman Long 2021-04-09 3350 swap(nr, stock->vmstat_bytes);
e632f602745d79 Waiman Long 2021-04-09 3351 swap(idx, stock->vmstat_idx);
e632f602745d79 Waiman Long 2021-04-09 3352 swap(pgdat, stock->cached_pgdat);
e632f602745d79 Waiman Long 2021-04-09 3353 } else {
e632f602745d79 Waiman Long 2021-04-09 3354 stock->vmstat_bytes += nr;
e632f602745d79 Waiman Long 2021-04-09 3355 if (abs(nr) > PAGE_SIZE) {
e632f602745d79 Waiman Long 2021-04-09 3356 nr = stock->vmstat_bytes;
e632f602745d79 Waiman Long 2021-04-09 3357 stock->vmstat_bytes = 0;
e632f602745d79 Waiman Long 2021-04-09 3358 } else {
e632f602745d79 Waiman Long 2021-04-09 3359 nr = 0;
e632f602745d79 Waiman Long 2021-04-09 3360 }
e632f602745d79 Waiman Long 2021-04-09 3361 }
e632f602745d79 Waiman Long 2021-04-09 3362 if (nr)
e632f602745d79 Waiman Long 2021-04-09 3363 mod_objcg_state(objcg, pgdat, idx, nr);
e632f602745d79 Waiman Long 2021-04-09 3364 }
e632f602745d79 Waiman Long 2021-04-09 3365
e632f602745d79 Waiman Long 2021-04-09 3366 void mod_obj_stock_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
e632f602745d79 Waiman Long 2021-04-09 3367 int idx, int nr)
e632f602745d79 Waiman Long 2021-04-09 3368 {
e632f602745d79 Waiman Long 2021-04-09 3369 unsigned long flags;
e632f602745d79 Waiman Long 2021-04-09 3370
1f4e22fce44599 Waiman Long 2021-04-09 @3371 get_obj_stock(flags);
e632f602745d79 Waiman Long 2021-04-09 3372 __mod_obj_stock_state(objcg, pgdat, idx, nr);
1f4e22fce44599 Waiman Long 2021-04-09 3373 put_obj_stock(flags);
e632f602745d79 Waiman Long 2021-04-09 3374 }
e632f602745d79 Waiman Long 2021-04-09 3375
bf4f059954dcb2 Roman Gushchin 2020-08-06 3376 int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3377 {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3378 struct mem_cgroup *memcg;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3379 unsigned int nr_pages, nr_bytes;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3380 int ret;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3381
bf4f059954dcb2 Roman Gushchin 2020-08-06 3382 if (consume_obj_stock(objcg, size))
bf4f059954dcb2 Roman Gushchin 2020-08-06 3383 return 0;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3384
bf4f059954dcb2 Roman Gushchin 2020-08-06 3385 /*
bf4f059954dcb2 Roman Gushchin 2020-08-06 3386 * In theory, memcg->nr_charged_bytes can have enough
bf4f059954dcb2 Roman Gushchin 2020-08-06 3387 * pre-charged bytes to satisfy the allocation. However,
bf4f059954dcb2 Roman Gushchin 2020-08-06 3388 * flushing memcg->nr_charged_bytes requires two atomic
bf4f059954dcb2 Roman Gushchin 2020-08-06 3389 * operations, and memcg->nr_charged_bytes can't be big,
bf4f059954dcb2 Roman Gushchin 2020-08-06 3390 * so it's better to ignore it and try grab some new pages.
bf4f059954dcb2 Roman Gushchin 2020-08-06 3391 * memcg->nr_charged_bytes will be flushed in
bf4f059954dcb2 Roman Gushchin 2020-08-06 3392 * refill_obj_stock(), called from this function or
bf4f059954dcb2 Roman Gushchin 2020-08-06 3393 * independently later.
bf4f059954dcb2 Roman Gushchin 2020-08-06 3394 */
bf4f059954dcb2 Roman Gushchin 2020-08-06 3395 rcu_read_lock();
eefbfa7fd67880 Muchun Song 2020-12-14 3396 retry:
bf4f059954dcb2 Roman Gushchin 2020-08-06 3397 memcg = obj_cgroup_memcg(objcg);
eefbfa7fd67880 Muchun Song 2020-12-14 3398 if (unlikely(!css_tryget(&memcg->css)))
eefbfa7fd67880 Muchun Song 2020-12-14 3399 goto retry;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3400 rcu_read_unlock();
bf4f059954dcb2 Roman Gushchin 2020-08-06 3401
bf4f059954dcb2 Roman Gushchin 2020-08-06 3402 nr_pages = size >> PAGE_SHIFT;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3403 nr_bytes = size & (PAGE_SIZE - 1);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3404
bf4f059954dcb2 Roman Gushchin 2020-08-06 3405 if (nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3406 nr_pages += 1;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3407
bf4f059954dcb2 Roman Gushchin 2020-08-06 3408 ret = __memcg_kmem_charge(memcg, gfp, nr_pages);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3409 if (!ret && nr_bytes)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3410 refill_obj_stock(objcg, PAGE_SIZE - nr_bytes);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3411
bf4f059954dcb2 Roman Gushchin 2020-08-06 3412 css_put(&memcg->css);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3413 return ret;
bf4f059954dcb2 Roman Gushchin 2020-08-06 3414 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3415
bf4f059954dcb2 Roman Gushchin 2020-08-06 3416 void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size)
bf4f059954dcb2 Roman Gushchin 2020-08-06 3417 {
bf4f059954dcb2 Roman Gushchin 2020-08-06 3418 refill_obj_stock(objcg, size);
bf4f059954dcb2 Roman Gushchin 2020-08-06 3419 }
bf4f059954dcb2 Roman Gushchin 2020-08-06 3420
6510c46c2c5965 Waiman Long 2021-04-09 3421 void obj_cgroup_uncharge_mod_state(struct obj_cgroup *objcg, size_t size,
6510c46c2c5965 Waiman Long 2021-04-09 3422 struct pglist_data *pgdat, int idx)
6510c46c2c5965 Waiman Long 2021-04-09 3423 {
6510c46c2c5965 Waiman Long 2021-04-09 3424 unsigned long flags;
6510c46c2c5965 Waiman Long 2021-04-09 3425
1f4e22fce44599 Waiman Long 2021-04-09 @3426 get_obj_stock(flags);
6510c46c2c5965 Waiman Long 2021-04-09 3427 __refill_obj_stock(objcg, size);
e632f602745d79 Waiman Long 2021-04-09 3428 __mod_obj_stock_state(objcg, pgdat, idx, -(int)size);
1f4e22fce44599 Waiman Long 2021-04-09 3429 put_obj_stock(flags);
6510c46c2c5965 Waiman Long 2021-04-09 3430 }
6510c46c2c5965 Waiman Long 2021-04-09 3431
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 30052 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 19:58 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 19:58 UTC (permalink / raw)
To: Roman Gushchin
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Vlastimil Babka, linux-kernel, cgroups, linux-mm,
Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
Alexander Duyck, Wei Yang, Masayoshi Mizuma
On 4/12/21 2:55 PM, Roman Gushchin wrote:
> On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
>> Most kmem_cache_alloc() calls are from user context. With instrumentation
>> enabled, the measured amount of kmem_cache_alloc() calls from non-task
>> context was about 0.01% of the total.
>>
>> The irq disable/enable sequence used in this case to access content
>> from object stock is slow. To optimize for user context access, there
>> are now two object stocks for task context and interrupt context access
>> respectively.
>>
>> The task context object stock can be accessed after disabling preemption
>> which is cheap in non-preempt kernel. The interrupt context object stock
>> can only be accessed after disabling interrupt. User context code can
>> access interrupt object stock, but not vice versa.
>>
>> The mod_objcg_state() function is also modified to make sure that memcg
>> and lruvec stat updates are done with interrupted disabled.
>>
>> The downside of this change is that there are more data stored in local
>> object stocks and not reflected in the charge counter and the vmstat
>> arrays. However, this is a small price to pay for better performance.
> I agree, the extra memory space is not a significant concern.
> I'd be more worried about the code complexity, but the result looks
> nice to me!
>
> Acked-by: Roman Gushchin <guro@fb.com>
>
> Btw, it seems that the mm tree ran a bit off, so I had to apply this series
> on top of Linus's tree to review. Please, rebase.
This patchset is based on the code in Linus' tree. I had applied the
patchset to linux-next to see if there was any conflicts. Two of the
patches had minor fuzzes around the edge but no actual merge conflict
for now.
Cheers,
Longman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 19:58 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 19:58 UTC (permalink / raw)
To: Roman Gushchin
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Vlastimil Babka,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
Alexander Duyck, Wei Yang, Masayoshi Mizuma
On 4/12/21 2:55 PM, Roman Gushchin wrote:
> On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
>> Most kmem_cache_alloc() calls are from user context. With instrumentation
>> enabled, the measured amount of kmem_cache_alloc() calls from non-task
>> context was about 0.01% of the total.
>>
>> The irq disable/enable sequence used in this case to access content
>> from object stock is slow. To optimize for user context access, there
>> are now two object stocks for task context and interrupt context access
>> respectively.
>>
>> The task context object stock can be accessed after disabling preemption
>> which is cheap in non-preempt kernel. The interrupt context object stock
>> can only be accessed after disabling interrupt. User context code can
>> access interrupt object stock, but not vice versa.
>>
>> The mod_objcg_state() function is also modified to make sure that memcg
>> and lruvec stat updates are done with interrupted disabled.
>>
>> The downside of this change is that there are more data stored in local
>> object stocks and not reflected in the charge counter and the vmstat
>> arrays. However, this is a small price to pay for better performance.
> I agree, the extra memory space is not a significant concern.
> I'd be more worried about the code complexity, but the result looks
> nice to me!
>
> Acked-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
>
> Btw, it seems that the mm tree ran a bit off, so I had to apply this series
> on top of Linus's tree to review. Please, rebase.
This patchset is based on the code in Linus' tree. I had applied the
patchset to linux-next to see if there was any conflicts. Two of the
patches had minor fuzzes around the edge but no actual merge conflict
for now.
Cheers,
Longman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 18:55 ` Roman Gushchin
0 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-04-12 18:55 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Vlastimil Babka, linux-kernel, cgroups, linux-mm,
Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
Alexander Duyck, Wei Yang, Masayoshi Mizuma
On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
> Most kmem_cache_alloc() calls are from user context. With instrumentation
> enabled, the measured amount of kmem_cache_alloc() calls from non-task
> context was about 0.01% of the total.
>
> The irq disable/enable sequence used in this case to access content
> from object stock is slow. To optimize for user context access, there
> are now two object stocks for task context and interrupt context access
> respectively.
>
> The task context object stock can be accessed after disabling preemption
> which is cheap in non-preempt kernel. The interrupt context object stock
> can only be accessed after disabling interrupt. User context code can
> access interrupt object stock, but not vice versa.
>
> The mod_objcg_state() function is also modified to make sure that memcg
> and lruvec stat updates are done with interrupted disabled.
>
> The downside of this change is that there are more data stored in local
> object stocks and not reflected in the charge counter and the vmstat
> arrays. However, this is a small price to pay for better performance.
I agree, the extra memory space is not a significant concern.
I'd be more worried about the code complexity, but the result looks
nice to me!
Acked-by: Roman Gushchin <guro@fb.com>
Btw, it seems that the mm tree ran a bit off, so I had to apply this series
on top of Linus's tree to review. Please, rebase.
Thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 18:55 ` Roman Gushchin
0 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2021-04-12 18:55 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Vlastimil Babka,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
Alexander Duyck, Wei Yang, Masayoshi Mizuma
On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
> Most kmem_cache_alloc() calls are from user context. With instrumentation
> enabled, the measured amount of kmem_cache_alloc() calls from non-task
> context was about 0.01% of the total.
>
> The irq disable/enable sequence used in this case to access content
> from object stock is slow. To optimize for user context access, there
> are now two object stocks for task context and interrupt context access
> respectively.
>
> The task context object stock can be accessed after disabling preemption
> which is cheap in non-preempt kernel. The interrupt context object stock
> can only be accessed after disabling interrupt. User context code can
> access interrupt object stock, but not vice versa.
>
> The mod_objcg_state() function is also modified to make sure that memcg
> and lruvec stat updates are done with interrupted disabled.
>
> The downside of this change is that there are more data stored in local
> object stocks and not reflected in the charge counter and the vmstat
> arrays. However, this is a small price to pay for better performance.
I agree, the extra memory space is not a significant concern.
I'd be more worried about the code complexity, but the result looks
nice to me!
Acked-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
Btw, it seems that the mm tree ran a bit off, so I had to apply this series
on top of Linus's tree to review. Please, rebase.
Thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
2021-04-10 6:07 ` kernel test robot
@ 2021-04-12 14:07 ` Waiman Long
-1 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 14:07 UTC (permalink / raw)
To: kernel test robot, Johannes Weiner, Michal Hocko,
Vladimir Davydov, Andrew Morton, Tejun Heo, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim
Cc: kbuild-all, clang-built-linux, Linux Memory Management List
On 4/10/21 2:07 AM, kernel test robot wrote:
> Hi Waiman,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on dennis-percpu/for-next]
> [also build test WARNING on linus/master v5.12-rc6 next-20210409]
> [cannot apply to hnaz-linux-mm/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
> base: https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
> config: arm64-randconfig-r031-20210409 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install arm64 cross compiling tool for clang build
> # apt-get install binutils-aarch64-linux-gnu
> # https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
> git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> stock = get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> 4 warnings generated.
>
>
> vim +/stock +3234 mm/memcontrol.c
>
> 3227
> 3228 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
> 3229 {
> 3230 struct obj_stock *stock;
> 3231 unsigned long flags;
> 3232 bool ret = false;
> 3233
>> 3234 stock = get_obj_stock(flags);
> 3235
> 3236 stock = current_obj_stock();
> 3237 if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
> 3238 stock->nr_bytes -= nr_bytes;
> 3239 ret = true;
> 3240 }
> 3241
> 3242 put_obj_stock(flags);
> 3243
> 3244 return ret;
> 3245 }
> 3246
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
My bad, I somehow missed it. I will fix that in the version.
Thanks,
Longman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-12 14:07 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-12 14:07 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 5027 bytes --]
On 4/10/21 2:07 AM, kernel test robot wrote:
> Hi Waiman,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on dennis-percpu/for-next]
> [also build test WARNING on linus/master v5.12-rc6 next-20210409]
> [cannot apply to hnaz-linux-mm/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
> base: https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
> config: arm64-randconfig-r031-20210409 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install arm64 cross compiling tool for clang build
> # apt-get install binutils-aarch64-linux-gnu
> # https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
> git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> stock = get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
> get_obj_stock(flags);
> ^~~~~~~~~~~~~~~~~~~~
> mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
> obj_stock = &stock->task_obj; \
> ^~~~~
> mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
> mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
> struct memcg_stock_pcp *stock; \
> ^
> 4 warnings generated.
>
>
> vim +/stock +3234 mm/memcontrol.c
>
> 3227
> 3228 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
> 3229 {
> 3230 struct obj_stock *stock;
> 3231 unsigned long flags;
> 3232 bool ret = false;
> 3233
>> 3234 stock = get_obj_stock(flags);
> 3235
> 3236 stock = current_obj_stock();
> 3237 if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
> 3238 stock->nr_bytes -= nr_bytes;
> 3239 ret = true;
> 3240 }
> 3241
> 3242 put_obj_stock(flags);
> 3243
> 3244 return ret;
> 3245 }
> 3246
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
My bad, I somehow missed it. I will fix that in the version.
Thanks,
Longman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
2021-04-09 23:18 ` Waiman Long
@ 2021-04-10 6:07 ` kernel test robot
-1 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-04-10 6:07 UTC (permalink / raw)
To: Waiman Long, Johannes Weiner, Michal Hocko, Vladimir Davydov,
Andrew Morton, Tejun Heo, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim
Cc: kbuild-all, clang-built-linux, Linux Memory Management List
[-- Attachment #1: Type: text/plain, Size: 4536 bytes --]
Hi Waiman,
I love your patch! Perhaps something to improve:
[auto build test WARNING on dennis-percpu/for-next]
[also build test WARNING on linus/master v5.12-rc6 next-20210409]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
base: https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
config: arm64-randconfig-r031-20210409 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
stock = get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
4 warnings generated.
vim +/stock +3234 mm/memcontrol.c
3227
3228 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
3229 {
3230 struct obj_stock *stock;
3231 unsigned long flags;
3232 bool ret = false;
3233
> 3234 stock = get_obj_stock(flags);
3235
3236 stock = current_obj_stock();
3237 if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
3238 stock->nr_bytes -= nr_bytes;
3239 ret = true;
3240 }
3241
3242 put_obj_stock(flags);
3243
3244 return ret;
3245 }
3246
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28994 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-10 6:07 ` kernel test robot
0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-04-10 6:07 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 4639 bytes --]
Hi Waiman,
I love your patch! Perhaps something to improve:
[auto build test WARNING on dennis-percpu/for-next]
[also build test WARNING on linus/master v5.12-rc6 next-20210409]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
base: https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git for-next
config: arm64-randconfig-r031-20210409 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/0day-ci/linux/commit/1f4e22fce44599095a55535301ca83adc5d3a4fe
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Waiman-Long/mm-memcg-Reduce-kmemcache-memory-accounting-overhead/20210410-071958
git checkout 1f4e22fce44599095a55535301ca83adc5d3a4fe
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> mm/memcontrol.c:3234:10: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
stock = get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3234:10: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
mm/memcontrol.c:3329:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3329:2: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
mm/memcontrol.c:3371:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3371:2: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
mm/memcontrol.c:3426:2: warning: variable 'stock' is uninitialized when used here [-Wuninitialized]
get_obj_stock(flags);
^~~~~~~~~~~~~~~~~~~~
mm/memcontrol.c:2284:16: note: expanded from macro 'get_obj_stock'
obj_stock = &stock->task_obj; \
^~~~~
mm/memcontrol.c:3426:2: note: variable 'stock' is declared here
mm/memcontrol.c:2278:2: note: expanded from macro 'get_obj_stock'
struct memcg_stock_pcp *stock; \
^
4 warnings generated.
vim +/stock +3234 mm/memcontrol.c
3227
3228 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
3229 {
3230 struct obj_stock *stock;
3231 unsigned long flags;
3232 bool ret = false;
3233
> 3234 stock = get_obj_stock(flags);
3235
3236 stock = current_obj_stock();
3237 if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
3238 stock->nr_bytes -= nr_bytes;
3239 ret = true;
3240 }
3241
3242 put_obj_stock(flags);
3243
3244 return ret;
3245 }
3246
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 28994 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-09 23:18 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-09 23:18 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Vlastimil Babka, Roman Gushchin
Cc: linux-kernel, cgroups, linux-mm, Shakeel Butt, Muchun Song,
Alex Shi, Chris Down, Yafang Shao, Alexander Duyck, Wei Yang,
Masayoshi Mizuma, Waiman Long
Most kmem_cache_alloc() calls are from user context. With instrumentation
enabled, the measured amount of kmem_cache_alloc() calls from non-task
context was about 0.01% of the total.
The irq disable/enable sequence used in this case to access content
from object stock is slow. To optimize for user context access, there
are now two object stocks for task context and interrupt context access
respectively.
The task context object stock can be accessed after disabling preemption
which is cheap in non-preempt kernel. The interrupt context object stock
can only be accessed after disabling interrupt. User context code can
access interrupt object stock, but not vice versa.
The mod_objcg_state() function is also modified to make sure that memcg
and lruvec stat updates are done with interrupted disabled.
The downside of this change is that there are more data stored in local
object stocks and not reflected in the charge counter and the vmstat
arrays. However, this is a small price to pay for better performance.
Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/memcontrol.c | 71 +++++++++++++++++++++++++++++++++++++++----------
1 file changed, 57 insertions(+), 14 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 69f728383efe..00c9074e42e5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2229,7 +2229,8 @@ struct obj_stock {
struct memcg_stock_pcp {
struct mem_cgroup *cached; /* this never be root cgroup */
unsigned int nr_pages;
- struct obj_stock obj;
+ struct obj_stock task_obj;
+ struct obj_stock irq_obj;
struct work_struct work;
unsigned long flags;
@@ -2254,11 +2255,46 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
}
#endif
+/*
+ * Most kmem_cache_alloc() calls are from user context. The irq disable/enable
+ * sequence used in this case to access content from object stock is slow.
+ * To optimize for user context access, there are now two object stocks for
+ * task context and interrupt context access respectively.
+ *
+ * The task context object stock can be accessed by disabling preemption only
+ * which is cheap in non-preempt kernel. The interrupt context object stock
+ * can only be accessed after disabling interrupt. User context code can
+ * access interrupt object stock, but not vice versa.
+ */
static inline struct obj_stock *current_obj_stock(void)
{
struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
- return &stock->obj;
+ return in_task() ? &stock->task_obj : &stock->irq_obj;
+}
+
+#define get_obj_stock(flags) \
+({ \
+ struct memcg_stock_pcp *stock; \
+ struct obj_stock *obj_stock; \
+ \
+ if (in_task()) { \
+ preempt_disable(); \
+ (flags) = -1L; \
+ obj_stock = &stock->task_obj; \
+ } else { \
+ local_irq_save(flags); \
+ obj_stock = &stock->irq_obj; \
+ } \
+ obj_stock; \
+})
+
+static inline void put_obj_stock(unsigned long flags)
+{
+ if (flags == -1L)
+ preempt_enable();
+ else
+ local_irq_restore(flags);
}
/**
@@ -2327,7 +2363,9 @@ static void drain_local_stock(struct work_struct *dummy)
local_irq_save(flags);
stock = this_cpu_ptr(&memcg_stock);
- drain_obj_stock(&stock->obj);
+ drain_obj_stock(&stock->irq_obj);
+ if (in_task())
+ drain_obj_stock(&stock->task_obj);
drain_stock(stock);
clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
@@ -3183,7 +3221,7 @@ static inline void mod_objcg_state(struct obj_cgroup *objcg,
memcg = obj_cgroup_memcg(objcg);
if (pgdat)
lruvec = mem_cgroup_lruvec(memcg, pgdat);
- __mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
+ mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
rcu_read_unlock();
}
@@ -3193,7 +3231,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
unsigned long flags;
bool ret = false;
- local_irq_save(flags);
+ stock = get_obj_stock(flags);
stock = current_obj_stock();
if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
@@ -3201,7 +3239,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
ret = true;
}
- local_irq_restore(flags);
+ put_obj_stock(flags);
return ret;
}
@@ -3254,8 +3292,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
{
struct mem_cgroup *memcg;
- if (stock->obj.cached_objcg) {
- memcg = obj_cgroup_memcg(stock->obj.cached_objcg);
+ if (in_task() && stock->task_obj.cached_objcg) {
+ memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
+ if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
+ return true;
+ }
+ if (stock->irq_obj.cached_objcg) {
+ memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg);
if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
return true;
}
@@ -3283,9 +3326,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
{
unsigned long flags;
- local_irq_save(flags);
+ get_obj_stock(flags);
__refill_obj_stock(objcg, nr_bytes);
- local_irq_restore(flags);
+ put_obj_stock(flags);
}
static void __mod_obj_stock_state(struct obj_cgroup *objcg,
@@ -3325,9 +3368,9 @@ void mod_obj_stock_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
{
unsigned long flags;
- local_irq_save(flags);
+ get_obj_stock(flags);
__mod_obj_stock_state(objcg, pgdat, idx, nr);
- local_irq_restore(flags);
+ put_obj_stock(flags);
}
int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size)
@@ -3380,10 +3423,10 @@ void obj_cgroup_uncharge_mod_state(struct obj_cgroup *objcg, size_t size,
{
unsigned long flags;
- local_irq_save(flags);
+ get_obj_stock(flags);
__refill_obj_stock(objcg, size);
__mod_obj_stock_state(objcg, pgdat, idx, -(int)size);
- local_irq_restore(flags);
+ put_obj_stock(flags);
}
#endif /* CONFIG_MEMCG_KMEM */
--
2.18.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 5/5] mm/memcg: Optimize user context object stock access
@ 2021-04-09 23:18 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2021-04-09 23:18 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
Tejun Heo, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Vlastimil Babka, Roman Gushchin
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
Shakeel Butt, Muchun Song, Alex Shi, Chris Down, Yafang Shao,
Alexander Duyck, Wei Yang, Masayoshi Mizuma, Waiman Long
Most kmem_cache_alloc() calls are from user context. With instrumentation
enabled, the measured amount of kmem_cache_alloc() calls from non-task
context was about 0.01% of the total.
The irq disable/enable sequence used in this case to access content
from object stock is slow. To optimize for user context access, there
are now two object stocks for task context and interrupt context access
respectively.
The task context object stock can be accessed after disabling preemption
which is cheap in non-preempt kernel. The interrupt context object stock
can only be accessed after disabling interrupt. User context code can
access interrupt object stock, but not vice versa.
The mod_objcg_state() function is also modified to make sure that memcg
and lruvec stat updates are done with interrupted disabled.
The downside of this change is that there are more data stored in local
object stocks and not reflected in the charge counter and the vmstat
arrays. However, this is a small price to pay for better performance.
Signed-off-by: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
mm/memcontrol.c | 71 +++++++++++++++++++++++++++++++++++++++----------
1 file changed, 57 insertions(+), 14 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 69f728383efe..00c9074e42e5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2229,7 +2229,8 @@ struct obj_stock {
struct memcg_stock_pcp {
struct mem_cgroup *cached; /* this never be root cgroup */
unsigned int nr_pages;
- struct obj_stock obj;
+ struct obj_stock task_obj;
+ struct obj_stock irq_obj;
struct work_struct work;
unsigned long flags;
@@ -2254,11 +2255,46 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
}
#endif
+/*
+ * Most kmem_cache_alloc() calls are from user context. The irq disable/enable
+ * sequence used in this case to access content from object stock is slow.
+ * To optimize for user context access, there are now two object stocks for
+ * task context and interrupt context access respectively.
+ *
+ * The task context object stock can be accessed by disabling preemption only
+ * which is cheap in non-preempt kernel. The interrupt context object stock
+ * can only be accessed after disabling interrupt. User context code can
+ * access interrupt object stock, but not vice versa.
+ */
static inline struct obj_stock *current_obj_stock(void)
{
struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
- return &stock->obj;
+ return in_task() ? &stock->task_obj : &stock->irq_obj;
+}
+
+#define get_obj_stock(flags) \
+({ \
+ struct memcg_stock_pcp *stock; \
+ struct obj_stock *obj_stock; \
+ \
+ if (in_task()) { \
+ preempt_disable(); \
+ (flags) = -1L; \
+ obj_stock = &stock->task_obj; \
+ } else { \
+ local_irq_save(flags); \
+ obj_stock = &stock->irq_obj; \
+ } \
+ obj_stock; \
+})
+
+static inline void put_obj_stock(unsigned long flags)
+{
+ if (flags == -1L)
+ preempt_enable();
+ else
+ local_irq_restore(flags);
}
/**
@@ -2327,7 +2363,9 @@ static void drain_local_stock(struct work_struct *dummy)
local_irq_save(flags);
stock = this_cpu_ptr(&memcg_stock);
- drain_obj_stock(&stock->obj);
+ drain_obj_stock(&stock->irq_obj);
+ if (in_task())
+ drain_obj_stock(&stock->task_obj);
drain_stock(stock);
clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
@@ -3183,7 +3221,7 @@ static inline void mod_objcg_state(struct obj_cgroup *objcg,
memcg = obj_cgroup_memcg(objcg);
if (pgdat)
lruvec = mem_cgroup_lruvec(memcg, pgdat);
- __mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
+ mod_memcg_lruvec_state(memcg, lruvec, idx, nr);
rcu_read_unlock();
}
@@ -3193,7 +3231,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
unsigned long flags;
bool ret = false;
- local_irq_save(flags);
+ stock = get_obj_stock(flags);
stock = current_obj_stock();
if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
@@ -3201,7 +3239,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
ret = true;
}
- local_irq_restore(flags);
+ put_obj_stock(flags);
return ret;
}
@@ -3254,8 +3292,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
{
struct mem_cgroup *memcg;
- if (stock->obj.cached_objcg) {
- memcg = obj_cgroup_memcg(stock->obj.cached_objcg);
+ if (in_task() && stock->task_obj.cached_objcg) {
+ memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
+ if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
+ return true;
+ }
+ if (stock->irq_obj.cached_objcg) {
+ memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg);
if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
return true;
}
@@ -3283,9 +3326,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
{
unsigned long flags;
- local_irq_save(flags);
+ get_obj_stock(flags);
__refill_obj_stock(objcg, nr_bytes);
- local_irq_restore(flags);
+ put_obj_stock(flags);
}
static void __mod_obj_stock_state(struct obj_cgroup *objcg,
@@ -3325,9 +3368,9 @@ void mod_obj_stock_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
{
unsigned long flags;
- local_irq_save(flags);
+ get_obj_stock(flags);
__mod_obj_stock_state(objcg, pgdat, idx, nr);
- local_irq_restore(flags);
+ put_obj_stock(flags);
}
int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size)
@@ -3380,10 +3423,10 @@ void obj_cgroup_uncharge_mod_state(struct obj_cgroup *objcg, size_t size,
{
unsigned long flags;
- local_irq_save(flags);
+ get_obj_stock(flags);
__refill_obj_stock(objcg, size);
__mod_obj_stock_state(objcg, pgdat, idx, -(int)size);
- local_irq_restore(flags);
+ put_obj_stock(flags);
}
#endif /* CONFIG_MEMCG_KMEM */
--
2.18.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-04-12 19:58 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-10 12:48 [PATCH 5/5] mm/memcg: Optimize user context object stock access kernel test robot
-- strict thread matches above, loose matches on Subject: below --
2021-04-09 23:18 [PATCH 0/5] mm/memcg: Reduce kmemcache memory accounting overhead Waiman Long
2021-04-09 23:18 ` [PATCH 5/5] mm/memcg: Optimize user context object stock access Waiman Long
2021-04-09 23:18 ` Waiman Long
2021-04-10 6:07 ` kernel test robot
2021-04-10 6:07 ` kernel test robot
2021-04-12 14:07 ` Waiman Long
2021-04-12 14:07 ` Waiman Long
2021-04-12 18:55 ` Roman Gushchin
2021-04-12 18:55 ` Roman Gushchin
2021-04-12 19:58 ` Waiman Long
2021-04-12 19:58 ` Waiman Long
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.