All of lore.kernel.org
 help / color / mirror / Atom feed
From: KY Srinivasan <kys@microsoft.com>
To: Jon Stanley <jonstanley@gmail.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>
Subject: RE: [EXTERNAL] hv_balloon issues??
Date: Mon, 25 Jan 2021 20:51:35 +0000	[thread overview]
Message-ID: <MWHPR2101MB0874A5A65BD03A5FB7857668A0BD9@MWHPR2101MB0874.namprd21.prod.outlook.com> (raw)
In-Reply-To: <CALY6xngo6fU7NoEgrmP_qtdz4OMQgKo9CiJno2uhtWie0ze3Rw@mail.gmail.com>

I take it that this is on Windows Server machine. What are the Dynamic memory settings for the VM under question.
K. Y

> -----Original Message-----
> From: Jon Stanley <jonstanley@gmail.com>
> Sent: Monday, January 25, 2021 12:20 PM
> To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; wei.liu@kernel.org; linux-
> hyperv@vger.kernel.org
> Subject: [EXTERNAL] hv_balloon issues??
> 
> I'm working to make a method to install bare-metal machines with Packer
> images, and in testing (this isn't going to wind up in production on Hyper-V) I
> think I've found an issue in hv_balloon, but I'm not sure.
> 
> Starting from a RHEL 8 live CD, I make a tmpfs filesystem and download a disk
> image to it. Despite having plenty of memory to do this (I was downloading a
> 5GB image onto a VM with 16GB of RAM), I got paid a visit by the OOM killer.
> 
> If I turn off dynamic memory, then things work as expected. This isn't 100%
> reproducible, I tried immediately after boot and it worked, unmounted the
> tmpfs filesystem and waited for a kernel message that said the balloon floor
> was reached and tried again, and BOOM!
> 
> The actual process that is filling the filesystem (curl) doesn't get killed (which
> makes sense I guess since *it* isn't taking a ton of memory), and also never
> completes presumably due to it's I/O becoming blocked. Does this have to do
> with a sudden, enormous demand for memory perhaps that the hypervisor is
> having difficulty fulfilling?
> The host has plenty of memory available (63GB right now)
> 
> On another note, is there a way that I'm not seeing to tell the current status of
> the balloon driver - i.e. current/max allocations? A quick look through /proc
> and /sys wasn't revealing.
> 
> Also, sorry to be using a distro kernel instead of upstream.
> 
> -Jon
> 
> Jan 25 14:58:43 dhcp-132.rmrf.net kernel: hv_balloon: Balloon request will be
> partially fulfilled. Balloon floor reached.
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: tuned invoked oom-killer:
> gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: CPU: 0 PID: 1165 Comm: tuned Not
> tainted 4.18.0-240.10.1.el8_3.x86_64 #1 Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: Hardware name: Microsoft Corporation Virtual Machine/Virtual
> Machine, BIOS Hyper-V UEFI Release
> v4.0 11/01/2019
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Call Trace:
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel:  dump_stack+0x5c/0x80 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  dump_header+0x51/0x308 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  oom_kill_process.cold.28+0xb/0x10 Jan
> 25 14:59:30 dhcp-132.rmrf.net kernel:  out_of_memory+0x1c1/0x4b0 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  __alloc_pages_slowpath+0xc24/0xd40
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel:
> __alloc_pages_nodemask+0x245/0x280
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel:  filemap_fault+0x3b8/0x840 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  ? hrtimer_cancel+0x11/0x20 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  ? futex_wait+0x19a/0x210 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  ? xas_load+0x8/0x80 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel:  ? xas_find+0x173/0x1b0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  ? filemap_map_pages+0x1a3/0x380 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel:  ext4_filemap_fault+0x2c/0x40 [ext4] Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel:  __do_fault+0x38/0xc0 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel:  do_fault+0x191/0x3c0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  __handle_mm_fault+0x3e6/0x7c0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  handle_mm_fault+0xc2/0x1d0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  __do_page_fault+0x21b/0x4d0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  do_page_fault+0x32/0x110 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  ? page_fault+0x8/0x30 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel:  page_fault+0x1e/0x30 Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: RIP: 0033:0x7faf2f8c5df2 Jan 25 14:59:30 dhcp-132.rmrf.net kernel:
> Code: Bad RIP value.
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RSP: 002b:00007faf242629a0
> EFLAGS: 00010246
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RAX: ffffffffffffff92 RBX:
> 00007faf24262a40 RCX: 00007faf2f8c5df2
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RDX: 0000000000000000 RSI:
> 0000000000000189 RDI: 00007faf1c002490
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RBP: 00007faf1c002490 R08:
> 0000000000000000 R09: 00000000ffffffff
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: R10: 00007faf24262a40 R11:
> 0000000000000246 R12: 0000000000000000
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: R13: 0000000000000000 R14:
> 00007faf24262a40 R15: 000000003b9aca00
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Mem-Info:
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: active_anon:18180
> inactive_anon:738744 isolated_anon:0
>                                            active_file:18
> inactive_file:337 isolated_file:32
>                                            unevictable:132114 dirty:0
> writeback:0 unstable:0
>                                            slab_reclaimable:6250
> slab_unreclaimable:5966
>                                            mapped:1626 shmem:738916
> pagetables:1396 bounce:0
>                                            free:31759 free_pcp:30 free_cma:0 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: Node 0 active_anon:72720kB
> inactive_anon:2954976kB active_file:72kB inactive_file:1348kB
> unevictable:528456kB isolated(anon):0kB i> Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: Node 0 DMA free:15908kB min:64kB low:80kB high:96kB
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
> unevictabl> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: lowmem_reserve[]: 0
> 3845
> 15960 15960 15960
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 DMA32 free:64676kB
> min:16264kB low:20328kB high:24392kB active_anon:1424kB
> inactive_anon:2489752kB active_file:28kB inactiv> Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: lowmem_reserve[]: 0 0 12114
> 12114 12114
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 Normal free:46452kB
> min:51248kB low:64060kB high:76872kB active_anon:71296kB
> inactive_anon:465224kB active_file:4kB inactiv> Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: lowmem_reserve[]: 0 0 0 0 0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB
> (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB
> (U) 1*2048kB (M) 3*4096kB (M) = >
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 DMA32: 29*4kB (UE)
> 36*8kB (UE) 33*16kB (UME) 6*32kB (UE) 3*64kB (UME) 1*128kB (U) 3*256kB
> (UME) 2*512kB (UM) 2*1024kB (U) 3>
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 Normal: 833*4kB (UME)
> 712*8kB (UME) 305*16kB (UME) 152*32kB (UME) 52*64kB (E) 28*128kB
> (UME) 15*256kB (UME) 11*512kB (UME) > Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: Node 0 hugepages_total=0
> hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: Node 0 hugepages_total=0
> hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: 871413 total pagecache pages Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: 0 pages in swap cache Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: Swap cache stats: add 0, delete 0, find 0/0 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: Free swap  = 0kB Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: Total swap = 0kB Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: 4194027 pages RAM Jan 25 14:59:30 dhcp-132.rmrf.net kernel: 0 pages
> HighMem/MovableOnly Jan 25 14:59:30 dhcp-132.rmrf.net kernel: 91830
> pages reserved Jan 25 14:59:30 dhcp-132.rmrf.net kernel: 0 pages hwpoisoned
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ pid ]   uid  tgid total_vm
>      rss pgtables_bytes swapents oom_score_adj name
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  762]     0   762    27626
>     1788   290816        0             0 systemd-journal
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  816]     0   816    25338
>      353   212992        0         -1000 systemd-udevd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  819]     0   819    15287
>      152   135168        0         -1000 auditd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  860]    81   860    14087
>      213   155648        0          -900 dbus-daemon
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  875]   995   875    29968
>      111   147456        0             0 chronyd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  907]     0   907    48443
>      510   405504        0             0 sssd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  908]   997   908   404961
>     1915   331776        0             0 polkitd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  913]     0   913     1085
>       16    53248        0             0 hypervvssd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  914]   994   914    40028
>      204   208896        0             0 rngd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  921]     0   921    50484
>      659   421888        0             0 sssd_be
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  922]     0   922    53956
>      395   462848        0             0 sssd_nss
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  925]     0   925    74573
>     5478   466944        0             0 firewalld
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  926]     0   926    24290
>      252   204800        0             0 systemd-logind
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  940]     0   940   116867
>      614   389120        0             0 NetworkManager
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  958]     0   958    23072
>      224   212992        0         -1000 sshd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  968]     0   968     1778
>       30    61440        0             0 hypervkvpd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  969]     0   969   106589
>     3721   450560        0             0 tuned
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  972]     0   972     9232
>      221   106496        0             0 crond
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [  973]     0   973    10449
>      135   114688        0             0 rhsmcertd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1189]     0  1189    56455
>      509   192512        0             0 rsyslogd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1201]     0  1201    30749
>      215   266240        0             0 login
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1206]     0  1206    23443
>      331   225280        0             0 systemd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1210]     0  1210    37531
>      648   299008        0             0 (sd-pam)
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1216]     0  1216     6554
>      154    86016        0             0 bash
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1285]     0  1285    20229
>      245   196608        0             0 curl
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel:
> oom-
> kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=
> 0,global_oom,task_memcg=/system.slice/firewalld.service,>
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Out of memory: Killed process 925
> (firewalld) total-vm:298292kB, anon-rss:21912kB, file-rss:0kB, shmem-rss:0kB,
> UID:0 Jan 25 14:59:34 dhcp-132.rmrf.net systemd[1]: firewalld.service: Main
> process exited, code=killed, status=9/KILL Jan 25 14:59:47 dhcp-132.rmrf.net
> systemd[1]: firewalld.service:
> Failed with result 'signal'.

  reply	other threads:[~2021-01-25 20:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 20:19 hv_balloon issues?? Jon Stanley
2021-01-25 20:51 ` KY Srinivasan [this message]
2021-01-25 21:07   ` [EXTERNAL] " Jon Stanley
2021-01-25 21:45     ` KY Srinivasan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR2101MB0874A5A65BD03A5FB7857668A0BD9@MWHPR2101MB0874.namprd21.prod.outlook.com \
    --to=kys@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=jonstanley@gmail.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.