> shall I switch it to defer and observe (all hosts are running fine by
> just now) or
> switch to defer while it is in the bad state?

You could do it immediately and see if no problems appear for long
enough, OTOH...

well cat /sys/kernel/mm/transparent_hugepage/defrag
always [defer] defer+madvise madvise never
was active now since your reply, however, I can not tell that it helped.

This was set on 2 hosts, one has 20GB of unused RAM now.
Yesterday there was a similar picture for both, with several GB, one with up to 10GB unused,
I just checked once, this is what I recall.

tell me if one would like to login remotely, I can set up teamviewer or something for this
at any time, just drop a message here and I'll contact you.
I have hopes that one can investigate things even on that host that has 20GB unused, it's just
a matter of time until this gets to the low values, surely the problem here already kicked in.

Also if the remote login is not an option, I'm always happy to provide whatever info you need.