On Thu, 14 Apr 2016 19:34:05 +0100 "Dr. David Alan Gilbert" wrote: > * Thomas Huth (thuth@redhat.com) wrote: > > On 14.04.2016 13:47, Dr. David Alan Gilbert wrote: > > > * Thomas Huth (thuth@redhat.com) wrote: > > > > > >> That would mean a regression compared to what we have today. Currently, > > >> the ballooning is working OK for 64k guests on a 64k ppc host - rather > > >> by chance than on purpose, but it's working. The guest is always sending > > >> all the 4k fragments of a 64k page, and QEMU is trying to call madvise() > > >> for every one of them, but the kernel is ignoring madvise() on > > >> non-64k-aligned addresses, so we end up with a situation where the > > >> madvise() frees a whole 64k page which is also declared as free by the > > >> guest. > > > > > > I wouldn't worry about migrating your fragmenet map; but I wonder if it > > > needs to be that complex - does the guest normally do something more sane > > > like do the 4k pages in order and so you've just got to track the last > > > page it tried rather than having a full map? > > > > That's maybe a little bit easier and might work for well-known Linux > > guests, but IMHO it's even more a hack than my approach: If the Linux > > driver one day is switched to send the pages in the opposite order, or > > if somebody tries to run a non-wellknown (i.e. non-Linux) guest, this > > does not work at all anymore. > > True. TBH, I'm not sure that basing off last sub-page ballooned will even be that much easier to implement, or at least to implement in a way that's convincingly correct even for the limited cases it's supposed to work in. > > > A side question is whether the behaviour that's seen by virtio_ballon_handle_output > > > is always actually the full 64k page; it calls balloon_page once > > > for each message/element - but if all of those elements add back up to the full > > > page, perhaps it makes more sense to reassemble it there? > > > > That might work for 64k page size guests ... but for 4k guests, I think > > you'll have a hard time to reassemble a page there more easily than with > > my current approach. Or do you have a clever algorithm in mind that > > could do the job well there? > > No, i didn't; I just have an ulterior motive which is trying to > do as few madvise's as possible, and while virtio_balloon_handle_output sees > potentially quite a few requests at once, balloon_page is stuck down > there at the bottom without any idea of whether there are any more coming. > > Dave > > > Thomas > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- David Gibson Senior Software Engineer, Virtualization, Red Hat