From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=52430 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OuYrL-0003gF-HO for qemu-devel@nongnu.org; Sat, 11 Sep 2010 18:47:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OuYrK-0006P9-72 for qemu-devel@nongnu.org; Sat, 11 Sep 2010 18:47:47 -0400 Received: from cantor2.suse.de ([195.135.220.15]:42645 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OuYrJ-0006P1-TS for qemu-devel@nongnu.org; Sat, 11 Sep 2010 18:47:46 -0400 Subject: Re: [Qemu-devel] [PATCH] Introduce qemu_madvise() Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=iso-8859-1 From: Alexander Graf In-Reply-To: <982D5612-5993-42A1-B8AB-B8AFA1106C97@web.de> Date: Sun, 12 Sep 2010 00:47:40 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1284224755-11299-1-git-send-email-andreas.faerber@web.de> <982D5612-5993-42A1-B8AB-B8AFA1106C97@web.de> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?Andreas_F=E4rber?= Cc: Blue Swirl , QEMU Developers On 12.09.2010, at 00:39, Andreas F=E4rber wrote: > Am 11.09.2010 um 23:37 schrieb Alexander Graf: >=20 >> On 11.09.2010, at 19:05, Andreas F=E4rber wrote: >>=20 >>> Remaining madvise() users: >>> exec.c: limited to __linux__ and/or MADV_MERGEABLE (no POSIX = equivalent) >>> kvm-all.c: limited to MADV_DONTFORK (no POSIX equivalent), >>> otherwise runtime error if !kvm_has_sync_mmu() >>> hw/virtio-balloon.c: limited to __linux__ >=20 >>> diff --git a/arch_init.c b/arch_init.c >>> index e468c0c..a910033 100644 >>> --- a/arch_init.c >>> +++ b/arch_init.c >>> @@ -396,7 +396,7 @@ int ram_load(QEMUFile *f, void *opaque, int = version_id) >>> #ifndef _WIN32 >>> if (ch =3D=3D 0 && >>> (!kvm_enabled() || kvm_has_sync_mmu())) { >>> - madvise(host, TARGET_PAGE_SIZE, MADV_DONTNEED); >>> + qemu_madvise(host, TARGET_PAGE_SIZE, = QEMU_MADV_DONTNEED); >>=20 >> Is this the only occurence in the whole code base? This patch should = convert all users, right? >=20 > It's the only relevant occurrence, cf. description above. >=20 > We could in theory create QEMU_MADV_MERGEABLE and QEMU_MADV_DONTFORK = and convert the callers, too, but what's the point when only madvise = supports it? > Using the current #ifdef mechanism we can detect/handle it at = compile-time; inside qemu_madvise it would be deferred to runtime. Or do = you have a suggestion how to handle that scenario other than returning = an error? Hrm. I'm not sure. Do we have to fail madvise at all? >=20 >>> diff --git a/osdep.c b/osdep.c >>> index 30426ff..8c09597 100644 >>> --- a/osdep.c >>> +++ b/osdep.c >>> @@ -139,6 +140,31 @@ void qemu_vfree(void *ptr) >>>=20 >>> #endif >>>=20 >>> +int qemu_madvise(void *addr, size_t len, int advice) >>> +{ >>> +#ifdef CONFIG_POSIX_MADVISE >>> + switch (advice) { >>> + case QEMU_MADV_DONTNEED: >>> + advice =3D POSIX_MADV_DONTNEED; >>> + break; >>> + default: >>> + fprintf(stderr, "Advice %d unhandled.\n", advice); >>=20 >> Advise :) >=20 > I'd advise against that advice. ;) (*hint hint*) >=20 >> It should probably also be 'madvise' here. Plus, I'd recommend %x as = that makes the bits slightly more obvious. >=20 > You mean, "qemu_madvise: Advice %x ..."? Or "Advice %x for = posix_madvise ..."? I'd remove the "Advice" part: "qemu_madvise: Flag %x unknown". >=20 >>> + abort(); >>> + } >>> + return posix_madvise(addr, len, advice); >>> +#else >>> + switch (advice) { >>> + case QEMU_MADV_DONTNEED: >>> + advice =3D MADV_DONTNEED; >>> + break; >>> + default: >>> + fprintf(stderr, "Advice %d unhandled.\n", advice); >>> + abort(); >>> + } >>> + return madvise(addr, len, advice); >>=20 >> So what do you do on haiku where there's no madvise? >=20 > It should detect posix_madvise and not run into this code path, just = like OpenSolaris. > I was hoping for Michael Lotz to resurface though and haven't retried = myself yet. Oh, so OpenSolaris and Haiku have posix_madvise? Nice. >=20 > Platforms that have neither posix_madvise nor madvise are equally = broken before and after. Sure :). >=20 >>=20 >>> +#endif >>> +} >>> + >>> int qemu_create_pidfile(const char *filename) >>> { >>> char buffer[128]; >>> diff --git a/osdep.h b/osdep.h >>> index 1cdc7e2..9f0da46 100644 >>> --- a/osdep.h >>> +++ b/osdep.h >>> @@ -90,6 +90,10 @@ void *qemu_memalign(size_t alignment, size_t = size); >>> void *qemu_vmalloc(size_t size); >>> void qemu_vfree(void *ptr); >>>=20 >>> +#define QEMU_MADV_DONTNEED 1 >>=20 >> (1 << 0) >>=20 >> There are probably more to follow. I'm also not sure if it wouldn't = make sense to just directly map qemu_madvise and real madvise bits. = Something like >>=20 >> #ifdef MADV_DONTNEED >> #define QEMU_MADV_DONTNEED (1 << 0) >> #else >> #define QEMU_MADV_DONTNEED 0 >> #endif >>=20 >> Unless of course a flag is mandatory - but I don't think any of the = madvise bits are. >=20 > I don't follow. Something like the following? >=20 > #ifdef CONFIG_POSIX_MADVISE > #define QEMU_MADV_DONTNEED POSIX_MADV_DONTNEED > #define qemu_madvise posix_madvise > #else > #ifdef MADV_DONTNEED > #define QEMU_MADV_DONTNEED MADV_DONTNEED > #endif > #define qemu_madvise madvise > #endif This could work, though in general preprocessor magic is ... too magic. = I was thinking: #ifdef CONFIG_POSIX_MADVISE #define QEMU_MADV_DONTNEED POSIX_MADV_DONTNEED #else #define QEMU_MADV_DONTNEED MADV_DONTNEED #endif and keep the qemu_madvise implementation in C. But then there's no need = to convert between QEMU_MADV and real MADV bits. >=20 >>> + >>> +int qemu_madvise(void *addr, size_t len, int advice); >>> + >>> int qemu_create_pidfile(const char *filename); >>>=20 >>> #ifdef _WIN32 >>> diff --git a/vl.c b/vl.c >>> index 3f45aa9..d352d18 100644 >>> --- a/vl.c >>> +++ b/vl.c >>> @@ -80,9 +80,6 @@ >>> #include >>> #include >>> #include >>> -/* See MySQL bug #7156 (http://bugs.mysql.com/bug.php?id=3D7156) = for >>> - discussion about Solaris header problems */ >>> -extern int madvise(caddr_t, size_t, int); >>=20 >> Does Solaris have posix_madvise? Otherwise it would still require = this header hack, no? >=20 > I tested on OpenSolaris 2009.06. Haven't tested on Solaris 10 yet, = don't have access to older ones. > We could wrap it in #ifndef CONFIG_POSIX_MADVISE to be fool-safe. *shrug* I was just afraid this was there for a reason. But maybe the = author of those lines simply didn't know about posix_madvise. > Thanks for the review, You're welcome :) Alex