From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v3 4/9] mm: Scrub memory from idle loop Date: Thu, 11 May 2017 17:48:57 +0200 Message-ID: <1494517737.7393.7.camel@citrix.com> References: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com> <1492184258-3277-5-git-send-email-boris.ostrovsky@oracle.com> <1494498375.7393.5.camel@citrix.com> <34644cef-567d-b558-c580-5ed7d8116fa5@oracle.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3047274252044013709==" Return-path: In-Reply-To: <34644cef-567d-b558-c580-5ed7d8116fa5@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Boris Ostrovsky , xen-devel@lists.xen.org Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com List-Id: xen-devel@lists.xenproject.org --===============3047274252044013709== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-MT6yyL7KVdpi4xKgYIbe" --=-MT6yyL7KVdpi4xKgYIbe Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2017-05-11 at 10:19 -0400, Boris Ostrovsky wrote: > > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > > > index 90e2b1f..a5f62b5 100644 > > > --- a/xen/arch/x86/domain.c > > > +++ b/xen/arch/x86/domain.c > > > @@ -118,7 +118,8 @@ static void idle_loop(void) > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( cpu_is_off= line(smp_processor_id()) ) > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0play_dead(); > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(*pm_idle)(); > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( !scrub_free_pag= es() ) > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0(*pm_idle)(); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0do_tasklet(); > > >=20 > >=20 > > This means that, if we got here to run a tasklet (as in, if the > > idle > > vCPU has been forced into execution, because there were a vCPU > > context > > tasklet wanting to run), we will (potentially) do some scrubbing > > first. > >=20 > We can move do_tasklet() above scrub_free_pages(). And new tasklet > after > that would result in a softirq being set so we'd do an early exit > from > scrub_free_pages(). >=20 How early? In fact, right now, if there is one tasklet queued, this is what happens: tasklet_schedule(t) tasklet_enqueue(t) test_and_set_bit(_TASKLET_enqueued, tasklet_work_to_do); raise_softirq(SCHEDULE_SOFTIRQ); schedule() set_bit(_TASKLET_scheduled, tasklet_work_to_do) tasklet_work_scheduled =3D 1; do_schedule(tasklet_work_scheduled) csched_schedule(tasklet_work_to_do) snext =3D CSCHED_VCPU(idle_vcpu[cpu]); idle_loop() (*pm_idle)() if ( !cpu_is_haltable() ) return; do_tasklet() /* runs tasklet t */ clear_bit(_TASKLET_enqueued, work_to_do); /* list_empty(list) =3D=3D t= rue */ raise_softirq(SCHEDULE_SOFTIRQ); do_softirq() schedule() clear_bit(_TASKLET_scheduled, tasklet_work); tasklet_work_scheduled =3D 0; =C2=A0=C2=A0=C2=A0do_schedule(tasklet_work_scheduled) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0csched_schedule(tasklet_work_to_do) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0snext =3D CSCHED_VCPU(idle_vcpu[c= pu]); idle_loop() (*pm_idle)() if ( !cpu_is_haltable() ) ... If we move do_tasklet up, as you suggest, this is what happens: =C2=A0tasklet_schedule(t) =C2=A0=C2=A0=C2=A0tasklet_enqueue(t) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0test_and_set_bit(_TASKLET_enqueued, tasklet_w= ork_to_do); =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0raise_softirq(SCHEDULE_SOFTIRQ); =C2=A0schedule() =C2=A0=C2=A0=C2=A0set_bit(_TASKLET_scheduled, tasklet_work_to_do) =C2=A0=C2=A0=C2=A0tasklet_work_scheduled =3D 1; =C2=A0=C2=A0=C2=A0do_schedule(tasklet_work_scheduled) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0csched_schedule(tasklet_work_to_do) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0snext =3D CSCHED_VCPU(idle_vcpu[c= pu]); =C2=A0idle_loop() =C2=A0=C2=A0=C2=A0do_tasklet() /* runs tasklet t */ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0clear_bit(_TASKLET_enqueued, work_to_do); /* = list_empty(list) =3D=3D true */ raise_softirq(SCHEDULE_SOFTIRQ); if ( !scrub_free_pages() ) //do some scrubbing, but softirq_pending() is true, so return 1 =C2=A0=C2=A0=C2=A0do_softirq() =C2=A0schedule() =C2=A0=C2=A0=C2=A0clear_bit(_TASKLET_scheduled, tasklet_work); =C2=A0=C2=A0=C2=A0tasklet_work_scheduled =3D 0; =C2=A0=C2=A0=C2=A0do_schedule(tasklet_work_scheduled) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0csched_schedule(tasklet_work_to_do) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0snext =3D CSCHED_VCPU(idle_vcpu[c= pu]); =C2=A0idle_loop() if ( !scrub_free_pages() ) //do the scrubbing, returns 0, so we enter the if =C2=A0=C2=A0=C2=A0(*pm_idle)() =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0if ( !cpu_is_haltable() ) ... IOW (provided I'm understanding your code right, of course), I still see it happening that we switched to idle *not* because the system was idle, but for running a tasklet, and yet we end up doing at least some scrubbing (like if the system were idle). Which still feels wrong to me. If there's more than one tasklet queued (or another one, or more, is/are queued before the one t is processed), it's even worse, because we go through the whole schedule()->idle_loop()->do_tasklet() again and again, and at each step we do a bit of scrubbing, before going back to schedule(). It probably would be at least a bit better, if scrub_free_pages() would check for softirqs() _before_ starting any scrubbing (which I don't think it does, right now, am I right?). > OTOH since, as you say, we only get to idle loop() if no tasklet is > pending (cpu_is_haltable() test) then would even that be needed? >=20 Err... sorry, not getting. It's the other way round. One of the reasons why we end up executing idle_loop(), is that there is at least a tasklet pending. Where we only get to if there's nothing pending is to calling (*pm_idle)(). Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-MT6yyL7KVdpi4xKgYIbe Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZFIfpAAoJEBZCeImluHPuKA4QAMFcqZ/oiGsvbQt7E3iI3wfl NfcEIgfoQ0C+WhHxcvTHY7YZ5iqd1BQJrg3pmj7K002ljYLXpL0juyCih/mz5vyL yB0gfTRezHy7b4XE+i4Jp/SzLzhyP+1NYJk5bDdWmRMFF4ZBCVFIXrmZpl2Cwphf Zd1tvpUEO4CJM8zTZ7b4kCtgTNZVupgZ/fnNX/wEsouhm995iGVwaRvCP7NLHw5Z lyjKHzeuIVsNQfVzx7L9UeQAoEA5vOGyVON+2GvBwGs9cURdOUXzBn9rfpnol05l itjei/EkucCDgfHyCz8Ptw6Df9BeVBmNJmEEnXk+Cssw6VbMHJNbMUqCvRiCHKlr X5vIt0dihjDLjhR99E8CF1HAxVTmRWDmGjCEwcZ37FiOmwWoV7zqPHfxsU5AGkwn d2jb4WuTLFeK07D8PhdaAMQgUhKVStSy4Uxbyl5ziPVgjnjAIDoBa5LmjCbDUj3B +G22nyUPuHVgdcbdQ/dOp7lBHZjgpWOzXlcJrAJll6ZjnaMPnsZbgpYq6LK5Wlj+ kxQHa7pMKrjyPZUMaWegef3Cj/KBE2jgC58axl8IautR46bi4Ihhy1kTmj4P+DXp LGRUypc5J+GxBHkEz/4G6Wv/BP3O/x/Hwb26Na46TvHH8eHPjvdjQ2StEnWpt8bp r+IDvftbtjDBiTAa1uGC =O2in -----END PGP SIGNATURE----- --=-MT6yyL7KVdpi4xKgYIbe-- --===============3047274252044013709== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============3047274252044013709==--