From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoffer Dall Subject: Re: [RFC PATCH 1/6] kvm: add device control API Date: Tue, 19 Feb 2013 18:16:53 -0800 Message-ID: References: <1361304966.29654.8@snotra> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Alexander Graf , kvm-ppc@vger.kernel.org, kvm@vger.kernel.org To: Scott Wood Return-path: In-Reply-To: <1361304966.29654.8@snotra> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Tue, Feb 19, 2013 at 12:16 PM, Scott Wood = wrote: > On 02/18/2013 11:50:44 PM, Christoffer Dall wrote: >> >> On Mon, Feb 18, 2013 at 4:53 PM, Scott Wood >> wrote: >> > On 02/18/2013 06:44:20 PM, Christoffer Dall wrote: >> >> >> >> On Wed, Feb 13, 2013 at 9:49 PM, Scott Wood >> >> > +#define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xac, struct >> >> > kvm_create_device) >> >> > +#define KVM_SET_DEVICE_ATTR _IOW(KVMIO, 0xad, struct >> >> > kvm_device_attr) >> >> > +#define KVM_GET_DEVICE_ATTR _IOW(KVMIO, 0xae, struct >> >> > kvm_device_attr) >> >> >> >> _IOWR ? >> > >> > >> > struct kvm_device_attr itself is write-only, though the data point= ed to >> > by >> > the addr field goes the other way for GET. ONE_REG is in the same >> > situation >> > and also uses _IOW for both. >> > >> > >> >> ok. >> >> Btw., what about the size of the attr? implicitly defined through th= e attr >> id? > > > Yes, same as in ONE_REG. > > >> and I also think it's easier to read the >> arch-specific bits of the code that way, instead of some arbitrary >> function that you have to trace through to figure out where it's >> called from. > > > I don't follow. > > I just mean that if you look in arch/XXX/kvm/XXX.c and see a function called kvm_create_xxx_dev(...) then you're like, what context is this being called from again and in which order, etc. Of course we can name the function kvm_ioctl_create_dev_xxx(...), but I still like to be able to follow the flow of things that are really arch specific, but anyhow, this is a weak argument. >> > By doing device recognition here we don't need a separate copy of = this >> > per >> > arch (including some #ifdef or modifying every arch at once -- inc= luding >> > ARM >> > which I can't modify yet because it's not merged), and *if* we sho= uld >> > end up >> > with an in-kernel-emulated device that gets used across multiple >> > architectures, it would be annoying to have to modify all relevant >> > architectures (and worse to deal with per-arch numberspaces). >> >> I would say that's exactly what you're going to need with your appro= ach: >> >> switch (cd->type) { >> case KVM_ARM_VGIC_V2_0: >> kvm_arm_vgic_v2_0_create(...); >> } >> >> >> are you going to ifdef here in this function, or? I think it's clean= er >> to have the single arch-specific hooks and handle the cases there. > > > There's an ifdef, as you can see from the patch that adds MPIC suppor= t. But > it's the same ifdef that gets used to determine whether the device co= de gets > built in. Nothing special needs to be added; no per-architecture hoo= p to > jump through. > > Note that we would still need per-device ifdefs in the arch code, bec= ause > not all PPC KVM builds are going to have MPIC support. yeah, that's the same on ARM. > > What if, instead of a switch statement and ifdefs, it operated on a > registration basis? > > Probably just makes the code more confusing and harder to grep with the limited number of in-kernel devices we support. Between that and your current approach, I prefer the current approach. Anyhow, the whole thing is internal state, as I wrote earlier, and by no means a deal breaker for me, and as long as we don't have to mess around with include/linux/kvm_host.h for changing arch-specific stuff, which I believe is the case even with the current design, I'm ok with it. >> The use case of having a single device which is so central to the >> system that we emulate it inside the kernel and is shared across >> multiple archs is pretty far fetched to me. > > > I don't think it's that far fetched. APIC is shared between x86 and = ia64 -- > even if APIC has no need for anything beyond existing API, it shows t= hat > it's not a crazy possibility. Freescale already has some devices tha= t are > shared between PPC and ARM, and that set will expand (probably not to= irq > controllers, though the probability is non-zero) with Layerscape, abo= ut > which Freeescale says, "The unique, core-agnostic architecture incorp= orates > the optimum core for the given application=97ARM cores or Power Archi= tecture > cores." We already need to go back and non-ppc-ize various drivers, > including their reliance on I/O accessors that are defined in > architecture-specific ways... Making things gratuitiously architectu= re > specific is just a bad idea, even if the "use case" for it actually b= eing > used on multiple architectures is remote. > =46or the record I think this is a simplification: making this generic always comes at the cost of some added complexity exactly due to the loss of being specific. It's a balancing act to figure out which is preferred given the magnitude of the cost. As I indicate above, this case is not too bad. > Normal kernel drivers tend to go in drivers/, not arch/, even if they= 're > only used on one architecture... > I know, but then you don't have #ifdef CONFIG_SOME_WEIRD_DEVICE in kernel/xxx.c which is the only thing I find mildly unpleasing. But let's not waste any more time on this detail. > >> However, this is internal and can always be changed, so if everyone >> agrees on the overall API, whichever way you implement it is fine wi= th >> me. > > > We at least need the numberspace to not be architecture-specific if w= e want > to retain the possibility of changing later -- not to mention what ha= ppens > if architectures merge. I see that "arm" and "arm64" are separate, d= espite > the fact that other architectures that used to be split this way have= since > merged. Maybe "arm64" is too different from "arm" for that to happen= , but > who knows... > =46air point, nobody knows. > ...and if they don't merge, wouldn't that be a likely case for device= s > shared across architectures? Does arm64 use gic/vgic? This post sug= gests > that there is at least something in common (the bit about "once the G= IC code > is shared between > arm and arm64"): > http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/1= 35836.html > I'm not sure how much of that is public at this point, or even determined. But KVM already shares code between arm64 and arm, so I guess I thought of this as a single architecture from the point of view of virt/kvm/kvm_main.c, but that may be incorrect actually. I really need to find time to play around more with the arm64 code. Thanks for the thoughts. -Christoffer From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoffer Dall Date: Wed, 20 Feb 2013 02:16:53 +0000 Subject: Re: [RFC PATCH 1/6] kvm: add device control API Message-Id: List-Id: References: <1361304966.29654.8@snotra> In-Reply-To: <1361304966.29654.8@snotra> MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable To: Scott Wood Cc: Alexander Graf , kvm-ppc@vger.kernel.org, kvm@vger.kernel.org On Tue, Feb 19, 2013 at 12:16 PM, Scott Wood wrot= e: > On 02/18/2013 11:50:44 PM, Christoffer Dall wrote: >> >> On Mon, Feb 18, 2013 at 4:53 PM, Scott Wood >> wrote: >> > On 02/18/2013 06:44:20 PM, Christoffer Dall wrote: >> >> >> >> On Wed, Feb 13, 2013 at 9:49 PM, Scott Wood >> >> > +#define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xac, struct >> >> > kvm_create_device) >> >> > +#define KVM_SET_DEVICE_ATTR _IOW(KVMIO, 0xad, struct >> >> > kvm_device_attr) >> >> > +#define KVM_GET_DEVICE_ATTR _IOW(KVMIO, 0xae, struct >> >> > kvm_device_attr) >> >> >> >> _IOWR ? >> > >> > >> > struct kvm_device_attr itself is write-only, though the data pointed to >> > by >> > the addr field goes the other way for GET. ONE_REG is in the same >> > situation >> > and also uses _IOW for both. >> > >> > >> >> ok. >> >> Btw., what about the size of the attr? implicitly defined through the at= tr >> id? > > > Yes, same as in ONE_REG. > > >> and I also think it's easier to read the >> arch-specific bits of the code that way, instead of some arbitrary >> function that you have to trace through to figure out where it's >> called from. > > > I don't follow. > > I just mean that if you look in arch/XXX/kvm/XXX.c and see a function called kvm_create_xxx_dev(...) then you're like, what context is this being called from again and in which order, etc. Of course we can name the function kvm_ioctl_create_dev_xxx(...), but I still like to be able to follow the flow of things that are really arch specific, but anyhow, this is a weak argument. >> > By doing device recognition here we don't need a separate copy of this >> > per >> > arch (including some #ifdef or modifying every arch at once -- includi= ng >> > ARM >> > which I can't modify yet because it's not merged), and *if* we should >> > end up >> > with an in-kernel-emulated device that gets used across multiple >> > architectures, it would be annoying to have to modify all relevant >> > architectures (and worse to deal with per-arch numberspaces). >> >> I would say that's exactly what you're going to need with your approach: >> >> switch (cd->type) { >> case KVM_ARM_VGIC_V2_0: >> kvm_arm_vgic_v2_0_create(...); >> } >> >> >> are you going to ifdef here in this function, or? I think it's cleaner >> to have the single arch-specific hooks and handle the cases there. > > > There's an ifdef, as you can see from the patch that adds MPIC support. = But > it's the same ifdef that gets used to determine whether the device code g= ets > built in. Nothing special needs to be added; no per-architecture hoop to > jump through. > > Note that we would still need per-device ifdefs in the arch code, because > not all PPC KVM builds are going to have MPIC support. yeah, that's the same on ARM. > > What if, instead of a switch statement and ifdefs, it operated on a > registration basis? > > Probably just makes the code more confusing and harder to grep with the limited number of in-kernel devices we support. Between that and your current approach, I prefer the current approach. Anyhow, the whole thing is internal state, as I wrote earlier, and by no means a deal breaker for me, and as long as we don't have to mess around with include/linux/kvm_host.h for changing arch-specific stuff, which I believe is the case even with the current design, I'm ok with it. >> The use case of having a single device which is so central to the >> system that we emulate it inside the kernel and is shared across >> multiple archs is pretty far fetched to me. > > > I don't think it's that far fetched. APIC is shared between x86 and ia64= -- > even if APIC has no need for anything beyond existing API, it shows that > it's not a crazy possibility. Freescale already has some devices that are > shared between PPC and ARM, and that set will expand (probably not to irq > controllers, though the probability is non-zero) with Layerscape, about > which Freeescale says, "The unique, core-agnostic architecture incorporat= es > the optimum core for the given application=97ARM cores or Power Architect= ure > cores." We already need to go back and non-ppc-ize various drivers, > including their reliance on I/O accessors that are defined in > architecture-specific ways... Making things gratuitiously architecture > specific is just a bad idea, even if the "use case" for it actually being > used on multiple architectures is remote. > For the record I think this is a simplification: making this generic always comes at the cost of some added complexity exactly due to the loss of being specific. It's a balancing act to figure out which is preferred given the magnitude of the cost. As I indicate above, this case is not too bad. > Normal kernel drivers tend to go in drivers/, not arch/, even if they're > only used on one architecture... > I know, but then you don't have #ifdef CONFIG_SOME_WEIRD_DEVICE in kernel/xxx.c which is the only thing I find mildly unpleasing. But let's not waste any more time on this detail. > >> However, this is internal and can always be changed, so if everyone >> agrees on the overall API, whichever way you implement it is fine with >> me. > > > We at least need the numberspace to not be architecture-specific if we wa= nt > to retain the possibility of changing later -- not to mention what happens > if architectures merge. I see that "arm" and "arm64" are separate, despi= te > the fact that other architectures that used to be split this way have sin= ce > merged. Maybe "arm64" is too different from "arm" for that to happen, but > who knows... > Fair point, nobody knows. > ...and if they don't merge, wouldn't that be a likely case for devices > shared across architectures? Does arm64 use gic/vgic? This post suggests > that there is at least something in common (the bit about "once the GIC c= ode > is shared between > arm and arm64"): > http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/13583= 6.html > I'm not sure how much of that is public at this point, or even determined. But KVM already shares code between arm64 and arm, so I guess I thought of this as a single architecture from the point of view of virt/kvm/kvm_main.c, but that may be incorrect actually. I really need to find time to play around more with the arm64 code. Thanks for the thoughts. -Christoffer