From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christoffer Dall <cdall@cs.columbia.edu>
Subject: Re: [RFC PATCH 1/6] kvm: add device control API
Date: Tue, 19 Feb 2013 18:16:53 -0800
Message-ID: <CAEDV+gJxLih65FpXvs=z2Cb+bxbRrL0kNzpDWaQ1x0SABLCGrg@mail.gmail.com>
References: <CAEDV+gKU604RN5uh0=xbFVPMYUfvqdqLbVc0nxJJykO2cCoivA@mail.gmail.com>
	<1361304966.29654.8@snotra>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: Alexander Graf <agraf@suse.de>, kvm-ppc@vger.kernel.org,
	kvm@vger.kernel.org
To: Scott Wood <scottwood@freescale.com>
Return-path: <kvm-ppc-owner@vger.kernel.org>
In-Reply-To: <1361304966.29654.8@snotra>
Sender: kvm-ppc-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org

On Tue, Feb 19, 2013 at 12:16 PM, Scott Wood <scottwood@freescale.com> =
wrote:
> On 02/18/2013 11:50:44 PM, Christoffer Dall wrote:
>>
>> On Mon, Feb 18, 2013 at 4:53 PM, Scott Wood <scottwood@freescale.com=
>
>> wrote:
>> > On 02/18/2013 06:44:20 PM, Christoffer Dall wrote:
>> >>
>> >> On Wed, Feb 13, 2013 at 9:49 PM, Scott Wood <scottwood@freescale.=
com>
>> >> > +#define KVM_CREATE_DEVICE        _IOWR(KVMIO,  0xac, struct
>> >> > kvm_create_device)
>> >> > +#define KVM_SET_DEVICE_ATTR      _IOW(KVMIO,  0xad, struct
>> >> > kvm_device_attr)
>> >> > +#define KVM_GET_DEVICE_ATTR      _IOW(KVMIO,  0xae, struct
>> >> > kvm_device_attr)
>> >>
>> >> _IOWR ?
>> >
>> >
>> > struct kvm_device_attr itself is write-only, though the data point=
ed to
>> > by
>> > the addr field goes the other way for GET.  ONE_REG is in the same
>> > situation
>> > and also uses _IOW for both.
>> >
>> >
>>
>> ok.
>>
>> Btw., what about the size of the attr? implicitly defined through th=
e attr
>> id?
>
>
> Yes, same as in ONE_REG.
>
>
>> and I also think it's easier to read the
>> arch-specific bits of the code that way, instead of some arbitrary
>> function that you have to trace through to figure out where it's
>> called from.
>
>
> I don't follow.
>
>

I just mean that if you look in arch/XXX/kvm/XXX.c and see a function
called kvm_create_xxx_dev(...) then you're like, what context is this
being called from again and in which order, etc. Of course we can name
the function kvm_ioctl_create_dev_xxx(...), but I still like to be
able to follow the flow of things that are really arch specific, but
anyhow, this is a weak argument.

>> > By doing device recognition here we don't need a separate copy of =
this
>> > per
>> > arch (including some #ifdef or modifying every arch at once -- inc=
luding
>> > ARM
>> > which I can't modify yet because it's not merged), and *if* we sho=
uld
>> > end up
>> > with an in-kernel-emulated device that gets used across multiple
>> > architectures, it would be annoying to have to modify all relevant
>> > architectures (and worse to deal with per-arch numberspaces).
>>
>> I would say that's exactly what you're going to need with your appro=
ach:
>>
>> switch (cd->type) {
>> case KVM_ARM_VGIC_V2_0:
>>     kvm_arm_vgic_v2_0_create(...);
>> }
>>
>>
>> are you going to ifdef here in this function, or? I think it's clean=
er
>> to have the single arch-specific hooks and handle the cases there.
>
>
> There's an ifdef, as you can see from the patch that adds MPIC suppor=
t.  But
> it's the same ifdef that gets used to determine whether the device co=
de gets
> built in.  Nothing special needs to be added; no per-architecture hoo=
p to
> jump through.
>
> Note that we would still need per-device ifdefs in the arch code, bec=
ause
> not all PPC KVM builds are going to have MPIC support.

yeah, that's the same on ARM.

>
> What if, instead of a switch statement and ifdefs, it operated on a
> registration basis?
>
>

Probably just makes the code more confusing and harder to grep with
the limited number of in-kernel devices we support. Between that and
your current approach, I prefer the current approach. Anyhow, the
whole thing is internal state, as I wrote earlier, and by no means a
deal breaker for me, and as long as we don't have to mess around with
include/linux/kvm_host.h for changing arch-specific stuff, which I
believe is the case even with the current design, I'm ok with it.

>> The use case of having a single device which is so central to the
>> system that we emulate it inside the kernel and is shared across
>> multiple archs is pretty far fetched to me.
>
>
> I don't think it's that far fetched.  APIC is shared between x86 and =
ia64 --
> even if APIC has no need for anything beyond existing API, it shows t=
hat
> it's not a crazy possibility.  Freescale already has some devices tha=
t are
> shared between PPC and ARM, and that set will expand (probably not to=
 irq
> controllers, though the probability is non-zero) with Layerscape, abo=
ut
> which Freeescale says, "The unique, core-agnostic architecture incorp=
orates
> the optimum core for the given application=97ARM cores or Power Archi=
tecture
> cores."  We already need to go back and non-ppc-ize various drivers,
> including their reliance on I/O accessors that are defined in
> architecture-specific ways...  Making things gratuitiously architectu=
re
> specific is just a bad idea, even if the "use case" for it actually b=
eing
> used on multiple architectures is remote.
>

=46or the record I think this is a simplification: making this generic
always comes at the cost of some added complexity exactly due to the
loss of being specific. It's a balancing act to figure out which is
preferred given the magnitude of the cost. As I indicate above, this
case is not too bad.

> Normal kernel drivers tend to go in drivers/, not arch/, even if they=
're
> only used on one architecture...
>

I know, but then you don't have #ifdef CONFIG_SOME_WEIRD_DEVICE in
kernel/xxx.c which is the only thing I find mildly unpleasing.

But let's not waste any more time on this detail.

>
>> However, this is internal and can always be changed, so if everyone
>> agrees on the overall API, whichever way you implement it is fine wi=
th
>> me.
>
>
> We at least need the numberspace to not be architecture-specific if w=
e want
> to retain the possibility of changing later -- not to mention what ha=
ppens
> if architectures merge.  I see that "arm" and "arm64" are separate, d=
espite
> the fact that other architectures that used to be split this way have=
 since
> merged.  Maybe "arm64" is too different from "arm" for that to happen=
, but
> who knows...
>

=46air point, nobody knows.

> ...and if they don't merge, wouldn't that be a likely case for device=
s
> shared across architectures?  Does arm64 use gic/vgic?  This post sug=
gests
> that there is at least something in common (the bit about "once the G=
IC code
> is shared between
> arm and arm64"):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/1=
35836.html
>

I'm not sure how much of that is public at this point, or even
determined. But KVM already shares code between arm64 and arm, so I
guess I thought of this as a single architecture from the point of
view of virt/kvm/kvm_main.c, but that may be incorrect actually.

I really need to find time to play around more with the arm64 code.

Thanks for the thoughts.

-Christoffer

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christoffer Dall <cdall@cs.columbia.edu>
Date: Wed, 20 Feb 2013 02:16:53 +0000
Subject: Re: [RFC PATCH 1/6] kvm: add device control API
Message-Id: <CAEDV+gJxLih65FpXvs=z2Cb+bxbRrL0kNzpDWaQ1x0SABLCGrg@mail.gmail.com>
List-Id: <kvm-ppc.vger.kernel.org>
References: <CAEDV+gKU604RN5uh0=xbFVPMYUfvqdqLbVc0nxJJykO2cCoivA@mail.gmail.com>
	<1361304966.29654.8@snotra>
In-Reply-To: <1361304966.29654.8@snotra>
MIME-Version: 1.0
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
To: Scott Wood <scottwood@freescale.com>
Cc: Alexander Graf <agraf@suse.de>, kvm-ppc@vger.kernel.org, kvm@vger.kernel.org

On Tue, Feb 19, 2013 at 12:16 PM, Scott Wood <scottwood@freescale.com> wrot=
e:
> On 02/18/2013 11:50:44 PM, Christoffer Dall wrote:
>>
>> On Mon, Feb 18, 2013 at 4:53 PM, Scott Wood <scottwood@freescale.com>
>> wrote:
>> > On 02/18/2013 06:44:20 PM, Christoffer Dall wrote:
>> >>
>> >> On Wed, Feb 13, 2013 at 9:49 PM, Scott Wood <scottwood@freescale.com>
>> >> > +#define KVM_CREATE_DEVICE        _IOWR(KVMIO,  0xac, struct
>> >> > kvm_create_device)
>> >> > +#define KVM_SET_DEVICE_ATTR      _IOW(KVMIO,  0xad, struct
>> >> > kvm_device_attr)
>> >> > +#define KVM_GET_DEVICE_ATTR      _IOW(KVMIO,  0xae, struct
>> >> > kvm_device_attr)
>> >>
>> >> _IOWR ?
>> >
>> >
>> > struct kvm_device_attr itself is write-only, though the data pointed to
>> > by
>> > the addr field goes the other way for GET.  ONE_REG is in the same
>> > situation
>> > and also uses _IOW for both.
>> >
>> >
>>
>> ok.
>>
>> Btw., what about the size of the attr? implicitly defined through the at=
tr
>> id?
>
>
> Yes, same as in ONE_REG.
>
>
>> and I also think it's easier to read the
>> arch-specific bits of the code that way, instead of some arbitrary
>> function that you have to trace through to figure out where it's
>> called from.
>
>
> I don't follow.
>
>

I just mean that if you look in arch/XXX/kvm/XXX.c and see a function
called kvm_create_xxx_dev(...) then you're like, what context is this
being called from again and in which order, etc. Of course we can name
the function kvm_ioctl_create_dev_xxx(...), but I still like to be
able to follow the flow of things that are really arch specific, but
anyhow, this is a weak argument.

>> > By doing device recognition here we don't need a separate copy of this
>> > per
>> > arch (including some #ifdef or modifying every arch at once -- includi=
ng
>> > ARM
>> > which I can't modify yet because it's not merged), and *if* we should
>> > end up
>> > with an in-kernel-emulated device that gets used across multiple
>> > architectures, it would be annoying to have to modify all relevant
>> > architectures (and worse to deal with per-arch numberspaces).
>>
>> I would say that's exactly what you're going to need with your approach:
>>
>> switch (cd->type) {
>> case KVM_ARM_VGIC_V2_0:
>>     kvm_arm_vgic_v2_0_create(...);
>> }
>>
>>
>> are you going to ifdef here in this function, or? I think it's cleaner
>> to have the single arch-specific hooks and handle the cases there.
>
>
> There's an ifdef, as you can see from the patch that adds MPIC support.  =
But
> it's the same ifdef that gets used to determine whether the device code g=
ets
> built in.  Nothing special needs to be added; no per-architecture hoop to
> jump through.
>
> Note that we would still need per-device ifdefs in the arch code, because
> not all PPC KVM builds are going to have MPIC support.

yeah, that's the same on ARM.

>
> What if, instead of a switch statement and ifdefs, it operated on a
> registration basis?
>
>

Probably just makes the code more confusing and harder to grep with
the limited number of in-kernel devices we support. Between that and
your current approach, I prefer the current approach. Anyhow, the
whole thing is internal state, as I wrote earlier, and by no means a
deal breaker for me, and as long as we don't have to mess around with
include/linux/kvm_host.h for changing arch-specific stuff, which I
believe is the case even with the current design, I'm ok with it.

>> The use case of having a single device which is so central to the
>> system that we emulate it inside the kernel and is shared across
>> multiple archs is pretty far fetched to me.
>
>
> I don't think it's that far fetched.  APIC is shared between x86 and ia64=
 --
> even if APIC has no need for anything beyond existing API, it shows that
> it's not a crazy possibility.  Freescale already has some devices that are
> shared between PPC and ARM, and that set will expand (probably not to irq
> controllers, though the probability is non-zero) with Layerscape, about
> which Freeescale says, "The unique, core-agnostic architecture incorporat=
es
> the optimum core for the given application=97ARM cores or Power Architect=
ure
> cores."  We already need to go back and non-ppc-ize various drivers,
> including their reliance on I/O accessors that are defined in
> architecture-specific ways...  Making things gratuitiously architecture
> specific is just a bad idea, even if the "use case" for it actually being
> used on multiple architectures is remote.
>

For the record I think this is a simplification: making this generic
always comes at the cost of some added complexity exactly due to the
loss of being specific. It's a balancing act to figure out which is
preferred given the magnitude of the cost. As I indicate above, this
case is not too bad.

> Normal kernel drivers tend to go in drivers/, not arch/, even if they're
> only used on one architecture...
>

I know, but then you don't have #ifdef CONFIG_SOME_WEIRD_DEVICE in
kernel/xxx.c which is the only thing I find mildly unpleasing.

But let's not waste any more time on this detail.

>
>> However, this is internal and can always be changed, so if everyone
>> agrees on the overall API, whichever way you implement it is fine with
>> me.
>
>
> We at least need the numberspace to not be architecture-specific if we wa=
nt
> to retain the possibility of changing later -- not to mention what happens
> if architectures merge.  I see that "arm" and "arm64" are separate, despi=
te
> the fact that other architectures that used to be split this way have sin=
ce
> merged.  Maybe "arm64" is too different from "arm" for that to happen, but
> who knows...
>

Fair point, nobody knows.

> ...and if they don't merge, wouldn't that be a likely case for devices
> shared across architectures?  Does arm64 use gic/vgic?  This post suggests
> that there is at least something in common (the bit about "once the GIC c=
ode
> is shared between
> arm and arm64"):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/13583=
6.html
>

I'm not sure how much of that is public at this point, or even
determined. But KVM already shares code between arm64 and arm, so I
guess I thought of this as a single architecture from the point of
view of virt/kvm/kvm_main.c, but that may be incorrect actually.

I really need to find time to play around more with the arm64 code.

Thanks for the thoughts.

-Christoffer