All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Compile files only once: some planning
@ 2010-03-23 21:43 Blue Swirl
  2010-03-24  9:17 ` [Qemu-devel] " Juan Quintela
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Blue Swirl @ 2010-03-23 21:43 UTC (permalink / raw)
  To: qemu-devel

Hi,

Here's some planning for getting most files compiled as few times as
possible. Comments and suggestions are welcome.

I have now converted most of the easy cases which were compiled for
all targets, saving about 200 compiles for full build with default set
of targets (~1500 files compiled). For the easy files, the rules can
be just moved from Makefile.target to Makefile.objs without impact. If
the file was compiled conditionally, the rules may need to be added to
default-configs/*.mak. Some defines, like TARGET_PAGE_SIZE or
TARGET_WORDS_BIGENDIAN can be pushed to board level.

The target dependent cases should be next. On full build, each MIPS
device file gets compiled four times, PPC files three times and x86
twice. The devices for architectures that are compiled only once (ARM,
Cris, Sparc32 etc.) do not need to be touched.

I think it's better to add a new line for each device to
default-configs instead of adding just CONFIG_MIPS for example.

The harder cases are those where the device code depends somehow on
the architecture. Some thoughts follow.

vl.c: a lot of work. Maybe the CPUState stuff should be separated to a new file.

virtio-*.c: push TARGET_PAGE_SIZE to board level, it's not so easy though.

rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.

ide/core.c: win2k flag could be passed from board level, or the code
could just be enabled here because the flag is x86 only in vl.c.

pckbd.c, vmmouse.c, x86 CPU: On x86 only, vmmouse wants to insert key
events to pckbd buffer and pckbd is connected to x86 A20 line. The
solution could be to use qemu_irq signals to handle A20 line changes
and some code reorganization.

dma.c: DMA_schedule needs access to CPUState.

mc146818.c: coalesced IRQs only for x86, also APIC dependency.

fpu/softfloat.c, fpu/softfloat-native.c: may need some #define adjustment.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-23 21:43 [Qemu-devel] Compile files only once: some planning Blue Swirl
@ 2010-03-24  9:17 ` Juan Quintela
  2010-03-24 17:56   ` Blue Swirl
  2010-03-25  2:54   ` Jamie Lokier
  2010-03-24  9:47 ` Paolo Bonzini
  2010-03-24 19:39 ` Michael S. Tsirkin
  2 siblings, 2 replies; 31+ messages in thread
From: Juan Quintela @ 2010-03-24  9:17 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

Blue Swirl <blauwirbel@gmail.com> wrote:
> Hi,
>
> Here's some planning for getting most files compiled as few times as
> possible. Comments and suggestions are welcome.

I took some thought about this at some point.  Problems here start from
"Recursive Makefile condered Harmful (tm)".

Look at how we jump through hops to be able to compile things in
one/other side.

We have:
Makefile
Makefile.target (really lots of them, one for target)
Makefile.hw
Makefile.user

If we had only a single Makefile, things in this department would be
much, much easier. And no, convert to a single Makefile is not trivial
either, but it would make things easier.

Why do we have several Makefiles?  Because we want to compile each file
with different options.

Why do we need to abuse so much VPATH?  Because we need to bring files
randomly from $(ROOT), $(ROOT)/hw  $(ROOT)/$(TARGET).

Problem here, there isn't a simple way to compile files for several
target just once (no way to put them).

Our main copmile rule is:

$(QEMU_PROG): $(obj-y) $(obj-$(TARGET_BASE_ARCH)-y)
	$(call LINK,$(obj-y) $(obj-$(TARGET_BASE_ARCH)-y))


(notice that things compiled in Makefile are trivial, they are already
compiled just once by definition, problems are for all the qemu's we
compile).

We could change: $(obj-$(TARGET_BASE_ARCH)-y) to something like:

OBJ-TARGET=s/.o/.$(TARGET_BASE_ARCH).o/

(I forgot the subst Makefile syntax), and have the:

%.$(TARGET_BASE_ARCH).o: %.c
   gcc $(TARGET_BASE_ARCH options)

>From there, as you suggested, we need some files that are not compiled
by architecture, they need to be compiled by board, well, we need to add
yet another level obj-$(TARGET_BOARD) or whatever.

Notice that this is a lot of work, but you are needing the audit to be
able to compile only once.  Problem just now is that there is not a
simple way to describe that information,  with my proposal it gets
trivial to express:

obj-$(CONFIG_FOO) += foo.o  # You need this for everything
obj-mips-$(CONFIG_FOO) += foo.o  # You need this for all mips boards
obj-malta-$(CONFIG_FOO) += foo.o  # You need this for all mips malta board

You still need to do some different magic from hw-32/64 but it could be
done this way.  Once you did it this way, you now where the files are
(hw or target) and you can drop the VPATH tricks.

Problem with this proposal is that it is not trivial to do in little
steps, and the real big advantages appear when you switch to a single
Makefile at the end.

> vl.c: a lot of work. Maybe the CPUState stuff should be separated to a new file.

That should just be a rule in Documentation.  You can't but anything
else in vl.c.  If you move anything out of vl.c (see timers work from
bonzini for example), you get a wild card for free commit bypassing
maintainers or some similar price :)

<rest of files>

I haven't really looked at them at depth.

I looked when I cleaned up the build system, I thought how to do the
next step (outlined before), but got sidetracked by other more urgent
things.

Later, Juan.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-23 21:43 [Qemu-devel] Compile files only once: some planning Blue Swirl
  2010-03-24  9:17 ` [Qemu-devel] " Juan Quintela
@ 2010-03-24  9:47 ` Paolo Bonzini
  2010-03-24 11:19   ` Richard Henderson
  2010-03-24 19:39 ` Michael S. Tsirkin
  2 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2010-03-24  9:47 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel


> The harder cases are those where the device code depends somehow on
> the architecture. Some thoughts follow.
>
> vl.c: a lot of work. Maybe the CPUState stuff should be separated to a new file.
>
> dma.c: DMA_schedule needs access to CPUState.

Most users of CPUState (e.g. qemu-timer.c and hw/dma.c) either need it 
as an opaque pointer, or only need access to target-independent stuff. 
So you could:

1) make CPUState define only common fields.  Include CPUState at the 
beginning of each per-target CPUXYZState.

2) Do s/CPUState/CPUXYZState/ on target-*/*.

3) Make it compile, possibly by undoing parts of 2) and changing parts 
of it to DO_UPCAST.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24  9:47 ` Paolo Bonzini
@ 2010-03-24 11:19   ` Richard Henderson
  2010-03-24 14:45     ` Paolo Bonzini
  2010-03-24 18:00     ` Blue Swirl
  0 siblings, 2 replies; 31+ messages in thread
From: Richard Henderson @ 2010-03-24 11:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Blue Swirl, qemu-devel

On 03/24/2010 02:47 AM, Paolo Bonzini wrote:
> 1) make CPUState define only common fields. Include CPUState at the
> beginning of each per-target CPUXYZState.

Irritatingly, the common fields contain quite big TLBs.  And the
offsets from the start of env affect the compactness of the code
generated from TCG.  We really really want the general registers
to come first to make sure that those offsets fit the host's
reg+offset addressing mode.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 11:19   ` Richard Henderson
@ 2010-03-24 14:45     ` Paolo Bonzini
  2010-03-24 14:56       ` Paul Brook
  2010-03-24 17:07       ` Richard Henderson
  2010-03-24 18:00     ` Blue Swirl
  1 sibling, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2010-03-24 14:45 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Blue Swirl, qemu-devel

On 03/24/2010 12:19 PM, Richard Henderson wrote:
> On 03/24/2010 02:47 AM, Paolo Bonzini wrote:
>> 1) make CPUState define only common fields. Include CPUState at the
>> beginning of each per-target CPUXYZState.
>
> Irritatingly, the common fields contain quite big TLBs. And the
> offsets from the start of env affect the compactness of the code
> generated from TCG. We really really want the general registers
> to come first to make sure that those offsets fit the host's
> reg+offset addressing mode.

What about adding a 512-bytes (or more) block or something like that at 
the beginning of CPUState with a union, so you can put the per-target 
stuff there?

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 14:45     ` Paolo Bonzini
@ 2010-03-24 14:56       ` Paul Brook
  2010-03-24 16:18         ` Paolo Bonzini
  2010-03-24 17:07       ` Richard Henderson
  1 sibling, 1 reply; 31+ messages in thread
From: Paul Brook @ 2010-03-24 14:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl, Paolo Bonzini, Richard Henderson

> On 03/24/2010 12:19 PM, Richard Henderson wrote:
> > On 03/24/2010 02:47 AM, Paolo Bonzini wrote:
> >> 1) make CPUState define only common fields. Include CPUState at the
> >> beginning of each per-target CPUXYZState.
> >
> > Irritatingly, the common fields contain quite big TLBs. And the
> > offsets from the start of env affect the compactness of the code
> > generated from TCG. We really really want the general registers
> > to come first to make sure that those offsets fit the host's
> > reg+offset addressing mode.
> 
> What about adding a 512-bytes (or more) block or something like that at
> the beginning of CPUState with a union, so you can put the per-target
> stuff there?

Is it really worth the hassle? Anything touching CPUState is probably going to 
be CPU specific anyway.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 14:56       ` Paul Brook
@ 2010-03-24 16:18         ` Paolo Bonzini
  2010-03-24 17:27           ` Paul Brook
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2010-03-24 16:18 UTC (permalink / raw)
  To: Paul Brook; +Cc: Blue Swirl, qemu-devel, Richard Henderson

On 03/24/2010 03:56 PM, Paul Brook wrote:
>> On 03/24/2010 12:19 PM, Richard Henderson wrote:
>>> On 03/24/2010 02:47 AM, Paolo Bonzini wrote:
>>>> 1) make CPUState define only common fields. Include CPUState at the
>>>> beginning of each per-target CPUXYZState.
>>>
>>> Irritatingly, the common fields contain quite big TLBs. And the
>>> offsets from the start of env affect the compactness of the code
>>> generated from TCG. We really really want the general registers
>>> to come first to make sure that those offsets fit the host's
>>> reg+offset addressing mode.
>>
>> What about adding a 512-bytes (or more) block or something like that at
>> the beginning of CPUState with a union, so you can put the per-target
>> stuff there?
>
> Is it really worth the hassle? Anything touching CPUState is probably going to
> be CPU specific anyway.

qemu-timer.c, hw/dma.c is not and these are the first two files I looked 
at.  translate-all.c is the third, and it is except for a trivial cleanup.

Big parts of vl.c are independent too.

As a quick check:

$ git grep -l 'CPUState' | grep -ve ^tcg -e ^target- | wc -l
96
$ git grep -l 'CPUState' | grep -ve ^tcg -e ^target- | \
      xargs grep -l '#if.*TARGET_' | wc -l
36

The ones that remain are pretty much what would you expect, besides 
translate-all.c and some in hw/ which I snipped:

bsd-user/main.c
darwin-user/main.c darwin-user/qemu.h darwin-user/signal.c
linux-user/elfload.c linux-user/main.c linux-user/qemu.h
linux-user/signal.c linux-user/syscall.c

cpu-all.h cpu-defs.h cpu-exec.c
def-helper.h
disas.c
exec-all.h exec.c
gdbstub.c
monitor.c
translate-all.c
vl.c

Of course this doesn't mean that 60 files are target-independent, but 
30-ish probably are or can be made so.

It would also help code clarity to use CPUXYZState more, to understand 
which hw/ files are specific to one model.  For hw/s390-virtio.c that's 
obvious, but slightly less so for hw/sun4m.c and even less so for 
hw/syborg.c.  This is an independent cleanup.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 14:45     ` Paolo Bonzini
  2010-03-24 14:56       ` Paul Brook
@ 2010-03-24 17:07       ` Richard Henderson
  2010-03-24 20:12         ` Richard Henderson
  1 sibling, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2010-03-24 17:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Blue Swirl, qemu-devel

On 03/24/2010 07:45 AM, Paolo Bonzini wrote:
> On 03/24/2010 12:19 PM, Richard Henderson wrote:
>> On 03/24/2010 02:47 AM, Paolo Bonzini wrote:
>>> 1) make CPUState define only common fields. Include CPUState at the
>>> beginning of each per-target CPUXYZState.
>>
>> Irritatingly, the common fields contain quite big TLBs. And the
>> offsets from the start of env affect the compactness of the code
>> generated from TCG. We really really want the general registers
>> to come first to make sure that those offsets fit the host's
>> reg+offset addressing mode.
> 
> What about adding a 512-bytes (or more) block or something like that at
> the beginning of CPUState with a union, so you can put the per-target
> stuff there?

I think that would be confusing.

What might be just as good (although possibly just as confusing)
is to move the big members into a different structure.  E.g.

struct CPUSmallCommonState
{
    // most of the stuff from CPU_COMMON.
    // sorted for some thought of padding elimination.  ;-)
};

struct CPULargeCommonState
{
    CPUTLBEntry tlb_table[NB_MMU_MODES][CPU_TLB_SIZE];
    target_phys_addr_t iotlb[NB_MMU_MODES][CPU_TLB_SIZE];
    struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
    jmp_buf jmp_env;
};

struct CPUXYZSmallState
{
    CPUSmallCommonState common_s;
    // the rest of the cpu-specific stuff.
};

struct CPUXYZLargeState
{
    CPUXYZSmallState s;
    CPUBigCommonState common_l;
};

extern int cpu_large_state_offset = offsetof(CPUXYZLargeState, common_l);

Now.  If you're compiling a file for which cpu-specific code is ok:

register CPUXYZLargeState *env __asm__(AREG0);
#define ENV_SMALL_COMMON_STATE    (&env->s.common_s)
#define ENV_LARGE_COMMON_STATE    (&env->common_l)

If you're compiling a file which is supposed to be independant of cpu:

register CPUSmallCommonState *env __asm__(AREG0);
#define ENV_SMALL_COMMON_STATE    (env)
#define ENV_LARGE_COMMON_STATE    ((CPULargeCommonState *)((char *)env + cpu_large_state_offset))

For the gcc-compiled code, the addition of the cpu_large_state_offset
is probably more or less on par in efficiency with indirection.  But
for TCG generated code, the variable read happens at code generation
time, which means we *still* have a constant in the generated code.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 16:18         ` Paolo Bonzini
@ 2010-03-24 17:27           ` Paul Brook
  0 siblings, 0 replies; 31+ messages in thread
From: Paul Brook @ 2010-03-24 17:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl, Paolo Bonzini, Richard Henderson

> >>>> 1) make CPUState define only common fields. Include CPUState at the
> >>>> beginning of each per-target CPUXYZState.
> >>>
> >>> Irritatingly, the common fields contain quite big TLBs. And the
> >>> offsets from the start of env affect the compactness of the code
> >>> generated from TCG. We really really want the general registers
> >>> to come first to make sure that those offsets fit the host's
> >>> reg+offset addressing mode.
> >>
> >> What about adding a 512-bytes (or more) block or something like that at
> >> the beginning of CPUState with a union, so you can put the per-target
> >> stuff there?
> >
> > Is it really worth the hassle? Anything touching CPUState is probably
> > going to be CPU specific anyway.
> 
> qemu-timer.c, hw/dma.c is not and these are the first two files I looked
> at.  translate-all.c is the third, and it is except for a trivial cleanup.

The use in hw/dma.c is incorrect.  See previous discussion about how 
qemu_bh_schedule_idle needs to go away.

I'm also unconvinced by your numbers. My i386-softmmu/ directory contains only 
43 object files, most of are device emulation and don't touch CPU state at 
all. arm-softmmu/ contains a good number more, but that's mostly board init 
(which needs to know which CPU it's creating), and devices that are only used 
by one board so noone's bothered to move them into libhw.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24  9:17 ` [Qemu-devel] " Juan Quintela
@ 2010-03-24 17:56   ` Blue Swirl
  2010-03-24 19:28     ` Juan Quintela
  2010-03-25  2:54   ` Jamie Lokier
  1 sibling, 1 reply; 31+ messages in thread
From: Blue Swirl @ 2010-03-24 17:56 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

On 3/24/10, Juan Quintela <quintela@redhat.com> wrote:
> Blue Swirl <blauwirbel@gmail.com> wrote:
>  > Hi,
>  >
>  > Here's some planning for getting most files compiled as few times as
>  > possible. Comments and suggestions are welcome.
>
>
> I took some thought about this at some point.  Problems here start from
>  "Recursive Makefile condered Harmful (tm)".
>
>  Look at how we jump through hops to be able to compile things in
>  one/other side.
>
>  We have:
>  Makefile
>  Makefile.target (really lots of them, one for target)
>  Makefile.hw
>  Makefile.user
>
>  If we had only a single Makefile, things in this department would be
>  much, much easier. And no, convert to a single Makefile is not trivial
>  either, but it would make things easier.
>
>  Why do we have several Makefiles?  Because we want to compile each file
>  with different options.
>
>  Why do we need to abuse so much VPATH?  Because we need to bring files
>  randomly from $(ROOT), $(ROOT)/hw  $(ROOT)/$(TARGET).

Would it help if the Makefiles were scattered to each directory, for
example instead of Makefile.hw we had hw/Makefile?

>  Problem here, there isn't a simple way to compile files for several
>  target just once (no way to put them).
>
>  Our main copmile rule is:
>
>  $(QEMU_PROG): $(obj-y) $(obj-$(TARGET_BASE_ARCH)-y)
>         $(call LINK,$(obj-y) $(obj-$(TARGET_BASE_ARCH)-y))
>
>
>  (notice that things compiled in Makefile are trivial, they are already
>  compiled just once by definition, problems are for all the qemu's we
>  compile).
>
>  We could change: $(obj-$(TARGET_BASE_ARCH)-y) to something like:
>
>  OBJ-TARGET=s/.o/.$(TARGET_BASE_ARCH).o/
>
>  (I forgot the subst Makefile syntax), and have the:
>
>  %.$(TARGET_BASE_ARCH).o: %.c
>    gcc $(TARGET_BASE_ARCH options)
>
>  From there, as you suggested, we need some files that are not compiled
>  by architecture, they need to be compiled by board, well, we need to add
>  yet another level obj-$(TARGET_BOARD) or whatever.
>
>  Notice that this is a lot of work, but you are needing the audit to be
>  able to compile only once.  Problem just now is that there is not a
>  simple way to describe that information,  with my proposal it gets
>  trivial to express:
>
>  obj-$(CONFIG_FOO) += foo.o  # You need this for everything
>  obj-mips-$(CONFIG_FOO) += foo.o  # You need this for all mips boards
>  obj-malta-$(CONFIG_FOO) += foo.o  # You need this for all mips malta board
>
>  You still need to do some different magic from hw-32/64 but it could be
>  done this way.  Once you did it this way, you now where the files are
>  (hw or target) and you can drop the VPATH tricks.
>
>  Problem with this proposal is that it is not trivial to do in little
>  steps, and the real big advantages appear when you switch to a single
>  Makefile at the end.

I may have missed something, but the compile process doesn't care
about boards, because all boards for some architecture (and therefore
all devices used by all boards) are linked to a single
per-architecture executable. So why introduce the boards concept to
Makefiles?

>  > vl.c: a lot of work. Maybe the CPUState stuff should be separated to a new file.
>
>
> That should just be a rule in Documentation.  You can't but anything
>  else in vl.c.  If you move anything out of vl.c (see timers work from
>  bonzini for example), you get a wild card for free commit bypassing
>  maintainers or some similar price :)

Cleaning up vl.c would be great, but just for purpose of single
compilation, it's enough to put the CPUState pieces to a target
dependent file (cpu-common.c?) and compile the rest once by making
TARGET_xxx conditional code unconditional. This may still be doable.

>  <rest of files>
>
>  I haven't really looked at them at depth.
>
>  I looked when I cleaned up the build system, I thought how to do the
>  next step (outlined before), but got sidetracked by other more urgent
>  things.

Thanks for the comments.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 11:19   ` Richard Henderson
  2010-03-24 14:45     ` Paolo Bonzini
@ 2010-03-24 18:00     ` Blue Swirl
  1 sibling, 0 replies; 31+ messages in thread
From: Blue Swirl @ 2010-03-24 18:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, qemu-devel

On 3/24/10, Richard Henderson <rth@twiddle.net> wrote:
> On 03/24/2010 02:47 AM, Paolo Bonzini wrote:
>
> > 1) make CPUState define only common fields. Include CPUState at the
> > beginning of each per-target CPUXYZState.
> >
>
>  Irritatingly, the common fields contain quite big TLBs.  And the
>  offsets from the start of env affect the compactness of the code
>  generated from TCG.  We really really want the general registers
>  to come first to make sure that those offsets fit the host's
>  reg+offset addressing mode.

One trick is to define a fixed offset (about half CPUState size) and
make env point to CPUState + offset. Then the negative part of the
offset space can be used efficiently. But this just doubles the space
that can be accessed fast, so it's not a big win.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 17:56   ` Blue Swirl
@ 2010-03-24 19:28     ` Juan Quintela
  2010-03-24 22:47       ` Paul Brook
  0 siblings, 1 reply; 31+ messages in thread
From: Juan Quintela @ 2010-03-24 19:28 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

Blue Swirl <blauwirbel@gmail.com> wrote:
> On 3/24/10, Juan Quintela <quintela@redhat.com> wrote:
>> Blue Swirl <blauwirbel@gmail.com> wrote:
>>  > Hi,
>>  >
>>  > Here's some planning for getting most files compiled as few times as
>>  > possible. Comments and suggestions are welcome.
>>
>>
>> I took some thought about this at some point.  Problems here start from
>>  "Recursive Makefile condered Harmful (tm)".
>>
>>  Look at how we jump through hops to be able to compile things in
>>  one/other side.
>>
>>  We have:
>>  Makefile
>>  Makefile.target (really lots of them, one for target)
>>  Makefile.hw
>>  Makefile.user
>>
>>  If we had only a single Makefile, things in this department would be
>>  much, much easier. And no, convert to a single Makefile is not trivial
>>  either, but it would make things easier.
>>
>>  Why do we have several Makefiles?  Because we want to compile each file
>>  with different options.
>>
>>  Why do we need to abuse so much VPATH?  Because we need to bring files
>>  randomly from $(ROOT), $(ROOT)/hw  $(ROOT)/$(TARGET).
>
> Would it help if the Makefiles were scattered to each directory, for
> example instead of Makefile.hw we had hw/Makefile?

The interesting one is Makefile.target, hw/Makefile could help, but that
is far away from where the action is.

If you look at Makefile.target from far enough, you will see that it
basically has:

libobj-$(CONFIG_FOO) = ...

ifdef CONFIG_LINUX_USER
....
endif

ifdef CONFIG_DARWIN_USER
...
endif

ifdef CONFIG_BSD_USER
...
endif

ifdef CONFIG_SOFTMMU
...
endif


The shared bits are very small (out of the libobj-y stuff).
Spliting the others in different Makefiles (or whatever is easy).  How
to get this ones compiled only once per BASE_ARCH/whatever should put us
near the goal of a single Makefile (and compiling each thing just the
number of times required).  Some thought is needed to know how to work here.

Actually, Anthony suggested at some point to just use 64 bits for
TARGET_PHYS_ADDR_BITS and remove the need for hw32/64.

I think that people emulationg 32bits on 32bits would suffer, but have
no clue how much.  Anthony, what was the idea?

>>  Problem with this proposal is that it is not trivial to do in little
>>  steps, and the real big advantages appear when you switch to a single
>>  Makefile at the end.
>
> I may have missed something, but the compile process doesn't care
> about boards, because all boards for some architecture (and therefore
> all devices used by all boards) are linked to a single
> per-architecture executable. So why introduce the boards concept to
> Makefiles?

I missunderstood this bit of your previous message:

> The target dependent cases should be next. On full build, each MIPS
> device file gets compiled four times, PPC files three times and x86
> twice. The devices for architectures that are compiled only once (ARM,
> Cris, Sparc32 etc.) do not need to be touched.

I was refering to this ones, but somehow got confused with boards :(

>
>>  > vl.c: a lot of work. Maybe the CPUState stuff should be separated to a new file.
>>
>>
>> That should just be a rule in Documentation.  You can't but anything
>>  else in vl.c.  If you move anything out of vl.c (see timers work from
>>  bonzini for example), you get a wild card for free commit bypassing
>>  maintainers or some similar price :)
>
> Cleaning up vl.c would be great, but just for purpose of single
> compilation, it's enough to put the CPUState pieces to a target
> dependent file (cpu-common.c?) and compile the rest once by making
> TARGET_xxx conditional code unconditional. This may still be doable.

I haven't looked at detail at this :(

>>  <rest of files>
>>
>>  I haven't really looked at them at depth.
>>
>>  I looked when I cleaned up the build system, I thought how to do the
>>  next step (outlined before), but got sidetracked by other more urgent
>>  things.
>
> Thanks for the comments.

You are welcome.

Later, Juan.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-23 21:43 [Qemu-devel] Compile files only once: some planning Blue Swirl
  2010-03-24  9:17 ` [Qemu-devel] " Juan Quintela
  2010-03-24  9:47 ` Paolo Bonzini
@ 2010-03-24 19:39 ` Michael S. Tsirkin
  2010-03-24 20:05   ` Blue Swirl
  2 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2010-03-24 19:39 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
> rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.

I don't see how it would help. These still get target_phys_addr_t which
is per-target. Further, a ton of devices do
cpu_register_physical_memory/qemu_register_coalesced_mmio.
These are also per target.

A simple solution would be to change all of cpu_XX functions to
get a 64 bit address. This is a lot of churn, if we do this
anyway we should also pass length to callbacks, this way
rwhandler will get very small or go away completely.

-- 
MST

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 19:39 ` Michael S. Tsirkin
@ 2010-03-24 20:05   ` Blue Swirl
  2010-03-24 20:08     ` Michael S. Tsirkin
  2010-03-24 22:33     ` Paul Brook
  0 siblings, 2 replies; 31+ messages in thread
From: Blue Swirl @ 2010-03-24 20:05 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]

On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
>  > rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
>
>
> I don't see how it would help. These still get target_phys_addr_t which
>  is per-target. Further, a ton of devices do
>  cpu_register_physical_memory/qemu_register_coalesced_mmio.
>  These are also per target.

I don't know what I was eating yesterday: there are no references to
ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
for the device itself, just add a property "be". The attached patch
performs this part.

But now there is a bigger problem, how to pass the property to the
device. It's not fair to require the user to remember to set it.

>  A simple solution would be to change all of cpu_XX functions to
>  get a 64 bit address. This is a lot of churn, if we do this
>  anyway we should also pass length to callbacks, this way
>  rwhandler will get very small or go away completely.

It's not too much effort to keep the target_phys_addr_t type.

[-- Attachment #2: 0001-Compile-rtl8139-and-e1000-only-once.patch --]
[-- Type: text/x-diff, Size: 12304 bytes --]

From e0ab5cc41c68207be558ccb330f4fb83fba4ee6f Mon Sep 17 00:00:00 2001
From: Blue Swirl <blauwirbel@gmail.com>
Date: Wed, 24 Mar 2010 19:54:05 +0000
Subject: [PATCH] Compile rtl8139 and e1000 only once

WIP

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
---
 Makefile.objs   |    2 +
 Makefile.target |    4 --
 hw/e1000.c      |  108 ++++++++++++++++++++++++++++++++++++++++++------------
 hw/rtl8139.c    |   82 +++++++++++++++++++++++++++++++-----------
 4 files changed, 147 insertions(+), 49 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 281f7a6..54895f8 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -155,6 +155,8 @@ hw-obj-y += msix.o
 hw-obj-y += ne2000.o
 hw-obj-y += eepro100.o
 hw-obj-y += pcnet.o
+hw-obj-y += rtl8139.o
+hw-obj-y += e1000.o
 
 hw-obj-$(CONFIG_SMC91C111) += smc91c111.o
 hw-obj-$(CONFIG_LAN9118) += lan9118.o
diff --git a/Makefile.target b/Makefile.target
index eb4d010..1a86fc4 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -176,10 +176,6 @@ QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
 # xen backend driver support
 obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
 
-# PCI network cards
-obj-y += rtl8139.o
-obj-y += e1000.o
-
 # Hardware support
 obj-i386-y = ide/core.o
 obj-i386-y += pckbd.o dma.o
diff --git a/hw/e1000.c b/hw/e1000.c
index fd3059a..0f72db8 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -121,6 +121,7 @@ typedef struct E1000State_st {
         uint16_t reading;
         uint32_t old_eecd;
     } eecd_state;
+    uint32_t be;
 } E1000State;
 
 #define	defreg(x)	x = (E1000_##x>>2)
@@ -825,14 +826,11 @@ static void (*macreg_writeops[])(E1000State *, int, uint32_t) = {
 enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
 
 static void
-e1000_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
+e1000_mmio_writel_le(void *opaque, target_phys_addr_t addr, uint32_t val)
 {
     E1000State *s = opaque;
     unsigned int index = (addr & 0x1ffff) >> 2;
 
-#ifdef TARGET_WORDS_BIGENDIAN
-    val = bswap32(val);
-#endif
     if (index < NWRITEOPS && macreg_writeops[index])
         macreg_writeops[index](s, index, val);
     else if (index < NREADOPS && macreg_readops[index])
@@ -841,25 +839,47 @@ e1000_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
         DBGOUT(UNKNOWN, "MMIO unknown write addr=0x%08x,val=0x%08x\n",
                index<<2, val);
 }
+static void
+e1000_mmio_writel_be(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    val = bswap32(val);
+    e1000_mmio_writel_le(opaque, addr, val);
+}
 
 static void
-e1000_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
+e1000_mmio_writew_le(void *opaque, target_phys_addr_t addr, uint32_t val)
 {
     // emulate hw without byte enables: no RMW
-    e1000_mmio_writel(opaque, addr & ~3,
-                      (val & 0xffff) << (8*(addr & 3)));
+    e1000_mmio_writel_le(opaque, addr & ~3,
+                         (val & 0xffff) << (8*(addr & 3)));
 }
 
 static void
-e1000_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
+e1000_mmio_writew_be(void *opaque, target_phys_addr_t addr, uint32_t val)
 {
     // emulate hw without byte enables: no RMW
-    e1000_mmio_writel(opaque, addr & ~3,
-                      (val & 0xff) << (8*(addr & 3)));
+    e1000_mmio_writel_be(opaque, addr & ~3,
+                         (val & 0xffff) << (8*(addr & 3)));
+}
+
+static void
+e1000_mmio_writeb_be(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    // emulate hw without byte enables: no RMW
+    e1000_mmio_writel_be(opaque, addr & ~3,
+                         (val & 0xff) << (8*(addr & 3)));
+}
+
+static void
+e1000_mmio_writeb_le(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    // emulate hw without byte enables: no RMW
+    e1000_mmio_writel_le(opaque, addr & ~3,
+                         (val & 0xff) << (8*(addr & 3)));
 }
 
 static uint32_t
-e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
+e1000_mmio_readl_le(void *opaque, target_phys_addr_t addr)
 {
     E1000State *s = opaque;
     unsigned int index = (addr & 0x1ffff) >> 2;
@@ -867,9 +887,6 @@ e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
     if (index < NREADOPS && macreg_readops[index])
     {
         uint32_t val = macreg_readops[index](s, index);
-#ifdef TARGET_WORDS_BIGENDIAN
-        val = bswap32(val);
-#endif
         return val;
     }
     DBGOUT(UNKNOWN, "MMIO unknown read addr=0x%08x\n", index<<2);
@@ -877,16 +894,38 @@ e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
 }
 
 static uint32_t
-e1000_mmio_readb(void *opaque, target_phys_addr_t addr)
+e1000_mmio_readl_be(void *opaque, target_phys_addr_t addr)
 {
-    return ((e1000_mmio_readl(opaque, addr & ~3)) >>
+    uint32_t val = e1000_mmio_readl_le(opaque, addr);
+    val = bswap32(val);
+    return val;
+}
+
+static uint32_t
+e1000_mmio_readb_be(void *opaque, target_phys_addr_t addr)
+{
+    return ((e1000_mmio_readl_be(opaque, addr & ~3)) >>
             (8 * (addr & 3))) & 0xff;
 }
 
 static uint32_t
-e1000_mmio_readw(void *opaque, target_phys_addr_t addr)
+e1000_mmio_readb_le(void *opaque, target_phys_addr_t addr)
+{
+    return ((e1000_mmio_readl_le(opaque, addr & ~3)) >>
+            (8 * (addr & 3))) & 0xff;
+}
+
+static uint32_t
+e1000_mmio_readw_le(void *opaque, target_phys_addr_t addr)
+{
+    return ((e1000_mmio_readl_le(opaque, addr & ~3)) >>
+            (8 * (addr & 3))) & 0xffff;
+}
+
+static uint32_t
+e1000_mmio_readw_be(void *opaque, target_phys_addr_t addr)
 {
-    return ((e1000_mmio_readl(opaque, addr & ~3)) >>
+    return ((e1000_mmio_readl_be(opaque, addr & ~3)) >>
             (8 * (addr & 3))) & 0xffff;
 }
 
@@ -1008,13 +1047,28 @@ static const uint32_t mac_reg_init[] = {
 };
 
 /* PCI interface */
+static CPUWriteMemoryFunc * const e1000_mmio_write_be[] = {
+    e1000_mmio_writeb_be,
+    e1000_mmio_writew_be,
+    e1000_mmio_writel_be
+};
 
-static CPUWriteMemoryFunc * const e1000_mmio_write[] = {
-    e1000_mmio_writeb,	e1000_mmio_writew,	e1000_mmio_writel
+static CPUReadMemoryFunc * const e1000_mmio_read_be[] = {
+    e1000_mmio_readb_be,
+    e1000_mmio_readw_be,
+    e1000_mmio_readl_be
 };
 
-static CPUReadMemoryFunc * const e1000_mmio_read[] = {
-    e1000_mmio_readb,	e1000_mmio_readw,	e1000_mmio_readl
+static CPUWriteMemoryFunc * const e1000_mmio_write_le[] = {
+    e1000_mmio_writeb_le,
+    e1000_mmio_writew_le,
+    e1000_mmio_writel_le
+};
+
+static CPUReadMemoryFunc * const e1000_mmio_read_le[] = {
+    e1000_mmio_readb_le,
+    e1000_mmio_readw_le,
+    e1000_mmio_readl_le
 };
 
 static void
@@ -1102,8 +1156,13 @@ static int pci_e1000_init(PCIDevice *pci_dev)
     /* TODO: RST# value should be 0 if programmable, PCI spec 6.2.4 */
     pci_conf[PCI_INTERRUPT_PIN] = 1; // interrupt pin 0
 
-    d->mmio_index = cpu_register_io_memory(e1000_mmio_read,
-            e1000_mmio_write, d);
+    if (d->be) {
+        d->mmio_index = cpu_register_io_memory(e1000_mmio_read_be,
+                                               e1000_mmio_write_be, d);
+    } else {
+        d->mmio_index = cpu_register_io_memory(e1000_mmio_read_le,
+                                               e1000_mmio_write_le, d);
+    }
 
     pci_register_bar((PCIDevice *)d, 0, PNPMMIO_SIZE,
                            PCI_BASE_ADDRESS_SPACE_MEMORY, e1000_mmio_map);
@@ -1146,6 +1205,7 @@ static PCIDeviceInfo e1000_info = {
     .romfile    = "pxe-e1000.bin",
     .qdev.props = (Property[]) {
         DEFINE_NIC_PROPERTIES(E1000State, conf),
+        DEFINE_PROP_UINT32("be", E1000State, be, 0),
         DEFINE_PROP_END_OF_LIST(),
     }
 };
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index 72e2242..ef5f1fd 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -493,7 +493,7 @@ typedef struct RTL8139State {
     /* PCI interrupt timer */
     QEMUTimer *timer;
     int64_t TimerExpire;
-
+    uint32_t be;
 } RTL8139State;
 
 static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
@@ -3123,19 +3123,29 @@ static void rtl8139_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t
     rtl8139_io_writeb(opaque, addr & 0xFF, val);
 }
 
-static void rtl8139_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
+static void rtl8139_mmio_writew_be(void *opaque, target_phys_addr_t addr,
+                                   uint32_t val)
 {
-#ifdef TARGET_WORDS_BIGENDIAN
     val = bswap16(val);
-#endif
     rtl8139_io_writew(opaque, addr & 0xFF, val);
 }
 
-static void rtl8139_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
+static void rtl8139_mmio_writew_le(void *opaque, target_phys_addr_t addr,
+                                   uint32_t val)
+{
+    rtl8139_io_writew(opaque, addr & 0xFF, val);
+}
+
+static void rtl8139_mmio_writel_be(void *opaque, target_phys_addr_t addr,
+                                   uint32_t val)
 {
-#ifdef TARGET_WORDS_BIGENDIAN
     val = bswap32(val);
-#endif
+    rtl8139_io_writel(opaque, addr & 0xFF, val);
+}
+
+static void rtl8139_mmio_writel_le(void *opaque, target_phys_addr_t addr,
+                                   uint32_t val)
+{
     rtl8139_io_writel(opaque, addr & 0xFF, val);
 }
 
@@ -3144,21 +3154,31 @@ static uint32_t rtl8139_mmio_readb(void *opaque, target_phys_addr_t addr)
     return rtl8139_io_readb(opaque, addr & 0xFF);
 }
 
-static uint32_t rtl8139_mmio_readw(void *opaque, target_phys_addr_t addr)
+static uint32_t rtl8139_mmio_readw_be(void *opaque, target_phys_addr_t addr)
 {
     uint32_t val = rtl8139_io_readw(opaque, addr & 0xFF);
-#ifdef TARGET_WORDS_BIGENDIAN
     val = bswap16(val);
-#endif
     return val;
 }
 
-static uint32_t rtl8139_mmio_readl(void *opaque, target_phys_addr_t addr)
+static uint32_t rtl8139_mmio_readw_le(void *opaque, target_phys_addr_t addr)
+{
+    uint32_t val = rtl8139_io_readw(opaque, addr & 0xFF);
+
+    return val;
+}
+
+static uint32_t rtl8139_mmio_readl_be(void *opaque, target_phys_addr_t addr)
 {
     uint32_t val = rtl8139_io_readl(opaque, addr & 0xFF);
-#ifdef TARGET_WORDS_BIGENDIAN
     val = bswap32(val);
-#endif
+    return val;
+}
+
+static uint32_t rtl8139_mmio_readl_le(void *opaque, target_phys_addr_t addr)
+{
+    uint32_t val = rtl8139_io_readl(opaque, addr & 0xFF);
+
     return val;
 }
 
@@ -3292,16 +3312,28 @@ static void rtl8139_ioport_map(PCIDevice *pci_dev, int region_num,
     register_ioport_read( addr, 0x100, 4, rtl8139_ioport_readl,  s);
 }
 
-static CPUReadMemoryFunc * const rtl8139_mmio_read[3] = {
+static CPUReadMemoryFunc * const rtl8139_mmio_read_be[3] = {
     rtl8139_mmio_readb,
-    rtl8139_mmio_readw,
-    rtl8139_mmio_readl,
+    rtl8139_mmio_readw_be,
+    rtl8139_mmio_readl_be,
 };
 
-static CPUWriteMemoryFunc * const rtl8139_mmio_write[3] = {
+static CPUWriteMemoryFunc * const rtl8139_mmio_write_be[3] = {
     rtl8139_mmio_writeb,
-    rtl8139_mmio_writew,
-    rtl8139_mmio_writel,
+    rtl8139_mmio_writew_be,
+    rtl8139_mmio_writel_be,
+};
+
+static CPUReadMemoryFunc * const rtl8139_mmio_read_le[3] = {
+    rtl8139_mmio_readb,
+    rtl8139_mmio_readw_le,
+    rtl8139_mmio_readl_le,
+};
+
+static CPUWriteMemoryFunc * const rtl8139_mmio_write_le[3] = {
+    rtl8139_mmio_writeb,
+    rtl8139_mmio_writew_le,
+    rtl8139_mmio_writel_le,
 };
 
 static void rtl8139_timer(void *opaque)
@@ -3369,8 +3401,15 @@ static int pci_rtl8139_init(PCIDevice *dev)
     pci_conf[PCI_CAPABILITY_LIST] = 0xdc;
 
     /* I/O handler for memory-mapped I/O */
-    s->rtl8139_mmio_io_addr =
-        cpu_register_io_memory(rtl8139_mmio_read, rtl8139_mmio_write, s);
+    if (s->be) {
+        s->rtl8139_mmio_io_addr =
+            cpu_register_io_memory(rtl8139_mmio_read_be, rtl8139_mmio_write_be,
+                                   s);
+    } else {
+        s->rtl8139_mmio_io_addr =
+            cpu_register_io_memory(rtl8139_mmio_read_le, rtl8139_mmio_write_le,
+                                   s);
+    }
 
     pci_register_bar(&s->dev, 0, 0x100,
                            PCI_BASE_ADDRESS_SPACE_IO,  rtl8139_ioport_map);
@@ -3404,6 +3443,7 @@ static PCIDeviceInfo rtl8139_info = {
     .romfile    = "pxe-rtl8139.bin",
     .qdev.props = (Property[]) {
         DEFINE_NIC_PROPERTIES(RTL8139State, conf),
+        DEFINE_PROP_UINT32("be", RTL8139State, be, 0),
         DEFINE_PROP_END_OF_LIST(),
     }
 };
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:05   ` Blue Swirl
@ 2010-03-24 20:08     ` Michael S. Tsirkin
  2010-03-24 20:24       ` Blue Swirl
  2010-03-24 22:33     ` Paul Brook
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2010-03-24 20:08 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On Wed, Mar 24, 2010 at 10:05:10PM +0200, Blue Swirl wrote:
> On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
> >  > rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
> >
> >
> > I don't see how it would help. These still get target_phys_addr_t which
> >  is per-target. Further, a ton of devices do
> >  cpu_register_physical_memory/qemu_register_coalesced_mmio.
> >  These are also per target.
> 
> I don't know what I was eating yesterday: there are no references to
> ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
> for the device itself, just add a property "be". The attached patch
> performs this part.
> 
> But now there is a bigger problem, how to pass the property to the
> device. It's not fair to require the user to remember to set it.

I still don't fully understand how come e1000 cares about
target endianness.

> >  A simple solution would be to change all of cpu_XX functions to
> >  get a 64 bit address. This is a lot of churn, if we do this
> >  anyway we should also pass length to callbacks, this way
> >  rwhandler will get very small or go away completely.
> 
> It's not too much effort to keep the target_phys_addr_t type.

I don't understand - target_phys_addr_t is different for different
targets to we will need to recompile the code for each target.
What am I missing?


> From e0ab5cc41c68207be558ccb330f4fb83fba4ee6f Mon Sep 17 00:00:00 2001
> From: Blue Swirl <blauwirbel@gmail.com>
> Date: Wed, 24 Mar 2010 19:54:05 +0000
> Subject: [PATCH] Compile rtl8139 and e1000 only once
> 
> WIP
> 
> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
> ---
>  Makefile.objs   |    2 +
>  Makefile.target |    4 --
>  hw/e1000.c      |  108 ++++++++++++++++++++++++++++++++++++++++++------------
>  hw/rtl8139.c    |   82 +++++++++++++++++++++++++++++++-----------
>  4 files changed, 147 insertions(+), 49 deletions(-)
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index 281f7a6..54895f8 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -155,6 +155,8 @@ hw-obj-y += msix.o
>  hw-obj-y += ne2000.o
>  hw-obj-y += eepro100.o
>  hw-obj-y += pcnet.o
> +hw-obj-y += rtl8139.o
> +hw-obj-y += e1000.o
>  
>  hw-obj-$(CONFIG_SMC91C111) += smc91c111.o
>  hw-obj-$(CONFIG_LAN9118) += lan9118.o
> diff --git a/Makefile.target b/Makefile.target
> index eb4d010..1a86fc4 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -176,10 +176,6 @@ QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
>  # xen backend driver support
>  obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
>  
> -# PCI network cards
> -obj-y += rtl8139.o
> -obj-y += e1000.o
> -
>  # Hardware support
>  obj-i386-y = ide/core.o
>  obj-i386-y += pckbd.o dma.o
> diff --git a/hw/e1000.c b/hw/e1000.c
> index fd3059a..0f72db8 100644
> --- a/hw/e1000.c
> +++ b/hw/e1000.c
> @@ -121,6 +121,7 @@ typedef struct E1000State_st {
>          uint16_t reading;
>          uint32_t old_eecd;
>      } eecd_state;
> +    uint32_t be;
>  } E1000State;
>  
>  #define	defreg(x)	x = (E1000_##x>>2)
> @@ -825,14 +826,11 @@ static void (*macreg_writeops[])(E1000State *, int, uint32_t) = {
>  enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
>  
>  static void
> -e1000_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
> +e1000_mmio_writel_le(void *opaque, target_phys_addr_t addr, uint32_t val)
>  {
>      E1000State *s = opaque;
>      unsigned int index = (addr & 0x1ffff) >> 2;
>  
> -#ifdef TARGET_WORDS_BIGENDIAN
> -    val = bswap32(val);
> -#endif
>      if (index < NWRITEOPS && macreg_writeops[index])
>          macreg_writeops[index](s, index, val);
>      else if (index < NREADOPS && macreg_readops[index])
> @@ -841,25 +839,47 @@ e1000_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
>          DBGOUT(UNKNOWN, "MMIO unknown write addr=0x%08x,val=0x%08x\n",
>                 index<<2, val);
>  }
> +static void
> +e1000_mmio_writel_be(void *opaque, target_phys_addr_t addr, uint32_t val)
> +{
> +    val = bswap32(val);
> +    e1000_mmio_writel_le(opaque, addr, val);
> +}
>  
>  static void
> -e1000_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
> +e1000_mmio_writew_le(void *opaque, target_phys_addr_t addr, uint32_t val)
>  {
>      // emulate hw without byte enables: no RMW
> -    e1000_mmio_writel(opaque, addr & ~3,
> -                      (val & 0xffff) << (8*(addr & 3)));
> +    e1000_mmio_writel_le(opaque, addr & ~3,
> +                         (val & 0xffff) << (8*(addr & 3)));
>  }
>  
>  static void
> -e1000_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
> +e1000_mmio_writew_be(void *opaque, target_phys_addr_t addr, uint32_t val)
>  {
>      // emulate hw without byte enables: no RMW
> -    e1000_mmio_writel(opaque, addr & ~3,
> -                      (val & 0xff) << (8*(addr & 3)));
> +    e1000_mmio_writel_be(opaque, addr & ~3,
> +                         (val & 0xffff) << (8*(addr & 3)));
> +}
> +
> +static void
> +e1000_mmio_writeb_be(void *opaque, target_phys_addr_t addr, uint32_t val)
> +{
> +    // emulate hw without byte enables: no RMW
> +    e1000_mmio_writel_be(opaque, addr & ~3,
> +                         (val & 0xff) << (8*(addr & 3)));
> +}
> +
> +static void
> +e1000_mmio_writeb_le(void *opaque, target_phys_addr_t addr, uint32_t val)
> +{
> +    // emulate hw without byte enables: no RMW
> +    e1000_mmio_writel_le(opaque, addr & ~3,
> +                         (val & 0xff) << (8*(addr & 3)));
>  }
>  
>  static uint32_t
> -e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
> +e1000_mmio_readl_le(void *opaque, target_phys_addr_t addr)
>  {
>      E1000State *s = opaque;
>      unsigned int index = (addr & 0x1ffff) >> 2;
> @@ -867,9 +887,6 @@ e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
>      if (index < NREADOPS && macreg_readops[index])
>      {
>          uint32_t val = macreg_readops[index](s, index);
> -#ifdef TARGET_WORDS_BIGENDIAN
> -        val = bswap32(val);
> -#endif
>          return val;
>      }
>      DBGOUT(UNKNOWN, "MMIO unknown read addr=0x%08x\n", index<<2);
> @@ -877,16 +894,38 @@ e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
>  }
>  
>  static uint32_t
> -e1000_mmio_readb(void *opaque, target_phys_addr_t addr)
> +e1000_mmio_readl_be(void *opaque, target_phys_addr_t addr)
>  {
> -    return ((e1000_mmio_readl(opaque, addr & ~3)) >>
> +    uint32_t val = e1000_mmio_readl_le(opaque, addr);
> +    val = bswap32(val);
> +    return val;
> +}
> +
> +static uint32_t
> +e1000_mmio_readb_be(void *opaque, target_phys_addr_t addr)
> +{
> +    return ((e1000_mmio_readl_be(opaque, addr & ~3)) >>
>              (8 * (addr & 3))) & 0xff;
>  }
>  
>  static uint32_t
> -e1000_mmio_readw(void *opaque, target_phys_addr_t addr)
> +e1000_mmio_readb_le(void *opaque, target_phys_addr_t addr)
> +{
> +    return ((e1000_mmio_readl_le(opaque, addr & ~3)) >>
> +            (8 * (addr & 3))) & 0xff;
> +}
> +
> +static uint32_t
> +e1000_mmio_readw_le(void *opaque, target_phys_addr_t addr)
> +{
> +    return ((e1000_mmio_readl_le(opaque, addr & ~3)) >>
> +            (8 * (addr & 3))) & 0xffff;
> +}
> +
> +static uint32_t
> +e1000_mmio_readw_be(void *opaque, target_phys_addr_t addr)
>  {
> -    return ((e1000_mmio_readl(opaque, addr & ~3)) >>
> +    return ((e1000_mmio_readl_be(opaque, addr & ~3)) >>
>              (8 * (addr & 3))) & 0xffff;
>  }
>  
> @@ -1008,13 +1047,28 @@ static const uint32_t mac_reg_init[] = {
>  };
>  
>  /* PCI interface */
> +static CPUWriteMemoryFunc * const e1000_mmio_write_be[] = {
> +    e1000_mmio_writeb_be,
> +    e1000_mmio_writew_be,
> +    e1000_mmio_writel_be
> +};
>  
> -static CPUWriteMemoryFunc * const e1000_mmio_write[] = {
> -    e1000_mmio_writeb,	e1000_mmio_writew,	e1000_mmio_writel
> +static CPUReadMemoryFunc * const e1000_mmio_read_be[] = {
> +    e1000_mmio_readb_be,
> +    e1000_mmio_readw_be,
> +    e1000_mmio_readl_be
>  };
>  
> -static CPUReadMemoryFunc * const e1000_mmio_read[] = {
> -    e1000_mmio_readb,	e1000_mmio_readw,	e1000_mmio_readl
> +static CPUWriteMemoryFunc * const e1000_mmio_write_le[] = {
> +    e1000_mmio_writeb_le,
> +    e1000_mmio_writew_le,
> +    e1000_mmio_writel_le
> +};
> +
> +static CPUReadMemoryFunc * const e1000_mmio_read_le[] = {
> +    e1000_mmio_readb_le,
> +    e1000_mmio_readw_le,
> +    e1000_mmio_readl_le
>  };
>  
>  static void
> @@ -1102,8 +1156,13 @@ static int pci_e1000_init(PCIDevice *pci_dev)
>      /* TODO: RST# value should be 0 if programmable, PCI spec 6.2.4 */
>      pci_conf[PCI_INTERRUPT_PIN] = 1; // interrupt pin 0
>  
> -    d->mmio_index = cpu_register_io_memory(e1000_mmio_read,
> -            e1000_mmio_write, d);
> +    if (d->be) {
> +        d->mmio_index = cpu_register_io_memory(e1000_mmio_read_be,
> +                                               e1000_mmio_write_be, d);
> +    } else {
> +        d->mmio_index = cpu_register_io_memory(e1000_mmio_read_le,
> +                                               e1000_mmio_write_le, d);
> +    }
>  
>      pci_register_bar((PCIDevice *)d, 0, PNPMMIO_SIZE,
>                             PCI_BASE_ADDRESS_SPACE_MEMORY, e1000_mmio_map);
> @@ -1146,6 +1205,7 @@ static PCIDeviceInfo e1000_info = {
>      .romfile    = "pxe-e1000.bin",
>      .qdev.props = (Property[]) {
>          DEFINE_NIC_PROPERTIES(E1000State, conf),
> +        DEFINE_PROP_UINT32("be", E1000State, be, 0),
>          DEFINE_PROP_END_OF_LIST(),
>      }
>  };
> diff --git a/hw/rtl8139.c b/hw/rtl8139.c
> index 72e2242..ef5f1fd 100644
> --- a/hw/rtl8139.c
> +++ b/hw/rtl8139.c
> @@ -493,7 +493,7 @@ typedef struct RTL8139State {
>      /* PCI interrupt timer */
>      QEMUTimer *timer;
>      int64_t TimerExpire;
> -
> +    uint32_t be;
>  } RTL8139State;
>  
>  static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
> @@ -3123,19 +3123,29 @@ static void rtl8139_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t
>      rtl8139_io_writeb(opaque, addr & 0xFF, val);
>  }
>  
> -static void rtl8139_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
> +static void rtl8139_mmio_writew_be(void *opaque, target_phys_addr_t addr,
> +                                   uint32_t val)
>  {
> -#ifdef TARGET_WORDS_BIGENDIAN
>      val = bswap16(val);
> -#endif
>      rtl8139_io_writew(opaque, addr & 0xFF, val);
>  }
>  
> -static void rtl8139_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
> +static void rtl8139_mmio_writew_le(void *opaque, target_phys_addr_t addr,
> +                                   uint32_t val)
> +{
> +    rtl8139_io_writew(opaque, addr & 0xFF, val);
> +}
> +
> +static void rtl8139_mmio_writel_be(void *opaque, target_phys_addr_t addr,
> +                                   uint32_t val)
>  {
> -#ifdef TARGET_WORDS_BIGENDIAN
>      val = bswap32(val);
> -#endif
> +    rtl8139_io_writel(opaque, addr & 0xFF, val);
> +}
> +
> +static void rtl8139_mmio_writel_le(void *opaque, target_phys_addr_t addr,
> +                                   uint32_t val)
> +{
>      rtl8139_io_writel(opaque, addr & 0xFF, val);
>  }
>  
> @@ -3144,21 +3154,31 @@ static uint32_t rtl8139_mmio_readb(void *opaque, target_phys_addr_t addr)
>      return rtl8139_io_readb(opaque, addr & 0xFF);
>  }
>  
> -static uint32_t rtl8139_mmio_readw(void *opaque, target_phys_addr_t addr)
> +static uint32_t rtl8139_mmio_readw_be(void *opaque, target_phys_addr_t addr)
>  {
>      uint32_t val = rtl8139_io_readw(opaque, addr & 0xFF);
> -#ifdef TARGET_WORDS_BIGENDIAN
>      val = bswap16(val);
> -#endif
>      return val;
>  }
>  
> -static uint32_t rtl8139_mmio_readl(void *opaque, target_phys_addr_t addr)
> +static uint32_t rtl8139_mmio_readw_le(void *opaque, target_phys_addr_t addr)
> +{
> +    uint32_t val = rtl8139_io_readw(opaque, addr & 0xFF);
> +
> +    return val;
> +}
> +
> +static uint32_t rtl8139_mmio_readl_be(void *opaque, target_phys_addr_t addr)
>  {
>      uint32_t val = rtl8139_io_readl(opaque, addr & 0xFF);
> -#ifdef TARGET_WORDS_BIGENDIAN
>      val = bswap32(val);
> -#endif
> +    return val;
> +}
> +
> +static uint32_t rtl8139_mmio_readl_le(void *opaque, target_phys_addr_t addr)
> +{
> +    uint32_t val = rtl8139_io_readl(opaque, addr & 0xFF);
> +
>      return val;
>  }
>  
> @@ -3292,16 +3312,28 @@ static void rtl8139_ioport_map(PCIDevice *pci_dev, int region_num,
>      register_ioport_read( addr, 0x100, 4, rtl8139_ioport_readl,  s);
>  }
>  
> -static CPUReadMemoryFunc * const rtl8139_mmio_read[3] = {
> +static CPUReadMemoryFunc * const rtl8139_mmio_read_be[3] = {
>      rtl8139_mmio_readb,
> -    rtl8139_mmio_readw,
> -    rtl8139_mmio_readl,
> +    rtl8139_mmio_readw_be,
> +    rtl8139_mmio_readl_be,
>  };
>  
> -static CPUWriteMemoryFunc * const rtl8139_mmio_write[3] = {
> +static CPUWriteMemoryFunc * const rtl8139_mmio_write_be[3] = {
>      rtl8139_mmio_writeb,
> -    rtl8139_mmio_writew,
> -    rtl8139_mmio_writel,
> +    rtl8139_mmio_writew_be,
> +    rtl8139_mmio_writel_be,
> +};
> +
> +static CPUReadMemoryFunc * const rtl8139_mmio_read_le[3] = {
> +    rtl8139_mmio_readb,
> +    rtl8139_mmio_readw_le,
> +    rtl8139_mmio_readl_le,
> +};
> +
> +static CPUWriteMemoryFunc * const rtl8139_mmio_write_le[3] = {
> +    rtl8139_mmio_writeb,
> +    rtl8139_mmio_writew_le,
> +    rtl8139_mmio_writel_le,
>  };
>  
>  static void rtl8139_timer(void *opaque)
> @@ -3369,8 +3401,15 @@ static int pci_rtl8139_init(PCIDevice *dev)
>      pci_conf[PCI_CAPABILITY_LIST] = 0xdc;
>  
>      /* I/O handler for memory-mapped I/O */
> -    s->rtl8139_mmio_io_addr =
> -        cpu_register_io_memory(rtl8139_mmio_read, rtl8139_mmio_write, s);
> +    if (s->be) {
> +        s->rtl8139_mmio_io_addr =
> +            cpu_register_io_memory(rtl8139_mmio_read_be, rtl8139_mmio_write_be,
> +                                   s);
> +    } else {
> +        s->rtl8139_mmio_io_addr =
> +            cpu_register_io_memory(rtl8139_mmio_read_le, rtl8139_mmio_write_le,
> +                                   s);
> +    }
>  
>      pci_register_bar(&s->dev, 0, 0x100,
>                             PCI_BASE_ADDRESS_SPACE_IO,  rtl8139_ioport_map);
> @@ -3404,6 +3443,7 @@ static PCIDeviceInfo rtl8139_info = {
>      .romfile    = "pxe-rtl8139.bin",
>      .qdev.props = (Property[]) {
>          DEFINE_NIC_PROPERTIES(RTL8139State, conf),
> +        DEFINE_PROP_UINT32("be", RTL8139State, be, 0),
>          DEFINE_PROP_END_OF_LIST(),
>      }
>  };
> -- 
> 1.5.6.5
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 17:07       ` Richard Henderson
@ 2010-03-24 20:12         ` Richard Henderson
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2010-03-24 20:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Blue Swirl, qemu-devel

On 03/24/2010 10:07 AM, Richard Henderson wrote:
> struct CPUSmallCommonState
> {
>     // most of the stuff from CPU_COMMON.
>     // sorted for some thought of padding elimination.  ;-)
> };
> 
> struct CPULargeCommonState
> {
>     CPUTLBEntry tlb_table[NB_MMU_MODES][CPU_TLB_SIZE];
>     target_phys_addr_t iotlb[NB_MMU_MODES][CPU_TLB_SIZE];
>     struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
>     jmp_buf jmp_env;
> };
...
> Now.  If you're compiling a file for which cpu-specific code is ok:
> 
> register CPUXYZLargeState *env __asm__(AREG0);
> #define ENV_SMALL_COMMON_STATE    (&env->s.common_s)
> #define ENV_LARGE_COMMON_STATE    (&env->common_l)
> 
> If you're compiling a file which is supposed to be independant of cpu:
> 
> register CPUSmallCommonState *env __asm__(AREG0);
> #define ENV_SMALL_COMMON_STATE    (env)
> #define ENV_LARGE_COMMON_STATE    ((CPULargeCommonState *)((char *)env + cpu_large_state_offset))

On 03/24/2010 11:00 AM, Blue Swirl wrote:
> One trick is to define a fixed offset (about half CPUState size) and
> make env point to CPUState + offset. Then the negative part of the
> offset space can be used efficiently. But this just doubles the space
> that can be accessed fast, so it's not a big win.

A good idea.  We can eliminate the cpu_large_state_offset from above via:

struct CPUSmallCommonState
{
    // most of the stuff from CPU_COMMON.
} __attribute__((aligned));

struct CPUXYZPrivateState
{
    // all the cpu-specific stuff
};

struct CPUXYZSmallState
{
    CPUXYZPrivateState p;
    CPUSmallCommonState s;
};

struct CPUXYZAllState
{
    CPUXYZSmallState s;
    CPULargeCommonState l;  // ARG0 register points here.
};

register void *biased_env __asm__(AREG0);

static inline CPUXYZPrivateState *get_env_cpu_private(void)
{
    return &((CPUXYZSmallState *)biased_env - 1)->p;
}

static inline CPUSmallCommonState *get_env_common_small(void)
{
    return (CPUSmallCommonState *)biased_env - 1;
}

static inline CPULargeCommonState *get_env_common_large(void)
{
    return (CPULargeCommonState *)biased_env;
}


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:08     ` Michael S. Tsirkin
@ 2010-03-24 20:24       ` Blue Swirl
  2010-03-24 20:24         ` Michael S. Tsirkin
                           ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Blue Swirl @ 2010-03-24 20:24 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel

On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Mar 24, 2010 at 10:05:10PM +0200, Blue Swirl wrote:
>  > On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
>  > > On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
>  > >  > rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
>  > >
>  > >
>  > > I don't see how it would help. These still get target_phys_addr_t which
>  > >  is per-target. Further, a ton of devices do
>  > >  cpu_register_physical_memory/qemu_register_coalesced_mmio.
>  > >  These are also per target.
>  >
>  > I don't know what I was eating yesterday: there are no references to
>  > ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
>  > for the device itself, just add a property "be". The attached patch
>  > performs this part.
>  >
>  > But now there is a bigger problem, how to pass the property to the
>  > device. It's not fair to require the user to remember to set it.
>
>
> I still don't fully understand how come e1000 cares about
>  target endianness.

It shouldn't. Maybe the real fix is to remove the byte swapping.

>  > >  A simple solution would be to change all of cpu_XX functions to
>  > >  get a 64 bit address. This is a lot of churn, if we do this
>  > >  anyway we should also pass length to callbacks, this way
>  > >  rwhandler will get very small or go away completely.
>  >
>  > It's not too much effort to keep the target_phys_addr_t type.
>
>
> I don't understand - target_phys_addr_t is different for different
>  targets to we will need to recompile the code for each target.
>  What am I missing?

target_phys_addr_t is 64 bit on a 64 bit host, on a 32 bit host it's
size will be either 64 or 32 bits depending on the target. So the
files are compiled once on 64 bit host, twice on 32 bit host if both
32 and 64 bits targets are selected.

>  > From e0ab5cc41c68207be558ccb330f4fb83fba4ee6f Mon Sep 17 00:00:00 2001
>  > From: Blue Swirl <blauwirbel@gmail.com>
>  > Date: Wed, 24 Mar 2010 19:54:05 +0000
>  > Subject: [PATCH] Compile rtl8139 and e1000 only once
>  >
>  > WIP
>  >
>  > Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
>  > ---
>  >  Makefile.objs   |    2 +
>  >  Makefile.target |    4 --
>  >  hw/e1000.c      |  108 ++++++++++++++++++++++++++++++++++++++++++------------
>  >  hw/rtl8139.c    |   82 +++++++++++++++++++++++++++++++-----------
>  >  4 files changed, 147 insertions(+), 49 deletions(-)
>  >
>  > diff --git a/Makefile.objs b/Makefile.objs
>  > index 281f7a6..54895f8 100644
>  > --- a/Makefile.objs
>  > +++ b/Makefile.objs
>  > @@ -155,6 +155,8 @@ hw-obj-y += msix.o
>  >  hw-obj-y += ne2000.o
>  >  hw-obj-y += eepro100.o
>  >  hw-obj-y += pcnet.o
>  > +hw-obj-y += rtl8139.o
>  > +hw-obj-y += e1000.o
>  >
>  >  hw-obj-$(CONFIG_SMC91C111) += smc91c111.o
>  >  hw-obj-$(CONFIG_LAN9118) += lan9118.o
>  > diff --git a/Makefile.target b/Makefile.target
>  > index eb4d010..1a86fc4 100644
>  > --- a/Makefile.target
>  > +++ b/Makefile.target
>  > @@ -176,10 +176,6 @@ QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
>  >  # xen backend driver support
>  >  obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
>  >
>  > -# PCI network cards
>  > -obj-y += rtl8139.o
>  > -obj-y += e1000.o
>  > -
>  >  # Hardware support
>  >  obj-i386-y = ide/core.o
>  >  obj-i386-y += pckbd.o dma.o
>  > diff --git a/hw/e1000.c b/hw/e1000.c
>  > index fd3059a..0f72db8 100644
>  > --- a/hw/e1000.c
>  > +++ b/hw/e1000.c
>  > @@ -121,6 +121,7 @@ typedef struct E1000State_st {
>  >          uint16_t reading;
>  >          uint32_t old_eecd;
>  >      } eecd_state;
>  > +    uint32_t be;
>  >  } E1000State;
>  >
>  >  #define      defreg(x)       x = (E1000_##x>>2)
>  > @@ -825,14 +826,11 @@ static void (*macreg_writeops[])(E1000State *, int, uint32_t) = {
>  >  enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
>  >
>  >  static void
>  > -e1000_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +e1000_mmio_writel_le(void *opaque, target_phys_addr_t addr, uint32_t val)
>  >  {
>  >      E1000State *s = opaque;
>  >      unsigned int index = (addr & 0x1ffff) >> 2;
>  >
>  > -#ifdef TARGET_WORDS_BIGENDIAN
>  > -    val = bswap32(val);
>  > -#endif
>  >      if (index < NWRITEOPS && macreg_writeops[index])
>  >          macreg_writeops[index](s, index, val);
>  >      else if (index < NREADOPS && macreg_readops[index])
>  > @@ -841,25 +839,47 @@ e1000_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
>  >          DBGOUT(UNKNOWN, "MMIO unknown write addr=0x%08x,val=0x%08x\n",
>  >                 index<<2, val);
>  >  }
>  > +static void
>  > +e1000_mmio_writel_be(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +{
>  > +    val = bswap32(val);
>  > +    e1000_mmio_writel_le(opaque, addr, val);
>  > +}
>  >
>  >  static void
>  > -e1000_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +e1000_mmio_writew_le(void *opaque, target_phys_addr_t addr, uint32_t val)
>  >  {
>  >      // emulate hw without byte enables: no RMW
>  > -    e1000_mmio_writel(opaque, addr & ~3,
>  > -                      (val & 0xffff) << (8*(addr & 3)));
>  > +    e1000_mmio_writel_le(opaque, addr & ~3,
>  > +                         (val & 0xffff) << (8*(addr & 3)));
>  >  }
>  >
>  >  static void
>  > -e1000_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +e1000_mmio_writew_be(void *opaque, target_phys_addr_t addr, uint32_t val)
>  >  {
>  >      // emulate hw without byte enables: no RMW
>  > -    e1000_mmio_writel(opaque, addr & ~3,
>  > -                      (val & 0xff) << (8*(addr & 3)));
>  > +    e1000_mmio_writel_be(opaque, addr & ~3,
>  > +                         (val & 0xffff) << (8*(addr & 3)));
>  > +}
>  > +
>  > +static void
>  > +e1000_mmio_writeb_be(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +{
>  > +    // emulate hw without byte enables: no RMW
>  > +    e1000_mmio_writel_be(opaque, addr & ~3,
>  > +                         (val & 0xff) << (8*(addr & 3)));
>  > +}
>  > +
>  > +static void
>  > +e1000_mmio_writeb_le(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +{
>  > +    // emulate hw without byte enables: no RMW
>  > +    e1000_mmio_writel_le(opaque, addr & ~3,
>  > +                         (val & 0xff) << (8*(addr & 3)));
>  >  }
>  >
>  >  static uint32_t
>  > -e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
>  > +e1000_mmio_readl_le(void *opaque, target_phys_addr_t addr)
>  >  {
>  >      E1000State *s = opaque;
>  >      unsigned int index = (addr & 0x1ffff) >> 2;
>  > @@ -867,9 +887,6 @@ e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
>  >      if (index < NREADOPS && macreg_readops[index])
>  >      {
>  >          uint32_t val = macreg_readops[index](s, index);
>  > -#ifdef TARGET_WORDS_BIGENDIAN
>  > -        val = bswap32(val);
>  > -#endif
>  >          return val;
>  >      }
>  >      DBGOUT(UNKNOWN, "MMIO unknown read addr=0x%08x\n", index<<2);
>  > @@ -877,16 +894,38 @@ e1000_mmio_readl(void *opaque, target_phys_addr_t addr)
>  >  }
>  >
>  >  static uint32_t
>  > -e1000_mmio_readb(void *opaque, target_phys_addr_t addr)
>  > +e1000_mmio_readl_be(void *opaque, target_phys_addr_t addr)
>  >  {
>  > -    return ((e1000_mmio_readl(opaque, addr & ~3)) >>
>  > +    uint32_t val = e1000_mmio_readl_le(opaque, addr);
>  > +    val = bswap32(val);
>  > +    return val;
>  > +}
>  > +
>  > +static uint32_t
>  > +e1000_mmio_readb_be(void *opaque, target_phys_addr_t addr)
>  > +{
>  > +    return ((e1000_mmio_readl_be(opaque, addr & ~3)) >>
>  >              (8 * (addr & 3))) & 0xff;
>  >  }
>  >
>  >  static uint32_t
>  > -e1000_mmio_readw(void *opaque, target_phys_addr_t addr)
>  > +e1000_mmio_readb_le(void *opaque, target_phys_addr_t addr)
>  > +{
>  > +    return ((e1000_mmio_readl_le(opaque, addr & ~3)) >>
>  > +            (8 * (addr & 3))) & 0xff;
>  > +}
>  > +
>  > +static uint32_t
>  > +e1000_mmio_readw_le(void *opaque, target_phys_addr_t addr)
>  > +{
>  > +    return ((e1000_mmio_readl_le(opaque, addr & ~3)) >>
>  > +            (8 * (addr & 3))) & 0xffff;
>  > +}
>  > +
>  > +static uint32_t
>  > +e1000_mmio_readw_be(void *opaque, target_phys_addr_t addr)
>  >  {
>  > -    return ((e1000_mmio_readl(opaque, addr & ~3)) >>
>  > +    return ((e1000_mmio_readl_be(opaque, addr & ~3)) >>
>  >              (8 * (addr & 3))) & 0xffff;
>  >  }
>  >
>  > @@ -1008,13 +1047,28 @@ static const uint32_t mac_reg_init[] = {
>  >  };
>  >
>  >  /* PCI interface */
>  > +static CPUWriteMemoryFunc * const e1000_mmio_write_be[] = {
>  > +    e1000_mmio_writeb_be,
>  > +    e1000_mmio_writew_be,
>  > +    e1000_mmio_writel_be
>  > +};
>  >
>  > -static CPUWriteMemoryFunc * const e1000_mmio_write[] = {
>  > -    e1000_mmio_writeb,       e1000_mmio_writew,      e1000_mmio_writel
>  > +static CPUReadMemoryFunc * const e1000_mmio_read_be[] = {
>  > +    e1000_mmio_readb_be,
>  > +    e1000_mmio_readw_be,
>  > +    e1000_mmio_readl_be
>  >  };
>  >
>  > -static CPUReadMemoryFunc * const e1000_mmio_read[] = {
>  > -    e1000_mmio_readb,        e1000_mmio_readw,       e1000_mmio_readl
>  > +static CPUWriteMemoryFunc * const e1000_mmio_write_le[] = {
>  > +    e1000_mmio_writeb_le,
>  > +    e1000_mmio_writew_le,
>  > +    e1000_mmio_writel_le
>  > +};
>  > +
>  > +static CPUReadMemoryFunc * const e1000_mmio_read_le[] = {
>  > +    e1000_mmio_readb_le,
>  > +    e1000_mmio_readw_le,
>  > +    e1000_mmio_readl_le
>  >  };
>  >
>  >  static void
>  > @@ -1102,8 +1156,13 @@ static int pci_e1000_init(PCIDevice *pci_dev)
>  >      /* TODO: RST# value should be 0 if programmable, PCI spec 6.2.4 */
>  >      pci_conf[PCI_INTERRUPT_PIN] = 1; // interrupt pin 0
>  >
>  > -    d->mmio_index = cpu_register_io_memory(e1000_mmio_read,
>  > -            e1000_mmio_write, d);
>  > +    if (d->be) {
>  > +        d->mmio_index = cpu_register_io_memory(e1000_mmio_read_be,
>  > +                                               e1000_mmio_write_be, d);
>  > +    } else {
>  > +        d->mmio_index = cpu_register_io_memory(e1000_mmio_read_le,
>  > +                                               e1000_mmio_write_le, d);
>  > +    }
>  >
>  >      pci_register_bar((PCIDevice *)d, 0, PNPMMIO_SIZE,
>  >                             PCI_BASE_ADDRESS_SPACE_MEMORY, e1000_mmio_map);
>  > @@ -1146,6 +1205,7 @@ static PCIDeviceInfo e1000_info = {
>  >      .romfile    = "pxe-e1000.bin",
>  >      .qdev.props = (Property[]) {
>  >          DEFINE_NIC_PROPERTIES(E1000State, conf),
>  > +        DEFINE_PROP_UINT32("be", E1000State, be, 0),
>  >          DEFINE_PROP_END_OF_LIST(),
>  >      }
>  >  };
>  > diff --git a/hw/rtl8139.c b/hw/rtl8139.c
>  > index 72e2242..ef5f1fd 100644
>  > --- a/hw/rtl8139.c
>  > +++ b/hw/rtl8139.c
>  > @@ -493,7 +493,7 @@ typedef struct RTL8139State {
>  >      /* PCI interrupt timer */
>  >      QEMUTimer *timer;
>  >      int64_t TimerExpire;
>  > -
>  > +    uint32_t be;
>  >  } RTL8139State;
>  >
>  >  static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
>  > @@ -3123,19 +3123,29 @@ static void rtl8139_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t
>  >      rtl8139_io_writeb(opaque, addr & 0xFF, val);
>  >  }
>  >
>  > -static void rtl8139_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +static void rtl8139_mmio_writew_be(void *opaque, target_phys_addr_t addr,
>  > +                                   uint32_t val)
>  >  {
>  > -#ifdef TARGET_WORDS_BIGENDIAN
>  >      val = bswap16(val);
>  > -#endif
>  >      rtl8139_io_writew(opaque, addr & 0xFF, val);
>  >  }
>  >
>  > -static void rtl8139_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
>  > +static void rtl8139_mmio_writew_le(void *opaque, target_phys_addr_t addr,
>  > +                                   uint32_t val)
>  > +{
>  > +    rtl8139_io_writew(opaque, addr & 0xFF, val);
>  > +}
>  > +
>  > +static void rtl8139_mmio_writel_be(void *opaque, target_phys_addr_t addr,
>  > +                                   uint32_t val)
>  >  {
>  > -#ifdef TARGET_WORDS_BIGENDIAN
>  >      val = bswap32(val);
>  > -#endif
>  > +    rtl8139_io_writel(opaque, addr & 0xFF, val);
>  > +}
>  > +
>  > +static void rtl8139_mmio_writel_le(void *opaque, target_phys_addr_t addr,
>  > +                                   uint32_t val)
>  > +{
>  >      rtl8139_io_writel(opaque, addr & 0xFF, val);
>  >  }
>  >
>  > @@ -3144,21 +3154,31 @@ static uint32_t rtl8139_mmio_readb(void *opaque, target_phys_addr_t addr)
>  >      return rtl8139_io_readb(opaque, addr & 0xFF);
>  >  }
>  >
>  > -static uint32_t rtl8139_mmio_readw(void *opaque, target_phys_addr_t addr)
>  > +static uint32_t rtl8139_mmio_readw_be(void *opaque, target_phys_addr_t addr)
>  >  {
>  >      uint32_t val = rtl8139_io_readw(opaque, addr & 0xFF);
>  > -#ifdef TARGET_WORDS_BIGENDIAN
>  >      val = bswap16(val);
>  > -#endif
>  >      return val;
>  >  }
>  >
>  > -static uint32_t rtl8139_mmio_readl(void *opaque, target_phys_addr_t addr)
>  > +static uint32_t rtl8139_mmio_readw_le(void *opaque, target_phys_addr_t addr)
>  > +{
>  > +    uint32_t val = rtl8139_io_readw(opaque, addr & 0xFF);
>  > +
>  > +    return val;
>  > +}
>  > +
>  > +static uint32_t rtl8139_mmio_readl_be(void *opaque, target_phys_addr_t addr)
>  >  {
>  >      uint32_t val = rtl8139_io_readl(opaque, addr & 0xFF);
>  > -#ifdef TARGET_WORDS_BIGENDIAN
>  >      val = bswap32(val);
>  > -#endif
>  > +    return val;
>  > +}
>  > +
>  > +static uint32_t rtl8139_mmio_readl_le(void *opaque, target_phys_addr_t addr)
>  > +{
>  > +    uint32_t val = rtl8139_io_readl(opaque, addr & 0xFF);
>  > +
>  >      return val;
>  >  }
>  >
>  > @@ -3292,16 +3312,28 @@ static void rtl8139_ioport_map(PCIDevice *pci_dev, int region_num,
>  >      register_ioport_read( addr, 0x100, 4, rtl8139_ioport_readl,  s);
>  >  }
>  >
>  > -static CPUReadMemoryFunc * const rtl8139_mmio_read[3] = {
>  > +static CPUReadMemoryFunc * const rtl8139_mmio_read_be[3] = {
>  >      rtl8139_mmio_readb,
>  > -    rtl8139_mmio_readw,
>  > -    rtl8139_mmio_readl,
>  > +    rtl8139_mmio_readw_be,
>  > +    rtl8139_mmio_readl_be,
>  >  };
>  >
>  > -static CPUWriteMemoryFunc * const rtl8139_mmio_write[3] = {
>  > +static CPUWriteMemoryFunc * const rtl8139_mmio_write_be[3] = {
>  >      rtl8139_mmio_writeb,
>  > -    rtl8139_mmio_writew,
>  > -    rtl8139_mmio_writel,
>  > +    rtl8139_mmio_writew_be,
>  > +    rtl8139_mmio_writel_be,
>  > +};
>  > +
>  > +static CPUReadMemoryFunc * const rtl8139_mmio_read_le[3] = {
>  > +    rtl8139_mmio_readb,
>  > +    rtl8139_mmio_readw_le,
>  > +    rtl8139_mmio_readl_le,
>  > +};
>  > +
>  > +static CPUWriteMemoryFunc * const rtl8139_mmio_write_le[3] = {
>  > +    rtl8139_mmio_writeb,
>  > +    rtl8139_mmio_writew_le,
>  > +    rtl8139_mmio_writel_le,
>  >  };
>  >
>  >  static void rtl8139_timer(void *opaque)
>  > @@ -3369,8 +3401,15 @@ static int pci_rtl8139_init(PCIDevice *dev)
>  >      pci_conf[PCI_CAPABILITY_LIST] = 0xdc;
>  >
>  >      /* I/O handler for memory-mapped I/O */
>  > -    s->rtl8139_mmio_io_addr =
>  > -        cpu_register_io_memory(rtl8139_mmio_read, rtl8139_mmio_write, s);
>  > +    if (s->be) {
>  > +        s->rtl8139_mmio_io_addr =
>  > +            cpu_register_io_memory(rtl8139_mmio_read_be, rtl8139_mmio_write_be,
>  > +                                   s);
>  > +    } else {
>  > +        s->rtl8139_mmio_io_addr =
>  > +            cpu_register_io_memory(rtl8139_mmio_read_le, rtl8139_mmio_write_le,
>  > +                                   s);
>  > +    }
>  >
>  >      pci_register_bar(&s->dev, 0, 0x100,
>  >                             PCI_BASE_ADDRESS_SPACE_IO,  rtl8139_ioport_map);
>  > @@ -3404,6 +3443,7 @@ static PCIDeviceInfo rtl8139_info = {
>  >      .romfile    = "pxe-rtl8139.bin",
>  >      .qdev.props = (Property[]) {
>  >          DEFINE_NIC_PROPERTIES(RTL8139State, conf),
>  > +        DEFINE_PROP_UINT32("be", RTL8139State, be, 0),
>  >          DEFINE_PROP_END_OF_LIST(),
>  >      }
>  >  };
>  > --
>  > 1.5.6.5
>  >
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:24       ` Blue Swirl
@ 2010-03-24 20:24         ` Michael S. Tsirkin
  2010-03-24 20:32           ` Blue Swirl
  2010-03-24 20:28         ` Anthony Liguori
  2010-03-24 21:00         ` Aurelien Jarno
  2 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2010-03-24 20:24 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On Wed, Mar 24, 2010 at 10:24:12PM +0200, Blue Swirl wrote:
> On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Wed, Mar 24, 2010 at 10:05:10PM +0200, Blue Swirl wrote:
> >  > On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> >  > > On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
> >  > >  > rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
> >  > >
> >  > >
> >  > > I don't see how it would help. These still get target_phys_addr_t which
> >  > >  is per-target. Further, a ton of devices do
> >  > >  cpu_register_physical_memory/qemu_register_coalesced_mmio.
> >  > >  These are also per target.
> >  >
> >  > I don't know what I was eating yesterday: there are no references to
> >  > ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
> >  > for the device itself, just add a property "be". The attached patch
> >  > performs this part.
> >  >
> >  > But now there is a bigger problem, how to pass the property to the
> >  > device. It's not fair to require the user to remember to set it.
> >
> >
> > I still don't fully understand how come e1000 cares about
> >  target endianness.
> 
> It shouldn't. Maybe the real fix is to remove the byte swapping.

Presumably it's there for a reason?

> >  > >  A simple solution would be to change all of cpu_XX functions to
> >  > >  get a 64 bit address. This is a lot of churn, if we do this
> >  > >  anyway we should also pass length to callbacks, this way
> >  > >  rwhandler will get very small or go away completely.
> >  >
> >  > It's not too much effort to keep the target_phys_addr_t type.
> >
> >
> > I don't understand - target_phys_addr_t is different for different
> >  targets to we will need to recompile the code for each target.
> >  What am I missing?
> 
> target_phys_addr_t is 64 bit on a 64 bit host, on a 32 bit host it's
> size will be either 64 or 32 bits depending on the target. So the
> files are compiled once on 64 bit host, twice on 32 bit host if both
> 32 and 64 bits targets are selected.

How about just making it 64 bit unconditionally?
How much do we save by using a 32 bit address value?

-- 
MST

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:24       ` Blue Swirl
  2010-03-24 20:24         ` Michael S. Tsirkin
@ 2010-03-24 20:28         ` Anthony Liguori
  2010-03-24 21:00         ` Aurelien Jarno
  2 siblings, 0 replies; 31+ messages in thread
From: Anthony Liguori @ 2010-03-24 20:28 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel, Michael S. Tsirkin

On 03/24/2010 03:24 PM, Blue Swirl wrote:
> On 3/24/10, Michael S. Tsirkin<mst@redhat.com>  wrote:
>    
>> On Wed, Mar 24, 2010 at 10:05:10PM +0200, Blue Swirl wrote:
>>   >  On 3/24/10, Michael S. Tsirkin<mst@redhat.com>  wrote:
>>   >  >  On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
>>   >  >   >  rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
>>   >  >
>>   >  >
>>   >  >  I don't see how it would help. These still get target_phys_addr_t which
>>   >  >   is per-target. Further, a ton of devices do
>>   >  >   cpu_register_physical_memory/qemu_register_coalesced_mmio.
>>   >  >   These are also per target.
>>   >
>>   >  I don't know what I was eating yesterday: there are no references to
>>   >  ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
>>   >  for the device itself, just add a property "be". The attached patch
>>   >  performs this part.
>>   >
>>   >  But now there is a bigger problem, how to pass the property to the
>>   >  device. It's not fair to require the user to remember to set it.
>>
>>
>> I still don't fully understand how come e1000 cares about
>>   target endianness.
>>      
> It shouldn't. Maybe the real fix is to remove the byte swapping.
>    

My previous pci memory functions patches removed the byte swapping.

The problem is that PCI devices are going to operate in little endian 
mode (usually) whereas the CPU may be acting in big endian mode.  We 
need to do a byte swap somewhere but the better place to do it is in the 
PCI bus layer.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:24         ` Michael S. Tsirkin
@ 2010-03-24 20:32           ` Blue Swirl
  0 siblings, 0 replies; 31+ messages in thread
From: Blue Swirl @ 2010-03-24 20:32 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel

On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Mar 24, 2010 at 10:24:12PM +0200, Blue Swirl wrote:
>  > On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
>  > > On Wed, Mar 24, 2010 at 10:05:10PM +0200, Blue Swirl wrote:
>  > >  > On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
>  > >  > > On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
>  > >  > >  > rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
>  > >  > >
>  > >  > >
>  > >  > > I don't see how it would help. These still get target_phys_addr_t which
>  > >  > >  is per-target. Further, a ton of devices do
>  > >  > >  cpu_register_physical_memory/qemu_register_coalesced_mmio.
>  > >  > >  These are also per target.
>  > >  >
>  > >  > I don't know what I was eating yesterday: there are no references to
>  > >  > ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
>  > >  > for the device itself, just add a property "be". The attached patch
>  > >  > performs this part.
>  > >  >
>  > >  > But now there is a bigger problem, how to pass the property to the
>  > >  > device. It's not fair to require the user to remember to set it.
>  > >
>  > >
>  > > I still don't fully understand how come e1000 cares about
>  > >  target endianness.
>  >
>  > It shouldn't. Maybe the real fix is to remove the byte swapping.
>
>
> Presumably it's there for a reason?
>
>
>  > >  > >  A simple solution would be to change all of cpu_XX functions to
>  > >  > >  get a 64 bit address. This is a lot of churn, if we do this
>  > >  > >  anyway we should also pass length to callbacks, this way
>  > >  > >  rwhandler will get very small or go away completely.
>  > >  >
>  > >  > It's not too much effort to keep the target_phys_addr_t type.
>  > >
>  > >
>  > > I don't understand - target_phys_addr_t is different for different
>  > >  targets to we will need to recompile the code for each target.
>  > >  What am I missing?
>  >
>  > target_phys_addr_t is 64 bit on a 64 bit host, on a 32 bit host it's
>  > size will be either 64 or 32 bits depending on the target. So the
>  > files are compiled once on 64 bit host, twice on 32 bit host if both
>  > 32 and 64 bits targets are selected.
>
>
> How about just making it 64 bit unconditionally?
>  How much do we save by using a 32 bit address value?

On a 32 bit host, probably a lot because of register pressure. And
it's not too much effort to keep the target_phys_addr_t type logic.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:24       ` Blue Swirl
  2010-03-24 20:24         ` Michael S. Tsirkin
  2010-03-24 20:28         ` Anthony Liguori
@ 2010-03-24 21:00         ` Aurelien Jarno
  2 siblings, 0 replies; 31+ messages in thread
From: Aurelien Jarno @ 2010-03-24 21:00 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel, Michael S. Tsirkin

On Wed, Mar 24, 2010 at 10:24:12PM +0200, Blue Swirl wrote:
> On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Wed, Mar 24, 2010 at 10:05:10PM +0200, Blue Swirl wrote:
> >  > On 3/24/10, Michael S. Tsirkin <mst@redhat.com> wrote:
> >  > > On Tue, Mar 23, 2010 at 11:43:51PM +0200, Blue Swirl wrote:
> >  > >  > rtl8139.c, e1000.c: need to convert ldl/stl to cpu_physical_memory_read/write.
> >  > >
> >  > >
> >  > > I don't see how it would help. These still get target_phys_addr_t which
> >  > >  is per-target. Further, a ton of devices do
> >  > >  cpu_register_physical_memory/qemu_register_coalesced_mmio.
> >  > >  These are also per target.
> >  >
> >  > I don't know what I was eating yesterday: there are no references to
> >  > ldl/stl in either rtl8139 or e1000. In fact, the conversion is simple
> >  > for the device itself, just add a property "be". The attached patch
> >  > performs this part.
> >  >
> >  > But now there is a bigger problem, how to pass the property to the
> >  > device. It's not fair to require the user to remember to set it.
> >
> >
> > I still don't fully understand how come e1000 cares about
> >  target endianness.
> 
> It shouldn't. Maybe the real fix is to remove the byte swapping.
> 

The real fix is actually to add a layer handling bus byte swapping
depending on how bus are connected.

Currently it only works because all big endian boards QEMU emulates
need to byteswap bus access, and none of the little endian boards 
need to do that.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 20:05   ` Blue Swirl
  2010-03-24 20:08     ` Michael S. Tsirkin
@ 2010-03-24 22:33     ` Paul Brook
  2010-03-24 22:50       ` Anthony Liguori
  1 sibling, 1 reply; 31+ messages in thread
From: Paul Brook @ 2010-03-24 22:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl, Michael S. Tsirkin

> But now there is a bigger problem, how to pass the property to the
> device. It's not fair to require the user to remember to set it.

It should not be a property of the device. All devices have a native 
endianness (for PCI this is little-endian), and the intermediate 
busses/interconnects should determine whether byteswapping occurs.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 19:28     ` Juan Quintela
@ 2010-03-24 22:47       ` Paul Brook
  2010-03-24 22:57         ` Anthony Liguori
  0 siblings, 1 reply; 31+ messages in thread
From: Paul Brook @ 2010-03-24 22:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Blue Swirl, Juan Quintela

> Actually, Anthony suggested at some point to just use 64 bits for
> TARGET_PHYS_ADDR_BITS and remove the need for hw32/64.
> 
> I think that people emulationg 32bits on 32bits would suffer, but have
> no clue how much.  Anthony, what was the idea?

Sacrificing runtime performance to avoid rebuilding a few files is not 
acceptable. I consider the fact that TARGET_PHYS_ADDR_BITS is always 64 on 64-
bit hosts is a bug. It's just hard to fix, and probably even less of a 
performance hit, so I haven't bothered yet.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 22:33     ` Paul Brook
@ 2010-03-24 22:50       ` Anthony Liguori
  2010-03-24 23:05         ` Paul Brook
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony Liguori @ 2010-03-24 22:50 UTC (permalink / raw)
  To: Paul Brook; +Cc: Blue Swirl, qemu-devel, Michael S. Tsirkin

On 03/24/2010 05:33 PM, Paul Brook wrote:
>> But now there is a bigger problem, how to pass the property to the
>> device. It's not fair to require the user to remember to set it.
>>      
> It should not be a property of the device. All devices have a native
> endianness (for PCI this is little-endian), and the intermediate
> busses/interconnects should determine whether byteswapping occurs.
>    

Right, the byte swapping needs to happen at the bus level which requires 
that the PCI regions use a different callback mechanism (and don't 
register directly with the cpu).

Regards,

Anthony Liguori

> Paul
>
>
>    

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 22:47       ` Paul Brook
@ 2010-03-24 22:57         ` Anthony Liguori
  2010-03-25  3:08           ` Jamie Lokier
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony Liguori @ 2010-03-24 22:57 UTC (permalink / raw)
  To: Paul Brook; +Cc: Blue Swirl, qemu-devel, Juan Quintela

On 03/24/2010 05:47 PM, Paul Brook wrote:
>> Actually, Anthony suggested at some point to just use 64 bits for
>> TARGET_PHYS_ADDR_BITS and remove the need for hw32/64.
>>
>> I think that people emulationg 32bits on 32bits would suffer, but have
>> no clue how much.  Anthony, what was the idea?
>>      
> Sacrificing runtime performance to avoid rebuilding a few files is not
> acceptable. I consider the fact that TARGET_PHYS_ADDR_BITS is always 64 on 64-
> bit hosts is a bug. It's just hard to fix, and probably even less of a
> performance hit, so I haven't bothered yet.
>    

It's a statement of correctness really.  Devices should never deal with 
target_phys_addr_t's.

The question is, should a pci_addr_t or a sysbus_addr_t be 64 bit or 
should it be 32-bit on 32-bit platforms.  Honestly, I am extremely 
sceptical that there would be any measurable performance difference.

Regards,

Anthony Liguori

> Paul
>
>
>    

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 22:50       ` Anthony Liguori
@ 2010-03-24 23:05         ` Paul Brook
  2010-03-24 23:07           ` Anthony Liguori
  0 siblings, 1 reply; 31+ messages in thread
From: Paul Brook @ 2010-03-24 23:05 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Michael S. Tsirkin

> On 03/24/2010 05:33 PM, Paul Brook wrote:
> >> But now there is a bigger problem, how to pass the property to the
> >> device. It's not fair to require the user to remember to set it.
> >
> > It should not be a property of the device. All devices have a native
> > endianness (for PCI this is little-endian), and the intermediate
> > busses/interconnects should determine whether byteswapping occurs.
> 
> Right, the byte swapping needs to happen at the bus level which requires
> that the PCI regions use a different callback mechanism (and don't
> register directly with the cpu).

Not necessarily a different callback mechanism, but probably a different 
registration mechanism.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 23:05         ` Paul Brook
@ 2010-03-24 23:07           ` Anthony Liguori
  2010-03-25  8:21             ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony Liguori @ 2010-03-24 23:07 UTC (permalink / raw)
  To: Paul Brook; +Cc: Blue Swirl, qemu-devel, Michael S. Tsirkin

On 03/24/2010 06:05 PM, Paul Brook wrote:
>> On 03/24/2010 05:33 PM, Paul Brook wrote:
>>      
>>>> But now there is a bigger problem, how to pass the property to the
>>>> device. It's not fair to require the user to remember to set it.
>>>>          
>>> It should not be a property of the device. All devices have a native
>>> endianness (for PCI this is little-endian), and the intermediate
>>> busses/interconnects should determine whether byteswapping occurs.
>>>        
>> Right, the byte swapping needs to happen at the bus level which requires
>> that the PCI regions use a different callback mechanism (and don't
>> register directly with the cpu).
>>      
> Not necessarily a different callback mechanism, but probably a different
> registration mechanism.
>    

Problem with the current scheme is that it's tied to 
target_phys_addr_t.  It's not a target_phys_addr_t and cannot be used 
with functions that take target_phys_addr_ts.

We can either introduce a generic address type or we can introduce bus 
specific addresses and have different callbacks.  The later has better 
type safety and since this isn't an obvious issue to most developers, 
that's probably an important feature.

Regards,

Anthony Liguori

> Paul
>    

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24  9:17 ` [Qemu-devel] " Juan Quintela
  2010-03-24 17:56   ` Blue Swirl
@ 2010-03-25  2:54   ` Jamie Lokier
  1 sibling, 0 replies; 31+ messages in thread
From: Jamie Lokier @ 2010-03-25  2:54 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Blue Swirl, qemu-devel

Juan Quintela wrote:
> Blue Swirl <blauwirbel@gmail.com> wrote:
> > Hi,
> >
> > Here's some planning for getting most files compiled as few times as
> > possible. Comments and suggestions are welcome.
> 
> I took some thought about this at some point.  Problems here start from
> "Recursive Makefile condered Harmful (tm)".

Anyone who remembers the old Linux kernel recursive make compared
with the current "dancing makefile" magic will appreciate the big leap
forward in usability.

-- Jamie

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 22:57         ` Anthony Liguori
@ 2010-03-25  3:08           ` Jamie Lokier
  0 siblings, 0 replies; 31+ messages in thread
From: Jamie Lokier @ 2010-03-25  3:08 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, Juan Quintela, Paul Brook, qemu-devel

Anthony Liguori wrote:
> It's a statement of correctness really.  Devices should never deal with 
> target_phys_addr_t's.

Shouldn't they?  On real hardware, 64-bit PCI devices switch to being
32-bit PCI when plugged into a 32-bit motherboard slot.

> The question is, should a pci_addr_t or a sysbus_addr_t be 64 bit or 
> should it be 32-bit on 32-bit platforms.

Depends what you want to emulate.  It's not an accurate emulation if
all the old PCI devices provide 64-bit BARs; that could conceivably
bite some old OS, which expects NE2000-PCI to be a 32-bit device, for example.

And it's not a repeatable emulation if switching beteen 32-bit and
64-bit hosts means the guest sees a change in the PCI devices.  It
should be possible to change hosts with qemu willy nilly with zero
change seen be the guest.

So perhaps the width of pci_addr_t should be a per-device property,
not a host property?

> Honestly, I am extremely sceptical that there would be any
> measurable performance difference.

You may be right, but surely the way to find out is to have a way to
build either, and then compare them.  Not have it dictated by the build system.

64-bit ops on 32-bit hosts, especially x86 due to register pressure,
are noticably more expensive than 32-bit ops.  The question is whether
they are sparse enough among all the other logic that it doesn't matter.

With a bit of make cleverness it should be quite easy to support a mix
of build-once files and build-few-times files according to the minimal
compile time variations those files depend on - and express those
dependencies is a simple, one-liner way.  GNU Make is good for that
sort of thing.

- Jamie

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-24 23:07           ` Anthony Liguori
@ 2010-03-25  8:21             ` Michael S. Tsirkin
  2010-03-25 12:01               ` Paul Brook
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2010-03-25  8:21 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, Paul Brook, qemu-devel

On Wed, Mar 24, 2010 at 06:07:35PM -0500, Anthony Liguori wrote:
> On 03/24/2010 06:05 PM, Paul Brook wrote:
>>> On 03/24/2010 05:33 PM, Paul Brook wrote:
>>>      
>>>>> But now there is a bigger problem, how to pass the property to the
>>>>> device. It's not fair to require the user to remember to set it.
>>>>>          
>>>> It should not be a property of the device. All devices have a native
>>>> endianness (for PCI this is little-endian), and the intermediate
>>>> busses/interconnects should determine whether byteswapping occurs.
>>>>        
>>> Right, the byte swapping needs to happen at the bus level which requires
>>> that the PCI regions use a different callback mechanism (and don't
>>> register directly with the cpu).
>>>      
>> Not necessarily a different callback mechanism, but probably a different
>> registration mechanism.
>>    
>
> Problem with the current scheme is that it's tied to target_phys_addr_t.  
> It's not a target_phys_addr_t and cannot be used with functions that take 
> target_phys_addr_ts.
>
> We can either introduce a generic address type or we can introduce bus  
> specific addresses and have different callbacks.  The later has better  
> type safety and since this isn't an obvious issue to most developers,  
> that's probably an important feature.
>
> Regards,
>
> Anthony Liguori
>
>> Paul
>>    


I'd prefer a generic address type since this makes it easier to
share code: for example virtio devices can use different busses,
it's common for pci host bridges to have common code for i/o
and memory, etc.

-- 
MST

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] Re: Compile files only once: some planning
  2010-03-25  8:21             ` Michael S. Tsirkin
@ 2010-03-25 12:01               ` Paul Brook
  0 siblings, 0 replies; 31+ messages in thread
From: Paul Brook @ 2010-03-25 12:01 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Blue Swirl, qemu-devel

> I'd prefer a generic address type since this makes it easier to
> share code: for example virtio devices can use different busses,

This is exactly why virtio devices should not be accessing memory. They should 
be doing everything via a virtqueue, which can interface with the host 
bindings appropriately.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2010-03-25 12:01 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-23 21:43 [Qemu-devel] Compile files only once: some planning Blue Swirl
2010-03-24  9:17 ` [Qemu-devel] " Juan Quintela
2010-03-24 17:56   ` Blue Swirl
2010-03-24 19:28     ` Juan Quintela
2010-03-24 22:47       ` Paul Brook
2010-03-24 22:57         ` Anthony Liguori
2010-03-25  3:08           ` Jamie Lokier
2010-03-25  2:54   ` Jamie Lokier
2010-03-24  9:47 ` Paolo Bonzini
2010-03-24 11:19   ` Richard Henderson
2010-03-24 14:45     ` Paolo Bonzini
2010-03-24 14:56       ` Paul Brook
2010-03-24 16:18         ` Paolo Bonzini
2010-03-24 17:27           ` Paul Brook
2010-03-24 17:07       ` Richard Henderson
2010-03-24 20:12         ` Richard Henderson
2010-03-24 18:00     ` Blue Swirl
2010-03-24 19:39 ` Michael S. Tsirkin
2010-03-24 20:05   ` Blue Swirl
2010-03-24 20:08     ` Michael S. Tsirkin
2010-03-24 20:24       ` Blue Swirl
2010-03-24 20:24         ` Michael S. Tsirkin
2010-03-24 20:32           ` Blue Swirl
2010-03-24 20:28         ` Anthony Liguori
2010-03-24 21:00         ` Aurelien Jarno
2010-03-24 22:33     ` Paul Brook
2010-03-24 22:50       ` Anthony Liguori
2010-03-24 23:05         ` Paul Brook
2010-03-24 23:07           ` Anthony Liguori
2010-03-25  8:21             ` Michael S. Tsirkin
2010-03-25 12:01               ` Paul Brook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.