All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier.
@ 2020-05-17 23:05 Catherine A. Frederick
  2020-05-18 10:21 ` Peter Maydell
  2020-05-19 14:51 ` Richard Henderson
  0 siblings, 2 replies; 4+ messages in thread
From: Catherine A. Frederick @ 2020-05-17 23:05 UTC (permalink / raw)
  To: qemu-devel

Hi, I've been patching TCG for my own purposes recently and I was 
wondering a few things. That being:

- Is the TCG backend expected to handle bad cases for instructions? I 
was wondering as I found a situation where a very large shift constant 
reaches the backend and causes an illegal instruction to be generated. 
Is the frontend expected to clean this up, or is the backend supposed to 
be able to deal with these? I currently patched the bug via clipping the 
shift constant between 0 and 64.

- I've been implementing an instruction scheduler(list scheduler, with 
priority given to most successors) for TCG and currently if I replace 
instructions in s->ops(the TCG context) I get a crash later in 
tcg_reg_alloc_op, even if the instruction stream is identical. Is there 
anything else I need to move when I do this?

- Is insn_start necessary to have in order(and what does it do?)? These 
currently are serializing instructions in my scheduler and significantly 
limit my reordering as they create lots of dependencies every few 
instructions.

- Is it "okay" to use g2h and h2g directly in code in syscall.c? 
Currently it seems like TYPE_PTRVOID doesn't do this conversion, and as 
such, most of the calls made over the guest-host barrier made by DRM 
seem to fail spectacularly across bittedness lines. I think a more ideal 
solution would be implementing types that do this automatically, so I 
don't have to deal with the difference in struct size using macros, but 
in the short term I don't really have another option.

My last email didn't seem to reach you all, but here's to hoping this 
one does. Thanks!



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier.
  2020-05-17 23:05 [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier Catherine A. Frederick
@ 2020-05-18 10:21 ` Peter Maydell
  2020-05-18 10:52   ` Alex Bennée
  2020-05-19 14:51 ` Richard Henderson
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Maydell @ 2020-05-18 10:21 UTC (permalink / raw)
  To: Catherine A. Frederick; +Cc: Richard Henderson, QEMU Developers

On Mon, 18 May 2020 at 00:23, Catherine A. Frederick
<agrecascino123@gmail.com> wrote:
> Hi, I've been patching TCG for my own purposes recently and I was
> wondering a few things. That being:
>
> - Is the TCG backend expected to handle bad cases for instructions? I
> was wondering as I found a situation where a very large shift constant
> reaches the backend and causes an illegal instruction to be generated.
> Is the frontend expected to clean this up, or is the backend supposed to
> be able to deal with these? I currently patched the bug via clipping the
> shift constant between 0 and 64.

The semantics of TCG IR ops are described in tcg/README. If the
behaviour described there isn't what the guest requires then
the frontend needs to emit code to handle special cases the way
the guest architecture says they're handled. If a backend doesn't
behave as the IR op spec requires then the backend is buggy. For shifts
in particular (shl/shr/sar/rotl/rotr) the IR op has "unspecified
behaviour" for out of range shift values: this is required to
not crash, but the result value is not specified and could be
anything. (The IR spec uses "undefined behaviour" to mean "could
do anything, including crashing. The reason shifts are made to
be only unspecified-behaviour is that it allows a frontend to
emit code that unconditionally does a shift and then uses movcond
or similar to only use the result in the case where the shift is
in range.)

> - I've been implementing an instruction scheduler(list scheduler, with
> priority given to most successors) for TCG and currently if I replace
> instructions in s->ops(the TCG context) I get a crash later in
> tcg_reg_alloc_op, even if the instruction stream is identical. Is there
> anything else I need to move when I do this?

This one's out of my field of knowledge; Richard might know.

> - Is insn_start necessary to have in order(and what does it do?)? These
> currently are serializing instructions in my scheduler and significantly
> limit my reordering as they create lots of dependencies every few
> instructions.

The primary purpose of insn_start is to save information about the
current instruction in a metadata area associated with the generated
TB, so that if the guest takes an unexpected exception (typically,
because a guest load or store faults) then we can restore the CPU
state to what it should be at the point of the fault. This is more
efficient than if we had to emit a "write new PC value to CPU state"
operation at the end of every guest insn. At runtime, at the point
when we determine that we need to generate a guest exception, we know
the host PC (inside the generated code) where the fault occurred.
We look up that PC value in the metadata to determine which guest
PC value that corresponds to [in accel/tcg/translate-all.c
cpu_restore_state_from_tb()], and use that to correctly set the
guest PC value and sometimes other state [by calling the frontend's
restore_state_to_opc() function, passing it the data that the
frontend handed to the insn_start opcode. (Eg on 32-bit arm we use
this mechanism to fix up the condexec bits, and to provide correct
values for the exception syndrome register.) If you wanted to be
able to reorder TCG ops across guest insn boundaries you'd need to
make the "guest-PC-to-insn-start-data" lookup handle that. You'd
also need to ensure that you don't reorder anything that updates
information visible to the guest (a visible effect of the insn)
before anything that could generate an exception in a preceding insn.

insn_start is also handy simply for debugging purposes -- in debug
disassembly output we indicate where instruction boundaries are in
the host generated code and in the TCG IR opcode dump, which makes
it easier to figure out what the generated code was trying to do.

Finally, I haven't checked, but I suspect the new TCG plugin APIs
implicitly assume that code for each insn is generated serially,
ie that a plugin can do "for each instruction" type work on a
callback that hangs off the insn_start op.

> - Is it "okay" to use g2h and h2g directly in code in syscall.c?
> Currently it seems like TYPE_PTRVOID doesn't do this conversion, and as
> such, most of the calls made over the guest-host barrier made by DRM
> seem to fail spectacularly across bittedness lines. I think a more ideal
> solution would be implementing types that do this automatically, so I
> don't have to deal with the difference in struct size using macros, but
> in the short term I don't really have another option.

Usually syscall.c code wants to use lock_user/lock_user_string/
lock_user_struct rather than directly using g2h/h2g. This is
important because the lock_user functions will check whether the
guest can actually access the memory, so that you can return a
suitable error (usually TARGET_EFAULT). Otherwise if the guest
hands you a buffer it doesn't have access to then your g2h-then-access
will probably make QEMU crash.

I think that the reason TYPE_PTRVOID is not converted within a struct
is that almost always the syscall.c wrapper for the ioctl will
need to actively do something to convert the data being pointed
to, which means it will need to call lock_user() on it, which means
it wants the guest address, not the host address. So an IOCTL
definition that uses TYPE_PTRVOID will be an IOCTL_SPECIAL()
which provides a C function to handle that conversion.
If the buffer being pointed to is literally just a byte buffer
of data that needs no conversion, I think you can specify that
with MK_PTR(TYPE_CHAR). If it's anything more complicated then
it's going to need an IOCTL_SPECIAL and a conversion function.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier.
  2020-05-18 10:21 ` Peter Maydell
@ 2020-05-18 10:52   ` Alex Bennée
  0 siblings, 0 replies; 4+ messages in thread
From: Alex Bennée @ 2020-05-18 10:52 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Richard Henderson, qemu-devel, Catherine A. Frederick


Peter Maydell <peter.maydell@linaro.org> writes:

> On Mon, 18 May 2020 at 00:23, Catherine A. Frederick
> <agrecascino123@gmail.com> wrote:
>> Hi, I've been patching TCG for my own purposes recently and I was
>> wondering a few things. That being:
>>
<snip>
>> - I've been implementing an instruction scheduler(list scheduler, with
>> priority given to most successors) for TCG and currently if I replace
>> instructions in s->ops(the TCG context) I get a crash later in
>> tcg_reg_alloc_op, even if the instruction stream is identical. Is there
>> anything else I need to move when I do this?
>
> This one's out of my field of knowledge; Richard might know.

I'm a little unclear in what is happening here. For TCG plugins we
insert dummy ops into the stream so they can be replaced or removed at a
later phase in the translation.

>> - Is insn_start necessary to have in order(and what does it do?)? These
>> currently are serializing instructions in my scheduler and significantly
>> limit my reordering as they create lots of dependencies every few
>> instructions.
>
> The primary purpose of insn_start is to save information about the
> current instruction in a metadata
<snip>
> Finally, I haven't checked, but I suspect the new TCG plugin APIs
> implicitly assume that code for each insn is generated serially,
> ie that a plugin can do "for each instruction" type work on a
> callback that hangs off the insn_start op.

Spoiler alert: Yes it does ;-)

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier.
  2020-05-17 23:05 [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier Catherine A. Frederick
  2020-05-18 10:21 ` Peter Maydell
@ 2020-05-19 14:51 ` Richard Henderson
  1 sibling, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2020-05-19 14:51 UTC (permalink / raw)
  To: Catherine A. Frederick, qemu-devel

On 5/17/20 4:05 PM, Catherine A. Frederick wrote:
> - I've been implementing an instruction scheduler(list scheduler, with priority
> given to most successors) for TCG and currently if I replace instructions in
> s->ops(the TCG context) I get a crash later in tcg_reg_alloc_op, even if the
> instruction stream is identical. Is there anything else I need to move when I
> do this?

If you're getting a crash in tcg_reg_alloc_op, are you updating liveness info?
 Where are you inserting your scheduler in the list of passes?

> - Is insn_start necessary to have in order(and what does it do?)? These
> currently are serializing instructions in my scheduler and significantly limit
> my reordering as they create lots of dependencies every few instructions.

insn_start is required for unwinding from exceptions.  You can consider it a
barrier for call and qemu_ld/st.  Anything else cannot raise an exception and
can be moved.


r~


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-19 14:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-17 23:05 [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier Catherine A. Frederick
2020-05-18 10:21 ` Peter Maydell
2020-05-18 10:52   ` Alex Bennée
2020-05-19 14:51 ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.