All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Borislav Petkov <bp@alien8.de>
Cc: x86-ml <x86@kernel.org>, Joerg Roedel <jroedel@suse.de>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Have insn decoder functions return success/failure
Date: Thu, 22 Oct 2020 16:31:00 +0900	[thread overview]
Message-ID: <20201022163100.1139b28220da4eafb5e70fcc@kernel.org> (raw)
In-Reply-To: <20201021164558.GB4050@zn.tnic>

On Wed, 21 Oct 2020 18:45:58 +0200
Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Oct 21, 2020 at 11:26:13PM +0900, Masami Hiramatsu wrote:
> > Hmm, I meant someone might think it can be used for filtering the
> > instruction something like,
> > 
> > insn_init(insn, buf, buflen, 1);
> > ret = insn_get_length(insn);
> > if (!ret) {
> > 	/* OK, this is safe */
> > 	patch_text(buf, trampoline);
> > }
> > 
> > No, we need another validator for such usage.
> 
> Well, I think calling insn_get_length() should give you only the
> *length* of the insn and nothing else - I mean that is what the function
> is called. And I believe current use is wrong.
> 
> Examples:
> 
> arch/x86/kernel/kprobes/core.c:
>                 insn_get_length(&insn);
> 
>                 /*
>                  * Another debugging subsystem might insert this breakpoint.
>                  * In that case, we can't recover it.
>                  */
>                 if (insn.opcode.bytes[0] == INT3_INSN_OPCODE)
> 
> So this has called get_length but it is far from clear that after that
> call, the opcode bytes in insn.opcode.bytes are there.

No, insn_get_length() implies it decodes whole of the instruction.
(yeah, we need an alias of that, something like insn_get_complete())

> 
> What that should do instead IMO is this:
> 
> 	insn_get_opcode(&insn);
> 

No, you've cut the last lines of that loop.

                /*
                 * Another debugging subsystem might insert this breakpoint.
                 * In that case, we can't recover it.
                 */
                if (insn.opcode.bytes[0] == INT3_INSN_OPCODE)
                        return 0;
                addr += insn.length;
        }

I need insn.length too. Of course we can split it into 2 calls. But
as I said, since the insn_get_length() implies it decodes all other
parts, I just called it once.

> and *then* the return value can tell you whether the opcode bytes were
> parsed properly or not. See what I mean?

I agreed to check the return value of insn_get_length() at that point
only for checking whether the instruction parsing was failed or not.

> 
> That's even documented that way:
> 
> /**
>  * insn_get_opcode - collect opcode(s)
>  * @insn:       &struct insn containing instruction
>  *
>  * Populates @insn->opcode, updates @insn->next_byte to point past the
>  * opcode byte(s), and set @insn->attr (except for groups).
> 
> 
> Similarly here:
> 
> static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
> 
> 	...
> 
>         insn_get_length(&ctxt->insn);
> 
>         ret = ctxt->insn.immediate.got ? ES_OK : ES_DECODE_FAILED;
> 
> that thing wants to decode the insn but it is looking whether it parsed
> an *immediate*?!

Hm, it is better to call insn_get_immediate() if it doesn't use length later.

> 
> I'm not saying this is necessarily wrong - just the naming nomenclature
> and the API should be properly defined when you call a function of the
> insn decoder, what you are guaranteed to get and what a caller can
> assume after that. And then the proper functions be called.

Would you mean we'd better have something like insn_get_until_immediate() ? 

Since the x86 instruction is CISC, we can not decode intermediate
parts. The APIs follows that. If you are confused, I'm sorry about that.

> 
> In the kprobes/core.c example above, it does a little further:
> 
> 	ddr += insn.length;	
> 
> which, IMO, it should be either preceeded by a call to insn_get_length()
> - yes, this time we want the insn length or, the code should call a
> decoding function which gives you *both* length* and opcode bytes.
> insn_decode_insn() or whatever. And *that* should be documented in that
> function's kernel-doc section. And so on...

Actually, there is a historical reason too. INT3 check was added afterwards.
At first, I just calculated the instruction length in the loop...

Thank you,

> 
> Does that make more sense?
> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette


-- 
Masami Hiramatsu <mhiramat@kernel.org>

  reply	other threads:[~2020-10-22  7:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20 12:02 [RFC] Have insn decoder functions return success/failure Borislav Petkov
2020-10-20 14:27 ` Masami Hiramatsu
2020-10-20 14:37   ` Borislav Petkov
2020-10-21  0:50     ` Masami Hiramatsu
2020-10-21  9:27       ` Borislav Petkov
2020-10-21 14:26         ` Masami Hiramatsu
2020-10-21 16:45           ` Borislav Petkov
2020-10-22  7:31             ` Masami Hiramatsu [this message]
2020-10-22  9:30               ` Borislav Petkov
2020-10-22 13:21                 ` Masami Hiramatsu
2020-10-22 17:58                   ` Andy Lutomirski
2020-10-23  9:20                     ` Borislav Petkov
2020-10-23  9:28                     ` Masami Hiramatsu
2020-10-23  9:32                       ` Borislav Petkov
2020-10-23 10:47                         ` Masami Hiramatsu
2020-10-23 23:27                           ` Borislav Petkov
2020-10-24  0:12                             ` Andy Lutomirski
2020-10-24  7:21                               ` Masami Hiramatsu
2020-10-24  8:23                               ` Borislav Petkov
2020-10-24 16:10                                 ` Andy Lutomirski
2020-10-27 13:42                                   ` Borislav Petkov
2020-10-28 11:36                                     ` Masami Hiramatsu
2020-10-24  7:13                             ` Masami Hiramatsu
2020-10-24  8:24                               ` Borislav Petkov
2020-10-29 12:42                             ` Borislav Petkov
2020-10-30  1:24                               ` Masami Hiramatsu
2020-10-30 13:07                                 ` Borislav Petkov
2020-10-23  9:17                   ` Borislav Petkov
2020-10-22  8:04             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201022163100.1139b28220da4eafb5e70fcc@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=bp@alien8.de \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.