From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751700AbdGYXsQ (ORCPT <rfc822;w@1wt.eu>);
        Tue, 25 Jul 2017 19:48:16 -0400
Received: from mga14.intel.com ([192.55.52.115]:55206 "EHLO mga14.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751389AbdGYXsO (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 25 Jul 2017 19:48:14 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.40,413,1496127600"; 
   d="scan'208";a="131219521"
Message-ID: <1501026493.22603.48.camel@ranerica-desktop>
Subject: Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and
 64-bit effective addresses
From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
To: Borislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
        "H. Peter Anvin" <hpa@zytor.com>, Andy Lutomirski <luto@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Brian Gerst <brgerst@gmail.com>, Chris Metcalf <cmetcalf@mellanox.com>,
        Dave Hansen <dave.hansen@linux.intel.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Masami Hiramatsu <mhiramat@kernel.org>, Huang Rui <ray.huang@amd.com>,
        Jiri Slaby <jslaby@suse.cz>, Jonathan Corbet <corbet@lwn.net>,
        "Michael S. Tsirkin" <mst@redhat.com>,
        Paul Gortmaker <paul.gortmaker@windriver.com>,
        Vlastimil Babka <vbabka@suse.cz>, Chen Yucong <slaoub@gmail.com>,
        Alexandre Julliard <julliard@winehq.org>, Stas Sergeev <stsp@list.ru>,
        Fenghua Yu <fenghua.yu@intel.com>,
        "Ravi V. Shankar" <ravi.v.shankar@intel.com>,
        Shuah Khan <shuah@kernel.org>, linux-kernel@vger.kernel.org,
        x86@kernel.org, linux-msdos@vger.kernel.org, wine-devel@winehq.org,
        Adam Buchbinder <adam.buchbinder@gmail.com>,
        Colin Ian King <colin.king@canonical.com>,
        Lorenzo Stoakes <lstoakes@gmail.com>,
        Qiaowei Ren <qiaowei.ren@intel.com>,
        Arnaldo Carvalho de Melo <acme@redhat.com>,
        Adrian Hunter <adrian.hunter@intel.com>,
        Kees Cook <keescook@chromium.org>,
        Thomas Garnier <thgarnie@google.com>,
        Dmitry Vyukov <dvyukov@google.com>
Date: Tue, 25 Jul 2017 16:48:13 -0700
In-Reply-To: <20170607154819.xkbxp3hg7lwjdxd6@pd.tnic>
References: <20170505181724.55000-1-ricardo.neri-calderon@linux.intel.com>
         <20170505181724.55000-17-ricardo.neri-calderon@linux.intel.com>
         <20170607154819.xkbxp3hg7lwjdxd6@pd.tnic>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.10.4-0ubuntu2 
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

I am sorry Boris, while working on this series I missed a few of your
feedback comments.

On Wed, 2017-06-07 at 17:48 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote:
> > The 32-bit and 64-bit address encodings are identical. This means that we
> > can use the same function in both cases. In order to reuse the function
> > for 32-bit address encodings, we must sign-extend our 32-bit signed
> > operands to 64-bit signed variables (only for 64-bit builds). To decide on
> > whether sign extension is needed, we rely on the address size as given by
> > the instruction structure.
> > 
> > Once the effective address has been computed, a special verification is
> > needed for 32-bit processes. If running on a 64-bit kernel, such processes
> > can address up to 4GB of memory. Hence, for instance, an effective
> > address of 0xffff1234 would be misinterpreted as 0xffffffffffff1234 due to
> > the sign extension mentioned above. For this reason, the 4 must be
> 
> Which 4?

I meant to say the 4 most significant bytes. In this case, the
64-address 0xffffffffffff1234 would lie in the kernel memory while
0xffff1234 would correctly be in the user space memory.
> 
> > truncated to obtain the true effective address.
> > 
> > Lastly, before computing the linear address, we verify that the effective
> > address is within the limits of the segment. The check is kept for long
> > mode because in such a case the limit is set to -1L. This is the largest
> > unsigned number possible. This is equivalent to a limit-less segment.
> > 
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Qiaowei Ren <qiaowei.ren@intel.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Thomas Garnier <thgarnie@google.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 99 ++++++++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 88 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 1a5f5a6..c7c1239 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -688,6 +688,62 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
> >  	return get_reg_offset(insn, regs, REG_TYPE_RM);
> >  }
> >  
> > +/**
> > + * _to_signed_long() - Cast an unsigned long into signed long
> > + * @val		A 32-bit or 64-bit unsigned long
> > + * @long_bytes	The number of bytes used to represent a long number
> > + * @out		The casted signed long
> > + *
> > + * Return: A signed long of either 32 or 64 bits, as per the build configuration
> > + * of the kernel.
> > + */
> > +static int _to_signed_long(unsigned long val, int long_bytes, long *out)
> > +{
> > +	if (!out)
> > +		return -EINVAL;
> > +
> > +#ifdef CONFIG_X86_64
> > +	if (long_bytes == 4) {
> > +		/* higher bytes should all be zero */
> > +		if (val & ~0xffffffff)
> > +			return -EINVAL;
> > +
> > +		/* sign-extend to a 64-bit long */
> 
> So this is a 32-bit userspace on a 64-bit kernel, right?

Yes.
> 
> If so, how can a memory offset be > 32-bits and we have to extend it to
> a 64-bit long?!?

Yes, perhaps the check above is not needed. I included that check as
part of my argument validation. In a 64-bit kernel, this function could
be called with val with non-zero most significant bytes.
> 
> I *think* you want to say that you want to convert it to long so that
> you can do the calculation in longs.

That is exactly what I meant. More specifically, I want to convert my
32-bit variables into 64-bit signed longs; this is the reason I need the
sign extension.
> 
> However!
> 
> If you're a 64-bit kernel running a 32-bit userspace, you need to do
> the calculation in 32-bits only so that it overflows, as it would do
> on 32-bit hardware. IOW, the clamping to 32-bits at the end is not
> something you wanna do but actually let it wrap if it overflows.

I have looked into this closely and as far as I can see, the 4 least
significant bytes will wrap around when using 64-bit signed numbers as
they would when using 32-bit signed numbers. For instance, for two
positive numbers we have:

7fff:ffff + 7000:0000 = efff:ffff.

The addition above overflows. When sign-extended to 64-bit numbers we
would have:

0000:0000:7fff:ffff + 0000:0000:7000:0000 = 0000:0000:efff:ffff.

The addition above does not overflow. However, the 4 least significant
bytes overflow as we expect. We can clamp the 4 most significant bytes.

For a two's complement negative numbers we can have:

ffff:ffff + 8000:0000 = 7fff:ffff with a carry flag.

The addition above overflows.

When sign-extending to 64-bit numbers we would have:

ffff:ffff:ffff:ffff + ffff:ffff:8000:0000 = ffff:ffff:7fff:ffff with a
carry flag.

The addition above does not overflow. However, the 4 least significant
bytes overflew and wrapped around as they would when using 32-bit signed
numbers.

> Or am I missing something?

Now, am I missing something?

Thanks and BR,
Ricardo