From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751215AbdE2Vsh (ORCPT ); Mon, 29 May 2017 17:48:37 -0400 Received: from mx2.suse.de ([195.135.220.15]:52656 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751097AbdE2Vsg (ORCPT ); Mon, 29 May 2017 17:48:36 -0400 Date: Mon, 29 May 2017 23:48:16 +0200 From: Borislav Petkov To: Ricardo Neri Cc: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Andrew Morton , Brian Gerst , Chris Metcalf , Dave Hansen , Paolo Bonzini , Masami Hiramatsu , Huang Rui , Jiri Slaby , Jonathan Corbet , "Michael S. Tsirkin" , Paul Gortmaker , Vlastimil Babka , Chen Yucong , Alexandre Julliard , Stas Sergeev , Fenghua Yu , "Ravi V. Shankar" , Shuah Khan , linux-kernel@vger.kernel.org, x86@kernel.org, linux-msdos@vger.kernel.org, wine-devel@winehq.org, Adam Buchbinder , Colin Ian King , Lorenzo Stoakes , Qiaowei Ren , Arnaldo Carvalho de Melo , Adrian Hunter , Kees Cook , Thomas Garnier , Dmitry Vyukov Subject: Re: [PATCH v7 09/26] x86/insn-eval: Add utility function to identify string instructions Message-ID: <20170529214816.7qqle6tis3kjlifx@pd.tnic> References: <20170505181724.55000-1-ricardo.neri-calderon@linux.intel.com> <20170505181724.55000-10-ricardo.neri-calderon@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170505181724.55000-10-ricardo.neri-calderon@linux.intel.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 05, 2017 at 11:17:07AM -0700, Ricardo Neri wrote: > String instructions are special because in protected mode, the linear > address is always obtained via the ES segment register in operands that > use the (E)DI register. ... and DS for rSI. If we're going to account for both operands of string instructions with two operands. Btw, LODS and OUTS use only DS:rSI as a source operand. So we have to be careful with the generalization here. So if ES:rDI is the only seg. reg we want, then we don't need to look at those insns... (we assume DS by default). ... > +/** > + * is_string_instruction - Determine if instruction is a string instruction > + * @insn: Instruction structure containing the opcode > + * > + * Return: true if the instruction, determined by the opcode, is any of the > + * string instructions as defined in the Intel Software Development manual. > + * False otherwise. > + */ > +static bool is_string_instruction(struct insn *insn) > +{ > + insn_get_opcode(insn); > + > + /* all string instructions have a 1-byte opcode */ > + if (insn->opcode.nbytes != 1) > + return false; > + > + switch (insn->opcode.bytes[0]) { > + case INSB: > + /* fall through */ > + case INSW_INSD: > + /* fall through */ > + case OUTSB: > + /* fall through */ > + case OUTSW_OUTSD: > + /* fall through */ > + case MOVSB: > + /* fall through */ > + case MOVSW_MOVSD: > + /* fall through */ > + case CMPSB: > + /* fall through */ > + case CMPSW_CMPSD: > + /* fall through */ > + case STOSB: > + /* fall through */ > + case STOSW_STOSD: > + /* fall through */ > + case LODSB: > + /* fall through */ > + case LODSW_LODSD: > + /* fall through */ > + case SCASB: > + /* fall through */ That "fall through" for every opcode is just too much. Also, you can use the regularity of the x86 opcode space and do: case 0x6c ... 0x6f: /* INS/OUTS */ case 0xa4 ... 0xa7: /* MOVS/CMPS */ case 0xaa ... 0xaf: /* STOS/LODS/SCAS */ return true; default: return false; } And voila, there's your compact is_string_insn() function! :^) (Modulo the exact list, as I mentioned above). Thanks. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v7 09/26] x86/insn-eval: Add utility function to identify string instructions Date: Mon, 29 May 2017 23:48:16 +0200 Message-ID: <20170529214816.7qqle6tis3kjlifx@pd.tnic> References: <20170505181724.55000-1-ricardo.neri-calderon@linux.intel.com> <20170505181724.55000-10-ricardo.neri-calderon@linux.intel.com> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Return-path: Content-Disposition: inline In-Reply-To: <20170505181724.55000-10-ricardo.neri-calderon@linux.intel.com> Sender: linux-msdos-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="utf-8" To: Ricardo Neri Cc: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Andrew Morton , Brian Gerst , Chris Metcalf , Dave Hansen , Paolo Bonzini , Masami Hiramatsu , Huang Rui , Jiri Slaby , Jonathan Corbet , "Michael S. Tsirkin" , Paul Gortmaker , Vlastimil Babka , Chen Yucong , Alexandre Julliard , Stas Sergeev , Fenghua Yu On Fri, May 05, 2017 at 11:17:07AM -0700, Ricardo Neri wrote: > String instructions are special because in protected mode, the linear > address is always obtained via the ES segment register in operands that > use the (E)DI register. ... and DS for rSI. If we're going to account for both operands of string instructions with two operands. Btw, LODS and OUTS use only DS:rSI as a source operand. So we have to be careful with the generalization here. So if ES:rDI is the only seg. reg we want, then we don't need to look at those insns... (we assume DS by default). ... > +/** > + * is_string_instruction - Determine if instruction is a string instruction > + * @insn: Instruction structure containing the opcode > + * > + * Return: true if the instruction, determined by the opcode, is any of the > + * string instructions as defined in the Intel Software Development manual. > + * False otherwise. > + */ > +static bool is_string_instruction(struct insn *insn) > +{ > + insn_get_opcode(insn); > + > + /* all string instructions have a 1-byte opcode */ > + if (insn->opcode.nbytes != 1) > + return false; > + > + switch (insn->opcode.bytes[0]) { > + case INSB: > + /* fall through */ > + case INSW_INSD: > + /* fall through */ > + case OUTSB: > + /* fall through */ > + case OUTSW_OUTSD: > + /* fall through */ > + case MOVSB: > + /* fall through */ > + case MOVSW_MOVSD: > + /* fall through */ > + case CMPSB: > + /* fall through */ > + case CMPSW_CMPSD: > + /* fall through */ > + case STOSB: > + /* fall through */ > + case STOSW_STOSD: > + /* fall through */ > + case LODSB: > + /* fall through */ > + case LODSW_LODSD: > + /* fall through */ > + case SCASB: > + /* fall through */ That "fall through" for every opcode is just too much. Also, you can use the regularity of the x86 opcode space and do: case 0x6c ... 0x6f: /* INS/OUTS */ case 0xa4 ... 0xa7: /* MOVS/CMPS */ case 0xaa ... 0xaf: /* STOS/LODS/SCAS */ return true; default: return false; } And voila, there's your compact is_string_insn() function! :^) (Modulo the exact list, as I mentioned above). Thanks. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --