From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263983AbTDYU2M (ORCPT ); Fri, 25 Apr 2003 16:28:12 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263984AbTDYU2M (ORCPT ); Fri, 25 Apr 2003 16:28:12 -0400 Received: from 217-125-129-224.uc.nombres.ttd.es ([217.125.129.224]:55534 "HELO cocodriloo.com") by vger.kernel.org with SMTP id S263983AbTDYU2J (ORCPT ); Fri, 25 Apr 2003 16:28:09 -0400 Date: Fri, 25 Apr 2003 13:44:34 +0200 From: Antonio Vargas To: John Bradford Cc: Timothy Miller , Steven Augart , Kernel Mailing List Subject: Re: Simple x86 Simulator (was: Re: Flame Linus to a crisp!) Message-ID: <20030425114434.GG6009@wind.cocodriloo.com> References: <3EA9565B.9020905@techsource.com> <200304251610.h3PGAiri001509@81-2-122-30.bradfords.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200304251610.h3PGAiri001509@81-2-122-30.bradfords.org.uk> User-Agent: Mutt/1.3.28i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 25, 2003 at 05:10:44PM +0100, John Bradford wrote: > > > We could not. Consider just the 8 32-bit-wide legacy x86 registers, > > > excluding the MMX and FPU registers: > > > (AX, BX, CX, DX, BP, SI, DI, SP). 32 bits x 8 = 2^256 independent > > > states to look up in the table, each state having 256 bits of > > > information. 2^264 total bits of information needed. Assume 1 GB > > > dimms (2^30 * 8 bits each = 2^33 bits of info), with a volume of 10 > > > cm^3 per DIMM (including a tiny amount of space for air circulation.). > > > Need 34508731733952818937173779311385127262255544860851932776 cubic > > > kilometers of space. > > > > > > Considerably larger than the volume of the earth, although admittedly > > > smaller than the total volume of the universe. > > > --Steven Augart > > > > If this could be done, someone would have done it already. > > It's certainly possible to implement most of the functionality of a > very simple processor this way, but applying the idea to an X86 > compatible processor was a joke. > > What interests me now is whether we could cache the results of certain > opcode strings in a separate memory area. > > Say for example, you have a complicated routine that runs in to > hundreds of opcodes, which is being applied to large amounts of data, > word by word. If one calculation doesn't depend on another, you could > cache the results, and then merely fetch them from the results cache > when the input data repeats itself. > > I.E. the processor dynamically makes it's own look-up tables. This is called dynamic programming and is done by keeping a cache for previous results. Take for example a simple fibonacci function: int fib(n){ return fib(n-2) + fib(n-1); } Now hook up a direct-mapped cache: int cache[65536]; int fib(n){ int x = cache[n]; if(x) return x; x = fib(n-2) + fib(n-1); cache[n] = x; return x; } Of course, this limits the range too much, so coding a LRU or some other cache system would consume less memory and give better speed. Your idea, applying this to general hardware execution could be _really_ nice in fact :) Greets, Antonio. ps. I recall that todays hardware does it for some stuff: the TBL cache replaces an expensive and sometimes complex page-table-tree walk (m68030 mmu-docs come to mind)