All of lore.kernel.org
 help / color / mirror / Atom feed
* Some words of encouragement
@ 2012-02-24 17:26 Brad Normand
  2012-02-24 18:22 ` Jody Bruchon
  0 siblings, 1 reply; 13+ messages in thread
From: Brad Normand @ 2012-02-24 17:26 UTC (permalink / raw)
  To: linux-8086

I've been watching this list for a while and I'm happy development is
continuing!  The last time I tried playing around with ELKS, I
couldn't even get things to compile, and had I succeeded, I got the
feeling I wouldn't be too impressed.

In case anyone's interested, my wishlist:

Support for 100% BIOS I/O (eventually I'd like to run this on
non-standard hardware, Sanyo MBC-55x, so any in/out opcode and
interrupt hooking will fail to work as intended).  Where necessary,
BIOS RAM can be probed and this shouldn't be a big problem.

Support for reserving some low memory area (again due to hardware constraints).

Some method to transfer data to/from FAT12.

And, a question... how feasible would it be to run anything with 256KB
RAM?  I was tempted to reply to the EMS code message because to make a
useful ELKS machine out of these, I may have to design a memory
expansion board, and I've got lots of 4MB SIMMs laying around.
Otherwise without kernel bankswitching support (or through EMS calls)
I believe I can do 960KB.


I also have a strange memory management idea that may be
interesting...  Since 8086 is a 64KB segment oriented machine and you
are already bumping up against the limitations of that, and there is
talk of changing compilers...  how crazy would it be to design a new
segmentation ABI to allow relocation of segments and still allow
programs to use more than 64KB?

I'm thinking something along the lines of, instead of using mov DS,AX
type instructions, these are changed to an OS call, so that the kernel
knows what segments the userspace is using, and when it has to
relocate data, it can simply rewrite the stored CS,DS,ES,SS of the
userspace, and next time the userspace needs to switch segments, it
asks the OS to do the dirty work.  Managing >64KB allocations is
possible too but basically the program needs to request a 64KB window
into the data at a time.  This would probably be pretty directly
applicable to a bankswitching scheme as a poor man's MMU.

Sorry for the craziness but I've spent many hours pondering this stuff
in the past...  if there's interest I'd be happy to put together a
specification of sorts.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-24 17:26 Some words of encouragement Brad Normand
@ 2012-02-24 18:22 ` Jody Bruchon
  2012-02-25  0:54   ` David Given
  2012-02-25  5:12   ` Brad Normand
  0 siblings, 2 replies; 13+ messages in thread
From: Jody Bruchon @ 2012-02-24 18:22 UTC (permalink / raw)
  To: ELKS (linux-8086)

On 02/24/12 12:26, Brad Normand wrote:
> I've been watching this list for a while and I'm happy development is
> continuing!  The last time I tried playing around with ELKS, I
> couldn't even get things to compile, and had I succeeded, I got the
> feeling I wouldn't be too impressed.
You probably still won't be impressed. ELKS in my QEMU frequently goes 
into an endless loop or otherwise stops working properly. I'll work out 
the problem eventually, but in the meantime I'm trying to clean up the 
code. ELKS really needs to be more comment-heavy in the code. I'm tired 
of seeing one-letter variables everywhere and not being fully sure what 
they're up to.
> In case anyone's interested, my wishlist:
>
> Support for 100% BIOS I/O (eventually I'd like to run this on
> non-standard hardware, Sanyo MBC-55x, so any in/out opcode and
> interrupt hooking will fail to work as intended).  Where necessary,
> BIOS RAM can be probed and this shouldn't be a big problem.
I think there's some degree of this already. I know ELKS can be built to 
not disable IRQs at boot, and uses BIOS calls for most disk I/O anyway.
> Support for reserving some low memory area (again due to hardware constraints).
That shouldn't be hard.
> Some method to transfer data to/from FAT12.
Planned for the future, for sure. The FAT filesystem options in ELKS 
were "unimplemented features" so I stripped them out of the build 
process entirely until a driver actually exists.
> And, a question... how feasible would it be to run anything with 256KB
> RAM?  I was tempted to reply to the EMS code message because to make a
> useful ELKS machine out of these, I may have to design a memory
> expansion board, and I've got lots of 4MB SIMMs laying around.
> Otherwise without kernel bankswitching support (or through EMS calls)
> I believe I can do 960KB.
The problem with EMS code is that ALL EMS boards are different, and the 
LIM EMS specification is useless since it relies on the vendor-supplied 
DOS driver to work. We can write an EMS driver if we have the hardware 
and can figure out how it works, or if you make the hardware yourself as 
you suggest. EMS is extremely hard to find (particularly at a sane 
price) these days so I blew the whole idea off.
> I also have a strange memory management idea that may be
> interesting...  Since 8086 is a 64KB segment oriented machine and you
> are already bumping up against the limitations of that, and there is
> talk of changing compilers...  how crazy would it be to design a new
> segmentation ABI to allow relocation of segments and still allow
> programs to use more than 64KB?
Manpower is always the issue. My thoughts (which I have not shared until 
now) were the same as what I was doing with my 6502 operating system 
that I've written: each program has a set of relocation tables that are 
just a list of offsets to absolute pointers (far pointers on 8086, long 
addresses on other platforms we might support later like 65816) and when 
the application image is loaded into memory, the relocation tables are 
retained in memory as well. If we needed to shuffle code elsewhere in 
memory for whatever reason (EMS, swap, defragmenting memory, loading a 
distinctly separate instance of a program already being executed, 
something to do while setting the building on fire) it would be a simple 
matter of enhancing the memory manager to intelligently rewrite 
relocations and tweak the stored task processor context to match the 
changes. This could even potentially be used to move data that is 
malloc()ed around if it was done carefully. But, of course, we come back 
to the issue of manpower once more. Who is going to write the compiler 
code that generates these new executables, the ELKS code that handles 
the new executable format, etc.? From what I understand, a lot of the 
people with sufficient compiler knowledge bailed on ELKS some time ago, 
so making this change would be an uphill battle.
> I'm thinking something along the lines of, instead of using mov DS,AX
> type instructions, these are changed to an OS call, so that the kernel
> knows what segments the userspace is using, and when it has to
> relocate data, it can simply rewrite the stored CS,DS,ES,SS of the
> userspace, and next time the userspace needs to switch segments, it
> asks the OS to do the dirty work.  Managing>64KB allocations is
> possible too but basically the program needs to request a 64KB window
> into the data at a time.  This would probably be pretty directly
> applicable to a bankswitching scheme as a poor man's MMU.
That sounds plausible, but I'm guessing bcc doesn't generate MOV DS,AX 
anyway, since it apparently restricts everything to single segments...
> Sorry for the craziness but I've spent many hours pondering this stuff
> in the past...  if there's interest I'd be happy to put together a
> specification of sorts.
Right now, we need to fix the most egregious bugs in the code, 
specifically the ones that cause the system to hang, lock up, get stuck 
at a looping "login:" or other crashy behavior. I suspect that bcc is 
partially responsible for some of these problems, given that it 
generates lackluster code and the bcc source itself is very poorly 
documented, making changes to it dicey. Once again, I would love to 
switch compilers, but while SDCC is apparently the most promising 
option, it also doesn't support the 8086 (unless I missed something when 
I looked at it), so that's a dead end unless someone wants to port SDCC 
to 8086 first. I'm thinking that the only real solution is going to be 
to either massively retool bcc, or to write a compiler. I'm starting to 
see why ELKS was abandoned; few people exist that want to do these big 
jobs AND possess the requisite experience to actually do it.

Jody Bruchon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-24 18:22 ` Jody Bruchon
@ 2012-02-25  0:54   ` David Given
  2012-02-25  6:27     ` Brad Normand
  2012-02-25  5:12   ` Brad Normand
  1 sibling, 1 reply; 13+ messages in thread
From: David Given @ 2012-02-25  0:54 UTC (permalink / raw)
  To: ELKS (linux-8086)

[-- Attachment #1: Type: text/plain, Size: 3724 bytes --]

On 24/02/12 18:22, Jody Bruchon wrote:
> On 02/24/12 12:26, Brad Normand wrote:
[...]
>> In case anyone's interested, my wishlist:

FWIW, I believe Minix 2 meets at least some of your requirements. If
you're interested in tiny Unixalikes, it's well worth checking out:

http://www.minix3.org/previous-versions/Intel-2.0.4/

(Although if you really want to run it on an XT, you'll need 2.0.3.
640kB isn't quite enough for 2.0.4.)

[...]
> If we needed to shuffle code elsewhere in
> memory for whatever reason (EMS, swap, defragmenting memory, loading a
> distinctly separate instance of a program already being executed,
> something to do while setting the building on fire) it would be a simple
> matter of enhancing the memory manager to intelligently rewrite
> relocations and tweak the stored task processor context to match the
> changes.

As it stands this will only work on the text segment, of course. We
don't know the layout of the data segment, so are unable to relocate
pointers in it. (Because we don't know what addresses contain pointers.)
This is where the half-baked 8086 'MMU' actually comes in handy; all
memory accesses are relative to SS, DS or CS, so moving stuff around in
memory is trivial.

If you don't have segments, and are on a flat memory architecture, then
there are two approaches:

(a) don't relocate data, just relocate text.
(b) fake segments by reserving a register to point at the data segment
base, and then do indirect accesses from that.

I have a faint memory that Minix 68k does this (b). But it's slow, of
course, as you need to make sure that only relative addresses are stored
in memory, because storing absolute addresses would break relocation,
and you need some way to distinguish between absolute and relative
addresses in registers...

My opinion is that the 8086 is so limited and so intrinsically slow and,
well, crap, that trying to handle anything bigger than 64kB code + 64kB
data isn't worth the hassle.

>> Managing>64KB allocations is
>> possible too but basically the program needs to request a 64KB window
>> into the data at a time.  This would probably be pretty directly
>> applicable to a bankswitching scheme as a poor man's MMU.

It would be relatively trivial to add the ability to allocate extra
chunks of data which are then accessed via read/write-like functions.

extern int mmapalloc(int size); // returns handle to new block
extern int mmapfree(int size); // returns handle to new block
extern void mmapread(int handle, void* buffer, int offset, int length);
extern void mmapwrite(int handle, void* buffer, int offset, int length);

mmapalloc() and mmapfree() would allocate a new segment. mmapread() and
mmapwrite() would be tiny little pieces of code that would just load the
segment descriptor into ES and do a bulk copy with REP STO or something.

That might be useful for certain specialist operations but, again, if
you need that much memory, you probably shouldn't be using an 8086...

[...]
> Once again, I would love to
> switch compilers, but while SDCC is apparently the most promising
> option, it also doesn't support the 8086 (unless I missed something when
> I looked at it), so that's a dead end unless someone wants to port SDCC
> to 8086 first.

I do actually have Watcom *almost* generating ELKS executables. No libc,
of course, I'm simply fighting the linker (which is deeply bizarre), but
it is getting there.

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-24 18:22 ` Jody Bruchon
  2012-02-25  0:54   ` David Given
@ 2012-02-25  5:12   ` Brad Normand
  1 sibling, 0 replies; 13+ messages in thread
From: Brad Normand @ 2012-02-25  5:12 UTC (permalink / raw)
  To: ELKS (linux-8086)

>> Some method to transfer data to/from FAT12.
>
> Planned for the future, for sure. The FAT filesystem options in ELKS were
> "unimplemented features" so I stripped them out of the build process
> entirely until a driver actually exists.

It was mostly an idea for me to get around modifying my fat12 boot
sector into a minixfs bootsector.  But if minixfs is better for a root
filesystem then maybe that's what I should do anyway.

> The problem with EMS code is that ALL EMS boards are different, and the LIM
> EMS specification is useless since it relies on the vendor-supplied DOS
> driver to work.

Well, it could be BIOS supplied (perhaps an option ROM), or otherwise
loaded before ELKS (eg, into high conventional memory).  The API
doesn't rely on DOS to my knowledge.  My system uses a mostly RAM
resident BIOS right now so all those scenarios would be about
equivalent.  Anyway, that's a way down the road thing, but in case
there were design decisions now that could impact it, I figured I'd
throw in my vote.

> Right now, we need to fix the most egregious bugs in the code, specifically
> the ones that cause the system to hang, lock up, get stuck at a looping
> "login:" or other crashy behavior. I suspect that bcc is partially
> responsible for some of these problems, given that it generates lackluster
> code and the bcc source itself is very poorly documented, making changes to
> it dicey. Once again, I would love to switch compilers, but while SDCC is
> apparently the most promising option, it also doesn't support the 8086
> (unless I missed something when I looked at it), so that's a dead end unless
> someone wants to port SDCC to 8086 first. I'm thinking that the only real
> solution is going to be to either massively retool bcc, or to write a
> compiler. I'm starting to see why ELKS was abandoned; few people exist that
> want to do these big jobs AND possess the requisite experience to actually
> do it.

Maybe I should start looking at SDCC and see how hard it is to get
into.  I'd be a little out of my comfort zone though, I'm mostly just
used to java and 8086/PIC assembly.  But maybe it won't be as bad as I
expect.  If I want to experiment with bizarre segmentation schemes,
there will probably be a lot of do-it-yourself involved.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25  0:54   ` David Given
@ 2012-02-25  6:27     ` Brad Normand
  2012-02-25 12:53       ` David Given
  0 siblings, 1 reply; 13+ messages in thread
From: Brad Normand @ 2012-02-25  6:27 UTC (permalink / raw)
  To: ELKS (linux-8086)

> FWIW, I believe Minix 2 meets at least some of your requirements. If
> you're interested in tiny Unixalikes, it's well worth checking out:
>
> http://www.minix3.org/previous-versions/Intel-2.0.4/
>
> (Although if you really want to run it on an XT, you'll need 2.0.3.
> 640kB isn't quite enough for 2.0.4.)

Available memory was one issue (256KB is all you can easily count on,
512KB isn't stock, and more than that was generally a custom card).
But another problem is part of video memory needs to be located
somewhere in the first 256KB.  Maybe it's possible to make work but it
just doesn't feel right.

FreeDOS was pretty trivially easy to make run - all it needed was a
base/load address modification and it was good to go.  Well, that and
writing a BIOS (or ripping Sanyo's from the DOS 2.11 that came with
it, but that's not my style, except for the horrible, horrible little
floppy data transfer engine that arises from the lack of DMA).

> As it stands this will only work on the text segment, of course. We
> don't know the layout of the data segment, so are unable to relocate
> pointers in it. (Because we don't know what addresses contain pointers.)
> This is where the half-baked 8086 'MMU' actually comes in handy; all
> memory accesses are relative to SS, DS or CS, so moving stuff around in
> memory is trivial.

I keep forgetting this stuff is coming from C that assumes it can do
sane pointer arithmetic.  Some parts of my scheme were pretty much
assuming hand coded assembly with procedures written around the
windowing scheme.  (I was envisioning a sort of java-like programming
language where segments are selected when crossing "class" boundaries,
so it would be unlikely for a 64KB segment to be filled, except for by
really bad coding techniques).  But I feel my chances of implementing
something like that are slim.  But if I change it to the idea of
implementing a target for a "plain" compiler...

Anything that targets "normal" x86-16 with bigger than 64KB data
chunks needs to understand arithmetic on far addresses, and I wonder
how hard it would be to do some sort of virtual linearization here,
basically instead of a far pointer being effectively 20 bits of
address inside 32 bits of storage, just make them flat 32 bits.  Then,
calculating the segment address is either shifting around for the
"normal" case of 1MB space where the address really just is 20 bits,
or could be converted to an ABI call (some interrupt numbers range?)
for the OS to prepare memory and provide a segment address.  The
pointer arithmetic would reduce to plain 32 bit operations, but
dereferencing an address would be significantly more complex.  It
would be easy to make a compiler generate code, but would probably be
challenging to make a compiler generate fast code.

> My opinion is that the 8086 is so limited and so intrinsically slow and,
> well, crap, that trying to handle anything bigger than 64kB code + 64kB
> data isn't worth the hassle.

Definitely a fair point.  My best counterpoint is that lots of these
earlier machines were built like tanks and I suspect they will still
be operable many decades from now.  But the less they're capable of,
the less chance they will be operating.

> It would be relatively trivial to add the ability to allocate extra
> chunks of data which are then accessed via read/write-like functions.
>
> extern int mmapalloc(int size); // returns handle to new block
> extern int mmapfree(int size); // returns handle to new block
> extern void mmapread(int handle, void* buffer, int offset, int length);
> extern void mmapwrite(int handle, void* buffer, int offset, int length);
>
> mmapalloc() and mmapfree() would allocate a new segment. mmapread() and
> mmapwrite() would be tiny little pieces of code that would just load the
> segment descriptor into ES and do a bulk copy with REP STO or something.
>

There's lots of options for special applications to get access to
extra memory, but lots of programs of interest wouldn't be compatible
with this.  Porting things would be (even more of) a nightmare.

> I do actually have Watcom *almost* generating ELKS executables. No libc,
> of course, I'm simply fighting the linker (which is deeply bizarre), but
> it is getting there.

This sounds like it'd be a good way to help ferret out bcc compiler
bugs or bypass them entirely, plus from what I've read, Watcom is one
of the better compilers to target 16 bit x86 in general.  There may be
some hints in FreeDOS, as openwatcom is one of the supported compilers
for the kernel there.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25  6:27     ` Brad Normand
@ 2012-02-25 12:53       ` David Given
  2012-02-25 13:13         ` David Given
  0 siblings, 1 reply; 13+ messages in thread
From: David Given @ 2012-02-25 12:53 UTC (permalink / raw)
  To: ELKS (linux-8086)

[-- Attachment #1: Type: text/plain, Size: 5472 bytes --]

On 25/02/12 06:27, Brad Normand wrote:
[...]
> Anything that targets "normal" x86-16 with bigger than 64KB data
> chunks needs to understand arithmetic on far addresses, and I wonder
> how hard it would be to do some sort of virtual linearization here,
> basically instead of a far pointer being effectively 20 bits of
> address inside 32 bits of storage, just make them flat 32 bits.

Well, Watcom does support this --- it's called HUGE model.

Here's a simple memcpy implementation:

void copy(char* dest, const char* src, int length)
{
	while (length--)
		*dest++ = *src++;
}

Here's the small mode version:

0000                          @copy:
0000    56                        push        si
0001    57                        push        di
0002    89 C7                     mov         di,ax
0004    89 D6                     mov         si,dx
0006                          L$1:
0006    4B                        dec         bx
0007    83 FB FF                  cmp         bx,0xffff
000A    74 08                     je          L$2
000C    8A 04                     mov         al,byte ptr [si]
000E    88 05                     mov         byte ptr [di],al
0010    46                        inc         si
0011    47                        inc         di
0012    EB F2                     jmp         L$1
0014                          L$2:
0014    5F                        pop         di
0015    5E                        pop         si
0016    C3                        ret

And here's the huge mode version:

0000                          @copy:
0000    56                        push        si
0001    57                        push        di
0002    55                        push        bp
0003    89 E5                     mov         bp,sp
0005    83 EC 02                  sub         sp,0x0002
0008    C4 7E 0A                  les         di,dword ptr 0xa[bp]
000B    C5 76 0E                  lds         si,dword ptr 0xe[bp]
000E    89 46 FE                  mov         word ptr -0x2[bp],ax
0011                          L$1:
0011    FF 4E FE                  dec         word ptr -0x2[bp]
0014    83 7E FE FF               cmp         word ptr -0x2[bp],0xffff
0018    74 2B                     je          L$2
001A    8A 04                     mov         al,byte ptr [si]
001C    26 88 05                  mov         byte ptr es:[di],al
001F    89 F0                     mov         ax,si
0021    8C DA                     mov         dx,ds
0023    BB 01 00                  mov         bx,0x0001
0026    31 C9                     xor         cx,cx
0028    9A 00 00 00 00            call        __PIA
002D    89 C6                     mov         si,ax
002F    8E DA                     mov         ds,dx
0031    89 F8                     mov         ax,di
0033    8C C2                     mov         dx,es
0035    BB 01 00                  mov         bx,0x0001
0038    31 C9                     xor         cx,cx
003A    9A 00 00 00 00            call        __PIA
003F    89 C7                     mov         di,ax
0041    8E C2                     mov         es,dx
0043    EB CC                     jmp         L$1
0045                          L$2:
0045    89 EC                     mov         sp,bp
0047    5D                        pop         bp
0048    5F                        pop         di
0049    5E                        pop         si
004A    CA 08 00                  retf        0x0008

So two and a half times the size *and* it's having to call off to an
external routine to do pointer arithmetic. But you do get standard
32-bit pointer semantics with arbitrary sized data structures.

There's a compromise, large mode, where the programmer promises that no
single data structure is bigger than 64kB. This means that it can
represent any pointer as a segment+offset pair, and do sane pointer
arithmetic with just the offset, which is much cheaper; the large mode
version of the above is only 30 bytes.

[...]
> This sounds like it'd be a good way to help ferret out bcc compiler
> bugs or bypass them entirely, plus from what I've read, Watcom is one
> of the better compilers to target 16 bit x86 in general.  There may be
> some hints in FreeDOS, as openwatcom is one of the supported compilers
> for the kernel there.

Unfortunately it seems that OMF object files can't represent pointer
differences. Which means I can't do this to emit the ELKS executable header:

	dw __tend, 0           ; size of text segment in bytes
	dw _edata, 0           ; size of data segment in bytes
	dw _end - _edata, 0    ; size of bss segment in bytes
	dw _cstart_, 0         ; entry point
	dw 65535, 0            ; chmem
	dw 0, 0                ; size of symbol table

The '_end - _edata' is silently accepted by wasm but evaluates to 0.
Which is nice. nasm was more informative (and has a saner syntax; I'd
forgotten how loathesome masm syntax is).

I'm now thinking that the sanest way to go here is (a) hack Watcom to
support ELKS executables directly; (b) write a tool to disassemble the
OMF output and convert it to as86-compatible format; (c) give up and go
to the pub...

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 12:53       ` David Given
@ 2012-02-25 13:13         ` David Given
  2012-02-25 18:04           ` Brad Normand
  0 siblings, 1 reply; 13+ messages in thread
From: David Given @ 2012-02-25 13:13 UTC (permalink / raw)
  To: ELKS (linux-8086)

[-- Attachment #1: Type: text/plain, Size: 1180 bytes --]

On 25/02/12 12:53, David Given wrote:
[...]
> I'm now thinking that the sanest way to go here is (a) hack Watcom to
> support ELKS executables directly; (b) write a tool to disassemble the
> OMF output and convert it to as86-compatible format; (c) give up and go
> to the pub...

Actually, I forgot option (d): change ELKS. It occurs to me that the
kernel doesn't actually need to know the BSS size. All it's doing is
allocating a data segment the size of the chmem field, loading the
preinit data from the file, and zero-initialising the rest. So all it
needs is the preinit data size, which we have, and the chmem size, which
we have. So if we're willing to decree that the kernel ignores the BSS
size field, we're good to go...

...in fact, looking at fs/exec.c, I reckon that it should work as is.
The only place the BSS size field is used in an old-style executable is
during the call to memset, where 0 is safe.

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 13:13         ` David Given
@ 2012-02-25 18:04           ` Brad Normand
  2012-02-25 19:09             ` David Given
  0 siblings, 1 reply; 13+ messages in thread
From: Brad Normand @ 2012-02-25 18:04 UTC (permalink / raw)
  To: linux-8086

I'm with you on the small model, and I did a little bit of programming
working with an equivalent of large model (user data storage
structures on an x86 graphing calculator), basically you normalize
your address registers every time you move to the next object.

This huge model code seems a little different than I was thinking of.
It's still using basic 20 bit effective far pointers (20+12i bit?) and
not 32 bit flat pointers in the code.  Here's where it prevents
MMU-like behavior: if there's no pointer arithmetic at all, no
external calls have to be made, even when accessing data.

What I was thinking of, is pointers are stored as flat 32 bit (4GB)
pointers, and when code uses a pointer, it can do its own arithmetic
in 32 bits, or it can make the kernel translate the address into a
DS:SI or ES:DI combination (behind the scenes, swapping in the 64KB
this makes visible if needed).  Now the program has a pointer loaded
in a "register" (DS/SI) and it can perform 16 bit arithmetic on it,
until SI overflows/underflows, then it has to ask the OS to fix the
pointer for it again.  For bigger than 16 bit operations on a pointer,
there would be calls like the huge code above used.  It wouldn't be
legal to ever load or store from segment registers directly, but once
they're loaded, anything you can logically reference by changing the
index is fair game.

That covers sequential memcpy-like situations, another situation is
random pointer access (eg, following a linked list or binary tree).
The last pointer's 32 bit representation could be compared against the
current 32 bit pointer to find the offset from the last DS:SI, then
either add the offset onto SI or go to the OS provided pointer
manipulation if it's outside the limits.  This is much less ideal once
the data size gets larger than 64KB because the window "hit rate" will
drop drastically.  But it's better than the situation for >~512KB of
not running at all.

Stack > 64KB could be done with the same logic if it's assumed that no
single stack frame is bigger than 64KB (in reality probably only
recursive calls have to care to check), and similar for code if no
single function (or linked object?) is larger than 64KB, but these
would add performance overheads as well.

Finally, there are some "difficulty of working implementation"
upsides:  using the ABI "for real" has a lower barrier to entry if
every pointer is kernel-dereferenced on every use.  Slow as heck, but
it should be easy to implement if the compiler is trained to do it.
It's just a problem of optimizing after that (this was probably the
most loaded statement in the whole message).  Plus, it degrades
gracefully into the small model (or some kind of "medium" with
separate 64KB data, code, and stack) for simple programs, and probably
existing compilers will do a good job here, maybe with some init
hacks.

tl;dr:  I'd love me some nethack running on 8088.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 18:04           ` Brad Normand
@ 2012-02-25 19:09             ` David Given
  2012-02-25 20:29               ` Brad Normand
  0 siblings, 1 reply; 13+ messages in thread
From: David Given @ 2012-02-25 19:09 UTC (permalink / raw)
  To: linux-8086

[-- Attachment #1: Type: text/plain, Size: 1987 bytes --]

On 25/02/12 18:04, Brad Normand wrote:
[...]
> What I was thinking of, is pointers are stored as flat 32 bit (4GB)
> pointers, and when code uses a pointer, it can do its own arithmetic
> in 32 bits, or it can make the kernel translate the address into a
> DS:SI or ES:DI combination (behind the scenes, swapping in the 64KB
> this makes visible if needed).

When you say 4GB pointers, you mean physical addresses? It sounds
feasible, but I've never heard of anybody doing it.

This would give you fast pointer arithmetic at the expensive of much
slower indirection, due to having to frequently reload ds or es; I'd
imagine that there's more indirection than pointer arithmetic in the
average C program.

Hmm... the compiler would be able to optimise constant indirections
(a->b), but not non-constant indirections (a[i]). The first case would
turn into a ds:[si+$n] indirection, but the latter case would have to do
explicit pointer arithmetic on the physical address a and then reload ds.

As you say, compiler support is the biggest issue. (You know what would
be nice? 8086 support for LLVM.)

Aha. I see that Watcom allows you to *mix* large and huge mode. This
would let you use large mode for most of your data, i.e. anything that's
referring to objects that are smaller than 64kB, and huge pointers for
anything bigger. Of course, this does require you to annotate your
pointers, but it gives you the speed of large mode plus limited
flexibility when you need large data objects.

> tl;dr:  I'd love me some nethack running on 8088.

Get the overlaid real-mode version from here:

http://www.nethack.org/v331/ports/download-msdos.html




PS. Please don't cc me if you're replying to the list!

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 19:09             ` David Given
@ 2012-02-25 20:29               ` Brad Normand
  2012-02-25 21:04                 ` David Given
  0 siblings, 1 reply; 13+ messages in thread
From: Brad Normand @ 2012-02-25 20:29 UTC (permalink / raw)
  To: linux-8086

> When you say 4GB pointers, you mean physical addresses? It sounds
> feasible, but I've never heard of anybody doing it.

Basically, yes.  It'd be a 4GB virtual address space, but could be
backed by a real 4GB RAM if there was appropriate hardware.  Otherwise
it could be backed by swap.

> This would give you fast pointer arithmetic at the expensive of much
> slower indirection, due to having to frequently reload ds or es; I'd
> imagine that there's more indirection than pointer arithmetic in the
> average C program.

Yeah, the trick to getting good performance would be compiler
optimizations on the access strategies, with a more hard limit of
really only having 2x64KB data windows in "view" at a time (without
counting CS and SS).  There wouldn't be many ways around that without
the compiler performing caching strategies which would be on a whole
new plane of insanity.  The real killer for large applications
wouldn't be the time for the kernel to adjust pointers, it'd be the
data moving if there isn't RAM available to back the used memory.

> Hmm... the compiler would be able to optimise constant indirections
> (a->b), but not non-constant indirections (a[i]). The first case would
> turn into a ds:[si+$n] indirection, but the latter case would have to do
> explicit pointer arithmetic on the physical address a and then reload ds.

Right, the optimization possible for a[i] type stuff would degrade
quickly after 64KB size is reached.  The same window could be reused
only if the next data is still within it, but if the compiler doesn't
know this ahead of time, there would be a penalty in checking if it's
still ok.  Some arithmetic could be done on the SI register but
external code would have to be invoked to correct an arithmetic
borrow/carry and fix the segment register/prepare data.

>> tl;dr:  I'd love me some nethack running on 8088.
>
> Get the overlaid real-mode version from here:
>
> http://www.nethack.org/v331/ports/download-msdos.html

[totally off topic]:

Well either way I'd have to build a RAM board to do nethack on this
hardware, I'm mostly trying to find a good reason to build a big RAM
board and have it useful for more than ramdisk or task switching DOS
instances.  On this machine there is well over 512KB of address space
unused, so any chunk of the expansion memory could be mapped to each
of the 4 segments by only having the hardware decode upper addresses
based on 64KB mapping windows.

x86-16 address -> ram board address
mapping 0x4**** -> 0x0013****
mapping 0x5**** -> 0x0014****
DS=0x4367 would give a window into a 64KB that crosses ram board
mapping boundaries

The mapping hardware would just be a 4 address bit, 8 data bit RAM,
for up to 24 bit hardware addressing.  It starts to sound like a
non-standard 286 without memory protection.

I have an unexplainable thing for imagining up huge projects like this...

Sorry if this is sending you CC messages, it shouldn't be, but I have
to manually rewrite the address each time I reply.  I might have
forgot.
--
To unsubscribe from this list: send the line "unsubscribe linux-8086" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 20:29               ` Brad Normand
@ 2012-02-25 21:04                 ` David Given
  2012-02-25 23:05                   ` Brad Normand
  0 siblings, 1 reply; 13+ messages in thread
From: David Given @ 2012-02-25 21:04 UTC (permalink / raw)
  To: linux-8086

[-- Attachment #1: Type: text/plain, Size: 1479 bytes --]

On 25/02/12 20:29, Brad Normand wrote:
[...]
> Basically, yes.  It'd be a 4GB virtual address space, but could be
> backed by a real 4GB RAM if there was appropriate hardware.  Otherwise
> it could be backed by swap.

If you're going to simulate virtual memory you'll need to tell the OS
when you've *finished* using a segment, so that it can save it to disk
again.

Both GEOS and PalmOS did this. You'd allocate a block of memory from the
system and get a handle; to actually use the handle you'd have to lock
it, and then unlock it again when you were finished. If you changed the
contents you would have to dirty it.

int handle = Allocate(100);
char* ptr = Lock(handle);
*ptr = 0;
Dirty(handle);
Unlock(handle);

It worked very well. GEOS used it for both persistent storage and
virtual memory simulation. You could open a multimegabyte document and
the data would be invisibly loaded on demand; from the API point of view
you got a persistent heap in a file. Lovely system, died due to being
impossible to write code for and lousy marketing.

http://www.breadbox.com/

(Free evaluation version, requires DOS.)

Of course, they both used object handles rather than simulating a flat
address space.

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 21:04                 ` David Given
@ 2012-02-25 23:05                   ` Brad Normand
  2012-02-26  1:10                     ` zkry
  0 siblings, 1 reply; 13+ messages in thread
From: Brad Normand @ 2012-02-25 23:05 UTC (permalink / raw)
  To: linux-8086

>> Basically, yes.  It'd be a 4GB virtual address space, but could be
>> backed by a real 4GB RAM if there was appropriate hardware.  Otherwise
>> it could be backed by swap.
>
> If you're going to simulate virtual memory you'll need to tell the OS
> when you've *finished* using a segment, so that it can save it to disk
> again.

Not necessarily - it would help the OS make better decisions about
what to keep in memory and what needs to be written back to
swap/whatever, but an assumption could be made that the 4 mapped
segments are still in use until something else is mapped in, and any
RW segment that is mapped is automatically dirty.

I suppose the ABI design decision would be whether segments are
assumed dirty-by-default or clean-by-default.

Dirty-by-default: Easy to make functional - it's transparent to the
app.  Little added code bloat.  But it's harder to make swapping
perform optimally because the OS must always assume something has
changed unless the app explicitly tells it "all done with DS, and it's
the same as before".  Bugs arising from incorrect use of that hint
could be very frustrating, but debuggable with an OS setting to ignore
the hinting.

Clean-by-default: The compiler must atomically flag the segment as
dirty with each write it does (this is where a real MMU kicks ass
because the OS can track it itself).   If writes aren't atomically
done along with flagging dirty, the OS cannot mark it clean again if
it is forced to swap out and back in due to memory pressure.  In a
cooperative multitasking environment, this could work very well
because the OS won't swap it out during execution.  It's just that
cooperative multitasking sucks in other ways.

To help cut down on extraneous write-backs in dirty-by-default, some
hinting could be provided by using different calls for
"map-for-read-only" and "map-for-read-write".  Again, debugging bad
usage of these would need an OS switch.  But, it would help catch
improper segment usage in some cases (mapping a RO segment for
read-write).

By the way, thanks for your input - it's valuable to have someone to
bounce ideas off even if this never makes it into ELKS, because it
will help me if I decide to implement some of this stuff in another
project.
--
To unsubscribe from this list: send the line "unsubscribe linux-8086" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Some words of encouragement
  2012-02-25 23:05                   ` Brad Normand
@ 2012-02-26  1:10                     ` zkry
  0 siblings, 0 replies; 13+ messages in thread
From: zkry @ 2012-02-26  1:10 UTC (permalink / raw)
  To: linux-8086, Olle Zkry



I'd say I'm quite the opposite oa a power user of ELKS - I've 
been lurking silently on this list for years.A friend of mine 
gave me his Toshiba T110+, a nice 8086 machine. And as I found 
out that it could boot ELKS.... "Embeddable Linux Kernel 
Subset." Far out!

Downloaded a disk file. Sure, it booted from a 720K floppy. 
Nice!

But, sadly, I never had much success beyound that. BUT I managed 
to edit a text file and save it back to the floppy. No mean feat 
in itself.

This may be badly embarrasing, but I felt MUCHO MACHO. Linux on 
an 8086! And I got it to work!

- - - 

So i've been silently subscriped to this mailing list for years. 
This last surge of interest in ELKS really gladdens my heart. 
The exchanges about the possibly available memory models "LARGE" 
etc.

I'll never be a power ELKS user, but I sure like the fact that 
there are still people who care.

/Olle (by oath comitted to the command line)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-02-26  1:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-24 17:26 Some words of encouragement Brad Normand
2012-02-24 18:22 ` Jody Bruchon
2012-02-25  0:54   ` David Given
2012-02-25  6:27     ` Brad Normand
2012-02-25 12:53       ` David Given
2012-02-25 13:13         ` David Given
2012-02-25 18:04           ` Brad Normand
2012-02-25 19:09             ` David Given
2012-02-25 20:29               ` Brad Normand
2012-02-25 21:04                 ` David Given
2012-02-25 23:05                   ` Brad Normand
2012-02-26  1:10                     ` zkry
2012-02-25  5:12   ` Brad Normand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.