All of lore.kernel.org
 help / color / mirror / Atom feed
* SDCC porting feasibility study, part 1: the assembler
@ 2012-02-27  0:18 Brad Normand
  2012-02-27 10:05 ` Gábor Lénárt
  0 siblings, 1 reply; 12+ messages in thread
From: Brad Normand @ 2012-02-27  0:18 UTC (permalink / raw)
  To: linux-8086

I've started looking at SDCC to try and get an idea how easy it is to
port this to target 8086.

Assuming one starts with the 8051 assembler it already has, I think
these are the main areas needed to make the assembler work:

i8086.h:
-Define numeric IDs for each class of opcode
-Define numeric IDs for each addressing mode
-Define numeric IDs for each register
-Prototypes for i80adr.c and i86mch.c

i86adr.c:
-Assign string names to registers
-Write code to classify arguments to address modes

i86ext.c:
-Define processor name, endianness, and asm file extension

i86mch.c
-Write code to emit machine instructions

i86pst.c:
-Fill in table describing keywords and mnemonics
-Fill in table assigning numbers to registers and register bits

in global assembler code:
-Investigate what, if any, global changes are needed to support 8086.

8051 is very close to 8086 in terms of flags.  Registers are different
and 8086 is perhaps less orthogonal when it comes to some operations,
and addressing modes.  Segmentation doesn't exist natively on 8051 but
there is already some code present to support a weird 8051 with 24 bit
address space.

Another thought, is that it isn't strictly necessary to support every
8086 opcode and addressing mode in the beginning, just what the
compiler emits.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27  0:18 SDCC porting feasibility study, part 1: the assembler Brad Normand
@ 2012-02-27 10:05 ` Gábor Lénárt
  2012-02-27 15:43   ` Brad Normand
  2012-02-27 15:46   ` Jody Bruchon
  0 siblings, 2 replies; 12+ messages in thread
From: Gábor Lénárt @ 2012-02-27 10:05 UTC (permalink / raw)
  To: Brad Normand; +Cc: linux-8086

Re,

On Sun, Feb 26, 2012 at 06:18:55PM -0600, Brad Normand wrote:
> I've started looking at SDCC to try and get an idea how easy it is to
> port this to target 8086.

Maybe a bit off-topic, but:

Hmm, I had a bad experience with SDCC with Z80 as target. Maybe I was
not so smart, but I couldn't make it emit RODATA like stuff, it just generated
Z80 code (!) to store data, instead of just the data.

I just mention it, because it can be interesting problem for other targets too.

Since the problem is quite "funny", I assume it was only my mistake that I
left some option which would told SDCC not to do that, though ... etc.

---

- Gábor
--
To unsubscribe from this list: send the line "unsubscribe linux-8086" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 10:05 ` Gábor Lénárt
@ 2012-02-27 15:43   ` Brad Normand
  2012-02-27 15:46   ` Jody Bruchon
  1 sibling, 0 replies; 12+ messages in thread
From: Brad Normand @ 2012-02-27 15:43 UTC (permalink / raw)
  To: linux-8086

> Maybe a bit off-topic, but:
>
> Hmm, I had a bad experience with SDCC with Z80 as target. Maybe I was
> not so smart, but I couldn't make it emit RODATA like stuff, it just generated
> Z80 code (!) to store data, instead of just the data.
>
> I just mention it, because it can be interesting problem for other targets too.
>
> Since the problem is quite "funny", I assume it was only my mistake that I
> left some option which would told SDCC not to do that, though ... etc.

Hmm, maybe I should have started by looking at ELKS code and seeing
what kinds of features it needs and seeing if SDCC can deliver in
general.

Anyone else know any practical compatibility issues or other thoughts
on this?  I tried googling the old standby "sdcc sucks" and only found
one guy that didn't know how to use the volatile keyword.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 10:05 ` Gábor Lénárt
  2012-02-27 15:43   ` Brad Normand
@ 2012-02-27 15:46   ` Jody Bruchon
  2012-02-27 17:53     ` Brad Normand
  1 sibling, 1 reply; 12+ messages in thread
From: Jody Bruchon @ 2012-02-27 15:46 UTC (permalink / raw)
  To: ELKS (linux-8086)

On 02/27/12 05:05, Gábor Lénárt wrote:
> Re,
>
> On Sun, Feb 26, 2012 at 06:18:55PM -0600, Brad Normand wrote:
>> I've started looking at SDCC to try and get an idea how easy it is to
>> port this to target 8086.
>
> Maybe a bit off-topic, but:
>
> Hmm, I had a bad experience with SDCC with Z80 as target. Maybe I was
> not so smart, but I couldn't make it emit RODATA like stuff, it just generated
> Z80 code (!) to store data, instead of just the data.

I am contemplating writing a new toolchain from the ground up at this 
point. I'm rapidly learning that open source C toolchains are in short 
supply, and the ones that exist either (A) don't target 8086 at all, (B) 
are not documented well enough (or clearly enough) for a newcomer to add 
support, (C) output things in ways that are undesired, or (D) are so 
complex that all ye who enter there abandon all hope, specifically 
thinking of gcc. Particularly with smaller CPUs than 8086, it seems C 
compilers are ill-suited. I am thinking of the 6502, of which many 
systems exist with massive (for a 64K address space) amounts of 
bank-switched RAM; its minimal amount of 8-bit registers and interesting 
addressing modes make it hard to compile good code for from a language 
like C. The 65816 being the 16-bit variant makes it highly desirable to 
port ELKS to (wouldn't it be nice to have ELKS working on the Apple IIgs?)

Maybe what we need to be doing is making a list of the features that we 
need a compiler to support, rather than taking each one in turn and 
trying to jam the pegs in the holes?

Jody Bruchon
--
To unsubscribe from this list: send the line "unsubscribe linux-8086" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 15:46   ` Jody Bruchon
@ 2012-02-27 17:53     ` Brad Normand
  2012-02-27 18:52       ` David Given
  0 siblings, 1 reply; 12+ messages in thread
From: Brad Normand @ 2012-02-27 17:53 UTC (permalink / raw)
  To: ELKS (linux-8086)

> I am contemplating writing a new toolchain from the ground up at this point.
> I'm rapidly learning that open source C toolchains are in short supply, and
> the ones that exist either (A) don't target 8086 at all, (B) are not
> documented well enough (or clearly enough) for a newcomer to add support,
> (C) output things in ways that are undesired, or (D) are so complex that all
> ye who enter there abandon all hope, specifically thinking of gcc.
> Particularly with smaller CPUs than 8086, it seems C compilers are
> ill-suited. I am thinking of the 6502, of which many systems exist with
> massive (for a 64K address space) amounts of bank-switched RAM; its minimal
> amount of 8-bit registers and interesting addressing modes make it hard to
> compile good code for from a language like C. The 65816 being the 16-bit
> variant makes it highly desirable to port ELKS to (wouldn't it be nice to
> have ELKS working on the Apple IIgs?)
>
> Maybe what we need to be doing is making a list of the features that we need
> a compiler to support, rather than taking each one in turn and trying to jam
> the pegs in the holes?

What scares me there is writing a whole toolchain isn't trivial.  In
college, I took a class where we wrote the most basic of basic
compilers in java (using a nice grammar parsing library), reading in a
simplified ALGOL and targeting mips and doing no optimization or real
register allocation at all.  It barely supported functions with
parameters and return values.  No strings.  No floating point.  Just
32 bit ints.  We used an existing assembler that did all the dirty
jump calculations for us.  It was a lot of work just to get code to
execute, and without having a linker, it only runs in a setup where
loading is performed by hand (eg, simulator).  The entire semester
class was just to get it to this point.

I think even if we have to severely re-engineer things or even do a
ground-up, starting with digging into something like SDCC and using it
as a reference point would help considerably.  Especially if aiming
for a multi-target compiler, you almost have to design your targets in
at the beginning to make sure you're not missing the infrastructure to
support it.

Other questions...

Should the toolchain be able to self compile?  (how close is SDCC to
self compiling?  This probably isn't a bad indication of how good it
is in general)
Should it be able to self compile on a real target?  If yes, are gobs
of RAM a safe assumption?
Language features supported?  (I think SDCC might have limitations
here, I saw some stuff that it won't copy entire structs, which sounds
like kinda a big deal but probably not terribly hard to overcome)
Optimizations?
Data types to support?
Standard library?
How much architecture specific code is appropriate to leave to the
applications or libraries?

"Software ecosystem" questions...

How much existing software would be easy to adapt to this toolchain?
How useful is the work done on the toolchain to other free software
communities or hobbyists, etc?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 17:53     ` Brad Normand
@ 2012-02-27 18:52       ` David Given
  2012-02-27 19:04         ` Chad
                           ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: David Given @ 2012-02-27 18:52 UTC (permalink / raw)
  To: ELKS (linux-8086)

[-- Attachment #1: Type: text/plain, Size: 2285 bytes --]

Brad Normand wrote:
[...]
> What scares me there is writing a whole toolchain isn't trivial.  In
> college, I took a class where we wrote the most basic of basic
> compilers in java (using a nice grammar parsing library), reading in a
> simplified ALGOL and targeting mips and doing no optimization or real
> register allocation at all.

I've been doing stuff with compilers for years (ask me about my C to
Perl compiler!), and this is so true. C compilation is painful. Doing C
compilation *well* is a life's work. You *seriously* don't want to write
one from scratch.

If you want to do an 8086 C compiler, I'd strongly recommend looking at
LLVM --- it's apparently got a really nice backend model, although the
documentation isn't brilliant. Better still, it may be possible to start
with the excellent 386 backend and cut it down, which should be a much
easier job than building one from scratch.

Other compiler backends I know about:

- gcc: unspeakable.

- Sparse: the engine behind the Linux kernel linter, which I used for
the above C to scripting engine compiler. It's not really intended as a
code generator but actually does a reasonable job.

- vbcc: I did a Z-machine backend for this; it's got a very nice
architecture, and is simple to work with and produces reasonable code,
but has a painful source-available-but-not-open-source license.

- the ACK: has the advantage of being a complete turnkey toolchain and
compiler, including assembler, linker, librarian, libc, etc, *and* it
already supports the 8086, but doesn't produce great code and is tough
to work on.

- tcc: very very very fast. Produces very very very crap code. It
started life as an IOCCC entry, and boy does it show.

Then there's TenDRA, which I've never had anything to do with but which
seems to have vanished. And there are C and C++ parsing libraries like
Elkhound which I've tried to look at but have been unable to get started
with.

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 18:52       ` David Given
@ 2012-02-27 19:04         ` Chad
  2012-02-27 19:33         ` Brad Normand
  2012-02-27 19:37         ` Harley Laue
  2 siblings, 0 replies; 12+ messages in thread
From: Chad @ 2012-02-27 19:04 UTC (permalink / raw)
  To: Linux-8086

There's also pcc, which is actually actively maintained again (it's
fast and BSD-licensed), and has some 16-bit targets (although x86[-64]
is the primary one at this point)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 18:52       ` David Given
  2012-02-27 19:04         ` Chad
@ 2012-02-27 19:33         ` Brad Normand
  2012-02-27 19:37         ` Harley Laue
  2 siblings, 0 replies; 12+ messages in thread
From: Brad Normand @ 2012-02-27 19:33 UTC (permalink / raw)
  To: ELKS (linux-8086)

> If you want to do an 8086 C compiler, I'd strongly recommend looking at
> LLVM --- it's apparently got a really nice backend model, although the
> documentation isn't brilliant. Better still, it may be possible to start
> with the excellent 386 backend and cut it down, which should be a much
> easier job than building one from scratch.

Also has going for it that it's got lots of optimization muscle and
opens up other languages besides C (and does C pretty well).  The
project is active.  Should be able to compile itself but is a complex
piece of software and running it natively on a low speed target would
probably require an uninterruptable power supply, to say nothing of
compiling itself on a low speed target or the memory requirements it
would entail.

> - the ACK: has the advantage of being a complete turnkey toolchain and
> compiler, including assembler, linker, librarian, libc, etc, *and* it
> already supports the 8086, but doesn't produce great code and is tough
> to work on.

Minix uses this, probably is more complete than SDCC from first glance.

> - tcc: very very very fast. Produces very very very crap code. It
> started life as an IOCCC entry, and boy does it show.

Interestingly enough, it's complete enough to compile a linux kernel
and the blazing speed (and probably low memory requirements) might
make it suitable for running within small targets.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 18:52       ` David Given
  2012-02-27 19:04         ` Chad
  2012-02-27 19:33         ` Brad Normand
@ 2012-02-27 19:37         ` Harley Laue
  2012-02-27 20:42           ` Brad Normand
  2012-02-27 22:26           ` David Given
  2 siblings, 2 replies; 12+ messages in thread
From: Harley Laue @ 2012-02-27 19:37 UTC (permalink / raw)
  To: ELKS (linux-8086)

On 02/27/2012 12:52 PM, David Given wrote:
> Brad Normand wrote:
> [...]
>> What scares me there is writing a whole toolchain isn't trivial.  In
>> college, I took a class where we wrote the most basic of basic
>> compilers in java (using a nice grammar parsing library), reading in a
>> simplified ALGOL and targeting mips and doing no optimization or real
>> register allocation at all.
> I've been doing stuff with compilers for years (ask me about my C to
> Perl compiler!), and this is so true. C compilation is painful. Doing C
> compilation *well* is a life's work. You *seriously* don't want to write
> one from scratch.
My thoughts exactly. I have to admit that I actually laughed a bit when 
it was suggested.
> If you want to do an 8086 C compiler, I'd strongly recommend looking at
> LLVM --- it's apparently got a really nice backend model, although the
> documentation isn't brilliant. Better still, it may be possible to start
> with the excellent 386 backend and cut it down, which should be a much
> easier job than building one from scratch.
I said this before and I'll say it again, I think SDCC or LLVM will 
likely be the best options to look into. The advantage SDCC has is that 
the retarget would likely be accepted upstream where-as the 8086 target 
for LLVM would live on its own (not necessarily a bad thing.)
> Other compiler backends I know about:
>
> - gcc: unspeakable.
>
> - Sparse: the engine behind the Linux kernel linter, which I used for
> the above C to scripting engine compiler. It's not really intended as a
> code generator but actually does a reasonable job.
>
> - vbcc: I did a Z-machine backend for this; it's got a very nice
> architecture, and is simple to work with and produces reasonable code,
> but has a painful source-available-but-not-open-source license.
>
> - the ACK: has the advantage of being a complete turnkey toolchain and
> compiler, including assembler, linker, librarian, libc, etc, *and* it
> already supports the 8086, but doesn't produce great code and is tough
> to work on.
I was barely able to even get this one to compile on my host machine. It 
seems they don't have very good support for 64bit Linux. Once it was 
compiled the ack command (for instance) had no usage help message as far 
as I could tell. Overall, it didn't leave a very good taste in my mouth.
>
> - tcc: very very very fast. Produces very very very crap code. It
> started life as an IOCCC entry, and boy does it show.
>
> Then there's TenDRA, which I've never had anything to do with but which
> seems to have vanished. And there are C and C++ parsing libraries like
> Elkhound which I've tried to look at but have been unable to get started
> with.
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 19:37         ` Harley Laue
@ 2012-02-27 20:42           ` Brad Normand
  2012-02-27 21:34             ` Jody Bruchon
  2012-02-27 22:26           ` David Given
  1 sibling, 1 reply; 12+ messages in thread
From: Brad Normand @ 2012-02-27 20:42 UTC (permalink / raw)
  To: Harley Laue; +Cc: ELKS (linux-8086)

So I guess the take-away I'm getting from all this is that having the
best applications running on this kind of hardware would lend itself
towards using LLVM, the best bang-for-the-buck of coding and
contributing to an existing project (with regards to the intrinsic
usefulness of the code) is probably SDCC (but in some ways this will
treat the system like a glorified microcontroller), and the most
practical and demonstratable self-supporting system would probably
lead towards TCC, but LLVM would also satisfy this very slowly on a
system with enough memory, giving at least the conceptual assurance
but without an easy way to show it off besides "come back in a few
weeks".

ACK is kind of a wildcard that might fit somewhere in-between any of
this depending on what exactly it is, but we already know it has some
issues.  It does presumably work though.

Writing our own toolchain... eech.  Might have been a neat idea back
in the 80's.

I suppose without having really dug into any of the options yet, my
tentative next move would be to look at what makes an LLVM backend.
Does such a backend include a linker?  What would be a good C library
to use with it?


Gawd, this is why I try to stick to assembler, at least then every
problem is my own fault.  Except assembler bugs (I've hit them already
with NASM running on 16 bit, rather bloated too).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 20:42           ` Brad Normand
@ 2012-02-27 21:34             ` Jody Bruchon
  0 siblings, 0 replies; 12+ messages in thread
From: Jody Bruchon @ 2012-02-27 21:34 UTC (permalink / raw)
  To: ELKS (linux-8086)

On 02/27/12 15:42, Brad Normand wrote:
---snip---
> ACK is kind of a wildcard that might fit somewhere in-between any of
> this depending on what exactly it is, but we already know it has some
> issues.  It does presumably work though.

I have read up on how ACK works, from some of the whitepapers on the 
site. The 6502 code generator's output is absolutely nightmarish. ACK 
apparently uses a not-so-grand intermediate representation that is 
responsible for it not being that good at generating code.

> Writing our own toolchain... eech.  Might have been a neat idea back
> in the 80's.

Where did all these other toolchains come from, anyway? As far as I am 
aware (trust me, bcc is pretty hard to find solid information about), 
the Dev86 toolchain was pretty much just Bruce Evans' work up until the 
point that the Linux-8086 crew grabbed it and beefed it up further. It 
might be yucky, but even if yucky, it's an option that would work.

Jody Bruchon

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: SDCC porting feasibility study, part 1: the assembler
  2012-02-27 19:37         ` Harley Laue
  2012-02-27 20:42           ` Brad Normand
@ 2012-02-27 22:26           ` David Given
  1 sibling, 0 replies; 12+ messages in thread
From: David Given @ 2012-02-27 22:26 UTC (permalink / raw)
  To: ELKS (linux-8086)

[-- Attachment #1: Type: text/plain, Size: 1745 bytes --]

On 27/02/12 19:37, Harley Laue wrote:
[...]
> I was barely able to even get this one to compile on my host machine. It
> seems they don't have very good support for 64bit Linux. Once it was
> compiled the ack command (for instance) had no usage help message as far
> as I could tell. Overall, it didn't leave a very good taste in my mouth.

Yeah, I got fairly burnt out working on it --- it's a massive chunk of
code and simply coming to terms with the build system was incredibly
hard work. (I eventually had to *write my own build tool* to make it
build. Sigh.) The 64 bit issues basically stem from the fact that it
comes from another era and it doesn't understand longs that are 64 bits
wide. I have a fix that seems to work. I just don't understand *why* it
works, which is why it's not checked in.

...also, I've just discovered this piece of code:

vprint(fmt,p1,p2,p3,p4,p5,p6,p7) char *fmt ; {
        /* Diagnostic print, no auto NL */
        fprintf(STDOUT,fmt,p1,p2,p3,p4,p5,p6,p7);
}

*shudder* As I said, another era.

As for documentation --- there should be man pages for pretty much
everything, although I am aware that some of the more exotic stuff (like
the assemblers) haven't made it to the release package.

If anyone's interested, I can have another go at making it work on
64-bit machines. I think the output code is better than bcc's, but it's
been a while since I've seen the output. (The 8086 backend has had a
fair bit of work done to it.)

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-02-27 22:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-27  0:18 SDCC porting feasibility study, part 1: the assembler Brad Normand
2012-02-27 10:05 ` Gábor Lénárt
2012-02-27 15:43   ` Brad Normand
2012-02-27 15:46   ` Jody Bruchon
2012-02-27 17:53     ` Brad Normand
2012-02-27 18:52       ` David Given
2012-02-27 19:04         ` Chad
2012-02-27 19:33         ` Brad Normand
2012-02-27 19:37         ` Harley Laue
2012-02-27 20:42           ` Brad Normand
2012-02-27 21:34             ` Jody Bruchon
2012-02-27 22:26           ` David Given

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.