DOS ain't dead - LLVM for DOS/DJGPP?

RayeR

CZ,
08.09.2011, 22:41
(edited by RayeR, 09.09.2011, 19:39)

LLVM for DOS/DJGPP? (Developers)

Hi,
does someone have an experiences with this new compiler? It's new thing for me. I just tried to play with it for a short time under Windows/MinGW32. But it's portable and support various targets. LLVM provides a virtaul machine that run specific bitcode that can be compiled from C source files. this bitcode can be then further optimized and compiled to native assembler for specific targets. Resulting code should be up to ~30% more effective than gcc produce. There are llvm-gcc and clang frontends that compile C->LLVM bitcode->x86-ASM->binary. I wonder if there will be possibility to utilize it also under DOS/DJGPP. Maybe partially with HXDOS support to execute LLVM win32 binaries. Or recompile it. I don't know if some specific stub would be needed or I could use just DJGPP to sompile ASM .s file to DOS EXE, I'll try it for fun...

---
DOS gives me freedom to unlimited HW access.

Rugxulo

Usono,
08.09.2011, 23:01

@ RayeR

LLVM for DOS/DJGPP?

Post reply

I've only very vaguely tried it. It's darn good, but it's not 100% as fast in all cases as GCC output (see Phoronix benchmarks?). But it's close (or even better) for average cases. It's "production ready", hence I think? Apple ships it by default now. Part of their motivation was embedding inside IDEs without source code clashes. (Like *BSD systems, they shun GPLv3 in their base system, hence their GCC is only 4.2.1, last GPLv2.) It's also faster, better scalability, better error messages, and uses less memory. Though honestly I don't know if they've written their own linker or still rely on GNU for that.

Clang can build the FreeBSD kernel and possibly others (rough work done for Linux). Its default is C99 mode (and C++, natch, since it can self-host and is written in C++). It has its own C++ runtime and library (I think?). Yes, it supports multiple targets, but that mostly just is what Apple uses: x86, x86-64, ARM. IIRC they have a C backend for others, but I have no idea if that works well.

For good or bad, LLVM-GCC is deprecated and will not be further updated. They are switching entirely to Clang frontend (C, C++, Obj C, some GCC-compatible extensions).

I'm surprised they got this far this quickly (relatively speaking) since this kind of stuff isn't fast work. Of course, now they (and even G++) have to play catch up to the new C++11 standard, which will probably take a few years, heh. Then comes C1X.

RayeR

CZ,
09.09.2011, 01:29

@ Rugxulo

LLVM for DOS/DJGPP?

Post reply

BTW I read that it will also replace gcc in linux...

---
DOS gives me freedom to unlimited HW access.

RayeR

CZ,
09.09.2011, 01:34

@ RayeR

LLVM for DOS/DJGPP?

Post reply

> BTW I read that it will also replace gcc in linux...
I found a limitation in note:
"Code generation supported for Pentium processors and up"
so this mean that it cannot generate code running on 386/486? Probably not all code but when compiling some complex code and it found usefull to use a pentium instruction then it will use it but helloworl code would probably work...

---
DOS gives me freedom to unlimited HW access.

Rugxulo

Usono,
09.09.2011, 05:44

@ RayeR

LLVM for DOS/DJGPP?

Post reply

> > BTW I read that it will also replace gcc in linux...

I would be heavily surprised if it did, esp. since Linux is GCC's bread and butter. Esp. with the GCC 4.5.x (and newer) plugin architecture (e.g. DragonEgg), there's no need, it can already use LLVM (allegedly, though I haven't tried).

> I found a limitation in note:
> "Code generation supported for Pentium processors and up"
> so this mean that it cannot generate code running on 386/486? Probably not
> all code but when compiling some complex code and it found usefull to use a
> pentium instruction then it will use it but helloworl code would probably
> work...

586-only instructions are very very rare from compilers. You're more likely to see 686 (CMOV..) or SSE2 (XMM) than anything else.

marcov

09.09.2011, 19:07

@ Rugxulo

LLVM for DOS/DJGPP?

Post reply

> > > BTW I read that it will also replace gcc in linux...
>
> I would be heavily surprised if it did, esp. since Linux is GCC's bread and
> butter.

The multi architecture angle is going to be hard. But as said, LLVM is improving fast. (this might also be a stumbling stone for the other Net/OpenBSD since they generally support more architectures)

FreeBSD 9.0 has Clang in base (though ports can still use GCC).

This will probably put pressure on various packages to work with CLang over time (iow kill GCCisms that LLVM doesn't support). This in turn will make a move for Linux easier.

> Esp. with the GCC 4.5.x (and newer) plugin architecture (e.g.
> DragonEgg), there's no need, it can already use LLVM (allegedly, though I
> haven't tried).

I don't follow that logic. Such workarounds are mostly interesting for distros actually switching to LLVM during the transition period.

For straight partyline GCCers that only is unnecessary complication.

> 586-only instructions are very very rare from compilers. You're more likely
> to see 686 (CMOV..) or SSE2 (XMM) than anything else.

Yes, specially since the very old (60 and 90 Pentiums) afaik miss instructions that later 486's and -clones have. Moreover Pentiums require a special optimization strategy.

OTOH forcing PPro also rules out stuff like K6. Only K6-2 was PPro instruction compatible iirc.

RayeR

CZ,
09.09.2011, 19:44

@ marcov

LLVM for DOS/DJGPP?

Post reply

BTW do you know if LLVM/Clang support runtime array bound check?
I cannot find this. But for good clang have some static bound check and display such nice warning:
C:\LLVM>clang hello.c hello.c:10:3: warning: array index of '10' indexes past the end of an array (that contains 10 elements) [-Warray-bounds] b[10]=0; ^ ~~ hello.c:4:1: note: array 'b' declared here char b[10]; ^ 1 warning generated.
but if you do
char c=9; b[c+1]=0;
it will pass. I think I don't make such stupid mistakes so it will not help like runtime check...

---
DOS gives me freedom to unlimited HW access.

Rugxulo

Usono,
10.09.2011, 23:34

@ marcov

LLVM for DOS/DJGPP?

Post reply

> The multi architecture angle is going to be hard. But as said, LLVM is
> improving fast. (this might also be a stumbling stone for the other
> Net/OpenBSD since they generally support more architectures)

Yeah, I doubt they'll drop GCC any time soon. In fact, last I heard, they still used non-supported GCCs due to bugs or lack of architecture support (SPARC? VAX?) in newer ones (e.g. OpenBSD, 2.95.3, unlike other arches). Even Linux kernel stopped allegedly stopped supporting 2.95.3 since 2005 or 2006 because of (most) lack of C99 support. The "oldest" GCC still (barely) maintained is 4.4.x. They still haven't fully implemented C99, but they're fairly close by now. (And now they have to finish adding C++11, heh.)

> FreeBSD 9.0 has Clang in base (though ports can still use GCC).

Ugh, "ports", what a mess.

Anyways, you still have to use GCC if you want other frontends (e.g. Ada) since Clang only supports C, C++, Objective C. Unless you switch to .NET/Mono or Java/JVM (as many other compilers and languages support those nowadays). I hear that the GNAT-AUX fork for *BSD is good.

> This will probably put pressure on various packages to work with CLang over
> time (iow kill GCCisms that LLVM doesn't support). This in turn will make a
> move for Linux easier.

They seem to support most GCC extensions due to the huge existing codebase.

> > 586-only instructions are very very rare from compilers. You're more
> likely
> > to see 686 (CMOV..) or SSE2 (XMM) than anything else.
>
> Yes, specially since the very old (60 and 90 Pentiums) afaik miss
> instructions that later 486's and -clones have. Moreover Pentiums require a
> special optimization strategy.
>
> OTOH forcing PPro also rules out stuff like K6. Only K6-2 was PPro
> instruction compatible iirc.

Most people and compilers don't bother worrying with compatibility anyways (sadly). But I don't think the Pentium (IIRC) has many "new" instructions useful for random compiles (RDTSC, CPUID). Can't remember, perhaps you meant something odd like CMPXCHG8B ??? Even the 486 is barely different (CMPXCHG, XADD, BSWAP). PentiumPro is rare (servers? workstations?) because it was expensive. I'm not sure all PentiumPros even supported CMOVxx, hence a small dispute in GCC history over that incorrect "feature" for -march=i686. And those were slower for 16-bit plus also lacked MMX (unlike PII or P/MMX). GCC doesn't even bother targeting Cyrix, and VIA might as well not exist (old ones lacked CMOVxx but newer support even AMD64) due to rarity. And don't forget (older) Atom or (older) VIA, which were in-order (unlike newer), aka similar to 486 (pipelined but not superscalar).

GCC 2.7.2.3 (mid 1997) was the last official release to not support Pentium. And even it only just added extra alignment (since 486 was sensitive) to 386 code, so no instruction scheduling difference. In other words, there hasn't ever really been any "good" 486 support. Supposedly the Pentium support was vaguely better, but I never benchmarked. And it hasn't changed a lot since then. (In other words, developers don't care, it never gets fixed or improved, sadly. On to newest/latest/greatest, ad infinitum.) Remember that the EGCS (Pentium-improved) fork became official GCC 2.95. BTW, GCC does seem pretty good for Pentium 4 and (these days) Core 2, but again, YMMV re: benchmarking. (But -ftree-vectorize isn't as nearly as useful as you'd hope.)

Long story short, it doesn't matter. Most everybody (esp. in Linux) just uses "i686" (or "generic") anyways. Almost nobody targets "i386" or "i586" anymore. Honestly, it's pretty pointless as CMOVxx/FCMOVxx barely helps (if at all). i586 scheduling for U/V pipelines helps but hurts others. i686 out-of-order scheduling (4-1-1 micro-ops) is different from Pentium 4 (family 15, long pipeline, no barrel shifter). And of course Core 2 (family 6) is based upon Pentium-M, aka Pentium III, which complicates everything. And MMX(s) and 3dnow!(s) are (almost) deprecated anyways, so nobody barely supports them. FPU still gets some support (esp. due to tons and tons of legacy code, and GCC always assumes an FPU), but even that is considered old and abandoned by half of the world (esp. related to slow speed, non-scalable, despite being included on all processors since a long time). 3dnow!(s) will be removed from future AMD processors, even.

I think Clang/LLVM aims to support at least Pentium 4 and SSE2, but I'm not sure how well. AMD64 makes this more obvious. But AVX is out in the wild now (with AVX2 in development, wtf??) and SSE4.1, SSE4.2, and a bunch of other weird things (MOVBE, AES). Bah, it's a mess. Don't expect compilers to support all that for you.

Obviously, using DOS makes one appreciate simplicity, so I doubt using anything besides "-march=i386 -mtune=native" is worth anything except in rare cases. But you know, people want to "try" to have the fastest speed, smallest code (with C? ha!), etc.

(combining replies)

> Btw it there some exe/asm analyzer that will find what instruction set
> is required to run this code?

No, it's mostly impossible, esp. with packed or encrypted .EXEs. You'd have to run in a debugger (or via disasm) and blindly hope it catches everything. Just assume everything is i686, as most compilers don't handle anything more exotic than that. Don't forget that most compilers are pretty dumb/simple, they don't try to do too much tricky code generation stuff. GCC is more complicated than most but still lacks a lot. Like I said, I'd be surprised if there was ever a "true" 486-only or 586-only output from GCC via pure C code (without inline asm or other tricks) as I don't think it ever supported that (except maybe?? via builtins).

> BTW do you know if LLVM/Clang support runtime array bound check?

I don't think C (language proper) ever wants to support that because of speed reasons. There were some semi-related 3rd-party patches to older GCCs for that, I think (e.g. 2.7.2 on up), but I never tried (too painful to rebuild). IIRC, EMX and TCC had some (??) support, but I don't know how well. Most likely they only support compile-time checks instead of (slower) runtime. Of course, as you probably know, most Pascal-ish languages have such a feature, and usually they leave it on for debugging (or even permanently). But for C, I dunno. It might even be patented there (read a rumor to that effect somewhere), which sounds wrong but anyways .... GCC doesn't use patented technology for obvious reasons.

I think in C you're expected to write your own boundary-checking code if it's worth it to you and thus take less of a performance hit than a generic solution. At least that's their mentality, I think. Or perhaps they think assert() is good enough, who knows. Or Valgrind, Findbugs, etc. (external tools) etc.

marcov

11.09.2011, 02:33
(edited by marcov, 11.09.2011, 13:20)

@ Rugxulo

LLVM for DOS/DJGPP?

Post reply

> > The multi architecture angle is going to be hard. But as said, LLVM is
> > improving fast. (this might also be a stumbling stone for the other
> > Net/OpenBSD since they generally support more architectures)
>
> Yeah, I doubt they'll drop GCC any time soon. In fact, last I heard, they
> still used non-supported GCCs due to bugs or lack of architecture support
> (SPARC? VAX?) in newer ones (e.g. OpenBSD, 2.95.3, unlike other arches).

Nothing is going to happen anytime soon. But I expect some of the old archs to be killed in the coming years. Simply because new enduser (or even workstation/basic server) hardware isn't made anymore for them.

> Anyways, you still have to use GCC if you want other frontends (e.g. Ada)

bleh.

>>But I don't think the Pentium (IIRC) has many "new" instructions
> useful for random compiles (RDTSC, CPUID).

Did the 5V Pentium 60/90 have those? IIRC they were a bit funky too. I still have one of those. (admitted, as key-ring hanger though, it is not "operational" anymore :-)

> Can't remember, perhaps you
> meant something odd like CMPXCHG8B ???

That one _is_ important; to create CAS iirc, for threadsafe queues.

> PentiumPro is rare (servers? workstations?) because
> it was expensive. I'm not sure all PentiumPros even supported CMOVxx, hence
> a small dispute in GCC history over that incorrect "feature" for
> -march=i686. And those were slower for 16-bit plus also lacked MMX (unlike
> PII or P/MMX).

PPro's were cool. Much faster, gigantic full speed cache (P-II was only half speed, but in the beginning when they competed against P-I and the difference was much bigger), and most importantly, the first true multiprocessor.

I might still have two somewhere (200MHz, 256kb), from a Proliant 1500.

> GCC doesn't even bother targeting Cyrix, and VIA might as
> well not exist (old ones lacked CMOVxx but newer support even AMD64) due to
> rarity. And don't forget (older) Atom or (older) VIA, which were in-order
> (unlike newer), aka similar to 486 (pipelined but not superscalar).

Yup, afaik Via C3 doesn't either, and they are actually not that old. (as in were sold after 2000, specially in all kinds of embedded configurations. Afaik we got ours in 2003)

We had one as firewall till a few months back, and early this year the firewall distro (smoothwall) changed to 686, and got into trouble.

But then we changed ISP, and the new modem has a firewall internal.

(gcc story skipped

true, most instructions of 486,pentium, pentiummmx are not commonly generated by compilers)

> BTW, GCC does seem pretty good for Pentium 4 and (these days) Core 2, but
> again, YMMV re: benchmarking. (But -ftree-vectorize isn't as nearly as
> useful as you'd hope.)

Netburst wise I've only a Pentium-D (and actually never owned a P4), and while it has an XP, it is used mostly in 64-bit mode.

> Long story short, it doesn't matter. Most everybody (esp. in Linux) just
> uses "i686" (or "generic") anyways. Almost nobody targets "i386" or "i586"
> anymore. Honestly, it's pretty pointless as CMOVxx/FCMOVxx barely helps (if
> at all). i586 scheduling for U/V pipelines helps but hurts others. i686
> out-of-order scheduling (4-1-1 micro-ops) is different from Pentium 4
> (family 15, long pipeline, no barrel shifter). And of course Core 2 (family
> 6) is based upon Pentium-M, aka Pentium III, which complicates everything.

Well, I think that is the rationale. i686 scheduling covers the biggest chunk (mainly i5,core2 and core mono and pentium M, but also the older Ppro/II/III.
AMD can also handle it fine. The only realistic problem is netburst, and the real exectional bits like VIA C3)

> And MMX(s) and 3dnow!(s) are (almost) deprecated anyways, so nobody barely
> supports them. FPU still gets some support (esp. due to tons and tons of
> legacy code, and GCC always assumes an FPU),

SSE2 is not as common as e.g. cmov. Maybe in time...

> I think Clang/LLVM aims to support at least Pentium 4 and SSE2, but I'm not
> sure how well. AMD64 makes this more obvious. But AVX is out in the wild
> now (with AVX2 in development, wtf??) and SSE4.1, SSE4.2, and a bunch of
> other weird things (MOVBE, AES). Bah, it's a mess. Don't expect compilers
> to support all that for you.

Some like AES,popcnt,lzcount,crc32 are mostly for specialistic (hand coded assembler) code.

Stuff like AVX is nearly always useful in memcpy/move() routines (but maybe not on AMD's upcoming bulldozer, since it eats two core's vector units there), and even compiler generated in expanded (inline) versions of that routine for low bytecounts.

> Obviously, using DOS makes one appreciate simplicity, so I doubt using
> anything besides "-march=i386 -mtune=native" is worth anything except in
> rare cases. But you know, people want to "try" to have the fastest speed,
> smallest code (with C? ha!), etc.

My experience is that most normal (non benchmark) programs are limited by

1) the heap manager
2) move/memcpy
3) other similar primitives like "search for a byte in block of memory", on C probably a special "search for byte 0 in block of memory" variant.

But move definitely matters. In 2003..2006, a team reprogrammed the lowlevel assembler of Delphi (which was pretty much at the level of 486 P-I) of such routines to optimized MMX/SSE(2) using, and that could matter 40% on memory copy easily.

And that was very, very noticable in my work app (image analysis, lots of image copying going on)

> (combining replies)
>
> > Btw it there some exe/asm analyzer that will find what instruction set
> > is required to run this code?
>
> No, it's mostly impossible, esp. with packed or encrypted .EXEs. You'd have
> to run in a debugger (or via disasm) and blindly hope it catches
> everything.

Moreover many programs detect processors. So such a tool could detect e.g. SSE(2) RTL, codec or encryption routines and conclude that the program needs SSE2, while in reality there are fallbacks for older CPUs.

> > BTW do you know if LLVM/Clang support runtime array bound check?
>
> I don't think C (language proper) ever wants to support that because of
> speed reasons.

That is nonsense. If you can turn it off, it doesn't matter. Works the same in pascal.

> Of course, as you probably know, most Pascal-ish languages have
> such a feature, and usually they leave it on for debugging (or even
> permanently).

Some do, but most ship production releases without. Not because of speed reasons btw, but more to avoid the chance that a customer gets an error that was not necessary.

> But for C, I dunno. It might even be patented there (read a
> rumor to that effect somewhere), which sounds wrong but anyways .... GCC
> doesn't use patented technology for obvious reasons.

Please. This is sixties technology, there are libraries full of prior art. IMHO FUD.

> I think in C you're expected to write your own boundary-checking code if
> it's worth it to you and thus take less of a performance hit than a generic
> solution. At least that's their mentality, I think. Or perhaps they think
> assert() is good enough, who knows. Or Valgrind, Findbugs, etc. (external
> tools) etc.

I would search the problem more in the standards. Doing unique things is simply discouraged. GCC has some more stuff, (Wirthian stuff like said range checks, nested procedures) but that is probably because the backend is/was also used for wirthian languages, and the concepts were therefore already supported by the middle layer.

RayeR

CZ,
12.09.2011, 13:46

@ marcov

LLVM for DOS/DJGPP?

Post reply

> Did the 5V Pentium 60/90 have those? IIRC they were a bit funky too. I
> still have one of those. (admitted, as key-ring hanger though, it is not
> "operational" anymore :-)

I also have this CPU (60MHz) but I don't have working mobo for it. It have different socket than newer pentiums. But it's a nice piece of silicon (and gold :)

> PPro's were cool. Much faster, gigantic full speed cache (P-II was only
> half speed, but in the beginning when they competed against P-I and the
> difference was much bigger), and most importantly, the first true
> multiprocessor.

also have one machine with Asus mobo and 440FX chipset. Yes PPro has fullspeed cache but maybe there was some wait-state. There was models with 256k, 512k and 1MB (tripple chip) version. PII seems like a step-back in evolution. They created slot1 because they was not able to place all silicon to one chip but the housing was expansive so they quickly moved to S370. BTW intel is anoying with his socket update policy, there was many cases when they changed few pins and forced milions users to trash their relative new mobos. Last case was LGA1155 (sandybridge) vs LGA1156. AMD made much less socket and support them much longer. It bothering me because new mobos lacks some esential peripherals for me like PCI, COM, LPT, IDE...

---
DOS gives me freedom to unlimited HW access.

Rugxulo

Usono,
13.09.2011, 01:53

@ marcov

LLVM for DOS/DJGPP?

Post reply

> >>But I don't think the Pentium (IIRC) has many "new" instructions
> > useful for random compiles (RDTSC, CPUID).
>
> Did the 5V Pentium 60/90 have those? IIRC they were a bit funky too.

Even (late-model DX) 486 had CPUID, but RDTSC was new to Pentium, I think. Not all clones have CPUID either (perhaps one needed it to be enabled??, dunno), hence you should test for CPUID before using it (in theory, though I admit most cpus in use these days have it). You may?? be thinking of the fact that RDTSC looped around / wrapper or "ran out" after a month or whatever unlike PPro on up where it had a longer duration (can't remember exactly, sorry).

I'm no hardware expert or professional programmer. I've never written my own compiler, and my x86 asm experience is pretty weak. I'm just saying, I have never, in limited experience, ever seen a compiler generate true 586-only code. Similarly with 486, though I guess it's possible in theory. GCC for sure I've never seen use either, but I guess it could?? be hidden in one of the built-ins (with test or otherwise, dunno). It's only the 686 CMOVxx bullcraps that have bit me a few times (on my old 586 or 486 Sx or DOSBox [486 DX2]).

> > Can't remember, perhaps you meant something odd like CMPXCHG8B ???
>
> That one _is_ important; to create CAS iirc, for threadsafe queues.

I don't know, can't remember, but the additional instructions in 486 and 586 are minimal and (almost?) useless for general-purpose compilers. Again, I don't know of any that have ever generated such instructions, but again, my experience is limited (aren't we all??). I've seen a few hand-coded 486-only apps (stupid BSWAP, as if that helps anything significantly), but usually when people say "486 only" it means "needs FPU" (though not all 486s had it, natch) or even sometimes "needs fast 486-like speeds".

Normally I had read that a 486 (pipelined) was always faster "clock for clock" than any 386, even at same Mhz, because the fastest 386 instruction was 2 clocks vs. 1 clock for 486. But I don't know about the ultra-fast 386 clones (AMD, etc). Anyways, the 486 was the one with (quite small) on-chip cache, finally, while most (but not all??) 386s had slower external off-chip motherboard cache. But 486 was more RISC-y, so while "lodsb" and short instructions typically were said to run faster on a 386, they were slower on 486 than standard "mov" equivalents. Hence, like I said, I don't think GCC ever (!) had any true explicit support for 486 beyond just adding extra alignment (since it was allegedly very sensitive, much moreso than Pentium).

> PPro's were cool. Much faster, gigantic full speed cache (P-II was only
> half speed, but in the beginning when they competed against P-I and the
> difference was much bigger), and most importantly, the first true
> multiprocessor.

Like I said, I've read they were expensive and hence most home users didn't use them. Servers? Dunno. I don't think it was until PII that Intel finally introduced budget line Celerons to compete with the likes of AMD (and others, who mostly disappeared after that). Besides, as even Wikipedia admits, the 32-bit only software market back then was much smaller, hence running slower 16-bit code was more painful then than now (whereas most people these days don't run DOS-like systems, unlike back in the day, *sniff*). Omitting MMX is another surprising oversight for PPro but (IMHO) not as huge a loss (as I can't name a lot of MMX-enabled apps, and it's deprecated these days anyways).

> Yup, afaik Via C3 doesn't either, and they are actually not that old. (as
> in were sold after 2000, specially in all kinds of embedded configurations.
> Afaik we got ours in 2003)

IIRC, there's some weak VIA support in GCC, but I don't think 99% of people ever used a VIA chip. I'm not sure if they target end users or just embedded (low-power needing) businesses. I'm honestly not sure if most Windows etc. OSes will run on VIA (though latest probably got fixed but not older) due to hard-coded CPUID checks or such dumb stuff.

PGCC used to have weak support for Cyrix, but I don't know exactly (and apparently that wasn't folded back into EGCS and hence nor in GCC 2.95.x).

> We had one as firewall till a few months back, and early this year the
> firewall distro (smoothwall) changed to 686, and got into trouble.

IIRC, that (VIA) was why Ubuntu was 486 only for a while there, dunno about lately. Stupid CMOVxx (see comment by Linus Torvalds) isn't that helpful in general use. So it should be avoided for general compilations, IMHO, unless it can be proved to help (and providing a separate 586-only binary isn't that painful, is it?).

> true, most instructions of 486,pentium, pentiummmx are not commonly
> generated by compilers)

MMX is integer only, 3dnow! and SSE1 are single-precision, and SSE2 (much bigger) is everything including double precision. I think?? I read that FPU is mostly ignored by most ultra-modern tools and OSes these days. Well, I mean, it's still there and used, but new compilers don't generate for it, they try to target SSE2 instead. Yes, I know, FPU can do 80-bits, but I think even the C standard (C89??) only says "long double" has to equal "double", not necessarily surpass it (though some compilers do support 80-bit). So maybe that's their justification for trying to cram SSE2 down everyone's throat. I mean, considering how C/C++ is kinda treated as "good enough for everything" by a lot of people, I wouldn't be surprised.

> Netburst wise I've only a Pentium-D, and while it has an XP,
> it is used mostly in 64-bit mode.

GCC isn't a bad compiler by any stretch. I don't think anybody can claim that. But it's definitely not be optimized very well for some (sub)architectures. It does honestly seem to be better on Pentium4 than others, surprisingly. In other words, it's kinda sad that even GCC developers don't have more machines to test. I don't know what they use, but I hope it's more than just a handful of ultra-modern cpus.

> Some like AES,popcnt,lzcount,crc32 are mostly for specialistic (hand coded
> assembler) code.

So much for the traditional saying that "C code is as fast as asm code". Not anymore! (Though in fairness, how could it be??)

> Stuff like AVX is nearly always useful in memcpy/move() routines (but maybe
> not on AMD's upcoming bulldozer, since it eats two core's vector units
> there), and even compiler generated in expanded (inline) versions of that
> routine for low bytecounts.

It only came out on Intel chips earlier this year, right? So it's extremely new, hence most people don't have it. Sure, some compilers support it now (eh? why?), but I think it's less useful overall. Hopefully future coders will be wise enough not to be "AVX-only", as that would be short-sighted, IMHO.

> My experience is that most normal (non benchmark) programs are limited by
>
> 1) the heap manager
> 2) move/memcpy
> 3) other similar primitives like "search for a byte in block of memory", on
> C probably a special "search for byte 0 in block of memory" variant.
>
> But move definitely matters. In 2003..2006, a team reprogrammed the
> lowlevel assembler of Delphi (which was pretty much at the level of 486
> P-I) of such routines to optimized MMX/SSE(2) using, and that could matter
> 40% on memory copy easily.

MMX and SSE are meant to be both bigger and faster, and that is the (only?) reason to use them. In particular, SSE being able to work at the same time as FPU helps too. Newer cpus have even faster SSE support, so it's even more beneficial than before. It's not a secret, SSE is considered "the future" (or maybe "the present", dunno, I guess I'm old-fashioned, obviously, heh).

But it doesn't always help, sadly. People say algorithms show more speedups than tweaking endlessly, but it just depends. Its all trial and error, sadly, esp. with so many cpu variants.

> > > BTW do you know if LLVM/Clang support runtime array bound check?
> >
> > I don't think C (language proper) ever wants to support that because of
> > speed reasons.
>
> That is nonsense. If you can turn it off, it doesn't matter. Works the same
> in pascal.

Yes, I know that, but their mentality is "don't add extra ballast that will slow things down". I think their rationale even says "be fast even if not portable". And obviously strict safety isn't their goal either (unsafe pointers). So speed takes a big priority.

> > But for C, I dunno. It might even be patented there (read a
> > rumor to that effect somewhere), which sounds wrong but anyways .... GCC
> > doesn't use patented technology for obvious reasons.
>
> Please. This is sixties technology, there are libraries full of prior art.
> IMHO FUD.

I know it "sounds" dumb, but there are lots of dumb patents. Lots! And I can't remember exactly, but I did read something similar. Maybe I'm remembering incorrectly, but GCC has indeed had to intentionally avoid some patents in its history.

> I would search the problem more in the standards. Doing unique things is
> simply discouraged. GCC has some more stuff, (Wirthian stuff like said
> range checks, nested procedures) but that is probably because the backend
> is/was also used for wirthian languages, and the concepts were therefore
> already supported by the middle layer.

Dunno, they don't really have that great support, even these days, for non-C/C++/Fortran languages. It's clear where their priorities are. Ada is surprisingly still supported, but obviously not as well. Everything else is just secondary (Java/GCJ too). More likely they introduced such features to allow them to only code in (extended) C, their favorite, instead of having to rely on other languages.

RayeR

CZ,
12.09.2011, 13:30

@ Rugxulo

LLVM for DOS/DJGPP?

Post reply

> > Btw it there some exe/asm analyzer that will find what instruction set
> > is required to run this code?
>
> No, it's mostly impossible, esp. with packed or encrypted .EXEs. You'd have
> to run in a debugger (or via disasm) and blindly hope it catches
> everything.

Uncompressing exe is not problem or I can analysed produced ASM code. Trying with debuger or simply run test is problematic because you have no chance to test all code. If it is more complex program than some simple command line utility working on file/stream there are too much branching to test all cases if somewhere it crash on invalid opcode...

> > BTW do you know if LLVM/Clang support runtime array bound check?
>
> I don't think C (language proper) ever wants to support that because of
> speed reasons. There were some semi-related 3rd-party patches to older GCCs

I search and it seems that's not supported here. Only by some external tools, as you said, I also read about valgrind but didn't tried...

---
DOS gives me freedom to unlimited HW access.

Rugxulo

Usono,
13.09.2011, 02:00

@ RayeR

LLVM for DOS/DJGPP?

Post reply

> > > BTW do you know if LLVM/Clang support runtime array bound check?
> >
> > I don't think C (language proper) ever wants to support that because of
> > speed reasons. There were some semi-related 3rd-party patches to older
> GCCs
>
> I search and it seems that's not supported here. Only by some external
> tools, as you said, I also read about valgrind but didn't tried...

http://gcc.gnu.org/extensions.html
http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html
http://williambader.com/bounds/example.html#download
http://sourceforge.net/projects/boundschecking/

"These patches add a -fbounds-checking flag that adds bounds checking tests to pointer and array accesses. Richard Jones developed the patches against gcc 2.7 in 1995. Herman ten Brugge is the current maintainer and has updated the patches for GCC 2.95.2 and later. William Bader has patches as well.

You may freely mix object modules compiled with and without bounds checking. The bounds checker also includes replacements for mem* and str* routines and can detect invalid calls against checked memory objects, even from modules compiled without bounds checking."

http://hobbes.nmsu.edu/h-browse.php?dir=/pub/os2/dev/emx/

"The emx port of GCC includes Richard W.M. Jones's bounds checking patches. See the GCC manual and \emx\gnu\doc\bounds\README for details."

http://bellard.org/tcc/

"SAFE! tcc includes an optional memory and bound checker. Bound checked code can be mixed freely with standard code."

RayeR

CZ,
14.09.2011, 01:25

@ Rugxulo

LLVM for DOS/DJGPP?

Post reply

> http://gcc.gnu.org/extensions.html
> http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html
> http://williambader.com/bounds/example.html#download
> http://sourceforge.net/projects/boundschecking/
>
> "These patches add a -fbounds-checking flag that adds bounds checking tests
> to pointer and array accesses. Richard Jones developed the patches against
> gcc 2.7 in 1995. Herman ten Brugge is the current maintainer and has
> updated the patches for GCC 2.95.2 and later. William Bader has patches as
> well.

Thx, but I don't like to go this way of gcc patches, I'm too lazy to setup GCC sources with all depencies and rebuild it in djgpp. Also I'm afraid that this pathes are late behind current gcc version (4.6.1)

---
DOS gives me freedom to unlimited HW access.

DOS386

09.09.2011, 18:41

@ RayeR

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

> this bitcode can be then further optimized and compiled to native assembler
> for specific targets. Resulting code should be up to ~30% effective than gcc

1.3 times or 0.3 times as fast as GCC ?

> Code generation supported for Pentium processors and up

bad ...

> BTW I read that it will also replace gcc in linux...

This is the DEATH of GCC :-(

BTW, Sherpya intended to switch from GCC/MinGW to LLVM too ...

---
This is a LOGITECH mouse driver, but some software expect here
the following string:*** This is Copyright 1983 Microsoft ***

RayeR

CZ,
09.09.2011, 19:38
(edited by RayeR, 10.09.2011, 12:31)

@ DOS386

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

> 1.3 times or 0.3 times as fast as GCC ?

oh, of course i meant up to ~30 MORE than gcc otherwise I wouldn't mentioned it :)

> > Code generation supported for Pentium processors and up
>
> bad ...

Maybe still usable for some code. Btw it there some exe/asm analyzer that will find what instruction set is required to run this code?

EDIT: Some benchmarks here

---
DOS gives me freedom to unlimited HW access.

DOS386

11.09.2011, 08:13
(edited by DOS386, 11.09.2011, 08:26)

@ RayeR

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

> > 1.3 times or 0.3 times as fast as GCC ?
> oh, of course i meant up to ~30 MORE than gcc otherwise I wouldn't

Would be cool :-)

> EDIT: Some benchmarks

interesting (varies much, Linux only?, latest CPU's only?) ...

BTW, how does it "look like" (I could not find any download) ... bloat, amount of files, steps to compile (MinGW has 3 : C -GCC-> ASM -GAS-> COFF -LD-> EXE)?

If anyone has both MinGW and LLVM installed, please compile something useful and slow (ZIP, 7-ZIP, BZIP2, FLAC, WAVPACK, FFMPEG2THEORA, MPLAYER, ...) so one can test :-)

> Yes, specially since the very old (60 and 90 Pentiums) afaik miss
> instructions that later 486's and -clones have

For example?

Google wrote:

> http://en.wikipedia.org/wiki / Low_Level_Virtual_Machine

:confused:

"http://llvm.org/" wrote:

> The LLVM Compiler Infrastructure Project

Better. But why "LLVM" ???

---
This is a LOGITECH mouse driver, but some software expect here
the following string:*** This is Copyright 1983 Microsoft ***

RayeR

CZ,
12.09.2011, 13:16

@ DOS386

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

> BTW, how does it "look like" (I could not find any download) ... bloat,
> amount of files, steps to compile (MinGW has 3 : C -GCC-> ASM -GAS-> COFF
> -LD-> EXE)?

Installation is quite simple. You need to download some packages:
http://www.llvm.org/releases/2.9/llvm-2.9-mingw32-i386.tar.bz2
http://www.llvm.org/releases/2.9/llvm-gcc4.2-2.9-x86-mingw32.tar.bz2
http://www.llvm.org/releases/2.9/clang-2.9-mingw32-i386.tar.bz2
You can decide if you install gcc fronted or clang or both
Then also need binutils for mingw
http://sourceforge.net/projects/mingw/files/MinGW/...inutils-2.21.53-1-mingw32-bin.tar.lzma/download
and this 2 DLLs from mingw: libintl-8.dll a libiconv-2.dll (i don't know package I just copied them)
Then it looks like any other GNU tool chain, use cmdline to give parms.
How to compile is described here: http://llvm.org/docs/GettingStarted.html#tutorial

---
DOS gives me freedom to unlimited HW access.

DOS386

13.09.2011, 07:34

@ RayeR

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

> Installation is quite simple. You need to download some packages:

Thanks, done! 100 MiB -> 400 MiB and it does something:

[image]

+ it brews hello.s
+ "-march=i386" seems accepted
- "-masm=intel" is not accepted
- some error messages are "vertical"
- it heavily depends from GCC/MinGW

Can it brew anything (ASM-like) besides GAS&AT&T ?

---
This is a LOGITECH mouse driver, but some software expect here
the following string:*** This is Copyright 1983 Microsoft ***

RayeR

CZ,
13.09.2011, 19:50

@ DOS386

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

I was able to compile my helloworld.c with llvm-gcc to produce bytecode and further helloworld.s in x86 assembly. For llvm-gcc the swiitch between AT&T/intel works. Then I assembled this helloworld.s by DJGPP GCC and it run :)

ps1 I got clang vertical warnings too, hm some bug...
ps2 clang.exe has harcoded path to system includes to c:/mingw/include I had to patch the binary coz I have it on L: drive (I replaced to only "include" and fill the rest with 0) and it works...

---
DOS gives me freedom to unlimited HW access.

RayeR

CZ,
14.09.2011, 10:36
(edited by RayeR, 14.09.2011, 10:55)

@ RayeR

LLVM for DOS/DJGPP? | DEATH of GCC

Post reply

I tried to run llvm-gcc and clang under DOS 7.x + LFN + HXDOS but not succeed :(
It requires PSAPI.DLL and NTDLL.DLL
L:\LLVM\BIN>LLC.EXE dpmild32: import not found: NtAllocateVirtualMemory dpmild32: import not found: RtlNtStatusToDosError dpmild32: import not found: _stricmp dpmild32: import not found: atoi dpmild32: import not found: _chkstk dpmild32: import not found: NtStopProfile dpmild32: import not found: sprintf dpmild32: import not found: RtlMultiByteToUnicodeN dpmild32: import not found: RtlAdjustPrivilege dpmild32: import not found: RtlUnicodeToOemN dpmild32: import not found: NtCreateProfile dpmild32: import not found: NtSetIntervalProfile dpmild32: import not found: NtStartProfile dpmild32: import not found: NtWriteFile dpmild32: import not found: NtSetInformationProcess dpmild32: import not found: NtQueryInformationProcess dpmild32: import not found: NtQueryVirtualMemory dpmild32: import not found: NtQuerySystemInformation dpmild32: file ntdll.dll dpmild32: c:\dos\win32\dkrnl32.dll: cannot resolve imports dpmild32: c:\dos\win32\dkrnl32.dll: cannot load PE file dkrnl32: exception C0000005, flags=0 occured at 97:32B2 ax=36A000 bx=388000 cx=388000 dx=3A5034 si=388040 di=39EBA8 bp=58B8 sp=369FC4 exception caused by access to memory address 3A5034 [eip] = 80 FC 3A 75 01 AD 80 FC 2F 74 DB 80 [esp] = 00C7347E 000000C7 20280040 00400103 32E40040 00C733DC dkrnl32: fatal exit! Exception 0E EAX=00400000 EBX=00388000 ECX=00388000 EDX=003A5034 ESI=00388040 EDI=0039EBA8 EBP=00369E64 ESP=00369FC4 EFL=00013246 EIP=000032B2 CS=0097 (00154000,000058EF,00FB) SS=00C7 (00000000,FFFFFFFF,CFF3) DS=00C7 (00000000,FFFFFFFF,CFF3) ES=00C7 (00000000,FFFFFFFF,CFF3) FS=0000 (********,********,****) GS=0000 (********,********,****) LDTR=0038 (FF80A000,00000FFF,0082) TR=0030 (000D8F78,00000067,008B) ERRC=0007 (********,********,****) PTE 1. Page LDT=069F1467 GDTR=07FF:FF808800 IDTR=07FF:FF809000 PTE CR2=0796C465 CR0=80000033 CR2=003A5034 CR3=069E4000 CR4=00000000 TSS:ESP0=00000804 DR0-3=00000000 00000000 00000000 00000000 DR6=FFFF0FF0 DR7=00000400 LPMS Sel/Cnt=0087/0001 RMS=B72C:0200 open RMCBs=0000/0000 ISR=0000 [EIP]=67 66 89 02 C3 67 66 8B 8E 84 00 00 [ESP]=347E 00C7 00C7 0000 0040 2028 0103 0040 00369FD4=0040 32E4 33DC 00C7 008F 9FDC 0036 9E15 00369FE4=0036 477E 0016 03C1 0103 008F 0000 58A4 00369FF4=0000 2ABD 0000 0000 0000 0000 ???? terminate (c)lient or (s)erver now? When I used PSAPI.DLL from KernelEx: L:\LLVM\BIN>LLC.EXE 1 dpmild32: c:\dos\win32\VESA32.dll: dll init failed dpmild32: c:\dos\win32\VESA32.dll: cannot load PE file dpmild32: c:\dos\win32\VESA32.dll: cannot load PE file dkrnl32: exception C0000005, flags=0 occured at 97:32B2 ax=36A000 bx=388000 cx=388000 dx=3A5034 si=388040 di=39EBA8 bp=58B8 sp=369FC4 exception caused by access to memory address 3A5034 [eip] = 80 FC 3A 75 01 AD 80 FC 2F 74 DB 80 [esp] = 00C7347E 000000C7 80280040 004000C0 32E40040 00C733DC dkrnl32: fatal exit! Exception 0E EAX=00400000 EBX=00388000 ECX=00388000 EDX=003A5034 ESI=00388040 EDI=0039EBA8 EBP=00369E64 ESP=00369FC4 EFL=00013246 EIP=000032B2 CS=0097 (00154000,000058EF,00FB) SS=00C7 (00000000,FFFFFFFF,CFF3) DS=00C7 (00000000,FFFFFFFF,CFF3) ES=00C7 (00000000,FFFFFFFF,CFF3) FS=0000 (********,********,****) GS=0000 (********,********,****) LDTR=0038 (FF80A000,00000FFF,0082) TR=0030 (000D8F78,00000067,008B) ERRC=0007 (********,********,****) PTE 1. Page LDT=069F1467 GDTR=07FF:FF808800 IDTR=07FF:FF809000 PTE CR2=07540465 CR0=80000033 CR2=003A5034 CR3=069E4000 CR4=00000000 TSS:ESP0=00000804 DR0-3=00000000 00000000 00000000 00000000 DR6=FFFF0FF0 DR7=00000400 LPMS Sel/Cnt=0087/0001 RMS=B72C:0200 open RMCBs=0000/0000 ISR=0000 [EIP]=67 66 89 02 C3 67 66 8B 8E 84 00 00 [ESP]=347E 00C7 00C7 0000 0040 8028 00C0 0040 00369FD4=0040 32E4 33DC 00C7 008F 9FDC 0036 9E15 00369FE4=0036 477E 0016 03C1 00C0 008F 0000 58A4 00369FF4=0000 2ABD 0000 0000 0000 0000 ???? terminate (c)lient or (s)erver now?

---
DOS gives me freedom to unlimited HW access.