> >>But I don't think the Pentium (IIRC) has many "new" instructions
> > useful for random compiles (RDTSC, CPUID).
>
> Did the 5V Pentium 60/90 have those? IIRC they were a bit funky too.
Even (late-model DX) 486 had CPUID, but RDTSC was new to Pentium, I think. Not all clones have CPUID either (perhaps one needed it to be enabled??, dunno), hence you should test for CPUID before using it (in theory, though I admit most cpus in use these days have it). You may?? be thinking of the fact that RDTSC looped around / wrapper or "ran out" after a month or whatever unlike PPro on up where it had a longer duration (can't remember exactly, sorry).
I'm no hardware expert or professional programmer. I've never written my own compiler, and my x86 asm experience is pretty weak. I'm just saying, I have never, in limited experience, ever seen a compiler generate true 586-only code. Similarly with 486, though I guess it's possible in theory. GCC for sure I've never seen use either, but I guess it could?? be hidden in one of the built-ins (with test or otherwise, dunno). It's only the 686 CMOVxx bullcraps that have bit me a few times (on my old 586 or 486 Sx or DOSBox [486 DX2]).
> > Can't remember, perhaps you meant something odd like CMPXCHG8B ???
>
> That one _is_ important; to create CAS iirc, for threadsafe queues.
I don't know, can't remember, but the additional instructions in 486 and 586 are minimal and (almost?) useless for general-purpose compilers. Again, I don't know of any that have ever generated such instructions, but again, my experience is limited (aren't we all??). I've seen a few hand-coded 486-only apps (stupid BSWAP, as if that helps anything significantly), but usually when people say "486 only" it means "needs FPU" (though not all 486s had it, natch) or even sometimes "needs fast 486-like speeds".
Normally I had read that a 486 (pipelined) was always faster "clock for clock" than any 386, even at same Mhz, because the fastest 386 instruction was 2 clocks vs. 1 clock for 486. But I don't know about the ultra-fast 386 clones (AMD, etc). Anyways, the 486 was the one with (quite small) on-chip cache, finally, while most (but not all??) 386s had slower external off-chip motherboard cache. But 486 was more RISC-y, so while "lodsb" and short instructions typically were said to run faster on a 386, they were slower on 486 than standard "mov" equivalents. Hence, like I said, I don't think GCC ever (!) had any true explicit support for 486 beyond just adding extra alignment (since it was allegedly very sensitive, much moreso than Pentium).
> PPro's were cool. Much faster, gigantic full speed cache (P-II was only
> half speed, but in the beginning when they competed against P-I and the
> difference was much bigger), and most importantly, the first true
> multiprocessor.
Like I said, I've read they were expensive and hence most home users didn't use them. Servers? Dunno. I don't think it was until PII that Intel finally introduced budget line Celerons to compete with the likes of AMD (and others, who mostly disappeared after that). Besides, as even Wikipedia admits, the 32-bit only software market back then was much smaller, hence running slower 16-bit code was more painful then than now (whereas most people these days don't run DOS-like systems, unlike back in the day, *sniff*). Omitting MMX is another surprising oversight for PPro but (IMHO) not as huge a loss (as I can't name a lot of MMX-enabled apps, and it's deprecated these days anyways).
> Yup, afaik Via C3 doesn't either, and they are actually not that old. (as
> in were sold after 2000, specially in all kinds of embedded configurations.
> Afaik we got ours in 2003)
IIRC, there's some weak VIA support in GCC, but I don't think 99% of people ever used a VIA chip. I'm not sure if they target end users or just embedded (low-power needing) businesses. I'm honestly not sure if most Windows etc. OSes will run on VIA (though latest probably got fixed but not older) due to hard-coded CPUID checks or such dumb stuff.
PGCC used to have weak support for Cyrix, but I don't know exactly (and apparently that wasn't folded back into EGCS and hence nor in GCC 2.95.x).
> We had one as firewall till a few months back, and early this year the
> firewall distro (smoothwall) changed to 686, and got into trouble.
IIRC, that (VIA) was why Ubuntu was 486 only for a while there, dunno about lately. Stupid CMOVxx (see comment by Linus Torvalds) isn't that helpful in general use. So it should be avoided for general compilations, IMHO, unless it can be proved to help (and providing a separate 586-only binary isn't that painful, is it?).
> true, most instructions of 486,pentium, pentiummmx are not commonly
> generated by compilers)
MMX is integer only, 3dnow! and SSE1 are single-precision, and SSE2 (much bigger) is everything including double precision. I think?? I read that FPU is mostly ignored by most ultra-modern tools and OSes these days. Well, I mean, it's still there and used, but new compilers don't generate for it, they try to target SSE2 instead. Yes, I know, FPU can do 80-bits, but I think even the C standard (C89??) only says "long double" has to equal "double", not necessarily surpass it (though some compilers do support 80-bit). So maybe that's their justification for trying to cram SSE2 down everyone's throat. I mean, considering how C/C++ is kinda treated as "good enough for everything" by a lot of people, I wouldn't be surprised.
> Netburst wise I've only a Pentium-D, and while it has an XP,
> it is used mostly in 64-bit mode.
GCC isn't a bad compiler by any stretch. I don't think anybody can claim that. But it's definitely not be optimized very well for some (sub)architectures. It does honestly seem to be better on Pentium4 than others, surprisingly. In other words, it's kinda sad that even GCC developers don't have more machines to test. I don't know what they use, but I hope it's more than just a handful of ultra-modern cpus.
> Some like AES,popcnt,lzcount,crc32 are mostly for specialistic (hand coded
> assembler) code.
So much for the traditional saying that "C code is as fast as asm code". Not anymore! (Though in fairness, how could it be??)
> Stuff like AVX is nearly always useful in memcpy/move() routines (but maybe
> not on AMD's upcoming bulldozer, since it eats two core's vector units
> there), and even compiler generated in expanded (inline) versions of that
> routine for low bytecounts.
It only came out on Intel chips earlier this year, right? So it's extremely new, hence most people don't have it. Sure, some compilers support it now (eh? why?), but I think it's less useful overall. Hopefully future coders will be wise enough not to be "AVX-only", as that would be short-sighted, IMHO.
> My experience is that most normal (non benchmark) programs are limited by
>
> 1) the heap manager
> 2) move/memcpy
> 3) other similar primitives like "search for a byte in block of memory", on
> C probably a special "search for byte 0 in block of memory" variant.
>
> But move definitely matters. In 2003..2006, a team reprogrammed the
> lowlevel assembler of Delphi (which was pretty much at the level of 486
> P-I) of such routines to optimized MMX/SSE(2) using, and that could matter
> 40% on memory copy easily.
MMX and SSE are meant to be both bigger and faster, and that is the (only?) reason to use them. In particular, SSE being able to work at the same time as FPU helps too. Newer cpus have even faster SSE support, so it's even more beneficial than before. It's not a secret, SSE is considered "the future" (or maybe "the present", dunno, I guess I'm old-fashioned, obviously, heh).
But it doesn't always help, sadly. People say algorithms show more speedups than tweaking endlessly, but it just depends. Its all trial and error, sadly, esp. with so many cpu variants.
> > > BTW do you know if LLVM/Clang support runtime array bound check?
> >
> > I don't think C (language proper) ever wants to support that because of
> > speed reasons.
>
> That is nonsense. If you can turn it off, it doesn't matter. Works the same
> in pascal.
Yes, I know that, but their mentality is "don't add extra ballast that will slow things down". I think their rationale even says "be fast even if not portable". And obviously strict safety isn't their goal either (unsafe pointers). So speed takes a big priority.
> > But for C, I dunno. It might even be patented there (read a
> > rumor to that effect somewhere), which sounds wrong but anyways .... GCC
> > doesn't use patented technology for obvious reasons.
>
> Please. This is sixties technology, there are libraries full of prior art.
> IMHO FUD.
I know it "sounds" dumb, but there are lots of dumb patents. Lots! And I can't remember exactly, but I did read something similar. Maybe I'm remembering incorrectly, but GCC has indeed had to intentionally avoid some patents in its history.
> I would search the problem more in the standards. Doing unique things is
> simply discouraged. GCC has some more stuff, (Wirthian stuff like said
> range checks, nested procedures) but that is probably because the backend
> is/was also used for wirthian languages, and the concepts were therefore
> already supported by the middle layer.
Dunno, they don't really have that great support, even these days, for non-C/C++/Fortran languages. It's clear where their priorities are. Ada is surprisingly still supported, but obviously not as well. Everything else is just secondary (Java/GCJ too). More likely they introduced such features to allow them to only code in (extended) C, their favorite, instead of having to rely on other languages. |