Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to index page
Thread view  Board view
segin

Homepage E-mail

Springfield, MO, USA,
08.08.2023, 18:08
 

readexe version 0.1 (Announce)

Hello DOS ain't dead forums!

I'm a hobbyist that started his computer experience as toddler on Windows 3.1 and MS-DOS 6.22 (because those were the bleeding edge at the time.)

I'm writing a diagnostic/system/developer tool that inspects executable files. It's called readexe and it's licensed under the ISC license, an Open Source Initiative approved open source license. It's legally equivalent to the MIT license, except it says the exact same thing in fewer words, and thus I prefer it for that efficiency.

readexe is capable of displaying the values of the MZ/ZM DOS EXE header. I'm aware that some very early DOS assemblers ran on big-endian Unix minicomputers and used a hex value that was meant for little-endian PC linkers and emit EXEs with "ZM" magic - MS-DOS 2.0 and 2.11 recognize and accept this alternative magic value as the "old" magic, and their source code on GitHub confirms this.

readexe is also capable of recognizing and dumping header information for NE "New Executable" images. It has OS values for OS/2, Windows, the European multitasking MS-DOS 4.00/4.10, special Windows/386 NE images, Borland's DOS extender, both 16-bit and 32-bit DPMI values used by HX (the 16-bit value is shared with Borland), plus the two values I've been told were used by the Phar-Lap 286 extender for both OS/2 and Win16 programs. It can read out NE segment tables as well.

readexe is very much in development, as indicated by it's 0.1 release version. It currently builds for all modern Unix, real-mode PC-DOS using both gcc-ia16 and OpenWatcom 2.0; DOS32 using OpenWatcom 2.0; OS/2 1.x using OpenWatcom 2.0; Win32 using OpenWatcom and gcc; Win64 using gcc.

Binaries are provided for OS/2, DOS, and Windows. Unix users are expected to run ./autogen.sh, then ./configure, then make. Unixes without autotools can use Makefile.unix, setting CC to point to your local C compiler as appropriate. Those building for real-mode DOS with gcc-ia16 can use Makefile.dos.

Pull requests are accepted as long as they follow the general style of the existing code. Header definitions should each go in their own separate header, but support code for any new formats should go into readexe.c. Importantly, the classical C way of handling bitfields by OR/XOR/ANDing together preprocessor macros onto a single integer value holding all the flags is NOT allowed; C99 is the rule and bitfields are how we do. Any code using preprocessor macros with hex values to be OR/XOR/ANDed to a raw integer shall be rejected for this reason. Use C99 bitfields if you want to contribute.

You may find readexe on GitHub at http://github.com/segin/readexe.

DosWorld

09.08.2023, 23:30

@ segin
 

readexe version 0.1

> readexe is capable of displaying the values of the MZ/ZM DOS

Hello!

(unasked) advice.

Some example of usage for LE, PE, ADAM you can find here. (Sorry, code is not perfect clean).

Also, example of LE, PE you can meet into Sphinx C-- source code.

PS: I had looking for this info and receive not so many sources. It will keep your time.

---
Make DOS great again!

Carthago delenda est, Ceterum censeo Carthaginem delendam esse.

segin

Homepage E-mail

Springfield, MO, USA,
11.08.2023, 12:49
(edited by segin, 11.08.2023, 13:03)

@ DosWorld
 

readexe version 0.1

> > readexe is capable of displaying the values of the MZ/ZM
> DOS
>
> Hello!
>
> (unasked) advice.

Very useful advice.

>
> Some example of usage for LE, PE, ADAM you can find
> here.
> (Sorry, code is not perfect clean).

LE is not a clean format, any handling code will be either less than ideal, or handling the format in an incomplete manner.

>
> Also, example of LE, PE you can meet into
> Sphinx C-- source code.

LE code is good. PE? Well, implementations of PE parsers are a dime a dozen now. That's why I'm saving that for last.

>
> PS: I had looking for this info and receive not so many sources. It will
> keep your time.

I'm aware that the legacy formats aren't well-represented in developer materials. That's one of the reasons I started this project.

tkchia

Homepage

10.08.2023, 21:08

@ segin
 

readexe version 0.1

Hello segin,

Welcome!

> bitfields by OR/XOR/ANDing together preprocessor macros onto a single
> integer value holding all the flags is NOT allowed; C99 is the rule and
> bitfields are how we do. Any code using preprocessor macros with hex values
> to be OR/XOR/ANDed to a raw integer shall be rejected for this reason. Use
> C99 bitfields if you want to contribute.

Well, I like me some bit-fields myself, when I can use them. The fact though is that they are famously unportable, and therefore less than useful when dealing with binary formats over a network. N1570 (the final draft of the C11 standard) says under 6.7.2.1,

> The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

I would say that it is best to keep that in mind.

Thank you!

---
https://gitlab.com/tkchia · https://codeberg.org/tkchia · 😴 "MOV AX,0D500H+CMOS_REG_D+NMI"

segin

Homepage E-mail

Springfield, MO, USA,
11.08.2023, 12:46

@ tkchia
 

readexe version 0.1

> Hello segin,
>
> Welcome!
>
> > bitfields by OR/XOR/ANDing together preprocessor macros onto a single
> > integer value holding all the flags is NOT allowed; C99 is the rule and
> > bitfields are how we do. Any code using preprocessor macros with hex
> values
> > to be OR/XOR/ANDed to a raw integer shall be rejected for this reason.
> Use
> > C99 bitfields if you want to contribute.
>
> Well, I like me some bit-fields myself, when I can use them. The fact
> though is that they are famously unportable, and therefore less than
> useful when dealing with binary formats over a network. N1570 (the final
> draft of the C11 standard) says under 6.7.2.1,
>
> > The order of allocation of bit-fields within a unit (high-order to
> low-order or low-order to high-order) is implementation-defined. The
> alignment of the addressable storage unit is unspecified.

The byte order of multi-byte integers is implementation-defined. Any brief review of the code would find that there's no checking of host byte order, and even a code comment stating a hard assumption on little-endianness.

>
> I would say that it is best to keep that in mind.

In practice, every x86 C99 compiler I've encountered does exactly what you'd expect.

If it doesn't work for someone, they can file a ticket if and when we cross that bridge. If OpenWatcom and gcc never change how they handle C99 bitfields, then this is a bridge we may never cross - I don't expect end users to compile source code on whatever random compiler they happen to not have (because regular end-users don't keep C compilers installed for the off-chance they find random C source to compile). I expect end users to consume prebuilt binaries. As long as the builds I make work, that'll be good enough for basically everyone interested.

If it breaks on AIX (for example), I don't care. I don't have an AIX machine to test on, and I'm not going to start pedantically writing as perfectly as possible portable C with the ISO standards open at all times next to me because of this. Doing just that is likely to still result in buggy code that I don't know anything is wrong with because of just how much of the C language - standard library included - is "implementation-defined" - no matter how perfectly I adhere to ISO 9899:1999. It'll lead to overly verbose code doing a lot of oft-unnecessary (for most users - x86 users) runtime checks (or worse, overly verbose code with more logic in preprocessor macros than in actual compiled code), increasing code size. How well does your ia16 GNU toolchain fork handle medium code model again?

So I'm just not going to worry about it until I have an actual practical reason to worry about it, in an application of the KISS principle.

>
> Thank you!

tkchia

Homepage

11.08.2023, 12:59

@ segin
 

readexe version 0.1

Hello segin,

I do not know — if you do not actually care about honoring C language standards, maybe do not bring them up on the first place?

OK, I grant that you have very strong opinions on how to write code. But then again, a lot of people also have very strong (and different) opinions (myself included). It will help if you provide us a good reason to pay particular attention to your opinions. :-D

Last but not least: bit-fields already existed in C89.

Thank you!

---
https://gitlab.com/tkchia · https://codeberg.org/tkchia · 😴 "MOV AX,0D500H+CMOS_REG_D+NMI"

segin

Homepage E-mail

Springfield, MO, USA,
11.08.2023, 13:50

@ tkchia
 

readexe version 0.1

> Hello segin,
>
> I do not know — if you do not actually care about honoring C language
> standards, maybe do not bring them up on the first place?

Because otherwise you get people submitting PRs with state-of-the-1980s things like #define SOME_NYBBLE 0x00F0 and val = (((field & SOME_NYBBLE) >> 4) & 0xF). I'd rather have zero instances of this sort of thing.

Bitfields may be "nonportable" but that actually doesn't matter - they work where they need to work for this particular project. Again, the compilers in use for development handle them in the "correct" manner, and I'm not concerning myself with other devsuites out of KISS reasons (if it comes up, I'll worry then.) They're also a bit more readable than the classic way of doing things.

While nothing per se ties this code to x86 (more so, it's tied to little-endianness than anything else), I also don't concern myself with other CPU architectures at this time. Make the code work first, make the code work elsewhere later (and only if needed.) Walk before you run. It's a tool for retrocomputing on IBM-compatible PCs, and reverse-engineering thereof, and everything is designed to that end. Keep that in mind.

Your points aren't invalid, just less valid than you're thinking. There's no networking here, and the compilers I've found capable of making the builds I want (mingw64-gcc, host clang and gcc on Linux, gcc-ia16, and OpenWatcom) all work in the same, predictable manner on this matter.

It's also important to avoid overengineering. It eats up time and bloats code size with ever-diminishing returns. This is why if there's a problem, there's an issue tracker that anyone with a (easily acquired) GitHub account can submit a bug report with - if it's actually necessary to worry about, someone will definitely tell me in an embarrassing fashion. (I guess I could build and run readexe on my PowerMac G5 and just watch what happens :-D)

>
> OK, I grant that you have very strong opinions on how to write code. But
> then again, a lot of people also have very strong (and different) opinions
> (myself included). It will help if you provide us a good reason to pay
> particular attention to your opinions. :-D

If you felt so badly about bitfields, why didn't you pester ISO to drop them from C17 or C23?

>
> Last but not least: bit-fields already existed in C89.

C99 bitfields aren't quite the same beast as C89 bitfields. For one, you can make assumptions about the in-memory ordering of individual elements with respect to the bitfield's definition in the source, as ISO 9899:1999 has certain requirements around this. Most portability issues you encounter with C89 bitfields are eliminated as the standard leaves far less things up to the implementation to do as it pleases with. The remaining portability issues can be addressed if they come up, when they come up.

>
> Thank you!

No, thank you, one, for the ia16 GNU toolchain, and two, for making me think about all of this. Also, if my commentary seems oddly jarring, it's because I keep editing it in random order before posting.

RayeR

Homepage

CZ,
15.08.2023, 19:01

@ segin
 

readexe version 0.1

Hi, many years ago I coded a simple tool EXeinfo that probably do similar things to yours. Feel free to try out and compare results if you want. I just wrote it when learning about exe a bit myself...

http://rayer.g6.cz/programm/programe.htm#EXEINFO

Bwt about C and bitfields. I don't hesitate using bitfields and unions and manipulate them by memcpy, fread, typecast and it works fine for me, at least ingcc that I mostly use. I just need to use attribute packed to avoid byte field alignment including memory holes, when structure is packed there are nobholes and all works as expected. I usualy compile my tools for Dos/Windows/Linux with gcc toolchains Djgpp/Mingw/native 32/64b linux gcc and I didn't have a problem. Also some code was ported to 8b atmel AVR and 32b ARM also runnig fine there. But I don't suggest anybody to break/abuse C ruled in general but in some case it makes life easier... :)

---
DOS gives me freedom to unlimited HW access.

segin

Homepage E-mail

Springfield, MO, USA,
15.08.2023, 19:53

@ RayeR
 

readexe version 0.1

> Hi, many years ago I coded a simple tool EXeinfo that probably do similar
> things to yours. Feel free to try out and compare results if you want. I
> just wrote it when learning about exe a bit myself...
>
> http://rayer.g6.cz/programm/programe.htm#EXEINFO
>

Oddball chance the source is available?

> Bwt about C and bitfields. I don't hesitate using bitfields and unions and
> manipulate them by memcpy, fread, typecast and it works fine for me, at
> least ingcc that I mostly use. I just need to use attribute packed to avoid
> byte field alignment including memory holes, when structure is packed there
> are nobholes and all works as expected. I usualy compile my tools for
> Dos/Windows/Linux with gcc toolchains Djgpp/Mingw/native 32/64b linux gcc
> and I didn't have a problem. Also some code was ported to 8b atmel AVR and
> 32b ARM also runnig fine there. But I don't suggest anybody to break/abuse
> C ruled in general but in some case it makes life easier... :)

I mostly use gcc, clang, and OpenWatcom. I'm considering trying my hand with TinyCC (last time I used it, it choked on anonymous unions) and a few other compilers that claim C99.

Also, maybe see if the OS/2 builds run on SIZZLE, FOOTBALL, or the CP/DOS boot disks :)

segin

Homepage E-mail

Springfield, MO, USA,
22.08.2023, 00:49

@ segin
 

readexe version 0.1

Announcement: 0.1.2 is now out with support for the "W3" EXE format, that is, the VxD table in WIN386.EXE.

segin@Draetheus-V[17:48]:{~\readexe}% .\readexe-win64.exe ..\OneDrive\C\WINDOWS\SYSTEM\WIN386.EXE
..\OneDrive\C\WINDOWS\SYSTEM\WIN386.EXE:
DOS executable with magic:      MZ (0x5a4d)
Number of executable pages:     0x0036 (27136+ bytes)
Size of final page:             0x00000000 (0 bytes)
Total code size:                0x00006a00 (27136 bytes)
Total relocation entries:       0x000f
Header size in paragraphs:      0x0020 (512 bytes)
Minimum heap in paragraphs:     0x1400 (81920 bytes)
Maximum heap in paragraphs:     0xffff (1048560 bytes)
Minimum memory to load:         109056 bytes
Initial CS:IP (entrypoint):     0000:10ef
Initial SS:SP (stack):          06a0:0400
Checksum:                       0x2f24
Relocation table offset:        0x0040
Overlay:                        0x0000

MZ EXE relocaton table
Number of relocations: 15
  [0] 0000:08cb
  [1] 0000:0f97
  [2] 0000:0fb3
  [3] 0000:10f9
  [4] 0000:19ff
  [5] 0000:1acd
  [6] 0000:2262
  [7] 0000:24c8
  [8] 0000:260d
  [9] 0000:279f
  [10] 0000:2827
  [11] 0000:2f93
  [12] 0348:1def
  [13] 0348:1df3
  [14] 0348:1df7
Offset to next header:          0x00006c00


W3 Executable header found at offset 0x00006c00
VxD Module Table:
   ID   Name          Offset      Size       (dec)
------------------------------------------------------
  [00] "WIN386  "     0x00007000  0x00003b8b (15243 bytes)
  [01] "INT13   "     0x00022400  0x0000021b (539 bytes)
  [02] "WDCTRL  "     0x00023c00  0x000002aa (682 bytes)
  [03] "VMD     "     0x00027400  0x00000308 (776 bytes)
  [04] "VPD     "     0x00029c00  0x0000029a (666 bytes)
  [05] "VWC     "     0x0002c400  0x00000220 (544 bytes)
  [06] "DOSNET  "     0x0002ec00  0x0000023f (575 bytes)
  [07] "VNETBIOS"     0x00031400  0x00000410 (1040 bytes)
  [08] "EBIOS   "     0x00035400  0x000001be (446 bytes)
  [09] "VDDVGA  "     0x00037c00  0x00000dbf (3519 bytes)
  [0a] "VKD     "     0x00042800  0x0000090c (2316 bytes)
  [0b] "VPICD   "     0x00047400  0x00000833 (2099 bytes)
  [0c] "VTD     "     0x0004a400  0x0000052d (1325 bytes)
  [0d] "REBOOT  "     0x0004c800  0x00000257 (599 bytes)
  [0e] "VDMAD   "     0x0004e400  0x00000790 (1936 bytes)
  [0f] "VSD     "     0x00051800  0x0000019b (411 bytes)
  [10] "V86MMGR "     0x00053400  0x00000f99 (3993 bytes)
  [11] "PAGESWAP"     0x0005fc00  0x0000038f (911 bytes)
  [12] "DOSMGR  "     0x00062400  0x000017e8 (6120 bytes)
  [13] "VMPOLL  "     0x0006e400  0x00000224 (548 bytes)
  [14] "WSHELL  "     0x0006fc00  0x00000daa (3498 bytes)
  [15] "BLOCKDEV"     0x00076800  0x0000028e (654 bytes)
  [16] "PAGEFILE"     0x00079400  0x0000048b (1163 bytes)
  [17] "VFD     "     0x0007c400  0x0000018a (394 bytes)
  [18] "PARITY  "     0x0007dc00  0x00000168 (360 bytes)
  [19] "BIOSXLAT"     0x0007f400  0x000001b4 (436 bytes)
  [1a] "VCD     "     0x00080c00  0x00000507 (1287 bytes)
  [1b] "VMCPD   "     0x00084400  0x0000021e (542 bytes)
  [1c] "COMBUFF "     0x00086c00  0x00000211 (529 bytes)
  [1d] "CDPSCSI "     0x00088400  0x00000155 (341 bytes)
  [1e] "QEMMFIX "     0x0008ac00  0x000001ac (428 bytes)

Oso2k

22.08.2023, 08:11

@ segin
 

readexe version 0.1

> I mostly use gcc, clang, and OpenWatcom. I'm
> considering trying my hand with TinyCC (last time I used it, it choked on
> anonymous unions) and a few other compilers that claim C99.


Anonymous unions are a MS extension.

https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html

Rugxulo

Homepage

Usono,
22.08.2023, 15:51

@ Oso2k
 

readexe version 0.1

> > I mostly use gcc, clang, and OpenWatcom. I'm
> > considering trying my hand with TinyCC (last time I used it, it choked
> on
> > anonymous unions) and a few other compilers that claim C99.
>
> Anonymous unions are a MS extension.
>
> https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html

Which part exactly? Even that links says that C11 supports it (if not ambiguous).

> C11 Changes from C99:
> Anonymous structures and unions, useful when unions and structures are nested,
> e.g. in struct T { int tag; union { float x; int n; }; };

> TinyCC Changelog
> version 0.9.24:
> - anonymous union/struct support (Filip Navara)

Back to index page
Thread view  Board view
22049 Postings in 2034 Threads, 396 registered users, 265 users online (0 registered, 265 guests)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum