Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to decompile DOS Commander Keen 4 #1287

Open
juj opened this issue Sep 4, 2023 · 10 comments
Open

Attempting to decompile DOS Commander Keen 4 #1287

juj opened this issue Sep 4, 2023 · 10 comments

Comments

@juj
Copy link

juj commented Sep 4, 2023

I wanted to decompile a couple of functions from old Commander Keen 4 game.

  1. Downloaded and installed Reko
  2. Attempted to install the prerequisites

The following prerequisite software must be installed on your machine first:
.NET 6.0 (https://www.microsoft.com/net/download/dotnet-framework-runtime)

but the link did not point to a .NET 6.0 download. Instead searched with google to get https://dotnet.microsoft.com/en-us/download/dotnet/thank-you/sdk-6.0.413-windows-x64-installer

  1. Loaded Commander Keen 4 shareware from https://keenwiki.shikadi.net/wiki/Keen_4_Versions - (namely http://cd.textfiles.com/nightowl/pdsi006/003A/_1GALAXY.ZIP )

Used 7-zip to decompress the archive to get to KEEN4E.exe.

Noticed Reko was not able to decompile that exe directly. Although there is a LZ uncompressor tool at https://keenwiki.shikadi.net/wiki/UNLZEXE (namely https://files.shikadi.net/keenwiki/tools/t.unlzexe_v.0.81_win_32-bit.gerstrong.2010-07-04.zip ) to produce an uncompressed KEEN4E.exe.

  1. I was able to then disassemble the file into the IDE.

  2. I then wanted to find all locations in the binary that contain an assembly instruction with an immediate 3DAh as operand, (e.g. mov cx, 3DAh. Though I was not quite able to figure out how to do that. So I downloaded IDA Freeware 5.0 from https://www.scummvm.org/news/20180331/ where I was able to find a function entry point 0x27D8 that contains an interesting immediate 3DAh that I'd like to examine. Unfortunately IDA does not provide a decompiler, so I switched back to looking at the function at 0x27D8 with Reko.

  3. I then proceeded to disassemble KEEN4E.exe with Reko, and in the function list that appeared, I double clicked on "fn0800_27d8" to view its contents.

  4. I was met with an exception Could not load file or assembly 'DynamicData, Version=7.1.0.0':

image

Not sure if I am missing some DLL installation. Manually installing https://download.visualstudio.microsoft.com/download/pr/8d1443fd-a5e1-438d-8cb8-6ccb9849a54a/4f89f2b74a9c272789dfac8658a87673/dotnet-sdk-6.0.413-win-x64.exe did not work to resolve that. I was thinking maybe version 7.1.0.0 might have been referring to .NET 4.7.1, but attempting to install that came back with an error that it was already installed.

Is the missing assembly due to me missing some installation, or I wonder if my use case is too dumb to work with a modern tool?

Thanks!

@juj
Copy link
Author

juj commented Sep 4, 2023

Oh, I now downloaded https://github.com/uxmal/reko/releases/download/version-0.11.4/WindowsDecompiler-0.11.4-x64-2b56adbc68.zip instead of https://github.com/uxmal/reko/releases/download/version-0.11.4/Reko-0.11.4-x64-2b56adbc68.msi and the .zip version worked, whereas the .msi version didn't. I was now able to view the decompiled function.

Although something feels wrong there. I have a function

image

where I believe the orange block on the left generates the decompiled code on the right. (btw, it would be cool to get highlights on the other side when clicking on a line on one side!)

The decompilation of that assembly reads

                    do
                        ;
                    while (Test(ULT,(__in<byte>(0x03DA) & 0x02) != 0x00));

but I don't think that is correct? Shouldn't it be

                    do
                        ;
                    while (__in<byte>(0x03DA) & 0x01);

?

@juj
Copy link
Author

juj commented Sep 4, 2023

Ops, then I accidentally clicked on the Restart button in the toolbar, and got

image

@juj
Copy link
Author

juj commented Sep 4, 2023

An interesting feature request might be to have the full path + filename of the executable being disassembled in the title of the main window, to be able to distinguish between multiple simultaneously disassembled binaries:

image

@uxmal
Copy link
Owner

uxmal commented Sep 4, 2023

Wow, a lot to unpack there! Thanks for taking the time to report this so thoroughly. In order:

  1. There does seem to be a problem with the MSI installer. Multiple users are reporting issues. "Of course" I cannot reproduce it on my rig. I will take note of it and see if I can find an enviroment where I can get the MSI installer problems to happen at will.

  2. When I try to open the EXE on my end, Reko's built-in unpacker crashes. Not sure what is causing it. I'm working on a fix.

  3. Matching highlights on both sides is a frequently expressed wish. There is some difficulty in making that wish happen, as it involves mapping many different lines of C code to many different addresses. I'm going to read up on how Godbolt does it and see if I can approximate what that project is doing.

  4. You're correct; that statement should indeed be decompiled to:

do
    ;
while (__in<byte>(0x03DA) & 0x01);

I will investigate after I fix the loading issue.

  1. I was unable to reproduce the crash you are reporting when re-opening the binary. If you do get it to happen again, please select the text inside the crash window and attach that text to this issue.

  2. Adding the full path of the currently decompiled binary to the main window title is eminently doable.

@juj
Copy link
Author

juj commented Sep 4, 2023

Thanks for the detailed reply.

In the end I was able to get the decompilation done, and I ended up with manually cleaning up the result for documentation:

Reko provided decompilation:

Eq_6 fn238C_07C5(Eq_6 ds, Eq_6 wArg04, Eq_6 bArg06, union Eq_6 & cxOut)
{
    __cli();
    do
        ;
    while ((__in<byte>(0x03DA) & 0x01) == 0x00);
    do
        ;
    while ((__in<byte>(0x03DA) & 0x01) != 0x00);
    __out<byte>(0x03D4, 0x0C);
    __out<byte>(0x03D5, SLICE(wArg04, byte, 8));
    __out<byte>(0x03D4, 0x0D);
    __out<byte>(0x03D5, (byte) wArg04);
    Mem43[ds:0xE314:word16] = 0x00;
    while (Mem43[ds:0xE314:word16] <u 0x09)
    {
        __sti();
        __cli();
        if ((__in<byte>(0x03DA) & 0x08) != 0x00)
            break;
    }
    __out<byte>(0x03C0, 0x33);
    __out<byte>(0x03C0, bArg06);
    __sti();
    cxOut = wArg04;
    return bArg06;
}

My manual cleanup with comments:

extern volatile int global_interrupt_timer;

void scroll_screen(int start_address, int pixel_shift)
{
    disable();
    while (!(inp(0x3DA) & 1)) /*nop*/ ; // Skip visible regions, i.e. wait until we are in start of blank
    while (inp(0x3DA) & 1) /*nop*/ ; // Skip blank regions, i.e. wait until we are in visible picture area
    // We are now in start of a visible scanline
    outp(0x3D4, 0xC);
    outp(0x3D5, start_address >> 8);
    outp(0x3D4, 0xD);
    outp(0x3D5, start_address);
    // Wait until start of vertical retrace
    global_interrupt_timer = 0;
    while(global_interrupt_timer < 9)
    {
        enable();
        NOP();
        disable();
        if (inp(0x3DA) & 8)
            break;
    }
    outp(0x3C0, 0x33);
    outp(0x3C0, pixel_shift);
    enable();
}

This is for investigating https://www.vogons.org/viewtopic.php?f=63&t=96028

I wonder if some of the transformations above might be something that could be interesting to clean up automatically? E.g.

do
        ;
    while ((__in<byte>(0x03DA) & 0x01) == 0x00);

to

while ((__in<byte>(0x03DA) & 0x01) == 0x00) /*nop*/;

also maybe ((__in<byte>(0x03DA) & 0x01) != 0x00) might clean up to just ((__in<byte>(0x03DA) & 0x01)) automatically (although some people do prefer seeing the explicit != 0 checks.

The construct SLICE(wArg04, byte, 8) seemed odd to me. Maybe displaying (byte)(wArg04 >> 8) might be more C-like?

Last, the line while (Mem43[ds:0xE314:word16] <u 0x09) seemed odd. What is <u? Smaller than compared as unsigned?

I don't know if it is interesting to be "proper C" out of the box, since the pattern ds:0xE314:word16 would not be valid C. In Borland Turbo C++ the construct used to be MK_FP(ds, 0xE314). Although not sure how the :word16 part is intended to be interpreted - that seems a bit redundant there.

Something that I struggled for quite a while is that I had two custom functions that I had located using IDA: wait_for_vblank and scroll_screen, and I wanted to see a full graph of how the two functions relate to each other. (e.g. does one call the other, or what are the earliest common ancestors that calls both, or is called by both?). I was trying to view a xref graph in IDA 5, though I was not able to make that work for the purpose either.

Another thing I was struggling was to figure out when I saw that Mem43[ds:0xE314:word16], how to find all memory load/store references to that global memory variable to understand what data that variable holds. Though then by random chance by clicking on functions I stumbled on the right interrupt handler function that incremented the variable, and I was able to figure it out.

Great tool overall - it is hard to find good disassembler + decompilers out there.

@juj
Copy link
Author

juj commented Sep 4, 2023

Attached a crash log of the Restart/Open crash. (I got this both when clicking on the Restart button, but also when clicking on Open... button when I had a previous decompile open)
neko_open_crash.txt

uxmal added a commit that referenced this issue Sep 4, 2023
* New unpacking script for LzExe 0.91 (addresses part of #1287)

* New `reko.dasm` OllyScript command.

* Change CMakeLists.txt to use correct version number.
@uxmal
Copy link
Owner

uxmal commented Sep 4, 2023

Attached a crash log of the Restart/Open crash. (I got this both when clicking on the Restart button, but also when clicking on Open... button when I had a previous decompile open)

How strange. I'm unable to reproduce this on my development rig.

uxmal added a commit that referenced this issue Sep 5, 2023
uxmal added a commit that referenced this issue Sep 18, 2023
@NancyAurum
Copy link

NancyAurum commented Jul 21, 2024

I have the same exception, while trying to open any procedure in the program I compiled with gcc/x86_64-w64-mingw32/12.2.0/
It also takes ages to initially load that 500kb exe for some reason, which both IDA and Ghidra do instantly.

Reko decompiler version 0.11.5.0 (git:a2c8f84b9c)

symta.zip

@uxmal
Copy link
Owner

uxmal commented Jul 22, 2024

@NancyAurum: your binary uncovered some juicy bugs in Reko that I've fixed in commit c7ac7f3. However, I cannot reproduce the exception you are reporting. What exactly are you doing when installing Reko? Are you using the MSI files provided, or building from source?

@NancyAurum
Copy link

NancyAurum commented Jul 22, 2024

@NancyAurum: your binary uncovered some juicy bugs in Reko that I've fixed in commit c7ac7f3. However, I cannot reproduce the exception you are reporting. What exactly are you doing when installing Reko? Are you using the MSI files provided, or building from source?

I installed the Reko on Windows 11 using the pre-compiled MSI you provide. I also have Visual Studio 2022, which I believe could be overriding some CLR stuff. Generally I wanted to use Reko to check if the compiler generated code good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants