Decompiling Challenge Scripts

Ah thanks for the help Willy. Knowing the number of instructions should help map them on output and I think the auto incrementing number is the script ID used in the run script commands. And the LHVMA being arrays is a good to know too thanks. C:

Also I've documented engine calls up to 3.5 as refered to in Lionhead's documentation, so that is:

  • 3.1 Statements using generic objects
  • 3.2 Statements involving state driven objects
  • 3.2.3 Controlling Villagers
  • 3.3 Statements involving visual effects
  • 3.4 Statements involving migrations
  • 3.5 Statements affecting players

I've also figured out the section after the functions section is a data section. Strings are stored in here in null terminated form and whenever you use a string in the scripting language it's inserted into the data section and a integer with an address relative to the data section is put in it's place.[br][br]Added: [time]1393044445[/time]Oh and I'm looking into the model files (BWM) too. I think I've identified the face list and vertex list.
 
In the last section, it reprints the global variables, It also signs the values to them
If I declare
global Ypos = 0
global Xpos = 1

in the last section it says
Null variable.........Ypos.......€?Xpos

The number is in hex 3F80 = 16256 = 0011 1111 1000 0000(base 2)
I only have a basic understanding of floating point arithmetic, So I am wondering if you understand it? 

 
You're right Willy, the last section seems to be assigning initial values to global variables, below is how I've read them. The unknown value is always 2, it might be suggesting what data type the value is, so 2 might mean to read a float for instance.

Code:
            chlFile.SavedGlobalTable = new SavedGlobal[reader.ReadInt32()];

            Console.Write("Reading {0} saved globals... ", chlFile.SavedGlobalTable.Length);

            for (int i = 0; i < chlFile.SavedGlobalTable.Length; i++)
            {
                SavedGlobal global = new SavedGlobal();
                global.Unknown = reader.ReadInt32();
                global.Value = reader.ReadSingle();
                global.Name = Helper.ReadNulledString(reader.BaseStream);

                chlFile.SavedGlobalTable[i] = global;
            }

            Console.WriteLine("Done.");

Between the data section and this final global data section seems to be a random 4096 null bytes.. Consistent in every challenge file, unsure why this is.[br][br]Added: [time]1393162545[/time]But other then that, I think that's the whole file format documented!  :D
 
I spent some time yesterday trying to come up with an equation which could be used to go from the hexadecimal number to the floating point number. This image attached is the equation I cam up with.
For the equation to work we have to convert the hexadecimal to binary then its very simple.

de = the decimal number of the exponent
b = is the fraction part of the binary number

Are you planning on creating a decompiler?
 
Willy said:
Are you planning on creating a decompiler?

That's the end goal. Although getting something that compiles perfectly into Lionhead's pseudo scripting language will be very tough. I should be able to get something translatable into it. We'll finally be able to replicate the vanilla wonders and AI.
 
What will you translate into if not the scripting language?
I think we could fix a lot of the problems that the game has.

By the way I am heading back to school tomorrow, so I won't be able to help much for a while.
 
Easter break is coming up, I'll have 2 weeks to try and finish this then. The only issue of right now is there's a lot of engine calls that aren't documented in Lionhead's documentation but still used in the retail scripts..
 
Binary: https://dl.dropboxusercontent.com/u/3659637/bw2tools.zip
Source: https://github.com/HandsomeMatt/bw2-tools

Right now it outputs into a very simple pseudo-asm format, hopefully this'll allow me to make a parser that'll then allow compilation back into CHL files. A lot like the CHLEX and CHASM used for BW Eruption.

My left todo list for the decompiler is pretty small though:
  • Identify instructions of types: 30, 32, 47
  • Figure out how LHVM arrays work.
  • Identifying variables is weird, globals seem to effect locals... Sometimes.
  • Change call instruction to function names so they can be recompiled.
 
Back
Top