8-Bit Warrior Games + Design + Tutorials + Random


Bug Swarm Thingy…



I got distracted and experimented with a basic swarm mechanic. Why not, eh?

The "bugs" are attracted to a target position and apply relative image blending based on distance to the target. I also added a background from OpenGameArt.org to help things out a bit.

If super bored, you can check out the super-dandy HTML5 Demo.



[GameMaker] YYC Optimisation – Direct Value Access

(This does not benefit standard VM exports or HTML5/JS exports but will make them perform slower)

Recently, I stumbled across a nifty optimisation for GameMaker's YYC (C++) compiler. In my tweening engine, I had noticed that some "simpler" easing scripts were performing much slower than more complicated ones.

For example...

/// EaseLinear()
return argument2 * argument0 / argument3 + argument1;

...was performing much slower than...

/// EaseInOutQuad()
var _arg0 = argument0/(argument3 * 0.5);
if (_arg0 < 1){ return argument2 * 0.5 * _arg0 * _arg0 + argument1; }
return argument2 * -0.5 * (--_arg0 * (_arg0 - 2) - 1) + argument1;

This had me baffled. So, I tweaked various parameters, attempting to find what was driving the non-sensical performance difference. Eventually, I noticed the unique thing EaseInOutQuad had which EaseLinear didn't:

operations involving a numerical constant

What I learned is that, under the hood, GameMaker seems to handle variables as a general data type which can be safely passed around to various expressions, functions, and scripts. I assume that these general types will then fetch and return their actual data type when needed, which may be a real number, string, array, or some other type. Apparently, this can lead to extra overhead when checking for a variable's data type at runtime.

Now, what does this have to do with numerical constants helping speed things up? Well!

When you write an operation explicitly involving a numerical constant, it can be assumed that other variables interacting with the numerical constant are (should be) real numbers, or at least know how to deal with them. In the right places, this can enable GameMaker to optimise things by directly accessing a variable's real value instead of accessing its general "packaged" data type first.

For example, in the most simplest form:

x = a;      // What is 'a'?

x = 0+a; // Compiler can assume 'a' is a real number

In the example above, GameMaker will first access (a) as a general type before assigning its actual value to (x). With (0+a), however, the real value from (a) will be directly accessed since the numerical constant (0) makes it safe to assume the intention of the operation. This can lead to speed gains, even with the (possible) slight overhead from the "+0" operation. Stripped down, the compiled C++ output would basically look like this (but much messier)...

x = a; // Access 'a' and find which type to assign x

x = 0 + a.val; // Directly assign the real value of 'a' to x

Despite appearing more complex, the second line is faster as its real value is being directly accessed by the C++ code. The operation involving a numerical constant allows this to occur.

Now! To get the most out of this technique, we need to utilize the Order of Precedence.

x = a + 0 + b * c; // We can do better!

In the above example, I have attempted to directly access the real values of all variables in the expression by adding (+ 0). However! Because (b * c) has a higher predence and will be executed first, b and c will fail to get the intended optimsation of having their values directly accessed. Instead, we need to wrap brackets around the first executed variable and add a constant zero to it.

x = a + (0+b) * c; // That's better!

Now, (0 + b) will be executed first, with (b) directly accessing its value. Because (b) is now assumed to be a real value, (c) can also assume it is real. And because ( (0+b) * c ) is assumed to be real, (a) can ALSO assume it is real when it is accessed last. As a result, all values will be directly accessed by the C++ code:

x = a.val + (0 + b.val) * c.val; // YAY!

Note that this trick also works when multiplying or dividing by constant values:

x = a + 0.5 * b * c; // Divide and conquer!

Note that using this technique directly with script/function parameters can sometimes do more harm than good. Script and function parameters may require general data types to be passed as arguments. Remember that the general type is safer to pass around?

x = AddValues(a+0, b+0, c+0); // Probably BAD

x = ShowNumber (a + (0+b) / c); // Probably GOOD

In the first example above, because of how the YYC works, the numerical constant (+0) would force all 3 parameter values to first be pre-calculated and assigned to 3 temporary "general" variables. This creates extra overhead and can slow things down. However, with the second example, the optimsation benefits for the single argument would likely speed things up, as the single parameter involves a more complex calculation, allowing direct access to 'a', 'b', and 'c' in a single expression. The cost of the "extra overhead" would likely be outweighed.

In regards to the easing algorithms I had mentioned at the start, placing ( 0 + argument0) at the start of EaseLinear was all that was needed to greatly boost its speed!

return (0+argument2) * argument0 / argument3 + argument1; // Huzzah!

Anyhow! There's no sure way to know how this could help speed up your own code until you try. Check to see where it helps and where it doesn't. Experiment and benchmark the results!

Also, be sure to check out the outputed C++  code for your project in the Asset Cache Directory. You can find the directory by going to File -> Preferences.


TweenGMS Dev History (Part 1)

TweenGMSAs of writing this, I just completed a large update(v0.9.70) for TweenGMS. It adds a lot to the engine, while stripping away many redundant scripts which cluttered previous versions. I hope to have version 1.0 *fingers crossed* released by the end of the year.

I have now been working on the engine for nearly 3 years... crazy. Creating the tool has been an interesting adventure. At the beginning, I hardly knew what tweening was. Before TweenGMS, my previous experience included dabbling with a tweening engine in Unity, for about 30 minutes, a year before. I still hadn't "got it".

Back in 2012, a couple of weeks before Halloween, I decided that I wanted to make a "stupid little game" for the season. And, inspired by this video, I wanted to use the project as a way to learn and apply tweening. The result was Sugar Crash, a "stupid little game" made in just a few days which would later become the base for TweenGMS. (The game was originally called "Candy Crash" and was out before Candy Crush was on mobile platforms... I had never heard of Candy Crush)

When I set out to create my own tweening engine in GameMaker, I decided from the start to not reference other engines. I used Robert Penner's easing algorithms as a base and went from there. I didn't want to mold the engine around common standards found in other engines. Instead, I wanted to build it in a way that made sense for my own needs and worked best with GameMaker's GML language. I didn't know if I was "doing it right", but I started to form a system which was relatively simple to use and did what I needed it to. That's got to count for something!

To be continued...



Harry High Dive

I recently finished up work on a new mobile game, "Harry High Dive". I have been working on it part time as the programmer since last September(?) and have finally managed to put the finishing touches on it. Even "simple" projects can take much longer than expected when adding time for polish.

Anyhow, feel free to check it out! It is currently available for Android devices on Google Play, but will also be available for iOS soon enough!


Filed under: General No Comments

TweenGMS – YYC Performance Boost

With TweenGMS, I have made use of script_execute() functions instead of switch statements to enhance performance. However, I have since found that this optimisation, while improving the standard VM and JS exports, greatly limits the YYC (C++ compiler). So, I experimented with changing out script_execute() for switch statements when executing property setters and easing algorithms. The performance gains for YYC were much higher than I originally expected. Below are benchmark results from my test.

The benchmark used patrolling tweens with the EaseInOutSine algorithm. The numbers below represent the number of tweens running before frame rate drops below 30. Run on a AMD 2.2GHZ quad core cpu (only one core used).

3500 - VM Runner using switch statement

5220 - VM Runner using script_execute() -> 49% gain over previous

12720 - YYC using script_execute() -> 144% gain over previous

28240 - YYC using switch statement -> 122% gain over previous

As you can see, there is a huge difference between these numbers. It is easy to see why I originally used script_execute, as it greatly improves performance for the standard VM Runner by nearly 50%. When using script_execute, there are still massive gains over the VM Runner when using the YYC. However, when reverting back to switch statements for the YYC, gains are through the roof. YYC performance more than doubles and pushes 5X faster than the optimised VM Runner version(8X faster than VM-switch).

In a future TweenGMS update, I will be adding an optional optimisation path which will allow you to get these gains when using the YYC. By default, the system will continue to use script_execute, but with a tiny bit of extra work, you can easily switch on (or off) the YYC optimisations provided by managed switch statements for easing algorithms and property setters.