Oolite Bulletins

For information and discussion about Oolite.
It is currently Wed Feb 20, 2019 10:09 pm

All times are UTC




Post new topic  Reply to topic  [ 82 posts ]  Go to page Previous 1 2 3 4 5 6
Author Message
 Post subject: Re: OXP Performance tips
PostPosted: Sun May 06, 2018 3:42 am 
Offline
Dangerous
Dangerous

Joined: Fri Mar 17, 2017 1:49 am
Posts: 83
The new version of Telescope is about 35% faster than 1.13 (14% faster than 1.15) and garbage production was decreased by over 70% vs 1.15 (1.13 & 1.15 have similar rates). Here are some of the techniques I used to improve performance:

Take out the garbage
We've talked recently in this thread about reducing garbage by reusing objects & arrays and performing some vector functions locally. That got me much of the 70%. I also got rid of most calls that return arrays. For example, I was using splice to remove an item from a list. Splice returns an array of what was deleted, which I didn't need.
Code:
    //mapping.splice( found, 1 );
    for( var i = found, len = mapping.length; i < len - 1; i++ ) {
        mapping[ i ] = mapping[ i + 1 ];
    }
    mapping.length = --maplen;
This is a wee bit slower, 1 microsec (0.001 ms), which I think is a fair trade off for less garbage.

Replace unnecessary function calls
A number of Math functions are convenient during development but can be replaced with inline expressons:
Code:
    Math.abs( x ):      x >= 0 ? x : -x
    Math.max( x, y ):   x > y ? x : y
    Math.min( x, y ):   x < y ? x : y
    Math.floor( x ):    ~~ x
    Math.round( x ):    ~~( x + 0.5 )
    Math.ceil( x ):     ~~( x + 1 )
[~ is the bitwise not, so x gets converted to an integer w/ all its bits flipped,
 the 2nd ~ flips them back, leaving x without its fractional part]
We usually never want to have duplicate code (maintenance, readability, duplicate bugs) but exceptions can be made where speed is essential. Telescope has a function for calculating distance which gets called in 14 places. But the loop that repositions effects has some common vector calculations, so moving some of the distance code inline eliminated duplicated work. Just be sure to document it well in both functions.

Prioritize compound conditions
As many of you know, JS will abort execution of conditional expressions in some situations:
Code:
    if( a && b && c ) ...   // If a is false, neither b or c will be evaluated, and
    if( x || y || z ) ...   // if x is true, neither y or z will be evaluated
                            // To do so would be pointless, as they wouldn't change the final value.
We can use this to enhance speed by putting the faster ones first.
Code:
    var ws = worldScripts.someOxp;
    var allowed = someFunction();
    var ps = player.ship;
    if( allowed && ps.docked && ps.bounty === 0 && ps.mass < 130000 && ws.$isSpecial )
This is the most efficient ordering, all things being equal, as 'allowed' is local, 'docked' is a property of PlayerShip, 'bounty' is a property of ship, 'mass' is a property of entity and 'isSpecial' is a worldScripts property. We may re-order them if we know one is almost always true (shift right for &&, left for ||) or almost always false (shift left for &&, right for ||). The goal is get it to abort quickly when it does abort.

If the 'allowed' variable is only used to determine entrance into the if block, move it into the expression so it's only executed when absolutely necessary.
Code:
    if( ps.docked && ps.bounty === 0 && ps.mass < 130000 && someFunction() && ws.$isSpecial )
[whether to place it before or after the worldScripts property is determined by profiling to see which takes longer]

Profiling shows that WorldScriptsGetProperty is about 18 times slower that EntityGetProperty. Such comparisons are relative, as WorldScriptsGetProperty varies with the number of oxp's you load, and MissionVariablesGetProperty (about 3 times slower) varies with the size of your save file.

A rare example came up recently involving the && of two .indexOf calls, where one list was always much shorter than the other. Putting the short one first saves time.

Spread the load over several frames
Some jobs require repeats faster than a Timer (is interval still a min. 0.25 sec?) but take longer than we'd like for a frame callback. I use a scheme that allows a job to be spread over several frames. For example, updating the MFD entails selection, distance & heading calculations and formatting. I've been shooting for 2 ms operations and updating an MFD list of 10 ships takes triple that. The user will never notice a delay of a couple frames (no so for visual effects, that will be noticed), so I process a few ships and suspend this until the next frame. To do this, I maintain an array of pending functions and at the end of my frame callback, if any are present, I do one. Thus the MFD text is prepared over 3 frames.

Strictly speaking, this example doesn't save any time, it just levels out the load (MFD is updated 1/sec, so whether I do it all at once or across frames, the same work is done). I also use this for creating a new scan or updating an existing one. Time is saved though when pending function calls get purged. A new scan purges everything pending, as there's no point updating the MFD, for example, when a new list is being created.

Significant time is saved when you spread high frequency jobs across frames. For updating the position of visual effects, rather than update them all in one frame, I can do half in one frame, the rest in the next frame, with no noticeable difference. This effectively cuts the time for updating effect positons in half.

Vary Effects updates by distance
Visual effects are tricky, as the human eye can detect oddities even at 60 Hz. It may not be clear what's up; it just looks wrong. Telescope deals with objects both near and far. The near ones, within 2 * scannerRange, get updated every frame. But those outside that limit can be updated less frequently, like every other frame. And those beyond 4 * scannerRange get updated every 4th frame. This is only done when flying at normal speeds. With the torus drive, everything gets done each frame, as at those speeds, it does become noticeable.

Cache, Cache, Cache
Whenever you've performed an expensive op, you never want to repeat it if you can avoid it. Say you support a number of other oxp's planetary naming conventions. Once you figure out what that planet's name is, save it until you leave the system. I know this is obvious but there are often cases that you can miss (I know I can & do!). When your profiling points to a problem function, always check if there is something that doens't need doing on every call.

Just recently, I ran into this. I thought my MFD formatting was quick enough until a added a name shortening feature (to avoid really squished text when using randomshipnames). Suddenly, that function rose to the top of the list of speed hogs. But, like the planet, a ship's name doesn't change and a cache solved the problem:
Code:
    var lastShipReports = {};               // cleared when entering witchspace
    function ShowShipReport( map ) {
        var name = '', cached = false;
            key = ent.entityPersonality;    // orbs don't have one, so are not cached (PlanetName has its own cache)
        if( key && lastShipReports.hasOwnProperty( key ) ) {
            name = lastShipReports[ key ];
            cached = true;
        }
        ...
        if( !cached ) lastShipReports[ key ] = name;
    }
Frame Rate customization
Telescope has a lot going on and its performance varies considerably from machine to machine. One way to combat this is to monitor its frame rate and adjust accordingly; sometimes even that's not enough, so you'll have to reduce functionality.

I wrote an oxp a while back to monitor frame rate (fps_monitor, clever title, no?) that collects lots of data but that may be overkill. If you have a frame callback, just sum the delta values it gets passed and increment a counter on each call. When the total delta summed hits 1, you counter has the # of frames in the last second. What my utility adds is average fps for set intervals, high and low values, different methods of calculating what the 'average' is.

I regularly zip through my list of sightings, checking their status to see if they need deleting. I've scaled this to be a function of the player's PC. The function has 2 modes, quickly vs full check.
Code:
        ...
        if( parm === true ) {
            quickly = that.quickly = true;
            fps = that.fps = current_fps();                         // quickly is fast, so check fps ships/frame
            if( fps < 0 ) fps = that.fps = 30;                      // current_fps returns -1 until 1st min. has passed
            starting = i = maplen;
        } else if( parm === false || parm === undefined ) {
            quickly = that.quickly = false;
            fps = current_fps();
            if( fps < 0 ) fps = 30;                                 // current_fps returns -1 until 1st min. has passed
            fps = that.fps = ~~(fps / 5);                           // store as fn prop for next frames' execution
            starting = i = maplen;
        } else {                                                    // parm is an index # to resume
            quickly = that.quickly || true;
            fps = that.fps || 6;
            starting = maplen;
            i = parm;
        }
        ...
        while( ...
            ...
            if( i > 0 && i % fps === 0 ) {                          // checking list can take more time than we'd like in a frame
                set_fn_pending( check_Sightings, i );               //   so suspend the work until next frame
                return;                                             // so we do a chunk each frame, its size a fn of fps
            }
            ...
I let the PC's frame rate dictate how many ships to check each frame, dividing by 5 for a full check as it takes much longer.

When creating a new list (the telescope 'scan'), I have to sort through all ships in the system, which is uaually over 150. Done all at once will cause the frame rate to crater, so I use the PC's frame rate to split the job across frames and monitor the effect. If the impact is too high, I increase the number of frames in the spread.

Telescope has a variable, MaxTargets, limiting the size of the list of sightings. Players may specify a value but if their hardware cannot handle it, it gets reduced. The adjustment is not always down, as a temporary dip in the frame rate would cause it's adjustment to be too low. Using a long term average (5 min) as a baseline, I compare the relative effect on frame rate and increase/decrease accordingly (see function init_growing if you're intrigued).

I also use frame rate to estimate the distance travelled in one frame, to accurately position effects when travelling at high speed. This can only ever be an estimate, as the frame rate can fluctuate a lot from one frame to the next, so I error on the side of caution.

Ensure gets are not repeated
[spoiler: involves closures]

Long or complex scripts must be broken into smaller functions, if for no other reason than our sanity. Function calls themselves are not very expensive (about 1.2 microsec) and smaller code chunks are easier to deal with logically, test in isolation and be understood by others. The problem with many smaller functions goes back to property gets. Not all can be cached, only the constant ones. Function references are fine as they never change, as are some objects, especially if they're in your control. But many object references cannot be cached, so each function must perform their own lookup. Writing one humungous function to realize the saving of only doing a lookup once is not my prefered solution. Another way is using a closure.

A closure in JS is simply a function that returns a reference to an inner function. This special feature of JS is to support independent features on a web page. Imagine a field that takes user input. It has to remember that input for when the user returns to that field. A closure is not required to do this but it makes it a lot easier. Without one, the value would have to be stored somewhere external to the function, as a normal function's variables get tossed when the function exits. By returning a reference to an internal function, JS must preserve its variables for when that referenced function is called. Think of it like a 'Do Not Disturb' sign on a hotel room door; the JS maid stays out and leaves everything as it is.

Closures have gotten a bad rep for causing memory leaks, among other things. This was due to programmer error, often generating these references in loops or creating many copies of the closure.

For our purposes, we only need a single instance of the closure. This can be done either by calling it at start up or have it self-initiate, so it's created when the script loads. Once created, we can cache distant lookups in local variables, so they are available to all. And we hardly ever need to type the word 'this' again :)
Code:
this.startUpComplete = function() {         // closure is created here, as the towbar script may not exist when we exit startUp
                                            // could be done in startUp if closure does not reference other oxp's
    if( !this._towedMass ) {
        let mc = this._myClosure();         // create closure by calling it
        this._setTowed = mc.setTowed;       // cache function references in script variables
        this._clearTowed = mc.clearTowed;
        this._towedMass = mc.towedMass;
        // to get the mass of the towed ship, use this._towedMass( ship )
    }
}
this._myClosure = function() {
    var wt = worldScripts.towbar;   // caches reference to towbar script
    var towed = null;               // reference to ship in question
    var mass = 0;                   // persistent local variable
    // private function that's only available inside _myClosure
    function isTowed( ship ) {      
        if( towing === null ) 
            setTowed();
        return ship === towed;
    }
    // public functions because they are returned
    function setTowed() {           
        var newShip = wt && wt.$TowbarShip;
        if( newShip && newShip !== towed ) {
            towed = newShip;
            mass = towed.mass;          // property get only when ship changes
        }
    }
    function clearTowed() {
        towed = null;
        mass = 0; 
    }
    function towedMass( ship ) {
        if( !ship ) return 0;
        if( !isTowed( ship ) ) return 0;
        return mass;
    }
    return { setTowed : setTowed,
             clearTowed: clearTowed,
             towedMass: towedMass
           };
}
The lookup of the towbar script is done once, when closure is created. The lookup of $TowbarShip is limited to calls to setTowed and the mass property get is only done when the towed ship changes. 'towedMass' can be called repeatedly without repeating a property get.

This is a trivial example but in telescope, with 100+ functions and 70+ variables, the savings can really add up. The scheme I used involves setting all the local global variables (glocals?) to -1 when I start processing a new sighting. When a function needs a property:
Code:
    if( mass < 0 ) mass = ship.mass;
    switch( mass ) {
    ...
It's longer code but fast. You can do hundreds of '< 0' tests before you come close to the expense of the property get. Another perk of this scheme is you don't need to keep track of when & where a property was gotten. Assume it wasn't, add the test and you're free to concentrate on your oxp!

FYI, fps_monitor is a self-initiating closure and I wrote Station Options as a closure too, so they are much shorter examples to check out.
(800 lines for fps_monitor vs 2300 for Station Options vs 5500 for Telescope)

_________________
"Better to be thought a fool, boy, than to open your trap and remove all doubt." - Grandma [over time, just "Shut your trap... fool"]
"The only stupid questions are the ones you fail to ask." - Dad
How do I...? Nevermind.


Top
   
 Post subject: Re: OXP Performance tips
PostPosted: Tue May 08, 2018 4:04 am 
Offline
Dangerous
Dangerous
User avatar

Joined: Thu Apr 05, 2018 5:31 am
Posts: 123
Location: Vladivostok, Russia
Finding in some scripts construction as
Code:
this.self = player.ship;
I have question (a bit stupid maybe)
Has this construction any advantage in time efficiency?


Top
   
 Post subject: Re: OXP Performance tips
PostPosted: Thu May 10, 2018 12:04 am 
Offline
Dangerous
Dangerous

Joined: Fri Mar 17, 2017 1:49 am
Posts: 83
By itself, there's no real difference (~40 nanosec). It stores a 1 hop property reference in another 1 hop reference. If you're going to reference the player's ship more than once in a function, use

var ps = player.ship;

so you save one hop everytime you use it (ps.target, ps.speed, etc.)

The savings really come about when saving a many hop lookup locally, esp. in loops in frame callbacks, where fractions of ms count.

Compare 2 (admittedly extreme) versions of a function that answers "Is my target a member of my group?"
Code:
this.$targetInMyGroup = function() {
	for( let i = 0; i < player.ship.group.ships.length; i++ ) {
		if( player.ship.target === player.ship.group.ships[ i ] )
			return true;
	}
	return false;
}
Code:
this.$fastTargetInMyGroup = function() {
	var ps = player.ship;
	var pst = ps.target;
	var groupShips = ps.group.ships;
	
	for( let i = 0, len = groupShips.length; i < len; i++ ) {
		if( pst === groupShips[ i ] )
			return true;
	}
	return false;
}
and their profiles of being run 60 times, to simulate a PC running the game at 60 frames per sec
Code:
[log] $targetInMyGroup:
Total time: 7.849 ms
JavaScript: 2.483 ms, native: 5.357 ms
Counted towards limit: 5.97458 ms, excluded: 1.87442 ms
Profiler overhead: 2.138 ms
                                                        NAME  T  COUNT    TOTAL     SELF  TOTAL%   SELF%  SELFMAX
                                        ShipGroupGetProperty  N    420     3.96     2.60    50.5    33.1     0.16
                           (<console input>) targetInMyGroup  J     60     7.52     2.17    95.9    27.6     0.33
                                             ShipGetProperty  N    600     1.40     1.12    17.8    14.3     0.10
                                           JSNewNSArrayValue  N    420     1.21     0.65    15.4     8.3     0.06
                                          JSArrayFromNSArray  N    420     0.55     0.55     7.1     7.1     0.19
                     (<console input>) do_60_targetInMyGroup  J      1     7.84     0.32    99.9     4.0     0.32
                                         JSShipGetShipEntity  N    600     0.27     0.27     3.5     3.5     0.01
                                     JSShipGroupGetShipGroup  N    420     0.15     0.15     2.0     2.0     0.01
[log] $fastTargetInMyGroup:
Total time: 1.805 ms
JavaScript: 0.972 ms, native: 0.826 ms
Counted towards limit: 1.47693 ms, excluded: 0.32807 ms
Profiler overhead: 0.572 ms
                                                        NAME  T  COUNT    TOTAL     SELF  TOTAL%   SELF%  SELFMAX
                       (<console input>) fastTargetInMyGroup  J     60     1.49     0.67    82.7    37.0     0.22
                                        ShipGroupGetProperty  N     60     0.52     0.36    29.0    19.9     0.01
                 (<console input>) do_60_fastTargetInMyGroup  J      1     1.80     0.31    99.6    16.9     0.31
                                             ShipGetProperty  N    120     0.30     0.25    16.7    14.1     0.04
                                           JSNewNSArrayValue  N     60     0.14     0.08     7.9     4.7     0.00
                                          JSArrayFromNSArray  N     60     0.06     0.06     3.2     3.2     0.01
                                         JSShipGetShipEntity  N    120     0.05     0.05     2.7     2.7     0.00
                                     JSShipGroupGetShipGroup  N     60     0.02     0.02     1.2     1.2     0.00

The first has lots of unnecessary property gets. Compare their Total time: 7.849/1.805 = 4.35 times faster. At 60 fps, there is 16.67 ms between each frame. Using $targetInMyGroup, the rate drops to 40.8 fps but with $fastTargetInMyGroup, it only drops to 54.1.

Before everyone jumps on my case, I did say it's an extreme example. This check really belongs in a timer, not a frame callback. And of course, everyone caches their loop's length, right? :wink:

In a 0.25 sec timer, frame rate drops to 58.12 & 59.57 respectively. But add that up over all the timers & all the frame callbacks in all the oxp's you've loaded ... :shock: cache everything worthwhile! Besides, it make your code easier to read.

_________________
"Better to be thought a fool, boy, than to open your trap and remove all doubt." - Grandma [over time, just "Shut your trap... fool"]
"The only stupid questions are the ones you fail to ask." - Dad
How do I...? Nevermind.


Top
   
 Post subject: Re: OXP Performance tips
PostPosted: Thu May 10, 2018 10:49 am 
Offline
Dangerous
Dangerous
User avatar

Joined: Thu Apr 05, 2018 5:31 am
Posts: 123
Location: Vladivostok, Russia
Thank you for such detailed answer, cag!


Top
   
 Post subject: Re: OXP Performance tips
PostPosted: Fri Jun 29, 2018 1:35 am 
Offline
Dangerous
Dangerous

Joined: Fri Mar 17, 2017 1:49 am
Posts: 83
Quote:
The tip I gave phkb, and I realized then that although it is implicit in what was already exposed, it is not obvious:

From:
Code:
this.$rand = function(max) {
	return Math.floor((Math.random() * max) + 1);
}
To:
Code:
this.$rand = function $rand(max) {
	var that = $rand;
	var floor = (that.floor = that.floor || Math.floor);
	var random = (that.random = that.random || Math.random);
	return floor((random() * max) + 1);
}
This way, rather than using 2 accesses outside the function (and quite far away, in a library), after the first time, you only use 2 accesses to an immediate property of the function.
Seems it was enough to make a big difference in phkb code.

I always do this, each time I access an external library or script.
Quote:
Quote:
So you're basically storing a (far) reference as a function's property. Would it be safe to say, for clarity, that
Code:
var floor = (that.floor = that.floor || Math.floor);
is logically equivalent to
Code:
if( that.floor === undefined ) that.floor = Math.floor;
var floor = that.floor;
Exactly!
This technique works well for objects/arrays too (functions are just a type of object).
And while it *can* be used for simple variables, you'd be well advised to use the following if said variable can ever be falsey (null, 0, undefined in addition to false)
Code:
var myVar = that.myVar = that.myVar === undefined ? -1 : that.myVar;
When using the || version, if the myVar property ever becomes falsey, it will get re-initialized on the next call, not just on the 1st time it's ever called.

It's a handy way to remember values from one call to the next but if in doubt, use the second form.

When I think of the hours lost chasing bugs (yes, plural, I have a thick skull) from using the first ... I need a beer :cry:

_________________
"Better to be thought a fool, boy, than to open your trap and remove all doubt." - Grandma [over time, just "Shut your trap... fool"]
"The only stupid questions are the ones you fail to ask." - Dad
How do I...? Nevermind.


Top
   
 Post subject: Re: OXP Performance tips
PostPosted: Thu Oct 18, 2018 10:51 pm 
Offline
---- E L I T E ----
---- E L I T E ----

Joined: Sat Sep 12, 2009 11:58 pm
Posts: 1071
Location: Essex (mainly industrial and occasionally anarchic)
To make a very late reply to cag:

1) that seems involved (!)
2) shouldn't compilers take care of this sort of thing? Except, hold on, it's an interpreted language. But perhaps there is some sort of JIT (as I believe it is called) . .


Top
   
 Post subject: Re: OXP Performance tips
PostPosted: Fri Oct 19, 2018 5:00 am 
Offline
Dangerous
Dangerous

Joined: Fri Mar 17, 2017 1:49 am
Posts: 83
Quote:
1) that seems involved (!)
Could you be more specific?
Quote:
2) shouldn't compilers take care of this sort of thing? Except, hold on, it's an interpreted language. But perhaps there is some sort of JIT (as I believe it is called) . .
Perhaps, maybe; I don't really know. We're not using the most up to date interpreter, that would be too much work for too little gain. (And may break older code).

I just go by the profiling numbers (it's such a great tool). Shaving a ms off a frame callback is worth it, IMHO. I usually play with a couple hundred oxp's (mostly eye candy) but if 1% 5% are using FCBs, those ms can add up fast.

I'd like to remind all the readers that much of this only applies to that small portion of code that's gobbling up resouces. Outside of FCBs (& some timers), forgeddabatit. Your time would be better elsewhere (anywhere else!)

_________________
"Better to be thought a fool, boy, than to open your trap and remove all doubt." - Grandma [over time, just "Shut your trap... fool"]
"The only stupid questions are the ones you fail to ask." - Dad
How do I...? Nevermind.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 82 posts ]  Go to page Previous 1 2 3 4 5 6

All times are UTC


Who is online

Users browsing this forum: No registered users and 27 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited