Jump to content

A plight for performance improvements


Recommended Posts

17 hours ago, EnTro said:

So I have uploaded the save game here: https://filebin.net/rhmmnrl7fjsc1dtm

I've re-run your experiment three times now, under as controlled conditions as I was able to set up. No other applications running, camera in the same spot (using a camera hotkey, focused on the geothermal plant in the bottom right of Sapona), starting from your fresh save every time. I get a baseline (1x speed) of 27-30 FPS, and an end state (only the two dupes in the rockets still alive) of 37-42 FPS. FPS increase more or less linearly with number of dupes killed, with those on Florino being worth a bit more than those on the other two planetoids. So far, so good, seems to support my theory.

But! FPS occasionally vary wildly while doing that, and also while just waiting for the numbers to settle. Partly, those swings look like GC stops (1 FPS for a split-second, then ramping up to average again), but I've seen drops to ~35 for 30 seconds without any input. Seems like a very noisy data source. It also doesn't match what you saw on your machine. I'm using the Steam overlay FPS counter, maybe you used something else?

Also but! Your save has more dupes, fewer critters, more errands/cycle, less automation, and far less of the map revealed than mine, yet the dupe count reduction impact seems about equal, tending to more pronounced on my base. I can't explain that using either of our theories. Can you? (that's not a rhetorical question) I also attached my own save, if you want to play around with it - strictly optional, of course. (There are tons of mods enabled on this, but all pure UI/QoL, no buildings, no modified mechanics. Should run identical on vanilla.)

Anyway - it's the weekend, so I'll try and set up a modding environment and see if I can get some sort of instrumentation injected into the game, to properly settle this. No promises, though; C# is not exactly my home turf, but should be close enough (C/C++ native, lived in Javaland and Lispia for many years).

Totally Swamped.sav

18 hours ago, EnTro said:

So I have uploaded the save game here: https://filebin.net/rhmmnrl7fjsc1dtm

Here is my experience with Linux and an i5-12400F

 

  • To my eyes, performance seems nearly flawless.
  • FPS counter says around 33 fps. I consider this to be very acceptable for a game which only updates at 5 ups, or 15 ups at 3x speed (technically some things update twice as frequently, and some things even update with the graphical frames). Like there's only so much interpolation which can be done.
  • The simulation pace at 3x seems to be very close to 3x, with it taking about 105 seconds to simulate 300 seconds of in-game time, that is within a few percent of true 3x speed.

I think it is probably a good idea to include the "simulation rate" (ups - updates per second) when benchmarking. With ONI, the "ups" and "fps" are decoupled - like for me it'll run like 90 fps early in the game. This is unlike Factorio where they are coupled. So in ONI if the developers decided, they could make the game make all effort to maintain 60 fps, sacrificing ups as required, it might not even be a large amount of sacrifice, since I don't believe the graphical side of thing is very demanding.

The game must have some kind of strategy for balancing the resources going to ups vs fps and I'm guessing it sacrifices fps quite aggressively until it down to like 30 fps, then starts sacrificing ups, I could be wrong, I haven't examined it closely. As I noted earlier, I don't get bothered by 30 fps for a game with far less than 30 ups, because the benefits from higher fps are only very small, it is a little nice to have smoother interpolation but the gameplay benefit is negligible (this compares with like shooter games where higher fps literally lets you be more accurate). I know other people may prefer smoother, but I'd understand why the ONI devs would let the game sacrifice fps while fps is well above ups.

Anyway, I do get the wishful thinking for better performance (and it is wishful thinking, best we can hope for is Peter Han's improvements to make it into patches, though many already have), but I do think it's wishful thinking just because of the reality of software development. The game runs well, IMO, on what these days is a $100 CPU, if you love the game enough, it's quite affordable to upgrade. The Factorio dev team is a rare breed, and they were focused on performance from the very beginning, it's enormously impractical to do the required full overhaul for a game as complex as ONI, time and time again similar attempts to overhaul a game to make it way better have just ended badly in disappointment. 

There might be some low hanging fruit in terms of increasing "perceptual smoothness" though. Probably not that low hanging through, because the the way graphical updating is done is probably heavily leveraging Unity, so it'd be a massive overhaul situation to better decouple rendering than Unity already does.

13 hours ago, Gurgel said:

Your estimations are really meaningless. You seem to assume everybody codes things in assembler by hand. 

Assembly is irrelevant, most code is compiled or translated to machine instructions, it's about the efficiency of the language you use, and how you code in it. The problem lies with the high level focus of game engines. See a game engine as a toolbox for woodworking, while running simulations is metalworking. That is very likely why ONI is so slow.

 

7 hours ago, pnambic said:

I've re-run your experiment three times now, under as controlled conditions as I was able to set up. No other applications running, camera in the same spot (using a camera hotkey, focused on the geothermal plant in the bottom right of Sapona), starting from your fresh save every time. I get a baseline (1x speed) of 27-30 FPS, and an end state (only the two dupes in the rockets still alive) of 37-42 FPS. FPS increase more or less linearly with number of dupes killed, with those on Florino being worth a bit more than those on the other two planetoids. So far, so good, seems to support my theory.

But! FPS occasionally vary wildly while doing that, and also while just waiting for the numbers to settle. Partly, those swings look like GC stops (1 FPS for a split-second, then ramping up to average again), but I've seen drops to ~35 for 30 seconds without any input. Seems like a very noisy data source. It also doesn't match what you saw on your machine. I'm using the Steam overlay FPS counter, maybe you used something else?

Also but! Your save has more dupes, fewer critters, more errands/cycle, less automation, and far less of the map revealed than mine, yet the dupe count reduction impact seems about equal, tending to more pronounced on my base. I can't explain that using either of our theories. Can you? (that's not a rhetorical question) I also attached my own save, if you want to play around with it - strictly optional, of course. (There are tons of mods enabled on this, but all pure UI/QoL, no buildings, no modified mechanics. Should run identical on vanilla.)

Anyway - it's the weekend, so I'll try and set up a modding environment and see if I can get some sort of instrumentation injected into the game, to properly settle this. No promises, though; C# is not exactly my home turf, but should be close enough (C/C++ native, lived in Javaland and Lispia for many years).

Totally Swamped.sav 19.85 MB · 0 downloads

That is a much more precise testing methodology than I used, I only tested at 3x speed, which I always play at. However with a quick test using 1x speed (debug deleting the dupes this time, instead of their earlier fates in the name of performance science, sorry dupes) I can confirm that I also see 40-42 FPS with the occasional drop to 37 or 38. Odd how close our numbers seem to match. Florino definitely has the most chore heavy duty roster for the dupes, and a wilder maze-like base layout so I can see how the impact of removing those dupes is larger.

But yes, I do indeed see the regular FPS slowdowns too, I didn't report them to prevent info overload. I'm using the Steam overlay FPS counter.

I did notice that occasionally my dupes behavior appears to stutter. Instead of a building pipe and continuing constructing the next one nearly directly, they idly stand by for a few seconds. I always assumed that they constructed so quickly (due to leveling their construction skill) that the game couldn't keep up, but this may have been some kind of behind the scenes optimization where dupe task updates are performed less frequent.

I haven't tried your save yet at this time, I will as a later point. Curious how your file is 8MB smaller than mine, even though you have revealed more of the map. I wonder if that extra 'stuff' makes a large difference. And I can't explain why we observe such big differences between performance with dupes in our bases. For me the still poor 3x performance with almost no dupes is a sign that the simulation must be quite heavy, in addition to the task path-finding load. It is too bad the game doesn't provide metrics into what it is doing. If it stored timestamps before and after each heat simulation step, fluid flow step, pathfinding step, task assignment step and render step (or however else the game is structured) in a buffer that could be saved on demand, that would be a wealth of data.

 

6 hours ago, blakemw said:

Here is my experience with Linux and an i5-12400F

 

  • To my eyes, performance seems nearly flawless.
  • FPS counter says around 33 fps. I consider this to be very acceptable for a game which only updates at 5 ups, or 15 ups at 3x speed (technically some things update twice as frequently, and some things even update with the graphical frames). Like there's only so much interpolation which can be done.
  • The simulation pace at 3x seems to be very close to 3x, with it taking about 105 seconds to simulate 300 seconds of in-game time, that is within a few percent of true 3x speed.

I think it is probably a good idea to include the "simulation rate" (ups - updates per second) when benchmarking. With ONI, the "ups" and "fps" are decoupled - like for me it'll run like 90 fps early in the game. This is unlike Factorio where they are coupled. So in ONI if the developers decided, they could make the game make all effort to maintain 60 fps, sacrificing ups as required, it might not even be a large amount of sacrifice, since I don't believe the graphical side of thing is very demanding.

The game must have some kind of strategy for balancing the resources going to ups vs fps and I'm guessing it sacrifices fps quite aggressively until it down to like 30 fps, then starts sacrificing ups, I could be wrong, I haven't examined it closely. As I noted earlier, I don't get bothered by 30 fps for a game with far less than 30 ups, because the benefits from higher fps are only very small, it is a little nice to have smoother interpolation but the gameplay benefit is negligible (this compares with like shooter games where higher fps literally lets you be more accurate). I know other people may prefer smoother, but I'd understand why the ONI devs would let the game sacrifice fps while fps is well above ups.

Anyway, I do get the wishful thinking for better performance (and it is wishful thinking, best we can hope for is Peter Han's improvements to make it into patches, though many already have), but I do think it's wishful thinking just because of the reality of software development. The game runs well, IMO, on what these days is a $100 CPU, if you love the game enough, it's quite affordable to upgrade. The Factorio dev team is a rare breed, and they were focused on performance from the very beginning, it's enormously impractical to do the required full overhaul for a game as complex as ONI, time and time again similar attempts to overhaul a game to make it way better have just ended badly in disappointment. 

There might be some low hanging fruit in terms of increasing "perceptual smoothness" though. Probably not that low hanging through, because the the way graphical updating is done is probably heavily leveraging Unity, so it'd be a massive overhaul situation to better decouple rendering than Unity already does.

That is very interesting. My game takes 227 seconds to complete a cycle at 3x. Just to confirm: your FPS is taken at 3x or at 1x?

I compare CPU performance using cpubenchmark.net. The generic wisdom is that ONI is multithreaded, but still single thread heavy. Your i5-12400F has a single core performance of 3507, while my 8700K has a single core performance of 2730 (CPU power improvement come mainly from more cores nowadays, individual cores are not that much faster). While obviously faster, your 28% faster CPU core performance would not explain why you can run my base 116% faster. I wonder what else could be the issue.

And indeed, any FPS 30+ in ONI is fine for me. This is more about UPS than FPS but measuring UPS is difficult and FPS is easy ;).

1 hour ago, EnTro said:

That is very interesting. My game takes 227 seconds to complete a cycle at 3x. Just to confirm: your FPS is taken at 3x or at 1x?

My fps was taken at 3x.

1 hour ago, EnTro said:

While obviously faster, your 28% faster CPU core performance would not explain why you can run my base 116% faster. I wonder what else could be the issue.

The "single threaded performance" may only be benchmarking relatively simple number crunching which isn't particularly memory bandwidth constrained. In particular, ONI does A LOT of memory writes, because all sorts of stuff is getting updated every tick (temperatures and such - and more significantly the updates of plants, critters and buildings rather than cells, because the memory for cells would be a lot more localized). Hence architectural details around memory bandwidth and latency and caching efficiency become important.

A useful though not totally perfect resource is Factorio benchmarks, for example in terms of averages, the 8700K gets 157.1 ups, while the 12400F gets 208.2 ups, 32% faster. However averages are just averages, different systems have different mobos, memory, operating systems etc (in the case of Factorio, some users go as far as setting up custom memory allocation settings under Linux to optimize performance), in terms of the lowest benchmarks, the 12400F is 82% faster than the 8700K. And we should also note, for the 8700K, the best benchmark is fully 2.6x faster than the worst. We can only speculate as to why that would be (overclocking, ram speeds and dual channel or not are likely significant factors), but it certainly means there's a lot more to real world performance than just the CPU.

And Factorio is not ONI, in particular Factorio is quite heavily optimized to maximize cache hits, while ONI probably isn't. Hence ONI's memory writes probably aren't caching as well. 

I should note I upgraded my system with maximizing ONI performance specifically in mind, within a certain budget of course, such as using low latency ram. Whether those choices actually mattered, I have no idea, but I can say it worked out well for me.

11 minutes ago, blakemw said:

in terms of the lowest benchmarks, the 12400F is 82% faster than the 8700K. And we should also note, for the 8700K, the best benchmark is fully 2.6x faster than the worst.

That is interesting, although I find it difficult to see the correct interpretation of a different kind of game altogether, which can differ a lot in details (even though it is simulation heavy, the underlying logic and math may be terribly different.

 

19 minutes ago, blakemw said:

but it certainly means there's a lot more to real world performance than just the CPU.

I agree completely, that's why I hesitate to post specs, as people will be eager to blame something hardware wise. I bring up the CPU mainly because the common wisdom seems to dictate that singe core CPU speed is king for ONI. I can hardly imagine that memory latency would be the main bottleneck for ONI, assuming calculations are optimized to work with blocks of data (like when using matrix operations in appropriately optimized libraries). ONI should not constantly pulling random small data sections from memory like a database server handling user requests would be.

 

9 hours ago, EnTro said:

That is interesting, although I find it difficult to see the correct interpretation of a different kind of game altogether, which can differ a lot in details (even though it is simulation heavy, the underlying logic and math may be terribly different.

Yes, and it's not ideal. But the distinction is that formal and detailed benchmarking is actually done for Factorio, and Factorio is definitely a lot closer than say, a shooter. Also there has been way less good benchmarking done for ONI which at least gives reason to believe there is some correlation.

9 hours ago, EnTro said:

I can hardly imagine that memory latency would be the main bottleneck for ONI, assuming calculations are optimized to work with blocks of data (like when using matrix operations in appropriately optimized libraries).

Excuse me a moment while I laugh uproariously at that assumption.

It's probably a fair assumption for the C++ physics core, but most stuff which isn't cells (dupes, critters, plants, buildings etc) are simulated in C# land, from what I understand pretty much just using Unity constructs (we have fairly clean decompiled C#), which is fairly horrible in terms of memory optimization. And C# as a managed language gives very limited scope for optimizing how memory is used and accessed. And Unity's garbage collector isn't exactly good, which is why GC stutters are such an issue in Unity games, including ONI.

9 hours ago, EnTro said:

ONI should not constantly pulling random small data sections from memory like a database server handling user requests would be.

Excuse me, what?! SIMdll shouldn't, and probably doesn't, but everything on top of it, and that's all conduits, all machines, critters, dupes, errands, not to mention the entire visualization and UI, are built in the context of a generalized game engine. It's all entity-component and events percolating through callbacks within callbacks. In C#.

There are more than 40 different duplicants and there are unique log entries for cycles 4000 and 4001.

According to the most recent Steam survey, a typical PC running Steam currently has a 6-core 2.3-2.69 Ghz Intel CPU and 16GB of RAM.

Though that is more than the system requirements listed on the Oxygen Not Included store page, such a PC can't run a save with 40+ dupes (and everything needed to sustain them) at that cycle count without encountering serious performance and stability issues.

That means players won't be able to experience everything the game has to offer, and even with a better PC, that number of dupes still seems like a lot, mainly because of their impact on game performance (there's enough space and resources to sustain 40+ dupes indefinitely on any unmodded map, and you can get there in a Spaced Out! moonlet cluster with just 4 dupes on each asteroid).

Though I enjoyed playing the Frosty Planet beta, I found the amount of new content underwhelming (just look at the research tree, and compare the number of items with yellow marks to blue), however, it'll be worth it if they work on the performance in parallel with adding new content. It won't make sense to buy DLCs for new content otherwise because the game can't support the content it already has.

On 6/30/2024 at 9:29 AM, blakemw said:

Excuse me a moment while I laugh uproariously at that assumption.

It's probably a fair assumption for the C++ physics core, but most stuff which isn't cells (dupes, critters, plants, buildings etc) are simulated in C# land, from what I understand pretty much just using Unity constructs (we have fairly clean decompiled C#), which is fairly horrible in terms of memory optimization. And C# as a managed language gives very limited scope for optimizing how memory is used and accessed. And Unity's garbage collector isn't exactly good, which is why GC stutters are such an issue in Unity games, including ONI.

On 6/30/2024 at 9:36 AM, pnambic said:

Excuse me, what?! SIMdll shouldn't, and probably doesn't, but everything on top of it, and that's all conduits, all machines, critters, dupes, errands, not to mention the entire visualization and UI, are built in the context of a generalized game engine. It's all entity-component and events percolating through callbacks within callbacks. In C#.

I meant the assumption rhetorically. 'Assume if it were implemented using a library like Eigen. It would greatly improve performance.' I do indeed assume that currently all buildings, critters, dupes and co live in Unity land and we currently have to deal with the unoptimized consequences of that.

Allow me to suggest an architecture to integrate the simulations more efficiently with the game. Let's start with heat exchange. The game consists of a main layer which contains all tiles, and fluids (liquids+gasses) that I'll call the tile layer. We have a few key physics parameters for heat exchange: Temperature (T), Heat capacity (c), Thermal conductivity (k), and mass (m). Each set of data can be stored in 256x384 matrices of single-precision floats (about 393 kB per matrix). Now we need to calculate the change in temperature (deltaT) with all neighbors. Matrix libraries can easily offset a matrix so we can obtain the one position in x or y shifted matrices of T and k. Alternatively it's quite cheap to simply keep offset copies of the data in memory if those values are not updated that often, such as for k.

Now we can calculate

deltaT = 0.5 * (k + k_neighbor) * (T_neighbor - T) / (c * m).       (eq. 1)

(ONI may use a different equation, but with similar complexity). This can run extremely fast, see spoiler.

Spoiler

I spun up a Matlab online environment, which is a heavily matrix optimized scientific toolbox (and otherwise slow). It typically runs on a single core, unless instructed otherwise. I initialized a bunch of 256 x 384 (vanilla map size) random single precision matrices for T, k, c and m and ran 1000 operations and timed the execution speed, which was reported per full calculation. I used a circshift with a random offset, and added all results simply to prevent any interpreter shenanigans from out optimizing the calculations.

>> T = rand(256,384, 'single');
>> k = rand(256,384, 'single');
>> m = rand(256,384, 'single');
>> c = rand(256,384, 'single');
>
> tic; for i = 1:1000; deltaT = deltaT + 0.5 * (k + circshift(k, [randi([-1 1], 1, 1) 0]) .* (T - circshift(T, [-1 0])) ./ (c .* m)); end; toc / 1000

Which shows an elapsed time of 5.1531e-05 s per update. In other words this allows for 17625 updates per second (of each individual neighbor pair). So roughly 4400+ updates per second of the heat exchange physics in the tile layer.

Now 17625 updates per second is quite a few, but we need to do more work. For heat exchange between the tiles we need to calculate four times, once for each neighbor pair. Still 4400 updates per second in the heat exchange between the tiles is plenty. but of course ONI does much more.

Lets look at heat exchange of pipes, wires and (background) buildings. Many buildings interact with just a few tiles. This is easy to model using a separate set of matrices containing the k, c, T and m of those buildings in those tiles, and letting them exchange heat with the tile layer using an equation similar to eq. 1, but instead of with neighboring tiles, we use the building layers to exchange with. Heat flow should be calculated both ways, so that adds 2 calculations for buildings, and 2 for background buildings. Pipes exchange heat with the tiles, and with their contents, and there is gas and liquid. So 8 updates extra. Add 2 more for wires, 2 for automation wires and that makes 20 updates per simulation step. I probably overlooked something, and buildings like heat exchange tiles and radiators are a bit funky and probably need a slightly custom approach, so let's double that to 40x eq. 1 is run per simulation step. Still that leaves us at 440 updates per second. We need just 5.

Something I will leave out for simplicity, but what will help a lot with performance is that buildings, pipes and such typically only cover a small part of the map. That makes the matrices describing them sparse, which can speed up calculations a lot in optimized libraries. This means that my estimation here is valid for a worst case environment where every single grid location contains a tile, building, background building, gas pipe, liquid pipe, and automation.

Next, fluid flow in pipes. Good point. This is tricky, because it involves moving the pipe contents along the pipe. Luckily shifting the contents in a pipe in mostly linear pipe is a straightforward offset in the matrix. This will barely add any simulation cost, and is only done every second.

Whoa, but don't forget about fluid flow and gas movement and phase changes. Absolutely! I don't know the exact model behind ONI fluid flow, but for liquid I assume there is a parameter that describes the 'viscosity' v (how far the liquid will run before stopping the flow) and a state that described if the flow is bulk, or a 'blade', so we should include that. Gas seems to, in addition to flow, exchange places with other gas, unless it is surrounded on top or bottom and sides by the same gas. While I don't know how that is simulated, it's probably some optimized automaton style behavior, based on simple neighbor based rules. Let's make a very conservative estimation and assume this cost is the equivalent of 200x the execution of eq. 1 (or 5x more costly than all heat exchange). Still, this allows for 74 updates per second. 15x more than we need.

Next we have the unpredictable factors: the dupes, critters and debris. They should probably just update individually. Perhaps the debris can be placed in several heat exchange layers of their own. Typically their are not 100 piles of debris on a single tile, so lets add 5 layers for that, with total cost of 10x an execution of eq. 1. Any more debris per tile can be handled individually.

It is probably also useful that the simulation keeps a ledger of updates tiles (fluid flow, phase changes) so that the game engine can check these.

Spoiler

This means that the current sweep to one location for performance advice becomes irrelevant. No more sweeping to safe your CPU!!

In total this provides an easily obtainable update speed with easily 50 updates per second, 10x more updates per second than we need.

Now how to tie this in with the game engine

I'll admit that I have less experience with this, as I've never tied in performant simulations with a game engine. Obviously the game engine needs to consume the simulation results in a way that it can digest it. However this information is ready to be collected.

The good thing is that the simulation and rest of the game + UI need little communication. The simulation only needs to be informed of significant changes in the simulation (tiles dug up, buildings build, critters moved). This means the simulation can run entirely on a separate thread from the game. The simulation data can live in a memory space that is accessible to both. The game can feed the simulation the required updates, get updates changed tiles, advance the simulation one step, and repeat. Mass, temperatures and such only need to be read when inspecting a building/tile and the precise timing on that read for the UI doesn't matter, so race conditions are not a worry. I suppose Unity game objects can be extended so that it can read from the simulation memory using something like a pointer, so keeping all data on the Unity side up to date is not needed.

I hope this long read has made a good demonstration of how light ONIs simulations can be if optimized. And how that can be easily separated from the game engine thread. Even in a worst case scenario 50 updates per second seem reachable, giving a 10x margin to account for unexpected things. Such an implementation would not only be amazing for the game performance, it would also create space for the devs to implement bigger worlds or new mechanisms without worrying about performance.

1 hour ago, EnTro said:

The good thing is that the game and UI need little communication.

Sorry to pour water on that, but, no, Not by a long shot. You seem unaware of the basic architecture of ONI. The lower-level sim layer is already written in C++, but it does at the same time more and less than what you expect. Communication needs between this and the next-higher layer (already in C#) are extensive. Think for example about machines that emit heat, but only while they're running, which depends on whether the automation grid has turned them on and a dupe is there to operate them. At the same time, those machines can overheat from hot hydrogen gas encroaching into their space from a vent.

May I suggest that you follow the short guide on decompiling the C# layers and take a look for yourself? Start with Game.cs, Sim.cs, SimMessages.cs, and Grid.cs. I do not want to short-change either your competence or the power of modern LA and constraint solver systems, but you're still massively underestimating the complexity of your proposition.

(note also that all gains achieved by the Fast Track mod originate in the C# layers. There is no modding support for the sim proper, since it's native code. On the flip side: preserving modability and extensibility limits what can be moved into lower-level parts.)

20 hours ago, pnambic said:

Communication needs between this and the next-higher layer (already in C#) are extensive. Think for example about machines that emit heat, but only while they're running, which depends on whether the automation grid has turned them on and a dupe is there to operate them. At the same time, those machines can overheat from hot hydrogen gas encroaching into their space from a vent.

These communication needs a quite low with a few changes. Heat damage is taken only every so many seconds, and coordinates can be easily extracted from the simulation layer using T_building > T_overheat, keep a map (dictionary or whatever it is in your flavor language) of coordinates to C# objects and you don't even need to loop over the buildings at all. Buildings turn on and off realistically only once in a while, after which the corresponding entry in the simulation layer can simply be updated (off -> heat production = 0; on -> heat production = 10kDTU) and the effects are taken into account. Hot gas effects are already in the simulation layer. Automation circuits do not need to be simulated on the grid. They can be treatments as segments with inputs and outputs. If any input is 'on' the segment is on, the list of output objects can be informed and the effects carried out.

Realistically these are very low frequency actions in the grand scheme of things, and a simple value update communication between the game-layer and the simulation layer suffices.

20 hours ago, pnambic said:

The lower-level sim layer is already written in C++, but it does at the same time more and less than what you expect.

C++ is fast, but matrix libraries are typically faster as they are optimized to fully utilize the CPU single instruction multiple data features in a cache friendly way. Plus my hypothesis is that much more simulation can be moved to a dedicated simulation code.

20 hours ago, pnambic said:

all gains achieved by the Fast Track mod originate in the C# layers

That just as well shows that the C# layer is unnecessarily bloaty and work should be offloaded to a much more optimized simulation code as much as possible.

In addition I don't think that my proposed architecture restricts modding support at all. As long as the modders have a way to customize the values communicated to and fro the simulation layers and under what conditions, no functionality is lost. In fact, it might become easier to slap on a new and custom simulation layer for a dedicated modder.

I understand the (somewhat dismissive) reaction that I just don't understand how ONI works. However I think most programmers are stuck in their (game design) paradigm and simply can't see past optimizing C# to get rid of some of the language overhead. Matrix operations are quite simple to implement, but it is a paradigm shift. I'm not advocating something new, this is how large scale scientific simulations are optimized when you pay tens of thousands for an hour of compute time on a supercomputer and time is literally money.

While scientific simulation levels of optimizations are not needed for ONI, there is some low hanging optimization fruit to be plucked. Hopefully Klei gives a small team of optimization interested programmers the freedom to play with some of these ideas, and we can all play a completely refreshed and optimized game.

 

On 7/1/2024 at 2:45 AM, hydragyro said:

There are more than 40 different duplicants and there are unique log entries for cycles 4000 and 4001.

According to the most recent Steam survey, a typical PC running Steam currently has a 6-core 2.3-2.69 Ghz Intel CPU and 16GB of RAM.

Though that is more than the system requirements listed on the Oxygen Not Included store page, such a PC can't run a save with 40+ dupes (and everything needed to sustain them) at that cycle count without encountering serious performance and stability issues.

That means players won't be able to experience everything the game has to offer, and even with a better PC, that number of dupes still seems like a lot, mainly because of their impact on game performance (there's enough space and resources to sustain 40+ dupes indefinitely on any unmodded map, and you can get there in a Spaced Out! moonlet cluster with just 4 dupes on each asteroid).

Though I enjoyed playing the Frosty Planet beta, I found the amount of new content underwhelming (just look at the research tree, and compare the number of items with yellow marks to blue), however, it'll be worth it if they work on the performance in parallel with adding new content. It won't make sense to buy DLCs for new content otherwise because the game can't support the content it already has.

Absolutely great point! I can't agree more

13 minutes ago, EnTro said:

my proposed architecture

So you're no longer proposing that Klei devs should finally do some obvious changes to increase performance, but agree that this is a change of architecture, i.e. you're proposing a rewrite. That's a step in the right direction.

14 minutes ago, EnTro said:

there is some low hanging optimization fruit to be plucked. Hopefully Klei gives a small team of optimization interested programmers the freedom to play with some of these ideas, and we can all play a completely refreshed and optimized game.

Oops. Seems I was overly optimistic.

19 minutes ago, EnTro said:

That just as well shows that the C# layer is unnecessarily bloaty and work should be offloaded to a much more optimized simulation code as much as possible.

Initially, you were very much convinced that it was incompetent implementation of the sim layer that slowed ONI down. You've changed your tune quite drastically over the course of this discussion, to the point where you're now incoherent.

So. Are you proposing a rewrite, or are you indeed still convinced that this is a simple matter of optimization for the right programmer?

If it's the first, we would have been done with this a page ago. I agree that ONI on a bespoke engine could indeed be faster than ONI is now. It will take several years to implement, and it's doubtful that it could recoup the costs. If it's the latter, how on earth do you think this would work, given your own statements?

43 minutes ago, pnambic said:

Initially, you were very much convinced that it was incompetent implementation of the sim layer that slowed ONI down. You've changed your tune quite drastically over the course of this discussion, to the point where you're now incoherent.

The air of opinionated programmer dismisses other opinion is tangible. Call it a rewrite, an architecture change or whatever (you are reading too deeply into words), the wording is irrelevant. I haven't changed my tune; the simulation slows ONI down, as I illustrated in my suggested architecture post.

My suggested architecture is not more than a practical demonstration of the strength of matrix optimized simulations, a worst care estimation that even a busy base could easily be simulated well at 50 updates per second, and a pencil sketch how that simulation could potentially be structured. See it as supporting evidence to my claim that the current ONI simulation is horribly inefficient. How they implement performance upgrades, and to what degree, is not up to me, as long as the performance becomes reasonable.

 

I believe that if ONI still wants to develop further as an active game (DLCs and core mechanics changes are considered after all), performance needs to be addressed to levels where a player can experience the full game using recommended hardware (as hydragyro explained quite elegantly).

10 hours ago, EnTro said:

Call it a rewrite, an architecture change or whatever (you are reading too deeply into words), the wording is irrelevant. I haven't changed my tune; the simulation slows ONI down, as I illustrated in my suggested architecture post.

Calm down. Re-read the thread.

You went from "With a bit of developer focus and the right programmer, the performance thing can be solved." to a proposal to move most of the higher-level simulation (above what's today in simdll) into a new core based on algebraic abstractions. That is a nontrivial modelling effort on top of a rewrite of large portions of the system.

This is after I showed you that, with some measurement uncertainty, reducing dupe count substantially improves performance. There is no mention of that - our original point of contention, the costs of pathfinding and errand assignment - in your entire proposal.

I stand by my claim that your stance has mutated sufficiently to be now incoherent.

 

10 hours ago, EnTro said:

I believe that if ONI still wants to develop further as an active game (DLCs and core mechanics changes are considered after all), performance needs to be addressed to levels where a player can experience the full game using recommended hardware (as hydragyro explained quite elegantly).

There is no question that we'd all like better performance. We were discussing whether that's feasible.

 

10 hours ago, EnTro said:

The air of opinionated programmer dismisses other opinion is tangible.

"He threw me with sand! He's a doodyhead!"

You're not stupid. That sort of behaviour is beneath you, and disappointing.

38 minutes ago, pnambic said:

Calm down. Re-read the thread.

I stayed on my original topic: "performance is not good enough, should be addressed and is a realistic addressable".
All your discussion on pathfinding, C#, sim.dll, FPS and decompiling code are fun and all, but distractions from the core topic in this thread.

38 minutes ago, pnambic said:

That is a nontrivial modelling effort on top of a rewrite of large portions of the system.

Sure, but very doable though for a skilled programmer. I never said it would be a trivial change. Dunno who you keep bashing on as if I did.

38 minutes ago, pnambic said:

This is after I showed you that, with some measurement uncertainty, reducing dupe count substantially improves performance

I don't think anyone in this thread ever claimed that pathfinding isn't involved. But even with just 2 dupes in rockets, and critters penned in, performance is bad. You can't chalk it up to pathfinding alone. Just being able to point at pathfinding as a factor doesn't invalidate the other arguments.

38 minutes ago, pnambic said:

I stand by my claim that your stance has mutated sufficiently to be now incoherent.

You may stand by that. My original stance really didn't change. I never meant to suggest a possible simulation architecture. I did that just to demonstrate that the complexity of ONI isn't the bottleneck, but the underperformant implementation is, after you brought in barely supported generic statements about C#, sim.dll already being C++ and whatnot.

38 minutes ago, pnambic said:

There is no question that we'd all like better performance. We were discussing whether that's feasible.

Yeah it absolutely is feasible, just takes a decision to do so from Klei, who stands to gain a lot with the upcoming DLCs. The only ones that can truly say it isn't feasible for practical or business reasons is Klei. I can only demonstrate that the simulations are quite light if properly optimized.


Plus we don't need the 10x simulation performance boost I estimated is possible in my earlier post. Just a 50% increase would already matter a lot.

Spoiler

It is a pity that people are so focused on blaming computer hardware, 'it seems difficult for Klei' or 'you just don't get it' instead of simply recognizing the value of improved performance for ONI

 

 

1 hour ago, EnTro said:

I stayed on my original topic: "performance is not good enough, should be addressed and is a realistic addressable".
All your discussion on pathfinding, C#, sim.dll, FPS and decompiling code are fun and all, but distractions from the core topic in this thread.

You stayed with your original claim. This discussion was about the effort required. That necessarily involves questions like "where are the performance bottlenecks?", "how complex are the required modifications?", and "what are the potential gains?". We're not in a land of spherical cows. This is a for-profit project with over 400,000 LOC, not including the existing low-level sim or the relevant parts of Unity.

1 hour ago, EnTro said:

I don't think anyone in this thread ever claimed that pathfinding isn't involved. But even with just 2 dupes in rockets, and critters penned in, performance is bad. You can't chalk it up to pathfinding alone. Just being able to point at pathfinding as a factor doesn't invalidate the other arguments.

On 6/27/2024 at 11:00 PM, EnTro said:

Clearly the observation is that dupe pathfinding is not the major bottleneck for me, and some other part of the simulation must be.

That was before I ran tests on your base and mine and got ~30% of FPS out of both just by killing dupes.

1 hour ago, EnTro said:

I never meant to suggest a possible simulation architecture. I did that just to demonstrate that the complexity of ONI isn't the bottleneck, but the underperformant implementation is

14 hours ago, EnTro said:

In addition I don't think that my proposed architecture restricts modding support at all

Right.

1 hour ago, EnTro said:

after you brought in barely supported generic statements about C#, sim.dll already being C++ and whatnot.

I'm not a doofus. I wasn't saying "but it's already C++, so it's fast". I was trying to explain the existing architecture to you, in the context of your claims about required communication bandwidth between the sim and the rest of the game. I am in a position to do so because I have read the existing code, which you should do, too, because it would give you a more realistic appreciation of what you're dealing with here. Putting your fingers into your ears and shouting "I can't hear you! It's all completely feasible!" is not reasonable.

Go make a business case to Klei. I'd be super happy to be proven wrong. In the real world.

17 hours ago, EnTro said:

I believe that if ONI still wants to develop further as an active game (DLCs and core mechanics changes are considered after all), performance needs to be addressed to levels where a player can experience the full game using recommended hardware (as hydragyro explained quite elegantly).

I agree. What does it help if any colony's actual "end" is the eventual end of performance? 

I am constantly forced to stop playing my colonies at some point because the performance has dropped too significantly (even on high-end PCs).

Better performance will also positively affect sales, as seen in other games' cases. 

Sometimes, I wonder how certain people "know" the details of ONI's codes and where they have their alleged information from. 

Regardless, Klei is responsible for improving the performance, and I am sure they already have ideas on how to do it. Maybe it was not greenlit before because of other reasons, perhaps business- or management-related. 

As a general note for life: Better don't argue with narcissists. Their responses are irrational (disguised as rational), not hitting the point, evading, putting meanings that are not even there, and making unverifiable factual claims presented as "truth."

In short, ONI needs heavy performance optimization to succeed in the future, especially if regular DLCs are planned. 

16 hours ago, Henlikuoth said:

Sometimes, I wonder how certain people "know" the details of ONI's codes and where they have their alleged information from. 

ONI's C# codebase is very straightforward to decompile and it's not obfuscated, the decompiled code is very readable (it's not precisely the same as what the Klei developers write, but it's still very easy to understand). Also I don't know how you think modders make mods for ONI, but modders literally have to understand the code to modify it. Compared with many games, ONI doesn't have much "data", the world definitions are data (they are plain text yaml files), but things like buildings and plants are defined in code. So a world mod (like Minibase, Baator or 100K) doesn't intrinsically require understanding the code, but most mods do.

The c++ physics core (simDLL) is compiled down to assembly that cannot be decompiled back to anything like the original c++ code, it has to be disassembled, this is one reason why there are no mods to the simDLL aspect of physics (this would be things like changing how temperature or gas flow work), it has been probed to an extent for example someone pulled out the details of heat exchange, but it's really not fun. However, it's still very straightforward to see the "interface" which simDLL presents (the functions and variables it exports), and also how the C# codebase interacts with it. So even though it's a bit of a blackbox, we still know what it does and how the rest of the code interacts with it.

Archived

This topic is now archived and is closed to further replies.

Please be aware that the content of this thread may be outdated and no longer applicable.

×
  • Create New...