Jump to content

What is causing AI lag in late game?


Recommended Posts

There are lots of topics out there looking at late game lags, but most of it is about FPS drop and how to avoid that. I usually don‘t have any problem with the FPS as long as I don’t use too many rocket suits. My late game is spoiled because critters and dupes start to behave slow, taking their time in between tasks, the AI lag. How many dupes can I sustain with the given resources? I will never know because AI lag breaks the game before that.
Did anybody ever do some testing on what has major impacts on AI lag? The only FPS drop that is immediately obvious to me is with rocket suits, but does that also cause AI lag?

I run an i7-7700k @ 4800GHz , 32 gb 3200 ram. Current base cycle 3000, 125 dupes 1100 tame and 200 wild critters. The processor usage is about 25 to 30% overall, peak performance CPU 2 between 65 and 70 %. It does not really matter if game is paused or running. Since new processors are not really faster in single core it would be interesting to know what impact more cores have in regard of the AI performance,

AI lag with dupes usually starts around 70-80, they start to take breaks between tasks. For critters it is usually when I hit 500+, with 800-1000+ overall critters the tame pufts and slicksters start to drop to 0 calories frequently due to lag. When I pause the game for 5-10 seconds every critter inhales immediately on resume. 

Before I start investing time in testing possible reasons for AI lag I would like to gather Forum knowledge about what is known already.

Some things that would be interesting to see in regard of AI lag

- Hardware, with how many dupes / critters does it start on what hardware? Is it a single core issue or do more cores help?

- Are dupe and critter AI lag 2 separate things or is there a correlation?

- Do the FPS droppers like rails and rocket suits and material / regolith scatter influence it?

- Anything you can think of

My idea is to find a save in my recent megabase where the critters start to hit 0 calories, change a single variable, and then see what impact it has. I could change the processor speed, kill away critters and / or dupes and gather excess regolith / destroy my multi material storage bins. Any ideas for a better test setup?

 

Link to comment
Share on other sites

13 hours ago, Stoned said:

There are lots of topics out there looking at late game lags, but most of it is about FPS drop and how to avoid that. I usually don‘t have any problem with the FPS as long as I don’t use too many rocket suits. My late game is spoiled because critters and dupes start to behave slow, taking their time in between tasks, the AI lag.

FPS is often posted about because its an easily measured part of the game that generally relates to how well the game is performing.  60 fps? Probably doing quite well.  5 fps?  Well... at least the dupes are still moving, right?   I'm not Klei and I haven't looked at any of this game code, but I suspect that some of the optimizations recently have dealt with the whole "FPS drops to zero" problem.  One way they possibly dealt with this was by prioritizing some of the computations.  Once you get to a certain amount of processor load, tasks will have to wait on others to finish.  I'm guessing that the priority is "Evaluate map --> Re-draw map --> evaluate AI."  I suspect the issue you're describing is caused by the code delaying some of the more time-consuming tasks such as dupe AI.  *shrug* I could be wrong, but its a theory.  Another possibility is that you're running low on memory.  ONI gets ready to do dupe calculations and asks for some memory to work in.  The OS says "No can do, bud!  Free something up first."  But, since we don't know anything about your particular ONI map and nothing about your computer, all we can do is speculate.

 

13 hours ago, Stoned said:

How many dupes can I sustain with the given resources?

Welllllll... you can do some math and figure it out.  Make some flow charts.  Start with an individual dupe's needs: 100g/s of oxygen and 1000kcal/cycle of food, then work back from there.  Where does the oxygen come from? What resources are necessary to produce it?  What food are my dupes going to eat? What resources do I need?   Of course, it won't help if you can't actually play the seed, but...

 

13 hours ago, Stoned said:

Did anybody ever do some testing on what has major impacts on AI lag? The only FPS drop that is immediately obvious to me is with rocket suits, but does that also cause AI lag?

I can continue to speculate here, but again, without knowing anything about your map, its hard to say for certain.  Each dupe adds a certain amount of work on the AI system.  More dupes, more work.  It is probably not linear -- 10 dupes will require more work than 10 times 1 dupe.  For example, lets say that you have 1 dupe and a new task pops up.  The check goes something like this: "For the first (and ONLY!) dupe, what priority will this task be?  Is it higher than the next task in the list?  What about the one after that?  OK, lets stick it in at the end of all the tasks with the same priority."  OK, now lets say we've got 10 dupes.  For each dupe, there will be a check, "What priority will this be if Dupe X does it?"  Then, for each dupe, the task is inserted where it belongs in the list of tasks.  At this point we're at 10 times the work, but we're not done yet.  Now we have to decide which dupe does this task.  This first check will be "Which of these 10 dupes has the task as the next job?"  If there's more then one, then it will check and see, "OK, who is going to be available first?"  Worst case you have 10 dupes that are all idle and all have the same priority.  Best case you have 1 idle dupe that will take the job at high priority.  Either way, you're doing several more checks beyond what would be necessary for a single dupe.

As you mentioned, the rocket suit has a big impact.  This is because its pathfinding is "any square with gas is an appropriate vector."  Pathfinding algorithms work best when you have few choices.  The more choices, the more decisions that have to be made.  This is why, in the other thread, I specifically said that one should use permissions on doors to limit dupe movements.  If you have 100 dupes that can go anywhere, you'll be waiting forever for the pathfinding algorithm to finish.  If you have 100 dupes, but only 10 can get to any particular area, then suddenly there's a lot less work for pathfinding as well as for job allocation.  "Hey!  We've got a new job!  Who's available?  OK, of those that raised their hands, who can get to where it is?  Ahh, there goes 90% of you.  Great!  Of the 10 that are left, who LIKES doing this job?  Only one? Great! Here's the ticket for the job.  Go have fun, Nails!"

 

14 hours ago, Stoned said:

I run an i7-7700k @ 4800GHz , 32 gb 3200 ram. Current base cycle 3000, 125 dupes 1100 tame and 200 wild critters. The processor usage is about 25 to 30% overall, peak performance CPU 2 between 65 and 70 %. It does not really matter if game is paused or running. Since new processors are not really faster in single core it would be interesting to know what impact more cores have in regard of the AI performance,

OK, now that we've got some specific information, lets get into some more specific speculation!  Your RAM is likely not running out, so we can drop that theory.  You're showing that most of your CPU cores are running at 25 to 30% and one is 65 to 70%.  An i7 has 4 processor cores, which (if we evaluate the ceiling) means you're using 160% of one CPU's processing power.  This is similar to my own experiences with my i5.  Some of that is going to be the overhead of your operating system, but we'll ignore it for now.  160% shows that ONI has some multithreading, which agrees with the patch notes a while back that stated that this was done with some of the algorithms.  Modern CPUs won't tie a thread to a processor core anymore (they used to).  Instead, a thread will rotate around the various cores to help distribute the heat load.  So one program running at 100% CPU will make your computer's performance screen show 25% across all 4 cores.  You also observe that the CPU use doesn't change between when its paused and when it isn't.  That's normal.  The game isn't suspended, it just isn't actually doing anything during its share of the CPU time.  You'll observe the same behavior with other games, but it may be more difficult to tell because unlike ONI, most games aren't CPU hogs.  You also gave some useful information about your current base, the number of dupes, and the number of critters.

Alright, lets speculate: You specifically state that you're showing AI lag.  I suspect that AI calculations are on one thread and physics calculations are on another, but as I'm not familiar with the actual code Klei has done, its just a guess.  That means that your CPU is managing to handle the map calculations just fine, but 100% of one CPU's processing time can't keep up with the AI.  With as many dupes and critters as you have, that wouldn't surprise me!  

Some things you can do I've posted about in other forums.  Limit where your dupes can go.  Limit what jobs your dupes can do.  With 125 dupes, if all 125 can do digging jobs, and all 125 can get to all digging jobs, then there'd going to be a lot of AI shuffling every time you assign a new dig job!  You can probably also improve things if you limit the travel abilities of your critters.  500 shine bugs with a free run of your entire base are likely going to cause some pathfinding troubles.

Finally, you're on cycle 3000.  Many individuals have posted about performance issues on long-running maps, regardless of how much of the map they've revealed and how many dupes they have active.  Personally, I haven't encountered this -- but its a possibility to consider.  You can test it by making a copy of your save file, then kill all but say 100 critters and 10 dupes, and see if the AI continues to lag.  If it does, then it may be related to the "aging base" syndrome and should be posted about on the bug forum.

 

14 hours ago, Stoned said:

AI lag with dupes usually starts around 70-80, they start to take breaks between tasks. For critters it is usually when I hit 500+, with 800-1000+ overall critters the tame pufts and slicksters start to drop to 0 calories frequently due to lag. When I pause the game for 5-10 seconds every critter inhales immediately on resume. 

Yeah, this agrees with my theory that 100% processor time isn't enough to handle that many AI calculations in a reasonable amount of time.

 

14 hours ago, Stoned said:

Some things that would be interesting to see in regard of AI lag

- Hardware, with how many dupes / critters does it start on what hardware? Is it a single core issue or do more cores help?

- Are dupe and critter AI lag 2 separate things or is there a correlation?

- Do the FPS droppers like rails and rocket suits and material / regolith scatter influence it?

- Anything you can think of

For the hardware per dupe things... I don't think anyone has started an analysis of that.  There is a thread somewhere that someone shared a 1000+ cycle save and had people evaluate it on their hardware, sometime around when the multithreading optimizations where added.

From what I can tell, and again, I can't speak for certain because I haven't looked at the code. But my observations suggest that environment (your map/physics/etc) calculations and Dupe/Critter AI are the two separate processes.  Your map might provide a method of testing that out.

I haven't personally run enough rails to see them cause performance hits.  I have seen problems with very complex piping setups.  The regolith impact will be with your FPS -- before ONI can re-draw the map, it first has to determine if there are any changes to the map.  The more debris per cell, the more time it takes to analyze each cell.  Again, I posted about this just yesterday in yet ANOTHER thread where someone complained about performance. 

 

Anyway, I hope that helps. 

Link to comment
Share on other sites

Thanks Kitten for your speculations about the theme. 

8 hours ago, KittenIsAGeek said:

Again, I posted about this just yesterday in yet ANOTHER thread where someone complained about performance

I know and that again went quickly in the direction of FPS problems and the usual fixes. That is why I started this thread. AI lag seems to be something different. And I have a feeling that these problems only have a loose correlation. I try to find out if there is a method to reduce/avoid AI lag, just like with FPS drop, or if it is because the code gives the AI a certain time frame (a depressing thesis). I never read anything about that.

 

Link to comment
Share on other sites

9 hours ago, KittenIsAGeek said:

You also observe that the CPU use doesn't change between when its paused and when it isn't.  That's normal.  The game isn't suspended, it just isn't actually doing anything during its share of the CPU time. 

I would think there is no reason for pathfinding, temperature, liquid, rail calculation if the time frame does not advance. As you said, we don’t know anything about the code, but it lets me think my AI problems can not be cured with the usual pathfinding tricks.

 

9 hours ago, KittenIsAGeek said:

Welllllll... you can do some math and figure it out.  Make some flow charts.  Start with an individual dupe's needs: 100g/s of oxygen and 1000kcal/cycle of food, then work back from there.  Where does the oxygen come from? What resources are necessary to produce it?  What food are my dupes going to eat? What resources do I need?   Of course, it won't help if you can't actually play the seed, but...

Believe me there is a huuuge difference between spreadsheeting a colony with 3 digit figure dupes and actually build a thing of that scale that works and stands the test of time. Especially if you avoid exploits. I have done it several times, and I know it takes tons of work to connect all the systems on a large scale. I always smile when somebody claims 'this design can easily be scaled up for 200 dupes‘. I love that part of the game, and with my hardware and a little thoughtful colony design I can avoid most of the FPS drop, but not the AI lag.

 

Anyways thank you for sharing your thoughts.

I will start testing later today and keep you all posted on any results.

Link to comment
Share on other sites

After some testing I have some first results on critters. I used the time a tamed squeaky puft took between 2 inhalations as a basis.

Critter AI lag is linear with FPS

Spoiler

I set the focus on different areas of the base to have different FPS. It was linear, with half the FPS I had to wait twice as long.

The amount of critters does not really affect FPS after you exceed a critical mass

Spoiler

I started with 1050 tame and 195 wild critters total. After my dupes had killed them down to 800 tame and 100 wild, there was no noticeable change in FPS. That makes total sense in regard to the next point.

All Critters share a fixed maximum amount of AI time

Spoiler

The time between inhalations greatly improved after I had decimated the critters. It was difficult to get similar FPS situations since a lot of geysers and associated machinery had started to work by the time I had killed the beasts.
The timings went from 95 seconds with 1250 Critter at 13 FPS to 70 seconds with 900 Critters at 13 FPS. Given some measurement errors, these figures are pretty linear as well.

It is getting late here and I have to stop for today. I will try to see if dupe AI is affected by critter amount tomorrow, but for some reason I am unable to enter debug mode since I moved the installation via steam. I littered debug_enable everywhere, but no joy so far.

it feels like my dupes take tasks quicker, that would be thrilling.

Link to comment
Share on other sites

Finding all the secrets to a fast base isn't as straight forward as pure FPS. For example, the process that dupes use tor thinking is asynchronous with the world. If that process is overloaded, then dupes will simply take longer to choose and perform tasks. Remember grooming station problems? The decision process can slow down quite a bit before players notice anything wrong.

One of the biggest CPU drains for dupes is the path finder. Every job requires a path sooner or later. Early game bases are small, and there are few places to go. Late game bases tend to have access everywhere, with a high number of dupes. The pathing requirements shoot up massively as a result. Jet suit pathing is even more demanding, since all the air spaces open up as viable paths.

One of the simplest ways to cut down pathing is to cut down on dupes. One less dupe is one less map to feed. Pathing demands can be further reduced by restricting dupe access, it is not necessary for every dupe to go everywhere. Look for ways to cut your map in half or into smaller areas where a few dupes can stay busy, instead of letting every dupe do everything. Critter pathing can be cut down with tiny rooms, though they already seem to prefer small paths. Hundreds of piles of debris each present a new possible path for dupes to perform jobs. Limiting access to storage, destroying it, or locking it out entirely can give potential gains.

Each of the game layers place a burden on the world sim. Excessive amounts of pipes (and especially conveyors) have been reported to cause considerable drops in FPS. Oversized objects like tempshifts run extra temperature calculations across tiles. So be careful of building a gigantic tangling mess.

It's hard to say anything more definitive without dedicated dev tools. 

Link to comment
Share on other sites

On 3/2/2020 at 8:10 AM, Stoned said:

Since new processors are not really faster in single core

They are, but the difference is not abysmal. Upgrading from i7-2600k to a 2700x was noticeable though, but the leap was longer xD

I would think new Ryzen cpus would be a good upgrade since they usually behave better on multithreaded tasks. But nobody does benchmarks on this game to assert that.

Link to comment
Share on other sites

2 hours ago, melquiades said:

I would think new Ryzen cpus would be a good upgrade since they usually behave better on multithreaded tasks. But nobody does benchmarks on this game to assert that.

There's a thread where someone shared a 1000+ cycle base and people did FPS and load time benchmarks on a variety of systems.  The main factors were (In order): Memory bandwidth, CPU speed, core generation (i.e. how new a processor it is).  So newer CPUs often did better because they generally had lower latency and higher bandwidth when talking to the memory -- especially when going from DDR3 to DDR4.  However, an older machine could do better if it had triple-channel memory compared to a newer machine with slightly higher CPU speed and more cores that only had dual-channel memory.

Link to comment
Share on other sites

8 hours ago, KittenIsAGeek said:

There's a thread where someone shared a 1000+ cycle base and people did FPS and load time benchmarks on a variety of systems. 

Cool! Would you mind sharing the thread? I didn't know how to find it.

It fits my situation, what gave me the final push to upgrade my whole system was my memory rather than the cpu.

Link to comment
Share on other sites

4 hours ago, melquiades said:

Cool! Would you mind sharing the thread? I didn't know how to find it.

You would ask the tough questions.   I'll see if I can track it down.

I think this is the thread.  I didn't realize that its almost a year old at this point. Wow.  Anyway.. Hope this helps.

 

 

Link to comment
Share on other sites

On 2/5/2020 at 1:06 PM, KittenIsAGeek said:

There's a thread where someone shared a 1000+ cycle base and people did FPS and load time benchmarks on a variety of systems.  The main factors were (In order): Memory bandwidth, CPU speed, core generation

Interesting.  I have always been a fan of low latency ram so maybe that's why I'm not feeling any lag.

Link to comment
Share on other sites

I did some more testing today with different Hardware settings.

Standard for measurements is the time the same puft with the same game load takes between two inhalations.

Reference time is 111s, Hardware

i7 7700k @ 4.8 GHz

32 Gb ddr4-3000

nvidia 1060 6Gb

Switching to ddr4-1500 ram 111s -> 140s (79%)

I don‘t know if that actually halves the ram performance, but it should give an idea.

Switching 4 logical cores off 111s -> 135s (82%)

I switched 4 cores off via msconfig. I don‘t know which ones are switched off by that action, might be running on 2 physical and 2 logical cores, might as well be running on 4 physical with no logical cores.

Switching cpu speed to half, 2.4 GHz 111s -> 293s (38%)

I expected a huge impact. Looks like the overhead is killing the game now.

All the changes gave a drop in FPS drop in about the same magnitude.

So Ram speed seems to have a similar effect as multithreading, buy more cores or faster ram, it doesn’t matter. Might be different when you switch from 8 to 16+ logical cores, but I can‘t test that. Good old cpu overclocking always helps :)

 

 

Link to comment
Share on other sites

"AI lag" is an intentional feature built into the game to keep your FPS up by allowing agents to delay finding new tasks. There are a variety of such calculations that are thrown out or delayed intentionally, and they appear to start ramping up when you hit 30 fps, to keep you at 30 fps.

If you want to avoid it, keep your fps at 45-60. With 3 dupes, few critters, half width maps, minimal rail usage, you too can have 60fps at 20x speed.

Link to comment
Share on other sites

1 hour ago, nakomaru said:

"AI lag" is an intentional feature built into the game to keep your FPS up by allowing agents to delay finding new tasks.

That is true, there is a fixed amount of computation time in each frame reserved for AI. Once your FPS drops, the AI will start to lag. I found this to be the perfect spot to benchmark some assumptions made in earlier threads. I can play with 45-60 FPS with up to 40 dupes in single speed, would always prefer that to 3 dupes in any speed...

I hope for that quantum computer that can handle all the sand in the sandbox, maybe in a few years I can fill the whole map with dupes and critters and run a decent FPS. As mentioned earlier, that is where the fun is for me. Or maybe there still is room for optimization by Klei, my processor sure has tons of computation power left.

Link to comment
Share on other sites

2 hours ago, nakomaru said:

"AI lag" is an intentional feature built into the game to keep your FPS up by allowing agents to delay finding new tasks. There are a variety of such calculations that are thrown out or delayed intentionally, and they appear to start ramping up when you hit 30 fps, to keep you at 30 fps.

That's really stupid.  I'd rather the game slow down than break.

7 hours ago, Stoned said:

Switching to ddr4-1500 ram 111s -> 140s (79%)

I don‘t know if that actually halves the ram performance, but it should give an idea.

Typically if you drop the frequency you can also get away with lowering the other timing parameters to mostly make up for it, but if you kept those the same, then yes, halving the frequency halves the ram performance.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

Please be aware that the content of this thread may be outdated and no longer applicable.

×
  • Create New...