Jump to content

Recommended Posts

So I stumbled on a thread today where tzionut shared his +/- 2000 cycle save file:

https://forums.kleientertainment.com/applications/core/interface/file/attachment.php?id=213765

And people where using it a bit to compare performance, with a interesting variating results.

So I was thinking: why not gather the data and have as many people as possible try that save file.

Rules of the setup:

-You first of all post your hardware specs.
-You also note down the time needed to load the save file (the time between clicking the load button and the simulation actually running -that is after the loading screen, but also after the game changed from black and white to full color)
-For relevancy we all use the highest vanilla speed, so the 3x speed.
-You take the average FPS for 60 seconds. A good program for this, is fraps.
-No complaining, no swearing, no idiocity. This is purely a benchmark thread to discuss interesting finds and gather data on performance so we can learn from it. No rants about performance by any means!

 

================================================================================================

 

So for me, this is:

Hardware:

CPU: X5675 overclocked to 3.7 ghz
GPU: GTX 960 4GB
RAM: 14 GB
Storage: Samsung Evo 850 500GB SSD
Motherboard: Asus Rampage III Extreme

Loading time:

1min 23s

Average FPS:

28 FPS (focussed down on the most FPS constrained point of the map)

Note:

My system is rather old spec, although for it having older spec parts, they are top of the line for their generation. Especially regarding CPU, mine is not doing that much worse than newer ones, with the mild overclocking bridging the gap.

Link to comment
Share on other sites

i7-7820X @ 3.6GHz

GTX 1660ti

64GB RAM

EVO 970 SSD

2560x1440 screen

 

1min 6sec loading time

31 fps

 

CPU load around 15%, utilizing two cores at 60% to 80%. The remaining cores are almost idle.

GPU load at 30% but at P3, that is a power-savings mode, so the actual load is smaller.

About 4GB RAM used, hard-disk is idle.

Looks like single-core CPU performance is the bottleneck.

Link to comment
Share on other sites

win10
i5-2310@2,9GHz
8GB Ram
Radeon RX560 4GB GDDR5
90 sec loading 
24,6 Average fps (Benchmark started with fps drop to 4, but most of time it was about 26-28 fps)

Spoiler

image.thumb.png.505d33f59d8c9366026139ba949da04c.png


System and ONI on SSD, but savegame loads from HDD
Loaded the savegame
Paused\unpaused game few times to load available debris
Made 63 sec benchmark
Highest CPU1 usage at the moment of debris loading with pause\unpause.
 

Link to comment
Share on other sites

Hardware:

CPU: i5-6600k CPU at 3.50Ghz
GPU: GTX 970 4GB
RAM: 16 GB
Storage: Samsung Evo 850 250GB SSD
Motherboard: Can't recall lol

Loading time:

56 seconds

Average FPS:

31 while shifting around but up to 40-42 for most of the map.

Note:

Link to comment
Share on other sites

FX8350 (4GHz), 16GB Ram, Radeon R9 290, SATA Samsung 850 PRO 1TB

1920x1080 windowed.

Load time: 95 sec, per-cycle save: 10 sec.

FPS pretty much 20-22 (Steam display), but completely playable without problem, animations work nicely, scrolling is smooth.

 

Link to comment
Share on other sites

Hardware:

CPU: I7 3930K 3,2 GHz Stock-Clock @ 4,2GHZ OC all cores <= Tested with OC Settings.

The results are similar to ToiDiaeRaRIsuOy. He has a Server CPU of the same generation. 500 MHZ more results in 2 FPS more average... hmmmm

GPU: 2x 7970 Lightning
RAM: 32 GB
Storage: Savegame from a SSD and Gamefiles from an SSHD
Motherboard: Asus Rampage IV Formula

Loading time:

1min 12s

Average FPS 8 Dupes:

30 FPS

Min: 14FPS (Lags)

Most of the time between 29-30FPS

Max: 34FPS

Interesting. More Dupes are scaling better as several versions before.

  • 24 Dupes = 25FPS
  • 30 Dupes = 22FPS
  • 40 Dupes = 18FPS
  • 50 Dupes = 16FPS

If they optimize the game a little bit that 40 Dupes are always over 30 FPS, all were good.

Link to comment
Share on other sites

OS: Win7 x64
CPU: I5 2320 @ 3.1GHz
GPU: GT 730 @ 2Gb 
RAM: DDR3 16GB
Storage: SATA SanDisk 120GB SSD
Motherboard: Gigabyte GA-P67-DS3-B3

Loading time: 1min 20s
Per-cycle save: ~10s
Fast Speed: ~20 FPS, Slow Speed: ~30 FPS

Note:
Very old spec, but no problem with save-file. Game playable for me.
Screenshot with MSI Afterburner monitoring
 

Spoiler

111.thumb.png.91630f8b57158cf3357c65b4d970aede.png

 

Link to comment
Share on other sites

Frames, Time (ms), Min, Max, Avg
  3311,     60000,  50,  59, 55.183

 

I don't have the newest computer. My specs should give me around 30 fps compared to posted specs so far. There is something going on here, which doesn't seem to be related to either CPU or GPU speed.

 

After thinking for quite a while I think I managed to figure out what is going on. Back when I built this computer I realized the primary task I planned for would be bottlenecked by cache misses, hence memory latency. As a result I paid a lot of attention to memory and found some really low latency memory. While throughput isn't bad, latency is what I really aimed for.

Now think about the game drawing a frame. It asks what is on a cell (read 1). It's something, which haven't been dug out yet. It reads the element type (read 2), it reads the element data (read 3), element color (read 4), then anim string (read 5). (calls drawing cell) Then it reads if there is something burried (read 6) and if there is, it reads the burried icon (read 7) and draws it on top of the cell.

Notice how it read from memory 7 times in that example alone and it's likely not even done with the cell yet. There is very little for the CPU to do since it's just comparing data, often just numbers. Work like that will really depend on how fast the CPU can access memory rather than how fast a CPU can do math. Having looked at the screen update code for ports (pipe input/output, power input etc) recently I can say it looks like the computer is doing a fair bit of tasks like that for each frame.

I get a freeze for around a second when I switch overlay, which makes sense in this context. One of the tasks it does is it goes through all buildings and hides everything from the overlay. Next it tells it to show the objects for the new overlay. There is a significant amount of work related to random memory access going on whenever the overlay is different from what it was the last time the overlay was drawn.

 

If the bottleneck is indeed memory access time for converting the game map into something for the GPU to work on, then CPU, GPU or resolution shouldn't really matter. RAM and motherboard is what really matters here. Given how big the impact is, it seems moving budget from GPU to RAM could result in a higher framerate in some games even though the computer doesn't get more expensive from doing that.

Link to comment
Share on other sites

34 minutes ago, Nightinggale said:

After thinking for quite a while I think I managed to figure out what is going on. Back when I built this computer I realized the primary task I planned for would be bottlenecked by cache misses, hence memory latency. As a result I paid a lot of attention to memory and found some really low latency memory. While throughput isn't bad, latency is what I really aimed for.

It is certainly a factor and it could be a large factor for this game. After all, it does a lot of non-local memory access where caches are basically worthless and the CPU is waiting for data most of the time. Cannot be the only factor though, or I would not get a well playable game with no lockups or freezes at 20 fps, while some other people find this is "unplayable" at 30 fps. And I assure you, if I had problems gaming, I would have something faster. It is not a question of money. One thing is that Intel has a worse memory interface than AMD and worse core-to-core communication and coordination. They make up for that in most situations with faster single-core speeds. Maybe the FX8350 just reacts better to higher load on the memory interface, which would nicely tie in with your theory.

Link to comment
Share on other sites

So I did this test before I left for a weekend away but forgot to post it earlier. 

I actually forgot that I had a mod installed. It is a speed control mod which simply increases the speed up controls to: 1x, 3x, 10x. 

Here is the mod for more details if needed. https://steamcommunity.com/sharedfiles/filedetails/?id=1713359495

 

GeForce GTX 1080 Ti

Intel i7-8700K @ 3.70GHz (no overclocking has been done)

32 GB RAM DDR4 (I can't recall my exact RAM model, sorry. If it ends up being important I can add it to the post at a later time). Though I do think the RAM timings where 15/15/15/35

Samsung 970 EVO M.2 2280 PCIe G3 2TB

Motherboard: ASRock Z390 Taichi Ultimate LGA 1151 (300 Series)

Resolution: 3440 x 1440 

The Below results are the average of testing 10 times, each with 60 seconds of game play recording, while moving my camera around to look at different areas. After each session, I closed the game completely and did a full computer shut down then restarted. This is the result of 10** rounds of tests. 

~78 FPS low speed (1x)

~50 FPS Medium Speed (3x)

~29 FPS 10x speed

~4 second daily save 

~45 seconds load   

 

**Load times where actually only averaged 9 times, as I had one extreme outlier. I am not really sure how/why, but I had ONE instance with a 15 seconds load time. 

 

EDIT: Windows 10 pro if that matters. 

Link to comment
Share on other sites

I'm running Linux Ubuntu 18.04.2 LTS with kernel 4.15.0-48 x86_64 on an Intel Core i5-4690S @ 3.2 GHz.  My graphics is a GeForce GTX 750 Ti, so I'm a bit behind the times.  I run ONI in 1920x1080 resolution.

It takes me 1 minute, 8.51 seconds to load the game.  The FPS is about 47, but sometimes goes up past 50 and sometimes drops as low as 40.

 

Link to comment
Share on other sites

Windows 10
i5-3230M @ 2.60 GHz
Intel HD 40008 GB DRR3 RAM
Toshiba HDD

Yeah, close to minimum requirements (even below with graphics)

Loading time: 1 min 40 s
Save time : 9 s
Average FPS: 12

Note
In spite of low FPS amount it was playable without problems. It was choppy, but still enjoyable. 

Link to comment
Share on other sites

image.png.102407a6f5a6ad423218bfe71bbc8dad.png

I plotted single core performance vs fps from all the posts in this thread. We don't have a clear line, indicating that the bottleneck isn't the CPU core speed. Instead it is fairly clear that most systems hit a wall around 30 fps. If we assume memory latency is the real bottleneck, then the 30 fps wall is likely due to some widespread memory latency. Systems under 30 fps also have low per core performance and are generally old, meaning they likely have old memory as well.

The most interesting part is actually to the far right. 3 systems are significantly faster. Not just a little, but significantly faster. It can't be explained by being new CPUs because two of them are 4th gen Intels from 2014.

2 hours ago, ToiDiaeRaRIsuOy said:

There is no huge issue at the moment regarding lag IF you are using an intel processor. We saw that in the benchmark thread. Low end CPU's performed relatively close to high end CPU's. However, there was one AMD processor in it and that got hit quite hard.

The fact that the AMD is also the one on the far left shouldn't be left out. It's from 2012 and on top of that, it's a Bulldozer design, meaning it has significant sacrifices to per core performance in order to have many cores. This means for tasks like ONI, bulldozer wasn't even the best choice AMD had in 2012 and Intel had better offers than AMD, all at the same or lower price.

Modern AMD CPUs will give a completely different kind of performance. We have no fair comparison between AMD and Intel in this thread.

Link to comment
Share on other sites

Devs just know that 30 fps is playable level and reduce caluculatins on the late stage down to provide aforementioned perfomance. You can check how much time need a dupe on latestage to decide what to do next. And how fast available resources renew. And all that stuff. 30fps is not celling, it is the goal — lowest apropriate framerate.

Link to comment
Share on other sites

55 minutes ago, Nightinggale said:

Modern AMD CPUs will give a completely different kind of performance. We have no fair comparison between AMD and Intel in this thread

I'm on an AMD Ryzen 5 1400 3.20 GHz. If I can remember this weekend I'll get you numbers

Link to comment
Share on other sites

1 hour ago, CDoroFF said:

 30fps is not celling, it is the goal — lowest apropriate framerate.

On AMD it is more like 20 fps. And seriously, fps are not the issue. The issues is uneven performance and Intel seems to be much more affected than AMD. 

Link to comment
Share on other sites

21 minutes ago, Gurgel said:

On AMD it is more like 20 fps. And seriously, fps are not the issue. The issues is uneven performance and Intel seems to be much more affected than AMD. 

It depends on what we'd call uneven performance. When I did my benchmark, I focussed down on the middle of the base, where the FPS hit was its biggest, which averaged around 28 FPS. I don't have numbers, but outside that it would shoot up to 40+ FPS on average easily at most places. Do I feel 28 average FPS being the bare minimum enjoyable? Yeah certainly. Do I find it annoying the FPS shoots up 40FPS and therefore create an uneven performance and experience? Not really. It would only annoy me if the average lowpoint would be that low. I think you roughly need 24 FPS (depends on who you are asking really) to keep your eyes from being able to distinguish individual frames rather than the illusion of fluid movement. 20 FPS can still be seen as the latter if you are used to it, but if you ask someone who is being used to 60 FPS watch something that is below 24 FPS, he might be able to see the actual individual frames.

1 hour ago, Nightinggale said:

The most interesting part is actually to the far right. 3 systems are significantly faster. Not just a little, but significantly faster. It can't be explained by being new CPUs because two of them are 4th gen Intels from 2014.

 

5 hours ago, ToiDiaeRaRIsuOy said:

We should actually try to post our single core performance as well maybe? I mean some stuff has moved to multi-thread and that's why we saw significant improvements the last few months, but it looks like single core performance is still very valuable.

 

For the record, I did overclock my cpu. So it's not the standard value! That's why I am asking to post the single core performance as well. The 3 on the right might have have been significantly overclocked.

Link to comment
Share on other sites

We need more AMD Users here. Ryzen 1 or 2 generation users here? Would be nice. :)

The result is so far not really good. If I remember correctly, the savegame has only 8 Dupes. The real issue is the bad scaling with more dupes in the game. I test the game with 24 Dupes now. I think the result will be horrible :pI'm curious.

EDIT://

Interesting. More Dupes are scaling better as several versions before.

  • 24 Dupes = 25FPS
  • 30 Dupes = 22FPS
  • 40 Dupes = 18FPS
  • 50 Dupes = 16FPS

If they optimize the game a little bit that 40 Dupes are always over 30 FPS, all were good. :)

 

Link to comment
Share on other sites

image.thumb.png.91c2dfca97632639cce7b4e39ce38833.png

I tried matching the chart up against transistor technology (Intel only). It's clear that going from 32 nm to 22 nm is a significant upgrade. Going from 22 to 14 not so much in this case. However it should be noted that 14 nm means smaller cores, which leaves room for more cores on the chip. For instance the 2 CPUs to the far right seems fairly similar but the 14 nm one has twice the number of cores.

I tried to look at bus speed, cache size and other factors like that and it seems completely random. There is no pattern there whatsoever.

1 hour ago, ToiDiaeRaRIsuOy said:

We should actually try to post our single core performance as well maybe? I mean some stuff has moved to multi-thread and that's why we saw significant improvements the last few months, but it looks like single core performance is still very valuable.

Regardless of how much you push tasks to other threads, you will always get stuck with tasks you can only run on a single core. Most games relies heavily on a single core even if they are made very multithreaded and single core performance will always be very important for any game. Take for instance pipes. Each pipe system can run independently from other pipe systems because they aren't connected. This allows them to run in parallel. However there is still a single core, which tells all the pipe systems that a new tick has started and they can do their one tick event. Games are always full of such single threaded parts within the multithreaded code and as such single threaded performance will always matter.

 

It's not really needed to look up the single core performance considering it's so easy to look up from just the CPU name. I already did so.

26 minutes ago, DustFireSky said:

We need more AMD Users here. Ryzen 1 or 2 generation users here? Would be nice. :)

The result is so far not really good. If I remember correctly, the savegame has only 8 Dupes. The real issue is the bad scaling with more dupes in the game. I test the game with 24 Dupes now. I think the result will be horrible :pI'm curious.

 

It would be interesting to try different games to stress test different parts. I wrote a spreadsheet for making the graphs and it would be easy to make a one for each tested game, particularly if the CPUs remains the same, meaning I just plot in new fps values.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

Please be aware that the content of this thread may be outdated and no longer applicable.

×
  • Create New...