Jump to content

[HELP] pls with Server Lag.


Recommended Posts

 

Hello,

I've followed the guide posted here: 

I'm running on AWS free tier. (t2.micro 1xCPU 1gb RAM), i've also added 500mb of swap. I've disabled the caves and I'm running the server without the shard system. Ubuntu 16.04

uname -a
Linux ip-172-31-44-224 4.4.0-1043-aws #52-Ubuntu SMP Tue Dec 5 10:49:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

/home/ubuntu# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
Stepping:              2
CPU MHz:               2400.068
BogoMIPS:              4800.13
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              30720K
NUMA node0 CPU(s):     0
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt

The reason why I wish to host this server is for the lolz and give to the community a 1 year free server from AWS. 

First time I've opened the server we had 5 players on the server and we played straight until day 20 without any issues and I was running as well tick_rate 30, after we restarted the server(killed screen and started again in detached mode), lag has started happening.

Now I get lag at normal intervals. For example right before posting 6 players on the server and it was running normally we played up until day 5, when all of a sudden server lag has started happening. This server lag now happens when even 2-3 players are connected and I just can't trace it to anything abnormal.

The server CPU is running around 0.7 load in 5 minutes, and the ram is around 700mb filled.

Does anyone have any ideas why this might be happening? Or for what features should I look to disable to have a stable server running?

folder tree : 
.
├── cluster.ini
├── cluster_token.txt
├── Master
│   ├── server_chat_log.txt
│   ├── server.ini
│   └── server_log.txt
└── whitelist.txt

 

My cluster.ini:

Spoiler

 


[GAMEPLAY]
game_mode = survival
max_players = 6
pvp = false
pause_when_empty = true
vote_kick = true


[NETWORK]
cluster_description = xxxx
cluster_name = xxxx
cluster_intention = cooperative
cluster_password =
; tick_rate = 30 -- commented out at first I was running on 30 tickrate and it was acting normaly.
whitelist_slots = 1
connection_timeout = 800
autosaver_enabled = false

[MISC]
console_enabled = true


[SHARD]
shard_enabled = false
bind_ip = 127.0.0.1
master_ip = 127.0.0.1
master_port = 10889
cluster_key = xxxx

[STEAM]
steam_group_id = xxxx
steam_group_admins = true

 

 

server_log.txt:

Spoiler

[00:00:00]: PersistRootStorage is now /home/steam/.klei//DoNotStarveTogether/MyDediServer/Master/
[00:00:00]: Starting Up
[00:00:00]: Version: 247691
[00:00:00]: Current time: Wed Dec 27 13:14:57 2017

[00:00:00]: System Name: Linux
[00:00:00]: Host Name: ip-172-31-44-224
[00:00:00]: Release(Kernel) Version: 4.4.0-1043-aws
[00:00:00]: Kernel Build Timestamp: #52-Ubuntu SMP Tue Dec 5 10:49:06 UTC 2017
[00:00:00]: Machine Arch: x86_64
[00:00:00]: Don't Starve Together: 247691 LINUX
[00:00:00]: Build Date: 3184
[00:00:00]: Parsing command line
[00:00:00]: Command Line Arguments: -console -cluster MyDediServer -monitor_parent_process 26452 -shard Master
[00:00:00]: [WARNING] -console has been deprecated: Use the [MISC] / console_enabled setting instead.
[00:00:00]: Initializing distribution platform
[00:00:00]: ....Done
[00:00:00]: THREAD - started 'GAClient' (4133575488)
[00:00:00]: CurlRequestManager::ClientThread::Main()
[00:00:00]: Mounting file system databundles/shaders.zip successful.
[00:00:00]: Mounting file system databundles/fonts.zip successful.
[00:00:00]: Mounting file system databundles/anim_dynamic.zip successful.
[00:00:00]: Mounting file system databundles/bigportraits.zip successful.
[00:00:00]: Mounting file system databundles/images.zip successful.
[00:00:00]: Mounting file system databundles/scripts.zip successful.
[00:00:00]: ProfileIndex:5.85
[00:00:00]: [Connect] PendingConnection::Reset(true)
[00:00:00]: Network tick rate: U=15(2), D=0
[00:00:00]: Network tick rate: U=15(2), D=0
[00:00:00]: OnLoadPermissionList: /home/steam/.klei//DoNotStarveTogether/MyDediServer/blocklist.txt (Failure)
[00:00:00]: OnLoadPermissionList: /home/steam/.klei//DoNotStarveTogether/MyDediServer/adminlist.txt (Failure)
[00:00:00]: THREAD - started 'StreamInput' (4126145344)
[00:00:00]: OnLoadUserIdList: /home/steam/.klei//DoNotStarveTogether/MyDediServer/whitelist.txt (Success)
[00:00:00]: Token retrieved from: /home/steam/.klei//DoNotStarveTogether/MyDediServer/cluster_token.txt
[00:00:00]: Token retrieved from: /home/steam/.klei//DoNotStarveTogether/MyDediServer/cluster_token.txt
[00:00:00]: cGame::InitializeOnMainThread
[00:00:00]: Renderer initialize: Okay
[00:00:00]: AnimManager initialize: Okay
[00:00:00]: Buffers initialize: Okay
[00:00:00]: cDontStarveGame::DoGameSpecificInitialize()
[00:00:00]: GameSpecific initialize: Okay
[00:00:00]: cGame::StartPlaying
[00:00:00]: LOADING LUA
[00:00:00]: DoLuaFile scripts/main.lua
[00:00:00]: DoLuaFile loading buffer scripts/main.lua
[00:00:00]:   taskgrouplist:    default Together
[00:00:00]:   taskgrouplist:    classic Classic
[00:00:00]:   taskgrouplist:    cave_default    Underground
[00:00:00]:   taskgrouplist:    lavaarena_taskset       The Forge
[00:00:00]: running main.lua

[00:00:00]: loaded modindex
[00:00:00]: ModIndex: Beginning normal load sequence for dedicated server.

[00:00:00]: DownloadMods(0)
[00:00:02]: LOADING LUA SUCCESS
[00:00:02]: PlayerDeaths could not load morgue
[00:00:02]: PlayerHistory could not load player_history
[00:00:02]: bloom_enabled       false
[00:00:02]: loaded saveindex
[00:00:02]: OnFilesLoaded()
[00:00:02]: OnUpdatePurchaseStateComplete
[00:00:02]:     Load FE
[00:00:02]:     Load FE: done
[00:00:02]: ModIndex: Load sequence finished successfully.
[00:00:02]: Reset() returning
[00:00:02]: THREAD - started 'FilesExistAsyncThread' (4066597696)
[00:00:02]: FilesExistAsyncThread started (12982 files)...
[00:00:03]: ... FilesExistAsyncThread complete
[00:00:03]: [200] Account Communication Success (6)
[00:00:03]: Received (xxxxxxxxxxx) from TokenPurpose
[00:00:03]: Starting Dedicated Server Game
[00:00:03]: Network tick rate: U=15(2), D=0
[00:00:03]: About to start a server with the following settings:
[00:00:03]:   Dedicated: true
[00:00:03]:   Online: true
[00:00:03]:   Passworded: false
[00:00:03]:   ServerPort: 11000
[00:00:03]:   SteamAuthPort: 8768
[00:00:03]:   SteamMasterServerPort: 27018
[00:00:03]:   ClanID: true
[00:00:03]:   ClanOnly: false
[00:00:03]:   ClanAdmin: true
[00:00:03]:   LanOnly: false
[00:00:03]:   FriendsOnly: false
[00:00:03]:   EnableAutosaver: false
[00:00:03]:   EncodeUserPath: true
[00:00:03]:   PVP: false
[00:00:03]:   MaxPlayers: 6
[00:00:03]:   GameMode: survival
[00:00:03]:   OverridenDNS:
[00:00:03]:   PauseWhenEmpty: true
[00:00:03]:   IdleTimeout: 1800s
[00:00:03]:   VoteEnabled: true
[00:00:03]:   InternetBroadcasting: true
[00:00:03]:   Intent: cooperative
[00:00:03]: [Shard] Shard server mode disabled by configuration file
[00:00:03]: Online Server Started on port: 11000
[00:00:03]: Collecting garbage...
[00:00:03]: lua_gc took 0.02 seconds
[00:00:03]: ~ShardLuaProxy()
[00:00:03]: ~ItemServerLuaProxy()
[00:00:03]: ~InventoryLuaProxy()
[00:00:03]: ~NetworkLuaProxy()
[00:00:03]: ~SimLuaProxy()
[00:00:04]: lua_close took 0.03 seconds
[00:00:04]: ReleaseAll
[00:00:04]: ReleaseAll Finished
[00:00:04]: cGame::StartPlaying
[00:00:04]: LOADING LUA
[00:00:04]: DoLuaFile scripts/main.lua
[00:00:04]: DoLuaFile loading buffer scripts/main.lua
[00:00:04]:   taskgrouplist:    default Together
[00:00:04]:   taskgrouplist:    classic Classic
[00:00:04]:   taskgrouplist:    cave_default    Underground
[00:00:04]:   taskgrouplist:    lavaarena_taskset       The Forge
[00:00:04]: running main.lua

[00:00:04]: loaded modindex
[00:00:04]: ModIndex: Beginning normal load sequence for dedicated server.

[00:00:04]: LOADING LUA SUCCESS
[00:00:04]: PlayerDeaths could not load morgue
[00:00:04]: PlayerHistory could not load player_history
[00:00:04]: bloom_enabled       false
[00:00:04]: loaded saveindex
[00:00:04]: OnFilesLoaded()
[00:00:04]: OnUpdatePurchaseStateComplete
[00:00:04]: Loading world: session/xxxxxxx/xxxxxx
[00:00:04]: Save file is at version 4.77
[00:00:04]:     Unload FE
[00:00:04]:     Unload FE done
[00:00:04]:     LOAD BE
[00:00:07]:     LOAD BE: done
[00:00:07]: Begin Session: xxxxxxxxxx
[00:00:07]: saving to server_temp/server_save
[00:00:07]: MiniMapComponent::AddAtlas( minimap/minimap_data.xml )
[00:00:08]: Loading Nav Grid
[00:00:11]: Reconstructing topology
[00:00:11]:     ...Sorting points
[00:00:11]:     ...Sorting edges
[00:00:11]:     ...Connecting nodes
[00:00:11]:     ...Validating connections
[00:00:11]:     ...Housekeeping
[00:00:11]:     ...Done!
[00:00:11]: Truncating to snapshot #2...
[00:00:11]:  - session/xxxxxxx/xxxxxx/0000000003
[00:00:11]:  - session/xxxxxxx/xxxxxx/0000000003
[00:00:11]:    2 file(s) removed
[00:00:11]: 1 uploads added to server. From server_temp
[00:00:11]: Telling Client our new session identifier: xxxxxxxxx
[00:00:11]: ModIndex: Load sequence finished successfully.
[00:00:11]: Reset() returning
[00:00:11]: [Steam] SteamGameServer_Init(8768, 11000, 27018)
[00:00:12]: [Steam] SteamGameServer_Init success
[00:00:12]: Validating portal[1] <-> <nil>[1] (inactive)
[00:00:12]: Validating portal[2] <-> <nil>[2] (inactive)
[00:00:12]: Validating portal[3] <-> <nil>[3] (inactive)
[00:00:12]: Validating portal[4] <-> <nil>[4] (inactive)
[00:00:12]: Validating portal[5] <-> <nil>[5] (inactive)
[00:00:12]: Validating portal[6] <-> <nil>[6] (inactive)
[00:00:12]: Validating portal[7] <-> <nil>[7] (inactive)
[00:00:12]: Validating portal[8] <-> <nil>[8] (inactive)
[00:00:12]: Validating portal[9] <-> <nil>[9] (inactive)
[00:00:12]: Validating portal[10] <-> <nil>[10] (inactive)
[00:00:12]: Sim paused

 

 

Link to comment
Share on other sites

18 hours ago, Denchick12345 said:

This is why i dont host a dedicated server.:?

:D Thanks for replying. 

 

As an update: I've configured a node_exporter for prometheus to monitor this server and see if there are any 'anomalies' that might be eluding me. Looking at the dashboard for the last hour I see some Spike in CPU usage, as when the lag started happening (server was day 7 but I played only 3 days) Average Load for 1min went to 80% when I was fighting a treeguard (if that has any relevance)

The node_exporter consumes around 1.7MB of ram with an average of 0.1% CPU thus he'll not get in the way.

This is a snapshot of the time when the lag started and when it dropped. Only 2 people remained on the server:

https://snapshot.raintank.io/dashboard/snapshot/1fAbbdsWTjeNg7VXDYz9xE9kn24HWWm6?orgId=2

I guess overall looks like a CPU issue. But this lag starts randomly when I'm alone as well on the server.

I know AWS t2 instances rely on burst CPU, but in the minimum requirements the need for 1 CPU per game instance is specified.The system looks normal up to a point.

Are there any map settings I can configure to reduce the CPU usage, or maybe the number of slots?

In the snapshot I've linked more details should be provided. Hopefully a general idea of what is happening on the server during a lag spike is there for you now.

Link to comment
Share on other sites

What you have there is a FREE VPS, which means the hosting company is sharing the recources you are using with other users to reduce their costs. This means the free VPS they are providing don't have many recources to use to begin with, and secondly they are getting a lower priority on the recources than other (Paid) VPS running on the same machine. Means: While other users don't use their servers your server is running fine, but as soon as they are putting some load on their servers your server doesn't get much processing power, disk speeds, etc, which causes LAG. Also a 1Gb 1 Core machine isn't much to begin with to run a OS and a Gameserver.

If you could magicially reduce the CPU power needed further than they are right now to run a server the Devs would've done that by now.

Also Minimum specs mean that you can run the server, not that it'd run flawlessly. It starts and people can join. Minimum specs correct. (Not even considering that you aren't even using a dedicated CPU core.)

Either invest in a higher priced VPS, or go straight for a server with a dedicated CPU core or two.

Link to comment
Share on other sites

Hey Daniel,

Thanks for your reply.

Indeed the specs aren't great, and AWS's t2.micro is not the best to host a real gameserver.

After this small time investing in the project, I'm considering actually buying a VPS from OVH. The real thing that bugs me is that 1 CPU core technically should be enough to handle 1 forest world with 3-4 active connections.

Do we have a place where we could read more about the requirements of the server? CPU processing power, ram speeds, IO disk etc.

OVH offers for 16 euro per month a 2x 3.1 GHz proc with 4GB of ram, in my head that means I could run a forest + cave server. But I need to know precisely before I plunge into paying for something.

Link to comment
Share on other sites

7 hours ago, Zauxst said:

Hey Daniel,

Thanks for your reply.

Indeed the specs aren't great, and AWS's t2.micro is not the best to host a real gameserver.

After this small time investing in the project, I'm considering actually buying a VPS from OVH. The real thing that bugs me is that 1 CPU core technically should be enough to handle 1 forest world with 3-4 active connections.

Do we have a place where we could read more about the requirements of the server? CPU processing power, ram speeds, IO disk etc.

OVH offers for 16 euro per month a 2x 3.1 GHz proc with 4GB of ram, in my head that means I could run a forest + cave server. But I need to know precisely before I plunge into paying for something.

There's no full reference and/or calculation what performance you actually need, since it always varies depending on the world gen, amount of items around, amount of mobs running around and where all players are distributed.

The Thing with that one CPU is the following: Technicially that one CPU core is enough, BUT you share it with probably 3 other people. (As an example) Means: if those other 3 put almost no load on it you get almost the full processiing power (minus the usual overhead prob around 80% of the max processing power.) which is enough to run the server smoothly. But if the others are pulling processing power as well everyone takes about 1/4 of the total processing power, and minus usual overhead, etc you end up with 20% of the max processing power, which is effectively 0.2 cores, which is not enough. (Very simplified example for easy understanding.)

With 16€ you are straight going for quite a investment. I can't guarantee that it'll run flawlessly, but if a server costs 16€ there's a high chance that you aren't sharing the recources with many people, if at all, so I'd expect a lot better performance.

Link to comment
Share on other sites

29 minutes ago, Daniel86268 said:

There's no full reference and/or calculation what performance you actually need, since it always varies depending on the world gen, amount of items around, amount of mobs running around and where all players are distributed.

The Thing with that one CPU is the following: Technicially that one CPU core is enough, BUT you share it with probably 3 other people. (As an example) Means: if those other 3 put almost no load on it you get almost the full processiing power (minus the usual overhead prob around 80% of the max processing power.) which is enough to run the server smoothly. But if the others are pulling processing power as well everyone takes about 1/4 of the total processing power, and minus usual overhead, etc you end up with 20% of the max processing power, which is effectively 0.2 cores, which is not enough. (Very simplified example for easy understanding.)

With 16€ you are straight going for quite a investment. I can't guarantee that it'll run flawlessly, but if a server costs 16€ there's a high chance that you aren't sharing the recources with many people, if at all, so I'd expect a lot better performance.

 

Hey Daniel.

Thanks again for your reply and introspective on the VPS space.

Sadly this is not the case here, while you normally would be given a hyper-thread core in this case it's a bit different,you are given clocks from a core and someone else is given as well (this is as a reply to your post) 

Along the lines, you did made me dig up deeper on how Amazons T2 instances are working and this blog post explains elegantly the best it can: http://www.dlt.com/blog/2016/10/19/misunderstood-misfits-t2-instance/ along their manual.

Basically a tldr to this issue is that the baseline power of the vCPU is dictated by the %usage. Thus t2.micro DST server usually runs at a +20% baseline when the server is running with 1 player in it. This means I am using my CPU credits that I've gathered, and I am artificially bottlenecked, something that does not appear in the logs.

The coincidence it was for me when I first launched the server (as I told at the start of the post) I was using my 30 CPU credits that I get from the first install.

So funny that I'm actually working with AWS on a daily basis but for the web applications in a Staging / Testing environments, this has not been a noticeable issue.

I guess all that remains at this point is to buy a VPS (1 core should be enough for a forest world) if I wish to give something back to the community. 

Thanks again Daniel for bearing with me.

It's nice to see that you can learn something everyday. 

Link to comment
Share on other sites

I learned about the AWS CPU "cap" the hard way not long ago. A bug caused a process to peak the CPU usage for hours, silently eating up CPU credits. That rendered a production server pretty much down until we figured how to fix it. :oops:

As you figured, a VPS is a good choice as you won't be capped or have to worry about credits / usage. Usually while playing the dedicated server CPU usage stays between 90~100%, constantly. If you intend to host a forest only world one CPU core should suffice so you can look for a more affordable upkeep option.

Also check out this post where we added our insights on what's important for a dedicated server to perform well, might be interesting for you :)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

Please be aware that the content of this thread may be outdated and no longer applicable.

×
  • Create New...