Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monsters bug when running #1650

Open
Schiffers opened this issue Dec 19, 2015 · 33 comments
Open

monsters bug when running #1650

Schiffers opened this issue Dec 19, 2015 · 33 comments
Labels
bug An issue describing unexpected behavior of code

Comments

@Schiffers
Copy link

well, how tittle says, when monsters is running low life, then walk 1 sqm and stop, walk another and stop, for exemple, draptor when is running he walk 1 sqm and stop and turn to the player, and continue doing it until he did not view any player and stop

@l04d
Copy link

l04d commented Dec 20, 2015

tested right now, everything is okay with that, must be your server.

@WibbenZ
Copy link
Member

WibbenZ commented Dec 20, 2015

They have pretty much always done that(atleast on OTs), you can just put follow on ex a necro and it will walk one step at a time insted of running.
No ide how it's on rl atm, do they actually run away or do they walk step by step?

Would be better IMO if they actually ran away insted of walking step by step, should not be easier just because you caught up with the monster.

@Schiffers
Copy link
Author

@WibbenZ in the rl they run away without stop or turn to the player, I think that the problem is with the flag "runonhealth" because the monster try to run, but attack aswell o.O

@ranisalt
Copy link
Member

Maybe monsters could have an internal state machine with ATTACK and FLEE states, the former searching a player and launching attacks and the latter just running away and checking if it's possible to attack again.

@marksamman marksamman added the needs-triage Needs testing with screenshot/video confirmation label Dec 21, 2015
@slavidodo
Copy link
Contributor

slavidodo commented Apr 27, 2016

confirmed, bug.
Simply as monster attacking a player, if the player stops until monster start to move again then the player moves, the monster will reach the first place player was in, tries to attack then tries to move again which makes some delay.

@Schiffers
Copy link
Author

Schiffers commented Jul 24, 2016

bump, also monsters when running stay in crazy dancing

@nekiro
Copy link
Member

nekiro commented Apr 21, 2020

Sorry for refreshing this issue, but that "bug" was reported some time ago by someone on forum and @marksamman stance on this is:
https://otland.net/threads/weird-monster-behaviour-in-tfs-1-1.228021/#post-2200365
Not sure if there is any better way of solving that.
(it will affect performance)

@DSpeichert
Copy link
Member

@nekiro It does look like a weird behavior. I'm inclined to tag it as a bug and look for a more performant solution than the one you linked.

@ghost
Copy link

ghost commented Aug 24, 2021

@nekiro It does look like a weird behavior. I'm inclined to tag it as a bug and look for a more performant solution than the one you linked.

yeah this is indeed a bug, make monsters so "dumb" that hunting anything isn't even hard

@DSpeichert DSpeichert added bug An issue describing unexpected behavior of code and removed needs-triage Needs testing with screenshot/video confirmation labels Aug 28, 2021
@gesior
Copy link
Contributor

gesior commented Sep 6, 2021

Maybe these monsters following player look dumb, but try to spawn Rabbit. It runs away doing 1 step per second!

Changing line described by Mark fixed Rabbits too.

@ghost
Copy link

ghost commented Sep 6, 2021

Maybe these monsters following player look dumb, but try to spawn Rabbit. It runs away doing 1 step per second!

Changing line described by Mark fixed Rabbits too.

exactly, my proposal for the time being, is to add mark code with a configurable switch to turn it on/off and warn about performance just like we do with classic attack speed and other settings

@gesior
Copy link
Contributor

gesior commented Sep 7, 2021

I did some test:

  • made Demon with no spells and no summons
  • spawned 4 Demons in big room
  • start running around them on 1000 lvl character
  • test it for minute with/without Mark creature.cpp change

Results were same over minute, so I post just 5 seconds part of each test.

CPU usage on dispatcher thread (5 seconds stats):

  1. Normal TFS:
Cpu usage: 2.24194%
 Time (ms)     Calls     Rel usage %    Real usage % Description
        50        20       45.41795%        1.01824% std::bind(&Game::checkDecay, this)
        26       487       23.48220%        0.52646% [&]() { sendAll(bufferedProtocols); }
        13       108       12.23709%        0.27435% std::bind(&Game::checkCreatureWalk, &g_game, getID())
        11        50       10.38291%        0.23278% std::bind(&Game::checkCreatures, this, (index + 1) % EVENT_CREATURECOUNT)
         6        33        6.01629%        0.13488% &Game::playerMove
         2       174        2.26108%        0.05069% std::bind(&Game::checkCreatureAttack, &g_game, getID())
         0         2        0.10168%        0.00228% std::bind(&Game::updateWorldTime, this)
  1. With Mark change in creature.cpp:
Cpu usage: 3.21697%
 Time (ms)     Calls     Rel usage %    Real usage % Description
        56        20       35.03417%        1.12704% std::bind(&Game::checkDecay, this)
        49       171       30.97982%        0.99661% std::bind(&Game::updateCreatureWalk, &g_game, getID())   <<<< -----
        24       484       15.39205%        0.49516% [&]() { sendAll(bufferedProtocols); }
        13       105        8.42253%        0.27095% std::bind(&Game::checkCreatureWalk, &g_game, getID())
         7        31        4.35771%        0.14019% &Game::playerMove
         6        50        4.27815%        0.13763% std::bind(&Game::checkCreatures, this, (index + 1) % EVENT_CREATURECOUNT)
         2       171        1.31935%        0.04244% std::bind(&Game::checkCreatureAttack, &g_game, getID())
         0         5        0.10145%        0.00326% &Game::playerReceivePingBack

As we can see, there is only one change in stats:

        49       171       30.97982%        0.99661% std::bind(&Game::updateCreatureWalk, &g_game, getID())

It made 171 calls to Game::updateCreatureWalk in 5 seconds and these calls took 49ms to calculate. Which gives 10 ms per 1 second (= 1% CPU!) to calculate 4 Demons and 1 player on i9-9900K.

IT MUST BE configurable on/off in config.lua, as in case of popular server, it may hit 100% CPU easily.

EDIT:
with 5 demons it uses extra 1.6% CPU (i9-9900K):

        81       222       41.00515%        1.63432% std::bind(&Game::updateCreatureWalk, &g_game, getID())

with 10 demons it uses extra 3% CPU:

       150       370       56.97033%        3.01295% std::bind(&Game::updateCreatureWalk, &g_game, getID())

with 15 demons it uses extra 10% CPU! 👀

       506       752       72.62174%       10.12373% std::bind(&Game::updateCreatureWalk, &g_game, getID())

There must be some way, to make them move smooth and react fast, but not use 123% CPU.

@Zbizu
Copy link
Contributor

Zbizu commented Sep 12, 2021

I suggest splitting the tickrate of monster actions to allow setting different values. Currently there's only one value for everything, the EVENT_CREATURE_THINK_INTERVAL which is set to 1000ms. If think had 1000 ms, but path updates had 100, the monster would move more smoothly for sure without costing that much performance.

link to variable:

static constexpr int32_t EVENT_CREATURE_THINK_INTERVAL = 1000;

also, why is it hardcoded?

@gesior
Copy link
Contributor

gesior commented Sep 13, 2021

I suggest splitting the tickrate of monster actions to allow setting different values. Currently there's only one value for everything, the EVENT_CREATURE_THINK_INTERVAL which is set to 1000ms. If think had 1000 ms, but path updates had 100, the monster would move more smoothly for sure without costing that much performance.

link to variable:

static constexpr int32_t EVENT_CREATURE_THINK_INTERVAL = 1000;

also, why is it hardcoded?

I will benchmark idea with 100 ms 'find path' interval.

I will also test algorithm, where monster can do 'extra step calculation' once every XXX ms. I will test values 100, 200 and 500 ms.
So it can react in 0 ms, when player make step (start following player), but it won't calculate path every player step.

@Zbizu
Copy link
Contributor

Zbizu commented Sep 13, 2021

On a second thought, won't it cause a data race?

@gesior
Copy link
Contributor

gesior commented Sep 21, 2021

I did tests with 4 reaction algorithms:

	if (creature == followCreature || (creature == this && followCreature)) {
		if (hasFollowPath) {
			const uint32_t algo = g_config.getNumber(ConfigManager::PZ_LOCKED);
			if (algo == 1) { // max cpu, 0 ms reaction
				isUpdatingPath = false;
				g_dispatcher.addTask(createTask(std::bind(&Game::updateCreatureWalk, &g_game, getID())));
			} else if (algo == 2) { // try to reduce cpu, fast reaction
				if (getWalkDelay() <= 0) {
					isUpdatingPath = false;
					g_dispatcher.addTask(createTask(std::bind(&Game::updateCreatureWalk, &g_game, getID())));
				} else {
					isUpdatingPath = true;
				}
			} else if (algo == 3) { // try to reduce cpu, fast reaction v2
				if (getWalkDelay() <= 0) {
					g_dispatcher.addTask(createTask(std::bind(&Game::updateCreatureWalk, &g_game, getID())));
				}
				isUpdatingPath = true;
			} else { // normal tfs
				isUpdatingPath = true;
			}
		}

Fighting vs 5 demons (on [email protected]):
Algo nr 0 - current TFS. Super slow reaction. 0% CPU
Algo nr 1 - instant reaction - 3.19% CPU
Algo nr 2 - fast first step - 0.97% CPU
Algo nr 3 - fast first step, faster next step - 1.1% CPU

Fighting vs 15 demons (on [email protected]):
Algo nr 0 - current TFS. Super slow reaction. 0% CPU
Algo nr 1 - instant reaction - 5.88% CPU
Algo nr 2 - fast first step - 1.1% CPU
Algo nr 3 - fast first step, faster next step - 1.6% CPU

Testing Algo nr 1 - instant reaction was hard with 15 demons, because they surround me often. With other algorithms I could easily do circles around them.

EDIT:
Now it updates walk path once a second in Creature::onThink. Tommorow I will test how much CPU it takes, if I add extra function that updates walk path 10 times per second.
I will also try to make it calculate in 8 threads in same time, if it won't be complicated.

@gesior
Copy link
Contributor

gesior commented Sep 23, 2021

Algorithm 4 by Mark. Monsters react like with 'instant reaction' algorithm, but CPU usage is lower, as we schedule updateCreatureWalk to moment when monster can make next step.

			} else if (algo == 4) { // try to reduce cpu, fast reaction v3
				int64_t walkDelay = getWalkDelay();
				if (walkDelay <= 0) {
					g_dispatcher.addTask(createTask(std::bind(&Game::updateCreatureWalk, &g_game, getID())));
				} else if(!isUpdatePathScheduled) {
					isUpdatePathScheduled = true;
					g_scheduler.addEvent(createSchedulerTask(walkDelay, std::bind(&Game::updateCreatureWalk, &g_game, getID())));
				}
				isUpdatingPath = true;
			}

(isUpdatePathScheduled resets in Game::updateCreatureWalk)
Fighting vs 15 demons (on [email protected]):
Algo nr 0 - current TFS. Super slow reaction. 0% CPU
Algo nr 1 - instant reaction - 5.88% CPU
Algo nr 2 - fast first step - 1.1% CPU
Algo nr 3 - fast first step, faster next step - 1.6% CPU
Algo nr 3 - instant reaction, but with Scheduler for 'walkDelay' - 1.9% CPU

Total CPU usage comparison:
IDLE with 15 demons: 1.87% CPU
Algo nr 0 with 15 demons: 3.46% CPU
Algo nr 1 with 15 demons: 10.20% CPU
Algo nr 2 with 15 demons: 3.50% CPU
Algo nr 3 with 15 demons: 4.22% CPU
Algo nr 4 with 15 demons: 5.12% CPU

@Zbizu
Copy link
Contributor

Zbizu commented Sep 23, 2021

That cpu usage is still quite big even for 15 demons. If we assume 200 players online of which 70% are hunting and keeping 5 monsters active on average (data completely made up but you get the point), will the server be playable?

On the other hand, there were also issues with pathfinding performance from what I remember. Also there's a huge bottleneck in area spells. Every tile of spell area calls isSightClear, every magic effect calls getSpectators so try benchmarking without area spells.

edit: nvm, the thing with magic effects doesnt happen

@gesior
Copy link
Contributor

gesior commented Sep 23, 2021

That cpu usage is still quite big even for 15 demons. If we assume 200 players online of which 70% are hunting and keeping 5 monsters active on average (data completely made up but you get the point), will the server be playable?

On the other hand, there were also issues with pathfinding performance from what I remember. Also there's a huge bottleneck in area spells. Every tile calls isSightClear, every magic effect calls getSpectators so try benchmarking without area spells.

I made Demon with 0 spells/attacks and 0 summons, to make results stable.
I will try to make that 'path finding' work in few threads in same time. Every popular server runs on 4 core machine, often even on 8/16 cores.

EDIT:
I made it run in 8 threads. Tested with 20/40 path findings per second. CPU usage on dispatcher is higher (6.4%) and total CPU usage of TFS jumped to 20% :(
On tuesday I will test it with extra IF to 'not calculate path few times per step'.

@soul4soul
Copy link
Contributor

That cpu usage is still quite big even for 15 demons. If we assume 200 players online of which 70% are hunting and keeping 5 monsters active on average (data completely made up but you get the point), will the server be playable?

On the other hand, there were also issues with pathfinding performance from what I remember. Also there's a huge bottleneck in area spells. Every tile of spell area calls isSightClear, every magic effect calls getSpectators so try benchmarking without area spells.

edit: nvm, the thing with magic effects doesnt happen

Didn't your line of sight PRs fix improve the logic to not check every tile on area spells?

@Zbizu
Copy link
Contributor

Zbizu commented Sep 25, 2021

No, I planned to work on that after two things get merged, but I've been busy with several other things. I'm not really good at optimizations so I'll stick with working on protocol. If anyone is interested in working on that, I can give you hints though.

@gesior
Copy link
Contributor

gesior commented Oct 11, 2021

I did tests with multithread pathfinding with searches limited to one calculation per step. It's hard to say, if it reduces dispatcher thread CPU usage, because it generates constant load related to multithread synchronization. Cost of 40 synchronizations of 16 threads per second is higher than calculation of 15 demons steps (around 1% cpu).

I made version for TFS, but someone need to test it on server with 50+ players online:
https://github.com/gesior/forgottenserver/commit/48cb292d56d36e87ab25e2bc00bcc5c61c4cc842

Someone tried to install it on otservbr and said it crashes, but maybe he did not apply it in all required places.

@github-project-automation github-project-automation bot moved this to Backlog in TFS 1.8 Jun 2, 2024
@gesior
Copy link
Contributor

gesior commented Sep 4, 2024

I've tested again TFS algorithm vs Mark fast reaction vs Gesior fast reaction vs Kondra multi-thread path finding.

Benchmark:

  • spawn Demon every 6 tiles on 6000x6000 empty map; disable all attacks/spells of Demon
  • add 7 SQM of walls every 3 tiles with 1 SQM between them (as on screenshot)
  • login 400 players; drop network connection to do not waste CPU on XTEA encryption
  • make all players teleport to new location 30 sqm away every 10 seconds; each player goes to new never visited location, so Demons never seen any player before
  • after each teleport each player makes 14 steps up-down within 10 seconds
    image

Results (avg CPU usage by Dispatcher thread):

  • Mark/Gesior instant reaction cannot handle over 300 online - CPU goes 100+
  • normal TFS - 1 second reaction time - with 400 online uses 50.47% CPU and lags 592 times per minute with avg lag time 30 ms
  • Kondra multi-thread path finding with 400 online and 50 ms reaction time uses 47.57% CPU and lags 523 times per minute with avg lag time 15 ms
  • Kondra multi-thread path finding with 400 online and 200 ms reaction time uses 41.65% CPU and lags 246 times per minute with avg lag time 31 ms

So you can reduce CPU usage on Dispatcher by 3% (50 ms reaction) or 9% (200 ms reaction) using 8 cores, but lower CPU usage accumulates more monster to process at once making it lag for 31 ms, not 15 ms (normal TFS lags for 30 ms).

Full results after 4 hours of benchmarking:
https://docs.google.com/spreadsheets/d/19uSEjxdpXQGrdheusutf7ndru5D2OS2c7XNCEJ6PFjg

Tested on Hetzner CCX33 Cloud:

  • 8 dedicated AMD cores each 2.4 GHz
  • 32 GB RAM
  • 7zip/sysbench benchmarks done between OTS benchmarks showed that CPU/RAM speed is pretty constant (+/- 3%), so OTS benchmarks results should be reproducible

Path Finding code is temporarily on branch (with Kondra permission):
https://github.com/gesior/forgottenserver-gesior/commits/pathfd/
Code to spawn monsters, walls and teleport players around:
https://github.com/gesior/forgottenserver-gesior/blob/pathfd/data/talkactions/scripts/pathfinding.lua

Raw results from OTS Stats and PHP code to process them in CSV:
raw_results.zip

EDIT:
Benchmarks from 16 core machine with 20 'find path calculations per second':

  • 8 threads: 44.94% CPU, 387 lags per minute (on 8 core machine: 48.49%, 548 lags per minute)
  • 16 threads: 41.13% CPU, 206 lags per minute
  • 32 threads: 40.24% CPU, 210 lags per minute

@Zbizu
Copy link
Contributor

Zbizu commented Sep 8, 2024

What if entire monster behavior was on multiple threads instead of game one and only dealing damage/dropping loot was on game thread?

@gesior
Copy link
Contributor

gesior commented Sep 8, 2024

What if entire monster behavior was on multiple threads instead of game one and only dealing damage/dropping loot was on game thread?

You can run on multiple threads only things that do not modify anything.
Ex. multi-thread path finding works, because it splits calculation of path and walking into 2 functions. First it calculates path on multiple threads (it does not modify anything), then it walks (move monsters/players) on dispatcher thread.

@Codinablack
Copy link
Contributor

According to pretty much every extensive resource on the internet, all the top minds in programmers hangout discord, as well as others I have spoken with in real life... multi-threading A* pathfinding is pointless.

Pretty much everyone everywhere agree's, if there is at all any issues of performance in an A* pathfinding algorithm, its the implementation or some other issue that needs addressed, and multi-threading is not the solution.

image

Even when using a dedicated thread for pathfinding, pretty much all agree that this would not do much as the main thread is still waiting on the results before moving.

@gesior
Copy link
Contributor

gesior commented Sep 17, 2024

According to pretty much every extensive resource on the internet, all the top minds in programmers hangout discord, as well as others I have spoken with in real life... multi-threading A* pathfinding is pointless.

That's right answer to wrong question.
You can't run A* between positions X-Y on multiple threads, but you can run 100x A* between 100 different X-Y positions on different maps with 100 cores.

There are often 30+ monsters waiting for path calculation every 0.05 sec with 400 players online. You can calculate all their paths to target using multiple threads.

@Codinablack
Copy link
Contributor

Codinablack commented Sep 17, 2024

According to pretty much every extensive resource on the internet, all the top minds in programmers hangout discord, as well as others I have spoken with in real life... multi-threading A* pathfinding is pointless.

That's right answer to wrong question. You can't run A* between positions X-Y on multiple threads, but you can run 100x A* between 100 different X-Y positions on different maps with 100 cores.

There are often 30+ monsters waiting for path calculation every 0.05 sec with 400 players online. You can calculate all their paths to target using multiple threads.

And then every one of those monsters would still have to wait to receive their path on the main thread.
Players can move, items can move, other things can change in that time, so now the paths are stale too.

@gesior
Copy link
Contributor

gesior commented Sep 17, 2024

According to pretty much every extensive resource on the internet, all the top minds in programmers hangout discord, as well as others I have spoken with in real life... multi-threading A* pathfinding is pointless.

That's right answer to wrong question. You can't run A* between positions X-Y on multiple threads, but you can run 100x A* between 100 different X-Y positions on different maps with 100 cores.
There are often 30+ monsters waiting for path calculation every 0.05 sec with 400 players online. You can calculate all their paths to target using multiple threads.

And then every one of those monsters would still have to wait to receive their path on the main thread. Players can move, items can move, other things can change in that time, so now the paths are stale too.

Idea is that you process multiple monsters at once, not each monster as different Dispatcher Task. TFS already does it in Game::checkCreatures.

It's most basic multi-thread concept possible. You have 10 task to do (in this case run A*). You can run them on main thread and they will take 10x 'task execution time' or you can run them on 10 threads and wait on main thread for results, which will reduce time to 1x 'task execution time', if you have 10 core CPU. By using 10 cores, you can reduce main thread processing time 10 times.

Only communication between threads is that Dispatcher lists 'monsters that require new path' and go sleep (nothing can move/change in that time). Path-finding threads process that list and then wake up Dispatcher thread.
All threads modify memory of Monster* that is shared between threads and reads Map* that is shared between threads (dispatcher thread and all path-finding threads).

Normal path-finding: 10 times per second 10% of Monsters recalculate their path (if required) and start walking (Game::checkCreatures / onThink)
Multi-thread path-finding: 20 times per second 100% of Monsters that require new path recalculate their path on multiple CPU cores (Dispatcher thread goes sleep until all paths are recalculated) and then start walking on Dispatcher thread

Result: faster reaction (0.05 sec, not 1.0 sec); lower CPU usage on Dispatcher, if you have enough CPU cores.

EDIT:
Math says that 20x 100% is 20x more than 10x 10%, but when you calculate path that often, list of 'monsters that require new path' is often much smaller than 100% of monsters. Most of monsters do not make step in 0.05 sec, so they don't need new path every "Multi-thread path-finding" call.
Even with 8 cores, CPU usage on dispatcher is 3% lower.

@Codinablack
Copy link
Contributor

Codinablack commented Sep 17, 2024

nothing can move/change in that time

Listen to what you are saying. If nothing can change, then its not multi-threaded. If there are indeed multiple threads running, everything has changed by the time they are done... this is the entire point and complexity that IS multi-threading!

If nothing can change, then that means things are waiting (locked/frozen), that is a lag. If things are changing, then its like I said previously, the paths are stale. A player could have moved in the way, a fire field created... these things can happen while the secondary thread is running, and if they can't, then everything is waiting, and that's a lag. Even if you are getting slightly better results than what is already in place, its only barely slightly better and for a high cost, and I don't mean CPU usage, so much as the resource that is a thread.

I don't want to argue the matter. If you have good results with multi-threading, or what you think is good results, more power to you. I am only saying, even if the results are better than whatever is going on right now, its not because multi-threading is the solution to a bottleneck for this case, its that there is another issue that needs addressed which when done so, would allow greater response times AND waaaaay less CPU for the same.

@gesior
Copy link
Contributor

gesior commented Sep 17, 2024

. If nothing can change, then its not multi-threaded.

There is one thing that can change. It's Monster* list of 'steps to target position' - thing that is calculated and modified by path-finding algorithm (A*).

If there are indeed multiple threads running, everything has changed by the time they are done

Nothing has changed, because only thread that can modify game (players/monsters/map) is Dispatcher. It's sleeping and waiting for path-finding results.
The idea is to stop Dispatcher, run 100 A* tasks on 10 CPU cores and start Dispatcher again. Instead of running 100 A* in Dispatcher thread one by one.

I am only saying, even if the results are better than whatever is going on right now, its not because multi-threading is the solution to a bottleneck for this case

Multi-threading is solution to any problem that CAN be processed in parallel.
Right now path-finding takes 100% CPU with 200 online, you can run it on 10 cores and reduce it to 10% CPU.
Even, if you rewrite path-finding to do not use A* so often and CPU usage will drop to 10%, if this new algorithm can run on multiple threads, you can reduce CPU usage on Dispatcher to 1% by running it on 10 cores.

Path-finding is only part of OTS engine that can be processed in parallel.

@Codinablack
Copy link
Contributor

Who exactly is getting 100 percent CPU usage with 200 players online?

Now you make me question where these players and this host with these willing players come from...

@ramon-bernardo
Copy link
Contributor

It seems that #4637 resolved this; initially, we don't experience the notorious delay

@gesior you run some tests with this PR to check the performance diff?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue describing unexpected behavior of code
Projects
Status: Backlog
Development

No branches or pull requests