Go Back   EQEmulator Home > EQEmulator Forums > Support > Support::Windows Servers

Support::Windows Servers Support forum for Windows EQEMu users.

Reply
 
Thread Tools Display Modes
  #1  
Old 03-08-2019, 12:51 AM
Drakiyth's Avatar
Drakiyth
Dragon
 
Join Date: Apr 2012
Posts: 545
Default

Waiting until Akkadius posts an official fix before I would touch the maps and navmesh files. That sounds like a bandaid fix which could lead to some zone crashes or NPCs screwing up, and not a full-on repair of the problem.
Reply With Quote
  #2  
Old 03-08-2019, 10:57 AM
Rekka
Fire Beetle
 
Join Date: Jan 2019
Location: North Carolina
Posts: 2
Default Possible help

This may not solve your problem, but it may help. (especially if you use innodb as your storage engine)

I took a look at how the Save method in client.cpp worked and saw some issues in the way locking happens and transactions are used (or not used).

From what I can tell between the lock in front of the DB connection and the lack of bulking up statements into a transaction, this could cause some serious issues if people were not using a very good ssd on their database and had quick network connection to their database of choice. You can easly get 'fsync' choked. Symptions are low io/cpu usage, everything starts lagging out waiting for the locks to be release for fsync to happen on the database for transaction purposes.

I've created a pull request.
https://github.com/EQEmu/Server/pull/827

**WARNING*** my testing has been minimal so use as your own risk, but I am encouraged by the results (over 2x faster with zero load on the system)

**Note**, you will also need to included two indexes on tables, its in the pull request. Its important as we are doing table scans during the save without them.

Hope this can help some of you. right now its a stop gap and hopefully I can come up with a better solution in the near future. I would like feedback if it helps resolve the lag issues with pets out.
This is an example bulking of the transactions of a simple save.

START TRANSACTION;
REPLACE INTO `character_currency` (id, platinum, gold, silver, copper,platinum_bank, gold_bank, silver_bank, copper_bank,platinum_cursor, gold_cursor, silver_cursor, copper_cursor, radiant_crystals, career_radiant_crystals, ebon_crystals, career_ebon_crystals)VALUES (685273, 184, 153, 149, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
REPLACE INTO `character_bind` (id, zone_id, instance_id, x, y, z, heading, slot) VALUES (685273, 189, 0, 18.000000, -147.000000, 20.000000, 64.000000, 0);
REPLACE INTO `character_bind` (id, zone_id, instance_id, x, y, z, heading, slot) VALUES (685273, 41, 0, -980.000000, 148.000000, -38.000000, 64.000000, 1);
REPLACE INTO `character_bind` (id, zone_id, instance_id, x, y, z, heading, slot) VALUES (685273, 41, 0, -980.000000, 148.000000, -38.000000, 64.000000, 2);
REPLACE INTO `character_bind` (id, zone_id, instance_id, x, y, z, heading, slot) VALUES (685273, 41, 0, -980.000000, 148.000000, -38.000000, 64.000000, 3);
REPLACE INTO `character_bind` (id, zone_id, instance_id, x, y, z, heading, slot) VALUES (685273, 41, 0, -980.000000, 148.000000, -38.000000, 64.000000, 4);
DELETE FROM `character_buffs` WHERE `character_id` = '685273';
DELETE FROM `character_pet_buffs` WHERE `char_id` = 685273;
DELETE FROM `character_pet_inventory` WHERE `char_id` = 685273;
INSERT INTO `character_pet_info` (`char_id`, `pet`, `petname`, `petpower`, `spell_id`, `hp`, `mana`, `size`) VALUES (685273, 0, 'Labann000', 0, 632, 3150, 0, 5.000000) ON DUPLICATE KEY UPDATE `petname` = 'Labann000', `petpower` = 0, `spell_id` = 632, `hp` = 3150, `mana` = 0, `size` = 5.000000;
DELETE FROM `character_tribute` WHERE `id` = 685273;REPLACE INTO character_activities (charid, taskid, activityid, donecount, completed) VALUES (685273, 22, 1, 0, 0), (685273, 22, 2, 0, 0), (685273, 22, 3, 0, 0), (685273, 22, 4, 0, 0), (685273, 22, 5, 0, 0);
REPLACE INTO character_activities (charid, taskid, activityid, donecount, completed) VALUES (685273, 23, 0, 0, 0);REPLACE INTO character_activities (charid, taskid, activityid, donecount, completed) VALUES (685273, 138, 0, 0, 0);
REPLACE INTO `character_data` ( id,account_id,`name`, last_name, gender, race, class, `level`, deity,birthday,last_login,time_played,pvp_status,l evel2, anon, gm, intoxication,hair_color,beard_color,eye_color_1,ey e_color_2,hair_style,beard,ability_time_seconds,ab ility_number,ability_time_minutes,ability_time_hou rs, title,suffix, exp, points, mana, cur_hp, str, sta, cha, dex, `int`,agi, wis, face, y, x, z, heading, pvp2, pvp_type,autosplit_enabled, zone_change_count, drakkin_heritage, drakkin_tattoo,drakkin_details, toxicity,hunger_level,thirst_level,ability_up,zone _id, zone_instance,leadership_exp_on, ldon_points_guk, ldon_points_mir, ldon_points_mmc, ldon_points_ruj, ldon_points_tak, ldon_points_available,tribute_time_remaining, show_helm, career_tribute_points,tribute_points,tribute_activ e,endurance, group_leadership_exp,raid_leadership_exp, group_leadership_points, raid_leadership_points, air_remaining,pvp_kills, pvp_deaths,pvp_current_points, pvp_career_points, pvp_best_kill_streak,pvp_worst_death_streak, pvp_current_kill_streak, aa_points_spent, aa_exp, aa_points, group_auto_consent, raid_auto_consent, guild_auto_consent, RestTimer, e_aa_effects, e_percent_to_aa, e_expended_aa_spent, e_last_invsnapshot, mailkey ) VALUES (685273,90536,'Rekka','',0,6,13,50,396,1550636815, 1552018689,22507,0,70,0,1,0,17,255,4,4,2,255,0,0,0 ,0,'','',164708608,345,2299,1589,60,80,60,75,134,9 0,83,3,-1831.625000,-225.750000,3.127999,37.500000,0,0,0,0,0,0,0,0,4480 ,4480,0,22,0,0,0,0,0,0,0,0,4294967295,0,0,0,0,1291 ,0,0,0,0,60,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,'B F01A8C0586571A2');
COMMIT;

If this doesn't help, I'm sorry if I have muddied the waters a bit.
Reply With Quote
  #3  
Old 03-08-2019, 11:22 AM
Straps
Fire Beetle
 
Join Date: Jan 2019
Posts: 4
Default

Is there data that makes you think DB is acting as a bottleneck?

While I don't want to undermine the helpful suggestions. I did try a lot of DB performance tuning and changing storage engines to tweak. I also setup fairly robust DB monitoring to make sure it wasn't DB performance.

When I was having massive lag spikes, there was virtually no load, locks, waits, anything really on the DB. The DB was on-box and was on SSD. I didn't save the reporting but nothing about it in my case pointed to DB performance issues.

As others said, a more methodical approach is probably in order to make sure the situation doesn't get more complex.
Reply With Quote
  #4  
Old 03-08-2019, 12:48 PM
Rekka
Fire Beetle
 
Join Date: Jan 2019
Location: North Carolina
Posts: 2
Default

From what I can tell from just looking at the code , the locking isn't at the database layer per say, it's at the MySQL connection per zone In the zonedb.cpp. there is only one connection per zone, at least from what I see.

Fsync is a delay or latency issue on the DB when dealing with transactions. Every single query in save is a transaction. (All 13+of them.) You can have a system that can do 500 transactions per sec but can do 100,000 inserts per sec if you bulk up statements.

A small latency can have a massive impact when locks are involved.

If there is latency at the DB it queues up on the zone depending on how many are in each zone. Less people in the zone the less this impacts them

Lowering the latency by limiting the fsync on the transaction call can ease the pressure on the lock on the connection which prevents stalling of the character saved. Or that is at least the idea.

**Note** this lock prevents all queries in the zone , not jsut during saves.

Also note the removal of the table scanning of the pet tables (adding indexes) helps to lower latency of the call as well

It can be easy to confuse work with latency/locks. You can have a slow system doing no work.

Honestly I would like to do more work on this and know it's a stop gap but figured doing 13x less transaction s per save was a win when someone in this thread noted commenting how some of the saves improved their latency. (Mine improved 2-3x)

I know it can be many issues and I may be barking up the wrong tree , but this is simply another option that does have a very clear improvement in performance around the zone locks.

Side note, A single mysql connection for a process is generally a less than idea situation. It is too much of a blockage area for network IO. Locks should be kept for nano/microseconds, not milliseconds. Possibly make seperate connections for read/writes depending on how the threading is setup on the zone process.. (note I have not really looked at the threading model of the zone yet, so this may be moot and may be my misunderstanding)

Note on a phone so sorry for formatting/bad Grammer. Very small window
Reply With Quote
  #5  
Old 04-03-2019, 06:54 PM
eldarian's Avatar
eldarian
Fire Beetle
 
Join Date: May 2017
Posts: 25
Default

Thank you Kindly for the very detailed Update, I believe this was a Side effect to Thanos Snap
Reply With Quote
  #6  
Old 04-15-2019, 07:39 PM
eldarian's Avatar
eldarian
Fire Beetle
 
Join Date: May 2017
Posts: 25
Default

I for one am Very pleased with the current build that was released to me *shared with Varlydra server* Akkaidus and KLS and anyone else I may not know who was involved work very hard and they kept their promise to find a fix and they delivered. Tested this with 24 clients in zone. Keep in mind your MS bar may say one thing but in reality you can cast spells with normal fresh time. My server is a 2 box server but permitted 3 for the purpose of our testing. Once more thank you for taking this problem serious and investing time and resources to see it fixed.
Reply With Quote
  #7  
Old 04-15-2019, 07:54 PM
Akkadius's Avatar
Akkadius
Administrator
 
Join Date: Feb 2009
Location: MN
Posts: 2,072
Default

Keep in mind this is not merged mainline, but we have a general fix in a working branch currently, we have a handful of things we need to take care of before merging mainline

If you're interested in the build for your server, download it at the following and report back

https://www.dropbox.com/s/2s2mput1q4...aries.zip?dl=0

Also, keep in mind you will need to run this update manually: https://github.com/EQEmu/Server/blob...date_range.sql

Last edited by Akkadius; 04-16-2019 at 08:09 PM..
Reply With Quote
  #8  
Old 04-15-2019, 09:30 PM
Drakiyth's Avatar
Drakiyth
Dragon
 
Join Date: Apr 2012
Posts: 545
Default

Quote:
Originally Posted by Akkadius View Post
Keep in mind this is not merged mainline, but we have a general fix in a working branch currently, we have a handful of things we need to take care of before merging mainline

If you're interested in the build for your server, download it at the following and report back

https://www.dropbox.com/s/2s2mput1q4...aries.zip?dl=0
Stellar work all around. I can't wait to fire this up on Varlyndria for everybody in tomorrow's update. Thank you very much for all you guys do to keep this place the best emulator project in the world.
Reply With Quote
  #9  
Old 05-24-2019, 04:04 PM
almightie
Fire Beetle
 
Join Date: Aug 2010
Posts: 1
Default

Hey guys just checking in to see if there is any update to this issue and if the fix will be pushed out.

Thank you
Reply With Quote
  #10  
Old 05-24-2019, 04:14 PM
Akkadius's Avatar
Akkadius
Administrator
 
Join Date: Feb 2009
Location: MN
Posts: 2,072
Default

Quote:
Originally Posted by almightie View Post
Hey guys just checking in to see if there is any update to this issue and if the fix will be pushed out.

Thank you
Fix is mainline and on master
Reply With Quote
  #11  
Old 05-26-2019, 04:17 PM
Huppy's Avatar
Huppy
Demi-God
 
Join Date: Oct 2010
Posts: 1,332
Default

I meant to ask you Akka, that fix, was that the "compression level" update that was commited ? The only reason I ask, one of my "toy boxes", is sticking to slightly older code, but picking away at manually applying feasible updates, when I can get away with it.
Reply With Quote
  #12  
Old 12-06-2019, 03:25 PM
peterigz
Fire Beetle
 
Join Date: Dec 2009
Posts: 4
Default

Hey Folks,

Posted this in discord but things can drift up on there so just posting here as well.

We've just completed a server update and merged in the latest changes in the eqemu master branch. Everything is great with the exception that we seem to have now run into the dreaded windows server lag bug as report here: http://www.eqemulator.org/forums/sho...t=42311&page=4 According to that thread it was fixed, but it seems it's happening to our server still for some reason. Symptoms are exactly the same, a resend cascade leading to big lag spikes (going by netstats).

We have windows server 2019, 2 zeon 2.4 processors, 32gig ram and more bandwidth then you can shake a stick at. Any ideas as to settings to tweak or other highly welcome! Meanwhile I'll see if I can utilise those metrics in the thread to get more insights.
Reply With Quote
  #13  
Old 12-06-2019, 04:58 PM
Uleat's Avatar
Uleat
Developer
 
Join Date: Apr 2012
Location: North Carolina
Posts: 2,815
Default

I haven't been on discord yet..

..but, I would start with ensuring that you have the correct zlib dll.


If it's not from 2019, I wouldn't trust it to be correct.

Make sure that you're using the one acquired from the eqemu_server.pl download option.


The new vcpkg method seems to install a zlib dll from 2018 into the build directory that seems to cause this issue.

If you do a select all -> copy -> paste from build to server install, and this dll is present in build, it will overwrite your current server copy.


As well, any older dependency-related copy will do it too.

The issue is related to build flags (mostly) and forces the encryption to operate in single-thread mode.

The copy obtained through the eqemu_server download is known to be correctly flagged (as of the my commit to that repo.)
__________________
Uleat of Bertoxxulous

Compilin' Dirty
Reply With Quote
  #14  
Old 12-07-2019, 12:04 PM
peterigz
Fire Beetle
 
Join Date: Dec 2009
Posts: 4
Default

Thanks again Uleat.

So after a bit of experimenting here's some notes:

I've been compiling with the Build Zlib flag set in cmake, so I didn't actually need or use a zlib1.dll in the folder - not sure what that means, shouldn't it compile with zlib with multithreaded mode in that case or is it not configured for that in cmake?

I can untick the build with zlib, in which case I do need the zlib1.dll in the folder to run it, however I've been compiling in 64bit mode so the x86 zlib1.dll from the installer doesn't work with it. However I can just compile a x86 version instead and use the zlib1.dll which is what I'll try next, at least then I know for sure is is using the right zlib then.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

   

All times are GMT -4. The time now is 05:50 AM.


 

Everquest is a registered trademark of Daybreak Game Company LLC.
EQEmulator is not associated or affiliated in any way with Daybreak Game Company LLC.
Except where otherwise noted, this site is licensed under a Creative Commons License.
       
Powered by vBulletin®, Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Template by Bluepearl Design and vBulletin Templates - Ver3.3