Clustering, load balancing and asynchronous protocol

This forum deals with administrative topics such as monitoring and system setup. It is dedicated to system administrators who do not have to deal with the functions themselves but who have to establish and maintain proper environments.
Public since 15.2.2017

Clustering, load balancing and asynchronous protocol

Postby biro.daniel » Fri Feb 02, 2018 11:34 am

Dear Bernd,

We have some questions regarding clustering and asynchronous protocol. The questions are aimed at the exact procedure of load balancing.

If our understanding is correct, the jobs get to one common place (database) in case of scaling-out independently of the load balancing, which results in a more balanced load balancing according to the documentation.

“In order to use job requests, the PTV xServer has to persist responses until they are fetched. This is done with the help of a JDBC database and works out of the box for a single PTV xServer. For a cluster of PTV xServer you need a central database. If you need a highly available solution you have to set up replication to your backup systems as well.”[1]


However, it is not clear whether it means that as the job comes in, it gets to the db and one of the servers with free capacity eventually fetches that job in order to process it, or the very server which recieved it will process it. What is the exact role of the queue in this process?

“The request queue is a temporary store for incoming requests that cannot be processed immediately. This queueing mechanism buffers allows to process temporary spikes that would otherwise overload the system. Technically, the queue is implemented as a highly efficient LIFO data structure.”[2]


Also note: is it really a LIFO (stack), not a FIFO (queue) data structure?

Does it mean in case of asynchronous request that it gets to the db when removed from the queue rather than getting processed? Does the asynchronous operation differ from the synchronous in this regard?

The following sentence about the load balancing strategies suggests the opposite:
“PTV xServer cannot distribute transactions between themselves, they need an external component to do it for them, a web proxy acting as load balancer.”[3]


What we would like to have is some information whether it is possible to retrieve more information on the full capacity of the cluster and whether the servers take the other’s calculation and if so, how.

As we understand if multiple xservers (xroute) are running, and a loadbalancer is in front of them, they will reply overload if the instance the request landed at cannot process more requests. This may be problematic to implement a backpressure algorithm to throttle workload, as response times are not useable for throllting, as reponse times greatly vary based on request complexity, and cannot be estimated efficiently. This is why one server may get its queue filled while others are almost idling, if the workload and loadbalancing are in an unfortunate correlation. Does assinging the instances to a common database solve this, or is there any means to have a common task queue for every instance, and receive overload signal only when the common queue is filled? Alternatively could you suggest an easily available info about the cluster node capacity, to be able to assign weights to the front-end load-balancer?

http://xserver.ptvgroup.com/fileadmin/f ... s%7C_____9 ↩︎

https://xtour-eu-n-test.cloud.ptvgroup. ... ecture.htm ↩︎

https://xtour-eu-n-test.cloud.ptvgroup. ... Tuning.htm ↩︎
biro.daniel
 
Posts: 26
Joined: Tue Aug 23, 2016 8:08 am

Re: Clustering, load balancing and asynchronous protocol

Postby Joost » Fri Feb 02, 2018 12:09 pm

Moved topic to administration forum since the questions are not directly related to clustering.

I asked our dev team to take a look at your questions (XSERS-905)
Joost Claessen
Senior Technical Consultant
PTV Benelux
Joost
 
Posts: 174
Joined: Fri Apr 25, 2014 1:46 pm

Re: Clustering, load balancing and asynchronous protocol

Postby Bernd Welter » Tue Feb 06, 2018 12:35 pm

Hi there,

here are some statements that can help to understand the generic picture behind the built-in asynchronous protocol topics:

  • Each server has its own queue. If a server get's a start-transaction he checks whether the queue has some space -if not: overloaded. Exception for the client. No internal forwarding to another server.
  • If a server accepts a start-transaction he will proceed it and update the status values into the database every once in a while.
  • If the executing server is finished he writes the result plan into the database.
  • If a server which is configured to use the same shared database gets a "watch" or "fetch" call it checks the database for the status and returns the result (if available). In other words: you can get the status, progress KPI and result from any server in the cluster.
The description comprises indeed the main facts how jobs are handled by the xserver framework. There's no optimization in distributing workload between servers. This has to be done via a separate loadbalancer. How to optimize loadbalancing is a complex issue. The Technical Concepts include recommendations and guidelines concerning this.
And of course the queue is a FIFO datastructure. The documenation is wrong at this place.

Thanks to Ralf for the hints.

Best regards,
Bernd
Bernd Welter
Manager Technical Consulting & Requirement Engineering
Senior Technical Consultant Developer Components
PTV GROUP - Germany

https://www.youtube.com/channel/UCgkUli9yGf0gwTDdxbMZ-Kg
User avatar
Bernd Welter
Site Admin
 
Posts: 939
Joined: Mon Apr 14, 2014 10:28 am


Return to PTV xServer Administration

cron