Comparison of bulk operations

Bernd Welter · Post by **Bernd Welter** » Thu Mar 21, 2024 9:06 am

Hi there,

In a recent project I've been asked about the impact of "not having a bulk operation" for a specific task.

You seem to not give much importance to the issue of not having batch geocoding in PTV Developer, when this is a requirement. Is it simply not possible? Does this rule out Developer?

No, it does not. (And by the way we will add bulk geocoding in PTV Developer - but once you read the statement below you probably understand why other tasks seem to be more important)

Let me start with viewing this from a “logical” perspective on meta level – not technical wise…

A user provides “N pieces of information (e.g. addresses) and requires N times a function being applied (e.g. geocoding of each address)” – doing this via a single batch call does not generate more information, it simply gathers it in a different way. Especially when it comes to “geocoding of large volumes” there are different approaches of gathering the final “info” you need:

Approach 1 - Reference approach: send N elemental geocodings in a single threaded sequence  requires 100% time for the global info being available on client side.
Approach 2 - Apply one BULK operation (if possible): This might reduce the calculation time from 100% to 9x%. The bigger N is the more can be saved but I wouldn’t expect “wonders”, because the calculation inside the service is still a sequencial one. Only the network traffic is reduced compared to te reference approach.
Approach 3 - Send the elemental requests in parallel (without exceeding a certain “degree of parallelism”): Here's huge potential because the reference time can simply be divided by the degree:
- degree == 2: reduces the clients waiting time for 100% info being available to roughly 50%
- degree == 3 : reduces the clients waiting time for 100% info being available to roughly 33%
- degree == N : reduces the client's waiting time to roughly 100% / N
Sidenote for PTV Developer: check the rate limits to ensure that you do not exceed any server side limitations.
Approach 4 - Now the biggest potential is when you apply both strategies at the same time:
- Cut the workload in chunks and send them through parallel bulk operations

Important

In all these approaches the transaction volume relevant for billing is equal.
This works fine if you replace the initial approach with a real bulk/batch operation.
Sidenote: this is NOT the same if you compare a distance matrixc with [N:M] with NxM elemental routings! In this case the temporary data strutures of the two approaches are NOT equal and can lead to gaps in the output information!

Native approach	Shortcut approach	Quality	Performance gain	Supported by API
N times single geocoding	single threaded bulk geocoding	100% compareable	small	xLocate1, xLocate 2
N times single geocoding	multi threaded single geocodings	100% compareable	huge	xLocate1, xLocate 2
N times single geocoding	multi threaded bulk geocodings	100% compareable	hugest	xLocate1, xLocate 2
N times route info through 2 or more waypoints	xroute1.calculateBulkRouteInfo	100% compareable	small	xRoute1
N times route info through 2 waypoints sharing start or destination or both	xroute1.calculateMatrixInfo xDima2.calculateDistanceMatrix	not compareable	huge	xRoute1, xDima2