# Unreal Engine 4 Parallel Processing School of Fish

Published:03/17/2018   Last Updated:03/16/2018

Nikolay Lazarev

Integrated Computer Solutions, Inc.

## General Description of the Flocking Algorithm

The implemented flocking algorithm simulates the behavior of a school, or flock, of fish. The algorithm contains four basic behaviors:

• Cohesion: Fish search for their neighbors in a radius defined as the Radius of Cohesion. The current positions of all neighbors are summed. The result is divided by the number of neighbors. Thus, the center of mass of the neighbors is obtained. This is the point to which the fish strive for cohesion. To determine the direction of movement of the fish, the current position of the fish is subtracted from the result obtained earlier, and then the resulting vector is normalized.
• Separation: Fish search for their neighbors in a radius defined as the Separation Radius. To calculate the motion vector of an individual fish in a specific separation direction from a school, the difference in the positions of the neighbors and its own position is summed. The result is divided by the number of neighbors and then normalized and multiplied by -1 to change the initial direction of the fish to swim in the opposite direction of the neighbors.
• Alignment: Fish search for their neighbors in a radius defined as the Radius of Alignment. The current speeds of all neighbors are summed, then divided by the number of neighbors. The resulting vector is normalized.
• Reversal: All of the fish can only swim in a given space, the boundaries of which can be specified. The moment a fish crosses a boundary must be identified. If a fish hits a boundary, then the direction of the fish is changed to the opposite vector (thereby keeping the fish within the defined space).

These four basic principles of behavior for each fish in a school are combined to calculate the total position values, speed, and acceleration of each fish. In the proposed algorithm, the concept of weight coefficients was introduced to increase or decrease the influence of each of these three modes of behavior (cohesion, separation, and alignment). The weight coefficient was not applied to the behavior of reversal, because fish were not permitted to swim outside of the defined boundaries. For this reason, reversal had the highest priority. Also, the algorithm provided for maximum speed and acceleration.

According to the algorithm described above, the parameters of each fish were calculated (position, velocity, and acceleration). These parameters were calculated for each frame.

## Source Code of the Flocking Algorithm with Comments

To calculate the state of fish in a school, double buffering is used. Fish states are stored in an array of size N x 2, where N is the number of fish, and 2 is the number of copies of states.

The algorithm is implemented using two nested loops. In the internal nested loop, the direction vectors are calculated for the three types of behavior (cohesion, separation, and alignment). In the external nested loop, the final calculation of the new state of the fish is made based on calculations in the internal nested loop. These calculations are also based on the values ​​of the weight coefficients of each type of behavior and the maximum values ​​of speed and acceleration.

External loop: At each iteration of a cycle, a new value for the position of each fish is calculated. As arguments to the lambda function, references are passed to:

 agents Array of fish states currentStatesIndex Index of array where the current states of each fish are stored previousStatesIndex Index of array where the previous states of each fish are stored kCoh Weighting factor for cohesion behavior kSep Weighting factor for separation behavior kAlign Weighting factor for alignment behavior rCohesion Radius in which neighbors are sought for cohesion rSeparation Radius in which neighbors are sought for separation rAlignment Radius in which the neighbors are sought for alignment maxAccel Maximum acceleration of fish maxVel Maximum speed of fish mapSz Boundaries of the area in which fish are allowed to move DeltaTime Elapsed time since the last calculation isSingleThread Parameter that indicates in which mode the loop will run

ParllelFor can be used in either of two modes, depending on the state of the isSingleThread Boolean variable:

``````     ParallelFor(cnt, [&agents, currentStatesIndex, previousStatesIndex, kCoh, kSep, kAlign, rCohesion, rSeparation,
rAlignment, maxAccel, maxVel, mapSz, DeltaTime, isSingleThread](int32 fishNum) {``````

Initializing directions with a zero vector to calculate each of the three behaviors:

``     FVector cohesion(FVector::ZeroVector), separation(FVector::ZeroVector), alignment(FVector::ZeroVector);``

Initializing neighbor counters for each type of behavior:

``     int32 cohesionCnt = 0, separationCnt = 0, alignmentCnt = 0;``

Internal nested loop. Calculates the direction vectors for the three types of behavior:

``     for (int i = 0; i < cnt; i++) {``

Each fish should ignore (not calculate) itself:

``     if (i != fishNum) {``

Calculate the distance between the position of a current fish and the position of each other fish in the array:

``     float distance = FVector::Distance(agents[i][previousStatesIndex].position, agents[fishNum][previousStatesIndex].position);``

If the distance is less than the cohesion radius:

``     if (distance < rCohesion) {``

Then the neighbor position is added to the cohesion vector:

``     cohesion += agents[i][previousStatesIndex].position;``

The value of the neighbor counter is increased:

``````     cohesionCnt++;
}``````

If the distance is less than the separation radius:

``     if (distance < rSeparation) {``

The difference between the position of the neighbor and the position of the current fish is added to the separation vector:

``     separation += agents[i][previousStatesIndex].position - agents[fishNum][previousStatesIndex].position;``

The value of the neighbor counter is increased:

``````     separationCnt++;
}``````

If the distance is less than the radius of alignment:

``     if (distance < rAlignment) {``

Then the velocity of the neighbor is added to the alignment vector:

``     alignment += agents[i][previousStatesIndex].velocity;``

The value of the neighbor counter is increased:

``````     alignmentCnt++;
}
}``````

If neighbors were found for cohesion:

``     if (cohesionCnt != 0) {``

Then the cohesion vector is divided by the number of neighbors and its own position is subtracted:

``````     cohesion /= cohesionCnt;
cohesion -= agents[fishNum][previousStatesIndex].position;``````

The cohesion vector is normalized:

``````     cohesion.Normalize();
}``````

If neighbors were found for separation:

``     if (separationCnt != 0) {``

The separation vector is divided by the number of neighbors and multiplied by -1 to change the direction:

``````            separation /= separationCnt;
separation *= -1.f;``````

The separation vector is normalized:

``````              separation.Normalize();
}``````

If neighbors were found for alignment:

``     if (alignmentCnt != 0) {``

The alignment vector is divided by the number of neighbors:

``            alignment /= alignmentCnt;``

The alignment vector is normalized:

``````            alignment.Normalize();
}``````

Based on the weight coefficients of each of the possible types of behavior, a new acceleration vector is determined, limited by the value of the maximum acceleration:

``agents[fishNum][currentStatesIndex].acceleration = (cohesion * kCoh + separation * kSep + alignment * kAlign).GetClampedToMaxSize(maxAccel);``

To limit the acceleration vector along the Z-axis:

``   agents[fishNum][currentStatesIndex].acceleration.Z = 0;``

Add to the previous position of the fish the result of the multiplication of the new velocity vector and the time elapsed since the last calculation:

``     agents[fishNum][currentStatesIndex].velocity += agents[fishNum][currentStatesIndex].acceleration * DeltaTime;``

The velocity vector is limited to the maximum value:

``````     agents[fishNum][currentStatesIndex].velocity =
agents[fishNum][currentStatesIndex].velocity.GetClampedToMaxSize(maxVel);``````

To the previous position of a fish, the multiplication of the new velocity vector and the time elapsed since the last calculation is added:

``     agents[fishNum][currentStatesIndex].position += agents[fishNum][currentStatesIndex].velocity * DeltaTime;``

The current fish is checked to be within the specified boundaries. If yes, the calculated speed and position values are saved. If the fish has moved beyond the boundaries of the region along one of the axes, then the value of the velocity vector along this axis is multiplied by -1 to change the direction of motion:

``````     agents[fishNum][currentStatesIndex].velocity = checkMapRange(mapSz,
agents[fishNum][currentStatesIndex].position, agents[fishNum][currentStatesIndex].velocity);

For each fish, collisions with world-static objects, like underwater rocks, should be detected, before new states are applied:

``     for (int i = 0; i < cnt; i++) {``

То detect collisions between fish and world-statiс objects:

``````            FHitResult hit(ForceInit);
if (collisionDetected(agents[i][previousStatesIndex].position, agents[i][currentStatesIndex].position, hit)) {``````

If a collision is detected, then the previously calculated position should be undone. The velocity vector should be changed to the opposite direction and the position recalculated:

``````                   agents[i][currentStatesIndex].position -= agents[i]  [currentStatesIndex].velocity * DeltaTime;
agents[i][currentStatesIndex].velocity *= -1.0;
agents[i][currentStatesIndex].position += agents[i][currentStatesIndex].velocity * DeltaTime;
}
}``````

Having calculated the new states of all fish, these updated states will be applied, and all fish will be moved to a new position:

``````     for (int i = 0; i < cnt; i++) {
FTransform transform;
m_instancedStaticMeshComponent->GetInstanceTransform(agents[i]->instanceId, transform);``````

Set up a new position of the fish instance:

``     transform.SetLocation(agents[i]->position);``

Turn the fish head forward in the direction of movement:

``````     FVector direction = agents[i].velocity;
direction.Normalize();

Update instance transform:

``````            m_instancedStaticMeshComponent->UpdateInstanceTransform(agents[i].instanceId, transform, false, false);
}``````

Redraw all the fish:

``````     m_instancedStaticMeshComponent->ReleasePerInstanceRenderData();

m_instancedStaticMeshComponent->MarkRenderStateDirty();``````

Swap indexed fish states:

``      swapFishStatesIndexes();``

## Complexity of the Algorithm: How Increasing the Number of Fish Affects Productivity

Suppose that the number of fish participating in the algorithm is N. To determine the new state of each fish, the distance to all the other fish must be calculated (not counting additional operations for determining the direction vectors for the three types of behavior). The initial complexity of the algorithm will be O(N2). For example, 1,000 fish will require 1,000,000 operations. Figure 1: Computational operations for calculating the positions of all fish in a scene.

Structure describing the state of each fish:

``````     struct TInfo{
int instanceId;
float3 position;
float3 velocity;
float3 acceleration;
};``````

Function for calculating the distance between two vectors:

``````     float getDistance(float3 v1, float3 v2) {
return sqrt((v2-v1)*(v2-v1) + (v2-v1)*(v2-v1) + (v2-v1)*(v2-v1));
}

RWStructuredBuffer<TInfo> data;

{``````

Total number of fish:

``     int fishCount = constants.fishCount;``

This variable, created and initialized in C++, determines the number of fish calculated in each graphics processing unit (GPU) thread (by default:1):

``     int calculationsPerThread = constants.calculationsPerThread;``

Loop for calculating fish states that must be computed in this thread:

``     for (int iteration = 0; iteration < calculationsPerThread; iteration++) {``

Thread identifier. Corresponds to the fish index in the state array:

``     int currentThreadId = calculationsPerThread * ThreadId.y + iteration;``

The current index is checked to ensure it does not exceed the total number of fish (this is possible, since more threads can be started than there are fish):

``````     if (currentThreadId >= fishCount)
return;``````

To calculate the state of fish, a single double-length array is used. The first N elements of this array are the new states of fish to be calculated; the second N elements are the older states of fish that were previously calculated.

Current fish index:

``    int currentId = fishCount + currentThreadId;``

Copy of the structure of the current state of fish:

``     TInfo currentState = data[currentThreadId + fishCount];``

Copy of the structure of the new state of fish:

``     TInfo newState = data[currentThreadId];``

Initialize direction vectors for the three types of behavior:

``````     float3 steerCohesion = {0.0f, 0.0f, 0.0f};
float3 steerSeparation = {0.0f, 0.0f, 0.0f};
float3 steerAlignment = {0.0f, 0.0f, 0.0f};``````

Initialize neighbors counters for each type of behavior:

``````     float steerCohesionCnt = 0.0f;
float steerSeparationCnt = 0.0f;
float steerAlignmentCnt = 0.0f;``````

Based on the current state of each fish, direction vectors are calculated for each of the three types of behaviors. The cycle begins with the middle of the input array, which is where the older states are stored:

``     for (int i = fishCount; i < 2 * fishCount; i++) {``

Each fish should ignore (not calculate) itself:

``     if (i != currentId) {``

Calculate the distance between the position of current fish and the position of each other fish in the array:

``     float d = getDistance(data[i].position, currentState.position);``

If the distance is less than the cohesion radius:

``     if (d < constants.radiusCohesion) {``

Then the neighbor’s position is added to the cohesion vector:

``     steerCohesion += data[i].position;``

And the counter of neighbors for cohesion is increased:

``````            steerCohesionCnt++;
}``````

If the distance is less than the separation radius:

``     if (d < constants.radiusSeparation) {``

Then the separation vector is added to the difference between the position of the neighbor and the position of the current fish:

``     steerSeparation += data[i].position - currentState.position;``

The counter of the number of neighbors for separation increases:

``````            steerSeparationCnt++;
}``````

If the distance is less than the alignment radius:

``     if (d < constants.radiusAlignment) {``

Then the velocity of the neighbor is added to the alignment vector:

``     steerAlignment += data[i].velocity;``

The counter of the number of neighbors for alignment increases:

``````                          steerAlignmentCnt++;
}
}
}``````

If neighbors were found for cohesion:

``     if (steerCohesionCnt != 0) {``

The cohesion vector is divided by the number of neighbors and its own position is subtracted:

``     steerCohesion = (steerCohesion / steerCohesionCnt - currentState.position);``

The cohesion vector is normalized:

``````            steerCohesion = normalize(steerCohesion);
}``````

If neighbors were found for separation:

``     if (steerSeparationCnt != 0) {``

Then the separation vector is divided by the number of neighbors and multiplied by -1 to change the direction:

``     steerSeparation = -1.f * (steerSeparation / steerSeparationCnt);``

The separation vector is normalized:

``````            steerSeparation = normalize(steerSeparation);
}``````

If neighbors were found for alignment:

``     if (steerAlignmentCnt != 0) {``

Then the alignment vector is divided by the number of neighbors:

``     steerAlignment /= steerAlignmentCnt;``

The alignment vector is normalized:

``````           steerAlignment = normalize(steerAlignment);
}``````

Based on the weight coefficients of each of the three possible types of behaviors, a new acceleration vector is determined, limited by the value of the maximum acceleration:

``````     newState.acceleration = (steerCohesion * constants.kCohesion + steerSeparation * constants.kSeparation
+ steerAlignment * constants.kAlignment);
newState.acceleration = clamp(newState.acceleration, -1.0f * constants.maxAcceleration,
constants.maxAcceleration);``````

To limit the acceleration vector along the Z-axis:

``     newState.acceleration = 0.0f;``

To the previous velocity vector, the product of the new acceleration vector and the time elapsed since the last calculation is added. The velocity vector is limited to the maximum value:

``````     newState.velocity += newState.acceleration * variables.DeltaTime;
newState.velocity = clamp(newState.velocity, -1.0f * constants.maxVelocity, constants.maxVelocity);``````

Add to the previous position of the fish the result of the multiplication of the new velocity vector and the time elapsed since the last calculation:

``     newState.position += newState.velocity * variables.DeltaTime;``

The current fish is checked to be within the specified boundaries. If yes, the calculated speed and position values are saved. If the fish has moved beyond the boundaries of the region along one of the axes, then the value of the velocity vector along this axis is multiplied by -1 to change the direction of motion:

``````                   float3 newVelocity = newState.velocity;
if (newState.position > constants.mapRangeX || newState.position < -constants.mapRangeX) {
newVelocity *= -1.f;
}

if (newState.position > constants.mapRangeY || newState.position < -constants.mapRangeY) {
newVelocity *= -1.f;
}
if (newState.position > constants.mapRangeZ || newState.position < -3000.f) {
newVelocity *= -1.f;
}
newState.velocity = newVelocity;

}
}         ``````

Table 1: Comparison of algorithms.

Fish

Algorithm (FPS)

Computing Operations

CPU SINGLE

CPU MULTI

GPU MULTI

100

62

62

62

10000

500

62

62

62

250000

1000

62

62

62

1000000

1500

49

61

62

2250000

2000

28

55

62

4000000

2500

18

42

62

6250000

3000

14

30

62

9000000

3500

10

23

56

12250000

4000

8

20

53

16000000

4500

6

17

50

20250000

5000

5

14

47

25000000

5500

4

12

35

30250000

6000

3

10

31

36000000

6500

2

8

30

42250000

7000

2

7

29

49000000

7500

1

7

27

56250000

8000

1

6

24

64000000

8500

0

5

21

72250000

9000

0

5

20

81000000

9500

0

4

19

90250000

10000

0

3

18

100000000

10500

0

3

17

110250000

11000

0

2

15

121000000

11500

0

2

15

132250000

12000

0

1

14

144000000

13000

0

0

12

169000000

14000

0

0

11

196000000

15000

0

0

10

225000000

16000

0

0

9

256000000

17000

0

0

8

289000000

18000

0

0

3

324000000

19000

0

0

2

361000000

20000

0

0

1

400000000 Figure 2: Comparison of algorithms.

Laptop Hardware:
CPU – Intel® Core i7-3632QM processor 2.2 GHz with turbo boost up to 3.2 GHz
GPU - NVIDIA GeForce* GT 730M
RAM - 8 GB DDR3*

#### Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804