![]() |
The ROme OpTimistic Simulator
2.0.0
A General-Purpose Multithreaded Parallel/Distributed Simulation Platform
|
Distributed GVT Support module. More...
#include <communication/communication.h>
#include <communication/mpi.h>
#include <communication/gvt.h>
Go to the source code of this file.
Functions | |
void | gvt_comm_init (void) |
Initialize the MPI-based distributed GVT reduction submodule. More... | |
void | gvt_comm_finalize (void) |
Shut down the MPI-based distributed GVT reduction submodule. More... | |
void | enter_red_phase (void) |
Make a thread enter into red phase. More... | |
void | exit_red_phase (void) |
Make a thread exit from red phase. More... | |
void | join_white_msg_redux (void) |
Join the white message reduction collective operation. More... | |
bool | white_msg_redux_completed (void) |
Test completion of white message reduction collective operation. More... | |
void | wait_white_msg_redux (void) |
Wait for the completion of wait message reduction. More... | |
bool | all_white_msg_received (void) |
Check if white messages are all received. More... | |
void | flush_white_msg_recv (void) |
Reset received white messages. More... | |
void | flush_white_msg_sent (void) |
Reset sent white messages. More... | |
void | broadcast_gvt_init (unsigned int round) |
Initiate a distributed GVT. More... | |
bool | gvt_init_pending (void) |
Check if there are pending GVT-init messages around. More... | |
void | gvt_init_clear (void) |
Forcely extract GVT-init message from MPI. More... | |
void | join_gvt_redux (simtime_t local_vt) |
Reduce the GVT value. More... | |
bool | gvt_redux_completed (void) |
Check if final GVT reduction is complete. More... | |
simtime_t | last_reduced_gvt (void) |
Return the last GVT value. More... | |
void | register_outgoing_msg (const msg_t *msg) |
Register an outgoing message, if necessary. More... | |
void | register_incoming_msg (const msg_t *msg) |
Register an incoming message, if necessary. More... | |
Variables | |
phase_colour * | threads_phase_colour |
simtime_t * | min_outgoing_red_msg |
Minimum time among all the outgoing red messages for each thread. | |
volatile atomic_t * | white_msg_recv |
static atomic_t | white_0_msg_recv |
static atomic_t | white_1_msg_recv |
static atomic_t * | white_msg_sent |
static int * | white_msg_sent_buff |
Temporary structure used for the MPI collective primitives. | |
static int | expected_white_msg |
static MPI_Request | white_count_req |
static MPI_Comm | white_count_comm |
static spinlock_t | white_count_lock |
static MPI_Request | gvt_reduction_req |
static MPI_Comm | gvt_reduction_comm |
static spinlock_t | gvt_reduction_lock |
static simtime_t | local_vt_buff |
static simtime_t | reduced_gvt |
static MPI_Request * | gvt_init_reqs |
static unsigned int | gvt_init_round |
Distributed GVT Support module.
This module implements communication primitives to implement the asynchronous distributed GVT reduction algorithm described in:
T. Tocci, A. Pellegrini, F. Quaglia, J. Casanovas-García, and T. Suzumura,
“ORCHESTRA: An Asynchronous Wait-Free Distributed GVT Algorithm,”
in Proceedings of the 21st IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications
2017
For a full understanding of the code, we encourage reading that paper.
This file is part of ROOT-Sim (ROme OpTimistic Simulator).
ROOT-Sim is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; only version 3 of the License applies.
ROOT-Sim is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with ROOT-Sim; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Definition in file gvt.c.
bool all_white_msg_received | ( | void | ) |
Check if white messages are all received.
Check if the number of white message received is equal to the expected ones (retrieved through the white message reduction).
true
if the expected amount of white messages has bee already received, false
otherwise. Definition at line 364 of file gvt.c.
void broadcast_gvt_init | ( | unsigned int | round | ) |
Initiate a distributed GVT.
A call to this function sends to all kernel instances an asynchronous request to initiate the distributed protocol for GVT reduction.
If must be called by one single thread, and a consistent round value should be passed as an argument.
round | A counter telling what is the GVT round that we want to initiate. round must be strictly monotonic. |
Definition at line 447 of file gvt.c.
void enter_red_phase | ( | void | ) |
Make a thread enter into red phase.
Once this function returns, the calling thread will be turned red. The minimum timestamp of red messages sent by it will be reset.
Definition at line 259 of file gvt.c.
void exit_red_phase | ( | void | ) |
Make a thread exit from red phase.
Once this function returns, the calling thread will be turned white.
Definition at line 279 of file gvt.c.
void flush_white_msg_recv | ( | void | ) |
Reset received white messages.
This function is used to prepare for the next white phase of the execution. In particular, after that the correct number of white messages has been received, a call to this function will change the white phase counter pointed by white_msg_recv to either white_0_msg_recv or white_1_msg_recv, depending on the current phase which we are leaving.
Furthermore, the white message counter is reset to zero—it will be used in the next GVT round.
Definition at line 386 of file gvt.c.
void flush_white_msg_sent | ( | void | ) |
Reset sent white messages.
This function is used to prepare for the next white phase of the execution. In particular, after that all the threads have entered the red phase (this ensures that white message counter can be safely reset for all threads) the counters keeping the number of white messages sent to other simulation kernel instances are reset.
Definition at line 417 of file gvt.c.
void gvt_comm_finalize | ( | void | ) |
Shut down the MPI-based distributed GVT reduction submodule.
This function is called just before shutting down the MPI subsystem. It frees all the data structures used to carry out the distributed GVT reduction protocol, and nicely tells MPI to wipe out everything that is no longer needed. Right after this, MPI will be shut down as well.
Definition at line 221 of file gvt.c.
void gvt_comm_init | ( | void | ) |
Initialize the MPI-based distributed GVT reduction submodule.
This function is called after that the MPI subsystem is activated. Its goal is to initialize all the variable-sized data structures and MPI-related data structures which will be used while running the distributed GVT reduction protocol.
Definition at line 174 of file gvt.c.
void gvt_init_clear | ( | void | ) |
Forcely extract GVT-init message from MPI.
This function synchronously extracts messages related to GVT initiation from the MPI library. This should be called as a fallback strategy if gvt_init_pending() determined that a clean shutdown of the communication subsystem is not possible at present time.
Definition at line 491 of file gvt.c.
bool gvt_init_pending | ( | void | ) |
Check if there are pending GVT-init messages around.
This function tells whether some kernel is still waiting for some GVT message to be sent around, related to GVT initiation. This is used when shutting down the communication subsystem to avoid deadlock in some main loop of some simulation kernel.
true
if there is some GVT-related message still pending, false
otherwise. Definition at line 477 of file gvt.c.
bool gvt_redux_completed | ( | void | ) |
Check if final GVT reduction is complete.
A call to this function, issued after a call to join_gvt_redux() will tell whether the final phase of the GVT reduction is complete, and the simulation kernel instance can safely access reduced_gvt to pick the newly-reduced value of the GVT.
This function is thread safe, so it can be called by multiple threads at one. A spinlock guard ensures that only one thread at a time performs this operation.
true
if the GVT reduction operation associated with the last round is completed, false
otherwise. Definition at line 539 of file gvt.c.
void join_gvt_redux | ( | simtime_t | local_vt | ) |
Reduce the GVT value.
A call to this function can be issued only after that all threads on all simulation kernels have agreed upon a local candidate for the GVT value, and this value has been posted by some thread in local_vt_buff.
The goal of this function is to use an All Reduce non-blocking primitive to compute the minimum among the values which have been posted by all kernel instances in local_vt_buff.
Eventually, the minimum will be stored in reduced_gvt of all kernel instances.
Definition at line 515 of file gvt.c.
void join_white_msg_redux | ( | void | ) |
Join the white message reduction collective operation.
All kernels will share eachother the number of white message sent during the last white phase.
This is an asyncronous function and the actual completion of the collective reduction can be tested through the white_msg_redux_completed()
function.
After the collective reduction will be terminated expected_white_msg
will hold the number of white message sent by all the other kernel to this during the last white phase.
Definition at line 301 of file gvt.c.
simtime_t last_reduced_gvt | ( | void | ) |
Return the last GVT value.
This function returns the last GVT value which all the threads from all simulation kernel instances have agreed upon.
It is safe to call this function also while a GVT reduction operation is taking place.
Definition at line 563 of file gvt.c.
void register_incoming_msg | ( | const msg_t * | msg | ) |
Register an incoming message, if necessary.
Any message being received by a simulation kernel instance should be passed to this function, right before being registered into any queue.
In this way, if the destination kernel is in a white phase, the total number of white messages received can be incremented. Consistency is ensured by the fact that atomic counters are used, making this function inherently thread-safe.
msg | The message to register as an incoming message. |
Definition at line 623 of file gvt.c.
void register_outgoing_msg | ( | const msg_t * | msg | ) |
Register an outgoing message, if necessary.
Any time that a simulation kernel is sending a message towards a remote simulation kernel instance, the message should be passed to this function beforehand.
In this way, if the thread which sends the message is in a red phase, the minimum timestamp of red messages sent aroung is updated.
On the other hand, if the thread which sends the message is in a white phase, the counter of white messages sent towards the destination kernel is increased.
All this is fundamental information to ensure that the GVT reduction is consistent.
msg | The message to register as an outgoing message. |
Definition at line 589 of file gvt.c.
void wait_white_msg_redux | ( | void | ) |
bool white_msg_redux_completed | ( | void | ) |
Test completion of white message reduction collective operation.
This function checks whether the asynchronous collective operation which counts whether the expected number of white messages has been received is completed.
This can be safely invoked concurrently by multiple worker threads, as a lock guard is used to ensure that only one thread at a time performs the check.
true
if the reduction operation is completed, false
otherwise. Definition at line 328 of file gvt.c.
|
static |
|
static |
|
static |
|
static |
|
static |
|
static |
|
static |
In the end, the GVT reduction is implemented using an MPI All Reduce. This primitive requires some stable buffer in memory to keep the values to be reduced across all ranks. This is a global variable in which each kernel instance places its proposal for the GVT, which is reduced using MPI all reduce.
|
static |
phase_colour* threads_phase_colour |
|
static |
Counter of received white messages for the first white phase in a period of three. This is pointed to by white_msg_recv when relating to the current white phase.
|
static |
Counter of received white messages for the second white phase in a period of three. This is pointed to by white_msg_recv when relating to the current white phase.
|
static |
|
static |
|
static |
volatile atomic_t* white_msg_recv |
There are multiple incarnations of a white phase, namely the white phase before a red phase is different from the white phase after that same red phase. At the end, we can reduce the multiple incarnations to only two (the third white phase appearing in the execution can be merged with the first, as the second one in between ensures that no pending action on the first phase is present).
This aspect is implemented using two different atomic counters of white messages, so that white_msg_recv must always point to the actual counter of white message received by the current kernel that have been sent during the last global white phase. It will point either to white_0_msg_recv or to white_1_msg_recv.