Before beginning the explanation I would like to point out that throughout this file I will generally talk about robots and processes really freely, they are the same thing, threads are of course light-processes that are sons of the
Main
class that is associated to any Robot.
In the project we were required to simulate a smart city's cleaning system, we had a set of processes roaming around and collecting pollution data that were autonomously sent to the server, each of the robots had a 10 percent probability of having problems and needing to go to a mechanic for repairs. The mechanic was able to handle one repair process at a time thus the project needed to handle different requests arriving at the mechanic in parallel. The full project description can be found in ./project_assignement.
The following is the list of all the classes that I chose to implement for this project
- Main
- Admin Client
- Admin Server
- Server Rest Interface
- MQTT Subscriber
- Bot Positions
- Average List
- Cleaning Bot
- Bot Services
- Service Comparator
- Waiting Thread
- Bot Thread
- Eliminator Thread
- Fix Helper Thread
- GRPC Services Thread (GRPC server)
- Input Thread
- Maintenance Thread
- Measurement Gathering Thread
- Mutual Exclusion Thread
- Pollution Sensor Thread
- Quit Helper Thread
- Bot Identity Comparator
- Bot Utilities
- Comm Pair
- Measurement Buffer
- Position
- Extra
- Atomic Counter
- Atomic Flag
- Logger
- Thread Safe data structure wrapper
- Simulator Code
- Variables
My implementation is quite thread heavy, but I wanted to really experiment as much as possible with it, to get to know and understand how threads work and how to control them while accessing information together.
In the following I will discuss my implementation for the different operations. When a new process is created it starts the initiator thread, which is simply a thread that instanciates and starts all of the necessary services (e.g. the GRPC Services Thread). in the meanwhile it requests the server to join the network, the join operation currently consists of two parts:- Addition of the new robot to the Admin Server's local structure
- Communication of personal data (i.e. an instance of the BotIdentity class) to all of the robots in the network
The GRPC communication is asynchronous, and I decided to do two different HTTP requests (check ID availability, communicate my existence to the network) to highlight the difference between the two tasks. The position in the city is chosen by the Admin Server simply by checking in the system which one is the district with the lowest current density. Originally it was supposed to pick a random district uniformly, while the solution would tend to a stable distribution (if enough processes were initialized), it would not be stable for the majority of the low density cases.
TheCommPair
class is a simple toy I came up with, it's nothing more than a pair containing a ManagedChannel
and a GRPCStub
. The idea behind it is simply to keep on recycling open stubs and channels; thus saving the resources necessary to open a new one each and every time a communication has to take place. Since the project is required to rely heavily on GRPC communication.
Comm Pairs are clustered into ThreadSafeHashMaps
, each of the pairs is associated to the hash of the BotIdentity
the ManagedChannel
connects to.
The maintenance process i s handled by two different threads, one deals with checking whether the robot should go under maintenance or not, the other built and initialized when the robot undergoes maintenance and handles communication and the maintenance procedure.The maintenance process uses a new thread to carry out the function, something that I didn't mention above was that, to handle potential concurrent removals and insertions, any operation involving the network to its full extent would be done by taking a snapshot of the network.
The GRPC is asynchronous and carried out in parallel, any removals necessary are postponed until the time for maintenance is up, thus the removal and stabilization (which do not come cheap) are done at most once per process. Pollution measurements (more about this later) stop during maintenance, originally they would still be sent, but I decided against it becasue it would not make much sense. The solution I implemented to handle removals is overengineered and overkill for the job and the setup of the project.
The reason behind it is that it was originally thought to straighten even the worst distributions (e.g. 1-6, 2-0, 3-0, 4-0), keep in mind that I originally placed robots in the city randomly.
The elimination works this way:
- I remove mentions of the dead robot from the local machine
- I contact the Admin Server to let him know that there are dead robots in the city
- Contact the other robots in the network to remove the references to dead robots from their systems
- Stabilize the data structure
It's important to mention a couple of details
For the elimination procedure, two different GRPC procedures have been called:moveRequest
-> which is used by the robot dealing with the elimination process to tell another robot to move away from a districtpositionModificationRequest
-> which is more like a notification, when a robot moves to another district computes a random position inside that district and lets every robot in the network know that its position is changed.
Whenever a robot receives a moveRequest
the thread waits until the position is changed and the robots in the network have been notified to perform the onComplete
and onNext
operations.
I decided to use a list of Priority Queues with a specific Comparator that allows me to do sorting on the robots by ID. The stabilization is later carried out just by moving around the top of the queues. The pollution sensor is composed by four different entities:
- Simulator
- Measurement Buffer
- Measurement Gathering Thread
- Pollution Sensor Thread
The Measurement Buffer implements a read/write cycle depending on the available size of the buffer. The cycle works unless the robot is on maintenance.
The Measurement Gathering Thread keeps averages in memory until the Pollution Sensor Thread comes and takes them from him. This Thread, the same way the other did, stays on hold until the maintenance process is done.
The Pollution Sensor Thread is the timer of the pipeline and handles MQTT broadcasting. Each MQTT broadcast will be at most 25 seconds apart from the one before, that is because I wanted to simulate the fact that the robot being broken down. To conclude the Admin Server will receive an MQTT message every: REMAINING_MAINTENANCE_TIME
- Cleaning the repository
- Lower the project complexity, find ways to merge functionalities together and lower the amount of threads per process
- Do code refactoring to better define responsibilities inside the network
- Change a couple of asynchronous GRPCs to better suited synchronous remote procedure calls
- Create a GUI to better understand what is going on
- Write good code documentation