Java implementation for DBSCANSD, a trajectory clustering algorithm.
DBSCANSD (Density-Based Spatial Clustering of Applicationswith Noise considering Speed and Direction)[1] is a clustering algorithm extended from DBSCAN [2]. It can consider speed and direction, which is essential for maritime lanes extraction. The output of this algorithm is a set of Gravity Vectors (GV) and Sampled Stopping Points (SSP).
In the present version, the implementation has not included generating SSP yet, but I shall add this part later.
Since the AIS data provided for this project is confidential, I cannot upload it to github as example. But I generated a toy data set and put it in the src folder which can be tested with the program. And it will be great if you use this algorithm for other domains' problems, such as tracking data of vehicles, pedestrian, hurricane or animals.
More details about this algorithm can be found in [1]. The link is as following:
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7004281
After downloading it to local,
-
cd to the folder of src/boliu/dbscansd/
-
compile all the .java files using:
javac *.java
-
cd to the folder of src/
-
execute the program using the following either command:
java boliu.dbscansd.Main inputfile outputfile lineNum eps minPts maxSpd maxDir isStop
* @param inputfile the input file path * @param outputfile the output file path * @param lineNum the designated number of trajectory points for clustering (if the size of the input file is less than lineNum, it will extract all the points) * @param eps 1st parameter of DBSCANSD, the radius * @param minPts 2nd parameter of DBSCANSD, the minimum number of points * @param maxSpd 3rd parameter of DBSCANSD, the maximum SOG difference * @param maxDir 4th parameter of DBSCANSD, the maximum COG difference * @param isStop boolean value (0/1), if you would like to cluster stopping points (1) or moving points (0)
--e.g. java boliu.dbscansd.Main toy_data.csv output 70000 0.03 50 2 2.5 0
In this way, the program will do the job on toy_data.csv file. It will extract the first 70,000 moving points from the data and then run DBSCANSD on the dataset. The final output will be two files: output_gv.csv (gravity vectors) output_movingclusters.csv (original clustering results with more rows).
-
waiting for the result :) The running time will vary with different sizes of the input data and other input parameters.
-
Star it if it helps *-*
[1] Liu, Bo, et al. "Knowledge-based clustering of ship trajectories using density-based approach." Big Data (Big Data), 2014 IEEE International Conference on. IEEE, 2014.
[2] Ester, Martin, et al. "A density-based algorithm for discovering clusters in large spatial databases with noise." Kdd. Vol. 96. No. 34. 1996.