Contains the implementation of algorithms that estimate the geographic location of media content based on their content and metadata. It includes the participation in the MediaEval Placing Task 2014. The project's paper can be found here.
This is a tag-based method, in which a complex geographical-tag model is built from the tags, titles and the locations of the images of the training set, in order to estimate the location of each query image included in the test set. The baseline approach comprises three steps.A. Filtering: remove all punctuation and symbols from the training and test data (e.g. “.%!&”), transform all characters to lower case and then remove from the training set all images with empty tags and title.
B. Grid Of Cells & Language Model: Divide the earth surface in cells with a side length of 0.01° for both latitude and longitude (approximately 1km near equator). Then for each such cell and for each tag, the tag-cell probabilities are calculated.
C. Assignment in Cells: For a query image, probability for every cell is computed summing up the contributions of individual tags and title words.
Having the implementation, described above, as baseline, some extensions are applied.-
Similarity Search: Determine the k most similar training images (using Jaccard similarity on the corresponding sets of tags) within the identical cell, and use their center-of-gravity is used as the estimated location.
-
Internal Grid: Built language model using a finer grid (cell side of 0.001°)and make the assumption that: if the estimated cell of finer granularity falls inside the borders of the estimated cell of coarser granularity, then apply similarity search inside former cell. Otherwise, apply similarity search inside latter cell.
-
Spatial Entropy: Built a Gaussian weight function based on the values of the spatial tag entropy. The spatial tag entropy calculated using the Shannon entropy formula on the tag-cell probabilities.
In order to make possible to run the project you have to set all necessary argument in the file config.properties.
Input File Format The dataset's records, that are given as training and test set, have to be in the following format.
imageID imageHashID userID title tags machineTags lon lat description
imageID
: the ID of the image.
imageHashID
: the Hash ID of the image that was provided by the organizers.
userID
: the ID of the user that uploaded the image.
title
: image's title.
tags
: image's tags.
machineTags
: image's machine tags.
lon
: image's longitude.
lat
: image's latitude.
description
: image's description, if it is provided.
Output File Format
At the end of the training process, the algorithm creates a folder named CellProbsForAllTags
and inside the folder a file named cell_tag_prob_scale(s)_entropy.txt
, where the s is the value of the scale that was given as argument. The format of this file is the following.
tag ent-rank_ent-value cell1-lon_cell1-lat>cell1-prob cell2-lon_cell2-lat>cell2-prob...
tag
: the actual name of the tag.
ent-value
: the value of the tag's entropy.
ent-rank
: the rank of the tag based on the entropy.
cellx
: the x most probable cell.
cellx-lon_cellx-lat
: the longitude and latitude of center of the cellx, which is also used as cell's ID.
cellx-prob
: the probability of the cellx for the specific tag.
The file described above is given as input for the Language Model process. During this process, a folder named resultsLM
is created and inside the folder a file named resultsLM_scale(s).txt
. The raw of this file contains the IDs of the most probable cell for every query image. Every row corresponds to the test set image of the same row.
In conclusion, the file that is created by the Language Model is used for the final process of the algorithm, the Internal Grid and Similarity Search. The final results are saved in the file specified in the arguments, and the records in each row are the ID of the query image, the estimated latitude and the estimated longitude separated with the symbol ;
.
Giorgos Kordopatis-Zilos ([email protected])
Symeon Papadopoulos ([email protected])