Multimedia Geotagging

This repository contains the implementation of algorithms that estimate the geographic location of multimedia items based on their textual content. The approach is described in the paper Geotagging Text Content With Language Models and Feature Mining.

Main Method

The approach is a refined language model, including feature selection and weighting schemes and heuristic techniques that improves the accuracy in finer granularities. It is a text-based method, in which a complex geographical-tag model is built from the tags, titles and the locations of a massive amount of geotagged images that are included in a training set, in order to estimate the location of each query image included in a test set.

The main approach comprises two major processing steps, an offline and an online.

Offline Processing Step

Pre-processing
- apply URL decoding, lowercase transformation, tokenization
- remove accents, punctuations and symbols (e.g. “.%!&”)
- discard terms consisting of numerics or less than three characters
Language Model
- divide earth surface in rectangular cells with a side length of 0.01°
- calculate term-cell probabilities based on the users that used the term inside the cell
Feature selection
- calculate locality score of every term in the dataset
- locality is based on the term frequency and the neighbor users that have used it in the cell distribution
- the final set of selected terms is formed from the terms with locality score greater than zero
Feature weighting using spatial entropy
- calculate spatial entropy values of every term applying the Shannon entropy formula in the term-cell probabilities
- spatial entropy weights derives from a Gaussian weight function over the spatial entropy of terms
- locality weights derives from the relative position in the rank of terms based on their locality score
- combine locality and spatial entropy weight to generate the final weights

Online Processing Step

Language Model based estimation (prior-estimation)
- the probability of each cell is calculated
- Most Likely Cell (MLC) considered the cell with the highest probability and used to produce the estimation
Multiple Resolution Grids
- build different language models for multiple resolution grids (side length 0.01° and 0.001°)
- estimate the MLC combining the result of the individual language models
Similarity Search
- determine the most similar training images within the MLC
- their center-of-gravity is the final location estimation

Instructions

In order to make possible to run the project you have to set all necessary argument in configurations, following the instruction for every argument. The default values may be used.

Input File
The imput files must be in the same format as YFCC100M dataset.

Output Files
At the end of the training process, the algorithm creates a folder named TermCellProbs and inside the folder another folder named scale_(s), named appropriately based on the scale s of the language model's cells. The format of this file is the following.

term	cell1-lon_cell1-lat>cell1-prob>cell1-users  cell2-lon_cell2-lat>cell2-prob>cell2-users...

term: the actual name of the term
cellx: the x most probable cell.
cellx-lon_cellx-lat: the longitude and latitude of center of the cellx, which is used as cell ID
cellx-prob: the probability of the cellx for the specific tag
cellx-users: the number of users that used the specific term in the cellx

The output of the feature weighting scheme is a folder with name weights containing two files, one for locality weights and one for spatial entropy weights, namely locality_weights and spatial_entropy_weights, respectively. Each row contains a term and its corresponding weight, separated by a tab character.

The files that are described above are given as input in the Language Model estimation process. During this process, a folder named resultsLM and inside that folder two files named resultsLM_scale(s)are created, where are included the MLCs of the query images. Every row contains the imageID and the MLC (tab-separated) of the image that corresponds in the respective line in the test set. Also, a file named resultsLM_scale(s)_conf_evid is created in the same folder, containing the confidence and evidences that lead to estimated MLC, for every query image.

Having estimated the MLCs for both granularity grids, the files are fed to the Multiple Resolution Grids technique, which produce a file named resultsLM_mg(cs)-(fs), where (cs) and (fs) stands for coarser and finer granularity grid, respectively. Every row of this file contains the image id, the MLC of the coarser language model and the result of the Multiple Resolution Grids technique, separated with a :.

In conclusion, the file that is created by the Multiple Resolution Grids technique is used for the final processes of the algorithm, Similarity Search. During this process, a folder named similar_images is created in the temp folder, containing the similarity values and the location of the images that containing in the MLG of every image in the test set. The final results are saved in the file specified in the arguments, and the records in each row are the ID of the query image, the real longitude and latitude, the estimated longitude and latitude, and they are tab-separated.

Evaluation Framework

This package contains the implemetations of the sampling strategies described in the MMCommons 2016 paper. In order to run the evaluation framework you have to set all necessary argument in configuration file, following the instruction for every argument. To run the code, the Evaluation class have to be executed.

Additionally, in this folder, the zip file that contains the generated collections from the different sampling strategies and the file of the building concepts can be found. Keep in mind that the geographical uniform sampling, the user uniform sampling and text diversity sampling generates different files in every code execution because they involve random selections and permutations.

Demo Version

There have been developed a demo version and a storm module of the approach.

Citations

If you use this code for your research, please cite our paper.

@article{kordopatis2017geotagging,
  title={Geotagging Text Content With Language Models and Feature Mining},
  author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Kompatsiaris, Ioannis},
  journal={Proceedings of the IEEE},
  year={2017}
}

If you use the evaluation framework, please cite our paper.

@inproceedings{kordopatis2016depth,
  title={In-depth exploration of geotagging performance using sampling strategies on YFCC100M},
  author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Kompatsiaris, Yiannis},
  booktitle={Proceedings of the 2016 ACM Workshop on Multimedia COMMONS},
  year={2016},
}

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

Contact for further details about the project

Giorgos Kordopatis-Zilos ([email protected])
Symeon Papadopoulos ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
properties		properties
samples		samples
src		src
weights		weights
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimedia Geotagging

Main Method

Offline Processing Step

Online Processing Step

Instructions

Evaluation Framework

Demo Version

Citations

License

Contact for further details about the project

About

Releases

Packages

Languages

License

MKLab-ITI/multimedia-geotagging

Folders and files

Latest commit

History

Repository files navigation

Multimedia Geotagging

Main Method

Offline Processing Step

Online Processing Step

Instructions

Evaluation Framework

Demo Version

Citations

License

Contact for further details about the project

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages