Navigation with image language model

This project presents a pipeline for robot navigation that uses the ClipSeg and Segment Anything models to generate masks for traversable paths in images. This approach is well suited for paths that have a high contrast and are already visible. To retrieve a new path, a pipeline prompting ClipSeg and Stable Diffusion is implemented.

ClipSeg and Segment Anything

Original image	Final mask

ClipSeg and Stable Diffusion

Original image	Final mask

Installation

Clone the repository locally and pip install navigate-with-image-language-model with:

git clone https://github.com/DmblnNicole/Navigation-with-image-language-model.git
pip install -e .

Install Dependecies

pip install git+https://github.com/openai/CLIP.git
pip install git+https://github.com/facebookresearch/segment-anything.git

Download the checkpoint for Segment Anything model type vit_h here: ViT-H SAM model and save it in the root folder of the repository.

Getting started

The file pipeline/eval.py runs the pipeline and contains all adjustable information like textprompts, paths to image data and the model types.

Choose your model type and specify if output masks should be visualized.

if __name__ == '__main__':
    main('sam', visualize=False)

A new folder called output will save the masks if visualize==True.

Optional

Change text prompts and upload your own dataset.

Upload image data and specify path

data_path='../data/images/hike/edge'

Upload ground truth masks and specify path

GT_dir = '../data/GT/GT_hike'

Choose your text prompt

word_mask='A bright photo of a road to walk on'

Experimental Results

The pipeline, comprising ClipSeg and Segment Anything, was evaluated on a dataset extracted from YouTube videos as shown above. This dataset consists of images with visible paths and high contrast. While the primary objective is to segment already visible paths with ClipSeg and Segment Anything, the method also produces results for images of forest terrain where no clear path is visible.

Final Mask	Final mask

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
data		data
pipeline		pipeline
weights		weights
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Navigation with image language model

ClipSeg and Segment Anything

ClipSeg and Stable Diffusion

Installation

Getting started

Optional

Experimental Results

About

Releases

Packages

Languages

michael-cummins/Navigation-with-image-language-model

Folders and files

Latest commit

History

Repository files navigation

Navigation with image language model

ClipSeg and Segment Anything

ClipSeg and Stable Diffusion

Installation

Getting started

Optional

Experimental Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages