Skip to content

apavlidi/WebCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web-Crawler CircleCI codecov Known Vulnerabilities

This is a java web crawler which crawls a URL and returns the URLs visited with all the links on that URL found. You can find here the released version of this application: http://web-crawl-env.eba-upp2ihyt.eu-west-2.elasticbeanstalk.com/

Teck Stack

The project is build with Spring and Java 17. It uses JUnit. The project follows the Google Java Style Guide by utilising the spotless plugin. It also provides a code coverage report by using jacoco.

For CI/CD it uses CircleCI, for deployment AWS Elastic Beanstalk and the application is wrapped with Docker.

It also integrated with Snyk for security vulnerabilities.

The CI/CD includes the following steps:

  1. Check codestyle
  2. Run tests
  3. Create Code Coverage report and publish it to Codecov
  4. Deploys application to ELB

You can find the full list of tech in the Tech & Tools Documentation.

Run it locally

  1. Clone the project on your local machine.
    $ git clone https://github.com/apavlidi/WebCrawler.git

  2. Navigate to the project folder and install the dependencies with the following command.
    $ mvn install

  3. Run the application locally (the application can be accessed from localhost:8080)
    $ mvn spring-boot:run

Docker

You can also run the application using docker:

  1. $ docker build -t app .

  2. $ docker run -p 8080:8080 app

Run tests ✅

You can run the tests by using $ mvn test.

Generate coverage report 📊

You can produce code coverage report using the jacoco plugin $ mvn jacoco:report. The code coverage report has been deployed to Codecov.

Lint code 💅

You can format the code by using the spotless plugin $ mvn spotless:apply. Spotless has been configured to use google style code.

Documentation 📕

Web-Crawl documentation is available here. The API is also exposed via OpenAPI, and it's accessible here: /v3/api-docs

Project Kanban 👨‍🏫

Web-Crawl project kanban is available here.

About

Java web-crawler

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published