Skip to content

This app helps you keep track of GitHub users in cities, saving the data as .edn to disk. So you can easily `grep` by language or keyword. Interact with the GitHub API using Clojure/babashka

License

Notifications You must be signed in to change notification settings

simonneutert/git-hire

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

git hire! hire on πŸ”₯!

Ever wondered who is coding what in your city and how to keep track of it, maybe grep projects by keyword?

This is your tool! (babashka >= 1.0.171 mandatory)

What is .edn?

EDN is a data format, like JSON, but more like Clojure's native data format.

A result can look like this:

{:name "Simon Neutert"
 :hireable true
 :languages ["HTML"]
 :bio "I'm an HTML hacker."
 :location "Area 50++"
 :public-repos 123
 :repos-url "https://api.github.com/users/simonneutert/repos"
 :type "User"}

Need JSON?

I can highly suggest jet for that.


this is fine dog
https://knowyourmeme.com/memes/this-is-fine


Features

  • up to 1000 users per city + language combination (sorted by "users' public repositories count")
  • if less than 1000 users in a city total, you can download by location only
  • concurrency built-in πŸš€

planned features

  • get all users (not just 1000)
    • implement automatic bucketing, sliding through the limits
    • PROBLEM: GitHub sets the limit here πŸ₯΄
  • tests?! 🧌
  • sort by active last week? OR created in year?
  • speed isn't crucial, but utilizing some of clojure.core.async magic could speed things up 10x maybe πŸ€” pmap ftw πŸŽ‰

Prerequisities

make sure your ENV has the GITHUB_HIRE_TOKEN at hand.
I do it like this:
in a terminal enter $ export GITHUB_HIRE_TOKEN="<my-token-here>"
then, from that terminal open your IDE of choice, like
$ code .

or have it in your .zshrc πŸ€— or whatever your shell loads at start

πŸ₯³ happy times in the REPL

Run

Here's what you need to get the thing running.

  • babashka or Docker/Podman
  • Project Configuration (optional)

Configuration

Currently, the only configuration you can do is setting sleep time between request cycles.

Sleep time

DEFAULT sleep time is 30 seconds.

Increase the sleep time to avoid hitting the GitHub API rate limit.

You can customise the sleep time between cycles by setting the SLEEP_TIME_SECONDS environment variable.

$ SLEEP_TIME_SECONDS=15 bb scrape <location-like-city-or-country> <language>

Run in Docker

All of the following should work in Docker, too.

The simplest way for you is to use the given Dockerfile.

$ docker build --build-arg github_hire_token=${GITHUB_HIRE_TOKEN} -t git-hire .
$ docker run -it --rm git-hire

If you need to store the profiles, you can mount a docker volume, but this goes beyond the scope of this README.

Run locally

$ bb scrape <location-like-city-or-country>

Will save the github profiles as .edn into the profiles directory,
but as GitHub support let me know:

When using the language qualifier when searching for users, it will only return users where the majority of their repositories use the specified language. (please, see documentation)

Specify further adding a language:

$ bb scrape <location-like-city-or-country> <language>

Be warned! This might not find a PHP dev who switched to Rust recently, as described by GitHub's Support.

Or if the city is too crowded, try loading mainstream languages for a given city.
Watch your rate limits ⚠️

After having built a pool of profiles, use
$ bb search-keyword "rust" and/or see examples given below.

examples

$ bb scrape mainz
$ bb scrape "Bad Kreuznach"
$ bb scrape wiesbaden java
$ bb scrape wiesbaden php
$ bb scrape mainz javascript

Search in result files (saved profiles)

$ bb search-keyword <search term skill framework else>

examples

$ bb search-keyword android
$ bb search-keyword "ruby on rails"
$ bb search-keyword nuxt

you might go further, by piping to bb again, unimaginable possibilities...

$ mkdir rails; cp $(grep -Zril rails profiles) rails

and then:

$ bb search-keyword "ios" | bb -e '(map #(str/upper-case %) *input*)'

Inspect Profiles (with examples! 🀯)

$ bb read-profile.clj simonneutert

go further, by piping:

$ bb read-profile.clj simonneutert | bb -e '(:languages *input*)'

then read many profiles

$ bb search-keyword ruby | bb -e '(mapv #(edn/read-string (slurp %)) *input*)'

map out name and bio, where bio is provided

$ bb search-keyword ruby |\
    bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
    bb -e '(mapv #(select-keys % [:name :bio]) *input*)' |\
    bb -e '(remove #(nil? (:bio %)) *input*)'

map out name and bio, where bio is provided, filter by bio containing "apple"

$ bb search-keyword ruby |\
    bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
    bb -e '(mapv #(select-keys % [:name :bio]) *input*)' |\
    bb -e '(remove #(nil? (:bio %)) *input*)' |\
    bb -e '(filter #(clojure.string/includes? (clojure.string/lower-case (:bio %)) "apple") *input*)' |\
    bb -e '(clojure.pprint/pprint *input*)'

what you came here for πŸ”₯ find all hireable

search-keyword git is sort of a hack returning all profiles you downloaded at this point

$ bb search-keyword git |\
    bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
    bb -e '(remove #(nil? (:hireable %)) *input*)'

Find juniors/new-joiners

# using httpie
GITHUB_HIRE_SINCE_YEAR=2019;
GITHUB_HIRE_LOCATION=wiesbaden;
https -A bearer -a ${GITHUB_HIRE_TOKEN} \
  "https://api.github.com/search/users?q=created%3A%3E${GITHUB_HIRE_SINCE_YEAR}-01-01+location%3A${GITHUB_HIRE_LOCATION}+repos%3A%3E1&type=Users" \
  "Accept":"application/vnd.github.v3+json"
# using httpie and jq
GITHUB_HIRE_SINCE_YEAR=2019;
GITHUB_HIRE_LOCATION=wiesbaden;
https -A bearer -a ${GITHUB_HIRE_TOKEN} \
  "https://api.github.com/search/users?q=created%3A%3E${GITHUB_HIRE_SINCE_YEAR}-01-01+location%3A${GITHUB_HIRE_LOCATION}+repos%3A%3E1&type=Users" \
  "Accept":"application/vnd.github.v3+json" |\
  jq '.items | map(select(.type == "User")) | .[] |.repos_url'

FAQ

Some stuff you would want to know/read as a beginner.

Errors

  • REPL fails and outputs
    ; : Can't set!: *current-length* from non-binding thread user

pmap and curl don't play well with each other in the shell (I guess).
Don't worry, run the tool from the shell:
bb scrape berlin ruby
it will fire up some threads πŸ”₯

CookBook Babashka

https://book.babashka.org/

How to Clojure in VS Code

https://clojure.org/guides/editors#_vs_code_rapidly_evolving_beginner_friendly

"github-username.edn" what am I supposed to do with that? JSON would be much nicer!

CLI to transform between JSON, EDN and Transit, powered with a minimal query language.

https://github.com/borkdude/jet

transform to JSON

$ bb search-keyword ruby |\
    bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
    jet --to json

About

This app helps you keep track of GitHub users in cities, saving the data as .edn to disk. So you can easily `grep` by language or keyword. Interact with the GitHub API using Clojure/babashka

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published