Skip to content

This AI Does Not Exist: generate realistic descriptions of made-up machine learning models.

License

Notifications You must be signed in to change notification settings

thesephist/modelexicon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This AI Does Not Exist 🤖

This AI Does Not Exist generates realistic descriptions of made-up machine learning models. Modelexicon is what it was called before I bought the domain.

A screenshot of the app running on the web

This AI Does Not Exist is built with Oak and Torus. EleutherAI's GPT-J-6B is used for text and code generation at time of writing, though this may change as state-of-the-art models improve.

How it works

At the core of This AI Does Not Exist are two text generation pipelines:

  • One takes a "model name" and generates a brief description in academic-sounding prose
  • Another takes that model description and writes a Python code snippet that demonstrates how to "use" the described model

These are both generated using a language model called GPT-J-6B, which sits somewhere between the well-known GPT-2 and GPT-3 models in terms of performance.

When you simply open thisaidoesnotexist.com, the model names you'll see are hand curated and pre-generated by me. There are a few reasons I chose to pre-generate a set of model names:

  • Most importantly, this saves compute costs. Most users are statistically going to click through the first few sample/pre-generated models, and try one or two of their own model ideas. Some visitors may even bounce after having seen only pre-generated examples. I expect pre-generated model data to save me 2-4x on API bills, which — my goodness — language models are expensive to run!
  • Manually curating the first few samples ensures the first encounter visitors have with this project is at least of a certain baseline of quality and fun.

The script in scripts/pregenerate_models.oak pre-generates this dataset into models.json, which the server round-robins through at runtime on each request. Any user-entered model names are obviously routed to the right APIs for text generation.

Development

Running Modelexicon requires a config.oak configuration file, which currently includes API access information for text generation. There are two text generation backends supported:

  • Huggingface Inference, which requires HuggingfaceURL set to the right model and HuggingfaceToken set to your API key.
  • My personal private language model API, which you probably can't use because you are not me. This requires setting CalamityURL to the API endpoint and CalamityToken to the app-specific API token I generate for my projects.

With these defined in config.oak, oak src/main.oak should start up the app.

Like many of my projects, Modelexicon is built and managed with Oak. There's a short Makefile that wraps common oak commands:

  • make runs the web server, and is equivalent to oak src/main.oak mentioned above
  • make fmt or make f auto-formats any tracked changes in the repository
  • make build or make b builds the client JavaScript bundle from src/app.js.oak
  • make watch or make w watches for file changes and runs the make build on any change