The data consists of website data provided by Indent Labs' NovelGens, which generates fictional creatures using AI.
Indent Labs used artificial intelligence to create 365 creatures in 2022 — one new, unique creature every day. Each creature was assigned a commonality, biome, tag, and paragraph of content. Users then had the ability to claim the creature, like the creature, create artwork of the creature, and/or write a story about the creature.
The website collected anonymous feedback from users about each creature, asking: Is this creature memorable? Scary? Sad? Powerful? Unique? Comical? Comforting? Dangerous?
After producing creatures and collecting data for a year, Indent Labs seeks to better understand the trends between the qualitative data and the user-submitted feedback. They want to understand:
- Which creatures claim superlatives in each of the feedback categories? What factors contribute to this?
- What trends exist between creatures that share types, tags, and/or environments? What is the most common feedback associated with each of these data points?
- What factors influence users to interact with each creature, either by claiming it, liking it, or submitting new content about it?
I received 20 .csv files from Indent Labs. I cleaned them using Google Sheets, eliminating unnecessary data, filling empty data, and reorganizing their structure before importing 16 tables to PostgreSQL.
- content_favorites: The "likes" given to each creature, consisting of 5 columns and 408 rows.
- content_feedback: User-submitted feedback about each creature, consisting of 18 columns and 4,652 rows.
- creature_environment_pairings: The environments selected for each creature, consisting of 5 columns and 655 rows.
- creature_story_pairings: The user-submitted stories and their respective creature(s), consisting of 5 columns and 20 rows.
- creature_tag_pairings: The tags selected for each creature, consisting of 3 columns and 527 rows.
- creatures: Published and drafted creatures, consisting of 8 columns and 369 rows.
- environment_story_pairings: The environments selected for user-submitted stories, consisting of 5 columns and 2 rows.
- environment_submitted_image_pairings: The environments selected for user-submitted images, consisting of 5 columns and 38 rows.
- environments: All possible environments for creatures, consisting of 5 columns and 18 rows.
- ownership_contracts: Creatures that have been claimed by users, consisting of 6 columns and 3,050 rows.
- profiles: Profile pages for website users, consisting of 7 columns and 154 rows.
- submitted_content_kudos: Kudos given by users to submitted stories and images by other users, consisting of 7 columns and 135 rows.
- submitted_images: Images submitted by users about the creatures, consisting of 9 columns and 53 rows.
- submitted_stories: Stories submitted by users about the creatures, consisting of 7 columns and 20 rows.
- tags: All possible tags for creatures, consisting of 5 columns and 98 rows.
- users: Encyrpted data on website users, consisting of 9 columns and 154 rows.
Please reach out to me if you have any questions or feedback.