Skip to content

An analytics engine for HADO AR sports using Python and Scikit-learn to balance gameplay, classify player demographics via telemetry, and generate real-time performance rankings.

Notifications You must be signed in to change notification settings

Hypurl/hado-telemetry-report

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

HADO AR Sport Analytics Engine

Role: Machine Learning Engineer (Intern) | Company: meleap Inc. (Shanghai) Tech Stack: Python, Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn

⚠️ NDA Notice

Due to Non-Disclosure Agreements (NDA), the raw source code, datasets, and specific internal balancing parameters cannot be shared publicly. This repository serves as a high-level overview of the system architecture, methodologies, and outcomes achieved during my tenure.

πŸ“– Project Overview

This portfolio entry outlines the analytics engine developed during my internship at meleap Inc. for HADO, a global Augmented Reality (AR) techno-sport.

The project analyzed high-dimensional telemetry data to solve three critical business challenges:

  1. Game Balancing: Identifying "meta" hero archetypes and ensuring competitive fairness.
  2. Player Categorization: Algorithmic classification of user demographics (e.g., "Pro Child" vs. "Coach") using gameplay metrics.
  3. Performance Benchmarking: Real-time generation of "S+ through F" rankings to provide players with actionable feedback.

This system processed data from a global player base, utilizing unsupervised learning models to benchmark user agility and standardize competitive integrity.


πŸ—οΈ System Architecture & Modules

1. Unsupervised Learning & Meta Analysis

Script Reference: hero_meta_clustering.py, hero_3d_visualization.py To understand player behavior, I engineered an unsupervised learning model using K-Means Clustering and Principal Component Analysis (PCA). This module analyzed 4D hero statistics (Attack, Defense, Healing, Control) to identify playstyle clusters.

  • Dimensionality Reduction: Applied PCA to visualize high-dimensional telemetry, revealing distinct hero archetypes.
  • Meta Visualization: Created 3D/4D visualizations to audit the game for "overpowered" strategies.

2. Player Categorization Engine (K-Value Analysis)

Script Reference: demographic_classification_model.py, statistical_fairness_validator.py This module validated the use of "K-Value" (a proprietary performance metric) as a predictor for player demographics.

  • Statistical Validation: Performed ANOVA (F=8.1788, p<0.0001) and Kruskal-Wallis tests to confirm that hero selection significantly impacts K-Value performance.
  • Demographic Classification: Developed decision boundaries using Random Forest and Decision Tree classifiers to categorize users into groups based on telemetry ranges.
  • Key Findings:
  • Coaches: Highest mean K-Value (~0.040), significantly outperforming other groups.
  • Pro Children: Distinct performance tier (~0.030) compared to general users.
  • General Users: Baseline K-Value (~0.022).

3. Real-Time Performance Analytics

Script Reference: realtime_scoring_engine.py, baseline_etl_processor.py A production-ready evaluation engine that provides real-time feedback.

  • ETL Pipeline: Acts as an ETL handler, cleaning raw CSV logs and generating JSON lookup tables for fast retrieval.
  • Scoring System: Benchmarks individual matches against global percentiles, assigning roles (e.g., "Apex Predator", "Guardian") and ranks (S+ to F).

4. Data Sanitation & Integrity

Script Reference: data_sanitation_pipeline.py Implemented rigorous cleaning protocols to handle noise in the accelerometer and gameplay logs.

  • Outlier Detection: Utilized Interquartile Range (IQR) and Isolation Forest methods to remove anomalous data points caused by sensor noise or bugged matches.
  • Logic Checks: Filtered impossible values to ensure training data quality.

πŸš€ Impact

  • Production Deployment: The clustering and scoring models were integrated into the backend, enabling automated skill assessment for users.
  • Data-Driven Balancing: Provided the development team with quantitative evidence (ANOVA results) to adjust hero parameters, ensuring a fair experience for the player base.
  • Automated Coaching: The system successfully distinguishes between "Pro Child" and "Amateur" playstyles, allowing for targeted algorithmic coaching.

About

An analytics engine for HADO AR sports using Python and Scikit-learn to balance gameplay, classify player demographics via telemetry, and generate real-time performance rankings.

Topics

Resources

Stars

Watchers

Forks