JengaAI: Multi-Task NLP Framework for Sustainable Development & Security

Tagline: "Making Advanced NLP Accessible to Everyone, Everywhere"

JengaAI revolutionizes artificial intelligence accessibility across Kenya and Africa by introducing a groundbreaking multi-task learning framework that democratizes advanced NLP capabilities while understanding unique African contexts. Built on innovative mathematical fusion techniques, our solution enables a single model to simultaneously handle multiple critical tasks—from Swahili sentiment analysis and agricultural disease detection, cybersecurity threat identification to NLP in governance.

This framework directly addresses Kenya's national priorities through four transformative applications:

National Security & Governance: Unified threat detection (cyber attacks, hate speech, misinformation), multi-task document processing, and real-time Swahili-English security monitoring.
Sustainable Economic Development: Agricultural NLP (crop disease prediction, market analysis), financial inclusion (M-Pesa transaction analysis, fraud detection), and small business AI empowerment.
Public Policy & Political Stability: Multi-dimensional public sentiment analysis, policy document classification, and unified monitoring of development goals.
Homegrown Generative AI: African-context aware language models that understand local idioms, culturally relevant content generation, and Swahili-English code-switching comprehension.

Core Features

Multi-Task Learning: Train a single or multiple models on multiple NLP tasks saving time and resources.
High Efficiency: Achieve significant cost reduction and faster inference speeds.
Extensible Task Library: A growing collection of pre-built tasks for classification, NER, QA, and more.
Fusion Mechanisms: State-of-the-art techniques for combining and sharing information between tasks.
African Context-Awareness: Models designed to understand the nuances of African languages and contexts.

Technical Innovation

We pioneer attention fusion mechanisms and uncertainty-weighted multi-task learning—cutting-edge research techniques that automatically balance task importance while maintaining individual task expertise. Unlike imported Western models, JengaAI understands that "sukuma wiki" refers to both kale and economic resilience.

Project Vision & Future Roadmap

Our Vision: To build the leading ecosystem for developing, deploying, and sharing AI solutions tailored for African languages and contexts. We aim to empower a new generation of African developers, researchers, and entrepreneurs by providing powerful, accessible tools.

Our Roadmap: The JengaAI framework is designed to evolve. Our future plans focus on integrating Large Language Models (LLMs) to unlock new capabilities.

Phase 1: LLMs as "Reasoning Heads"
- Concept: Enhance existing tasks by replacing simple output layers with small, fine-tuned LLMs, dramatically improving their reasoning and few-shot learning capabilities.
Phase 2: Multi-Task Fine-Tuning of a Core LLM
- Concept: Adapt the framework to fine-tune a single, powerful, open-source LLM on multiple tasks at once, creating a versatile model that is an expert in specific African domains.
Phase 3: Jenga-AI as an Agentic Framework
- Concept: Evolve the multi-task model into the "brain" of an AI agent. This will enable complex workflows where one task's output (e.g., threat detection) triggers another (e.g., entity extraction), which then feeds a generative LLM to produce a summary or an alert.

This roadmap transforms JengaAI from a powerful training framework into a comprehensive, end-to-end platform for building the next generation of context-aware AI in Africa.

Phase 4: Behavioral Cloning for Agentic AI
- Concept: Extend the agentic framework to incorporate Behavioral Cloning, enabling the multi-task model to learn complex action sequences by observing expert demonstrations. This would involve:
  - Textual Observations: Representing environmental states or agent perceptions as textual input for the shared encoder.
  - Action Prediction Heads: Developing new task-specific heads for predicting discrete (classification) or continuous (regression) actions.
  - Integration with Agentic Workflows: Allowing the cloned behaviors to drive actions within the Jenga-AI agent, enabling it to perform tasks like navigation, interaction, or complex decision-making based on learned policies.
- Benefit: This phase would allow JengaAI to move beyond purely analytical NLP tasks into interactive and autonomous agent capabilities, particularly useful for applications requiring automated responses or control in simulated or real-world environments, driven by natural language understanding.

Project Architecture

The framework is organized as follows:

multitask_bert/
├── analysis/         # Attention visualization and model analysis
├── core/             # Core components: model, fusion, config
├── data/             # Data processing and loading pipelines
├── deployment/       # Inference and model exportation
├── tasks/            # Definitions for all NLP tasks
├── training/         # Trainer, callbacks, and data modules
└── utils/            # Utility functions

Getting Started

Prerequisites

Python 3.9+
pip

Installation

Clone the repository:

git clone https://github.com/your-repo/JengaAI.git
cd JengaAI

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate

Install dependencies: (Note: The requirements.txt file is currently empty. This will be populated with the necessary packages.)
```
pip install -r requirements.txt
pip install -e .
```

Quickstart

Here is a conceptual example of how to use the framework to train a model on two different tasks:

from multitask_bert.data import DataProcessor
from multitask_bert.tasks import ClassificationTask, NERTask
from multitask_bert.training import Trainer

# 1. Define your tasks
classification_task = ClassificationTask(
    name="SwahiliSentiment",
    label_map={0: "Negative", 1: "Positive"}
)
ner_task = NERTask(
    name="SecurityThreats",
    label_map={0: "O", 1: "B-Threat", 2: "I-Threat"}
)

# 2. Load and process your data
data_processor = DataProcessor()
train_data = data_processor.load_data(
    tasks=[classification_task, ner_task],
    train_files=["sentiment_data.csv", "threat_data.jsonl"]
)

# 3. Configure and run the trainer
trainer = Trainer(
    tasks=[classification_task, ner_task],
    model_name_or_path="bert-base-multilingual-cased",
    train_data=train_data
)

trainer.train()

Community & Contribution

JengaAI is an open-source project, and we welcome contributions from the community. Whether you're a developer, a researcher, or just passionate about our mission, there are many ways to get involved:

Contribute Code: Help us build new features, fix bugs, or improve the framework.
Add a Task: Introduce a new NLP task to the framework.
Improve Documentation: Help us make our documentation clearer and more comprehensive.
Share Models: Train and share models for different African languages and contexts.

Please read our CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
llm_finetuning		llm_finetuning
mlruns		mlruns
multitask_bert		multitask_bert
scripts		scripts
seq2seq_models		seq2seq_models
tests		tests
unified_results		unified_results
unified_results_ner		unified_results_ner
.DS_Store		.DS_Store
.gitignore		.gitignore
ABSTRACT.md		ABSTRACT.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Democratizing NLP-1.pdf		Democratizing NLP-1.pdf
Democratizing NLP.pptx		Democratizing NLP.pptx
INSTALL.md		INSTALL.md
LICENSE		LICENSE
OVERVIEW.md		OVERVIEW.md
PROJECT_ROADMAP.md		PROJECT_ROADMAP.md
README.MD		README.MD
git-etc		git-etc
git-sc		git-sc
git-status-check		git-status-check
git-sync		git-sync
hackathon_mvp.yaml		hackathon_mvp.yaml
nohup.out		nohup.out
requirements.txt		requirements.txt
run_hackathon_mvp.py		run_hackathon_mvp.py
session_summary.md		session_summary.md
setup-git-automation		setup-git-automation
setup.py		setup.py
threat_classification_data.jsonl		threat_classification_data.jsonl
threat_ner_data.jsonl		threat_ner_data.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JengaAI: Multi-Task NLP Framework for Sustainable Development & Security

Core Features

Technical Innovation

Project Vision & Future Roadmap

Project Architecture

Getting Started

Prerequisites

Installation

Quickstart

Community & Contribution

License

About

Uh oh!

Releases

Packages

Languages

License

Rogendo/Jenga-AI

Folders and files

Latest commit

History

Repository files navigation

JengaAI: Multi-Task NLP Framework for Sustainable Development & Security

Core Features

Technical Innovation

Project Vision & Future Roadmap

Project Architecture

Getting Started

Prerequisites

Installation

Quickstart

Community & Contribution

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages