HowBootcamps Project

Data Engineering Bootcamp Project

📝 Summary

Objective
Scope
Architecture
Next Steps

Objective

This project corresponds to the third and final challenge of the How Bootcamps data engineering bootcamp.

The objective is to provide the media analyst team with quick and unified access to data related to paid media operations. Including, primarily: investment, results, and history of internal KPIs.

In the current scope, the project provides two main benefits:

On-demand Data: Analysts can request campaign data via Slack, without permission restrictions or the need to deal with interfaces from different platforms. The data obtained through this method is current and consolidated;
Historical Operation Data: Managers can track the history of internal control KPIs in a consolidated view with data obtained from the platforms. This creates the opportunity to estimate the possibility of Churn based on unmet goals and low performance;

Scope

The project scope includes extracting campaign data from the Google Ads and Meta Ads media platforms. Operational data, such as internal KPIs, are obtained from Monday.com (project management tool). The data will be made available for consumption through "Slash Commands" in Slack (internal communication tool) and in a dashboard developed with Streamlit.

Repository for the Streamlit dashboard

Architecture

The architecture consists of Python scripts in Google Cloud Functions, with two main endpoints:

"raiox": responsible for responding to requests generated by Slack (via Slash Commands), triggered by Cloud Pub/Sub "topics";
"adsops_snapshot": responsible for the daily data ingestion into the Bucket that feeds the Dashboard, triggered by a Cloud Scheduler job (Daily);

The use of Pub/Sub was necessary to conform to the response model expected by the Slack API.

Flow

Operational data, such as client KPIs or the IDs of ad accounts to be requested, are obtained via "Monday.com". This integrates with the agency's onboarding process flow. With the obtained IDs, it's possible to request campaign data from each client's Meta and Google accounts. Depending on the triggered endpoint, this data is stored in a Bucket to be later consumed by the dashboard, or it is structured into a response message to be published on Slack.

Class Architecture

For both flows, a class architecture was designed to structure data from different media platforms according to the standard required by the frontend.

classDiagram
    MetaBusinessmanager *-- MetaAdaccount
    MetaAdaccount *-- MetaCampaign
    MetaCampaign *-- MetaAdset
    GoogleAccount *-- GoogleCampaign

    class MetaBusinessmanager{
      meta_businessmanager_name :str
      adaccounts : array
      meta_businessmanager_id:str
      set_meta_businessmanager_name()
      set_meta_adaccounts()
      calculate_total_spend()
      calculate_total_budget()
    }
    class MetaAdaccount{
      id:str
      name:str
      owned:boolean
      campaigns:array
    }
    class MetaCampaign{
      daily_results:array
      spend:float
      adsets:array
      id:str
      name:str
      cbo:boolean
      budget:float
      effectiveStatus:str
      setEffectiveStatus()
      add_spend()
      set_budget()
      get_budget()
      add_investment()
      get_total_spend_from_daily()
      get_total_spend()
      get_linear_projection()
      get_total_budget()
    }
    class MetaAdset{
        city  :  str
        set_budget()
        add_spend()
        setEffectiveStatus()
    }
    class GoogleAccount{
      account_id:str
      campaigns:array
      set_google_campaigns()
      get_total_budget()
      get_total_cost()
      calculate_spend_projection()
    }
    class GoogleCampaign{
      campaign_id:str
      campaign_name:str
      cost:float
      budget:float
      effectiveStatus:str
      get_total_budget()
      get_total_cost()
    }

    class MessageBuilder{
      client_name:str
      channel_id:str
      businessmanagers:array
      google_accounts:array
      CANAL_SLACK:str
      count_meta_adaccounts()
      count_meta_campaigns()
      count_google_campaigns()
      set_businessmanagers()
      set_google_accounts()
      calculate_total_projection()
      build_message()
    }

Interfaces

Slack Message

Streamlit Dashboard

Next Steps

Next steps include extending the obtained operational data to also encompass contract information (signing date, commitment period, negotiated values, etc.).
Perform ingestion of historical data prior to the start of daily scheduled extraction (Apr/24);
Expand the data extracted from platforms to also include Impressions, Clicks, and Cost per mille (CPM);
Utilize the longer history and broader dataset to create a model that identifies clients at higher risk of churn;

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
classes		classes
gateways		gateways
platforms		platforms
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HowBootcamps Project

Data Engineering Bootcamp Project

📝 Summary

Objective

Scope

Architecture

Flow

Class Architecture

Interfaces

Slack Message

Streamlit Dashboard

Next Steps

About

Uh oh!

Releases

Packages

Languages

jgdemattos/raiox

Folders and files

Latest commit

History

Repository files navigation

HowBootcamps Project

Data Engineering Bootcamp Project

📝 Summary

Objective

Scope

Architecture

Flow

Class Architecture

Interfaces

Slack Message

Streamlit Dashboard

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages