This project corresponds to the third and final challenge of the How Bootcamps data engineering bootcamp.
The objective is to provide the media analyst team with quick and unified access to data related to paid media operations. Including, primarily: investment, results, and history of internal KPIs.
In the current scope, the project provides two main benefits:
- On-demand Data: Analysts can request campaign data via Slack, without permission restrictions or the need to deal with interfaces from different platforms. The data obtained through this method is current and consolidated;
- Historical Operation Data: Managers can track the history of internal control KPIs in a consolidated view with data obtained from the platforms. This creates the opportunity to estimate the possibility of Churn based on unmet goals and low performance;
The project scope includes extracting campaign data from the Google Ads and Meta Ads media platforms. Operational data, such as internal KPIs, are obtained from Monday.com (project management tool). The data will be made available for consumption through "Slash Commands" in Slack (internal communication tool) and in a dashboard developed with Streamlit.
Repository for the Streamlit dashboard
The architecture consists of Python scripts in Google Cloud Functions, with two main endpoints:
- "raiox": responsible for responding to requests generated by Slack (via Slash Commands), triggered by Cloud Pub/Sub "topics";
- "adsops_snapshot": responsible for the daily data ingestion into the Bucket that feeds the Dashboard, triggered by a Cloud Scheduler job (Daily);
The use of Pub/Sub was necessary to conform to the response model expected by the Slack API.
Operational data, such as client KPIs or the IDs of ad accounts to be requested, are obtained via "Monday.com". This integrates with the agency's onboarding process flow. With the obtained IDs, it's possible to request campaign data from each client's Meta and Google accounts. Depending on the triggered endpoint, this data is stored in a Bucket to be later consumed by the dashboard, or it is structured into a response message to be published on Slack.
For both flows, a class architecture was designed to structure data from different media platforms according to the standard required by the frontend.
classDiagram
MetaBusinessmanager *-- MetaAdaccount
MetaAdaccount *-- MetaCampaign
MetaCampaign *-- MetaAdset
GoogleAccount *-- GoogleCampaign
class MetaBusinessmanager{
meta_businessmanager_name :str
adaccounts : array
meta_businessmanager_id:str
set_meta_businessmanager_name()
set_meta_adaccounts()
calculate_total_spend()
calculate_total_budget()
}
class MetaAdaccount{
id:str
name:str
owned:boolean
campaigns:array
}
class MetaCampaign{
daily_results:array
spend:float
adsets:array
id:str
name:str
cbo:boolean
budget:float
effectiveStatus:str
setEffectiveStatus()
add_spend()
set_budget()
get_budget()
add_investment()
get_total_spend_from_daily()
get_total_spend()
get_linear_projection()
get_total_budget()
}
class MetaAdset{
city : str
set_budget()
add_spend()
setEffectiveStatus()
}
class GoogleAccount{
account_id:str
campaigns:array
set_google_campaigns()
get_total_budget()
get_total_cost()
calculate_spend_projection()
}
class GoogleCampaign{
campaign_id:str
campaign_name:str
cost:float
budget:float
effectiveStatus:str
get_total_budget()
get_total_cost()
}
class MessageBuilder{
client_name:str
channel_id:str
businessmanagers:array
google_accounts:array
CANAL_SLACK:str
count_meta_adaccounts()
count_meta_campaigns()
count_google_campaigns()
set_businessmanagers()
set_google_accounts()
calculate_total_projection()
build_message()
}
- Next steps include extending the obtained operational data to also encompass contract information (signing date, commitment period, negotiated values, etc.).
- Perform ingestion of historical data prior to the start of daily scheduled extraction (Apr/24);
- Expand the data extracted from platforms to also include Impressions, Clicks, and Cost per mille (CPM);
- Utilize the longer history and broader dataset to create a model that identifies clients at higher risk of churn;


