Competition Summary - Pseudo Sonic Log Generation
Winner Team | Contact | |
---|---|---|
1st Place | UTFE | Wen Pan([email protected]) |
Tianqi Deng([email protected]) | ||
Honggeun Jo([email protected]) | ||
Javier Santos([email protected]) | ||
2nd Place | iwave | Lei Fu([email protected]) |
3rd Place | RockAbusers | Arkhat Kalbekov( [email protected]) |
Valeria Suarez([email protected]) | ||
4th Place | StuckAtHome | |
5th Place | SedStrat | Epo Prasetya Kusumah( [email protected]) |
Mohammad Aviandito([email protected]) | ||
Yogi Pamadya([email protected]) |
Root mean squared error(RMSE) is calculated from the DTC and DTS values of the hidden dataset.
Rank | Team Name | Best Score | Best Solution | Notebook |
---|---|---|---|---|
1 | UTFE | 12.35942 | Neural Network | Notebook |
2 | iwave | 12.55189 | LSTM | Notebook |
3 | RockAbusers | 13.2136 | Randomforest | Notebook |
4 | StuckAtHome | 13.43166 | TreeEnsemble | Notebook |
5 | SedStrat | 13.84585 | Ensemble | Notebook |
6 | RocketTeam | 14.83064 | LSTM | Notebook |
7 | iPetro | 15.38718 | Neural Network | Notebook |
8 | Oilers | 15.75537 | XGBoost | Notebook |
9 | DataDrivenPancakes | 16.31731 | Ensemble | Notebook |
10 | TeamTriumphant | 16.41215 | LGBM | Notebook |
11 | TheMeanSquares | 16.60382 | LGBM | Notebook |
12 | Explorum | 16.70458 | Randomforest | Notebook |
13 | MSArchie | 16.9674 | Ensemble | Notebook |
14 | MLogging | 16.98075 | Ensemble | Notebook |
15 | TrashPandas | 17.27522 | Tree Ensemble | Notebook |
16 | TeamTF | 17.47539 | Tree Ensemble | Notebook |
17 | PDDA | 17.92553 | Randomforest | Starter_Yu.ipyb |
18 | UNDFightingHawks | 20.23271 | Randomforest | Notebook |
19 | DoaIbu | 20.34702 | Ensemble | Notebook |
20 | TensorITB | 23.92497 | MultiOutputRegressor | Notebook |
Synergy | 14.28895 | |||
LACrew | 15.61239 | |||
DATUM | 15.93848 | |||
Curiosity | 15.96676 | |||
Diagenesis | 16.58438 | |||
SubsurfaceIntelligence | 16.92818 | |||
Colonels | 17.22655 | |||
HoustonEnergyTeam | 17.30373 | |||
TeamCGG | 17.38406 | |||
IIT Roorkee | 19.12469 | |||
GUCoders | 22.91161 |
Well logs are interpreted/processed to estimate the in-situ petrophysical and geomechanical properties, which is essential for subsurface characterization. Various types of logs exist, and each provides distinct information about subsurface properties. Certain well logs, like gamma ray (GR), resistivity, density, and neutron logs, are considered as “easy-to-acquire” conventional well logs that are run in most of the wells. Other well logs, like nuclear magnetic resonance, dielectric dispersion, elemental spectroscopy, and sometimes sonic logs, are only run in limited number of wells.
Sonic travel-time logs contain critical geomechanical information for subsurface characterization around the wellbore. Often, sonic logs are required to complete the well-seismic tie workflow or geomechanical properties prediction. When sonic logs are absent in a well or an interval, a common practice is to synthesize them based on its neighboring wells that have sonic logs. This is referred to as sonic log synthesis or pseudo sonic log generation.
Compressional travel-time (DTC) and shear travel-time (DTS) logs are not acquired in all the wells drilled in a field due to financial or operational constraints. Under such circumstances, machine learning techniques can be used to predict DTC and DTS logs to improve subsurface characterization. The goal of the “SPWLA’s 1st Petrophysical Data-Driven Analytics Contest” is to develop data-driven models by processing “easy-to-acquire” conventional logs from Well #1, and use the data-driven models to generate synthetic compressional and shear travel-time logs (DTC and DTS, respectively) in Well #2. A robust data-driven model for the desired sonic-log synthesis will result in low prediction errors, which can be quantified in terms of Root Mean Squared Error(RMSE) by comparing the synthesized and the original DTC and DTS logs.
You are provided with two datasets: Well #1 dataset and Well #2 dataset. You need to build a generalizable data-driven models using Well #1 dataset. Following that, you will deploy the newly developed data-driven models on Well #2 dataset to synthesize DTS and DTC logs. The data-driven model should use feature sets derived from the following seven logs: Caliper, Neutron, Gamma Ray, Deep Resistivity, Medium Resistivity, Photo-electric factor and density. The data-driven model should synthesize two target logs: DTC and DTS logs.
Petrophysical Data-Driven Analytics (PDDA), a special interest group under society of Petrophysicists and Well Log Analysts (SPWLA), is announcing its first machine learning contest in 2020! The contest is open to all SPWLA members (including student members) or whoever are interested in machine learning applications in petrophysics.
Start Date: March 1, 2020
Team Registration Deadline: March 31, 2020 11:59 PM CST
Entry Deadline: April 30, 2020 11:59 PM CST
End Date (Final Submission of Code Deadline): May 7, 2020 11:59 PM CST
Please send your team name, team member, contact info, and affiliation to [email protected]. The official competition website is https://github.com/pddasig/Machine-Learning-Competition-2020.
You cannot register from multiple accounts and therefore you cannot submit from multiple accounts.
The maximum team size is 5.
Your submission needs to follow the same format as the ‘sample_submission.csv’ file provided on the competition website, the final ranking is based on the RMSE score of the hidden dataset.
A blind test dataset from 20% of the hidden dataset is released for the your judgement, you may check your model performance based on this dataset as many times as you want. This dataset will be released after the registration deadline.
Please note that the purpose of the released dataset is providing a validation tool to check the performance of your model. However, in the real application there would be no such data, since we will not have any access to the new well's data. Therefore please do not use the data to train your model.
You may select up to 3 submissions for judging before the entry deadline, the highest score will be used for your rank. You must submit your runnable code in a Notebook/JupyterNotebook format before the end date, any code submission with sever bugs or results in a different number from the data entry will not be ranked or awarded.
** Please make sure to use "random_state" or "SEED" for all the steps that involves randomization in your model, this will ensure the same result run by the judges.
Privately sharing code or data outside of teams is not permitted. It's okay to share code if made available to all participants on the competition Github repository.
You should NOT use any dataset during the training other than the one provided by the committee.
Any violation of the above will be regarded as cheating and not ranked or awarded.
COMPETITION TITLE: Pseudo Sonic Log Generation
COMPETITION ORGANIZOR: SPWLA – PDDA SIG
COMPETITION WEBSITE: https://github.com/pddasig/Machine-Learning-Competition-2020
You can submit "Issues" ticket to the repository if you find any problem of the compeition or would like to raise a discussion topic.
Total award: $1500
Rank | Prize |
---|---|
1st Place | $500 |
2nd Place | $400 |
3rd Place | $300 |
4th Place | $200 |
5th Place | $100 |
Top 5 winning teams will be awarded with prizes(NOT in cash).
Novel and practical algorithms will be recommended for a submission to the next SPWLA special issue by PDDA.
The data comes from VOLVE dataset owned by Equinor.
DATA ACCESS AND USE: Creative Commons Attribution-NonCommercial-ShareAlike license.
ENTRY IN THIS COMPETITION CONSTITUTES YOUR ACCEPTANCE OF THESE OFFICIAL COMPETITION RULES.
The Competition named above is a skills-based competition to promote and further the field of data science. You must submit your registration to [email protected] to enter. Your competition submissions ("Submissions") must conform to the requirements stated on the Competition Website. Your Submissions will be scored based on the evaluation metric described on the Competition Website. Subject to compliance with the Competition Rules, Prizes, if any, will be awarded to participants with the best scores, based on the merits of the data science models submitted. Check the competition website for the complete Competition Rules.
Yanxiang Yu, Chicheng Xu, Siddharth Misra, Weichang Li, Michael Ashby, Brendon Hall, Yan Xu, Oghenekaro Osogba