Data Engineering with Marcelo Guarido
Mon, Apr 26
|Zoom Webinar
Time & Location
Apr 26, 2021, 5:30 p.m. – 6:30 p.m. MDT
Zoom Webinar
Guests
About the Event
Your Zoom link will be sent in the confirmation email after you RSVP
Slack Channel: #bootcamp_data_engineering_marcelo_guarido
https://gtx2021geothe-t3d1213.slack.com/archives/C01T9MDSVUN
Github Link: https://github.com/gtxdatathon
Data Engineering
As part of a data science project, data preparation and data engineering account for around 80% of the project’s execution time. A data scientist needs to have the whole domain of the data and insights that can help improve the prediction accuracy, and that’s the focus of this course. Knowing the data helps decide the data visualization, data cleaning, data imputation, and feature engineering.
During the data engineering course, we will work on a Python Notebook script showing an example of well logs with the focus on the following steps:
- Visualization: how to plot well logs
- Data cleaning: filtering bad data
- Data imputation: complete missing data using mean or median values
- Feature engineering: create new logs based on the existing ones
The example data is the one provided by the 2016 SEG ML Contest (learn more about the contest in this link https://github.com/seg/2016-ml-contest).
Data and scripts will soon be available in this Google Drive link:
https://drive.google.com/drive/folders/1-kRulkJSTKOesTfeBfm2uoO6Rk6QZw86?usp=sharing
Marcelo Guarido is a data scientist with a background in physics and geophysics, with years of experience in research and application of machine learning algorithms to different types of datasets, such as images, time series, geophysical, petrophysical, among others. Marcelo worked 8+ years on Oil and Gas industry companies, most cases with seismic acquisition and processing in PGS and Schlumberger. During the last five years, he has focused on machine learning and data science research. He worked at Verdazo Analytics doing research in machine learning applied to drilling and petrophysical data. Also, Marcelo is currently the head of the CREWES Data Science Initiative. Marcelo is involved in the research, training, consulting, and mentoring of Oil and Gas projects and professionals. Guarido holds a BS degree in Physics and an MSc degree in Geophysics from the University of São Paulo (Brazil), and a Ph.D. degree from the University of Calgary.