Introducción gentil a ciencia de datos: variables que afectan el éxito de estudiantes en las escuelas públicas de California (K-12)

Betza and Angel will walk you through a step-by-step process of how to apply data analysis, data science and data engineering using public datasets to understand the impact of financial investment and other factors on students’ academic outcomes for the state of California in 2016.

We’ll go over the ins and outs of this study and will touch in each of these phases:

  • Data Analysis Science process
  • Data gathering
  • Cleaning best practices
  • EDA - Basic and Extensive Data Analysis
  • Statistical evaluation - p- value, pairwise, xi squared
  • Predictive Modeling - Is linear regression or Multiple regression valid for this data?
  • Data Engineering process
  • Containerization
  • Cleaning methodology
  • Presenting the data
  • Engineering practices

Some of the tools we’ll be showcasing: Scientific Python, Docker, Jupyter Notebooks, Streamlit, Git, AWS EC2/Digital Ocean