Statistical learning, also known as machine learning, plays a pivotal role in the modern world, and forecasting models are routinely used in politics, economics, biology, and popular culture. Existing materials for teaching statistical learning reduce it to a recipe form, without explaining its fundamentals and limitations, and without using active learning strategies. As a consequence of this gap in educational materials, practitioners such as statistics instructors and students use statistical learning models without a clear understanding of how they work, resulting in erroneous models.
Various types of statistical learning models can be implemented in R, a free statistical language, using multiple methods. Each of these approaches results in completely different R programs, and the lack of a unified technique across different types of models results in programming code that is difficult to understand and hard to maintain. “Tidymodels” is a framework in R that encourages the use of sound statistical principles and the use of a standard syntax, facilitating the creation and maintenance of correct statistical learning models.
We propose the creation and dissemination of active learning materials that introduce college students to the principles of statistical and machine learning by using the R programming language and the tidymodels framework.