Chinese Traditional Culture
Chu Culture
Introduction to Behavioral Economics
Introduction to Statistics and Applications within Data Science
Healthy China Initiative and International Health Cooperation
Intelligent Robot and Advanced Manufacturing
Smart Earth
Environment and Social Development
Infections and Immune Response
Artificial Intelligence and Big Data
Critical Conservation and Revitalization of Architecture Heritage
Introduction to Statistics and Applications within Data Science
Basic information
Course Title : Introduction to Statistics and Applications within Data Science
|
|
Instructor |
Manuel González Canché , Associate Professor Quantitative Methods and Policy Analysis, University of Pennsylvania, USA. |
Prerequisites |
No specific preparation is required before the start of the programme, although students will be encouraged to familiarize themselves with basic concepts about data science analytic tools. |
Required Text & Tools |
Textbook: Wickham, H., & Grolemund, G. (2016). R for data science: import, tidy, transform, visualize, and model data. " O'Reilly Media, Inc." https://r4ds.had.co.nz/index.html
Silge, J., & Robinson, D. (2017). Welcome to text mining with R. https://www.tidytextmining.com/
Beck, M. W. (2018). NeuralNetTools: Visualization and Analysis Tools for Neural Networks. Journal of Statistical Software, 85(11), 1–20.https://doi.org/10.18637/jss.v085.i11
Hvitfeldt, E., & Silge, J. (2022). Supervised machine learningfor text analysis in R. Chapman and Hall/CRC.https://smltar.com/
Tools: Student will use R (or Python) Statistical Software . |
Grading Criteria |
Class participation and attendance 60% Two exercises 20% Final Exam 20% |
Course Key Words |
Data science, Statistical modeling, Data analytics, data mining, Data Retrieval, Advanced data visualizations |
Schedule
Session | Topic | Beijing Time | |
Day 1 | Introduction: Relevanceof data science | 6/26/2023 | 19:20-20:55 |
Day 2 | Introduction to R | 6/27/2023 | 19:20-20:55 |
Day 3 | Data visualizationand processingChapter 3 Wickham & Grolemundhttps://r4ds.had.co.nz/data-visualisation.html | 6/28/2023 | 19:20-20:55 |
Day 4 | Feature engineeringChapter 3 Wickham & Grolemundhttps://r4ds.had.co.nz/transform.html | 6/29/2023 | 19:20-20:55 |
Day 5 | Text MiningNatural Language ProcessingChapter 1 Silge & Robinsonhttps://www.tidytextmining.com/tidytext.html | 7/3/2023 | 19:20-20:55 |
Day 6 | Machine learning andunsupervised learningChapter 6 Silge & Robinsonhttps://www.tidytextmining.com/topicmodeling. html | 7/4/2023 | 19:20-20:55 |
Day 7 | Supervised learningChapter 7 Hvitfeldt & Silgehttps://smltar.com/mlclassification.html#classfir stattemptlookatdata | 7/5/2023 | 19:20-20:55 |
Day 8 | Deep Learning Neural networks Beck 2018https://www.jstatsoft.org/article/view/v085i11 | 7/6/2023 | 19:20-20:55 |
Course description
This course represents an introduction to Statistics and Data Science Applications. This is a highly applied course where we will devote time each week to understand each of the topics to be discussed and then proceed to showcase how to implement the analyses in R. All data and code will be provided to class participants and our class discussions will highlight best practices employed by data scientists and academics alike.
Objectives
1. This course will cover detailed explanations and examples of statistical modeling with data science and visualization.
2. This course will provide students an opportunity to develop research skills including different methods for collecting, analyzing, and exploring data.
3. This course will engage students to apply their learning and knowledge into actual practices and applications in order to accomplish the learning goals and research proposals that apply the concepts and methods discussed in this course.
4. It is expected that participants will become well positioned to be competitive applicants in undergraduate or graduate programs where data science tools are valued and highly sought. They can also seek employment as data scientists.