2018/2019, Semester 1

School of Business (Analytics & Operations)

Modular Credits: 4

The module starts with basic Python programming. It then walks through Statistics topics from visualizing and summarizing data, to estimating model parameters and hypothesis testing, and to linear regression analysis. For each topic, Python illustrations and experimentations are interwoven inside so as to help students better appreciate statistical theory and also understand how it works in practice. The module finishes with practical issues on acquiring, cleaning, and organizing data using Python. This completes the cycle of data analysis and the students are able to independently execute a basic Data Analytics project.

**Basic Python Programming**- Data Structures
- Flow Control
- Functions and Packages
- Object-Oriented Programming

**Understanding Data and Visualization**- Summary Statistics and Empirical Cumulative Distribution Function
- Data Visualization: Histogram, Scatterplot, Boxplot, Line plot
- Python Implementation: Matplotlib, Seaborn and NumPy packages

**Statistical Concepts and Inference Techniques**- Sampling and Population
- Parameter Estimation
- Confidence Intervals
- Hypothesis Testing

**Linear Regression Analysis**- Model Assumptions and Interpretations
- Categorical Variables and Interaction Effects
- Model Selection

**Advanced Python**- Pandas Package
- Obtaining data from the Internet
- Data Cleaning

As well, several ethical issues will be discussed throughout the whole semester. The specific topics are as follows:

- Ethics for Data Visualization
- Ethics for Data Collection and Analysis
- Ethics for Making Generalization based on Sample Data

Class Participation 10%

Group Project 20%

Take-home Quiz 10%

Assignments 20%