The statistical package for the module is Stata version 10 (version 11 is now available but the former is available in the computer lab.
STATA is a very powerful statistical package for managing, analyzing, and graphing data. Stata has three major strengths: data manipulation, statistics, and graphics. Stata is an excellent tool for data manipulation: moving data from external sources into the program, cleaning it up, generating new variables, generating summary data sets, merging data sets and checking for merge errors, collapsing cross–section time–series data on either of its dimensions, reshaping data sets from “long” to “wide”. In this context, Stata is an excellent program for answering ad-hoc questions about any aspect of the data.
This statistical package has some advantages that make it very practical and useful:
(i) It is eminently portable
, and its developers committed to crossplatform compatibility. In fact, Stata is a full-featured statistical programming language for Windows, Macintosh, Unix, and Linux
(ii) It is an excellent tool for data manipulation
: moving data from external sources into the program, cleaning it up, generating new variables, generating summary data sets, reshaping databases, etc.
(iii) In terms of statistics, Stata provides all of the standard univariate, bivariate and multivariate statistical tools
, from descriptive statistics and t-tests through one-, two- and N-way ANOVA, regression, principal components, and the like.
(iv) Stata’s regression capabilities are full-featured, including regression diagnostics, prediction, robust estimation of standard errors, instrumental variables / two-stage least squares, seemingly unrelated regressions / three-stage least squares, longitudinal models, and the like.
(v) Stata has commands for any type of analysis we may perform (cross section, time series, survival data, etc). We must specify a family command that indicates the type of data we are dealing with. For instance, the “xt” commands is for cross-section/time-series or panel (longitudinal) data; the “svy” commands are used for the handling of survey data with complex sampling designs; the “st” commands for survival-time data with duration models, among others.
(vi) Stata has the reproducibility advantage. You can create files that allow you to keep a record of your Stata sessions; they will contain the commands you use and their outcomes (log-files). If you want to program or save a determined procedure you did to the data-set in order to use it again, or share it with colleaguees, there also exists another type of files for that really useful purpose (do-files).
In addition to the lab sessions and handouts available on IVLE, the Stata resource page at the University of California Los Angeles (UCLA) website is excellent for learning Stata:
In case you want to own your own copy of Stata you can buy a student version (GradPlan) for as low as US$49:
Please contact/email Ruth for details.
Useful text on Stata:
Baum, Christopher (2006). An Introduction to Modern Econometrics Using Stata.
: Stata Press. [Available at Law Library RBR.]