Introductory Guide to Using Stata in Empirical Financial Accounting Research

Posted by David Veenman - May 02, 2017

Most academics use either SAS or Stata for their data processing and analysis. The big advantage of SAS is that it can connect to the WRDS server to access databases and even helps you run all data processes and calculations on the WRDS cloud. The disadvantage of SAS is that it is like a dinosaur, it’s old and big (12 gigs on my computer, which took hours to install…), as well as its relatively steep learning curve. I only use SAS to download some datasets which are not (easily) available through the WRDS web requests, such as CRSP value-weighted size decile returns (CRSP file “erdport1” or “ermport1”). Although nowadays the WRDS servers are accessible through R or Python shells as well, reducing the need for a SAS installation.

I’m a big fan of Stata. Thus far, Stata has helped me get virtually every possible research problem solved. Besides helping me solve any problem, the big advantage of Stata is that the learning curve is not so steep. Programming in Stata is extremely intuitive and does not require any preknowledge of programming (unlike SAS), just common sense and careful thought.

In 2010, I decided to archive my knowledge of Stata for archival accounting research settings in a document. At that point, the main purpose of the document was to help MSc students at the University of Amsterdam get acquainted with the program and get a better understanding of how to merge, reshape, and clean data before running statistical analyses for their MSc thesis. Over the years, the guide has successfully helped hundreds of MSc students at UvA and Erasmus get the empirical work done for their theses. These students are generally scared of data/statistics (and really do not want to become an academic!), but many of them ultimately get the hang of it when using Stata. In addition, the guide has been successfully used in several PhD programmes and conferences around the world.

Below is a link to the Stata guide. Note that as I am sharing the guide with the community, it’s far from perfect. It’s just a summary of my experience with some specific examples. Also note that while the latest version of the guide dates back to December 2013, the plan is to update the guide some time later this calendar year. In this regard, any suggestions for additions/corrections are very welcome (email me at Lastly, you are welcome to use this guide in your MSc/PhD programmes to help your students figure out how to use Stata in their projects. Please let me know if you do so, so I can keep track of where and how it’s used. Please do not post this guide on a public server. Thanks in advance!

Please find the guide on this page:
Introductory Guide to Using Stata in Empirical Financial Accounting Research

This guide serves to assist MSc and PhD students in using Stata for empirical financial accounting research. Stata is a powerful program that can be used to analyze any research question in “empirical archival financial accounting” research. While there are many statistical packages out there (e.g., SPSS, or even Excel), the major advantage of Stata is that it allows you to manage your data, compute key variables needed in empirical research (e.g., discretionary accruals), and easily merge large sets of data (e.g., combining financial statement data from Compustat with analyst forecast data from I/B/E/S). The purpose of this guide is to give students a head start in using Stata in empirical financial accounting research settings.


WordPress Cookie Plugin by Real Cookie Banner