Psychology
840: Computational Statistics---Fall, 2014
Note: While the syllabus here appears complete, that is
because it has been Òbrought overÓ from previous years. Looking ahead any more
than a couple weeks may lead to files that have not yet been revised for 2014.
Last revised 10/14/14
Time, Place: 9:00-11:30
Mondays, 347 Davie
Instructor: David
Thissen
Tentative
Schedule:
|
Date |
Topic/Readings |
Other Materials |
|||
|
August 25 |
Introduction A bit on
early computing in the Davie Hall Thurstone Lab is here. Leland
Wilkinson on The Future of Statistical
Computing. Here is the glossary on
C++, and here is a
tutorial. Horton,
N.J., Brown, E.R., & Qian, L. (2004). Use
of R as a toolbox for mathematical statistical exploration. The American
Statistician, 58, 343-357. |
The class
introduction and Computing History presentations, and the notes on VS installation, are clickable links
to .pdf files. For
future classes: The R can be downloaded from mirrors (top
entry, left side navigation bar; choose one in the USA!). Windows
users: Install
Visual Studio Professional 2013 (which is free, with registration, for
students), or the free Visual
Studio 2013 Express for Windows Desktop or Visual
C++ Express (VS 2010) (if you donÕt meet the system requirements for VS
2013). Various
kinds of registration with MS will be involved. Mac
users: If you run OS X 10.8 or 10.9, acquire Xcode from the App Store (free).
|
|||
|
September
8 September
15 |
Regression: Data Manipulation, Matrix
Operations---R Bock,
R.D. (1975). Chapter 4 from Multivariate statistical methods for the
behavioral sciences. New York: McGraw-Hill. Readings
that may be useful anytime: Bock, R.D. (1975). Chapter 2 from Multivariate
statistical methods for the behavioral sciences. New York:
McGraw-Hill. Bock, R.D. (1993). Chapter 2 from the unpublished drafts
of Item Response Theory.
Two books
that have interesting sections on matrix differentiation and the derivatives
for the least squares solution to regression are Searle (1982) Matrix Algebra
Useful for Statistics (section here) and
Schott (1997) Matrix
Analysis for Statistics (section here). Feel free
to suggest additional books (or the appendices common in introductory
graduate statistics texts) as alternative presentations of matrix algebra? |
Exercise
4.1-3 (the green bean problem) on pp.
207-208 of Chapter 4 is required. Use any software you
like for the computation (but modification of my R is recommended). Due Monday
September 22. The Keynote presentation is here (in .pdf format). The canned regression, matrix
regression, and graphics R files, and the data, are clickable here. Optional homework exercises on regression are
here. |
|||
|
September 15 September 22 |
Regression: Data Manipulation, Matrix
Operations---C++ Documentation for the NewMat10
matrix library from Robert Davies. The .zip archive of my Mac OS X
Xcode folder for the NewMat library is here. |
A downloadable pre-created .zip archive of a Visual
Studio 2013 project to build the (modified) newmat10 library is here. For
older versions of VS, a downloadable .zip archive of
the entire VS 2005 project for the NewMat10D library is still available
and will probably update upon opening. The .zip file containing the
classed regression .h and .cp files is here. The collection of trial C++ what works up to the
classed regression is here. More Optional homework exercises for regression are
here. We might
consider (largely optionally) the Scythe
Statistical Library as well. The source for that, as a collection of .h
files, is here, as a .zip file with DOS
line-ends. |
|||
|
September
29 |
IRT: Estimating Theta Thissen,
D., & Orlando, M. 2001). Item response
theory for items scored in two categories. In D. Thissen & H. Wainer
(Eds), Test
Scoring. Hillsdale, NJ: Lawrence Erlbaum Associates. (Ch. 3) |
The Keynote presentation is here (in .pdf
format). The IRT R
file is clickable here. Optional homework exercises for IRT scoring are
here. |
|||
|
September 29 (?) October 6 |
IRT: Estimating Theta Using C++. |
The Keynote presentation is here (in .pdf format). The .zip
file of the C++ source files and item parameter files is clickable here. Documentation for
a somewhat more elaborate version of IRTScore is here. More Optional homework exercises for scoring are here. |
|||
|
October 13 October 20 (?) |
Fechner/Thurstone Scaling Bock, R.D. & Jones, L.V. (1968). Chapter 2 and part of
3 from The measurement and prediction of judgment
and choice. San Francisco, CA: Holden-Day. |
One required (and some
optional) homework exercises for scaling / probit / logit analysis. The Keynote presentation is here (in .pdf format). The Bock
& Jones class R file is clickable here. The .zip
file of the C++ source files for the NormalProb function development is
clickable here. The .zip file of
the C++ source files for the probit regression program is clickable here. A
web-obtained image of Abramowitz & Stegun Page 932 is clickable here. |
|||
October
20 (part 2) |
Probit MCMC Johnson,
V.E. & Albert, J.H. (1999). Chapter 1,
Chapter 2, and Chapter 3 from Ordinal Data Modeling. New York, NY:
Springer. Specifically,
pp. 53 and 58-62 of Chapter 2, and pp. 75-86 and 90-92 describe our topics.
Chapter 1, along with sections 2.1-2.3, are excellent background on Bayesian
inference, using likelihood topics we have discussed, |
The Keynote presentation is here (in .pdf format). TheMCMC R
file is clickable here. Optional homework exercises for probit MCMC are
here. |
|
|||
October 27 |
IRT item parameter estimation I Cai,
L., & Thissen, D. (in press). Modern approaches to parameter
estimation in item response theory. In S.P. Reise & D.A. Revicki
(Eds.), Handbook
of item response theory modeling: Applications to typical performance
assessment. New York: Taylor & Francis (Routledge). Bock,
R.D. & Lieberman, M. (1970). Fitting a
response model for n dichotomously scored items. Psychometrika, 35, 179-197. Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item
parameters: An application of the EM algorithm. Psychometrika, 46, 443-449. Thissen,
D. (1982). Marginal maximum likelihood estimation
for the one-parameter logistic model. Psychometrika, 47, 201-214. |
The Keynote presentation is here (in .pdf format). The R files for Bock & Lieberman with numerical
derivatives, Bock & Lieberman
with analytical derivatives and Fisher scoring hessian (roll our own
Newton-Raphson), a Bock &
Lieberman-type algorithm for the 2PL, and Bock & Aitkin 2PL with numerical
derivatives are clickable here. |
|
|||
November
3 (part 1) |
IRT item parameter estimation II |
The Keynote presentation is here (in .pdf format). The R
files for Bock & Aitkin 2PL with the
empirical hessian, and Bock &
Aitkin 2PL with the expected value of the hessian, are clickable here. The zipped source files for the C++ implementation of
the Bock-Aitkin algorithm are clickable here. |
|
|||
November 3 (part 2) |
IRT item parameter estimation III Albert, J.H. (1992). Bayesian estimation
of normal ogive item response curves using Gibbs sampling. Journal of
Educational Statistics, 17, 251-269. Patz,
R.J. & Junker, B.W. (1999a). A
straightforward approach to Markov chain Monte Carlo methods for item
response theory. Journal of Educational and Behavioral Statistics, 24, 146-178. Cowles,
M.K. (2004). Review of WinBUGS 1.4. The American
Statistician, 58, 330-336. |
The Keynote presentation is here (in .pdf format). The R file for
Albert's algorithm is clickable here. Patz & Junker's S-Plus code is "mcmcirt.zip" on this page. |
|
|||
November 10 November 17 |
Estimation for exploratory and confirmatory factor
analysis Bock, R.D., & Bargmann, R. (1966). Analysis of covariance structures. Psychometrika,
46,
443-449. The 1965 L.L. Thurstone Psychometric Laboratory Research Memorandum
version of this article is here. Jennrich, R.I. & Robinson, S.M. (1969). A Newton-Raphson algorithm for maximum
likelihood factor analysis. Psychometrika, 34, 111-123. Joreskog, K.G. (1969). A
general approach to confirmatory maximum likelihood factor analysis. Psychometrika,
34,
183-202. Joreskog, K.G. (1971). Statistical
analysis of sets of congeneric tests. Psychometrika, 36, 109-133. (Also an excerpt from a Lisrel manual.) Rubin, D.B. & Thayer, D.T. (1982). EM algorithms for ML factor analysis. Psychometrika,
47,
69-76. Some
reading that may be useful, especially for Rubin & Thayer: Bock, R.D. (1975). Chapter 3 sections 3-4-5 from Multivariate statistical methods for the
behavioral sciences. New York: McGraw-Hill. |
The Keynote presentation is here (in .pdf format). The zip file containing the R code examples is clickable
here. The Bock & Bargmann presentation is a clickable
link to .pdf file. The derivative-free and derivative-based
R files are clickable links to text files, as is are the links to the .h and .cpp files for the C++. Optional homework exercises, for this. The
source files for a minimal start on a C++ implementation of confirmatory
factor analysis inspired by Joreskog (1969, 1971) are here. |
|
|||
November 24 December
1 |
Your presentations |
|
|
|||
Requirements,
grading, and stuff: There will be no
tests. There will be homework
assignments, of two kinds: required and
optional. There will be two or three assignments required of everyone. In
addition, at many classes there will be a list of optional assignments
provided. Over the course of the semester, each student will be required to do
(at least) three of the optional assignments. This will yield a total of five
(5) or six (6) homework assignments: two or three required plus three optional.
Assignments will involve some level of computer programming, and a 2-4 page
written (typed, please, thank you) report. The report must describe the
programming in readable English, and include the results. The programming may
be done collaboratively (indeed, that is encouraged); however, each student
must complete a unique individual report.
A
report on a programming project of your own choosing is also required. These
projects may be done individually or in pairs (teams of two are recommended),
on topics of your choosing, with brief oral presentations on November 24 or
December 1. We will discuss this aspect of the course in more detail in October
sometime.
Class
participation: For each week, the readings listed
above will serve as the topical focus. This semester this class is a work in
progress, representing a departure from previous incarnations of the course. As
such, we will be open to suggestions and alternative reformulations as we
proceed.