File: notation.tex

package info (click to toggle)
matchit 2.4-13-2
  • links: PTS
  • area: main
  • in suites: squeeze
  • size: 2,040 kB
  • ctags: 51
  • sloc: makefile: 24; csh: 13
file content (151 lines) | stat: -rw-r--r-- 7,927 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
\documentclass[oneside,letterpaper,titlepage,12pt]{article}
%\usepackage[ae,hyper]{/usr/lib/R/share/texmf/Rd}
\usepackage{makeidx}
\usepackage{graphicx}
\usepackage{natbib}
\usepackage[reqno]{amsmath}
\usepackage{amssymb}
\usepackage{verbatim}
\usepackage{epsf}
\usepackage{url}
\usepackage{html}
\usepackage{dcolumn}
\usepackage{longtable}
\usepackage{vmargin}
\setpapersize{USletter}
\newcolumntype{.}{D{.}{.}{-1}}
\newcolumntype{d}[1]{D{.}{.}{#1}}
%\pagestyle{myheadings}
\htmladdtonavigation{
  \htmladdnormallink{%
    \htmladdimg{http://gking.harvard.edu/pics/home.gif}}
  {http://gking.harvard.edu/}}
\newcommand{\hlink}{\htmladdnormallink}

\bodytext{ BACKGROUND="http://gking.harvard.edu/pics/temple.setcounter"}
\setcounter{tocdepth}{3}

\parindent=0cm
\newcommand{\MatchIt}{\textsc{MatchIt}}

\begin{document}

\begin{center}
Notation for \MatchIt \\
Elizabeth Stuart \\
\end{center}

This document details the notation to be used in the matching paper to accompany \MatchIt,
as discussed in conference call on April 27, 2004 and emails on April 27-29, 2004. \\

We first provide a general idea of the notation and ideas.  Formal notation follows.

\begin{enumerate}
\item  There exists in nature fixed values, $\theta_{1i}$ and $\theta_{0i}$, the
potential outcomes under treatment and control, respectively.  They, or
more usually their difference, are our quantities of interest and hence
the inferential target in this exercise.  They are not generally known.
There also exist in nature a vector of fixed values $X_i$ that are known and
will play the role of covariates to condition on.  Note that the use of $X_i$ in matching assumes that they are not 
affected by treatment assignment: $X_{1i}=X_{0i}=X_i$.  If a researcher is interested in adjusting for a variable potentially affected by treatment
assignment, then methods such as principal stratification should be used. [We can write a paragraph on that somewhere.]

\item  Then (i.e., only after step 1) $T_i$ is created, preferably by the
experimenter, who assigns its values randomly, but in some cases $T_i$ is
created by the world and we hope that it was created effectively randomly
(or randomly conditional on the observed $X_i$).  This is what is known as the assumption of 
unconfounded treatment assignment: That treatment assignment is independent of
the potential outcomes given the covariates $X_i$. 

\item  Finally, the observed outcome variable $Y_i$ is calculated
deterministically as $Y_i = \theta_{1i}T_i + \theta_{0i}(1-T_i)$.  This is
an accounting identity, not a regression equation with an error term;
i.e., it is just true, not really an assumption.
\end{enumerate}

The accounting identity in 3 implies that we observe $\theta_{1i}=Y_i$ (but
not $\theta_{0i}$) when $T_i=1$, and we observe $\theta_{0i}=Y_i$ (but not
$\theta_{1i}$) when $T_i=0$.  Since the fundamental problem of causal
inference indicates that either $\theta_{0i}$ or $\theta_{1i}$ will be
unobserved for each $i$, we will need to infer their values, for which we
define $\tilde\theta_{1i} \sim p(\theta_{1i})$  and $\tilde\theta_{0i} \sim
p(\theta_{0i})$ as draws of the potential outcomes under treatment and
control, respectively, from their respective predictive posterior
distributions.\\

We now detail the notation utilized.  Fixed, but sometimes unknown values are represented by Greek letters.  Draws from the posterior
distribution of these unknown values are represented by a tilde over the variable of interest.  Fixed, but always known values are represented
by capital Roman letters.
We first define notation for individual $i$:
\begin{itemize}
\item $\theta_{1i}=$ individual $i$'s potential outcome under treatment 
\item $\theta_{0i}=$ individual $i$'s potential outcome under control
\item $T_i=$ individual $i$'s observed treatment assignment 
\begin{itemize} \item $T_i=1$ means individual $i$ receives treatment
		\item $T_i=0$ means individual $i$ receives control
\end{itemize}
\item $\theta_{0i}$ and $\theta_{1i}$ are considered fixed quantities.  Which one is observed depends on the random variable $T_i$.
\item $Y_i=T_i \theta_{1i} + (1-T_i) \theta_{0i}$ is individual $i$'s observed outcome
\item Let $Y_{1i}=(\theta_{1i}|T_i=1)$.  Similarly, $Y_{0i}=(\theta_{0i}|T_i=0)$.  One of these is observed for individual $i$.  [Do we actually want this
notation?  It's not quite right since the capital Roman letter would imply that $Y_{1i}$ is always observed, but the notation may be handy (see below).]
\item Let $\tilde{\theta}_{0i}$ be a draw from the posterior distribution of $\theta_{0i}$: $\tilde{\theta}_{0i} \sim p(\theta_{0i})$. 
Similarly, let  $\tilde{\theta}_{1i}$ be a draw from the posterior distribution of $\theta_{1i}$: $\tilde{\theta}_{1i} \sim p(\theta_{1i})$.
\item Consider variables $X_i$.  If $X_{0i}=X_{1i}=X_i$ then $X$ is a ``proper covariate'' in that it is not affected by treatment assignment.  Only proper
covariates should be used in the matching procedure.  That $X_i$ is not affected by treatment assignment 
is always an assumption (sometimes more reasonable than other times)--we never observe $X_{0i}$ and $X_{1i}$ for individual $i$.
\end{itemize}

For individual $i$, the potentially observed values can be represented by the following 2x2 table:

\begin{center}
\begin{tabular}{cc|c|c|}
\multicolumn{4}{c}{\hspace*{4cm} Actual treatment assignment} \\
\\
\multicolumn{2}{c}{} & \multicolumn{2}{c}{Control \hfill Treatment} \\
\cline{3-4}
& & & \phantom{abcd} \\
& Control & $\theta_{0i}$, $X_i$ & $\tilde{\theta}_{0i}$, $X_i$ \\
Potential & & &  \phantom{abcd} \\
\cline{3-4}
outcomes under & & & \phantom{abcd} \\
& Treatment  & $\tilde{\theta}_{1i}$, $X_i$ & $\theta_{1i}$, $X_i$ \\
& & & \phantom{abcd} \\
\cline{3-4}
\end{tabular}
\end{center}
 
For each individual, we are in either Column 1 or Column 2 (depending on treatment assignment).  Within each column, 1 number will be observed and 1 will be a drawn
value from the posterior distribution (i.e., the true value for that cell is missing). Thus, for each individual we are in the situation of one of the 
following two vectors of potential outcomes:

\begin{center}
\begin{tabular}{rl}
If $T_i=1$: & $(\tilde\theta_{0i}, \theta_{1i})$ \\
If $T_i=0$: & $(\theta_{0i}, \tilde\theta_{1i})$ \\
\end{tabular}
\end{center}

Now consider $n$ individuals observed, with $n_1$ in the treated group and $n_0$ in the control group ($n=n_0+n_1$).  Then we have the following notation:
\begin{itemize}
\item $n_1=\sum_{i=1}^n T_i$, $n_0=\sum_{i=1}^n (1-T_i)$
\item ${\bf Y_0} = \{Y_{i}|T_i=0\}$.  ${\bf Y_0}$ is of length $n_0$.
\item ${\bf Y_1} = \{Y_{i}|T_i=1\}$.  ${\bf Y_1}$ is of length $n_1$.
\item ${\bf Y} = \{ {\bf Y_0}, {\bf Y_1}\}$.  ${\bf Y}$ is of length $n$.
\item $\overline{Y}_1 = \frac{\sum_{i=1}^n T_i \theta_{1i}}{\sum_{i=1}^n T_i}=\frac{\sum_{i=1}^{n_1} Y_{1i}}{n_1}$
\item $\overline{Y}_0 = \frac{\sum_{i=1}^n (1-T_i) \theta_{0i}}{\sum_{i=1}^n (1-T_i)}=\frac{\sum_{i=1}^{n_0} Y_{0i}}{n_0}$
\item Observed sample variances would be calculated in a similar way (using ${\bf Y_0}$ and ${\bf Y_1}$).
\end{itemize}

Throughout, we will use $p()$ to represent pdf's and $g()$ for functional forms of models.

One point to make sure we mention in the write-up is to say what OLS assumes.  OLS with a treatment indicator does:
$$Y_i|X_i, T_i \sim N(\alpha + \beta_0 T_i + {\boldsymbol \beta} {\bf X_i}, \sigma^2)$$
which assumes that the models of the potential outcomes follow parallel lines with equal residual variance:
$$\theta_{1i}|X_i \sim N(\alpha + \beta_0 + {\boldsymbol \beta} {\bf X_i}, \sigma^2)$$
$$\theta_{0i}|X_i \sim N(\alpha + {\boldsymbol \beta} {\bf X_i}, \sigma^2).$$
We can also have a graphical representation of this, perhaps with some examples of outcomes that clearly don't have that nice parallel linear relationship,
and show that this would be a very special case of what we're advocating.



\end{document}