## File: VisualizationSignificance.htd

package info (click to toggle)
cain 1.10+dfsg-2
• area: main
• in suites: stretch
• size: 29,856 kB
• sloc: cpp: 49,612; python: 14,988; xml: 11,654; ansic: 3,644; makefile: 133; sh: 2
 file content (146 lines) | stat: -rw-r--r-- 6,277 bytes parent folder | download | duplicates (4)
 `123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146` `````` Significance Testing

Significance Testing

Open the file examples/cain/ImmigrationDeath.xml. The immigration-death process has a single species X and two reactions: immigration and death. The immigration reaction is 0→X with a propensity factor k1 = 1. The death reaction is X→0 and has the propensity factor k2 = 0.1. Since both reactions use mass-action kinetic laws, the propensities are 1 and 0.1 X, respectively.

The immigration-death model is presented in section 6.3.D of Markov Processes. There it is called the payroll process; hiring and leaving corresponds to immigration and death. The text shows how to derive the mean and variance of the population by solving first order differential equations. These are listed below for a starting time of t0 and an initial population of X0.

mean(X(t)) = k1/k2 + (X0-k1/k2) exp(-k2(t-t0))

var(X(t)) = (k1/k2) (1-exp(-k2(t-t0))) (1+(k2X0/k1) exp(-k2(t-t0)))

The analytical solution for the transient behavior is associated with the "Ref., Transient" method. The steady state solution is associated with the "Ref., Steady St." method. Note that the solutions are listed in the simulation output panel. Select the "ImmigrationDeath" model and the "Direct" method and then generate 10,000 trajectories. Now we will compare the empirical solution and the reference solution. Click the plot button in the simulation output panel and then go to the "Time Series" tab in the plot configuration window. Unselect the "Std Dev" field in the grid by right clicking on the column header. Set the line style to "dot", and the marker style to "Circle". Match the face and edge color to the line color by left clicking on the column headers. (See the Plotting Time Series Data section for information on configuring plots.) Change the title to "Immigration-Death, Mean". Then click the "New plot button" to plot the empirical mean. Then select the "Statistics" tab. Right click on the "Std Dev" field in the grid to turn off the plotting of the standard deviation as error bars. Clear the title and axes labels. Finally, click the plot button to show the reference solution (plotted with a solid line) along with the empirical solution. The result is shown below.

One may do the same for the standard deviation by selecting the "Std. Dev." radio button and following the same procedure. The result is shown below.

From the plots above we see that the mean and standard deviation of empirical solution is very close to the analytical solution. While using the "eyeball norm" is useful, it is better still to use statistical tools to analyze the solutions. Specifically, we use the one-sample version of Student's t-test to test the null hypotheses that the empirical means are equal to the analytical means. (We use the plural because we will apply the test for each time frame.) The test yields a p-value that is the probability of obtaining a test statistics that is at least as extreme as the observed result, assuming that the null hypothesis is true. For instance, suppose that the empirical mean and analytical mean differ by an amount d. A p-value of 0.25 would mean that there is a 25% chance that the empirical mean would differ from the analytical mean by at least d. Put another way, if you generated many empirical solutions, using the same number of trajectories in each, you would expect about a quarter of them to differ by at least d from the analytical mean. This is, of course, assuming that the null hypothesis is true and two means are equal. A small p-value indicates that the null hypothesis is not true. For example, a p-value of 0.001 would mean that there is only a one in a thousand chance of observing an empirical mean that differs from the analytical mean by the calculated amount. In this context, this would lead us to suspect that the simulation method used to generate the empirical solution is incorrect. Of course, such an extreme difference is still possible with a correct method. If we were to repeat the experiment with another empirical solution and obtain another small p-value, we would confidently state that the method is incorrect.

Click the p-value button   in the simulation output panel to open the p-value analysis window shown below. In the left column select the "ImmigrationDeath, Direct" output. In the right column select the reference solution "ImmigrationDeath, Ref., Transient". Click the "Calculate" button to compute the p-value for all of the species and all of the frames. The row headers list the frame times. The column headers list the species.

Click the "Plot" button to show a plot of p-value versus frame number. We see that the p-values are consistent with a correct stochastic simulation method.

``````