Data from experiment

Storing and loading data

Data are stored in files. Unfortunately, there are various formats of files with data. Usually we have text files and one line in file corresponds to one data points. Every line should contain at least two numbers: x and y of point. It can also contain standard deviation of y coordinate. Numbers can be separated by whitespace, commas, colons or semicolons. Some lines can contain comments or extra informations. If these lines have a hash (#) in first column, they are ignored. In other case, they are usually also ignored. There are also other file types, that can be read: .rit and .mca. In future, the way the special file formats are handled will be changed.

Points are loaded from files using command

d.load [xcol:ycol [:scol] ] [from [-to]/of ] [[+]*merge] 'filename'

where xcol, ycol, scol, from, to, of, merge are unsigned integers. Note that the name of the file is inside of single quotation marks. If the file is in a normal format (columns with numbers) it can be specified which column contains x, y and, optionally, std. dev. of y. Only selected points can be read from the file, what can be specified using from-to /of syntax, eg. 1/2 - odd points (first, third, etc.), 2/2 - even points, 5-10/50 - from every 50 points only 5th, 6th, ... 10th are loaded. A few neighboring points can be merged (loaded as one point). Resulting point have value of x equal to average of merged points, and value of y is either equal to sum of y's of merged points (+*), or to average (*).

Data can be saved to file either as points or as simple script, that contains all information about currently loaded data, its background, calibration, settings etc. Command

d.export [[d] | [s] | [b]] 'filename' [+]

can export data as points, data as script or background as points if letter before filename is, respectively, d, s or b. Only active points are being exported (see next chapter to learn about active and inactive points). + will cause, if specified file exist, appending to the file rather than overwritting.

Some information about current data can be obtained using command:

d.info

Active and inactive points

We often have situation, that only part of data from file is interesting for us. We should be able to exclude selected points from fitting and all computations. It can be done with command d.range or with mouse-click in GUI. The idea of active and inactive points is simple: only the active ones are subject to all computations, inactive ones are neglected. Command d.range has following syntax:

d.range [[+] | [-]] [xmin] : [xmax]

If sign is not specified, only points with x coordinate between xmin and xmax are marked as active. If xmin or xmax is omitted, it is equivalent to -inf or +inf respectively. When command contains + or -, given range is added to set of currently active points or subtracted from it.

In case we care only about segments of points in which there are peaks, we can use command

d.range * level margin

It will mark as active only these points, which have y coordinate greater than level, and points around the first ones - points which distance from point above level is less than margin.

Background and calibration

Data background can be defined as the extrapolation of points chosen by user. These chosen points are called background points, to differ from data points. Apart from this, background can be defined as a function, eg. polynomial, and parameters of this function can be fitted. You can read about the second way in the section called “Sum of fitting functions ”.

Command

d.background X Y

adds (X,Y) point to background points. If there is only one background point, background is equal Y for all data points. If there are two background points, background is a straight line, that contains both of them. If the number of background points is larger, background is either cubic spline - if option spline-background is set, or polyline otherwise. If distance along x axis from new point to other existing point is smaller than value of option min-background-points-distance, this other point is deleted. You can remove background point with command

d.background ! [X]

If X is specified, it will delete all background points with x in range (X - dx, X + dx), where dx is a value of option min-background-points-distance. Without X, it will delete all background points.

Command

d.background [.]

will display information about background (if dot is omitted) and recompute background. Explicit background recomputing is required only after changing value of option spline-background, after other background-related commands it is done automatically. If dot follows command, the only effect is the recomputation.

Sometimes the instrument introduces errors to the x coordinate of the data points, which can be determined by using a reference sample. Correcting these errors is called here calibration. (See also description of zero-shift.) Calibration commands are analogical to background commands described above. We also have a set of points, called here calibration points. If there is only one calibration point (X, D), process of calibration changes x coordinate of all data points x -> x-D. If there are two calibration points, calibration curve d(x) is a straight line that contains both of calibration points. If the number of calibration points is greater, calibration curve is either cubic spline - if option spline-calibration is set, or polyline otherwise. Every point is transformed using calibration curve d(x) in the following way: (x, y) -> (x-d(x), y).

The following commands are analogical to d.background commands:

d.calibrate X D

d.calibrate ! [X]

d.calibrate [.]

To delete calibration point with second listed command, exact value of D must be given, there is no equivalent to min-background-points-distance.

Standard deviation or weight

When fitting data, we assume that only y coordinate of data is subject to statistical error in measurement. Is is quit common assumption. To see how y standard deviation sigma influences minimized function, look at weighted sum of squared residuals formula in the section called “Nonlinear optimization ”. We can also think about weights of points - every point has a weight assigned, that is equal wi=1/sigma^2

Standard deviation of points can be read from file together with x and y coordinates, or set with command:

d.deviation {[u] | [r]} [min]

This commands is setting y standard deviation according to specified letter:

  • u - equal min for all points (default: min=1),

  • r - max (sqrt(y), min), setting std. dev. as a square root of value is common when y is the number of counted independent events (default: min=1).

If option background-influences-std-dev is set, y standard deviation is computed after subtracting background - it does matter only in second case.

Command

d.deviation

displays information about y standard deviations of points.

Working with many datasets

Let call a set of data that usually comes from one file - a dataset. All operations described above concern one dataset, but you can also work with many datasets. Datasets can be grouped in plots. There can be any number of plots, and any number of datasets in each plot. There is always one active plot and one active dataset in each plot. Inactive datasets are drawn, but do not influence any operations. So there are only two reasons to have more than one datasets in one plot - it is possible to compare them visually and to easily switch between them. Let me repeat: all command are related only to the active dataset.

A so-called plot contains not only datasets, but also described below so-called sum. Only datasets and sum in active plot are drawn and all commands except fitting are related to active plot. But for f.xxx commands it doesn't matter if plot is active - all plots are fitted simultanously. This is why one may want to have many plots - to fit simultanously datasets with functions, that can share its parameters.

You can manage plots and datasets using the data pane in GUI or d.activate command. The following syntax:

d.activate [p] :: [d]

is used to switch to plot p and dataset d. It is possible to change only active plot or to change only active dataset. To create (and activate) new plot or new dataset replace p or d with asterisk (*).