Tech Note 106: Summary of Steven Bickerton's TASS Reduction

Steven Bickerton
Aug 23, 2007
Sep 5, 2007
Keywords: photometry computation techniques

You can find the code Steve used in his analysis by going to Steve's analysis software page.


Recalibration

Before performing any statistical analysis of the photometry, the measurements were recalibrated. The mean magnitude was calculated for each target star and the deviation from the mean was computed for each measurement. The median deviation was then found for each set of observations (ie. the systematic amount by which the magnitudes in a given frame deviate from their means). This differential photometry was performed in 0.25deg x 0.25deg fields in an effort to account for systematics introduced by position on the detector. Some observations were systematically bright or faint by as much as 0.1 mag. The magnitude of each star was corrected by subtracting the median deviation of the magnitudes in the 0.25deg square in the observation.

The program is run as follows:

./recalibrate.pl photometry_file sky_patch_size(degrees)

eg.

./recalibrate.pl i_010 0.25

This will generate a file called i_010.rcal which contains recalibrated photometry.

The photometry file must have data columns as follows:

1 ID
2 N
3 RA
4 Dec
5 JD
6 Filter1_name
7 Magnitude1
8 Mag_err1
9 Quality_index1
10 Filter2_name
11 Magnitude2
12 Mag_err2
13 Quality_index2

eg.
85572 6 9.9967 50.1725 2452903.76381 V 13.692 0.100 0 I 12.572 0.041 0
85572 6 10.0001 50.1699 2452900.75813 V 14.431 0.161 0 I 12.648 0.041 0
85572 6 10.0002 50.1726 2453277.72325 V 14.385 0.121 0 I 12.733 0.045 0
85572 6 10.0003 50.1722 2453259.78293 V 14.423 0.130 0 I 12.829 0.052 0
85572 6 10.0009 50.1716 2453257.78453 V 14.499 0.140 0 I 12.829 0.051 0
85572 6 10.0029 50.1739 2453267.76052 V 14.336 0.130 0 I 12.830 0.053 0
85580 32 9.9982 49.5382 2452964.58823 V 13.230 0.036 0 I 12.600 0.015 0
85580 32 9.9982 49.5387 2453322.61128 V 13.233 0.047 0 I 12.395 0.036 0
...

The deviations for some of the observations in the i_010 field are shown before (top panel) and after (bottom panel) recalibration in Figure 1. Note that in the upper panel, the points are frequently systematically above or below 0, while they are well centered in the lower panel.

Figure 1: the magnitude deviations from their means before (top) and after (bottom) recalibration.

The Welch-Stetson Index

The determine whether or not a given star's photometric measurements were consistent with random functuations, or were more likely to be representative of non-random variability, the Welch-Stetson variability index was computed (Welch and Stetson 1993)

QWS = √(1.0/(n*(n-1))) ∑ ( (V- ⟨V&rang)/σV ) * ( (I-⟨I&rang)/σI )

where angle-brackets denote a mean value, and σV/I are the standard errors in the measurements of the V/I magnitudes.

The code is executed as follows:

./welchStetson.pl photometry_file N_minimum

where the photometry file is as defined above (and is output by the recalibrate.pl program), and N_minimum is the minimum number of measurements to accept when computing WS for a given star. The suffix .ws is appended to the photometry filename to form the output filename.

The .ws file has the form:

# 01 ID
# 02 N
# 03 RA
# 04 Dec
# 05 Vmean
# 06 Imean
# 07 Iws
# 08 Iws (v only)
# 09 Iws (i only)
# 10 K (stetson)

Columns 8,9, and 10 are unused at this point.

The Welch-Stetson values are shown versus mean V-magnitude in Figure 2. They are shown both before (top panel) and after (bottom panel) recalibration. When computed with the 'raw' photometry, the index values grow for brighter targets as the uncertainties (σV/I) are smaller. This also occurs in the recalibrated data, but to a far lesser extent. A red circle is used to mark a known variable star from the GCVS.


Figure 2: The Welch-Stetson variability index values as a function of mean V magnitude before (above) and after (below) recalibration.

An upper-limit was established as a detection threshold for variability candidates. As the scatter in the WS values is a function of magnitude, the upper limit was found by smoothing the data with a Gaussian smoothing kernel and computing the RMS uncertainty at each magnitude. The WS values are shown with the smoothed curve and fitted threshold in Figure 3.


Figure 3: the detection threshold for candidate variable stars.

Values above the threshold were deemed to be candidate variable stars and their time series were written to individual files:

./writeTimeseries.pl photometry_file WS_output_file threshold(sigma)

The photometry file used here should contain the recalibrated photometry. The output files are named with the ID number of the star and suffix ".ts" appended. The program assumes the photometry file will have the form 'i_NNN.something' where NNN is a 3-digit number representing the right ascension of the target field. The time series are written in a directory called tsNNN.
Period Finding

A modification of the Lomb-Scargle (LS) periodogram was used to determine the periods of the candidate variables (see Gregory 2005). The method is essentially a discrete Fourier transform which can be performed on unevenly-sampled data. However, the LS periodogram carries the assumption that the periodic function being sought is sinusoidal in nature, which is not generally the case for a variety of variable stars (most notably eclipsing binaries). To work around this, I phased the data for each trial frequency and broke it into 5 bins. The mean for each bin was computed and the 5 means were used as a model for the variability. This model was used in place of the sinusoids for the LS method. The program is run as follows:

./pdm time_series_file c5:7 > output_file

where c5:7 indicates that time,magnitude are in columns 5 and 7, respectively. The output file has the form:

1 frequency
2 period
3 Phase dispersion minimization PDM value
4 Probability of the PDM value in column 3
5 Whittaker-Robinson max variance of means value
6 The modified lomb-scargle periodogram value (for arbitrary waveform)
7 The Probability density for the value in column 6
8 The amplitude of the 5-bin model
9 The maximum squared deviation in the 5-bin model divided by 1/5th variance

Of greatest interest at this point, are columns 2 and 7. Plotting 7 against 2, we see the probability density as a function of the period. The highest peak is the most probable period.

Column 3 is Stellingwerf's (1978) phase dispersion minimisation value. This is just the ratio of the model-subtracted variance to the variance of the un-modeled magnitude values. Column 4 is the probability that the value in column 3 would be observed.

Column 5 is the Whittaker-Robinson variance of means test (also described in Stellingwerf, 1978).

Columns 8 is the amplitude of the 5-bin model, and column 9 is the maximum squared deviation in the 5-bin model divided by 1/5th variance. In other words, column 9 is the most deviant value in the model normalized by its proportion of the total variance of the model. Columns 8 and 9 are unused, but it is my hope that they may be useful in classifying the variable types. I suspect (but have not yet tested) that eclipsing binaries will have large values in column 9, while intrinsic variables will not (the single faint dip in the time series will cause one bin in the model to be responsible for a disproportionately large amount of the total variance. But, as I mentioned, I haven't yet examined this.

A few plots of the probability density (column 7) versus the period (column 2) are shown in Figure 4, with the corresponding phased lightcurves in Figure 5.


Figure 4: The modified Lomb-Scargle periodograms for 3 candidates in the RA=330 field.

Figure 5: The phased lightcurves for the candidate variable stars whose periods were determined in Figure 4. The colour-coding is used to denote the order of the phase, darker colours are from earlier cycles, while lighter colours are more recent. There shouldn't be any pattern present in the colours.

Admittedly, I've chosen to show you some of the best examples from the data. Most of the candidates are not so easily sorted-out. In this case, however, it's clear that the top panel is an eclipsing binary, and the second is a Cepheid.

This is where the work stands so far. The next step, as I see it, is to begin classifying the candidate objects by eye and to try to identify some parameters which will allow them to be classified automatically. I've also been looking at the CLEAN algorithm as a way to remove sampling effects such as aliasing from the periodograms. This would allow the periods to be more reliably determined.
References

Gregory, P. 2005. Bayesian Logical Data Analysis for the Physical Sciences, Cambridge University Press, Cambridge, UK

Stellingwerf, R.F. 1978. ApJ 224, 953

Welch, D.L., and Stetson, P., 1993, AJ 105, 1813