Tech Note 106: Summary of Steven Bickerton's TASS Reduction
Steven Bickerton
Aug 23, 2007
Sep 5, 2007
Keywords: photometry computation techniques
You can find the code Steve used in his
analysis by going to
Steve's analysis software page.
Recalibration
Before performing any statistical analysis of the photometry, the
measurements were recalibrated. The mean magnitude was calculated for
each target star and the deviation from the mean was computed for each
measurement. The median deviation was then found for each set of
observations (ie. the systematic amount by which the magnitudes in a given
frame deviate from their means). This differential photometry was
performed in 0.25deg x 0.25deg fields in an effort to account for
systematics introduced by position on the detector. Some observations were
systematically bright or faint by as much as 0.1 mag. The magnitude
of each star was corrected by subtracting the median deviation of the
magnitudes in the 0.25deg square in the observation.
The program is run as follows:
./recalibrate.pl photometry_file sky_patch_size(degrees)
eg.
./recalibrate.pl i_010 0.25
This will generate a file called i_010.rcal which contains
recalibrated photometry.
The photometry file must have data columns as follows:
1 ID
2 N
3 RA
4 Dec
5 JD
6 Filter1_name
7 Magnitude1
8 Mag_err1
9 Quality_index1
10 Filter2_name
11 Magnitude2
12 Mag_err2
13 Quality_index2
eg.
85572 6 9.9967 50.1725 2452903.76381 V 13.692 0.100 0 I 12.572 0.041 0
85572 6 10.0001 50.1699 2452900.75813 V 14.431 0.161 0 I 12.648 0.041 0
85572 6 10.0002 50.1726 2453277.72325 V 14.385 0.121 0 I 12.733 0.045 0
85572 6 10.0003 50.1722 2453259.78293 V 14.423 0.130 0 I 12.829 0.052 0
85572 6 10.0009 50.1716 2453257.78453 V 14.499 0.140 0 I 12.829 0.051 0
85572 6 10.0029 50.1739 2453267.76052 V 14.336 0.130 0 I 12.830 0.053 0
85580 32 9.9982 49.5382 2452964.58823 V 13.230 0.036 0 I 12.600 0.015 0
85580 32 9.9982 49.5387 2453322.61128 V 13.233 0.047 0 I 12.395 0.036 0
...
The deviations for some of the observations in the i_010 field are
shown before (top panel) and after (bottom panel) recalibration in
Figure 1. Note that in the upper panel, the points are frequently
systematically above or below 0, while they are well centered in the
lower panel.
Figure 1: the magnitude deviations from their means before (top) and
after (bottom) recalibration.
The Welch-Stetson Index
The determine whether or not a given star's photometric measurements
were consistent with random functuations, or were more likely to be
representative of non-random variability, the Welch-Stetson
variability index was computed (Welch and Stetson 1993)
QWS = √(1.0/(n*(n-1))) ∑ ( (V- 〈V&rang)/σV ) * ( (I-〈I&rang)/σI )
where angle-brackets denote a mean value, and σV/I are
the standard errors in the measurements of the V/I magnitudes.
The code is executed as follows:
./welchStetson.pl photometry_file N_minimum
where the photometry file is as defined above (and is output by the
recalibrate.pl program), and N_minimum is the minimum number of
measurements to accept when computing WS for a given star. The suffix
.ws is appended to the photometry filename to form the output filename.
The .ws file has the form:
# 01 ID
# 02 N
# 03 RA
# 04 Dec
# 05 Vmean
# 06 Imean
# 07 Iws
# 08 Iws (v only)
# 09 Iws (i only)
# 10 K (stetson)
Columns 8,9, and 10 are unused at this point.
The Welch-Stetson values are shown versus mean V-magnitude in Figure
2. They are shown both before (top panel) and after (bottom panel)
recalibration. When computed with the 'raw' photometry, the index
values grow for brighter targets as the uncertainties
(σV/I) are smaller. This also occurs in the
recalibrated data, but to a far lesser extent. A red circle is used
to mark a known variable star from the GCVS.
Figure 2: The Welch-Stetson variability index values as a function of
mean V magnitude before (above) and after (below) recalibration.
An upper-limit was established as a detection threshold for
variability candidates. As the scatter in the WS values is a function
of magnitude, the upper limit was found by smoothing the data with a
Gaussian smoothing kernel and computing the RMS uncertainty at each
magnitude. The WS values are shown with the smoothed curve and fitted
threshold in Figure 3.
Figure 3: the detection threshold for candidate variable stars.
Values above the threshold were deemed to be candidate variable stars
and their time series were written to individual files:
./writeTimeseries.pl photometry_file WS_output_file
threshold(sigma)
The photometry file used here should contain the recalibrated
photometry. The output files are named with the ID number of the star
and suffix ".ts" appended. The program assumes the photometry file
will have the form 'i_NNN.something' where NNN is a 3-digit number
representing the right ascension of the target field. The time series
are written in a directory called tsNNN.
Period Finding
A modification of the Lomb-Scargle (LS) periodogram was used to
determine the periods of the candidate variables (see Gregory
2005). The method is essentially a discrete Fourier transform
which can be performed on unevenly-sampled data. However, the LS
periodogram carries the assumption that the periodic function being
sought is sinusoidal in nature, which is not generally the case for a
variety of variable stars (most notably eclipsing binaries). To work
around this, I phased the data for each trial frequency and broke it
into 5 bins. The mean for each bin was computed and the 5 means were
used as a model for the variability. This model was used in place of
the sinusoids for the LS method. The program is run as follows:
./pdm time_series_file c5:7 > output_file
where c5:7 indicates that time,magnitude are in columns 5 and 7,
respectively. The output file has the form:
1 frequency
2 period
3 Phase dispersion minimization PDM value
4 Probability of the PDM value in column 3
5 Whittaker-Robinson max variance of means value
6 The modified lomb-scargle periodogram value (for arbitrary waveform)
7 The Probability density for the value in column 6
8 The amplitude of the 5-bin model
9 The maximum squared deviation in the 5-bin model divided by 1/5th
variance
Of greatest interest at this point, are columns 2 and 7. Plotting 7
against 2, we see the probability density as a function of the
period. The highest peak is the most probable period.
Column 3 is Stellingwerf's (1978) phase dispersion minimisation
value. This is just the ratio of the model-subtracted variance to the
variance of the un-modeled magnitude values. Column 4 is the
probability that the value in column 3 would be observed.
Column 5 is the Whittaker-Robinson variance of means test (also
described in Stellingwerf, 1978).
Columns 8 is the amplitude of the 5-bin model, and column 9 is the
maximum squared deviation in the 5-bin model divided by 1/5th
variance. In other words, column 9 is the most deviant value in the model
normalized by its proportion of the total variance of the model.
Columns 8 and 9 are unused, but it is my hope that they may be useful
in classifying the variable types. I suspect (but have not yet
tested) that eclipsing binaries will have large values in column 9,
while intrinsic variables will not (the single faint dip in the time
series will cause one bin in the model to be responsible for a
disproportionately large amount of the total variance. But, as I
mentioned, I haven't yet examined this.
A few plots of the probability density (column 7) versus the period
(column 2) are shown in Figure 4, with the corresponding phased lightcurves
in Figure 5.
Figure 4: The modified Lomb-Scargle periodograms for 3 candidates in
the RA=330 field.
Figure 5: The phased lightcurves for the candidate variable stars
whose periods were determined in Figure 4. The colour-coding is used
to denote the order of the phase, darker colours are from earlier
cycles, while lighter colours are more recent. There shouldn't be any
pattern present in the colours.
Admittedly, I've chosen to show you some of the best examples from the
data. Most of the candidates are not so easily sorted-out. In this
case, however, it's clear that the top panel is an eclipsing binary,
and the second is a Cepheid.
This is where the work stands so far. The next step, as I see it, is
to begin classifying the candidate objects by eye and to try to identify
some parameters which will allow them to be classified automatically.
I've also been looking at the CLEAN algorithm as a way to remove
sampling effects such as aliasing from the periodograms. This would
allow the periods to be more reliably determined.
References
Gregory, P. 2005. Bayesian Logical Data Analysis for the Physical
Sciences, Cambridge University Press, Cambridge, UK
Stellingwerf, R.F. 1978. ApJ 224, 953
Welch, D.L., and Stetson, P., 1993, AJ 105, 1813