[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A New Record



Tom,

> You (I think) are saying "I know bad data when I see it".  The
> problem is
> the same as with pornography.  If you can't define "bad" data then
> you can't pass a law against it.  (Make a cut to remove it)

Of course there is no strict cut, one assumes some limit, and some
people are more liberal than others, and the cut may also depend on the
purpose.
I have selected an area of about 2.5 x 2.5 degrees around NSV 8866. 
This contains data from 339 images.  I did what you suggested, that is:
calculate averages for the more than 2000 stars, and then calculate for
each image how much stars on the image differ from the average
(expressed in standard deviations).   The attached plot shows the
average squared deviations (ASD) for V and Ic.  I think one can say
that if the ASD is larger than 5, the image falls outside the "normal"
range.  What I suggest (this has probably been suggested before
already) is to add this number (or a "quality label" such as A: ASD <
3, B : 3 < ASD < 5, etc.) to the image.
 
> The risk in removing data is that you create a bias in the remaining
> data. 
> I believe that the way we are processing data is bias free.  

I am not sure that setting cuts to throw away complete images before
analysing data for a particular star, creates a bias.  

> You should note that we have gradually improved the apparatus over
> time. 

I am not questioning the overall quality of the data at all, otherwise
I wouldn't be so keen to use it.

> > On the other hand, when you look at the individual data points of a
> > star, these bad points do matter, because they distort the light
> curve.
> 
> Yep, that is just life with real data.  However images that contain
> "obvious" bad points may also contain "good" points that allow
> determination of something about some other star.  So one has to have
> a method that makes sense to throw out whole images.  

If you have an image with a lot of stars deviating from the normal, the
only "good" points you can be sure of are those which are close to the
normal.  All others would be suspicious.  So, there is no real value in
it.  And this will certainly be the case, if you have a large number of
images of the area.

One must look at the possible uses of the database.  If it is to be a
"standard" catalogue only, there is no need for the individual data,
and you might as well keep "bad" images in.  They will not change
anything.  On the other hand, if one wants to study variable stars,
people will need something to assess the quality of a single data
point, without calculating averages and so on for all the images.

Patrick


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

ASD.gif

GIF image