[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A New Record



Patrick,

Thank you very much for the nice plot.  It gives me something to think
about.  

My present emphasis is to take a large and well documented data set.  I
don't want to leave anything out until the cut has been thoroughly studied
and documented.  It is, I believe, a subject for a nice paper.  (Hint to
some of you that supervise academic work. Note there is room for several
flags, so several papers.  "An improved algorithm  for validating massive
sky surveys")

I have no objection to adding one or more quality factors.  As I keep
saying, in tass you get to just do it.  There is presently a V and I flag
bit in all the tables that could be used.  At the present time I am
deleting all the flagged measurements before doing my analysis as these are
mostly saturated stars. Patrick, did you do this for your analysis? The
flags are there in all the .cal files, I just delete these stars before
making up the collected file and searching it for variables.   These flag
positions are available.  Each one could hold any number of bits, I think. 
The lower bit values are already assigned.  Their use can be found in
Michael's documentation.

I would welcome someone to take on this problem and propose how to use
quality fields to make the tass data more useful.  What I would hope for is
some discussion of this on the list and a few TN's that propose schemes,
try them out, and present results.  Then we can pick one or more and do the
necessary processing to include them in the data in the data base.  I can
then add the processing to the pipeline (if possible - one may need the
whole data set) so that the newer data contains quality factors.  

I much prefer flags to cutting the data.  That way, if we get the flags
wrong it does not affect the data.  One can always ignore a flag if the
data is there. 

Tom Droege


> [Original Message]
> From: Patrick Wils <patrickwils@yahoo.com>
> To: <tass@listserv.wwa.com>
> Date: 10/21/2003 5:46:16 PM
> Subject: Re: A New Record
>
> Tom,
>
> > You (I think) are saying "I know bad data when I see it".  The
> > problem is
> > the same as with pornography.  If you can't define "bad" data then
> > you can't pass a law against it.  (Make a cut to remove it)
>
> Of course there is no strict cut, one assumes some limit, and some
> people are more liberal than others, and the cut may also depend on the
> purpose.
> I have selected an area of about 2.5 x 2.5 degrees around NSV 8866. 
> This contains data from 339 images.  I did what you suggested, that is:
> calculate averages for the more than 2000 stars, and then calculate for
> each image how much stars on the image differ from the average
> (expressed in standard deviations).   The attached plot shows the
> average squared deviations (ASD) for V and Ic.  I think one can say
> that if the ASD is larger than 5, the image falls outside the "normal"
> range.  What I suggest (this has probably been suggested before
> already) is to add this number (or a "quality label" such as A: ASD <
> 3, B : 3 < ASD < 5, etc.) to the image.
>  
> > The risk in removing data is that you create a bias in the remaining
> > data. 
> > I believe that the way we are processing data is bias free.  
>
> I am not sure that setting cuts to throw away complete images before
> analysing data for a particular star, creates a bias.  
>
> > You should note that we have gradually improved the apparatus over
> > time. 
>
> I am not questioning the overall quality of the data at all, otherwise
> I wouldn't be so keen to use it.
>
> > > On the other hand, when you look at the individual data points of a
> > > star, these bad points do matter, because they distort the light
> > curve.
> > 
> > Yep, that is just life with real data.  However images that contain
> > "obvious" bad points may also contain "good" points that allow
> > determination of something about some other star.  So one has to have
> > a method that makes sense to throw out whole images.  
>
> If you have an image with a lot of stars deviating from the normal, the
> only "good" points you can be sure of are those which are close to the
> normal.  All others would be suspicious.  So, there is no real value in
> it.  And this will certainly be the case, if you have a large number of
> images of the area.
>
> One must look at the possible uses of the database.  If it is to be a
> "standard" catalogue only, there is no need for the individual data,
> and you might as well keep "bad" images in.  They will not change
> anything.  On the other hand, if one wants to study variable stars,
> people will need something to assess the quality of a single data
> point, without calculating averages and so on for all the images.
>
> Patrick
>
>
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Shopping - with improved product search
> http://shopping.yahoo.com