[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dB timings - RE: A Data Reduction Proposal



I was about to reply to Rob's post but you rise the same issues.

Not to speak for him but it looks like he merged 90 frames.
For purposes of timing all "i" or mixed i and v doesn't
matter they are just 264K points.    So I think it would
represent 1/2 of a baseline night's data.  So he's going
at about 14 hrs per night.  

A few comments:

1) This is _much_ faster then I was able to process 
Mark III data. He is not doing quite the same function but
"close".  He has much newer and improved DBMS software and
a faster computer too. 

2) His processing rate will slow down as the size of the database
grows.  A flat out guess is that time will be grow at about
N log(M) where N= one night's data and M=all the data in the database.
By this formula N will be double Rob's sample size and M will
grow to 1000 times his sample size.  

So lets say a typical production run would be 42 hours?  This is
not as bad as it looks because I think I see a 4x or better
speedup in the code, that and a 2x faster computer is not to
costly either.   Also, this is the kind of thing that one could
use multiple computers for.  You'd split the processing by RA with
a few degrees (or hours) assigned to each computer.  All in all
I think he has should the approach is not completely unreasonable.

What's saving us here is PostgreSQL's "box" type that uses a
very good two dimensional index function called "quad trees"
What it does is divide a plane into fourths such that an equal
number of data points are in each quadrant.  It then repeats the
procedure for each quadrant and so on.  


--- aah@nofs.navy.mil wrote:
> Rob,
>   For your timings, and I apologize ahead of time for
> not following the entire discussion, was the import and merging
> of 200K "stars" basically a single-filter detection of 200K
> separate stars that you then add to the database?  The
> reason I ask is that the baseline that was discussed many
> moons ago for a single night was the following:
>      [200 2kx2k images (100 each V and I), each with 5K stars]
> that is 1M entries per night to add to the database if I've done my
> math right.  If your estimate of 7hrs for 200K stars is
> right, then 35 hours would be required to import/merge one night
> of data from one site with your computer.  This doesn't depend on the
> size
> of the database?  My estimate is about 60M real stars will be present
> in the Mark IV database, along with an approximately equal number of
> nonreal objects.
> Arne
> 


=====
Chris Albertson 
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Send FREE Valentine eCards with Yahoo! Greetings!
http://greetings.yahoo.com