[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
- To: firstname.lastname@example.org
- Subject: Database notes
- From: Chris Albertson <email@example.com>
- Date: Wed, 14 Jan 1998 18:11:58 -0800
- Old-Return-Path: <firstname.lastname@example.org>
- Organization: Logicon RDA
- Resent-Date: Wed, 14 Jan 1998 22:17:18 -0500
- Resent-From: email@example.com
- Resent-Message-ID: <"33L1YB.A.OWG.ilXv0"@kani.wwa.com>
- Resent-Sender: firstname.lastname@example.org
- Sender: email@example.com
1) I think the merge program will run at a rate of about
one million observations per hour. I processed three
of Glenn's ZIP files /pub/incoming/B*.zip in about an
hour into a merged (collated) table. This is using a
P100 with 96MB RAM. Note that just these three files
alone contain ~950,000 observations Production code
will be different (likely slower) but this is +much+
better then the previous merge algorithm I was using.
Still it will take days to process the existing backlog.
2) Running large data sets has turned up some problems in
my software and in Postgres too. No show stoppers. There
is always eight ways to do everything. It will take a
while to make this process robust and crash proof.
3) I ran into what is arguably a bug in Postscript. I was
impressed when after only hours of reporting it I got
a fix e-mailed to me from one of the developers.
As shipped the Postgres server does not free memory allocated
until the end of a "transaction". A transaction is a single
SQL statement or a set of statements bounded by BEGIN...END
brackets. This is no big deal unless you do thousands of
operations within BEGIN..END. The new merge does this
unavoidably. The result is the size of the Postgress server
gets very large and sometimes crashes when it tries to grow
larger then the available space. The merge program can cause a
server crash if run with the "wrong" options on large data sets.
There is a fix but it requires a re-compile of the server
In file /src/backend/access/transam/xact.c there is an #ifdef
Be sure that TBL_FREE_CMD_MEMORY is defined. Or you can do what I
did and just remove the #ifdef ... #endif lines from the source
file. This change, I am told will appear in version 6.3 due
out of Beta test in March 98. Unless you have 500MB of RAM you
need this fix.
firstname.lastname@example.org Voice: 818-351-0089 X127
Logicon RDA, Pasadena California Fax: 818-351-0699