gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnumed-devel] Re: dbgen import csv scripts for gnumed.


From: sjtan
Subject: [Gnumed-devel] Re: dbgen import csv scripts for gnumed.
Date: Fri, 27 Aug 2004 00:59:55 +1000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616



# run in the dbgen directory
csv='../dbgen/dataset1.csv'
So that seems to be FEBRL stuff ?

yes. The aim was to see if gnumed can interface with febrl in the dedup use case.
the good news is that it wasn't too hard.

The steps were ( to run a  febrl dbgen test dataset with gnumed):

1. write script(s) to input a dbgen csv file to gnumed ( correct_postcodes.py csv_to_gnumed.py)

2. write scripts to read gnumed data into febrl
   a) write a convenient sql view ( v_febrl_demo_read_au.sql)
   b) either
i) read view contents into csv file ( read_v_febrl_demo.py) and then modify the csv filename in a copy of febrl/project-deduplicate.py
        or
ii) write a pgdb adaption of DataSetSQL in febrl/dataset.py , and use a copy of febrl/project-deduplicate.py and modify the indata structure to read
        the modified DataSet  ( DataSetPGSQL)

both worked, although ii) discovered a bug in the febrl/dataset.py read_records(self, start, number): method, where self.next_record_num is incremented by number ( the batch number of records to read) outside of the batch processing loop, when it should be self.next_record_num += 1 inside the loop.

I've put the modified dataset.py , datasetTest.py from febrl , as well as the scripts for gnumed febrl input output in test-area/febrl.










reply via email to

[Prev in Thread] Current Thread [Next in Thread]