i am currently trying to import geonames into our virtuoso. we therefor downloaded the geonames data (http://download.geonames.org/all-geonames-rdf.zip) and used the following script to import the data:
- Code: Select all
create procedure DB.DBA.import_geonames_dataset(in rdfdump_path
varchar, in graph_uri varchar)
{
declare rdfdump_str, line, nb_processed, process_records, msg_str
any;
declare exit handler for sqlstate '2200*'
{
goto znext;
}; rdfdump_str := file_to_string_output(rdfdump_path);
nb_processed := 0;
line := ses_read_line(rdfdump_str); -- URL line
while(isstring(line))
{
nb_processed := nb_processed + 1;
if(mod(nb_processed, 1000) = 0)
{
result_names(line);
result(line);
}
if(mod(nb_processed, 50000) = 0)
{
result_names(msg_str);
msg_str := 'Checkpoint in progress...';
result(msg_str);
exec('checkpoint');
msg_str := 'Checkpoint in finished';
result_names(msg_str);
result(msg_str);
}
line := ses_read_line(rdfdump_str); -- RDF document line
DB.DBA.RDF_LOAD_RDFXML(line, graph_uri, graph_uri);
znext:;
line := ses_read_line(rdfdump_str); -- skip: URL line
}
};
DB.DBA.import_geonames_dataset('/data/geonames/all-geonames-rdf.txt', 'http://geonames.org');
We run into a problem related to transactions, that we were able to fix by setting log_enable(2). Now the import crashed again, reporting:
- Code: Select all
...
http://sws.geonames.org/8209796/
http://sws.geonames.org/8210796/
http://sws.geonames.org/8211796/
http://sws.geonames.org/8212796/
http://sws.geonames.org/8213796/
http://sws.geonames.org/8214796/
http://sws.geonames.org/8215796/
http://sws.geonames.org/8216796/
http://sws.geonames.org/8217796/
http://sws.geonames.org/8218796/
http://sws.geonames.org/8219796/
http://sws.geonames.org/8220796/
http://sws.geonames.org/8221796/
http://sws.geonames.org/8222796/
http://sws.geonames.org/8223796/
*** Error 08003: [Virtuoso Driver][Virtuoso Server]HT033: cannot read from session
at line 3 of Top-Level:
DB.DBA.import_geonames_dataset('/data/geonames/all-geonames-rdf.txt', 'http://geonames.org')
searching the web, i found that jens lehmann also ran into this problem: http://sourceforge.net/mailarchive/foru ... nth=200810 5 years ago, but i can't find a solution to the problem in this thread. Can you tell me, what to do, to get the import running?
I don't know, if it helps you, but doing a count for the geonames data
- Code: Select all
SELECT count(*) from <http://geonames.org> WHERE {?s ?p ?o}
says: 82170767
Joshua Bacher

