I've been pondering this for a fair amount of time now. I'm trying to download the data from Yahoo!'s Stock API. When you use the API, it gives you a .csv file. I've been looking at opencsv, which seems perfect, except I want to avoid downloading and saving the file, if at all possible.
OpenCSV, according to the examples, can only read from a FileReader. According to Oracle's docs on FileReader, the file needs to be local.
Is it possible to read from a remote file using OpenCSV without downloading?
CSVReader takes a Reader argument according to the documentation, so it isn't limited to a FileReader for the parameter.
To use a CSVReader without saving the file first, you could use a BufferedReader around a stream loading the data:
URL stockURL = new URL("http://example.com/stock.csv");
BufferedReader in = new BufferedReader(new InputStreamReader(stockURL.openStream()));
CSVReader reader = new CSVReader(in);
// use reader
Implementation of opencsv for reading csv file and saving to database.
import com.opencsv.CSVParser;
import com.opencsv.CSVParserBuilder;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import com.opencsv.bean.CsvBindByPosition;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import javax.persistence.Column;
import java.io.*;
import java.lang.annotation.Annotation;
import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
#Service
#Slf4j
public class FileUploadService {
#Autowired
private InsertCSVContentToDB csvContentToDB;
/**
* #param csvFileName location of the physical file.
* #param type Employee.class
* #param delimiter can be , | # etc
* #param obj //new Employee();
*
* import com.opencsv.bean.CsvBindByPosition;
* import lombok.Data;
*
* import javax.persistence.Column;
*
* #Data
* public class Employee {
*
* #CsvBindByPosition(position = 0, required = true)
* #Column(name = "EMPLOYEE_NAME")
* private String employeeName;
*
* #CsvBindByPosition(position = 1)
* #Column(name = "Employee_ADDRESS_1")
* private String employeeAddress1;
* }
*
* #param sqlQuery query to save data to DB
* #param noOfLineSkip make it 0(Zero) so that it should not skip any line.
* #param auditId apart from regular column in csv we need to add more column for traking like file id or audit id
* #return
*/
public <T> void readCSVContentInArray(String csvFileName, Class<? extends T> type, char delimiter, Object obj,
String sqlQuery, int noOfLineSkip, Long auditId) {
List<T> lstCsvContent = new ArrayList<>();
Reader reader = null;
CSVReader csv = null;
try {
reader = new BufferedReader(new InputStreamReader(new FileInputStream(csvFileName), "utf-8"));
log.info("Buffer Reader : " + ((BufferedReader) reader).readLine().isEmpty());
CSVParser parser = new CSVParserBuilder().withSeparator(delimiter).withIgnoreQuotations(true).build();
csv = new CSVReaderBuilder(reader).withSkipLines(noOfLineSkip).withCSVParser(parser).build();
String[] nextLine;
int size = 0;
int chunkSize = 10000;
Class params[] = { Long.class };
Object paramsObj[] = { auditId };
long rowNumber = 0;
Field field[] = type.getDeclaredFields();
while ((nextLine = csv.readNext()) != null) {
rowNumber++;
try {
obj = type.newInstance();
for (Field f : field) {
if(!f.isSynthetic()){
f.setAccessible(true);
Annotation ann[] = f.getDeclaredAnnotations();
CsvBindByPosition csv1 = (CsvBindByPosition) ann[0];
Column c = (Column)ann[1];
try {
if (csv1.position() < nextLine.length) {
if (csv1.required() && (nextLine[csv1.position()] == null
|| nextLine[csv1.position()].trim().isEmpty())) {
String message = "Mandatory field is missing in row: " + rowNumber;
log.info("null value in " + rowNumber + ", " + csv1.position());
System.out.println(message);
}
if (f.getType().equals(String.class)) {
f.set(obj, nextLine[csv1.position()]);
}
if (f.getType().equals(Boolean.class)) {
f.set(obj, nextLine[csv1.position()]);
}
if (f.getType().equals(Integer.class)) {
f.set(obj, Integer.parseInt(nextLine[csv1.position()]));
}
if (f.getType().equals(Long.class)) {
f.set(obj, Long.parseLong(nextLine[csv1.position()]));
}
if (f.getType().equals(Double.class) && null!=nextLine[csv1.position()] && !nextLine[csv1.position()].trim().isEmpty() ) {
f.set(obj, Double.parseDouble(nextLine[csv1.position()]));
}if(f.getType().equals(Double.class) && ((nextLine[csv1.position()]==null) || nextLine[csv1.position()].isEmpty())){
f.set(obj, new Double("0.0"));
}
if (f.getType().equals(Date.class)) {
f.set(obj, nextLine[csv1.position()]);
}
}
} catch (Exception fttEx) {
log.info("Exception when parsing the file: " + fttEx.getMessage());
System.out.println(fttEx.getMessage());
}
}
}
lstCsvContent.add((T) obj);
if (lstCsvContent.size() > chunkSize) {
size = size + lstCsvContent.size();
//write code to save to data base of file system in chunk.
lstCsvContent = null;
lstCsvContent = new ArrayList<>();
}
} catch (Exception ex) {
log.info("Exception: " + ex.getMessage());
}
}
//write code to save list into DB or file system
System.out.println(lstCsvContent);
} catch (Exception ex) {
log.info("Exception:::::::: " + ex.getMessage());
} finally {
try {
if (csv != null) {
csv.close();
}
if (reader != null) {
reader.close();
}
} catch (IOException ioe) {
log.info("Exception when closing the file: " + ioe.getMessage());
}
}
log.info("File Processed successfully: ");
}
}
Related
I am using spark-excel(com.crealytics.spark.excel) library to read excel file. If no duplicate column in excel file, the library working fine. If any duplicate column name occurs in excel file, throwing below exception.
How to overcome this error?
Is there any workaround solution to overcome this?
Exception in thread "main" org.apache.spark.sql.AnalysisException: Found duplicate column(s) in the data schema: `net territory`;
at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)
Using spark excel API getting exception .
StructType schema = DataTypes.createStructType(new StructField[]{DataTypes.createStructField("CGISAI", DataTypes.StringType, true), DataTypes.createStructField("SALES TERRITORY", DataTypes.StringType, true)});
SQLContext sqlcxt = new SQLContext(jsc);
Dataset<Row> df = sqlcxt.read()
.format("com.crealytics.spark.excel")
.option("path", "file:///"+siteinfofile)
.option("useHeader", "true")
.option("spark.read.simpleMode", "true")
.option("treatEmptyValuesAsNulls", "true")
.option("inferSchema", "false")
.option("addColorColumns", "False")
.option("sheetName", "sheet1")
.option("startColumn", 22)
.option("endColumn", 23)
//.schema(schema)
.load();
return df;
This is the code I am using. I am using sparkexcel library from com.crealytics.spark.excel.
I want the solution to identify whether excel file has duplicate columns or not. if have duplicate columns, how to rename/eliminate the duplicate columns.
WorkAround is as below:
convert .xlsx file into .csv . using spark default csv api that can handle duplicate column names by renaming them automatically.
Below is the code to convert from xlsx to csv file.
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package com.huawei.java.tools;
/**
*
* #author Nanaji Jonnadula
*/
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackageAccess;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.ss.util.CellAddress;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.util.SAXHelper;
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler;
import org.apache.poi.xssf.model.StylesTable;
import org.apache.poi.xssf.usermodel.XSSFComment;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import javax.xml.parsers.ParserConfigurationException;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import static org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler;
public class ExcelXlsx2Csv {
private static class SheetToCSV implements SheetContentsHandler {
private boolean firstCellOfRow = false;
private int currentRow = -1;
private int currentCol = -1;
private StringBuffer lineBuffer = new StringBuffer();
/** * Destination for data */
private FileOutputStream outputStream;
public SheetToCSV(FileOutputStream outputStream) {
this.outputStream = outputStream;
}
#Override
public void startRow(int rowNum) {
/** * If there were gaps, output the missing rows: * outputMissingRows(rowNum - currentRow - 1); */
// Prepare for this row
firstCellOfRow = true;
currentRow = rowNum;
currentCol = -1;
lineBuffer.delete(0, lineBuffer.length()); //clear lineBuffer
}
#Override
public void endRow(int rowNum) {
lineBuffer.append('\n');
try {
outputStream.write(lineBuffer.substring(0).getBytes());
} catch (IOException e) {
System.out.println("save date to file error at row number: {}"+ currentCol);
throw new RuntimeException("save date to file error at row number: " + currentCol);
}
}
#Override
public void cell(String cellReference, String formattedValue, XSSFComment comment) {
if (firstCellOfRow) {
firstCellOfRow = false;
} else {
lineBuffer.append(',');
}
// gracefully handle missing CellRef here in a similar way as XSSFCell does
if (cellReference == null) {
cellReference = new CellAddress(currentRow, currentCol).formatAsString();
}
int thisCol = (new CellReference(cellReference)).getCol();
int missedCols = thisCol - currentCol - 1;
if (missedCols > 1) {
lineBuffer.append(',');
}
currentCol = thisCol;
if (formattedValue.contains("\n")) {
formattedValue = formattedValue.replace("\n", "");
}
formattedValue = "\"" + formattedValue + "\"";
lineBuffer.append(formattedValue);
}
#Override
public void headerFooter(String text, boolean isHeader, String tagName) {
// Skip, no headers or footers in CSV
}
}
private static void processSheet(StylesTable styles, ReadOnlySharedStringsTable strings,
SheetContentsHandler sheetHandler, InputStream sheetInputStream) throws Exception {
DataFormatter formatter = new DataFormatter();
InputSource sheetSource = new InputSource(sheetInputStream);
try {
XMLReader sheetParser = SAXHelper.newXMLReader();
ContentHandler handler = new XSSFSheetXMLHandler(
styles, null, strings, sheetHandler, formatter, false);
sheetParser.setContentHandler(handler);
sheetParser.parse(sheetSource);
} catch (ParserConfigurationException e) {
throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
}
}
public static void process(String srcFile, String destFile,String sheetname_) throws Exception {
File xlsxFile = new File(srcFile);
OPCPackage xlsxPackage = OPCPackage.open(xlsxFile.getPath(), PackageAccess.READ);
ReadOnlySharedStringsTable strings = new ReadOnlySharedStringsTable(xlsxPackage);
XSSFReader xssfReader = new XSSFReader(xlsxPackage);
StylesTable styles = xssfReader.getStylesTable();
XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();
int index = 0;
while (iter.hasNext()) {
InputStream stream = iter.next();
String sheetName = iter.getSheetName();
System.out.println(sheetName + " [index=" + index + "]");
if(sheetName.equals(sheetname_)){
FileOutputStream fileOutputStream = new FileOutputStream(destFile);
processSheet(styles, strings, new SheetToCSV(fileOutputStream), stream);
fileOutputStream.flush();
fileOutputStream.close();
}
stream.close();
++index;
}
xlsxPackage.close();
}
public static void main(String[] args) throws Exception {
ExcelXlsx2Csv.process("D:\\data\\latest.xlsx", "D:\\data\\latest.csv","sheet1"); //source , destination, sheetname
}
}
I want to load a gzipped rdf file into a org.eclipse.rdf4j.repository.Repository. During the upload, status messages must be logged to the console. The size of my rdf file is ~1GB of uncompressed or ~50MB of compressed data.
Actually an RDF4J repository will automatically process a compressed (zip/gzip) file correctly, already. So you can simply do this:
RepositoryConnection conn = ... ; // your store connection
conn.add(new File("file.zip"), null, RDFFormat.NTRIPLES):
If you want to include reporting, a different (somewhat simpler) approach is to use an org.eclipse.rdf4j.repository.util.RDFLoader class in combination with an RDFInserter:
RepositoryConnection conn = ... ; // your store connection
RDFInsert inserter = new RDFInserter(conn);
RDFLoader loader = new RDFLoader(conn.getParserConfig(), conn.getValueFactory());
loader.load(new File("file.zip"), RDFFormat.NTRIPLES, inserter));
The RDFLoader takes care of properly uncompressing the (zip or gzip) file.
To get intermediate reporting you can wrap your RDFInserter in your own custom AbstractRDFHandler that does the counting and reporting (before passing on to the wrapper inserter).
Variant 1
The following sample will load an InputStream with gzipped data into an in-memory rdf repository. The zipped format is supported directly by rdf4j.
Every 100000th statement will be printed to stdout using the RepositoryConnectionListenerAdapter.
import java.io.InputStream;
import org.eclipse.rdf4j.model.IRI;
import org.eclipse.rdf4j.model.Resource;
import org.eclipse.rdf4j.model.Value;
import org.eclipse.rdf4j.repository.Repository;
import org.eclipse.rdf4j.repository.RepositoryConnection;
import org.eclipse.rdf4j.repository.event.base.NotifyingRepositoryConnectionWrapper;
import org.eclipse.rdf4j.repository.event.base.RepositoryConnectionListenerAdapter;
import org.eclipse.rdf4j.repository.sail.SailRepository;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.sail.memory.MemoryStore;
public class MyTripleStore {
Repository repo;
/**
* Creates an inmemory triple store
*
*/
public MyTripleStore() {
repo = new SailRepository(new MemoryStore());
repo.initialize();
}
/**
* #param in gzip compressed data on an inputstream
* #param format the format of the streamed data
*/
public void loadZippedFile(InputStream in, RDFFormat format) {
System.out.println("Load zip file of format " + format);
try (NotifyingRepositoryConnectionWrapper con =
new NotifyingRepositoryConnectionWrapper(repo, repo.getConnection());) {
RepositoryConnectionListenerAdapter myListener =
new RepositoryConnectionListenerAdapter() {
private long count = 0;
#Override
public void add(RepositoryConnection arg0, Resource arg1, IRI arg2,
Value arg3, Resource... arg4) {
count++;
if (count % 100000 == 0)
System.out.println("Add statement number " + count + "\n"
+ arg1+ " " + arg2 + " " + arg3);
}
};
con.addRepositoryConnectionListener(myListener);
con.add(in, "", format);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
Variant 2
This variant implements an AbstractRDFHandler to provide the reporting.
import java.io.InputStream;
import org.eclipse.rdf4j.model.Statement;
import org.eclipse.rdf4j.repository.Repository;
import org.eclipse.rdf4j.repository.RepositoryConnection;
import org.eclipse.rdf4j.repository.sail.SailRepository;
import org.eclipse.rdf4j.repository.util.RDFInserter;
import org.eclipse.rdf4j.repository.util.RDFLoader;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.rio.helpers.AbstractRDFHandler;
import org.eclipse.rdf4j.sail.memory.MemoryStore;
public class MyTripleStore {
Repository repo;
/**
* Creates an inmemory triple store
*
*/
public MyTripleStore() {
repo = new SailRepository(new MemoryStore());
repo.initialize();
}
/**
* #param in gzip compressed data on an inputstream
* #param format the format of the streamed data
*/
public void loadZippedFile1(InputStream in, RDFFormat format) {
try (RepositoryConnection con = repo.getConnection()) {
MyRdfInserter inserter = new MyRdfInserter(con);
RDFLoader loader =
new RDFLoader(con.getParserConfig(), con.getValueFactory());
loader.load(in, "", RDFFormat.NTRIPLES, inserter);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
class MyRdfInserter extends AbstractRDFHandler {
RDFInserter rdfInserter;
int count = 0;
public MyRdfInserter(RepositoryConnection con) {
rdfInserter = new RDFInserter(con);
}
#Override
public void handleStatement(Statement st) {
count++;
if (count % 100000 == 0)
System.out.println("Add statement number " + count + "\n"
+ st.getSubject().stringValue() + " "
+ st.getPredicate().stringValue() + " "
+ st.getObject().stringValue());
rdfInserter.handleStatement(st);
}
}
}
Here is, how to call the code
MyTripleStore ts = new MyTripleStore();
ts.loadZippedFile(new FileInputStream("your-ntriples-zipped.gz"),
RDFFormat.NTRIPLES);
In my quest to create a simple Java program to extract tweets from Twitter's streaming API, I have modified this (http://cotdp.com/dl/TwitterConsumer.java) code snippet to work with the OAuth method. The result is the below code, which when executed, throws a Connection Refused Exception.
I am aware of Twitter4J however I want to create a program that relies least on other APIs.
I have done my research and it looks like the oauth.signpost library is suitable for Twitter's streaming API. I have also ensured my authentication details are correct. My Twitter Access level is 'Read-only'.
I couldn't find a simple Java example that shows how to use the streaming API without relying on e.g. Twitter4j.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.http.HttpResponse;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import oauth.signpost.OAuthConsumer;
import oauth.signpost.commonshttp.CommonsHttpOAuthConsumer;
/**
* A hacky little class illustrating how to receive and store Twitter streams
* for later analysis, requires Apache Commons HTTP Client 4+. Stores the data
* in 64MB long JSON files.
*
* Usage:
*
* TwitterConsumer t = new TwitterConsumer("username", "password",
* "http://stream.twitter.com/1/statuses/sample.json", "sample");
* t.start();
*/
public class TwitterConsumer extends Thread {
//
static String STORAGE_DIR = "/tmp";
static long BYTES_PER_FILE = 64 * 1024 * 1024;
//
public long Messages = 0;
public long Bytes = 0;
public long Timestamp = 0;
private String accessToken = "";
private String accessSecret = "";
private String consumerKey = "";
private String consumerSecret = "";
private String feedUrl;
private String filePrefix;
boolean isRunning = true;
File file = null;
FileWriter fw = null;
long bytesWritten = 0;
public static void main(String[] args) {
TwitterConsumer t = new TwitterConsumer(
"XXX",
"XXX",
"XXX",
"XXX",
"http://stream.twitter.com/1/statuses/sample.json", "sample");
t.start();
}
public TwitterConsumer(String accessToken, String accessSecret, String consumerKey, String consumerSecret, String url, String prefix) {
this.accessToken = accessToken;
this.accessSecret = accessSecret;
this.consumerKey = consumerKey;
this.consumerSecret = consumerSecret;
feedUrl = url;
filePrefix = prefix;
Timestamp = System.currentTimeMillis();
}
/**
* #throws IOException
*/
private void rotateFile() throws IOException {
// Handle the existing file
if (fw != null)
fw.close();
// Create the next file
file = new File(STORAGE_DIR, filePrefix + "-"
+ System.currentTimeMillis() + ".json");
bytesWritten = 0;
fw = new FileWriter(file);
System.out.println("Writing to " + file.getAbsolutePath());
}
/**
* #see java.lang.Thread#run()
*/
public void run() {
// Open the initial file
try { rotateFile(); } catch (IOException e) { e.printStackTrace(); return; }
// Run loop
while (isRunning) {
try {
OAuthConsumer consumer = new CommonsHttpOAuthConsumer(consumerKey, consumerSecret);
consumer.setTokenWithSecret(accessToken, accessSecret);
HttpGet request = new HttpGet(feedUrl);
consumer.sign(request);
DefaultHttpClient client = new DefaultHttpClient();
HttpResponse response = client.execute(request);
BufferedReader reader = new BufferedReader(
new InputStreamReader(response.getEntity().getContent()));
while (true) {
String line = reader.readLine();
if (line == null)
break;
if (line.length() > 0) {
if (bytesWritten + line.length() + 1 > BYTES_PER_FILE)
rotateFile();
fw.write(line + "\n");
bytesWritten += line.length() + 1;
Messages++;
Bytes += line.length() + 1;
}
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Sleeping before reconnect...");
try { Thread.sleep(15000); } catch (Exception e) { }
}
}
}
}
I tried to simulate the code and found that the error was very simple. You should use https instead of http in the url :)
I have 3 methods
for open file
for read file
for return things read in method read
this my code :
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package javaapplication56;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.rmi.RemoteException;
import java.util.logging.Level;
import java.util.logging.Logger;
/**
*
* #author x
*/
public class RemoteFileObjectImpl extends java.rmi.server.UnicastRemoteObject implements RemoteFileObject
{
public RemoteFileObjectImpl() throws java.rmi.RemoteException {
super();
}
File f = null;
FileReader r = null;
BufferedReader bfr = null;
String output = "";
public void open(String fileName) {
//To read file passWord
f = new File(fileName);
}
public String readLine() {
try {
String temp = "";
String newLine = System.getProperty("line.separator");
r = new FileReader(f);
while ((temp = bfr.readLine()) != null) {
output += temp + newLine;
bfr.close();
}
}
catch (IOException ex) {
ex.printStackTrace();
}
return output;
}
public void close() {
try {
bfr.close();
} catch (IOException ex) {
}
}
public static void main(String[]args) throws RemoteException{
RemoteFileObjectImpl m = new RemoteFileObjectImpl();
m.open("C:\\Users\\x\\Documents\\txt.txt");
m.readLine();
m.close();
}
}
But it does not work.
What do you expect it to do, you are not doing anything with the line you read, just
m.readLine();
Instead:
String result = m.readLine();
or use the output variable that you saved.
Do you want to save it to a variable, print it, write it to another file?
Update: after your update in the comments:
Your variable bfr is never created/initialized. You are only doing this:
r = new FileReader(f);
so bfr is still null.
You should do something like this instead:
bfr = new BufferedReader(new FileReader(f));
I can't seem to find sample code for constructing a Berkeley DB in Java and inserting records into it. Any samples? And I do not mean the Berkeley DB Java Edition either.
http://www.oracle.com/technology/documentation/berkeley-db/db/programmer_reference/BDB_Prog_Reference.pdf
Chapter 5
If you download db-5.0.21.NC.zip you will see plenty of samples.
Here is one that seems to do what you want
/*-
* See the file LICENSE for redistribution information.
*
* Copyright (c) 2004, 2010 Oracle and/or its affiliates. All rights reserved.
*
* $Id$
*/
// File: ExampleDatabaseLoad.java
package db.GettingStarted;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
import java.util.StringTokenizer;
import java.util.Vector;
import com.sleepycat.bind.EntryBinding;
import com.sleepycat.bind.serial.SerialBinding;
import com.sleepycat.bind.tuple.TupleBinding;
import com.sleepycat.db.DatabaseEntry;
import com.sleepycat.db.DatabaseException;
public class ExampleDatabaseLoad {
private static String myDbsPath = "./";
private static File inventoryFile = new File("./inventory.txt");
private static File vendorsFile = new File("./vendors.txt");
// DatabaseEntries used for loading records
private static DatabaseEntry theKey = new DatabaseEntry();
private static DatabaseEntry theData = new DatabaseEntry();
// Encapsulates the databases.
private static MyDbs myDbs = new MyDbs();
private static void usage() {
System.out.println("ExampleDatabaseLoad [-h <database home>]");
System.out.println(" [-i <inventory file>] [-v <vendors file>]");
System.exit(-1);
}
public static void main(String args[]) {
ExampleDatabaseLoad edl = new ExampleDatabaseLoad();
try {
edl.run(args);
} catch (DatabaseException dbe) {
System.err.println("ExampleDatabaseLoad: " + dbe.toString());
dbe.printStackTrace();
} catch (Exception e) {
System.out.println("Exception: " + e.toString());
e.printStackTrace();
} finally {
myDbs.close();
}
System.out.println("All done.");
}
private void run(String args[])
throws DatabaseException {
// Parse the arguments list
parseArgs(args);
myDbs.setup(myDbsPath);
System.out.println("loading vendors db....");
loadVendorsDb();
System.out.println("loading inventory db....");
loadInventoryDb();
}
private void loadVendorsDb()
throws DatabaseException {
// loadFile opens a flat-text file that contains our data
// and loads it into a list for us to work with. The integer
// parameter represents the number of fields expected in the
// file.
List vendors = loadFile(vendorsFile, 8);
// Now load the data into the database. The vendor's name is the
// key, and the data is a Vendor class object.
// Need a serial binding for the data
EntryBinding dataBinding =
new SerialBinding(myDbs.getClassCatalog(), Vendor.class);
for (int i = 0; i < vendors.size(); i++) {
String[] sArray = (String[])vendors.get(i);
Vendor theVendor = new Vendor();
theVendor.setVendorName(sArray[0]);
theVendor.setAddress(sArray[1]);
theVendor.setCity(sArray[2]);
theVendor.setState(sArray[3]);
theVendor.setZipcode(sArray[4]);
theVendor.setBusinessPhoneNumber(sArray[5]);
theVendor.setRepName(sArray[6]);
theVendor.setRepPhoneNumber(sArray[7]);
// The key is the vendor's name.
// ASSUMES THE VENDOR'S NAME IS UNIQUE!
String vendorName = theVendor.getVendorName();
try {
theKey = new DatabaseEntry(vendorName.getBytes("UTF-8"));
} catch (IOException willNeverOccur) {}
// Convert the Vendor object to a DatabaseEntry object
// using our SerialBinding
dataBinding.objectToEntry(theVendor, theData);
// Put it in the database.
myDbs.getVendorDB().put(null, theKey, theData);
}
}
private void loadInventoryDb()
throws DatabaseException {
// loadFile opens a flat-text file that contains our data
// and loads it into a list for us to work with. The integer
// parameter represents the number of fields expected in the
// file.
List inventoryArray = loadFile(inventoryFile, 6);
// Now load the data into the database. The item's sku is the
// key, and the data is an Inventory class object.
// Need a tuple binding for the Inventory class.
TupleBinding inventoryBinding = new InventoryBinding();
for (int i = 0; i < inventoryArray.size(); i++) {
String[] sArray = (String[])inventoryArray.get(i);
String sku = sArray[1];
try {
theKey = new DatabaseEntry(sku.getBytes("UTF-8"));
} catch (IOException willNeverOccur) {}
Inventory theInventory = new Inventory();
theInventory.setItemName(sArray[0]);
theInventory.setSku(sArray[1]);
theInventory.setVendorPrice((new Float(sArray[2])).floatValue());
theInventory.setVendorInventory((new Integer(sArray[3])).intValue());
theInventory.setCategory(sArray[4]);
theInventory.setVendor(sArray[5]);
// Place the Vendor object on the DatabaseEntry object using our
// the tuple binding we implemented in InventoryBinding.java
inventoryBinding.objectToEntry(theInventory, theData);
// Put it in the database. Note that this causes our secondary database
// to be automatically updated for us.
myDbs.getInventoryDB().put(null, theKey, theData);
}
}
private static void parseArgs(String args[]) {
for(int i = 0; i < args.length; ++i) {
if (args[i].startsWith("-")) {
switch(args[i].charAt(1)) {
case 'h':
myDbsPath = new String(args[++i]);
break;
case 'i':
inventoryFile = new File(args[++i]);
break;
case 'v':
vendorsFile = new File(args[++i]);
break;
default:
usage();
}
}
}
}
private List loadFile(File theFile, int numFields) {
List records = new ArrayList();
try {
String theLine = null;
FileInputStream fis = new FileInputStream(theFile);
BufferedReader br = new BufferedReader(new InputStreamReader(fis));
while((theLine=br.readLine()) != null) {
String[] theLineArray = splitString(theLine, "#");
if (theLineArray.length != numFields) {
System.out.println("Malformed line found in " + theFile.getPath());
System.out.println("Line was: '" + theLine);
System.out.println("length found was: " + theLineArray.length);
System.exit(-1);
}
records.add(theLineArray);
}
fis.close();
} catch (FileNotFoundException e) {
System.err.println(theFile.getPath() + " does not exist.");
e.printStackTrace();
usage();
} catch (IOException e) {
System.err.println("IO Exception: " + e.toString());
e.printStackTrace();
System.exit(-1);
}
return records;
}
private static String[] splitString(String s, String delimiter) {
Vector resultVector = new Vector();
StringTokenizer tokenizer = new StringTokenizer(s, delimiter);
while (tokenizer.hasMoreTokens())
resultVector.add(tokenizer.nextToken());
String[] resultArray = new String[resultVector.size()];
resultVector.copyInto(resultArray);
return resultArray;
}
protected ExampleDatabaseLoad() {}
}
There are a number of good Getting Started Guides for Oracle Berkeley DB Java Edition. They are included in the documentation set. You'll find example code in the documentation. If you're new to Oracle Berkeley DB Java Edition that's the right place to start. There are other examples in the download package as well.
I'm the product manager for Oracle Berkeley DB, I hope this addressed your question. If not please let me know how else I can help you.