Performance Tuning While Loading CSV - java

I have attached below code
Functionality
Reading csv and insert in db after replacing values with webmacro.
Reading values from csv # first header information NO,NAME next to that read one by one values and put into webmacro context context.put("1","RAJARAJAN") next webmacro replace $(NO) ==>1 and $(NAME)==>RAJARAJAN and add in statment batch once it reached 1000 execute the batch.
Code is running as per functionality but it takes 4 minutes to parse 50,000 records need performance improvement or need to change logic ....kindly let me know if any doubts.
Any change to drastic performance...
Note: I use webmacro because to replace $(NO) in merge query to values read in CSV
Bala.csv
NO?NAME
1?RAJARAJAN
2?ARUN
3?ARUNKUMAR
Connection con=null;
Statement stmt=null;
Connection con1=null;
int counter=0;
try{
WebMacro wm = new WM();
Context context = wm.getContext();
String strFilePath = "/home/vbalamurugan/3A/email-1822820895/Bala.csv";
String msg="merge into temp2 A using
(select '$(NO)' NO,'$(NAME)' NAME from dual)B on(A.NO=B.NO)
when not matched then insert (NO,NAME)
values(B.NO,B.NAME) when matched then
update set A.NAME='Attai' where A.NO='$(NO)'";
String[]rowsAsTokens;
con=getOracleConnection("localhost","raymedi_hq","raymedi_hq","XE");
con.setAutoCommit(false);
stmt=con.createStatement();
File file = new File(strFilePath);
Scanner scanner = new Scanner(file);
try {
String headerField;
String header[];
headerField=scanner.nextLine();
header=headerField.split("\\?");
long start=System.currentTimeMillis();
while(scanner.hasNext()) {
String scan[]=scanner.nextLine().split("\\?");
for(int i=0;i<scan.length;i++){
context.put(header[i],scan[i]);
}
if(context.size()>0){
String m=replacingWebMacroStatement(msg,wm,context);
if(counter>1000){
stmt.executeBatch();
stmt.clearBatch();
counter=0;
}else{
stmt.addBatch(m);
counter++;
}
}
}
long b=System.currentTimeMillis()-start;
System.out.println("=======Total Time Taken"+b);
}catch(Exception e){
e.printStackTrace();
}
finally {
scanner.close();
}
stmt.executeBatch();
stmt.clearBatch();
stmt.close();
}catch(Exception e){
e.printStackTrace();
con.rollback();
}finally{
con.commit();
}
// Method For replace webmacro with $
public static String replacingWebMacroStatement(String Query, WebMacro wm,Context context) throws Exception {
Template template = new StringTemplate(wm.getBroker(), Query);
template.parse();
String macro_replaced = template.evaluateAsString(context);
return macro_replaced;
}
// for getting oracle connection
public static Connection getOracleConnection(String IPaddress,String username,String password,String Tns)throws SQLException{
Connection connection = null;
try{
String baseconnectionurl ="jdbc:oracle:thin:#"+IPaddress+":1521:"+Tns;
String driver = "oracle.jdbc.driver.OracleDriver";
String user = username;
String pass = password;
Class.forName(driver);
connection=DriverManager.getConnection(baseconnectionurl,user,pass);
}catch(Exception e){
e.printStackTrace();
}
return connection;
}

I can tell you that this code takes on average about 150ms on my machine:
StrTokenizer tokenizer = StrTokenizer.getCSVInstance();
for (int i=0;i<50000;i++) {
tokenizer.reset("a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z");
String toks[] = tokenizer.getTokenArray();
}
You'll find StrTokenizer in the apache commons-lang package, but I would doubt that String.split(), StringTokenizer or Scanner.nextLine() would be your bottleneck in any case. I would assume it's your database inserts times.
If that's the case you can do 1 of two things:
Tune your batch size.
Multithread the inserts
And as suggested, a profiler will help to determine where your time is spent.,

Related

Append row in Sql without replacing previous row java

So i have database with value like this...
i'm trying to append the value by using insert into without replacing it,the data from this txt file...
but when i reload/refresh the database there is no new data being appended into the database...,
here is my code....
public static void importDatabase(String fileData){
try{
File database = new File(fileData);
FileReader fileInput = new FileReader(database);
BufferedReader in = new BufferedReader(fileInput);
String line = in.readLine();
line = in.readLine();
String[] data;
while (line != null){
data = line.split(",");
int ID = Integer.parseInt(data[0]);
String Nama = data[1];
int Gaji = Integer.parseInt(data[2]);
int Absensi = Integer.parseInt(data[3]);
int cuti = Integer.parseInt(data[4]);
String Status = data[5];
String query = "insert into list_karyawan values(?,?,?,?,?,?)";
ps = getConn().prepareStatement(query);
ps.setInt(1,ID);
ps.setString(2,Nama);
ps.setInt(3,Gaji);
ps.setInt(4,Absensi);
ps.setInt(5,cuti);
ps.setString(6,Status);
line = in.readLine();
}
ps.executeUpdate();
ps.close();
con.close();
System.out.println("Database Updated");
in.close();
}catch (Exception e){
System.out.println(e);
}
}
When i run it, it shows no error but the data never get into database, where did i go wrong?.,...
Auto-commit mode is enabled by default.
The JDBC driver throws a SQLException when a commit or rollback operation is performed on a connection that has auto-commit set to true.
Symptoms of the problem can be unexpected application behavior
update the JVM configuration for the ActiveMatrix BPM node to use the following Oracle connection property:
autoCommitSpecCompliant=false Try once
Note:I am not able to put as comment so i posted as a answer

java - Jar file can't be executed on another pc

I have an executable jar file I compiled from my program and I ran it on my PC. It works perfectly fine when I ran it in my command prompt using java -jar [nameofjar.jar]
However, I tried testing it on another pc. Using command prompt to run the same jar file, it throws an error:
D:\QA06122018_2>java -jar Indexing.jar
java.lang.NullPointerException
at IndexDriver.processText(IndexDriver.java:81)
at IndexDriver.index(IndexDriver.java:140)
at Main.main(Main.java:44).....
Both PC are using the same operating system and settings.
I even looked at the code regarding the error and there doesn't seem to be any problem with it. Ran fine on my IDE.
Is there anything I might overlooked?
EDIT:
The code :
public PreparedStatement preparedStatement = null;
MysqlAccessIndex con = new MysqlAccessIndex();
public Connection con1 = con.connect();
String path1;
public void index() throws Exception {
// Connection con1 = con.connect();
try {
Statement statement = con1.createStatement();
ResultSet rs = statement.executeQuery("select * from filequeue where Status='Active' LIMIT 5");
while (rs.next()) {
// get the filepath of the PDF document
path1 = rs.getString(2);
int getNum = rs.getInt(1);
Statement test = con1.createStatement();
test.executeUpdate("update filequeue SET STATUS ='Processing' where UniqueID="+getNum);
try {
// call the index function
PDDocument document = PDDocument.load(new File(path1),MemoryUsageSetting.setupTempFileOnly());
if (!document.isEncrypted()) {
PDFTextStripper tStripper = new PDFTextStripper();
for(int p=1; p<=document.getNumberOfPages();++p) {
tStripper.setStartPage(p);
tStripper.setEndPage(p);
try {
String pdfFileInText = tStripper.getText(document);
processText(pdfFileInText);
System.out.println("Page "+p+" done");
}catch (Exception e){
e.printStackTrace();
Statement statement1 = con1.createStatement();
statement1.executeUpdate("update filequeue SET Error ='E0003' where UniqueID="+getNum);
statement1.executeUpdate("update filequeue SET Status ='Error' where UniqueID="+getNum);
con1.commit();
con1.close();
}
}
}
// After completing the process, update status: Complete
Statement pre= con1.createStatement();
pre.executeUpdate("update filequeue SET STATUS ='Complete' where UniqueID="+getNum);
// con1.commit();
preparedStatement.close();
document.close();
System.out.println("Successfully commited changes to the database!");
con1.commit();
// con1.close();
// updateComplete_DB(getNum);
} catch (Exception e) {
try {
System.err.println(e);
Statement statement1 = con1.createStatement();
statement1.executeUpdate("update filequeue SET STATUS ='Error' where UniqueID="+getNum);
statement1.executeUpdate("update filequeue SET Error ='E0002' where UniqueID="+getNum);
con1.commit();
// add rollback function
rollbackEntries();
}catch (Exception e1){
System.out.println("Could not rollback updates :" + e1.getMessage());
}
}
// con1.close();
}
}catch(Exception e){
e.printStackTrace();
//System.out.println("lalala");
}
//con1.commit();
con1.close();
}
Calling the method:
public void processText(String text) throws SQLException {
String lines[] = text.split("\\r?\\n");
for (String line : lines) {
String[] words = line.split(" ");
String sql="insert IGNORE into test.indextable values (?,?);";
preparedStatement = con1.prepareStatement(sql);
int i=0;
for (String word : words) {
// check if one or more special characters at end of string then remove OR
// check special characters in beginning of the string then remove
// insert every word directly to table db
word=word.replaceAll("([\\W]+$)|(^[\\W]+)", "");
preparedStatement.setString(1, path1);
preparedStatement.setString(2, word);
preparedStatement.executeUpdate();
}
}
preparedStatement.close();
}
The root cause is that there were no lines to process.
You appear to only create prepared statements inside the for (String line : lines) { loop. But you only close the last statement you created (outside that loop).
When you don't have any lines, preparedStatement is null, because you never created one.
Even when you have lines to process, you are creating lots of prepared statements but only closing the last one.
You should probably create one prepared statement at the start of the method and reuse it for the whole method, closing it at the end.

Java Insert multi row data from file.txt to table of database [duplicate]

This question already has answers here:
How to split the large size .txt file data into small portion and insert into database?
(2 answers)
Read data from txt file and insert it into database using java
(2 answers)
Closed 4 years ago.
I have a text file consisting of several lines.
I want to add the whole lines to the table of database.
Before it is inserted to table, it should be substring to get fields value of database table. I think my code (Query) is not good for big data. I know there is other way to do that condition.
public class ReaderFilesData {
LinkedList<String> listFiles = new LinkedList<String>();
private Path path = Paths.get("src/FilesDownloaded/");
DataTRX dataTRX = new DataTRX();
public void readFiles() {
File[] listFile = new File(path.toString()).listFiles();
for (File file : listFile) {
if (file.isFile()) {
listFiles.add(file.getName());
}
}
System.out.println("Total Files : " +listFiles.size());
}
public void readData() {
Path pathsourceFile;
String line;
BufferedReader reader;
for (int i=0; i<listFiles.size(); i++) {
try {
String fileName = listFiles.get(i);
System.out.println("FileName : " +fileName);
pathsourceFile = Paths.get("src/FilesDownloaded/"+fileName+"");
reader = new BufferedReader(new FileReader(pathsourceFile.toString());
while ((line = reader.readLine())!=null) {
int startPoint = line.lastIndexOf(';')+1;
String valueLine = new String(line.substring(startPoint));
System.out.println("Transaction data : " +valueLine);
dataTRX.setId(valueLine.substring(0,2));
dataTRX.setAmount(Integer.parseInt(valueLine.substring(2, 10)));
dataTRX.setDesc(valueLine.substring(10, 18));
System.out.println("getId : " + dataTRX.getId());
System.out.println("getAmount : " + dataTRX.getAmount());
System.out.println("getDesc : " + dataTRX.getDesc());
importData(dataTRX.getId(),
dataTRX.getAmount(),
dataTRX.getDesc(),
}
reader.close();
} catch (Exception e) {
e.getMessage();
}
}
}
public void importData(String id, int amount, String discount ) {
String insertData = "INSERT INTO tbl_trx (id, amount, desc) "
+ "VALUES (?,?,?)";
try {
try (PreparedStatement ps = GeneralRules.conn.prepareStatement(insertData)) {
ps.setString(1, id);
ps.setInt(2, amount);
ps.setString(4, desc);
ps.executeUpdate();
System.out.println("Data successfully update to database!!!\n");
ps.close();
}
} catch (Exception e) {
e.getMessage();
}
}
This is example data of file.txt
320000000200000001
2G0000000500000002
AB0000001500000001
I do substring data base on line above :
substring id,amount,discount (32,00000002,00000001)
substring id,amount,discount (2G,00000005,00000002)
substring id,amount,discount (AB,00000015,00000001)
Your code seems good to me. But If I would have written it, below optimization/replacement, I would have done
1) Use List instead of LinkedList in variable declaration and remove generic String from reference point. Something like
List<String> listFiles = new LinkedList<>();
Link for more explanation on this
2) Similar to using try with resource you did for PreparedStatement, I would do the same for BufferedReader. This would remove the need to close the `BufferedReader' in the end
try (BufferedReader reader = new BufferedReader(new FileReader(pathsourceFile.toString())))
Link for more explanation on this
3) Because you have used try with resource for PreparedStatement, there is no need to have ps.close(), because preparedstatement implements AutoCloseable. So try with resouce will take care of it
4) Instead of e.getMessage(), I would have used e.printStackTrace() because it would give me more information about the error
As far as your use of sub-string is concerned, I would have used it, or would have use regex to split the string.
If number of rows to be inserted are more, which I think is your case, instead of calling executeUpdate() everytime, go with Batch mode. i.e add statements to PreparedStatement batch using addBatch() and execute in one go with executeBatch()

Multiple threads not inserting everything into MySQL

Excuse any wrong practices as I am very new to threading. I have a program that calls my api and gets data back in json format. Each request returns a row of data in json format. All together I need to retrieve about 2,000,000 rows a day which means 2,000,000 requests (I understand that this is bad design, but the system was not designed for this purpose it is just what I need to do for the next couple of weeks). When I tried running it on a single thread I was processing about 200 requests a minute which is much too slow. As a result I created 12 threads and I was processing 5500 rows a minutes which was a great improvement. The problem was only about on average 90% of the rows were inserted into the database as I ran it a few times to make sure. Before each insert printed to a file each URL which was sent and then I checked to see if each insert statement was successful (returned 1 when executed ) and it all seems fine. Every time I run it it inserts about 90% but it does varies and it has never been a consistent number. Am I doing something wrong inside my java code? Essentially the code starts in main by creating 12 threads. Each thread's creates a run method which calls a new instance of MySQLPopulateHistData and passes a start and end integer which are used in the insert statement for ranges. I have done many system.out.println type testing and can see all the threads do start and all the 12 instances (one instance for each thread) called are executing? Does anyone have any idea what it could be?
MAIN:
import java.io.IOException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class MainClass {
public static void main(String[] args) {
try {
//create a pool of threads
Thread[] threads = new Thread[12];
// submit jobs to be executing by the pool
for (int i = 0; i <12; i++) {
threads[i] = new Thread(new Runnable() {
public void run() {
try {
new MySQLPopulateHistData(RangeClass.IdStart, RangeClass.IdEnd);
} catch (Throwable e) {
//TODO Auto-generated catch block
e.printStackTrace();
}
}
});
threads[i].start();
Thread.sleep(1000);
RangeClass.IdStart = RangeClass.IdEnd + 1;
RangeClass.IdEnd = RangeClass.IdEnd + 170000;
}
} catch (Throwable e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
MyDataSourceFactory.class
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Properties;
import javax.sql.DataSource;
import com.mysql.jdbc.jdbc2.optional.MysqlDataSource;
public class MyDataSourceFactory {
static String url = "jdbc:mysql://localhost:3306/my_schema";
static String userName = "root";
static String password = "password";
public synchronized static DataSource getMySQLDataSource() {
MysqlDataSource mysqlDS = null;
mysqlDS = new MysqlDataSource();
mysqlDS.setURL(url);
mysqlDS.setUser(userName);
mysqlDS.setPassword(password);
return mysqlDS;
}
}
MySQLPopulateHistData.class
public class MySQLPopulateHistData {
public MySQLPopulateHistData(int s, int e ) throws IOException, Throwable{
getHistory(s,e);
}
public synchronized void getHistory(int start, int end){
DataSource ds = MyDataSourceFactory.getMySQLDataSource();
Connection con = null;
Connection con2 = null;
Statement stmt = null;
Statement stmt2 = null;
ResultSet rs = null;
try {
con = ds.getConnection();
con2 = ds.getConnection();
stmt = con.createStatement();
stmt2 = con.createStatement();
rs = stmt.executeQuery("SELECT s FROM sp_t where s_id BETWEEN "+ start +" AND "+ end + " ORDER BY s;");
String s = "";
while(rs.next()){
s = rs.getString("s");
if( s == ""){
}
else{
try{
URL fullUrl = new URL(//My Url to my api with password with start and end range);
InputStream is = fullUrl.openStream();
String jsonStr = getStringFromInputStream(is);
JSONObject j = new JSONObject(jsonStr);
JSONArray arr = j.getJSONObject("query").getJSONObject("results").getJSONArray("quote");
for(int i=0; i<arr.length(); i++){
JSONObject obj = arr.getJSONObject(i);
String symbol = obj.getString("s");
stmt2.executeUpdate("INSERT into sp2_t(s) VALUES ('"+ s +"') BETWEEN "+start+" AND "+ end +";");
}
}
catch(Exception e){
}
}
s = "";
}
} catch (Exception e) {
e.printStackTrace();
}finally{
try {
if(rs != null) rs.close();
if(stmt != null) stmt.close();
if(con != null) con.close();
if(stmt2 != null) stmt.close();
if(con2 != null) con.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
UPDATE:
So I put:
(if s.equals("")){
System.out.println("EMPTY");
}
and it never printed out EMPTY. After the JSON requests gets converted to the JSONArray I added:
if(arr.length()>0){
StaticClassHolder.cntResponses++;
}
This is just a static variable in another class that gets incremented everytime there is a valid JSON response. It equalled to the exact right amount it was supposed to be. So it seems as if the URL gets all the responses properly, parses them properly, but is not INSERTING them properly into the database? I can't figure out why?
I also faced the similar issue while inserting records in Oracle. Since I didn't find any concrete solution. I tried with single thread and all went fine.
There are several reasons why this does not work:
A normal computer can only handle about 4-8 threads in total per cpu. As the system uses some of thise threads you would only be able to run some threads at the same time. The computer handles this by pausing some threads then running another thread.
If you try to send several queries through the socket to the mysql server at the same time chanses are that some of the requests will not work and you lose some of your data.
As for now I do not have any solution for faster updates of the table.

Insert into Database using form in Netbeans 7.01

I'm doing an individual project in java. I want to insert data into my database...but my program is successfully running without any error but when insert data and submit the my data it will give an error like this java.sql.SQLException: Can not issue data manipulation statements with executeQuery().This My Code: \
what can do for solved this problem
private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {
if (evt.getSource() == jButton1)``
{
int x = 0;
String s1 = jTextField1.getText().trim();
String s2 = jTextField2.getText();
char[] s3 = jPasswordField1.getPassword();
char[] s4 = jPasswordField2.getPassword();
String s8 = new String(s3);
String s9 = new String(s4);
String s5 = jTextField5.getText();
String s6 = jTextField6.getText();
String s7 = jTextField7.getText();
if(s8.equals(s9))
{
try{
File image = new File(filename);
FileInputStream fis = new FileInputStream(image);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte buf[] = new byte[1024];
for (int readNum; (readNum = fis.read(buf)) != -1;) {
bos.write(buf, 0, readNum);
}
cat_image = bos.toByteArray();
PreparedStatement ps = conn.prepareStatement("insert into reg values(?,?,?,?,?,?,?)");
ps.setString(1,s1);
ps.setString(2,s2);
ps.setString(3,s8);
ps.setString(4,s5);
ps.setString(5,s6);
ps.setString(6,s7);
ps.setBytes(7,cat_image);
rs = ps.executeQuery();
if(rs.next())
{
JOptionPane.showMessageDialog(null,"Data insert Succesfully");
}else
{
JOptionPane.showMessageDialog(null,"Your Password Dosn't match" ,"Acces dinied",JOptionPane.ERROR_MESSAGE);
}
}catch(Exception e)
{
System.out.println(e);
}
}
Use ps.executeUpdate() or ps.execute().
From executeUpdate
Executes the SQL statement in this PreparedStatement object, which must be an SQL Data Manipulation Language (DML) statement, such as
INSERT, UPDATE or DELETE; or an SQL statement that returns nothing,
such as a DDL statement.
From execute
Executes the SQL statement in this PreparedStatement object, which
may be any kind of SQL statement. Some prepared statements return
multiple results; the execute method handles these complex statements
as well as the simpler form of statements handled by the methods
executeQuery and executeUpdate.The execute method returns a boolean to
indicate the form of the first result. You must call either the method
getResultSet or getUpdateCount to retrieve the result; you must call
getMoreResults to move to any subsequent result(s).
Then modify your code properly
int rowsAffected = ps.executeUpdate();
JOptionPane.showMessageDialog(null,"Data Rows Inserted "+ rowsAffected);
Also you have to close your streams and connections in a finally block.
SQLException is thrown because of wrong sql statement. You may have syntax error while inserting string and integer values. Check your sql statement after VALUES there should be "1-0" around integer elements and '"some value"' around string elements.

Categories

Resources