How to read file contents in batches using java code - java

I am very much new to java programming.I need to read a huge java file in smaller chunks. For example
if I have the file as follows
a
b
c
d
e
f
g
h
I have the batch size as 2. As per the above file I need to create 4 batches and then process. I dont need to have a multi threading mode in this task.
Following is what I have tried. I know it is simple and I have come closer to what i want to acheive.
Any suggestions on the code will be helpful
public class testing {
public static void main(String[] args) throws IOException {
System.out.println("This is for testing");
FileReader fr = null;
try {
fr = new FileReader("C:\\Users\\me\\Desktop\\Files.txt");
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int batchSize=2;
int batchCount=0;
int lineIncr=0;
BufferedReader bfr = new BufferedReader(fr);
String line;
int nextBatch=0;
int i=0;
while((line=bfr.readLine())!= null) {
if (lineIncr <=nextBatch ) {
System.out.println(line);
int b=0;
i=i+1;
if (i==2) {
b=b+1;
System.out.println("batchSize : "+b);
System.out.println("batchSize : "+b);
}
}
}
bfr.close();
}
}

Try this:
final int batchSize = 2;
Path file = Paths.get("C:\\Users\\me\\Desktop\\Files.txt");
try (BufferedReader bfr = Files.newBufferedReader(file)) {
List<String> batch = new ArrayList<>(batchSize);
for (String line; (line = bfr.readLine()) != null; ) {
batch.add(line);
if (batch.size() == batchSize) {
process(batch);
batch = new ArrayList<>(batchSize); // or: batch.clear()
}
}
if (! batch.isEmpty()) {
process(batch);
}
}
Notable features:
Uses new NIO 2 Path API, instead of old File API.
Uses try-with-resources to ensure Reader is always closed correctly.
Collects the batch of lines in a List<String>.
Calls process(List<String> batch) method to do the processing.
Call process() with partial batch, if last batch is incomplete.

Related

Changing the first line in a file

I'm having an issue with changing a line in a file, the purpose of this code is to change the first number of the file to itself + 1. For some reason the code doesn't seem to be functioning at all, any help would be appreciated!
public static void changenumber(String fileName)
{
ArrayList<String> list = new ArrayList<String>();
File temp = new File(fileName);
Scanner sc;
try {
sc = new Scanner(temp);
while (sc.hasNextLine())
{
list.add(sc.nextLine());
}
sc.close();
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
String first = list.get(0);
int i = Integer.parseInt(first);
i = i+1;
first = Integer.toString(i);
list.set(0, first);
writenumber(list,fileName);
}
public static void writenumber(ArrayList<String> list, String fileName)
{
PrintWriter write;
try {
write = new PrintWriter(new FileWriter(fileName, true));
for(int i = 0; i<list.size();i++)
{
write.append(list.get(i));
}
}
catch(IOException err)
{
err.printStackTrace();
}
}
Your problem is that you never closed the FileWriter.
Use try-with-resources to ensure that file streams are closed correctly.
A few other improvements to your code:
Do not ignore exceptions. Continuing execution as-if nothing bad happened will cause lots of problems. Let the exception bounce back to caller, and let caller decide what to do if the file cannot be updated.
Scanner is slow. Since all you're doing to reading lines, use BufferedReader instead.
The lines in memory don't end in newline characters, so you need to use the println() method when writing the lines back out, otherwise the result is a file with all the lines concatenated into a single line.
Variables renamed to be more descriptive.
public static void changenumber(String fileName) throws IOException {
ArrayList<String> lines = new ArrayList<>();
try (BufferedReader in = new BufferedReader(new FileReader(fileName))) {
for (String line; (line = in.readLine()) != null; ) {
lines.add(line);
}
}
int i = Integer.parseInt(lines.get(0));
i++;
lines.set(0, Integer.toString(i));
writenumber(lines, fileName);
}
public static void writenumber(List<String> lines, String fileName) throws IOException {
try (PrintWriter out = new PrintWriter(new FileWriter(fileName, true))) {
for (String line : lines) {
out.println(line);
}
}
}
Of course, you could simplify the code immensely by using the newer NIO.2 classes added in Java 7, in particular the java.nio.file.Files class.
public static void changenumber(String fileName) throws IOException {
Path filePath = Paths.get(fileName);
List<String> lines = Files.readAllLines(filePath);
lines.set(0, Integer.toString(Integer.parseInt(lines.get(0)) + 1));
Files.write(filePath, lines);
}

Java - Read large .txt data file in batch size of 10

I have a large data file say dataset.txt where data is in the format -
1683492079 kyra maharashtra 18/04/2017 10:16:17
1644073389 pam delhi 18/04/2017 10:16:17
.......
The fields are id, name, state, and timestamp.
I have around 50,000 lines of data in the .txt data file.
My requirement is to read the data from this data file in batch size of 10.
So in first batch I need to read from 0 to 9th elements. Next batch from 10th to 19th elements and so on...
Using BufferedReader I have managed to read the whole file:
import java.io.*;
public class ReadDataFile {
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("dataset.txt"));
String line;
while((line = br.readLine())!= null)
{
System.out.println(line);
}
br.close();
}
}
But my requirement is to read the file in batch size of 10. I am new to Java so would really appreciate if some one can help me in simple terms.
As per #GhostCat answer - this what I have got -
public class ReadDataFile {
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("dataSetExample.txt"));
readBatch(br,10);
}
public static void readBatch(BufferedReader reader, int batchSize) throws IOException {
List<String> result = new ArrayList<>();
for (int i = 0; i < batchSize; i++) {
String line = reader.readLine();
if (line != null) {
// result.add(line);
System.out.println(line);
}
}
// return result;
return ;
}
}
The file is read in the readBatch method so how do I know in the main method that the end of file is reached to call the next 10 records? Kindly help.
Your requirements aren't really clear; but something simple to get you started:
A) your main method shouldn't do any reading; it just prepare that BufferedReader object
B) you use that reader with a method like:
private static List<String> readBatch(Reader reader, int batchSize) throws IOException {
List<String> result = new ArrayList<>();
for (int i = 0; i < batchSize; i++) {
String line = reader.readLine();
if (line != null) {
result.add(line);
} else {
return result;
}
}
return result;
}
To be used in your main:
BufferedReader reader = ...
int batchSize = 10;
boolean moreLines = true;
while (moreLines) {
List<String> batch = readBatch(reader, batchSize);
... do something with that list
if (batch.size() < batchSize) {
moreLines = false;
}
This is meant as "suggestion" how you could approach this. Things missing from my answer: probably you should use a distinct class, and do parsing right there (and return a List<DataClass> instead of moving around those raw "line strings".
And of course: 50000 lines isn't really much of data. Unless we are talking an embedded device, there is really not much point regarding "batch style".
And finally: the term batch processing has a very distinct meaning; also in Java, and if you intend to go there, see here for further reading.
Anybody in need of working example ---
// Create a method to read lines (using buffreader) and should accept the batchsize as argument
private static List<String> readBatch(BufferedReader br, int batchSize) throws IOException {
// Create a List object which will contain your Batch Sized lines
List<String> result = new ArrayList<>();
for (int i = 1; i < batchSize; i++) { // loop thru all your lines
String line = br.readLine();
if (line != null) {
result.add(line); // add your lines to your (List) result
} else {
return result; // Return your (List) result
}
}
return result; // Return your (List) result
}
public static void main(String[] args) throws IOException {
//input file
BufferedReader br = new BufferedReader(new FileReader("c://ldap//buffreadstream2.csv"));
//output file
BufferedWriter bw = new BufferedWriter(new FileWriter("c://ldap//buffreadstream3.csv"));
// Your Batch size i.e. how many lines you want in your batch
int batchSize = 5; // Define your batchsize here
String line = null;
long batchNumber = 1;
try {
List<String> mylist = null;
while ((line = br.readLine()) != null) { // Do it for your all line in your csv file
bw.write("Batch Number # " + batchNumber + "\n");
System.out.println("Batch Number # " + batchNumber);
bw.write(line + "\n"); // Since br.readLine() reads the next line you have to catch your first line here itself
System.out.println(line); // else you will miss every batchsize number line
// process your First Line here...
mylist = readBatch(br, batchSize); // get/catch your (List) result here as returned from readBatch() method
for (int i = 0; i < mylist.size(); i++) {
System.out.println(mylist.get(i));
// process your lines here...
bw.write(mylist.get(i) + "\n"); // write/process your returned lines
}
batchNumber++;
}
System.out.println("Lines are Successfully copied!");
br.close(); // one you are done .. dont forget to close/flush
br = null; // all
bw.flush(); // your
bw.close(); // BR and
bw = null; // BWs..
} catch (Exception e) {
System.out.println("Exception caught: " + e.getMessage()); // Catch any exception here
}
}

Save a reader of a file in a database in Java

I have a Reader in Java:
And the reader (Reader read) is from a file with 1'000.000 of lines
And i need save each line in my database, i am reading the Reader like:
int data = read.read();
String line = "";
while (data != -1) {
char dataChar = (char) data;
data = read.read();
if (dataChar != '\n') {
line = line + dataChar;
} else {
i++;
showline(line);
line = "";
}
}
Then i am calling my DAO for each line:
private static void showline(String line) {
try {
if (line.startsWith(prefix)) {
line = line.substring(prefix.length());
}
ms = new Msisdn(Long.parseLong(line, 10), idList);
ListDAO.createMsisdn(ms);
} catch (Exception e) {
}
}
And my DAO is:
public static void createMsisdn(Msisdn msisdn) {
EntityManager e = DBManager.createEM();
try {
createMsisdn(msisdn, e);
} finally {
if (e != null) {
e.close();
}
}
}
public static void createMsisdn(Msisdn msisdn, EntityManager em) {
em.getTransaction().begin();
em.persist(msisdn);
em.getTransaction().commit();
}
But my problem is that with a file with 1'000.000 lines it takes about 1 hour 30 minutes to complete. How can I make it faster?
(My main problem is call the DAO 1'000.000 of times because it is very slow, because the while is faster, without the call to the DAO the time is less than 1 minute, but with the call to the DAO the time is 2 hours)
Reading characters and appending them into a String one by one is incredibly inefficient. Using a BufferedReader to read lines of text is much better:
String line;
BufferedReader reader = new BufferedReader(read);
while ((line = reader.readLine()) != null) {
showline(line);
}
This won't have a big effect in your case though: you are inserting each line in a separate transaction, and each transaction can take hundreds of milliseconds to complete. You should structure your code in a way that several lines could be inserted in a single transaction. For example you can read blocks of lines like this, but you'll have to change the showlines and createMsisdn methods so that they accept several at a time and process them in a single batch:
final int TRANSACTION_SIZE = 500;
int i = 0;
String[] lines = new String[TRANSACTION_SIZE];
BufferedReader reader = new BufferedReader(read);
while ((lines[i] = reader.readLine()) != null) {
if (i >= lines.length) {
showlines(lines, lines.length);
i = 0;
} else {
i++;
}
}
if (i > 0) showlines(lines, i);

Read in N Lines of an Input Stream and print in reverse order without using array or list type structure?

Using the readLine() method of BufferedReader, can you print the first N lines of a stream in reverse order without using a list or an array?
I think you can do it through recursion with something like:
void printReversed(int n)
{
String line = reader.readLine();
if (n > 0)
printReversed(n-1);
System.out.println(line);
}
How about recursion to reverse the order?
Pseudo code:
reverse(int linesLeft)
if (linesLeft == 0)
return;
String line = readLine();
reverse(linesLeft - 1);
System.out.println(line);
Nice question. Here you have one solution based on coordinated threads. Although it's heavy on resources (1 thread/line of the buffer) it solves your problem within the given constrains. I'm curious to see other solutions.
public class ReversedBufferPrinter {
class Worker implements Runnable {
private final CountDownLatch trigger;
private final CountDownLatch release;
private final String line;
Worker(String line, CountDownLatch release) {
this.trigger = new CountDownLatch(1);
this.release = release;
this.line = line;
}
public CountDownLatch getTriggerLatch() {
return trigger;
}
public void run() {
try {
trigger.await();
} catch (InterruptedException ex) { } // handle
work();
release.countDown();
}
void work() {
System.out.println(line);
}
}
public void reversePrint(BufferedReader reader, int lines) throws IOException {
CountDownLatch initialLatch = new CountDownLatch(1);
CountDownLatch triggerLatch = initialLatch;
int count=0;
String line;
while (count++<lines && (line = reader.readLine())!=null) {
Worker worker = new Worker(line, triggerLatch);
triggerLatch = worker.getTriggerLatch();
new Thread(worker).start();
}
triggerLatch.countDown();
try {
initialLatch.await();
} catch (InterruptedException iex) {
// handle
}
}
public static void main(String [] params) throws Exception {
if (params.length<2) {
System.out.println("usage: ReversedBufferPrinter <file to reverse> <#lines>");
}
String filename = params[0];
int lines = Integer.parseInt(params[1]);
File file = new File(filename);
BufferedReader reader = new BufferedReader(new FileReader(file));
ReversedBufferPrinter printer = new ReversedBufferPrinter();
printer.reversePrint(reader, lines);
}
}
Here you have another alternative, based on BufferedReader & StringBuilder manipulations. More manageable in terms of computer resources needed.
public void reversePrint(BufferedReader bufReader, int lines) throws IOException {
BufferedReader resultBufferReader = null;
{
String line;
StringBuilder sb = new StringBuilder();
int count = 0;
while (count++<lines && (line = bufReader.readLine())!=null) {
sb.append('\n'); // restore new line marker for BufferedReader to consume.
sb.append(new StringBuilder(line).reverse());
}
resultBufferReader = new BufferedReader(new StringReader(sb.reverse().toString()));
}
{
String line;
while ((line = resultBufferReader.readLine())!=null) {
System.out.println(line);
}
}
}
it will also require implicit data structures, but you can spawn threads, run them inorder, and make each thread read a line and wait a decreasing amount of time. the result will be: the last thread will run first, and the first one will run last, each one printing its line. (the interval between them will have to be large enough to ensure large "safety margins")
I have no idea how, if any, that can be done with no explicit/implicit data storage.
Prepend each line you read to a string, and print the string. If you run out of lines to read, you just print what you have.
Alternatively, if you are certain of the number of lines you have, and you do not wish to use a string:
void printReversed(int n, BufferedReader reader)
{
LineNumberReader lineReader = new LineNumberReader(reader);
while (--i >= 0)
{
lineReader.setLineNumber(i);
System.out.println(lineReader.readLine());
}
}

Java Compiler - Load Method

So I have been working on a java project where the goal is to create a virtual computer. So I am basically done but with one problem. I have created a compiler which translates a txt document with assembly code in it and my compiler has created a new-file with this code written as machine executable ints. But now I need to write a load method that reads these ints and runs the program but I am having difficulty doing this. Any help is much appreciated....also this is not homework if you are thinking this. The project was simply to make a compiler and now I am trying to complete it for my own interest. Thanks.
Here is what I have so far for load:
public void load(String newfile) throws FileNotFoundException
{
try{
File file = new File(newfile);
FileInputStream fs = new FileInputStream(file);
DataInputStream dos = new DataInputStream(fs);
dos.readInt();
dos.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
Ok here is the part of the Compiler that does the writeInts:
public void SecondPass(SymbolList symbolTable, String filename){
try {
int dc = 99;
//Open file for reading
File file = new File(filename);
Scanner scan = new Scanner(file);
//Make filename of new executable file
String newfile = makeFilename(filename);
//Open Output Stream for writing new file.
FileOutputStream fs = new FileOutputStream(newfile);
DataOutputStream dos = new DataOutputStream(fs);
//Read First line. Split line by Spaces into linearray.
String line = scan.nextLine();
String[] linearray = line.split(" ");
while(line!=null){
if(!linearray[0].equals("REM")){
int inst = 0, opcode, loc;
if(isInstruction(linearray[0])){
opcode = getOpcode(linearray[0]);
loc = symbolTable.searchName(linearray[1]).getMemloc();
inst = (opcode*100)+loc;
} else if(!isInstruction(linearray[0])){
if(isInstruction(linearray[1])){
opcode = getOpcode(linearray[1]);
if(linearray[1].equals("STOP"))
inst=0000;
else {
loc = symbolTable.searchName(linearray[2]).getMemloc();
inst = (opcode*100)+loc;
}
}
if(linearray[1].equals("DC"))
dc--;
}
dos.writeInt(inst);
System.out.println(" inst is being written as:" + inst);
}
try{
line = scan.nextLine();
}
catch(NoSuchElementException e){
line = null;
break;
}
linearray = line.split(" ");
}
scan.close();
for(int i=lc; i<=dc; i++){
dos.writeInt(0);
}
for(int i = dc+1; i < 100; i++)
{
dos.writeInt(symbolTable.searchLocation(i).getValue());
}
dos.close();
fs.close();
}
catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
So what I have done is write a file in txt like:
IN X
In Y
SUB X
STO Y
OUT Y
DC: X 0
DC: Y 0
And I wrote a compiler that has now converted this file into machine code so I have created a file for example called program.txt.ex and it contains a bunch of ####### or machine code and I did this using the SecondPass code above and now I need to write a load method that will allow me to load and run this file.
Here is my Run method
public void run(String filename) throws IOException
{
if (mem == null)
System.out.println("mem null");
if (filename == null)
System.out.println("filename null");
mem.loadFromFile(filename);
cpu.reset();
cpu.setMDR(mem.read(cpu.getMAR()));
cpu.fetch2();
while (!cpu.stop())
{
cpu.decode();
if (cpu.OutFlag())
OutPut.display(mem.read(cpu.getMAR()));
if (cpu.InFlag())
mem.write(cpu.getMDR(),in.getInt());
if (cpu.StoreFlag())
{
mem.write(cpu.getMAR(),in.getInt());
cpu.getMDR();
}
else
{
cpu.setMDR(mem.read(cpu.getMAR()));
cpu.execute();
cpu.fetch();
cpu.setMDR(mem.read(cpu.getMAR()));
cpu.fetch2();
}
}
}
The Run Method:
public void run(int mem)
{
cpu.reset();
cpu.setMDR(mem.read(cpu.getMAR()));
cpu.fetch2();
while (!cpu.stop())
{
cpu.decode();
if (cpu.OutFlag())
OutPut.display(mem.read(cpu.getMAR()));
if (cpu.InFlag())
mem.write(cpu.getMDR(),in.getInt());
if (cpu.StoreFlag())
{
mem.write(cpu.getMAR(),in.getInt());
cpu.getMDR();
}
else
{
cpu.setMDR(mem.read(cpu.getMAR()));
cpu.execute();
cpu.fetch();
cpu.setMDR(mem.read(cpu.getMAR()));
cpu.fetch2();
}
}
}
I notice that your loader does a single
dos.readInt();
...which will read a single integer value from your file. What you probably want to do is create a loop that reads ints until you hit the end-of-file on dos (which might more aptly be named dis, no?). You could add those ints to a dynamic container like an ArrayList, which will grow with every element you stuff into it. Once done loading, you can use toArray to copy all those ints to an array of the appropriate size.
If seems that you need to load the whole file in memory before starting execution, so it would go:
public int[] load(String newfile) throws FileNotFoundException
{
int mem[] = new int[100];
try {
File file = new File(newfile);
FileInputStream fs = new FileInputStream(file);
DataInputStream dis = new DataInputStream(fs);
for (int i = 0; i < mem.length; ++i) {
mem[i] = dis.readInt();
}
dos.readInt();
dos.close();
} catch (IOException e) {
e.printStackTrace();
}
return mem;
}
void run(int mem[]) {
// now execute code
int pc = 0;
loop: while (true) {
int inst = mem[pc++];
int opcode = inst/100;
int loc = inst%100;
switch (opcode) {
case OpCode.STOP:
break loop;
case OpCode.IN:
...
}
}
}

Categories

Resources