Java- Basic IO trouble

Java- Basic IO trouble - java

I'm trying to code a sieve of eratosthenes which I intend to use to find the largest prime factor of 13195. If this works, I intend to use it on the number: 600851475143.
Since creating a list of numbers ranging from 2-600851475143 would be nearly impossible due to memory issues, I have decided to store the numbers in a text file instead.
The problem I'm running into though is that instead of getting a text file filled with numbers, the code only produces a file with one number (this is my first time work with IO related stuff in Java):
long number = 13195;
long limit = (long) Math.sqrt(number);
for (long i = 2; i < limit + 1; i++)
{
try
{
Writer output = null;
File file = new File("Primes.txt");
output = new BufferedWriter(new FileWriter(file));
output.write(Long.toString(i) + "\n");
output.close();
}
catch (IOException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Here's the output contained the text file:
114
What am I doing wrong?

Don't use Erathostenes - it's too slow unless you need all the primes in the range.
Here is a better way to factorize a given number. The function returns a map, where the keys are the prime factors of n and the values are their powers. E.g. for 13195 it will be {5:1, 7:1, 13:1, 29:1}
It's complexity is O(sqrt(n)):
public static Map<Integer, Integer> Factorize(int n){
HashMap<Integer, Integer> ret = new HashMap<Integer, Integer>();
int origN = n;
for(int p = 2; p*p <= origN && n > 1; p += (p == 2 ? 1: 2)){
int power = 0;
while (n % p == 0){
++power;
n /= p;
}
if(power > 0)
ret.put(p, power);
}
return ret;
}
Of course if you need just the largest prime factor you can return the last p only not the whole map - the complexity is the same.

Your code keep re-opening, writing, and closing the same file. You should do something like this:
long number = 13195;
long limit = (long) Math.sqrt(number);
try
{
File file = new File("Primes.txt");
Writer output = new BufferedWriter(new FileWriter(file));
for (long i = 2; i < limit + 1; i++)
{
output.write(Long.toString(i) + "\n");
}
output.close();
}
catch (IOException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}

You need to take the file instantiation out of the loop.

You are overwriting your file on every pass through the loop.
You need to open your file outside the main loop.
long number = 13195;
long limit = (long) Math.sqrt(number);
try
{
Writer output = null;
File file = new File("Primes.txt");
output = new BufferedWriter(new FileWriter(file));
catch (IOException e)
{
// Cannot open file
e.printStackTrace();
}
for (long i = 2; i < limit + 1; i++)
{
try
{
output.write(Long.toString(i) + "\n");
}
catch (IOException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}
output.close();

You are recreating your filewriter in every iteration of your for-loop and not specifying it to append so you are overwriting your file in every iteration.
Try changing it to create your filewriter before your for-loop and close it after the loop. Something like this:
long number = 13195;
long limit = (long) Math.sqrt(number);
Writer output = null;
try
{
File file = new File("/var/tmp/Primes.txt");
output = new BufferedWriter(new FileWriter(file));
for (long i = 2; i < limit + 1; i++) {
output.write(Long.toString(i) + "\n");
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
output.close();
}

Saving directly to disk is slower, have you considered doing this in pieces, then saving to disk? It could also have the benefit of smaller file size, since you can write only the primes you have found, instead of every composite number.

Related

Is there a faster way to use csv reader in Java?

I need to open a csv file in more parts, each one by 5,000 samples and then plot them. To go back and forward on the signal each time I click a button I have to instantiate a new reader and than I skip to the point I need. My signal is big, is about 135,000 samples so csvReader.skip() method is very slow when I work with last samples. But to go back I can't delete lines, so each time my iterator needs to be re-instantiated. I noticed that skip uses a for loop? Is there a better way to overtake this problem? Here is my code:
public void updateSign(int segmento) {
Log.d("segmento", Integer.toString(segmento));
//check if I am in the signal length
if (segmento>0 && (float)(segmento-1)<=(float)TOTAL/normaLen)
{
try {
reader = new CSVReader(new FileReader(new File(patty)));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
List<Integer> sign = new ArrayList<>();
//this is the point of the signal where i finish
int len = segmento * normaLen;
//check if i am at the end of the signal
if (len >= TOTAL) {
len = TOTAL;
segmento=0;
avanti.setValue(false);
System.out.println(avanti.getValue());
} else {
lines = TOTAL - len;
avanti.setValue(true);
System.out.println(avanti.getValue());
}
//the int to i need to skip
int skipper = (segmento-1)*normaLen;
try {
System.out.println("pre skip");
reader.skip(skipper);
System.out.println("post skip");
} catch (IOException e) {
e.printStackTrace();
}
//my iterator
it = reader.iterator();
System.out.println("iteratore fatto");
//loop to build my mini-signal to plot
//having only 5,000 sample it is fast enaugh
for (int i = skipper; i < len-1; i++) {
if (i>=(segmento-1)*normaLen) {
sign.add(Integer.parseInt(it.next()[0]));
}
else
{
it.next();
System.out.println("non ha funzionato lo skip");
}
}
System.out.println("ciclo for: too much fatica?");
//set sign to be plotted by my fragment
liveSign.setValue(sign);
}
}
Thanks in advance!

Is this a good way of reading a binary file full of doubles?

I have a list of binary files that I need to read and then, store to a variable. Each file is a collection of a huge number of doubles. The files were saved with a C program with double type in C under linux. Now, I want to read all these files using Java. Is this the fastest approach you can achieve? In my PC it takes 24 seconds to read 10 files (1.5 Mb/files with 194,672 doubles/file) and store them into an array. I was thinking in using some type of buffer but I am not sure if I should leave some bytes from the begging...
int i;
int num_f = 10;
int num_d = 194672;
File folder = new File(route);
File[] listOfFiles = folder.listFiles();
float double_list[][] = new float[num_f][num_d];
for (int file = 0; file < listOfFiles.length; file++) {
if (listOfFiles[file].isFile()) {
try{
br = new DataInputStream(new FileInputStream(listOfFiles[file].getAbsolutePath()));
//We read all file
i = 0;
while(br.available() > 0) {
//I know that float != double but I don't think I will lose a huge precision
//as the double numbers stored are in a region [-5,5] and with this way I reduce
//the amount of memory needed. (float) is not cpu consuming (<1s).
double_list[file][i++] = (float) br.readDouble();
}
}
}catch (Exception e){
e.printStackTrace();
}finally {
try {
//Close file
br.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}

Finally, I could do it with the help of Andreas and this website (http://pulasthisupun.blogspot.com.es/2016/06/reading-and-writing-binary-files-in.html) (check there for other type formats!). For endianness, the default option is the BIG_ENDIAN code but with that I had nonsense things as infinity numbers. Nonetheless, with LITTLE_ENDIAN i'm getting the correct numbers! Still though I will have to do some test in the future just to be sure that I do not have to let some extra byte from the beginning...
BTW, time spent: 0.160048575s, not bad ;)
int i;
int num_f = 10;
int num_d = 194672;
File folder = new File(route);
File[] listOfFiles = folder.listFiles();
float double_list[][] = new float[num_f][num_d];
for (int file = 0; file < listOfFiles.length; file++) {
if (listOfFiles[file].isFile()) {
try{
fc = (FileChannel) Files.newByteChannel(Paths.get(listOfFiles[file].getAbsolutePath()), StandardOpenOption.READ);
byteBuffer = ByteBuffer.allocate((int)fc.size());
byteBuffer.order(ByteOrder.LITTLE_ENDIAN);
fc.read(byteBuffer);
byteBuffer.flip();
buffer = byteBuffer.asDoubleBuffer();
((DoubleBuffer)buffer).get(double_list[file]);
byteBuffer.clear();
fc.close();
}catch (Exception e){
e.printStackTrace();
}finally {
try {
//Close file
br.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}

Why does my code become slow after processing a large dataset?

I have a Java program, which basically reads from a file line by line and stores the lines into a set. The file contains more than 30000000 lines. My program runs fast at the beginning but slow down after processing 20000000 lines and even too slow to wait. Can somebody explains why this would happen and how can I speed up the program again?
Thanks.
public void returnTop100Phases() {
Set<Phase> phaseTreeSet = new TreeSet<>(new Comparator<Phase>() {
#Override
public int compare(Phase o1, Phase o2) {
int diff = o2.count - o1.count;
if (diff == 0) {
return o1.phase.compareTo(o2.phase);
} else {
return diff > 0 ? 1 : -1;
}
}
});
try {
int lineCount = 0;
BufferedReader br = new BufferedReader(
new InputStreamReader(new FileInputStream(new File("output")), StandardCharsets.UTF_8));
String line = null;
while ((line = br.readLine()) != null) {
lineCount++;
if (lineCount % 10000 == 0) {
System.out.println(lineCount);
}
String[] tokens = line.split("\\t");
phaseTreeSet.add(new Phase(tokens[0], Integer.parseInt(tokens[1])));
}
br.close();
PrintStream out = new PrintStream(System.out, true, "UTF-8");
Iterator<Phase> iterator = phaseTreeSet.iterator();
int n = 100;
while (n > 0 && iterator.hasNext()) {
Phase phase = iterator.next();
out.print(phase.phase + "\t" + phase.count + "\n");
n--;
}
out.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}

Looking at the runtime behaviour this is clearly a memory issue. Actually my tests even broke after around 5M with 'GC overhaed limit exeeded' on Java8. If I limit the size of the phaseTreeSet by adding
if (phaseTreeSet.size() > 100) { phaseTreeSet.pollLast(); }
it runs through quickly. The point why it gets that slow is, it uses more memory, and thus the garbage collection takes longer. But every time before it takes more memory it has to do a big garbage collection again. Obviously there's quite some memory to take, and every time it gets a bit slower...
To get faster you need to get the stuff out of memory. Maybe by keeping only top Phases like I did, or by using kind of a database.

How to get HashSet limit Size?

i wanna get HashSet Limit byte size in my develope system.
so i made just adding dump data source code
loot at my source code
String DUMP = "llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll";
void testSetLimitByte(){
File f = new File("d:/test.txt");
BufferedWriter bw = null;
HashSet<String> set = new HashSet<String>();
int cnt = 0;
try {
bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("d:/test.txt", false) , "UTF-8"));
for (int i = 0; i<100000000; i++) {
String dumpData = DUMP + i;
bw.write(dumpData);
bw.newLine();
if(i == 0)
continue;
set.add(dumpData);
if(i%10000 == 0)
System.out.print(".");
if(i%100000 == 0)
System.out.print(" ");
if(i%1000000 == 0){
cnt++;
System.out.println(cnt +" (size 1billion)");
}
}
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if(bw != null)
bw.close();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("HashSet Limit Memory : " + f.length() +"bytes");
}
}
is this code get similar HashSet limit byte size..?

The HashSet is not limited by bytes, only the total heap size is limited.
Note: to reach the Integer.MAX_VALUE size of a HashSet you need heap of ~64 GB (and trivial keys/values)
How many HashSet adding byte in some system It does not occurring OutOfMemoryError
In this case, you can find that the JVM is just running slower and slower trying to use the last portions of memory. In newer JVMs, it detect that you are approaching this condition and dies a little earlier but it can take a long time to reach that point.

Delete last line in text file

I am tying to erase the last line in a text file using Java; however, the code below deletes everything.
public void eraseLast()
{
while(reader.hasNextLine()) {
reader.nextLine();
if (!reader.hasNextLine()) {
try {
fWriter = new FileWriter("config/lastWindow.txt");
writer = new BufferedWriter(fWriter);
writer.write("");
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}

If you wanted to delete the last line from the file without creating a new file, you could do something like this:
RandomAccessFile f = new RandomAccessFile(fileName, "rw");
long length = f.length() - 1;
do {
length -= 1;
f.seek(length);
byte b = f.readByte();
} while(b != 10);
f.setLength(length+1);
f.close();
Start off at the second last byte, looking for a linefeed character, and keep seeking backwards until you find one. Then truncate the file after that linefeed.
You start at the second last byte rather than the last in case the last character is a linefeed (i.e. the end of the last line).

You are creating a new file that's replacing the old one, you want something like this
public void eraseLast() {
StringBuilder s = new StringBuilder();
while (reader.hasNextLine()) {
String line = reader.readLine();
if (reader.hasNextLine()) {
s.append(line);
}
}
try {
fWriter = new FileWriter("config/lastWindow.txt");
writer = new BufferedWriter(fWriter);
writer.write(s.toString());
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

The answer above needs to be slightly modified to deal with the case where there is only 1 line left in the file (otherwise you get an IOException for negative seek offset):
RandomAccessFile f = new RandomAccessFile(fileName, "rw");
long length = f.length() - 1;
do {
length -= 1;
f.seek(length);
byte b = f.readbyte();
} while(b != 10 && length > 0);
if (length == 0) {
f.setLength(length);
} else {
f.setLength(length + 1);
}

You're opening the file in overwrite mode (hence a single write operation will wipe the entire contents of the file), to open it in append mode it should be:
fWriter = new FileWriter("config/lastWindow.txt", true);
And besides, it's not going to delete the last line: although the reader has reached the current last line of the file, the writer is after the last line - because we specified above that the append mode should be used.
Take a look at this answer to get an idea of what you'll have to do.

I benefited from others but the code was not working. Here is my working code on android studio.
File file = new File(getFilesDir(), "mytextfile.txt");
RandomAccessFile randomAccessFile = new RandomAccessFile(file, "rw");
byte b;
long length = randomAccessFile.length() ;
if (length != 0) {
do {
length -= 1;
randomAccessFile.seek(length);
b = randomAccessFile.readByte();
} while (b != 10 && length > 0);
randomAccessFile.setLength(length);
randomAccessFile.close();
}

This is my solution
private fun removeLastSegment() {
val reader = BufferedReader(FileReader(segmentsFile))
val segments = ArrayList<Segment>()
var line: String?
while (reader.readLine().also { line = it } != null) {
segments.add(Gson().fromJson(line, Segment::class.java))
}
reader.close()
segments.remove(segments.last())
var writer = BufferedWriter(FileWriter(segmentsFile))
writer.write("")
writer = BufferedWriter(FileWriter(segmentsFile, true))
for (segment in segments) {
writer.appendLine(Gson().toJson(segment))
}
writer.flush()
writer.close()
lastAction--
lastSegment--
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java- Basic IO trouble - java

You need to take the file instantiation out of the loop.

Saving directly to disk is slower, have you considered doing this in pieces, then saving to disk? It could also have the benefit of smaller file size, since you can write only the primes you have found, instead of every composite number.

Related

Is there a faster way to use csv reader in Java?

Is this a good way of reading a binary file full of doubles?

Why does my code become slow after processing a large dataset?

How to get HashSet limit Size?

Delete last line in text file

Categories

Resources