I am trying to make a java program that downloads a lot of images off a website. However, once I run the class, it instantaneously exits, and I can't figure out why. Here is my code:
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;
import java.util.HashMap;
public class Main {
static HashMap<Integer, String> hmap = new HashMap<Integer, String>();
public static void main(String[] args) throws IOException {
for(int i = 1; i > 151; i++) {
for(int i1 = 1; i1 > 151; i1++) {
if(i == i1) {
continue;
}
String imageUrl1 = "http://images.alexonsager.net/pokemon/fused/" + i + "/" + i + "." + i1 + ".png";
String destinationFile1 = hmap.get(i) + " and " + hmap.get(i1);
saveImage(imageUrl1, destinationFile1);
System.out.println("Downloaded " + destinationFile1);
}
}
}
public static void saveImage(String imageUrl, String destinationFile) throws IOException {
URL url = new URL(imageUrl);
InputStream is = url.openStream();
OutputStream os = new FileOutputStream(destinationFile);
byte[] b = new byte[2048];
int length;
while ((length = is.read(b)) != -1) {
os.write(b, 0, length);
}
is.close();
os.close();
}
public static void createHash() {
//hmap.put(int, string) times 151
}
}
What I want it to do is download, say, i=1 and i1=2, then download i=1 and i1=3, and so on until both hit 151 (they can't both be equal). In all, this will download 22650 files that will approximately be 27.6MB alltogether. So, that being said, is it a memory issue with the java settings themselves (I have 32GB of RAM, so me running out is not really an option), or is it a problem with the code?
I would greatly appreciate it if someone could help me with this.
Thanks!
i is never greater than 151 so you never enter your loops.
Solution
for (int i = 1 ; i < 151 ; i++)
Related
I am doing an assignment for my class where I have a file that is 7 Mb. Essentially I am supposed to break it up into 2 phases.
Phase 1: I add each word from the file into an array list and sort it in alphabetical order. I then add every 100,000 words into 1 file, so I have 12 files in total with the naming convention as displayed below in code.
Phase 2: For every 2 files, I read one line from each file, and write which one comes first in alphabetical order into a new file (basically sort), until I eventually merge 2 files into 1 that is sorted. I do this in a loop, so that the number of files get halved each time while being sorted, so essentially I would have 7 MB all sorted into one file.
What I am having trouble with: For phase 2, I successfully read phase 1, but it seems that my files are all being copied repeatedly into multiple files, rather than being sorted and merged. I appreciate any help given, thank you.
File: It seems I cannot upload the .txt file, but the code should work so that any file with any number of lines can be merged, just the number of lines variable needs to be changed.
Summary: 1 Big big file unsorted, turns into multiple sorted files (ie. 12), first sort and merge turns it into 6 files, second sort and merge turns it into 3 files, third merge turns it into 2 files, and fourth merge turns it into 1 file big file again.
Code:
package Assignment11;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Scanner;
public class FileSorter_1
{
public static ArrayList<String> storyline = new ArrayList<String>();
public static int num_lines = 100000; //this number can be changed
public static int num_files_initial;
public static int num_files_sec;
public static void main(String[] args) throws IOException
{
phase1();
phase2();
}
public static void phase1() throws IOException
{
Scanner story = new Scanner(new File("Aesop_Shakespeare_Shelley_Twain.txt")); //file name
int f = 0;
while(story.hasNext())
{
int i = 0;
while(story.hasNext())
{
String temp = story.next();
storyline.add(temp);
i++;
if(i > num_lines)
{
break;
}
}
Collections.sort(storyline, String.CASE_INSENSITIVE_ORDER);
BufferedWriter write2file = new BufferedWriter(new FileWriter("temp_0_" + f + ".txt")); //initialze new file
for(int x = 0; x<num_lines;x++)
{
write2file.write(storyline.get(x));
write2file.newLine();
}
write2file.close();
f++;
}
num_files_initial = f;
}
public static void phase2() throws IOException
{
int file_n = 1;
int prev_fn = 0;
int t = 0;
int g = 0;
while(g<5)
{
System.out.println(num_files_initial);
if(t+1 > num_files_initial-1)
{
if(num_files_initial % 2 != 0)
{
BufferedWriter w = new BufferedWriter(new FileWriter("temp_"+file_n +"_" + g + ".txt"));
Scanner file1 = new Scanner(new File("temp_"+prev_fn +"_" + t + ".txt"));
String word1 = file1.next();
while(file1.hasNext())
{
w.write(word1);
w.newLine();
}
g++;
break;
}
num_files_initial = num_files_initial / 2 + num_files_initial % 2;
g = 0;
t = 0;
file_n++;
prev_fn++;
}
String s1="temp_"+file_n +"_" + g + ".txt";
String s2="temp_"+prev_fn +"_" + t + ".txt";
String s3="temp_"+prev_fn +"_" + (t+1) + ".txt";
System.out.println(s2);
System.out.println(s3);
BufferedWriter w = new BufferedWriter(new FileWriter(s1));
Scanner file1 = new Scanner(new File(s2));
Scanner file2 = new Scanner(new File(s3));
String word1 = file1.next();
String word2 = file2.next();
System.out.println(num_files_initial);
//System.out.println(t);
//System.out.println(g);
while(file1.hasNext() && file2.hasNext())
{
if(word1.compareTo(word2) == 1) //if word 1 comes first = 1
{
w.write(word1);
w.newLine();
file1.next();
}
if(word1.compareTo(word2) == 0) //if word 1 comes second = 0
{
w.write(word2);
w.newLine();
file2.next();
}
}
while(file1.hasNext())
{
w.write(word1);
w.newLine();
break;
}
while(file2.hasNext())
{
w.write(word2);
w.newLine();
break;
}
g++;
t+=2;
w.close();
file1.close();
file2.close();
}
}
}
After writing data into the new files you are not clearing the existing sorted array and that's why it is being copied into new files. Here are some fixes:
...
int f = 0;
while(story.hasNext())
{
// initilize the array here.
storyline = new ArrayList<>();
int i = 0;
while(story.hasNext())
{
String temp = story.next();
storyline.add(temp);
i++;
if(i > num_lines)
{
break;
}
}
Collections.sort(storyline, String.CASE_INSENSITIVE_ORDER);
BufferedWriter write2file = new BufferedWriter(new FileWriter("temp_0_" + f + ".txt")); //initialze new file
// instead of num_lines use i
for(int x = 0; x<i;x++)
{
write2file.write(storyline.get(x));
write2file.newLine();
}
write2file.close();
f++;
}
num_files_initial = f;
Hope this helps.
I have a project where I am to write data (strings and ints) into a binary random access file, and read the data in a separate class. The problem I have is I'm trying to iterate through the file and read the data in a specific order (int, String, String, int), however the Strings are various byte sizes.
I am getting an EOFException but cannot figure out why.
Here is the class which writes the data. Part of the requirements is to limit the number of bytes for the Strings and catch a user defined exception if they are exceeded.
import java.io.RandomAccessFile;
import java.io.IOException;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.ArrayList;
import java.io.File;
public class QuestionBank {
private RandomAccessFile file;
private ArrayList <Questions> listQuestions;
public QuestionBank(){
file = null;
listQuestions = new ArrayList<Questions>();
}
public void storeQuestion (Questions ques) throws IOException {
ques = new Questions(ques.getQuesIDNum(), ques.getQuestion(), ques.getAnswer(), ques.getValue());
listQuestions.add(ques);
byte[] quesBytes = ques.getQuestion().getBytes("UTF-8");
byte[] ansBytes = ques.getAnswer().getBytes("UTF-8");
try {
file = new RandomAccessFile(new File("Question.bin"), "rw");
long fileSize = file.length();
file.seek(fileSize);
file.writeInt(ques.getQuesIDNum());
file.writeUTF(ques.getQuestion());
for (int i = 0; i <= 50 - ques.getQuestion().length(); i++){
file.writeByte(50);
}
if (quesBytes.length > 50) {
throw new ByteSizeException("Question has too many bytes");
}
file.writeUTF(ques.getAnswer());
for (int i = 0; i <= 20 - ques.getAnswer().length(); i++){
file.writeByte(20);
}
if (ansBytes.length > 20) {
throw new ByteSizeException("Answer has too many bytes");
}
file.writeInt(ques.getValue());
file.close();
} catch (IOException e) {
System.out.println("I/O Exception Found");
} catch (ByteSizeException eb) {
System.out.println("String has too many bytes");
}
}
Here is the class which reads the file.
import java.util.ArrayList;
import java.util.Random;
import java.io.RandomAccessFile;
import java.io.IOException;
import java.io.FileNotFoundException;
import java.io.File;
public class TriviaGame {
public static final int RECORD = 78;
private ArrayList<Questions> quesList;
private int IDNum;
private String question;
private String answer;
private int points;
public TriviaGame() {
quesList = new ArrayList<Questions>();
IDNum = 0;
question = "";
answer = "";
points = 0;
}
public void read(){
try {
RandomAccessFile file;
file = new RandomAccessFile(new File("Question.bin"), "r");
long fileSize = file.length();
long numRecords = fileSize/RECORD;
file.seek(0);
for (int i = 0; i < numRecords; i++){
IDNum = file.readInt();
question = file.readUTF();
answer = file.readUTF();
points = file.readInt();
System.out.println("ID: " + IDNum + " Question: " + question + " Answer: " + answer + " Points: " + points);
}
file.close();
} catch (IOException e) {
System.out.println(e.getClass());
System.out.println("I/O Exception found");
}
}
}
Thanks
file.writeUTF(ques.getQuestion());
Here you have written the question.
for (int i = 0; i <= 50 - ques.getQuestion().length(); i++){
file.writeByte(50);
}
if (quesBytes.length > 50) {
throw new ByteSizeException("Question has too many bytes");
}
Here for some unknown reason you are padding the question to 50 bytes. Remove. Same with the answer. You are using readUTF() to read both of these, so all you need is writeUTF() to write them. No padding required.
Or, if you insist on this padding, you have to skip over it when reading: after the first readUTF(), you need to skip over the padding.
I'm trying to find a way to use
copyInputStreamToFile(InputStream source, File destination)
to make a small progress bar in the console by file size. Is there a way to do this?
The short answer you can't, look at the source code of this method, I tried to track its execution path and it goes to this method at IOUtils class:
public static long copyLarge(final InputStream input, final OutputStream output, final byte[] buffer)
throws IOException {
long count = 0;
int n = 0;
while (EOF != (n = input.read(buffer))) {
output.write(buffer, 0, n);
count += n;
}
return count;
}
So, this functionality is encapsulated by an API.
The long answer you can implement downloading method by yourself, by using relative parts of IOUtils and FileUtils libraries and add functionality to print percentage of downloaded file in a console:
This is a working kick-off example:
package apache.utils.custom;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
public class Downloader {
private static final int EOF = -1;
private static final int DEFAULT_BUFFER_SIZE = 1024 * 4;
public static void copyInputStreamToFileNew(final InputStream source, final File destination, int fileSize) throws IOException {
try {
final FileOutputStream output = FileUtils.openOutputStream(destination);
try {
final byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
long count = 0;
int n = 0;
while (EOF != (n = source.read(buffer))) {
output.write(buffer, 0, n);
count += n;
System.out.println("Completed " + count * 100/fileSize + "%");
}
output.close(); // don't swallow close Exception if copy completes normally
} finally {
IOUtils.closeQuietly(output);
}
} finally {
IOUtils.closeQuietly(source);
}
}
You should provide expected file size to this method which you can calculate by using this code:
URL url = new URL(urlString);
URLConnection urlConnection = url.openConnection();
urlConnection.connect();
int file_size = urlConnection.getContentLength();
Of course the better idea is to encapsulate the whole functionality in a single method.
Hope it will help you.
I was trying to read from one file and write the bytes read to another file using the classes specified in the title.I successfully did it but while i was trying to try different things i came across a problem which i do not understand.
Here is the code
import java.io.*;
public class FileInputStreamDemo {
public static void main(String[] args)
throws Exception {
// TODO Auto-generated method stub
int size;
InputStream f = new
FileInputStream("G:/Eclipse Workspace/FileInputStream Demo/src/FileInputStreamDemo.java");
System.out.println("Total available bytes: " + (size = f.available()));
/*int n=size/40;
System.out.println("first " + n + " bytes of file one read() at a time");
for (int i=0;i<n;i++)
{
System.out.print((char) f.read());
}
System.out.println("\n Still available: " + f.available());
System.out.println("reading the next" + n + "with one read(b[])");
byte b[] = new byte[n]; */
/*for(int i=0;i<size;i++)
{
System.out.print((char) f.read());
}*/
OutputStream f1 = new
FileOutputStream("G:/Eclipse Workspace/FileInputStream Demo/test.txt");
for (int count = 0; count < size; count++) {
f1.write(f.read());
}
for (int i = 0; i < size; i++) {
System.out.print(f.read());
}
f.close();
f1.close();
}
}
The problem that i am talking about is that when i first read from the FileInputStream object f i.e f.read() and write it to the f1 i.e FileOutputStream object it goes on to do what it is meant to do ,but when i try to read it again it returns -1. why so ?
Use RandomAccessFile and seek(0) method to come back at the beginning.
RandomAccessFile file = new RandomAccessFile(new File("G:/Eclipse Workspace/FileInputStream Demo/src/FileInputStreamDemo.java"), "r");
Here is sample code:
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.io.RandomAccessFile;
public class FileInputStreamDemo {
public static void main(String[] args) throws Exception {
long size;
File file = new File("D:/Workspace/JavaProject/src/com/test/FileInputStreamDemo.java");
RandomAccessFile f = new RandomAccessFile(file, "r");
System.out.println("Total available bytes: " + (size = file.length()));
OutputStream f1 = new FileOutputStream(new File(
"D:/Workspace/JavaProject/resources/test.txt"));
for (int count = 0; count < size; count++) {
f1.write(f.read());
}
f.seek(0);
for (int i = 0; i < size; i++) {
System.out.print((char)f.read());
}
f.close();
f1.close();
}
}
I'm trying to do a few performance enhancements and am looking to use memory mapped files for writing data. I did a few tests and surprisingly, MappedByteBuffer seems slower than allocating direct buffers. I'm not able to clearly understand why this would be the case. Can someone please hint at what could be going on behind the scenes? Below are my test results:
I'm allocating 32KB buffers. I've already created the files with sizes 3Gigs before starting the tests. So, growing the file isn't the issue.
I'm adding the code that I used for this performance test. Any input / explanation about this behavior is much appreciated.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
public class MemoryMapFileTest {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
for (int i = 0; i < 10; i++) {
runTest();
}
}
private static void runTest() throws IOException {
// TODO Auto-generated method stub
FileChannel ch1 = null;
FileChannel ch2 = null;
ch1 = new RandomAccessFile(new File("S:\\MMapTest1.txt"), "rw").getChannel();
ch2 = new RandomAccessFile(new File("S:\\MMapTest2.txt"), "rw").getChannel();
FileWriter fstream = new FileWriter("S:\\output.csv", true);
BufferedWriter out = new BufferedWriter(fstream);
int[] numberofwrites = {1,10,100,1000,10000,100000};
//int n = 10000;
try {
for (int j = 0; j < numberofwrites.length; j++) {
int n = numberofwrites[j];
long estimatedTime = 0;
long mappedEstimatedTime = 0;
for (int i = 0; i < n ; i++) {
byte b = (byte)Math.random();
long allocSize = 1024 * 32;
estimatedTime += directAllocationWrite(allocSize, b, ch1);
mappedEstimatedTime += mappedAllocationWrite(allocSize, b, i, ch2);
}
double avgDirectEstTime = (double)estimatedTime/n;
double avgMapEstTime = (double)mappedEstimatedTime/n;
out.write(n + "," + avgDirectEstTime/1000000 + "," + avgMapEstTime/1000000);
out.write("," + ((double)estimatedTime/1000000) + "," + ((double)mappedEstimatedTime/1000000));
out.write("\n");
System.out.println("Avg Direct alloc and write: " + estimatedTime);
System.out.println("Avg Mapped alloc and write: " + mappedEstimatedTime);
}
} finally {
out.write("\n\n");
if (out != null) {
out.flush();
out.close();
}
if (ch1 != null) {
ch1.close();
} else {
System.out.println("ch1 is null");
}
if (ch2 != null) {
ch2.close();
} else {
System.out.println("ch2 is null");
}
}
}
private static long directAllocationWrite(long allocSize, byte b, FileChannel ch1) throws IOException {
long directStartTime = System.nanoTime();
ByteBuffer byteBuf = ByteBuffer.allocateDirect((int)allocSize);
byteBuf.put(b);
ch1.write(byteBuf);
return System.nanoTime() - directStartTime;
}
private static long mappedAllocationWrite(long allocSize, byte b, int iteration, FileChannel ch2) throws IOException {
long mappedStartTime = System.nanoTime();
MappedByteBuffer mapBuf = ch2.map(MapMode.READ_WRITE, iteration * allocSize, allocSize);
mapBuf.put(b);
return System.nanoTime() - mappedStartTime;
}
}
You're testing the wrong thing. This is not how to write the code in either case. You should allocate the buffer once, and just keep updating its contents. You're including allocation time in the write time. Not valid.
Swapping data to disk is the main reason for MappedByteBuffer to be slower than DirectByteBuffer.
cost of allocation and deallocation is high with direct buffers , including MappedByteBuffer, and this is cost is accrued to both the examples hence the only difference in writing to disk , which is the case with MappedByteBuffer but not with Direct Byte Buffer