How to download a PDF from a given URL in Java? [duplicate]

How to download a PDF from a given URL in Java? [duplicate] - java

This question already has answers here:
How can I download and save a file from the Internet using Java?
(23 answers)
Closed 4 years ago.
I want to make a Java application that when executed downloads a file from a URL. Is there any function that I can use in order to do this?
This piece of code worked only for a .txt file:
URL url= new URL("http://cgi.di.uoa.gr/~std10108/a.txt");
BufferedReader in = new BufferedReader(
new InputStreamReader(url.openStream()));
PrintWriter writer = new PrintWriter("file.txt", "UTF-8");
String inputLine;
while ((inputLine = in.readLine()) != null){
writer.write(inputLine+ System.getProperty( "line.separator" ));
System.out.println(inputLine);
}
writer.close();
in.close();

Don't use Readers and Writers here as they are designed to handle raw-text files which PDF is not (since it also contains many other information like info about font, and even images). Instead use Streams to copy all raw bytes.
So open connection using URL class. Then just read from its InputStream and write raw bytes to your file.
(this is simplified example, you still need to handle exceptions and ensure closing streams in right places)
System.out.println("opening connection");
URL url = new URL("https://upload.wikimedia.org/wikipedia/en/8/87/Example.JPG");
InputStream in = url.openStream();
FileOutputStream fos = new FileOutputStream(new File("yourFile.jpg"));
System.out.println("reading from resource and writing to file...");
int length = -1;
byte[] buffer = new byte[1024];// buffer for portion of data from connection
while ((length = in.read(buffer)) > -1) {
fos.write(buffer, 0, length);
}
fos.close();
in.close();
System.out.println("File downloaded");
Since Java 7 we can also use Files.copy and the try-with-resources to automatically close the InputStream (the stream doesn't have to be closed manually in this case):
URL url = new URL("https://upload.wikimedia.org/wikipedia/en/8/87/Example.JPG");
try (InputStream in = url.openStream()) {
Files.copy(in, Paths.get("someFile.jpg"), StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
// handle exception
}

Related

Blank pages in pdf after downloading it from web

I am trying to download a PDF file with HttpClient, it is downloading the PDF file but pages are blank. I can see the bytes on console from response if I print them. But when I try to write it to file it is producing a blank file.
FileUtils.writeByteArrayToFile(new File(outputFilePath), bytes);
However the file is showing correct size of 103KB and 297KB as expected but its just blank!!
I tried with Output stream as well like:
FileOutputStream fileOutputStream = new FileOutputStream(outFile);
fileOutputStream.write(bytes);
Also tried to write with UTF-8 coding like:
Writer out = new BufferedWriter( new OutputStreamWriter(
new FileOutputStream(outFile), "UTF-8"));
String str = new String(bytes, StandardCharsets.UTF_8);
try {
out.write(str);
} finally {
out.close();
}
Nothing is working for me. Any suggestion is highly appreciated..
Update: I am using DefaultHttpClient.
HttpGet httpget = new HttpGet(targetURI);
HttpResponse response = null;
String htmlContents = null;
try {
httpget = new HttpGet(url);
response = httpclient.execute(httpget);
InputStreamReader dataStream=new InputStreamReader(response.getEntity().getContent());
byte[] bytes = IOUtils.toByteArray(dataStream);
...

You do
InputStreamReader dataStream=new InputStreamReader(response.getEntity().getContent());
byte[] bytes = IOUtils.toByteArray(dataStream);
As has already been mentioned in comments, using a Reader class can damage binary data, e.g. PDF files. Thus, you should not wrap your content in an InputStreamReader.
As your content can be used to construct an InputStreamReader, though, I assume response.getEntity().getContent() returns an InputStream. Such an InputStream usually can be directly used as IOUtils.toByteArray argument.
So:
InputStream dataStream=response.getEntity().getContent();
byte[] bytes = IOUtils.toByteArray(dataStream);
should already work for you!

Here is a method I use to download a PDF file from a specific URL. The method requires two string arguments, an url string (example: "https://www.ibm.com/support/knowledgecenter/SSWRCJ_4.1.0/com.ibm.safos.doc_4.1/Planning_and_Installation.pdf") and a destination folder path to download the PDF file (or whatever) into. If the destination path does not exist within the local file system then it is automatically created:
public boolean downloadFile(String urlString, String destinationFolderPath) {
boolean result = false; // will turn to true if download is successful
if (!destinationFolderPath.endsWith("/") && !destinationFolderPath.endsWith("\\")) {
destinationFolderPath+= "/";
}
// If the destination path does not exist then create it.
File foldersToMake = new File(destinationFolderPath);
if (!foldersToMake.exists()) {
foldersToMake.mkdirs();
}
try {
// Open Connection
URL url = new URL(urlString);
// Get just the file Name from URL
String fileName = new File(url.getPath()).getName();
// Try with Resources....
try (InputStream in = url.openStream(); FileOutputStream outStream =
new FileOutputStream(new File(destinationFolderPath + fileName))) {
// Read from resource and write to file...
int length = -1;
byte[] buffer = new byte[1024]; // buffer for portion of data from connection
while ((length = in.read(buffer)) > -1) {
outStream.write(buffer, 0, length);
}
}
// File Successfully Downloaded");
result = true;
}
catch (MalformedURLException ex) { ex.printStackTrace(); }
catch (IOException ex) { ex.printStackTrace(); }
return result;
}

I have stored image to inputstream from html,so i want to store the inputstream to folder in my drive [duplicate]

This question already has answers here:
Easy way to write contents of a Java InputStream to an OutputStream
(24 answers)
Closed 6 years ago.
Part filePart = request.getPart("barcodePhtotoVen");
inputStream = filePart.getInputStream();
How to write this using outputstream to folder/file in my computer drive?

Hope following code snippet help you
Update :
OutputStream out = null;
InputStream filecontent = null;
try {
out = new FileOutputStream(new File("destination_file_path"));
filecontent = filePart.getInputStream();
int read = 0;
final byte[] bytes = new byte[1024];
while ((read = filecontent.read(bytes)) != -1) {
out.write(bytes, 0, read);
}
} catch (FileNotFoundException f) {
} finally {
if (out != null) {
out.close();
}
}
Earlier code I tested on mine system, perfectly worked. Please try mine updated code. I just tested on mine system and its working fine too.
Resources : Hope this knowledge sharing help you.
Thanks.

Do not use an InputStream and do not use an OutputStream. Part has a write(String) method which writes the part directly to a file.

I have stored image to inputstream
No you haven't. You haven't stored the image anywhere, let alone to an input stream, which is a contradiction in terms. You need to read from the input stream and write to a disk file.

When reading and writing a file the output changes

I'm trying read a file(doesn't matter the extension) and write after this, but when I do it, the output file is different from the input.
my code is the next:
OutputStream outputStream = null;
FileReader fr = new FileReader("rute\\inputfile.PNG");
BufferedReader br = new BufferedReader(fr);
String line;
while ((line= br.readLine()) != null) {
content += line;
}
byte[] toBytes= content.getBytes();
InputStream inputStream = new ByteArrayInputStream(toBytes);
try {
outputStream = new FileOutputStream(new File("rute\\output.PNG"));
int read = 0;
byte[] bytes = new byte[1024];
while ((read = inputStream.read(bytes)) != -1) {
outputStream.write(bytes, 0, read);
}
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
}
inputStream.close();
If you ask me why convert into bytes and write from this form, is because I need do something with the data, and I need this conversion.
If you tell me that i cant load an image on a String, yes I can do something like that:
File fil = ~~~~;
FileInputStream fis = null;
fis = new FileInputStream(fil);
byte[] bytess = IOUtils.toByteArray(fis);
But I dont want do it by this way because if I want load big files, the heap size is not enough an this could be solved by the "line per line" read.
Thanks for your answers

I will recommend read this question before. Since you are reading binary data into a String you are changing the encoding of that data. So the output will be different.
Best approach is read binary files as byte arrays. But I will depend which type of transformation/edition/changes you need to do with them.
UPDATE
And, of course, you are editing your content before writing
while ((line= br.readLine()) != null) {
content += line + "\n";
}
so the your output file will be different always.
UPDATE 2
Since the question/problem is how to read a big binary file, google is usually your friend.
Or you can check this other question

ZipFile corrupted after downloading using Java [duplicate]

This question already has answers here:
How can I download and save a file from the Internet using Java?
(23 answers)
Closed 8 years ago.
Downloaded Zip files using Java, when it's open saying that Can't open.
Want to know what is the pblm?
Is it because of less memory?
Here is the code for downloading zipFiles
try {
for(int i=0;i<URL_LOCATION.length;i++) {
url = new URL(URL_LOCATION[i]);
connection = url.openConnection();
stream = new BufferedInputStream(connection.getInputStream());
int available = stream.available();
b = new byte[available];
stream.read(b);
File file = new File(LOCAL_FILE[i]);
OutputStream out = new FileOutputStream(file);
out.write(b);
}
} catch (Exception e) {
System.err.println(e.toString());
}
Soln for this: Refered Link is How to download and save a file from Internet using Java?
BufferedInputStream in = null;
FileOutputStream fout = null;
try
{
in = new BufferedInputStream(new URL(urlString).openStream());
fout = new FileOutputStream(filename);
byte data[] = new byte[1024];
int count;
while ((count = in.read(data, 0, 1024)) != -1)
{
fout.write(data, 0, count);
}
}
finally
{
if (in != null)
in.close();
if (fout != null)
fout.close();
}

You are using the available()-call to determine how many bytes to read. Thats blatantly wrong (see javadoc of InputStream for details). available() only tells you about data immediately available, not about the real stream length.
You need a loop and read from the stream until it return -1 (for EndOfStream) as number of bytes read.
I recommend you review the tutorial on streams: http://docs.oracle.com/javase/tutorial/essential/io/bytestreams.html

Read Image from Socket [duplicate]

This question already has an answer here:
Closed 11 years ago.
Possible Duplicate:
Read Image File Through Java Socket
void readImage() throws IOException
{
socket = new Socket("upload.wikimedia.org", 80);
DataOutputStream bw = new DataOutputStream(new DataOutputStream(socket.getOutputStream()));
bw.writeBytes("GET /wikipedia/commons/8/80/Knut_IMG_8095.jpg HTTP/1.1\n");
bw.writeBytes("Host: wlab.cs.bilkent.edu.tr:80\n\n");
DataInputStream in = new DataInputStream(socket.getInputStream());
File file = new File("imgg.jpg");
file.createNewFile();
DataOutputStream dos = new DataOutputStream(new FileOutputStream(file));
int count;
byte[] buffer = new byte[8192];
while ((count = in.read(buffer)) > 0)
{
dos.write(buffer, 0, count);
dos.flush();
}
dos.close();
System.out.println("image transfer done");
socket.close();
}
-Create a socket
-Create output stream
-Request the page that includes image
-Read socket to an input stream
-Write to file
I am trying to read an image from socket.
But it is not working.
It seems to read and the image is opened but can not be seen
Where is the problem?

You need to skip HTTP headers to get correct image.
I've already answered to this question today, look at: Read Image File Through Java Socket
The second problem, that you are trying to receive an image from wikipedia without referer and wikipedia restrict to do that (you receiving access denied every time). Try to use another image URL (google image for example).

You can use URL objects directly to fetch HTTP content. The input stream returned by the URL object will only contain content at the URL. The example method below takes a URL, fetches its content and writes the content to a given file.
public static void createImageFile(URL url, File file) throws IOException{
FileOutputStream fos = null;
InputStream is = null;
byte[] b = new byte[1024]; // 1 kB read blocks.
URLConnection conn;
try{
conn = url.openConnection();
/* Set some connection options here
before opening the stream
(i.e. connect and read timeouts) */
is = conn.getInputStream();
fos = new FileOutputStream(file);
int i = 0;
do{
i = is.read(b);
if(i != -1)
fos.write(b, 0, i);
}while(i != -1)
}finally{
/* Don't forget to clean up. */
if(is != null){
try{
is.close();
}catch(Exception e){
/* Don't care */
}
}
if(fos != null){
try{
fos.close();
}catch(Exception e){
/* Don't care */
}
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to download a PDF from a given URL in Java? [duplicate] - java

Related

Blank pages in pdf after downloading it from web

I have stored image to inputstream from html,so i want to store the inputstream to folder in my drive [duplicate]

When reading and writing a file the output changes

ZipFile corrupted after downloading using Java [duplicate]

Read Image from Socket [duplicate]

Categories

Resources