how download file by url in jsoup - java

i have a website to download excel file. and now i need to send parameters to download file with this site url by jsoup. when i get bodystream(), i get a error,i do not know why and how can i solute this matter.
Connection con = Jsoup.connect(url);
File downloadFile = File.createTempFile("TMP", ".xlsx");
con=con.timeout(300000);
con = con.header("Connection", "keep-alive")
.header("Cache-Control", "max-age=0");
con=con.data(parameters);
con=con.cookies(cookie);
Connection.Response res = con.ignoreContentType(true).method(POST).execute();
FileUtils.copyInputStreamToFile(res.bodyStream(), downloadFile);
but i got java.lang.IllegalArgumentException: Request has already been read
※sometimes i download download successfully with same code and parameters.
can you tell me how to solute this matter and download file by this way?

The following worked for me (with some changes to specify a URL; but you can include your other changes such as setting the cookies and to POST).
This just uses the inbuilt Java helper utility to read the input stream and save it to a file.
Given the error message you mentioned, I wonder if the FileUtils method you're using (what dependency is that from?) is sometimes re-reading the file.
String url = "https://jsoup.org/rez/html5-logo.svg";
File downloadFile = File.createTempFile("TMP", ".svg");
Connection con = Jsoup.connect(url)
.timeout(300000)
.header("Cache-Control", "max-age=0")
.ignoreContentType(true);
Connection.Response res = con.execute();
BufferedInputStream body = res.bodyStream();
Files.copy(body, downloadFile.toPath(), StandardCopyOption.REPLACE_EXISTING);
System.out.println("Saved URL to " + downloadFile.getAbsolutePath());
Alternatively, if you still get the same error, you could try reading the whole body into a byte array before saving:
Connection.Response res = con.execute();
byte[] bytes = res.bodyAsBytes();
Files.write(downloadFile.toPath(), bytes);

Related

(spring boot or java) I have a problem opening URL PDF

spring boot or java read/open pdf url and ResponseEntity attachment file .pdf
Call the URL https://xxxxx.xxx/file.pdf
Read the file from step 1 and display it. By setting the response value as follows:
Content-Type : application/pdf
Content-Transfer-Encoding : binary
Content-disposition : attachment; filename=filename.pdf
Content-Length : xxxx
URL url = new URL(apiReportDomain
+ "/rest_v2/reports/reports/cms/loan_emergency/v1_0/RTP0003_02.pdf?i_ref_code=" + documentId);
System.out.println(url);
String encoding = Base64.getEncoder().encodeToString(
(apiReportUsername + ":" + apiReportPassword).getBytes(StandardCharsets.UTF_8));
HttpURLConnection connectionApi = (HttpURLConnection) url.openConnection();
connectionApi.setRequestMethod("GET");
connectionApi.setDoOutput(true);
connectionApi.setRequestProperty("Authorization", "Basic " + encoding);
connectionApi.setRequestProperty("Content-Type", "application/pdf");
InputStream content = connectionApi.getInputStream();
BufferedReader in = new BufferedReader(
new InputStreamReader(content));
StringBuilder sb = new StringBuilder();
int cp;
while ((cp = in.read()) != -1) {
sb.append((char) cp);
}
byte[] output = sb.toString().getBytes();
HttpHeaders responseHeaders = new HttpHeaders();
responseHeaders.set("charset", "utf-8");
responseHeaders.setContentType(MediaType.valueOf("application/pdf"));
responseHeaders.setContentLength(output.length);
responseHeaders.set("Content-disposition", "attachment; filename=filename.pdf");
return new ResponseEntity<byte[]>(output, responseHeaders, HttpStatus.OK);
enter image description here
which the result i got is a blank page But in fact, this PDF contains a full sheet of text.
Update this if it does or does not operate, I think the problem would be the https and certificate verification at client download by your original connection.
You need the certificate to decrypt the pdf and formally accept the certificate. See JCA cryptography API.
Also the following is best MIME type for sending binary download.
Content-Type : application/octet-stream
https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/HttpsURLConnection.html
The issue is that the server needs to fetch the file from the internet, and then pass it on. Except of a redirect (which would look like cross-site traffic).
First write local code to fetch the PDF in a local test application.
It could be that you need to use java SE HttpClient.
It just might be you need to fake a browser as agent, and accept cookies, follow a redirect. That all can be tested by a browser's development page looking at the network traffic in detail.
Then test that you can store a file with the PDF response.
And finally wire the code in the spring application, which is very similar on yielding the response. You could start with a dummy response, just writing some hard-coded bytes.
After info in the question
You go wrong in two points:
PDFs are binary data, String is Unicode, with per char 2 bytes, requiring a conversion back and forth: the data will be corrupted and the memory usage twice, and it will be slow.
String.getBytes(Charset) and new String(byte[], Charset) prevent that the default Charset of the executing PC is used.
Keeping the PDF first entirely in memory is not needed. But then you are missing the Content-Length header.
InputStream content = connectionApi.getInputStream();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
content.transferTo(baos);
byte[] output = baos.toByteArray();
HttpHeaders responseHeaders = new HttpHeaders();
responseHeaders.set("charset", "utf-8");
responseHeaders.setContentType(MediaType.valueOf("application/pdf"));
responseHeaders.setContentLength(output.length);
responseHeaders.set("Content-disposition",
"attachment; filename=filename.pdf");

Java Jsoup downloading torrent file

I got a problem, I want to connect to this website (https://ww2.yggtorrent.is) to download torrent file. I've made a method to connect to the website by Jsoup who work well but when I try to use it to Download the torrent file, the website return "You must be connected to download file".
Here is my code to connect:
Response res = Jsoup.connect("https://ww2.yggtorrent.is/user/login")
.data("id", "<MyLogin>", "pass", "<MyPassword>")
.method(Method.POST)
.execute();
and here is my code to download file
Response resultImageResponse = Jsoup.connect("https://ww2.yggtorrent.is/engine/download_torrent?id=285633").cookies(cookies)
.ignoreContentType(true).execute();
FileOutputStream out = (new FileOutputStream(new java.io.File("toto.torrent")));
out.write(resultImageResponse.bodyAsBytes());
out.close();
I've tested a lot of thing but now I have no clue.
The only thing you didn't show us in your code is getting cookies from response. I hope you do this correctly because you use them to make second request.
This code looks like yours but with example of how I get the cookies. I also add referer header. It successfully downloads that file for me and utorrent recognizes it correctly:
// logging in
System.out.println("logging in...");
Response res = Jsoup.connect("https://ww2.yggtorrent.is/user/login")
.timeout(10000)
.data("id", "<MyLogin>", "pass", "<MyPassword>")
.method(Method.POST)
.execute();
// getting cookies from response
Map<String, String> cookies = res.cookies();
System.out.println("got cookies: " + cookies);
// optional verification if logged in
System.out.println(Jsoup.connect("https://ww2.yggtorrent.is").cookies(cookies).get()
.select("#panel-btn").first().text());
// connecting with cookies, it may be useful to provide referer as some servers expect it
Response resultImageResponse = Jsoup.connect("https://ww2.yggtorrent.is/engine/download_torrent?id=285633")
.referrer("https://ww2.yggtorrent.is/engine/download_torrent?id=285633")
.cookies(cookies)
.ignoreContentType(true)
.execute();
// saving file
FileOutputStream out = (new FileOutputStream(new java.io.File("C:/toto.torrent")));
out.write(resultImageResponse.bodyAsBytes());
out.close();
System.out.println("done");

Why my HTTPS file download corrupts .zip files?

I'm trying to download zip files from internet using following code:
public void getFile(String updateURL) throws Exception {
URL url = new URL(updateURL);
HttpURLConnection httpsConn = (HttpURLConnection) url.openConnection();
httpsConn.setRequestMethod("GET");
TrustModifier.relaxHostChecking(httpsConn);
int responseCode = httpsConn.getResponseCode();
if (responseCode == HttpsURLConnection.HTTP_OK) {
String fileName = "fileFromNet";
try (FileOutputStream outputStream = new FileOutputStream(fileName)) {
ReadableByteChannel rbc = Channels.newChannel(httpsConn.getInputStream());
outputStream.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
}
}
httpsConn.disconnect();
}
TrustModifier is a class used to solve the "trust issue": http://www.obsidianscheduler.com/blog/ignoring-self-signed-certificates-in-java/
The code above works well for zip files available via plain http or for non compressed files exposed via https but but if I try to download a zip file exposed via https endpoint only a small fragment of original file will be downloaded. I have tested with different download links from internet and always got the same result.
Does anybody has an idea what I've been doing wrong here?
Thank you.
transferFrom() must be called in a loop until the transfer is complete, and in this case the only way you can know that is by adding up the return values of transferFrom() until they equal the Content-length of the HTTP response.
Actually the problem was in the TrustModifier Class I was using to switch off the servier certificate check. Once I removed it because I didn't need it any longer (I took the certificate from server and put it in a local trust store), my problem was solved.

Download a PDF file via REST service with “Content-Disposition” Header in java

I am trying to download a PDF file from a response of Java REST call after custom authentication check.
I can see downloaded file but it is empty file.
Below is my code snippet.
//Custom HTTPClient
HTTPAuthClient client = new HTTPAuthClient(url,username,password)
Request request = new Request(downloadURL); //I'm downloading file content of an URL.
Response response = client.executeGet(request);
String response1 = response.getResponseBody();
InputStream is = new ByteArrayInputStream(response.getBytes());
response.setContentType("Content-type",application/pdf); //here response is //javax.servlet.HttpServletResponse
response.setHeader("Content-Disposition","attachment;filename="myfile.pdf");
IOUtils.copy(is,response.getOutPutStream());
response.flushBuffer();
With this code I could download the file but when I open the file and verified there is no data.
As part of response body also I can see some data.
Could you please help me out where I'm doing mistake I tried many options but did not find solution.
How can you use setContentType like this
response.setContentType("Content-type",application/pdf);
If only one avalible param in this method is String void setContentType(String type) so your method should be:
response.setContentType("application/pdf");
Java Doc to be sure.

java - Download and then write image as servlet response

How to download an image from a server and then write it as a response in my servlet.
What is the best way to do it keeping good performance?
Here's my code:
JSONObject imageJson;
... //getting my JSON
String imgUrl = imageJson.get("img");
if you don't need to hide your image source and if server is accessible from the client as well, I'd just point your response to remote server (as you already have the url) => you don't need to do a download to your server first, but possibly client could access it directly => you don't waste your resources.
However if you still need to download it to your server first, following post might help: Writing image to servlet response with best performance
It's important to avoid intermediate buffering of image in servlet. Instead just stream whatever was received to the servlet response:
InputStream is = new URL(imgUrl).openStream();
OutputStream os = servletResponse.getOutputStream();
IOUtils.copy(is, os);
is.close();
I'm using IOUtils from Apache Commons (not necessary, but useful).
The complete solution : download a map and save to file.
String imgUrl = "http://maps.googleapis.com/maps/api/staticmap?center=-15.800513,-47.91378&zoom=11&size=200x200&sensor=false";
InputStream is = new URL(imgUrl).openStream();
File archivo = new File("c://temp//mapa.png");
archivo.setWritable(true);
OutputStream output = new FileOutputStream(archivo);
IOUtils.copy(is, output);
IOUtils.closeQuietly(output);
is.close();

Categories

Resources