Why my HTTPS file download corrupts .zip files?

Why my HTTPS file download corrupts .zip files? - java

I'm trying to download zip files from internet using following code:
public void getFile(String updateURL) throws Exception {
URL url = new URL(updateURL);
HttpURLConnection httpsConn = (HttpURLConnection) url.openConnection();
httpsConn.setRequestMethod("GET");
TrustModifier.relaxHostChecking(httpsConn);
int responseCode = httpsConn.getResponseCode();
if (responseCode == HttpsURLConnection.HTTP_OK) {
String fileName = "fileFromNet";
try (FileOutputStream outputStream = new FileOutputStream(fileName)) {
ReadableByteChannel rbc = Channels.newChannel(httpsConn.getInputStream());
outputStream.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
}
}
httpsConn.disconnect();
}
TrustModifier is a class used to solve the "trust issue": http://www.obsidianscheduler.com/blog/ignoring-self-signed-certificates-in-java/
The code above works well for zip files available via plain http or for non compressed files exposed via https but but if I try to download a zip file exposed via https endpoint only a small fragment of original file will be downloaded. I have tested with different download links from internet and always got the same result.
Does anybody has an idea what I've been doing wrong here?
Thank you.

transferFrom() must be called in a loop until the transfer is complete, and in this case the only way you can know that is by adding up the return values of transferFrom() until they equal the Content-length of the HTTP response.

Actually the problem was in the TrustModifier Class I was using to switch off the servier certificate check. Once I removed it because I didn't need it any longer (I took the certificate from server and put it in a local trust store), my problem was solved.

Related

Jersey Client download ZIP file and unpack efficiently

So, I have a server application that returns ZIP files and I'm working with huge files (>=5GB). I am then using the jersey client to do a GET request from this application after which I want to basically extract the ZIP and save it as a folder. This is the client configuration:
Client client = ClientBuilder.newClient();
client.register(JacksonJaxbJsonProvider.class);
client.register(MultiPartFeature.class);
return client;
And here's the code fetching the response from the server:
client.target(subMediumResponseLocation).path("download?delete=true").request()
.get().readEntity(InputStream.class)
My code then goes through a bunch of (unimportant for this question) steps and finally gets to the writing of data.
try (ZipInputStream zis = new ZipInputStream(inputStream)) {
ZipEntry ze = zis.getNextEntry();
while(ze != null){
String fileName = ze.getName();
if(fileName.contains(".")) {
size += saveDataInDirectory(folder,zis,fileName);
}
is.closeEntry();
ze = zis.getNextEntry();
}
zis.closeEntry();
} finally {
inputStream.close();
}
Now the issue I'm getting is that the ZipInputStream refuses to work. I can debug the application and see that there are bytes in the InputStream but when it get to the while(ze != null) check, it returns null on the first entry, resulting in an empty directory.
I have also tried writing the InputStream from the client to a ByteArrayOutputStream using
the transferTo method, but I get a java heap space error saying the array length is too big (even though my heap space settings are Xmx=16gb and Xms=12gb).
My thoughts were that maybe since the InputStream is lazy loaded by Jersey using the UrlConnector directly, this doesn't react well with the ZipInputStream. Another possible issue is that I'm not using a ByteArrayInputStream for the ZipInputStream.
What would a proper solution for this be (keeping in mind the heap issues)?

Ok so I solved it, apparently my request was getting a 404 for adding the query param in the path... .path("download?delete=true")

with java downloaded file is too small

i use this code snippet to download some mp3-files:
File target = /*...*/;
InputStream in = new URL(link).openStream();
Files.copy(in, target.toPath(), StandardCopyOption.REPLACE_EXISTING);
it usually works fine, but now i have a series of files, that are way too small and don't work. for example: https://kritisches-denken-podcast.de/wp-content/uploads/2019/01/KDP-Episode-17-Selbsterhaltungstherapie.mp3 should be about 46MB(when i download it via browser) but is only 315 Bytes when i download it with the code above on my android.

The URL has a redirect built into it. Usually such redirects, especially for URLs targeted at non-browsers (Which an mp3 URL clearly is), are served up as an HTTP 301 'Moved Permanently' (and sometimes 302 'Moved Temporarily'), with the right URL sent along in the Location header. The text you see (the 315 bytes you download) is merely 'fallback' HTML that also state the content has moved. There is no need to parse this, fortunately.
The HTTP 'browser' of URL's openStream code is very basic and does not follow redirects. You need an API that does. URLConnection (also from the core libs) can do it, but it does not follow redirects if the redirect switches from http to https or vice versa, so you might not wanna do that. Just in case you do:
File target = /*...*/;
HttpURLConnection con = (HttpURLConnection) new URL(link).openConnection();
con.setInstanceFollowRedirects(true);
try (InputStream in = con.getInputStream()) {
Files.copy(in, target.toPath(), StandardCopyOption.REPLACE_EXISTING);
}
If the above is no good (presumably due to HTTP/HTTPS redirect issue) I suggest picking up a real HTTP client, which the standard API does not provide. I suggest OkHttp.

Cannot upload file to a server

I'm using a function called UploadFFGS and this is its content:
URL url = new URL("http://linkedme.com/filebet.txt");
URLConnection ucn = url.openConnection();
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection = (HttpURLConnection) url.openConnection();
FileInputStream is = new FileInputStream("filebet.txt"); //before I download the same file because I must edit it and upload the new version
OutputStream ostream = connection.getOutputStream();
PrintWriter pwriter = new PrintWriter(ostream);
pwriter.print(jTextArea1.getText());
pwriter.close();
This program never uploads the file filebet I have on my desktop to my link (http://linkedme.com/filebet.txt). Any ideas? I call it in this way:
try {
UploadFFGS();
}
catch (MalformedURLException ex) {
Logger.getLogger(xGrep.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(xGrep.class.getName()).log(Level.SEVERE, null, ex);
}
Also, NetBeans gives me this error: "java.net.ProtocolException: cannot write to a URLConnection if doOutput=false - call setDoOutput(true)".

Your approach won't work because your API endpoint (most likely) is a regular file rather than an interpreted script. The endpoint must provide a API by means of which you upload a file (POST/PUT etc).

I have a different solution. Maybe this will be useful for someone.
Just have a look at your advanced proxy settings in your web browser.
System engineers in our company had changed the proxy settings but I was not aware of it.
This error cost me 3 work-days. I got this doOutput error while writing a ftp upload project in my company. I tried everything like adding conn.setDoOutput(true) or 'fifty shades' of similar solutions but non of them saved me.
But, after I changed my proxy settings to correct ones, the error dissapeared and now I am able to upload my files through ftp using urlConnection in java.
I used the code in the link below to make an upload process, and did not add anything except host, port, user and password.
http://www.ajaxapp.com/2009/02/21/a-simple-java-ftp-connection-file-download-and-upload/

Check if Url refers to file or DIrectory. (HTTP)

How can I determine if an url is referring to a file or directory. the link http://example.com/test.txt should return that it is a file an http://example.com/dir/ is a directory
I know you can do this with the Uri class but this objects IsFile function only works with the file:/// scheme. And I am working with the http:// scheme. Any idea's?
Thanks

I'm not sure why this question was left unanswered/neglected for a long time. I faced the same situation in server-side java (reckon it would be similar for Android flavour). The only input information is a URL to the resource and we need to tell if the resource is a directory or file. So here's my solution.
if ("file".equals(resourceUrl.getProtocol())
&& new File(resourceUrl.toURI()).isDirectory()) {
// it's a directory
}
Hope this helps the next reader.
Note: please see #awwsmm comment. It was my assumption when provided the answer above. Basically, it doesn't make sense to test if a remote resource is a directory or anything. It is totally up to the site to decide what to return for each request.

It won't work because protocol would be http:// not file://.
class TestURL{
public static boolean isDirectory(URL resourceUrl){
if ("file".equals(resourceUrl.getProtocol())
&& new File(resourceUrl.toURI()).isDirectory()) {
true;
}
return false;
}
public static void main(String[] args){
System.out.println(TestURL.isDirectory("http://example.com/mydir/"));
}
}

I think a directory http://example.com/some/dir will redirect to http://example.com/some/dir/, a file will not.
ie one can examine the http Location field in the HEAD response:
$ curl -I http://example.com/some/dir | grep Location
Location: http://example.com/some/dir/

In my case I needed to download a file with HTTPS and there was occasions that server was misconfigured to redirect the request and thus would download only http data.
In my case I was able to "decide" whether the requested resource was file or not was inspecting "content type" / aka MIME type.
String[] allowedMimeTypes = new String[] { "application/octet-stream", "image/gif",
"text/css", "text/csv", "text/plain", "text/xml" };
URL website = new URL("https://localhost/public/" + file);
HttpsURLConnection huc = (HttpsURLConnection) website
.openConnection();
if (!Arrays.asList(allowedMimeTypes).contains(huc.getContentType())) {
throw new Exception("Not a file...");
}
When the response contained text/html;charset=utf-8 or similar I was able to determine that it was indeed not a file.
Also note that these MIME types are usually quite configurable on server side.

File not found exception while reading connection.getInputStream()

I am sending a request on a server URL but I am getting File not found exception but when I browse this file through a web browser it seems fine.
URL url = new URL(serverUrl);
connection = getSecureConnection(url);
// Connect to server
connection.connect();
// Send parameters to server
writer = new BufferedWriter(new OutputStreamWriter(connection.getOutputStream(), "UTF-8"));
writer.write(parseParameters(CoreConstants.ACTION_PREFIX + actionName, parameters));
writer.flush();
// Read server's response
reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
when I try to getInputStream then it throws error file not found.
It is an .aspx Controller page.

If the request works fine in a browser but not in code, and you've verified that the URL is the same, then the problem probably has something to do with how you are sending your parameters to the server. Specifically, this part:
writer.write(parseParameters(CoreConstants.ACTION_PREFIX + actionName, parameters));
Perhaps there is a bug in the parseParameters() function?
But more generally, I would recommend using something a bit higher-level than a raw URLConnection. HtmlUnit and HttpClient are both fine choices, particularly since it seems like your request is a fairly simple one. I've used both to perform similar client/server interaction in a number of apps. I suggest revising your code to use one of these libraries, and then see if it still produces the error.

Ok finally I have found that the problem was at IIS side it has been resolved in .Net 4.0. for previous version go to your web.config and specify validateRequest==false

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Why my HTTPS file download corrupts .zip files? - java

transferFrom() must be called in a loop until the transfer is complete, and in this case the only way you can know that is by adding up the return values of transferFrom() until they equal the Content-length of the HTTP response.

Actually the problem was in the TrustModifier Class I was using to switch off the servier certificate check. Once I removed it because I didn't need it any longer (I took the certificate from server and put it in a local trust store), my problem was solved.

Related

Jersey Client download ZIP file and unpack efficiently

with java downloaded file is too small

Cannot upload file to a server

Check if Url refers to file or DIrectory. (HTTP)

File not found exception while reading connection.getInputStream()

Categories

Resources