Java - Download file from URL with matching file name pattern - java

I want to download few files from a URL. I know the starting of the file name. But the next part would be different. Mostly a date. But it could be different for different files. From Java code, is there any way to download file with matching pattern?
If I hit the below URL in chrome, all the files are listed and I have to download the required files manually.
http://<ip_address>:<port>/MR/build/report/scan/daily/2021-12-13_120/data/
File names can b like below. It will have known file name and date. The date can be different. Either the same as in URL or some older one.
scan_report_2021_12_13_120.txt
build_report_2021_12_10_110.txt
my_reportdata_2021_11_30_110.txt
As of now, my Java code is like below. I have to pass the complete URL with exact file name to download the files. Most of the cases it would be same as the date and number in URL. So in the program I take the date part from URL and add it to my file name nd pass as the URL. But for some files it might change and for those I have to manually download.
private static void downloadFile(String remoteURLPath, String localPath) {
System.out.println("DownloadFileTest.downloadFile() Downloading from " + remoteURLPath + " to = " + localPath);
FileOutputStream fos = null;
try {
URL website = new URL(remoteURLPath);
ReadableByteChannel rbc = Channels.newChannel(website.openStream());
fos = new FileOutputStream(localPath);
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (fos != null) {
try {
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
The argument remoteURLPath is passed like http://<ip_address>:<port>/MR/build/report/scan/daily/2021-12-13_120/data/scan_report_2021_12_13_120.txt
And localPath is passed like C:\\MyDir\\MyData\\scan_report_2021_12_13_120.txt
Similarly other files also with date as 2021_12_13_120. Other files wont get downloaded. But will create empty file in the same directory which I will delete later since size is 0.
Is there any way we can pass pattern here?
Like http://<ip_address>:<port>/MR/build/report/scan/daily/2021-12-13_120/data/scan_report_*.txt
And instead of passing complete local path, is there any way to pass only directory where the file should get downloaded with exact same name as in the remote system?
In Linux I can use wget with pattern matching. But was looking for Java way to download in all platforms.
wget -r -np -nH --cut-dirs=10 -A "scan_report*.txt" "http://<ip_address>:<port>/MR/build/report/scan/daily/2021-12-13_120/data/"

Thanks to comment from #FedericoklezCulloca. I modified my code using this answer
The solution I did is read all html page and get all href values as it had only the file names with extension. From there I had another list which I used to get the matching files and those I downloaded then using my code in the Question.
Method to get all href list from URL. may be optimisation can be done. Also I did not use any extra library.
private static List<String> getAllHREFListFromURL(String downloadURL) {
URL url;
InputStream is = null;
List<String> hrefListFromURL = new ArrayList<>();
try {
url = new URL(downloadURL);
is = url.openStream();
byte[] buffer = new byte[1024];
int bytesRead = -1;
StringBuilder page = new StringBuilder(1024);
while ((bytesRead = is.read(buffer)) != -1) {
String str = new String(buffer, 0, bytesRead);
page.append(str);
}
StringBuilder htmlPage = new StringBuilder(page);
String search_start = "href=\"";
String search_end = "\"";
while (!htmlPage.isEmpty()) {
int indexOf = htmlPage.indexOf(search_start);
if (indexOf != -1) {
String substring = htmlPage.substring(indexOf + search_start.length());
String linkName = substring.substring(0, substring.indexOf(search_end));
hrefListFromURL.add(linkName);
htmlPage = new StringBuilder(substring);
} else {
htmlPage = new StringBuilder();
}
}
} catch (MalformedURLException e1) {
e1.printStackTrace();
} catch (IOException ex) {
ex.printStackTrace();
} finally {
try {
is.close();
} catch (Exception e) {
}
}
return hrefListFromURL;
}
Method to get list of files that I needed.
private static List<String> getDownloadList(List<String> allHREFListFromURL) {
List<String> filesList = getMyFilesList();
List<String> downloadList = new ArrayList<>();
for (String fileName : filesList) {
Predicate<String> fileFilter = Pattern.compile(fileName + "*").asPredicate();
List<String> collect = allHREFListFromURL.stream().filter(fileFilter).collect(Collectors.toList());
downloadList.addAll(collect);
}
return downloadList;
}
private static List<String> getMyFilesList() {
List<String> filesList = new ArrayList<>();
filesList.add("scan_report");
filesList.add("build_report");
filesList.add("my_reportdata");
return filesList;
}
The downloadList I iterate and uses my original download method to download.

Related

Downloading an image in java

I have to download an image from the nasa website. Problem is, that my code sometimes works, sucessfully downloading an image, while sometimes saves only 186B (don't know why exactly 186).
Problems is for sure connected with the way nasa sahres those photos. For instance, an image from that link https://mars.jpl.nasa.gov/msl-raw-images/msss/00001/mcam/0001ML0000001000I1_DXXX.jpg is saved sucessfully, while from that link https://mars.nasa.gov/mer/gallery/all/2/f/001/2F126468064EDN0000P1001L0M1-BR.JPG fails.
Here is my code
public static void saveImage(String imageUrl, String destinationFile){
URL url;
try {
url = new URL(imageUrl);
System.out.println(url);
InputStream is = url.openStream();
OutputStream os = new FileOutputStream(destinationFile);
byte[] b = new byte[2048];
int length;
while ((length = is.read(b)) != -1) {
os.write(b, 0, length);
}
is.close();
os.close();
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Does someone have an idea, why is doesn't work?
public boolean downloadPhotosSol(int i) throws JSONException, IOException {
String url0 = "https://api.nasa.gov/mars-photos/api/v1/rovers/spirit/photos?sol=" + this.chosenMarsDate + "&camera=" + this.chosenCamera + "&page=" + i + "&api_key=###";
JSONObject json = JsonReader.readJsonFromUrl(url0);
if(json.getJSONArray("photos").length() == 0) return true;
String workspace = new File(".").getCanonicalPath();
String pathToFolder = workspace+File.separator+this.getManifest().getName() + this.chosenMarsDate + this.chosenCamera +"Strona"+i;
new File(pathToFolder).mkdirs();
for(int j = 0;j<json.getJSONArray("photos").length();j++) {
String url = ((JSONObject) json.getJSONArray("photos").get(j)).getString("img_src");
SaveImage.saveImage(url, pathToFolder+File.separator+"img"+j+".jpg");
}
return false;
}
When you get a 186 byte file, open it with a text editor and see what is inside. It could contain an HTTP error message in HTML format. If instead you see the first 186 bytes of your image file, then something is not working right with your program.
EDIT: From your comments it looks like you are getting an HTTP 301 response, which is a redirect to another location. A web browser handles this automatically without you noticing. However, your Java program is not following the redirect to the new location. You need to use an HTTP Java library that handles redirects.
Best and short way of doing it:
try(InputStream in = new URL("http://example.com/image.jpg").openStream()){
Files.copy(in, Paths.get("C:/File/To/Save/To/image.jpg"));
}

Javafx save textField to a text file

I have this code for saving to a text file however, I can't seem to find a way to make it save to not the user.home folder but to another folder on my hard drive. I searched in many places but couldn't really find anything that could help me.
It works with the user.home setting but if I try to change it, it doesn't. The program, when executed, comes up with Source not found.
saveBtn.setOnAction(new EventHandler<ActionEvent>()
{
public void handle(ActionEvent event)
{
Object source = event.getSource();
String s = null;
//Variable to display text read from file
if (_clickMeMode) {
FileOutputStream out = null;
try {
//Code to write to file
String text = titleField.getText();
byte b[] = text.getBytes();
String outputFileName = System.getProperty("user.home"
+ File.separatorChar+"home")
+ File.separatorChar + "Movies2.txt";
out = new FileOutputStream(outputFileName);
out.write(b);
out.close();
//Clear text field
titleField.setText("");
}catch (java.io.IOException e) {
System.out.println("Cannotss text.txt");
} finally {
try {
out.close();
} catch (java.io.IOException e) {
System.out.println("Cannote");
}
}
}
else
{
//Save text to file
_clickMeMode = true;
}
window.setTitle("Main Screen");
window.setScene(mainScreen);
}
});
Your file name is incorrectly assigned:
String outputFileName = System.getProperty("user.home"
+ File.separatorChar+"home")
+ File.separatorChar + "Movies2.txt";
You are passing a string of the form "user.home/home" to System.getProperty().
Since there is no such property, this will return null.
Then you concatenate this with /Movies2.txt, so outputFileName will be something like null/Movies2.txt.
(A simple System.out.println(outputFileName) will confirm this.)
Instead of building the filename by hand like this, you should use a higher-level API to do it. E.g.:
Path outputFile = Paths.get(System.getProperty("user.home"), "home", "Movies2.txt");
OutputStream out = Files.newOutputStream(outputFile);
out.write(b);
If you also need (or might need) to create the directory, you can do
Path outputDir = Paths.get(System.getProperty("user.home"), "home");
Files.createDirectories(outputDir);
Path outputFile = outputDir.resolve("Movies2.txt");
OutputStream out = Files.newOutputStream(outputFile);
out.write(b);

Gdx.files.internal(...) wrapper not working correctly

I made a wrapper ConfigurationFile class to help handle Gdx.files stuff, and it worked fine for a long time, but now it's not working, and I don't know why.
I have two of the following two methods: internal(...) and local(...). The only difference between the two is handling the load from arguments from (File folder, String name) and (String path).
-Snip Now Unnecessary Information-
UPDATE
After more configuring, I came to find out that they're not behaving the same. I have an assets/files/ folder that Gdx.files.internal(...) will access fine, but ConfigurationFile.internal(...) will access files/, and they're set up the same way. I'll give you the two pieces of code that I used for testing.
Using Gdx.files.internal(...) directly (works as expected):
FileHandle handle = Gdx.files.internal("files/virus_data");
BufferedReader reader = null;
try {
reader = new BufferedReader(handle.reader());
String c = "";
while ((c = reader.readLine()) != null) {
System.out.println(c); // prints out all 5 lines on the file.
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (reader != null) reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
Using ConfigurationFile.internal(...):
// First part, calls ConfigurationFile#internal(String path)
ConfigurationFile config = ConfigurationFile.internal("files/virus_data");
// ConfigurationFile#internal(String path)
public static ConfigurationFile internal(String path) {
ConfigurationFile config = new ConfigurationFile();
// This is literally calling Gdx.files.internal("files/virus_data");
config.handle = Gdx.files.internal(path);
config.file = config.handle.file();
config.folder = config.file.getParentFile();
config.init();
return config;
}
// ConfigurationFile#init()
protected void init() {
// File not found.
// Creates a new folder as a sibling of "assets"
// Creates a new file called "virus_data"
if (!folder.exists()) folder.mkdirs();
if (!file.exists()) {
try {
file.createNewFile();
} catch (IOException e) {
e.printStackTrace();
}
} else loadFile();
}
// ConfigurationFile#loadFile()
protected void loadFile() {
BufferedReader reader = null;
try {
reader = new BufferedReader(handle.reader());
String c = "";
while ((c = reader.readLine()) != null) {
System.out.println(c);
if (!c.contains(":")) continue;
String[] values = c.split(":");
String key = values[0];
String value = values[1];
if (values.length > 2) {
for (int i = 2; i < values.length; i++) {
value += ":" + values[i];
}
}
key = key.trim();
value = value.trim();
mapValues.put(key, value);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (reader != null) reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
What I'm having trouble understanding is what's the difference between these two ways that it is causing my ConfigurationFile to create a new File in a folder that is a sibling of assets. Could someone tell me why this is happening?
My suggestion is not to use
Gdx.files.internal(folder + "/" + name);
If you have to use the File api, do it this way:
Gdx.files.internal(new File(folder, name).toString());
This way you avoid weird things that could be happening with path separators.
If Gdx maybe needs relative paths for some reason (perhaps relative to some Gdx internal home directory), you could use NIO to do something like
final Path gdxHome = Paths.get("path/to/gdx/home");
//...
File combined = new File(folder, name);
String relativePath = gdxHome.relativize(combined.toPath()).toString();
Okay, so after intense testing, I found out the problem, which I found to be ridiculous.
Since the file is Internal, that means a new File(...) reference can't be properly made to it, but instead it's an InputStream (if I'm correct), but anyways, using the method FileHandle#file() on an Internal file causes some kind of conversion for the path, so after removing anything that dealed with FileHandle#file() for an Internal file fixed it.

better regular expression for matching strings?

I have a "moreinfo" Directory which has some html file and other folder. I am searching the file in the moreinfo directory( and not sub directory in moreinfo) matches with toolId*.The names of the file is same as toolId],
Below is a code snippet how i writing it, In case my toolId = delegatedAccess the list returns 2 file (delegatedAccess.html & delegatedAccess.shopping.html) based on the wide card filter(toolId*)
Is their a better way of writing the regular expression that check until last occurring period and return the file that matches exactly with my toolId?
infoDir =/Users/moreinfo
private String getMoreInfoUrl(File infoDir, String toolId) {
String moreInfoUrl = null;
try {
Collection<File> files = FileUtils.listFiles(infoDir, new WildcardFileFilter(toolId+"*"), null);
if (files.isEmpty()==false) {
File mFile = files.iterator().next();
moreInfoUrl = libraryPath + mFile.getName(); // toolId;
}
} catch (Exception e) {
M_log.info("unable to read moreinfo" + e.getMessage());
}
return moreInfoUrl;
}
This is what i end up doing with all the great comments. I did string manipulation to solve my problem. As Regex was not right solution to it.
private String getMoreInfoUrl(File infoDir, String toolId) {
String moreInfoUrl = null;
try {
Collection<File> files = FileUtils.listFiles(infoDir, new WildcardFileFilter(toolId+"*"), null);
if (files.isEmpty()==false) {
for (File mFile : files) {
int lastIndexOfPeriod = mFile.getName().lastIndexOf('.');
String fNameWithOutExtension = mFile.getName().substring(0,lastIndexOfPeriod);
if(fNameWithOutExtension.equals(toolId)) {
moreInfoUrl = libraryPath + mFile.getName();
break;
}
}
}
} catch (Exception e) {
M_log.info("unable to read moreinfo" + e.getMessage());
}
return moreInfoUrl;
}

filtering files

I want to check the file-type of a file. I thought about magic numbers, but how to use it
with Java.
I want only allow Textfiles and filter files like jpg etc. in my programm.
Some ideas, what can I do.
private String path;
private String fileText;
private String textLine;
public LoadModel(String path) {
this.path = path;
this.fileText = "";
FileReader read = null;
BufferedReader bufRead = null;
if (path != null && new File(path).exists()
&& !(new File(path).isDirectory())) {
try {
read = new FileReader(path);
bufRead = new BufferedReader(read);
do {
try {
this.textLine = bufRead.readLine();
} catch (IOException ex) {
Logger.getLogger(LoadModel.class.getName()).log(Level.SEVERE, null, ex);
}
if (this.textLine != null) {
this.fileText = this.fileText + this.textLine + "\n";
}
} while (this.textLine != null);
} catch (FileNotFoundException ex) {
Logger.getLogger(LoadModel.class.getName()).log(Level.SEVERE, null, ex);
}
} else {
HinweisDialogController.hinweisDialogOK("Die angegebene Datei existiert nicht");
}
}
Here you can find the list of API's available for identify mime type in java with code sample.
Also in java 7 have an option
Files.probeContentType(path)
.
You can try java.nio.file.Files.probeContentType which is designed to determine a file content type. For example this test
System.out.println(Files.probeContentType(Paths.get("1.xml")));
System.out.println(Files.probeContentType(Paths.get("1.txt")));
prints
text/xml
text/plain
see API for more details
If you need your code to work on earlier versions of JDK (not JDK7) you may use Apache Tika's MimeType detector, which has MimeType#detect() method
More information here

Categories

Resources