Android reading pdf metadata - memory issue - java

I'm currently building an android application that displays a set of pdf files in a ListView. Instead of just displaying the filename I want to grab the Title from the metadata of the pdf and display that in the list, if the file doesnt have a Title set then just use the filename. I'm using iText atm, here is what I have:
File[] filteredFiles = root.listFiles(filter);
for (int i=0;i<filteredFiles.length;i++) {
try {
File f = filteredFiles[i];
PdfReader reader = new PdfReader(f.getAbsolutePath());
String title = reader.getInfo().get("Title");
reader.close();
//Do other stuff here...
} catch (Exception e) {
e.printStackTrace();
}
}
This works fine, its gets the data I want, but its slowww. Also, sometimes I get memory crashes if the file is over 2MB. Is there a better way of doing this? Maybe a way of getting the metadata without having to actually open the pdf file?
Any help is much appreciated, Thanks.

You can try fast PDFParse library. It optimized for performance & small memory consumption.
File[] filteredFiles = root.listFiles(filter);
for (int i=0;i<filteredFiles.length;i++) {
try {
File f = filteredFiles[i];
PDFDocument reader = new PDFDocument(f.getAbsolutePath());
String title = reader.getDocumentInfo().getTitle();
reader.close();
//Do other stuff here...
} catch (Exception e) {
e.printStackTrace();
}
}

Related

Android Changing The Modified Date For Files Exported To Local Storage

I am creating an android app and added an option to save media such as images and videos, I need to be able to change the modified date of the file to whatever the current date and time are. I have been using the File method to save files. I have seen that this is a known issue with the .setLastModified() method in Android but am unable to find any other solution to this problem. This method doesn't seem to do anything at least on my device (Google Pixel 2). It will just have the original date.
I have even tried doing some "dirty method" by using the RandomFileAccess() method (I'll show the code below) But with no luck.
File rootPath = new File(Environment.getExternalStorageDirectory().getAbsoluteFile(), "imageAlbum");
if(!rootPath.exists()) {
rootPath.mkdirs();
}
File localFile = new File(rootPath,filename.toLowerCase());
Then I use firebase and the .getFile() method.
This is what I mean by a "dirty" method.
RandomAccessFile raf = null;
try {
raf = new RandomAccessFile(rootPath + "/" + finalFilename.toLowerCase(), "rw");
long length = raf.length();
raf.setLength(length + 1);
raf.setLength(length);
raf.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} }
localFile.setLastModified(System.currentTimeMillis());
But again with no luck.
EDIT:
Here is the code I am using for firebase.
String finalFilename = filename;
storageReference.getFile(localFile).addOnCompleteListener(new OnCompleteListener<FileDownloadTask.TaskSnapshot>() {
#Override
public void onComplete(#NonNull Task<FileDownloadTask.TaskSnapshot> task) {
if (task.isSuccessful()) {
localFile.setLastModified(System.currentTimeMillis());
{
I have tried using the ".setLastModified" method before and after calling the firebase methods.

PDF Box - Unable to renameTo or Delete files

I'm fairly new to programming and I've been trying to use PDFBox for a personal project that I have. I'm basically trying to verify if the PDF has specific keywords in it, if YES I want to transfer the file to a "approved" folder.
I know the code below is poor written, but I'm not able to transfer nor delete the file correctly:
try (Stream<Path> filePathStream = Files.walk(Paths.get("C://pdfbox_teste"))) {
filePathStream.forEach(filePath -> {
if (Files.isRegularFile(filePath)) {
String arquivo = filePath.toString();
File file = new File(arquivo);
try {
// Loading an existing document
PDDocument document = PDDocument.load(file);
// Instantiate PDFTextStripper class
PDFTextStripper pdfStripper = new PDFTextStripper();
String text = pdfStripper.getText(document);
String[] words = text.split("\\.|,|\\s");
for (String word : words) {
// System.out.println(word);
if (word.equals("Revisão") || word.equals("Desenvolvimento")) {
// System.out.println(word);
if(file.renameTo(new File("C://pdfbox_teste//Aprovados//" + file.getName()))){
document.close();
System.out.println("Arquivo transferido corretamente");
file.delete();
};
}
}
System.out.println("Fim do documento: " + arquivo);
System.out.println("----------------------------");
document.close();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
});
I wanted to have the files transferred into the new folder. Instead, sometimes they only get deleted and sometimes nothing happens. I imagine the error is probably on the foreach, but I can't seem to find a way to fix it.
You try to rename the file while it is still open, and only close it afterwards:
// your code, does not work
if(file.renameTo(new File("C://pdfbox_teste//Aprovados//" + file.getName()))){
document.close();
System.out.println("Arquivo transferido corretamente");
file.delete();
};
Try to close the document first, so the file is no longer accessed by your process, and then it should be possible to rename it:
// fixed code:
document.close();
if(file.renameTo(new File("C://pdfbox_teste//Aprovados//" + file.getName()))){
System.out.println("Arquivo transferido corretamente");
};
And as Mahesh K pointed out, you don't have to delete the (original) file after you renamed it. Rename does not make a duplicate where the original file would need to be deleted, it just renames it.
After calling renameTo, you shouldn't be using delete.. as per my understanding renameTo works like move command. Pls see this

How to Map files to the Memory in Java nio?

I have a text file which I downloaded from the internet. File is large, somewhat around 77MB and I need to map it into the memory so I can read it fast. Here is my code
public class MapRead {
public MapRead()
{
try {
File file = new File("E:/Amazon HashFile/Hash.txt");
FileChannel c = new RandomAccessFile(file,"r").getChannel();
MappedByteBuffer buffer = c.map(FileChannel.MapMode.READ_ONLY, 0,c.size()).load();
System.out.println(buffer.isLoaded());
System.out.println(buffer.capacity());
} catch (IOException ex) {
Logger.getLogger(MapRead.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
No good, this generated the below output.
false
81022554
This is my first time trying this. I have no idea what went wrong, or what to do next, to read the file.

How to write images, swf's, videos and anything else that is stored on a website to a file on my computer using streams

I'm trying to write a program that copies a website to my harddrive. This is easy enough to do just copying over the source and saving it as an html file, but In doing that you can't access any of the pictures, videos etc offline. I was wondering if there is a way to do this using an input/output stream and if so how exactly to do it...
Thanks so much in advance
If you have URL of the file to be downloaded then you can simply do it using apache commons-io
org.apache.commons.io.FileUtils.copyURLToFile(URL, File);
EDIT :
This code will download a zip file on your desktop.
import static org.apache.commons.io.FileUtils.copyURLToFile;
public static void Download() {
URL dl = null;
File fl = null;
try {
fl = new File(System.getProperty("user.home").replace("\\", "/") + "/Desktop/Screenshots.zip");
dl = new URL("http://example.com/uploads/Screenshots.zip");
copyURLToFile(dl, fl);
} catch (Exception e) {
System.out.println(e);
}
}

Java: Read in text files from a directory, from the internet

Does anybody know how to recursively read in files from a specific directory on the internet, in Java?
I want to read in all the text files from this web directory: http://www.cs.ucdavis.edu/~davidson/courses/170-S11/Female/
I know how to read in multiple files that are in a folder on my computer, and I how to read in a single file from the internet. But how can I read in multiple files on the internet, without hardcoding the URLs in?
Stuff I tried:
// List the files on my Desktop
final File folder = new File("/Users/crystal/Desktop");
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
File fileEntry = listOfFiles[i];
if (!fileEntry.isDirectory()) {
System.out.println(fileEntry.getName());
}
}
Another thing I tried:
// Reading data from the web
try
{
// Create a URL object
URL url = new URL("http://www.cs.ucdavis.edu/~davidson/courses/170-S11/Female/5_1_1.txt");
// Read all of the text returned by the HTTP server
BufferedReader in = new BufferedReader (new InputStreamReader(url.openStream()));
String htmlText; // String that holds current file line
// Read through file one line at a time. Print line
while ((htmlText = in.readLine()) != null)
{
System.out.println(htmlText);
}
in.close();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
// If another exception is generated, print a stack trace
e.printStackTrace();
}
Thanks!
Since the URL you mentioned has indexes enabled, you're in luck.
You've got a few options here.
Parse the html to find the attribute of the a tags, using SAX2 or any other XML parser. htmlunit would also work I think.
Use a little regexp magic to match all string between <a href=" and "> and use that as the urls to read from.
Once you've got a list of all the URLs you need, then the second piece of code should work just fine. Just iterate over your list, and construct your URL from that list.
Here's a sample regex that should match what you want. It does catch a few extra links, but you should be able to filter those out.
<a\ href="(.+?)">

Categories

Resources