I wanna know if It's possible to add/remove code to a .jar file.
Here's my case:
I have a program to organize some pdf files by company, type and date in a determined directory. But when I parse some of the pdf file the company name may be incorrect, mostly because of the way the pdf was generated.
the company name in the file pdf is:
COMPANY & MAN'S
but when converted it might outputs:
CO MPA NY & MAN S
Knowing this I have blocks to every file type to handle this kind of exception.
is this way:
static String DSN(String EDIT_DSN)
{
EDIT_DSN = EDIT_DSN.replaceAll("CO MPA NY & MAN S", "COMPANY & MAN'S");
return EDIT_DSN;
}
What I'm trying to do is create another piece of code that is able to add/edit/remove lines to this blocks. Is it possible? if it is, how should I do it?
Simple: you don't hard code these strings in your code.
Instead, you put the strings into a text file (for example a Java property file).
Then your "mapper" code simply reads the mappings from such text files. So you don't have to change your Java classes, just feed different text files to it.
Related
If I have 100 short XML files in folder and I'd like to know which of them contains text aaabbbccc (for later accurate parsing). Is this a good idea to read them as Strings one after other and try to use contains function to determine what files not contains this text?
As I know the contains function is very fast.
I've done a lot of internet searching to find some information to no avail.. Hopefully you can help me..
I want to be able to use a flat file, with normal content (i.e. full english sentences, paragraphs etc), extract each word and store each word individually, one word per row, in a SQL database (doesn't matter if there are spaces but characters such as apostrophes can be kept in)
I then want to have a HTML page with code to access this DB and output the text to the user one word at a time, essentially 'writing' the inputted files text word-by-word on the web page.
This is just a coding exercise but I am frustrated as I know the what but not the how.. I am not sure where to start. Please note some of these files can be quite big ~ 20,000 words so there may be a performance element to consider to any solution.
TL;DR: I want to extract individual words from a text file with normal everyday sentences into a SQL DB that I can retrieve from a HTML page.
Simple read & split exercise
with open(<filename>) as f:
dd = {}
for ln in f:
wds = ln.strip().split()
for word in wds:
dd[word] = 1 # need something for value
for wkey in dd:
<insert into db>
Well, before you start you should choose just one programming language. Since you seem like you are a beginner I would highly recommend Python over Java, but it depends on if you're required to use any particular language by an employer/professor/etc.
Also just to point out, this is also a very BIG task that you've chosen. I'll try to break it down into parts for you, but I recommend starting with just one of these parts before you move on, and make sure it works on your local machine before you try putting it on the web.
First you need to use something read in your file, preferably line by line. A method similar to FileReader/BufferedReader in Java or the open(), readlines() functions in Python will do these. I would also check out the tutorials online on file handling for whichever of these two languages you're going to use. The Python one is here. Practice this with a test file or a small section of your real file before you start working on your real input files.
When you start processing the lines from the file, I would recommend splitting them into individual words using a string split function on spaces or on any punctuation, such as ,.!?". This way you'll pull out the individual words from the each line in the file.
Next, you'll want to choose a database API for the appropriate programming language. I used PyMySQL but there is also MySQLDB for Python. In Java there is JDBC.
You'll need to then build your database on a server somewhere, preferably on the same server as your HTML page for ease of connection. You'll want to practice connecting to your database and adding sample rows before you start trying to process your real input files.
You can't have normal HTML access the database directly - you'll need to use a coding language like Python for that. I've never used Java for webpages, but with Python you'll simply output text and tell the server to display it as the webpage. This will do the trick:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import otherstuffhere
## Must have this header to tell browser how to handle this output
## and must be printed first
print ("Content-Type: text/html\n\n")
## Connect to database here
## Your code to display words from the database goes below here
print (myfield1)
Also remember that when you output your text, you'll need to add all the HTML tags to the normal text output. For example, when printing each word, you'll need to add <p> or <br> to end each line, because although the Python print() function will automatically add a line break, this doesn't translate to a line break in HTML. For example:
print ("My word list is: <br>")
for word in dbOutputList:
print (word)
print ("<br>")
After that the REAL fun/crying begins, but you should work on the above before you move on.
I am a teacher and would like my students to have in front of them pretty printouts of the source code to 4 short Java files. I don't want to waste paper (or have them shuffling papers around), so I would like to have the four files on a single page. I don't want to print (from Eclipse) each to a separate PDF, then combine them 4-up, since that would make the text tiny. I tried concatenating the four files into a single .java file in Eclipse, but, despite reading this question, I found no way of suppressing the display of errors (namely defining multiple public classes in a single file).
Update: I don't just want to print the code as text. I would like it pretty-printed, i.e., with syntax highlighting.
You can copy them from eclipse to open/libreoffice or word to keep the formatting, especially the colors.
You could use a program like Highlight: http://www.andre-simon.de/ (Results can be copied to with highlighting)
You could use Latex to handle the formatting
You could put the source together in one file, you have to make the necessary changes to make sure there is only one public class:
public class a {}
class B {}
Copy them all, to a text editor, and print from there.
Following on from my previous question, my program doesn't detect the 300 images that have just been created in a particular directory; instead, it only detects desktop.ini, which is not the case as I can physically see that the files have been created within said directory and do exist.
Can somebody please explain why this happens as when I run the program the next time, it seems to work just fine?
The only way that something is detected within the directory on the first run is when there is at least one file which exists in the directory before the program is compiled and executed.
Many thanks.
UPDATE: Files are detected as follows:
//Default greyscale image directory (to convert from greyscale to binary).
static File dirGrey = new File("test_images\\Greyscale");
//Array of greyscale image filenames.
static File imgListGrey[] = dirGrey.listFiles();
without knowing how you create the images, this question is akin to 'How many kittens are under my desk right now?'
Are you creating the files yourself? If so, are you closing any file handles referring to those files once they are created?
You're creating the file list in a static array, and it's created when the class containing the array is loaded by the Java class loader, which is probably before you create the image files. That's why the array contains an outdated list.
static is rarely needed, mostly useful for constants (things that never change, such as 42), for pure functions (Math.sqrt()) and a few other special cases. When you use it, you have to learn all the tricky initialization order stuff. Otherwise, just stick with non-static variables.
Using a Java servlet, is it possible to detect the true file type of a file, regardless of its extension?
Scenario: You only allow plain text file uploads (.txt and .csv) The user takes the file, mypicture.jpg, renames it to mypicture.txt and proceeds to upload the file. Your servlet expects only text files and blows up trying to read the jpg.
Obviously this is user error, but is there a way to detect that its not plain text and not proceed?
You can do this using the builtin URLConnection#guessContentTypeFromStream() API. It's however pretty limited in content types it can detect, you can then better use a 3rd party library like jMimeMagic.
See also:
Best way to determine file type in Java
When do browsers send application/octet-stream as Content-Type?
No. There is no way to know what type of file you are being uploaded. You must make all verifications on the server before taking any actions with the file.
I think you should consider why your program might blow up when give a JPEG (say) and make it defensive against this. For example a JPEG file is likely to have apparently very long lines (any LF of CR LF will be soemwhat randomly spread). But a so called text file could equally have long lines that might kill your program,
What exactly do you mean by "plain text file"? Would a file consisting of Chinese text be a plain text file? If you assume English text in ASCII or ANSI coding, you would have to read the full file as binary file, and check that e. g. all byte values are between, say, 32 and 127 plus 13, 10, 9, maybe.