parse java string in java - java

I want to parse java code using java.
problem is , when I pass the java code to parse method,it does not take it as string.How do I escape the code to be parsed
public class JavaParser {
private int noOfLines;
public void parse(String javaCode){
String[] lines = javaCode.split("[\r\n]+");
for(String line : lines)
System.out.println(line);
}
public static void main(){
JavaParser a = new JavaParser();
a.parse("java code;");
}
}

You need to read the java code file as a text file, line by line or alla t once for example:
FileInputStream inputStream = new FileInputStream("foo.java");
try {
String everything = IOUtils.toString(inputStream);
} finally {
inputStream.close();
}
Then you can parse the everything string.

maybe you can describe what you're trying to achieve?
In general its java's compiler (like javac) work to parse the java source files.
Quick googling revealed this project that can suit your needs
As of java 6 you can invoke compiler as a part of your code (java exposes the compiler API). This can be helpful if you're trying to compile the code after you read it. In general you can read this article, maybe you'll find it helpful
Hope this helps

Related

easiest way to read a java file - is there a simpler auternative to JSON

I am writing a small java method that needs to read test data from a file on my win10 laptop.
The test data has not been formed yet but it will be text based.
I need to write a method that reads the data and analyses it character by character.
My questions are:
what is the simplest format to create and read the file....I was looking at JSON, something that does not look particularly complex but is it the best for a very simple application?
My second question (and I am a novice). If the file is in a text file on my laptop.....how do I tell my java code where to find it....how do I ask java to navigate the win10 operating system?
You can also map the text file into java objects (It depends on your text file).
For example, we have a text file that contains person name and family line by line like:
Foo,bar
John,doe
So for parse above text file and map it into a java object we can :
1- Create a Person Object
2- Read and parse the file (line by line)
Create Person Class
public class Person {
private String name;
private String family;
//setters and getters
}
Read The File and Parse line by line
public static void main(String[] args) throws IOException {
//Read file
//Parse line by line
//Map into person object
List<Person> personList = Files
.lines(Paths
.get("D:\\Project\\Code\\src\\main\\resources\\person.txt"))
.map(line -> {
//Get lines of test and split by ","
//It split words of the line and push them into an array of string. Like "John,Doe" -> [John,Doe]
List<String> nameAndFamily = Splitter.on(",").trimResults().omitEmptyStrings().splitToList(line);
//Create a new Person and get above words
Person person = new Person();
person.setName(nameAndFamily.get(0));
person.setFamily(nameAndFamily.get(1));
return person;
}
).collect(Collectors.toList());
//Process the person list
personList.forEach(person -> {
//You can whatever you want to the each person
//Print
System.out.println(person.getName());
System.out.println(person.getFamily());
});
}
Regarding your first question, I can't say much, without knowing anything about the data you like to write/read.
For your second question, you would normally do something like this:
String pathToFile = "C:/Users/SomeUser/Documents/testdata.txt";
InputStream in = new FileInputStream(pathToFile);
As your data gains more complexity you should probably think about using a defined format, if that is possible, something like JSON, YAML or similar for example.
Hope this helps a bit. Good luck with your project.
As for the format the text file needs to take, you should elaborate a bit on the kind of data. So I can't say much there.
But to navigate the file system, you just need to write the path a bit different:
The drive letter is a single character at the beginning of the path i.e. no colon ":"
replace the backslash with a slash
then you should be set.
So for example...
C:\users\johndoe\documents\projectfiles\mydatafile.txt
becomes
c/users/johndoe/documents/projectfiles/mydatafile.txt
With this path, you can use all the IO classes for file manipulation.

Java Read Text file with condition

I'm just wondering if there's a way to read a text file and skip a line with specific string.
For example (test1.txt):
test0,orig,valid,nice
test1,input,of,ol,[www]
test2,[eee],oa,oq
test3,wa,eee,string,int
test4,asfd,eee,[tsddas],wwww
Expected output :
test0,orig,valid,nice
test3,wa,eee,string,int
I already have this code :
String line;
String test[];
try{
LineIterator it = FileUtils.lineIterator(file2,"UTF-8");
while(it.hasNext()){
line = it.nextLine();
test = StringUtils.split(line,(","));
}
Thanks in advance guys!
Something like:
line = it.nextLine();
if (line.contains("specific string")) continue;
test = StringUtils.split(line,(","));
Why are you using StringUtils to split a line? The String class supports a split(...) method.
I suggest you read the String API for basic functionality.
You could also make use of the Java 8 Streaming API. It allows all this rather easily.
final List<String[]> rows = Files.lines(Paths.get(file2), "UTF-8")
.filter(l -> !l.contains("["))
.map(l -> l.split(","))
.collect(Collectors.toList());
If you want to do it fast and don't want to run into any issues, you should also think about using an existing CSV library. A nice example is the Apache Commons CSV library (https://commons.apache.org/proper/commons-csv/).

What is the proper way to write/read a file with different IO streams

I have a file that contains bytes, chars, and an object, all of which need to be written then read. What would be the best way to utilize Java's different IO streams for writing and reading these data types? More specifically, is there a proper way to add delimiters and recognize those delimiters, then triggering what stream should be used? I believe I need some clarification on using multiple streams in the same file, something I have never studied before. A thorough explanation would be a sufficient answer. Thanks!
As EJP already suggested, use ObjectOutputStream and ObjectInputStream an0d wrap your other elements as an object(s). I'm giving as an answer so I could show an example (it's hard to do it in comment) EJP - if you want to embed it in your question, please do and I'll delete the answer.
class MyWrapedData implements serializeable{
private String string1;
private String string2;
private char char1;
// constructors
// getters setters
}
Write to file:
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(fileName));
out.writeObject(myWrappedDataInstance);
out.flush();
Read from file
ObjectInputStream in = new ObjectInputStream(new FileInputStream(fileName));
Object obj = in.readObject();
MyWrapedData wraped = null;
if ((obj != null) && (obj instanceof MyWrappedData))
wraped = (MyWrapedData)obj;
// get the specific elements from the wraped object
see very clear example here: Read and Write
Redesign the file. There is no sensible way of implementing it as presently designed. For example the object presupposes an ObjectOutputStream, which has a header - where's that going to go? And how are you going to know where to switch from bytes to chars?
I would probably use an ObjectOutputStream for the whole thing and write everything as objects. Then Serialization solves all those problems for you. After all you don't actually care what's in the file, only how to read and write it.
Can you change the structure of the file? It is unclear because the first sentence of your question contradicts being able to add delineators. If you can change the file structure you could output the different data types into separate files. I would consider this the 'proper' way to delineate the data streams.
If you are stuck with the file the way it is then you will need to write an interface to the file's structure which in practice is a shopping list of read operations and a lot of exception handling. A hackish way to program because it will require a hex editor and a lot of trial and error but it works in certain cases.
Why not write the file as XML, possibly with a nice simple library like XSTream. If you are concerned about space, wrap it in gzip compression.
If you have control over the file format, and it's not an exceptionally large file (i.e. < 1 GiB), have you thought about using Google's Protocol Buffers?
They generate code that parses (and serializes) file/byte[] content. Protocol buffers use a tagging approach on every value that includes (1) field number and (2) a type, so they have nice properties such as forward/backward compatability with optional fields etc. They are fairly well optimized for both speed and file size, adding only ~2 bytes of overhead for a short byte[], with ~2-4 additional bytes to encode the length on larger byte[] fields (VarInt encoded lengths).
This could be overkill, but if you have a bunch of different fields & types, protobuf is really helpful. See: http://code.google.com/p/protobuf/.
An alternative is Thrift by Facebook, with support for a few more languages although possibly less use in the wild last I checked.
If the structure of your file is not fixed, consider using a wrapper per type. First you need to create the interface of your wrapper classes….
interface MyWrapper extends Serializable {
void accept(MyWrapperVisitor visitor);
}
Then you create the MyWrapperVisitor interface…
interface MyWrapperVisitor {
void visit(MyString wrapper);
void visit(MyChar wrapper);
void visit(MyLong wrapper);
void visit(MyCustomObject wrapper);
}
Then you create your wrapper classes…
class MyString implements MyWrapper {
public final String value;
public MyString(String value) {
super();
this.value = value;
}
#Override
public void accept(MyWrapperVisitor visitor) {
visitor.visit(this);
}
}
.
.
.
And finally you read your objects…
final InputStream in = new FileInputStream(myfile);
final ObjectInputStream objIn = new ObjectInputStream(in);
final MyWrapperVisitor visitor = new MyWrapperVisitor() {
#Override
public void visit(MyString wrapper) {
//your logic here
}
.
.
.
};
//loop over all your objects here
final MyWrapper wrapper = (MyWrapper) objIn.readObject();
wrapper.accept(visitor);

Are there any Java Frameworks for binary file parsing?

My problem is, that I want to parse binary files of different types with a generic parser which is implemented in JAVA. Maybe describing the file format with a configuration file which is read by the parser or creating Java classes which parse the files according to some sort of parsing rules.
I have searched quite a bit on the internet but found almost nothing on this topic.
What I have found are just things which deal with compiler-generators (Jay, Cojen, etc.) but I don't think that I can use them to generate something for parsing binary files. But I could be wrong on that assumption.
Are there any frameworks which deal especially with easy parsing of binary files or can anyone give me a hint how I could use parser/compiler-generators to do so?
Update:
I'm looking for something where I can write a config-file like
file:
header: FIXED("MAGIC")
body: content(10)
content:
value1: BYTE
value2: LONG
value3: STRING(10)
and it generates automatically something which parses files which start with "MAGIC", followed by ten times the content-package (which itself consists of a byte, a long and a 10-byte string).
Update2:
I found something comparable what I'm looking for, "Construct", but sadly this is a Python-Framework. Maybe this helps someone to get an idea, what I'm looking for.
Using Preon:
public class File {
#BoundString(match="MAGIC")
private String header;
#BoundList(size="10", type=Body.class)
private List<Body> body;
private static class Body {
#Bound
byte value1;
#Bound
long value2;
#BoundString(size="10")
String value3;
}
}
Decoding data:
Codec<File> codec = Codecs.create(File.class);
File file = codecs.decode(codec, buffer);
Let me know if you are running into problems.
give a try to preon
I have used DataInputStream for reading binary files and I write the rules in Java. ;) Binary files can have just about any format so there is no general rule for how to read them.
Frameworks don't always make things simpler. In your case, the description file is longer than the code to just read the data using a DataInputStream.
public static void parse(DataInput in) throws IOException {
// file:
// header: FIXED("MAGIC")
String header = readAsString(in, 5);
assert header.equals("MAGIC");
// body: content(10)
// ?? not sure what this means
// content:
for(int i=0;i<10;i++) {
// value1: BYTE
byte value1 = in.readByte();
// value2: LONG
long value2 = in.readLong();
// value3: STRING(10)
String value3 = readAsString(in, 10);
}
}
public static String readAsString(DataInput in, int len) throws IOException {
byte[] bytes = new byte[len];
in.readFully(bytes);
return new String(bytes);
}
If you want to have a configuration file you could use a Java Configuration File. http://www.google.co.uk/search?q=java+configuration+file
Google's Protocol Buffers
Parser combinator library is an option. JParsec works fine, however it could be slow.
I have been developing a framework for Java which allows to parse binary data https://github.com/raydac/java-binary-block-parser
in the case you should just describe structure of your binary file in pseudolanguage
You can parse binary files with parsers like JavaCC. Here you can find a simple example. Probably it's a bit more difficult than parsing text files.
Have you looking into the world of parsers. A good parser is yacc, and there may be a port of it for java.

Convert HTML Character Back to Text Using Java Standard Library

I would like to convert some HTML characters back to text using Java Standard Library. I was wondering whether any library would achieve my purpose?
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
// TODO code application logic here
// "Happy & Sad" in HTML form.
String s = "Happy & Sad";
System.out.println(s);
try {
// Change to "Happy & Sad". DOESN'T WORK!
s = java.net.URLDecoder.decode(s, "UTF-8");
System.out.println(s);
} catch (UnsupportedEncodingException ex) {
}
}
I think the Apache Commons Lang library's StringEscapeUtils.unescapeHtml3() and unescapeHtml4() methods are what you are looking for. See https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringEscapeUtils.html.
Here you have to just add jar file in lib jsoup in your application and then use this code.
import org.jsoup.Jsoup;
public class Encoder {
public static void main(String args[]) {
String s = Jsoup.parse("<Français>").text();
System.out.print(s);
}
}
Link to download jsoup: http://jsoup.org/download
java.net.URLDecoder deals only with the application/x-www-form-urlencoded MIME format (e.g. "%20" represents space), not with HTML character entities. I don't think there's anything on the Java platform for that. You could write your own utility class to do the conversion, like this one.
The URL decoder should only be used for decoding strings from the urls generated by html forms which are in the "application/x-www-form-urlencoded" mime type. This does not support html characters.
After a search I found a Translate class within the HTML Parser library.
You can use the class org.apache.commons.lang.StringEscapeUtils:
String s = StringEscapeUtils.unescapeHtml("Happy & Sad")
It is working.
I'm not aware of any way to do it using the standard library. But I do know and use this class that deals with html entities.
"HTMLEntities is an Open Source Java class that contains a collection of static methods (htmlentities, unhtmlentities, ...) to convert special and extended characters into HTML entitities and vice versa."
http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=htmlentities
Or you can use unescapeHtml4:
String miCadena="GUÍA TELEFÓNICA";
System.out.println(StringEscapeUtils.unescapeHtml4(miCadena));
This code print the line:
GUÍA TELEFÓNICA
As #jem suggested, it is possible to use jsoup.
With jSoup 1.8.3 it il possible to use the method Parser.unescapeEntities that retain the original html.
import org.jsoup.parser.Parser;
...
String html = Parser.unescapeEntities(original_html, false);
It seems that in some previous release this method is not present.

Categories

Resources