Integrating MaltParser into java code, without using a separate process

Integrating MaltParser into java code, without using a separate process - java

There are several resources already available for training and executing the grammatical dependency parser, MaltParser; most notably is the project's homepage: http://www.maltparser.org/userguide.html#startusing). And looking at the NLTK code that uses MaltParser, I see how I could write equivalent Java code to start up a separate child process to run MaltParser: http://nltk.org/_modules/nltk/parse/malt.html. However, what I am asking, or rather looking for, is code that clearly and cleanly shows how to integrate MaltParser as a library into a Java program.
To be specific, I want to write Java code to do the following:
Train a parsing model.
Load a trained model and parse sentences in an online fashion (i.e. stream sentences and use a MaltParser object to parse each one).
To whomever has the knowledge, patience, and willingness: please to help me answer 1 and 2!

I found a rudimentary solution to 2. I noticed that on http://www.maltparser.org/userguide.html#api it directs one to a listing of example files. I took this snippet out of one of those files:
/**
* #author Johan Hall
*/
public static void main(String[] args) {
try {
MaltParserService service = new MaltParserService();
// Inititalize the parser model 'model0' and sets the working directory to '.' and sets the logging file to 'parser.log'
service.initializeParserModel("-c model0 -m parse -w . -lfi parser.log");
// Creates an array of tokens, which contains the Swedish sentence 'Grundavdraget upphör alltså vid en taxerad inkomst på 52500 kr.'
// in the CoNLL data format.
String[] tokens = new String[11];
tokens[0] = "1\tGrundavdraget\t_\tN\tNN\tDD|SS";
tokens[1] = "2\tupphör\t_\tV\tVV\tPS|SM";
tokens[2] = "3\talltså\t_\tAB\tAB\tKS";
tokens[3] = "4\tvid\t_\tPR\tPR\t_";
tokens[4] = "5\ten\t_\tN\tEN\t_";
tokens[5] = "6\ttaxerad\t_\tP\tTP\tPA";
tokens[6] = "7\tinkomst\t_\tN\tNN\t_";
tokens[7] = "8\tpå\t_\tPR\tPR\t_";
tokens[8] = "9\t52500\t_\tR\tRO\t_";
tokens[9] = "10\tkr\t_\tN\tNN\t_";
tokens[10] = "11\t.\t_\tP\tIP\t_";
// Parses the Swedish sentence above
DependencyStructure graph = service.parse(tokens);
// Outputs the dependency graph created by MaltParser.
System.out.println(graph);
// Terminates the parser model
service.terminateParserModel();
} catch (MaltChainedException e) {
System.err.println("MaltParser exception: " + e.getMessage());
}
}

Related

Convert .bib file into a new .acm file

I am trying to convert a .bib that can contains x number of article into a ACM file with said x number of ACM notation. I've written a bit of code but would like help with the rest.
The .bib is formatted this way -->
#ARTICLE{
8249726,
author={N. Khlif and A. Masmoudi and F. Kammoun and N. Masmoudi},
journal={IET Image Processing},
title={Secure chaotic dual encryption scheme for H.264/AVC video conferencing protection},
number={1},
year={2018},
volume={12},
pages={42-52},
keywords={adaptive codes;chaotic communication;cryptography;data compression;data protection;variable length codes;video coding;H.264/AVC video conferencing protection;advanced video coding protection;chaos-based crypto-compression scheme;compression ratio;context adaptive variable length coding;decision module;format compliance;inter-prediction encryption;intra-prediction encryption;piecewise linear chaotic maps;pseudorandom bit generators;secure chaotic dual encryption scheme;selective encryption approach;video compression standards},
doi={10.1049/iet-ipr.2017.0022},
ISSN={1751-9659},
month={Dec},
}
#ARTICLE{
8093611,
author={W. Wu and H. Mao and Y. Wang and J. Wang and W. Wang and C. Tian},
journal={IEEE Access},
title={CoolConferencing: Enabling Robust Peer-to-Peer Multi-Party Video Conferencing},
year={2017},
pages={25474-25486},
number={2},
volume={5},
keywords={Internet;peer-to-peer computing;teleconferencing;video communication;CoolConferencing design;MPVC approach;MPVC platform;any-view support;multirate support;optimal video transmission performance;overlay network;realistic network environments;resilient data-driven principle;robust MPVC system;robust peer-to-peer multiparty video conferencing;robust system;state-of-the-art video;Bandwidth;Delays;Internet;Peer-to-peer computing;Receivers;Robustness;Streaming media;Computer networks;peer to peer computing;streaming media},
doi={10.1109/ACCESS.2017.2768798},
ISSN={1751-9659},
month={Dec},
}
Note that the .bib file can have any number of articles, 2 in this case so the .ACM file should have 2 ACM quotation. Also the articles information doesn't have specific line order.
I can't use any library that auto convert.
Here is the code that I have as of now. This code will read each line of 1 latex file and print all the information between { }, and now I need to save each information and then create a method to return the information in ACM format.
public static void main(String[] args) {
List<String> lines = new ArrayList<>();
try {
File myFile = new File("Latex1.bib");
Scanner reader = new Scanner(myFile);
while (reader.hasNextLine()) {
Pattern pattern = Pattern.compile("=\\{([^}]*)");
Matcher matcher = pattern.matcher(reader.nextLine());
if (matcher.find()) {
System.out.println(matcher.group(1));
}
}
} catch (FileNotFoundException e) {
e.getMessage();
}
}
Could you please complete it?

Need to load static content in JMETER tests

I need to figure out a way to load content from a file containing list of ids in the preprocessing step in Jmeter. This needs to happen only once and not every time for each request. So it should be like -
Load all the list of static ids from the file once.
For every request pick one id randomly from this list.
POST the request
I am trying to explore JSR223 preprocessor but not much luck so far. Also I am not sure whether the preprocessor executes for every request which I do not want.
My current JSR Preprocessor looks something like the following -
import java.util.*;
import java.io.*;
try {
Random generator = new Random();
List<String> uuids = new ArrayList<String>();
int n = 1000;
try(BufferedReader br = new BufferedReader(new FileReader("/uuids.txt"))) {
String line = br.readLine();
while (line != null) {
uuids.add(line);
line = br.readLine();
}
}
int rn = uuids.get(generator.nextInt(n));
vars.put("some_file", "/files/" + uuids.get(rn) + ".json.gz");
} catch (Throwable ex) {
log.error("Something went wrong", ex);
throw ex;
}```

Your approach is a little bit wrong because:
JSR223 PreProcessor is executed before each request in its scope
JSR223 PreProcessor is executed by each thread (virtual user)
So I would recommend the following enhancement:
Add setUp Thread Group to your test plan
Add JSR223 Sampler to it with the following code:
SampleResult.setIgnore()
props.put('uuids', new File('uuids.txt').readLines())
this will let you read the file only once and only by one thread.
Whenever you want to access a random uuid you can use the following __groovy() function:
${__groovy(props.get('uuids').get(org.apache.commons.lang3.RandomUtils.nextInt(0\,props.get('uuids').size())),)}
More information on Groovy scripting in JMeter: Apache Groovy - Why and How You Should Use It

You can use instead JMeter's plugin bzm - Random CSV Data Set Config
Just input the CSV filename and it will generate random uuid every time

How do I implement #params filetype into this seleciton of code?

I am currently working on a word search program for my Java class, and I thought I would do some research on other similar programs before I get started. (See how they work before I write my own version.) I have stumbled across this program:
“Java Word Search Solver” by tmck-code via Code Review, Feb 2015.
It looks very well written, but I cannot figure out how to input my file-name for the puzzle and word list method.
Example:
/**
* A method that loads a dictionary text file into a tree structure
* #param filename The dictionary file to load
* #return The Red-Black tree containing the dictionary
*/
private static ArrayList<String> loadDict(String filename) {
ArrayList<String> dict = new ArrayList<String>();
try {
BufferedReader in = new BufferedReader(
new FileReader(filename));
String word;
while( (word = in.readLine()) != null ) {
dict.add(word);
}
} catch( IOException e ) {
System.err.println("A file error occurred: " + filename );
System.exit(1);
}
return dict;
}
Where in this selection of code do I put my file name (wordlist.txt)?

You do not put this anywhere in the code. I read a little through to code in your link. The path of your file gets passed to programm as you call it.
If you call it via a console (e.g. cmd.exe) it should look somehow like this:
C:\Users\yourName> java WordSearch.java "/path/to/your/file.txt"
You do not put anything of the information in the code itself. The programm in your link just uses the console arguments

NFC with NFC-Tools, Creating NDEF Application

I am attempting to do what I would have guessed would be pretty simple, but as it turns out is not. I have an ACR122 NFC reader and a bunch of Mifare Classic and Mifare Ultralight tags, and all I want to do is read and write a mime-type and a short text string to each card from a Java application. Here's what I've got working so far:
I can connect to my reader and listen for tags
I can detect which type of tag is on the reader
On the Mifare Classic tags I can loop through all of the data on the tag (after programming the tag from my phone) and build an ascii string, but most of the data is "junk" data
I can determine whether or not there is an Application directory on the tag.
Here's my code for doing that:
Main:
public static void main(String[] args){
TerminalFactory factory = TerminalFactory.getDefault();
List<CardTerminal> terminals;
try{
TerminalHandler handler = new TerminalHandler();
terminals = factory.terminals().list();
CardTerminal cardTerminal = terminals.get(0);
AcsTerminal terminal = new AcsTerminal();
terminal.setCardTerminal(cardTerminal);
handler.addTerminal(terminal);
NfcAdapter adapter = new NfcAdapter(handler.getAvailableTerminal(), TerminalMode.INITIATOR);
adapter.registerTagListener(new CustomNDEFListener());
adapter.startListening();
System.in.read();
adapter.stopListening();
}
catch(IOException e){
}
catch(CardException e){
System.out.println("CardException: " + e.getMessage());
}
}
CustomNDEFListener:
public class CustomNDEFListener extends AbstractCardTool
{
#Override
public void doWithReaderWriter(MfClassicReaderWriter readerWriter)
throws IOException{
NdefMessageDecoder decoder = NdefContext.getNdefMessageDecoder();
MadKeyConfig config = MfConstants.NDEF_KEY_CONFIG;
if(readerWriter.hasApplicationDirectory()){
System.out.println("Application Directory Found!");
ApplicationDirectory directory = readerWriter.getApplicationDirectory();
}
else{
System.out.println("No Application Directory Found, creating one.");
readerWriter.createApplicationDirectory(config);
}
}
}
From here, I seem to be at a loss as for how to actually create and interact with an application. Once I can create the application and write Record objects to it, I should be able to write the data I need using the TextMimeRecord type, I just don't know how to get there. Any thoughts?
::Addendum::
Apparently there is no nfc-tools tag, and there probably should be. Would someone with enough rep be kind enough to create one and retag my question to include it?
::Second Addendum::
Also, I am willing to ditch NFC-Tools if someone can point me in the direction of a library that works for what I need, is well documented, and will run in a Windows environment.

Did you checked this library ? It is well written, how ever has poor documentation. Actually no more than JavaDoc.

How to parse languages other than English with Stanford Parser？ in java, not command lines

I have been trying to use Stanford Parser in my Java program to parse some sentences in Chinese. Since I am quite new at both Java and Stanford Parser, I used the 'ParseDemo.java' to practice. The code works fine with sentences in English and outputs the right result. However, when I changed the model to 'chinesePCFG.ser.gz' and tried to parse some segmented Chinese sentences, things went wrong.
Here's my code in Java
class ParserDemo {
public static void main(String[] args) {
LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz");
if (args.length > 0) {
demoDP(lp, args[0]);
} else {
demoAPI(lp);
}
}
public static void demoDP(LexicalizedParser lp, String filename) {
// This option shows loading and sentence-segment and tokenizing
// a file using DocumentPreprocessor
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
// You could also create a tokenier here (as below) and pass it
// to DocumentPreprocessor
for (List<HasWord> sentence : new DocumentPreprocessor(filename)) {
Tree parse = lp.apply(sentence);
parse.pennPrint();
System.out.println();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCCprocessed(true);
System.out.println(tdl);
System.out.println();
}
}
public static void demoAPI(LexicalizedParser lp) {
// This option shows parsing a list of correctly tokenized words
String sent[] = { "我", "是", "一名", "学生" };
List<CoreLabel> rawWords = Sentence.toCoreLabelList(sent);
Tree parse = lp.apply(rawWords);
parse.pennPrint();
System.out.println();
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
List<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
System.out.println(tdl);
System.out.println();
TreePrint tp = new TreePrint("penn,typedDependenciesCollapsed");
tp.printTree(parse);
}
private ParserDemo() {} // static methods only
}
It's basically the same as ParserDemo.java, but when I run it I get the following result:
Loading parser from serialized file
edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz ... done [2.2
sec]. (ROOT (IP
(NP (PN 我))
(VP (VC 是)
(NP
(QP (CD 一名))
(NP (NN 学生))))))
Exception in thread "main" java.lang.RuntimeException: Failed to
invoke public
edu.stanford.nlp.trees.EnglishGrammaticalStructure(edu.stanford.nlp.trees.Tree)
at
edu.stanford.nlp.trees.GrammaticalStructureFactory.newGrammaticalStructure(GrammaticalStructureFactory.java:104)
at parserdemo.ParserDemo.demoAPI(ParserDemo.java:65) at
parserdemo.ParserDemo.main(ParserDemo.java:23)
the code on line 65 is:
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
My guess is that chinesePCFG.ser.gz misses something relevant to 'edu.stanford.nlp.trees.EnglishGrammaticalStructure'. Since the parser parses Chinese correctly via commandlines, there must be something wrong with my own code. I have been searching, only to find few similar cases some of which mentioned about using the right model, but I don't really know how to modify the code to the 'right model'. Hope that someone could help me with it. I am a newbie on Java and Stanford Parser, so please be specific. Thank you!

The problem is that the GrammaticalStructureFactory is constructed from a PennTreebankLanguagePack, which is for the English Penn Treebank. You need to use (in two places)
TreebankLanguagePack tlp = new ChineseTreebankLanguagePack();
and to import this appropriately
import edu.stanford.nlp.trees.international.pennchinese.ChineseTreebankLanguagePack;
But we also generally recommend using the factored parser for Chinese (since it works considerably better, unlike for English, although at the cost of more memory and time usage)
LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz");

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Integrating MaltParser into java code, without using a separate process - java

Related

Convert .bib file into a new .acm file

Need to load static content in JMETER tests

How do I implement #params filetype into this seleciton of code?

NFC with NFC-Tools, Creating NDEF Application

How to parse languages other than English with Stanford Parser？ in java, not command lines

Categories

Resources