I am trying to implement the java based NLP „RFTagger“ to a Processing Sketch in order to analyze Tweets.
Using Twitter4j as described here http://blog.blprnt.com/blog/blprnt/updated-quick-tutorial-processing-twitter
Using RFTagger to analyze Tweets: http://sifnos.sfs.uni-tuebingen.de/resource/A4/rftj/
After I filtered out all retweets, hashtags and profile names in order to have clear sentences to work with, the words of one sentence are stored in an ArrayList:
ArrayList<String> sentsTweet = new ArrayList<String>();
Now I’d like to have the sentence analyzed by RFTagger. I just implemented the library as described on the RFTagger Website:
List <String> tags = rft.getTags(sentsTweet);
Unfortunately within Processing the class "List" is unknown / not available (?) / Error Message: Cannot find a class or type named “List“
I know I could transform the data into some other, manageable format. Like this:
Object[] tags = (rft.getTags(sentsTweet)).toArray();
But I need to store the data how it is in order to send it a second time to RFTagger to use it's tagset converter:
TagsetConverter conv = ConverterFactory.getConverter("stts");
List<String> sttsTags = new LinkedList<String>();
for ( String tag : tags ) {
sttsTags.add(conv.rftag2tag(tag));
}
Now as List<String> doesn't work in Processing do you guys have an idea how I could handle the data and or communication of RFTagger it?
Kind regards,
Marv
This has nothing to do with the processing library.
RFTagger.getTags() returns java.util.List which is part of the JDK and JRE. You need to add the import for the List class:
import java.util.List;
Related
I have a class Person having a set of Objects Contacts. I want to get a stream of Contacts from the stream of Persons.
public class Persons{
private Set<Contact> contacts;
}
persons.stream().map(Person::getContacts);
gives me Stream<Set<Contact>> rather a Stream<Contact>
Any suggestion or help would be appreciated as I am quite new to Java 8 and Streams.
You may try this:
Stream<Contact> contacts = persons.stream().flatMap(p -> p.getContacts().stream());
or that:
Stream<Contact> contacts = persons.stream().map(Person::getContacts).flatMap(Set::stream);
Check this excellent thread so that you may understand the difference between map and flatMap.
You can achieve this by using Stream#flatMap instead of Stream#map. The JavaDoc shows an example of flattening a list of lines from a file to a list of words within each line. You can adapt the same technique to your domain model of Person and Contact.
How to filter a list/collection of streams based on url parameters, for example:
?filter=(type=="audio"&&systemBitrate<100000)||(type=="video"&&systemBitrate<1024000)
I know this can be done using statically:
List<StreamItem> results = streamList.stream().filter(s -> s.type == "audio" && s.systemBitrate < 100000).collect(Collectors.toList());
Simple object:
public class StreamItem {
String name;
String type;
int systemBitrate;
}
The idea is to dynamically filter playback manifest in a similar way to the one below and play only selected tracks:
curl -v 'http://demo.unified-streaming.com/video/tears-of-steel/tears-of-steel.ism/Manifest?filter=(type=="audio"%26%26systemBitrate<100000)||(type=="video"%26%26systemBitrate<1024000)'
One way to do it is to use one of the "Expression Language" libraries,
to compile your filter expression and then apply it to the elements of your stream.
Below is a short example using MVEL:
pom.xml
<dependency>
<groupId>org.mvel</groupId>
<artifactId>mvel2</artifactId>
<version>2.3.1.Final</version>
</dependency>
java code
Serializable expr = MVEL.compileExpression(
"(type==\"audio\"&&systemBitrate<100000)||(type==\"video\"&&systemBitrate<1024000)"
);
Arrays.asList(
new StreamItem("audio",10000),
new StreamItem("audio",200000),
new StreamItem("video",200000),
new StreamItem("video",2000000)
)
.stream()
.filter(e->MVEL.executeExpression(expr,e,boolean.class))
.forEach(System.out::println);
output
Element{type='audio', systemBitrate=10000}
Element{type='video', systemBitrate=200000}
Please note that your StreamItem class must have getters defined for both type and systemBitrate properties, for the MVEL to be able to resolve them.
Don't expect this to be blazing fast, yet it still should be fast enough for most practical tasks, taking that expression is compiled, before use.
Filtering a list of 1000000 (one million) StreamItems, using the expression above, takes ~150ms on my laptop, on average.
protected static void attSelection_w(Instances data) throws Exception {
AttributeSelection fs = new AttributeSelection();
WrapperSubsetEval wrapper = new WrapperSubsetEval();
wrapper.buildEvaluator(data);
wrapper.setClassifier(new RandomForest());
wrapper.setFolds(10);
wrapper.setThreshold(0.001);
fs.SelectAttributes(data);
fs.setEvaluator(wrapper);
fs.setSearch(new BestFirst());
System.out.println(fs.toResultsString());
}
Above is my code for wrapper based attribute selection using random forest + bestfirst search. However, this somehow spits out a result using cfs, like below.
Search Method:
Greedy Stepwise (forwards).
Start set: no attributes
Merit of best subset found: 0.287
Attribute Subset Evaluator (supervised, Class (nominal): 9 class):
CFS Subset Evaluator
Including locally predictive attributes
There is no other code using CFS in the whole class, and I'm pretty much stuck.. I would appreciate any help. Thanks!
You just inverted the order and get the default method, the correct order is to set the parameter first, then call the selection:
//first
fs.setEvaluator(wrapper);
fs.setSearch(new BestFirst());
//then
fs.SelectAttributes(data);
Just set class Index and add this line after creating instance data
data.setClassIndex(data.numAttributes() - 1);
I checked and it worked fine.
This is a continuation of my work on a gradebook program. I have been posting my questions related to JSON and connecting two applications to StackOverflow because I've been having a really difficult time with that part.
I have been attempting to create an HTTP POST request that uses JSON for the purpose of sending information from a Java gradebook application to a Rails web-based application that displays those grades in the form of a report to students.
Ultimately, I want to send more than just one student's information. Furthermore, each student might have anywhere from 0 to 50 assignments, descriptions of the assignments, as well as grades for those assignments. On top of that there will be multiple classes/courses of students. All this information needs to be "read in" to the JSON object. Does anyone have any suggestions about how I could modify this code so that I could send all that data?
The farthest that I was able to take the JSON-related part of code is shown below. However, that code needs to be modified as the following questions suggest.
1. How do I create the array of JSON objects dynamically rather than how it is shown below (since the courses, students, and grades will vary and be read in from the Java program)?
2. How do I synthesize/combine the three JSON arrays of objects below to make it work? My idea is that I write the array of course objects then somehow embed the array of student objects as part of each course object, then somehow embed the array of grade objects as part of each student object.
{‘JSONArrayOfCourseObjects’ : [{‘courseID’ : ‘Botany101FallSemester’, ‘courseInstructor’ :
‘Mr. Smith’}, {‘courseID’ : ‘Physics101FallSemester’, ‘courseInstructor’ : ‘Mrs. Newton},
etc.]}
{‘JSONArrayOfStudentObjects’ : [{‘Name’ : ‘John Doe’, ‘StudentID’ : ‘12345678’, ‘Address’ :
‘1234 Main Street’}, {‘Name’ : ‘Don Corleone’, ‘StudentID’ : ‘87654321’, ‘Address’ :
‘121 Walberry Ave’}, etc.]}
{‘JSONArrayOfGradeObjects’ : [{‘nameOfAssignment’ : ‘Irrigation Homework 1’,
‘dateOfAssignment’ : ‘Sept 1, 2014’, ‘categoryOfAssignment’ : ‘Homework’},
{‘nameOfAssignment’ : ‘Test 1’, ‘dateOfAssignment’ : ‘Sept 14, 2014’, ‘categoryOfAssignment’ :
‘Test’}, etc.]}
JSONlib is the simplest Java API out there for generating quick and dirty JSON. It has everything you need to build up the object and convert it to text. If you need something more powerful, there's GSon and Jackson.
Here are some samples. This example is in Groovy so it's not copy and pasteable, but it shows you how to use it:
def array = new JSONArray()
new File("/path/to/grades/files").eachFile { file ->
String rawJson = file.text
JSONObject obj = (JSONObject ) JSONSerializer.toJSON( rawJson )
array = array.element(obj)
})
println array.toString(5) //Use 5 character indentation
I am absolutely new to Java development.
Can someone please elaborate on how to obtain "Grammatical Relations" using the Stanfords's Natural Language Processing Lexical Parser- open source Java code?
Thanks!
See line 88 of first file in my code to run the Stanford Parser programmatically
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();
System.out.println("words: "+words);
System.out.println("POStags: "+tags);
System.out.println("stemmedWordsAndTags: "+stems);
System.out.println("typedDependencies: "+tdl);
The collection tdl is a list of these typed dependencies. If you look on the javadoc for TypedDependency you'll see that using the .reln() method gets you the grammatical relation.
Lines 311-318 of the third file in my code show how to use that list of typed dependencies. I happen to get the name of the relation, but you could get the relation itself, which would be of the class GrammaticalRelation.
for( Iterator<TypedDependency> iter = tdl.iterator(); iter.hasNext(); ) {
TypedDependency var = iter.next();
TreeGraphNode dep = var.dep();
TreeGraphNode gov = var.gov();
// All useful information for a node in the tree
String reln = var.reln().getShortName();
Don't feel bad, I spent a miserable day or two trying to figure out how to use the parser. I don't know if the docs have improved, but when I used it they were pretty damn awful.