I am trying to do sentiment analysis and project the values on Google Visualization.
I am calling this python script using my java program
Code Snippet (for AlchemyAPI)
https://github.com/AlchemyAPI/alchemyapi-twitter-python
I wrote a java program to call the python script.
import java.io.*;
public class twitmain {
public String twittersentiment(String[] args) throws IOException {
// set up the command and parameter
String pythonScriptPath = "/twitter/analyze.py"; // I'm calling AlchemyAPI
String[] cmd = new String[2 + args.length];
cmd[0] = "C:\\Python27\\python.exe";
cmd[1] = pythonScriptPath;
for (int i = 0; i < args.length; i++) {
cmd[i + 2] = args[i];
}
// create runtime to execute external command
Runtime rt = Runtime.getRuntime();
Process pr = rt.exec(cmd);
// retrieve output from python script
BufferedReader bfr = new BufferedReader(new InputStreamReader(
pr.getInputStream()));
String line = ""; int i=0;
while ((line = bfr.readLine()) != null) {
System.out.println(line);
}
return line;
}
}
Output:
I am getting tweets and final statistics as follows:
##########################################################
# The Tweets #
##########################################################
#uDiZnoGouD
Date: Mon Apr 07 05:07:19 +0000 2014
To enjoy in case you win!
To help you sulk in case you loose!
#IndiavsSriLanka #T20final http://t.co/hRAsIa19zD
Document Sentiment: positive (Score: 0.261738)
##########################################################
# The Stats #
##########################################################
Document-Level Sentiment:
Positive: 3 (60.00%)
Negative: 1 (20.00%)
Neutral: 1 (20.00%)
Total: 5 (100.00%)
Problem (Question):
How do I just scrape Positive, Negitive, Neutral and send it to Google Visualization? (I.e make a JSON?)
Any help would be really appreciated.
Shoot, I just realized you were asking the other way around. To write the parsing app in Java.
Anyway the idea would be the same, but the language would differ.
But that would also mean that you have access to the sources of the python app, so you can maybe dig around there, and you could dump the result object into the console as a JSON object.
Original answer in python:
You should identify the types of lines and parse them and construct the JSON object yourself.
Like for every line:
import re
json_obj = {}
pattern = "^(\w+): (\d) \((\d{2}\.\d{2}%)\)$"
match = re.match(pattern, line)
if match:
prop_obj = { "value": match[2], "percent": match[3] }
json_obj[match[1]] = prop_obj
This would transform the line:
Positive: 3 (60.00%)
into
{
Positive: {
value: "3"
percent: "60.00%"
}
}
Taking this idea further, the parsing rules should be a dictionary of pattern-extractor_methods as key-values
var parse_rules = {
"^(\w+): (\d) \((\d{2}\.\d{2}%)\)$":
def (matches):
return { match[1]: { "value": match[2], "percent": match[3] }}
, ...
}
And for each line you would test against the parse rules and execute the methods if a match is found, and the result of the method is merged in the JSON result object
This is a lot of work (depending on the complexity of the java app, but I'd go this way if the Java app cannot be modified.
Regex explanation & example
Related
I am having a problem when trying to use the Z3Str3 from z3-4.8.9-x64-ubuntu-16.04 notably if I substitute the com.microsoft.z3.jar to the one in z3-4.8.8-x64-ubuntu-16.04 I no longer have that issue. The problem is that the Z3 process never comes back with a result, despite the simplicity of the query. I noticed though that it returns the valid answer when I kill my program. I am not noticing that behavior when I am trying to run the same query on the executable, so I am guessing there is something about using the jar file that I might need to tweak one way or the other.
Here is my code. I am using Ubuntu 16.04 LTS, and IntelliJ version ultimate 2020.3.
Many thanks!
import com.microsoft.z3.*;
public class Z3String3Processor_reduced {
public static void main(String[] args) {
StringBuilder currentQuery = new StringBuilder("\n" +
"(declare-const string0 String)\n" +
"(assert (= (str.indexof string0 \"a\" 1) 6))\n" +
"(check-sat)\n" +
"(get-model)\n" +
"\n");
Context context1 = new Context();
Solver solver1 = context1.mkSolver();
Params params = context1.mkParams();
params.add("smt.string_solver", "z3str3");
solver1.setParameters(params);
StringBuilder finalQuery = new StringBuilder(currentQuery.toString());
// attempt to parse the query, if successful continue with checking satisfiability
try {
// throws z3 exception if malformed or unknown constant/operation
BoolExpr[] assertions = context1.parseSMTLIB2String(finalQuery.toString(), null, null, null, null);
solver1.add(assertions);
// check sat, if so we can go ahead and get the model....
if (solver1.check() == Status.SATISFIABLE) {
System.out.println("sat");
} else
System.out.println("not sat");
context1.close();
} catch (Z3Exception e) {
System.out.println("Z3 exception: " + e.getMessage());
}
}
}
I don't think this has anything to do with Java. Let's extract your query and put it in a file named a.smt2:
$ cat a.smt2
(declare-const string0 String)
(assert (= (str.indexof string0 "a" 1) 6))
(check-sat)
(get-model)
Now, if I run:
$ z3 a.smt2
sat
(
(define-fun string0 () String
"FBCADEaGaa")
)
That's good. But if I run:
$ z3 smt.string_solver=z3str3 a.smt2
... does not terminate ..
So, bottom line, your query (as simple as it looks), gives hard time to the z3str3 solver.
I see that you already reported this as a bug at https://github.com/Z3Prover/z3/issues/5673
Given that the default string-solver can handle the query just fine, why not just use that one? If you have to use z3str3 for some other reason, then you've found a case where it doesn't handle this query well; I'm not sure how inclined the z3 folks will be to fix this given the query is handled by the default solver rather quickly. Please report what you find out!
I am trying to export repeat grid data to excel. To do this, I have provided a button which runs "MyCustomActivity" activity via clicking. The button is placed above the grid in the same layout. It also worth pointing out that I am utulizing an article as a guide to configure. According to the guide my "MyCustomActivity" activity contains two steps:
Method: Property-Set, Method Parameters: Param.exportmode = "excel"
Method: Call pzRDExportWrapper. And I pass current parameters (There is only one from the 1st step).
But after I had got an issue I have changed the 2nd step by Call Rule-Obj-Report-Definition.pzRDExportWrapper
But as you have already understood the solution doesn't work. I have checked the log files and found interesting error:
2017-04-11 21:08:27,992 [ WebContainer : 4] [OpenPortal] [ ] [ MyFW:01.01.02] (ctionWrapper._baseclass.Action) ERROR as1|172.22.254.110 bar - Activity 'MyCustomActivity' failed to execute; Failed to find a 'RULE-OBJ-ACTIVITY' with the name 'PZRESOLVECOPYFILTERS' that applies to 'COM-FW-MyFW-Work'. There were 3 rules with this name in the rulebase, but none matched this request. The 3 rules named 'PZRESOLVECOPYFILTERS' defined in the rulebase are:
2017-04-11 21:08:42,807 [ WebContainer : 4] [TABTHREAD1] [ ] [ MyFW:01.01.02] (fileSetup.Code_Security.Action) ERROR as1|172.22.254.110 bar - External authentication failed:
If someone have any suggestions and share some, I will appreciate it.
Thank you.
I wanted to provide a functionality of exporting retrieved works to a CSV file. The functionality should has a feature to choose fields to retrieve, all results should be in Ukrainian and be able to use any SearchFilter Pages and Report Definition rules.
At a User Portal I have two sections: the first section contains text fields and a Search button, and a section with a Repeat Grid to display results. The textfields are used to filter results and they use a page Org-Div-Work-SearchFilter.
I made a custom parser to csv. I created two activities and wrote some Java code. I should mention that I took some code from the pzPDExportWrapper.
The activities are:
ExportToCSV - takes parameters from users, gets data, invokes the ConvertResultsToCSV;
ConvertResultsToCSV - converts retrieved data to a .CSV file.
Configurations of the ExportToCSV activity:
The Pages And Classes tab:
ReportDefinition is an object of a certain Report Definition.
SearchFilter is a Page with values inputted by user.
ReportDefinitionResults is a list of retrieved works to export.
ReportDefinitionResults.pxResults denotes a type of a certain work.
The Parameters tab:
FileName is a name of a generated file
ColumnsNames names of columns separated by comma. If the parameter is empty then CSVProperties is exported.
CSVProperties is a props to display in a spreadsheet separated by comma.
SearchPageName is a name of a page to filter results.
ReportDefinitionName is a RD's name used to retrieve results.
ReportDefinitionClass is a class of utilized report definition.
The Step tab:
Lets look through the steps:
1. Get an SearchFilte Page with a name from a Parameter with populated fields:
2. If SearchFilter is not Empty, call a Data Transform to convert SearchFilter's properties to Paramemer properties:
A fragment of the data Transform:
3. Gets an object of a Report Definition
4. Set parameters for the Report Definition
5. Invoke the Report Definition and save results to ReportDefinitionResults:
6. Invoke the ConvertResultsToCSV activity:
7. Delete the result page:
The overview of the ConvertResultsToCSV activity.
The Parameters tab if the ConvertResultsToCSV activity:
CSVProperties are the properties to retrieve and export.
ColumnsNames are names of columns to display.
PageListProperty a name of the property to be read in the primay page
FileName the name of generated file. Can be empty.
AppendTimeStampToFileName - if true, a time of the file generation.
CSVString a string of generated CSV to be saved to a file.
FileName a name of a file.
listSeperator is always a semicolon to separate fields.
Lets skim all the steps in the activity:
Get a localization from user settings (commented):
In theory it is able to support a localization in many languages.
Set always "uk" (Ukrainian) localization.
Get a separator according to localization. It is always a semicolon in Ukrainian, English and Russian. It is required to check in other languages.
The step contains Java code, which form a CSV string:
StringBuffer csvContent = new StringBuffer(); // a content of buffer
String pageListProp = tools.getParamValue("PageListProperty");
ClipboardProperty resultsProp = myStepPage.getProperty(pageListProp);
// fill the properties names list
java.util.List<String> propertiesNames = new java.util.LinkedList<String>(); // names of properties which values display in csv
String csvProps = tools.getParamValue("CSVProperties");
propertiesNames = java.util.Arrays.asList(csvProps.split(","));
// get user's colums names
java.util.List<String> columnsNames = new java.util.LinkedList<String>();
String CSVDisplayProps = tools.getParamValue("ColumnsNames");
if (!CSVDisplayProps.isEmpty()) {
columnsNames = java.util.Arrays.asList(CSVDisplayProps.split(","));
} else {
columnsNames.addAll(propertiesNames);
}
// add columns to csv file
Iterator columnsIter = columnsNames.iterator();
while (columnsIter.hasNext()) {
csvContent.append(columnsIter.next().toString());
if (columnsIter.hasNext()){
csvContent.append(listSeperator); // listSeperator - local variable
}
}
csvContent.append("\r");
for (int i = 1; i <= resultsProp.size(); i++) {
ClipboardPage propPage = resultsProp.getPageValue(i);
Iterator iterator = propertiesNames.iterator();
int propTypeIndex = 0;
while (iterator.hasNext()) {
ClipboardProperty clipProp = propPage.getIfPresent((iterator.next()).toString());
String propValue = "";
if(clipProp != null && !clipProp.isEmpty()) {
char propType = clipProp.getType();
propValue = clipProp.getStringValue();
if (propType == ImmutablePropertyInfo.TYPE_DATE) {
DateTimeUtils dtu = ThreadContainer.get().getDateTimeUtils();
long mills = dtu.parseDateString(propValue);
java.util.Date date = new Date(mills);
String sdate = dtu.formatDateTimeStamp(date);
propValue = dtu.formatDateTime(sdate, "dd.MM.yyyy", "", "");
}
else if (propType == ImmutablePropertyInfo.TYPE_DATETIME) {
DateTimeUtils dtu = ThreadContainer.get().getDateTimeUtils();
propValue = dtu.formatDateTime(propValue, "dd.MM.yyyy HH:mm", "", "");
}
else if ((propType == ImmutablePropertyInfo.TYPE_DECIMAL)) {
propValue = PRNumberFormat.format(localeCode,PRNumberFormat.DEFAULT_DECIMAL, false, null, new BigDecimal(propValue));
}
else if (propType == ImmutablePropertyInfo.TYPE_DOUBLE) {
propValue = PRNumberFormat.format(localeCode,PRNumberFormat.DEFAULT_DECIMAL, false, null, Double.parseDouble(propValue));
}
else if (propType == ImmutablePropertyInfo.TYPE_TEXT) {
propValue = clipProp.getLocalizedText();
}
else if (propType == ImmutablePropertyInfo.TYPE_INTEGER) {
Integer intPropValue = Integer.parseInt(propValue);
if (intPropValue < 0) {
propValue = new String();
}
}
}
if(propValue.contains(listSeperator)){
csvContent.append("\""+propValue+"\"");
} else {
csvContent.append(propValue);
}
if(iterator.hasNext()){
csvContent.append(listSeperator);
}
propTypeIndex++;
}
csvContent.append("\r");
}
CSVString = csvContent.toString();
5. This step forms and save a file in server's catalog tree
char sep = PRFile.separatorChar;
String exportPath= tools.getProperty("pxProcess.pxServiceExportPath").getStringValue();
DateTimeUtils dtu = ThreadContainer.get().getDateTimeUtils();
String fileNameParam = tools.getParamValue("FileName");
if(fileNameParam.equals("")){
fileNameParam = "RecordsToCSV";
}
//append a time stamp
Boolean appendTimeStamp = tools.getParamAsBoolean(ImmutablePropertyInfo.TYPE_TRUEFALSE,"AppendTimeStampToFileName");
FileName += fileNameParam;
if(appendTimeStamp) {
FileName += "_";
String currentDateTime = dtu.getCurrentTimeStamp();
currentDateTime = dtu.formatDateTime(currentDateTime, "HH-mm-ss_dd.MM.yyyy", "", "");
FileName += currentDateTime;
}
//append a file format
FileName += ".csv";
String strSQLfullPath = exportPath + sep + FileName;
PRFile f = new PRFile(strSQLfullPath);
PROutputStream stream = null;
PRWriter out = null;
try {
// Create file
stream = new PROutputStream(f);
out = new PRWriter(stream, "UTF-8");
// Bug with Excel reading a file starting with 'ID' as SYLK file. If CSV starts with ID, prepend an empty space.
if(CSVString.startsWith("ID")){
CSVString=" "+CSVString;
}
out.write(CSVString);
} catch (Exception e) {
oLog.error("Error writing csv file: " + e.getMessage());
} finally {
try {
// Close the output stream
out.close();
} catch (Exception e) {
oLog.error("Error of closing a file stream: " + e.getMessage());
}
}
The last step calls #baseclass.DownloadFile to download the file:
Finally, we can post a button on some section or somewhere else and set up an Actions tab like this:
It also works fine inside "Refresh Section" action.
A possible result could be
Thanks for reading.
I am using RapidMiner 5. I want to make a text preprocessing module to use with a categorization system. I created a process in RapidMiner with these steps.
Tokenize
Transform Case
Stemming
Filtering stopwords
Generating n-grams
I want to write a script to do spell correction for these words. So, I used 'Execute Script' operator and wrote a groovy script for doing this (from here- raelcunha). This is the code ( helped by RapidMiner community) I wrote in execute Script operator of rapid miner.
Document doc=input[0]
List<Token> newTokens = new LinkedList<Token>();
nWords=train("set2.txt")
for (Token token : doc.getTokenSequence()) {
//String output=correct((String)token.getToken(),nWords)
println token.getToken();
Token nToken = new Token(correct("garbge",nWords), token);
newTokens.add(nToken);
}
doc.setTokenSequence(newTokens);
return doc;
This is the code for spell correction. ( Thanks to Norvig.)
import com.rapidminer.operator.text.Document;
import com.rapidminer.operator.text.Token;
import java.util.List;
import java.util.LinkedList;
def train(f){
def n = [:]
new File(f).eachLine{it.toLowerCase().eachMatch(/\w+/){n[it]=n[it]?n[it]+1:1}}
n
}
def edits(word) {
def result = [], n = word.length()-1
for(i in 0..n) result.add(word[0..<i] + word.substring(i+1))
for(i in 0..n-1) result.add(word[0..<i] + word[i+1] + word[i, i+1] + word.substring(i+2))
for(i in 0..n) for(c in 'a'..'z') result.add(word[0..<i] + c + word.substring(i+1))
for(i in 0..n) for(c in 'a'..'z') result.add(word[0..<i] + c + word.substring(i))
result
}
def correct(word, nWords) {
if(nWords[word]) return word
def list = edits(word), candidates = [:]
for(s in list) if(nWords[s]) candidates[nWords[s]] = s
if(candidates.size() > 0) return candidates[candidates.keySet().max()]
for(s in list) for(w in edits(s)) if(nWords[w]) candidates[nWords[w]] = w
return candidates.size() > 0 ? candidates[candidates.keySet().max()] : word
}
I am getting String index out of bounds exception while calling edits method.
And, I do not know how to debug this because rapidminer just tells me that there is an issue in the Execute Script operator and not saying which line of script caused this issue.
So, I am planning to do the same thing by creating an operator in Java as mentioned here-How to extend RapidMiner
The things I did:
Included all jar files from RapidMiner Lib folder , (C:\Program Files (x86)\Rapid-I\RapidMiner5\lib ) into the build path of my java project.
Started coding using the same guide the link to which is given above.
Input for my operator is a Document ( com.rapidminer.operator.text.Document)
as in the script.
But, I am not able to use this Document object in this code. Can you tell me why? Where are the text processing jars located?
For using the plugin jars, should we add some other locations to the BuildPath?
I want to fetch the app category from play store through its unique identifier i.e. package name, I am using the following code but does not return any data. I also tried to use this AppsRequest.newBuilder().setAppId(query) still no help.
Thanks.
String AndroidId = "dead000beef";
MarketSession session = new MarketSession();
session.login("email", "passwd");
session.getContext().setAndroidId(AndroidId);
String query = "package:com.king.candycrushsaga";
AppsRequest appsRequest = AppsRequest.newBuilder().setQuery(query).setStartIndex(0)
.setEntriesCount(10).setWithExtendedInfo(true).build();
session.append(appsRequest, new Callback<AppsResponse>() {
#Override
public void onResult(ResponseContext context, AppsResponse response) {
String response1 = response.toString();
Log.e("reponse", response1);
}
});
session.flush();
Use this script:
######## Fetch App names and genre of apps from playstore url, using pakage names #############
"""
Reuirements for running this script:
1. requests library
Note: Run this command to avoid insecureplatform warning pip install --upgrade ndg-httpsclient
2. bs4
pip install requests
pip install bs4
"""
import requests
import csv
from bs4 import BeautifulSoup
# url to be used for package
APP_LINK = "https://play.google.com/store/apps/details?id="
output_list = []; input_list = []
# get input file path
print "Need input CSV file (absolute) path \nEnsure csv is of format: <package_name>, <id>\n\nEnter Path:"
input_file_path = str(raw_input())
# store package names and ids in list of tuples
with open(input_file_path, 'rb') as csvfile:
for line in csvfile.readlines():
(p, i) = line.strip().split(',')
input_list.append((p, i))
print "\n\nSit back and relax, this might take a while!\n\n"
for package in input_list:
# generate url, get html
url = APP_LINK + package[0]
r = requests.get(url)
if not (r.status_code==404):
data = r.text
soup = BeautifulSoup(data, 'html.parser')
# parse result
x = ""; y = "";
try:
x = soup.find('div', {'class': 'id-app-title'})
x = x.text
except:
print "Package name not found for: %s" %package[0]
try:
y = soup.find('span', {'itemprop': 'genre'})
y = y.text
except:
print "ID not found for: %s" %package[0]
output_list.append([x,y])
else:
print "App not found: %s" %package[0]
# write to csv file
with open('results.csv', 'w') as fp:
a = csv.writer(fp, delimiter=",")
a.writerows(output_list)
This is what i did, best and easy solution
https://androidquery.appspot.com/api/market?app=your.unique.package.name
Or otherwise you can get source html and get the string out of it ...
https://play.google.com/store/apps/details?id=your.unique.package.name
Get this string out of it - use split or substring methods
<span itemprop="genre">Sports</span>
In this case sports is your category
use android-market-api it will gives all information of application
I am using Weka to create a classifier via the Java API.
The instances are created using java code.
The classifier is being created from code as well via passing following
String args[]=" -x 10 -s 1 -W weka.classifiers.functions.Logistic".split(" ");
String classname;
String[] tmpOptions = Utils.splitOptions(Utils.getOption("W", args));
classname = tmpOptions[0];
System.out.println(classname);
Classifier cls = (Classifier) Utils.forName(Classifier.class, classname, tmpOptions);
It works fine and does cross validation.
After that I once again load my training instances and label their output as ?
and pass it to classifier using
for (int index = 0; index < postDatas.size(); index++) {
Instance instance = nominal.instance(index);
double label = classifier.classifyInstance(instance);
System.out.println(label);
}
classifier.classifyInstance(instance); gives me following exception:
java.lang.NullPointerException
at weka.classifiers.functions.Logistic.distributionForInstance(Logistic.java:710)
any clues to where am I going wrong?
Since you didn't provide all relevant information, I'll take a shot in the dark:
I'm assuming that you're using Weka version 3.7.5 and I found the following source code for Logistic.java online
public double [] distributionForInstance(Instance instance) throws Exception {
// line 710
m_ReplaceMissingValues.input(instance);
instance = m_ReplaceMissingValues.output();
...
}
Assuming you didn't pass null for instance, this only leaves m_ReplaceMissingValues. That member is initialized when the method Logistic.buildClassifier(Instances train)is called:
public void buildClassifier(Instances train) throws Exception {
...
// missing values
m_ReplaceMissingValues = new ReplaceMissingValues();
m_ReplaceMissingValues.setInputFormat(train);
train = Filter.useFilter(train, m_ReplaceMissingValues);
...
}
It looks like you've never trained your classifier Logistic on any data after you created the object in the line
Classifier cls = (Classifier) Utils.forName(Classifier.class, classname, tmpOptions);