i want to use heideltime tool in my java code, so i downloaded heideltime-standalone,and i imported de.unihd.dbs.heideltime.standalone.jar as well as stanford-postagger.jar.
here is the code :
String textFile ="مدى اسبوع";
HeidelTimeStandalone H = new HeidelTimeStandalone(Language.ARABIC,
DocumentType.NEWS,
OutputType.TIMEML,
"/heideltime-standalone/config.props",
POSTagger.STANFORDPOSTAGGER,true);
String result = H.process(textFile,resultFormatter );
System.out.print(result);
and here is the output:
mai 01, 2016 5:09:54 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone initialize
INFOS: HeidelTimeStandalone initialized with language arabic
mai 01, 2016 5:09:54 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone readConfigFile
INFOS: trying to read in file /heideltime-standalone/config.props
May 01, 2016 5:09:56 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone initialize
INFO: HeidelTime initialized
May 01, 2016 5:09:56 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone initialize
INFO: JCas factory initialized
May 01, 2016 5:09:56 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone process
INFO: Processing started
de.unihd.dbs.heideltime.standalone.exceptions.DocumentCreationTimeMissingException
at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.provideDocumentCreationTime(HeidelTimeStandalone.java:304)
at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.process(HeidelTimeStandalone.java:493)
at de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.process(HeidelTimeStandalone.java:427)
at Arabic_Parser.main(Arabic_Parser.java:54)
May 01, 2016 5:09:56 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone process
WARNING: Processing aborted due to errors
May 01, 2016 5:09:56 PM de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone process
INFO: Result formatted
<?xml version="1.0"?>
<!DOCTYPE TimeML SYSTEM "TimeML.dtd">
<TimeML>
مدى اسبوع
</TimeML>
as you can see the Processing aborted due to errors, would you please help me to fix the errors
The HeidelTime that function you're calling
String result = H.process(textFile,resultFormatter );
says in its doc comment:
/**
* Processes document with HeidelTime
*
* #param document
* #return Annotated document
* #throws DocumentCreationTimeMissingException
* If document creation time is missing when processing a
* document of type {#link DocumentType#NEWS}. Use
* {#link #process(String, Date)} instead to provide document
* creation time!
*/
public String process(String document, ResultFormatter resultFormatter)
throws DocumentCreationTimeMissingException {
If you want to call this with NEWS then you'll have to specify a time too, as an extra parameter between the document and result formatter.
Related
I'm using htmlunit [http://htmlunit.sourceforge.net/] and it spews out a bunch of warnings/error:
Mar 24, 2017 6:37:30 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Mar 24, 2017 6:37:31 PM com.gargoylesoftware.htmlunit.html.InputElementFactory createElementNS
INFO: Bad input type: "datetime", creating a text input
Mar 24, 2017 6:37:31 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Mar 24, 2017 6:37:32 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Mar 24, 2017 6:37:34 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Mar 24, 2017 6:37:34 PM com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError
SEVERE: runtimeError: message=[An invalid or illegal selector was specified (selector: '*,:x' error: Invalid selector: :x).] sourceName=[https://www.example.com/bundles/jquery?v=u8J3xxyrazUhSJl-OWRJ6I82HpC6Fs7PQ0-l8XzoZXY1] line=[1] lineSource=[null] lineOffset=[0]
Mar 24, 2017 6:37:35 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Mar 24, 2017 6:37:35 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Mar 24, 2017 6:37:35 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Mar 24, 2017 6:37:36 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
Mar 24, 2017 6:37:43 PM com.gargoylesoftware.htmlunit.javascript.host.dom.Document createElement
INFO: createElement: Provided string 'iframe name="rufous-frame-29_-1">https://platform.twitter.com/widgets.js] line=[9] lineSource=[null] lineOffset=[0]
I've looked at other resources and tried to turn it off by:
Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
Logger.getLogger("org.apache.http").setLevel(Level.OFF);
and:
final WebClient webClient = new WebClient(BrowserVersion.EDGE);
but it does not work.
What else can be done to suppress these warning/error messages?
If you are not going set the loggers to OFF in 'logging.properties' file then you need pin the loggers with a hard reference before you set the level to OFF.
Here is a corrected example based off your code:
private static final Logger[] PINNED_LOGGERS;
static {
System.setProperty("org.apache.commons.logging.simplelog.defaultlog", "fatal");
PINNED_LOGGERS = new Logger[]{
Logger.getLogger("com.gargoylesoftware.htmlunit"),
Logger.getLogger("org.apache.http")
};
for (Logger l : PINNED_LOGGERS) {
l.setLevel(Level.OFF);
}
}
I put the following at the start of the application and the errors/warnings stopped:
System.getProperties().put("org.apache.commons.logging.simplelog.defaultlog", "fatal");
Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
Logger.getLogger("org.apache.http").setLevel(Level.OFF);
Attempting to use HtmlUnit. I want to input a username(pin) and password(password) and get to the next page. The output is a long list of errors.
I am unable to post all the errors. However, the majority are duplicates. I've posted below each unique error message I've received:
Sep 10, 2015 7:00:35 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Sep 10, 2015 7:00:41 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/stylesheet.css' [141:2] Error in style rule. (Invalid token ".2". Was expecting one of: <S>, <LBRACE>, ".", ":", "[", <COMMA>, <HASH>, <S>.)
Sep 10, 2015 7:00:44 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/bootstrap.css' [2590:17] Error in expression; ':' found after identifier "progid".
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/style.css' [130:25] Error in expression; ':' found after identifier "data".
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/style.css' [131:8] Error in declaration. (Invalid token ",". Was expecting one of: <S>, ":".)
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/style.css' [142:17] Invalid color "#7d7e7d\0".
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/style.css' [166:23] Invalid color "#cccccc\0".
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/common/style.css' [459:17] Invalid color "#b5b5b5\0".
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.html.InputElementFactory createElementNS
INFO: Bad input type: "tel", creating a text input
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.html.InputElementFactory createElementNS
INFO: Bad input type: "url", creating a text input
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.html.InputElementFactory createElementNS
INFO: Bad input type: "email", creating a text input
Sep 10, 2015 7:00:45 PM com.gargoylesoftware.htmlunit.html.InputElementFactory createElementNS
INFO: Bad input type: "datetime", creating a text input
Sep 10, 2015 7:00:47 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://app.applyyourself.com/AYApplicantLogin/fl_ApplicantLogin.asp?id=dukegrad' [1:141] Error in declaration. '*' is not allowed as first char of a property.
I've tried using these links:
How to use HtmlUnit in Java?
Groovy htmlunit getFirstByXPath returning null + OCR Question
Here is the code:
public ApplicationLogin(String pin, String password) throws Exception {
WebClient webClient = new WebClient();
HtmlPage webPage = webClient.getPage("https://app.applyyourself.com/AYApplicantLogin/fl_ApplicantLogin.asp?id=dukegrad");
HtmlForm form = webPage.getFirstByXPath("//form[#id='frmApplicantLogin']");
HtmlTextInput username = form.getInputByName("UserID");
HtmlPasswordInput pass = form.getInputByName("Password");
HtmlSubmitInput b = form.getInputByValue("login");
username.setValueAttribute(pin);
pass.setValueAttribute(password);
HtmlPage webPage2 = b.click();
}
Use another WEB driver instead of default. For instance it could be CHROME (you can chose between different browsers like CHROME, FIREFOX_31, FIREFOX_31, INTERNET_EXPLORER_8, INTERNET_EXPLORER_11 ).
WebClient webClient = new WebClient(BrowserVersion.CHROME);
I suppose it should fix your problem.
How do I turn off the logging in my Console output? I don't want to see the output of my Java program:
Nov 01, 2013 12:01:29 PM org.glassfish.tyrus.server.ServerContainerFactory create
INFO: Provider class loaded: org.glassfish.tyrus.container.grizzly.GrizzlyEngine
Nov 01, 2013 12:01:29 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [0.0.0.0:8029]
Nov 01, 2013 12:01:29 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer] Started.
Nov 01, 2013 12:01:29 PM org.glassfish.tyrus.server.Server start
INFO: WebSocket Registered apps: URLs all start with ws://localhost:8029
Nov 01, 2013 12:01:29 PM org.glassfish.tyrus.server.Server start
INFO: WebSocket server started.
I am surprised that Java does not have this functionality baked in (or that I couldn't find it which is kind of the same thing nowadays)...
Assign the System.out and System.err to a stream that does not write anything (to log). In .NET there is the the Stream.Null which is a stream with no backing store, i.e. a null stream or a void stream.
In Java, such a Stream is easily built by extending the java.io.OutputStream:
class VoidStream extends OutputStream {
#Override
public void write(int b) throws IOException {
//
// --------------
// | void space |
// --------------
//
}
}
Now, set the VoidStream to the System.out and System.err using System.setOut() and System.Err() resulting in an empty console,
PrintStream ps = new PrintStream(new VoidStream());
System.setOut(ps);
System.setErr(ps);
In theory, one caveat is System.console(), becuase its output stream cannot be set but functions as printf() and format() can be called. Fortunately, the System.console() is only available when the program is run in a terminal (otherwise it returns null) and who is still working with a terminal? ;-)
In Java, the following regular expression
To: a#b\.com.*Subject: Please verify your email address
somehow doesn't find the match in this text:
Dez 21, 2012 10:29:58 AM com.google.appengine.api.datastore.dev.LocalDatastoreService init
INFO: Local Datastore initialized:
Type: High Replication
Storage: In-memory
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: MailService.send
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: From:
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: To: a#b.com
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: Subject: Please verify your email address
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: Body:
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: Content-type: text/plain
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: Data length: 4
My Java code looks like this:
Matcher matcher = Pattern.compile(regex, Pattern.MULTILINE).matcher(input);
if (matcher.find()) {
...
}
This is a bit strange, since the pattern seems to work when I test it online with this tool: http://regexpal.com
So, Java must be interpreting the pattern a bit differently. Is there any way to get error messages of the Matcher?
Update It should find:
To: a#b.com
Dez 21, 2012 10:29:58 AM com.google.appengine.api.mail.dev.LocalMailService log
INFO: Subject: Please verify your email address
You'll want to use Pattern.DOTALL instead of Pattern.MULTILINE.
DOTALL makes the . match newlines. (Which is what you want)
MULTILINE makes ^ and $ work on a per-line basis.
I'm running the following code that errors when I try to commit my changes using Cayenne as my ORM. The code is pasted below and errors out on the context.commitChanges();line. The output messages are pasted below the code. Any help on figuring this out would be appreciated.
import org.apache.cayenne.access.DataContext;
import java.util.*;
import com.jared.*;
public class Main {
public static void main(String[] args) {
DataContext context = DataContext.createDataContext();
Stocks theStock=(Stocks) context.createAndRegisterNewObject(Stocks.class);
theStock.setAsk(3.4);
theStock.setAvgdailyvolume(323849);
theStock .setBid(5.29);
theStock.setChange(-1.22);
theStock.setDayhigh(9.21);
theStock.setDaylow(2.11);
theStock.setLasttradeprice(5.11);
theStock.setLasttradesize(3827);
theStock.setOpen(6.21);
theStock.setPriorclose(4.21);
theStock.setShortratio(1.1);
theStock.setSymbol("^SP%");
theStock.setVolume(28193);
theStock.setLasttradedate(new Date());
context.commitChanges();
System.out.println("Done with the database");
}
}
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate startedLoading
INFO: started configuration loading.
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate shouldLoadDataDomain
INFO: loaded domain: stocks
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate loadDataMap
INFO: loaded .
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate shouldLoadDataNode
INFO: loading .
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate shouldLoadDataNode
INFO: using factory: org.apache.cayenne.conf.DriverDataSourceFactory
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.DriverDataSourceFactory load
INFO: loading driver information from 'stocksNode.driver.xml'.
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.DriverDataSourceFactory$DriverHandler init
INFO: loading driver org.hsqldb.jdbcDriver
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.DriverDataSourceFactory$LoginHandler init
INFO: loading user name and password.
Nov 20, 2008 11:20:37 PM org.apache.cayenne.access.QueryLogger logPoolCreated
INFO: Created connection pool: jdbc:hsqldb:file:/hsqldb/data/stocks
Driver class: org.hsqldb.jdbcDriver
Min. connections in the pool: 1
Max. connections in the pool: 1
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate shouldLoadDataNode
INFO: loaded datasource.
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate initAdapter
INFO: no adapter set, using automatic adapter.
Nov 20, 2008 11:20:37 PM org.apache.cayenne.conf.RuntimeLoadDelegate finishedLoading
INFO: finished configuration loading in 203 ms.
Exception in thread "main" org.apache.cayenne.CayenneRuntimeException: [v.3.0M4 May 18 2008 16:32:02] Commit Exception
at org.apache.cayenne.access.DataContext.flushToParent(DataContext.java:1192)
at org.apache.cayenne.access.DataContext.commitChanges(DataContext.java:1066)
at Main.main(Main.java:24)
Caused by: java.lang.NullPointerException
at org.apache.cayenne.access.DataDomainInsertBucket.createPermIds(DataDomainInsertBucket.java:101)
at org.apache.cayenne.access.DataDomainInsertBucket.appendQueriesInternal(DataDomainInsertBucket.java:76)
at org.apache.cayenne.access.DataDomainSyncBucket.appendQueries(DataDomainSyncBucket.java:80)
at org.apache.cayenne.access.DataDomainFlushAction.preprocess(DataDomainFlushAction.java:183)
at org.apache.cayenne.access.DataDomainFlushAction.flush(DataDomainFlushAction.java:135)
at org.apache.cayenne.access.DataDomain.onSyncFlush(DataDomain.java:821)
at org.apache.cayenne.access.DataDomain$2.transform(DataDomain.java:788)
at org.apache.cayenne.access.DataDomain.runInTransaction(DataDomain.java:847)
at org.apache.cayenne.access.DataDomain.onSync(DataDomain.java:785)
at org.apache.cayenne.access.DataContext.flushToParent(DataContext.java:1164)
... 2 more
user name, password?
ClientConnection connection = new
HessianConnection("http://localhost:8080/cayenne-service",
"cayenne-user", "secret",
null);