Java Goose not extracting content on Android - java

I'm trying to set up a small Android application which extracts content from a web page using the Goose library. Since the library is written in Scala, I'm using the .jar I found here. The problem is, when I try to extract content from a page, it returns nothing. I successfully create an Article object using the URL I need, but the values of the object (title, domain, topImage etc.) are all null. I tried using different urls, to see if the problem was isolated to a single website, but it doesn't appear to be so.
The code I use to set up the Goose instance is this:
gooseDir = context.getCacheDir();
Configuration config = new Configuration();
config.setLocalStoragePath(gooseDir.getAbsolutePath());
Goose goose = new Goose(config);
And then I just create the Article instance like so:
Article article = goose.extractContent(url);
Any advice?

Actually you can't use the Goose library on Android due to incompatibilities, but you can use my Android version: https://github.com/milosmns/goose
It does almost the same thing as Goose, only works well on Android.

Related

LibGDX: Cannot load a json file from assets folder

I have a json file with data for all the tiles in my game that I store in the assets folder. I try to access and parse it using TileList dataList = json.fromJson(TileList.class, Gdx.file.internal("map-elements/tiles/tiles.json")). This works fine for the desktop version but on the html version, after converting with gwt, I get these errors:
GwtApplication: exception: Error reading file: map-elements/tiles/tiles.json
Error reading file: map-elements/tiles/tiles.json
Couldn't find Type for class 'net.vediogames.archipelo.world.tiles.TileList'
TileList is a simple object that contains an array of TileData which can then be converted into Tile objects. I did it this way to make the json parsing easy.
The solution to the json error is simple. Instead of passing the FileHandle into the json parser, pass the string from the file like this:
TileList dataList = json.fromJson(TileList.class, Gdx.file.internal("map-elements/tiles/tiles.json").readString());
In the end, all I needed to do to solve that issue is add .readString(). As for the Couldn't find Type for class 'net.vediogames.archipelo.world.tiles.TileList' error, I also found a solution but it is more complicated.
JavaScript handles class references differently than Java. So I was not able to use TileList.class without first registering it so LibGDX can generate a Reflection. What I needed to do was add this line into my *.gwt.xml files:
<extend-configuration-property name="gdx.reflect.include" value="net.vediogames.archipelo.world.tiles.TileList" />
If you want a full tutorial about how reflection works and how to include packages or exclude, please view the official LibGDX tutorial here.
Your solution was not working for me, I still got the same error. After some hours of testing I got it to work, using your suggestions and by using the ClassReflection instead of referencing the class itself.
Your example:
TileList dataList = json.fromJson(TileList.class, Gdx.file.internal("map-elements/tiles/tiles.json").readString());
looks in my working code like:
TileList dataList = (TileList) json.fromJson(ClassReflection.forName(TileList.class.getName()), Gdx.file.internal("map-elements/tiles/tiles.json").readString());
This is quite a pain in the a.. but I'm glad it is finally working now.

Programatically embed a video in a slideshow using Apache Open Office API

I want to create a plugin that adds a video on the current slide in an open instance of Open Office Impress by specifying the location of the video automatically. I have successfully added shapes to the slide. But I cannot find a way to embed a video.
Using the .uno:InsertAVMedia I can take user input to choose a file and it works. How do I want to specify the location of the file programmatically?
CONCLUSION:
This is not supported by the API. Images and audio can be inserted without user intervention but videos cannot be done this way. Hope this feature is released in subsequent versions.
You requested information about an extension, even though the code you are using is quite different, using a file stream reader and POI.
If you really do want to develop an extension, then start with one of the Java samples. An example that uses Impress is https://wiki.openoffice.org/wiki/File:SDraw.zip.
Inserting videos into an Impress presentation can be difficult. First be sure you can get it to work manually. The most obvious way to do that seems to be Insert -> Media -> Audio or Video. However many people use links to files instead of actually embedding the file. See also https://ask.libreoffice.org/en/question/1898/how-to-embed-video-into-impress-presentation/.
If embedding is working for your needs and you want to automate the embedding by using an extension (which seems to be what your question is asking), then there is a dispatcher method called InsertAVMedia that does this.
I do not know offhand what the parameters are for the call. See https://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=61127 for how to look up parameters for dispatcher calls.
EDIT
Here is some Basic code that inserts a video.
sub insert_video
dim document as object
dim dispatcher as object
document = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
dispatcher.executeDispatch(document, ".uno:InsertAVMedia", "", 0, Array())
end sub
From looking at InsertAVMedia in sfx.sdi, it seems that this call does not take any parameters.
EDIT 2
Sorry but InsertVideo and InsertImage do not take parameters either. From svx.sdi it looks like the following calls take parameters of some sort: InsertGalleryPic, InsertGraphic, InsertObject, InsertPlugin, AVMediaToolBox.
However according to https://wiki.openoffice.org/wiki/Documentation/OOoAuthors_User_Manual/Getting_Started/Sometimes_the_macro_recorder_fails, it is not possible to specify a file for InsertObject. That documentation also mentions that you never know what will work until you try it.
InsertGraphic takes a FileName parameter, so I would think that should work.
It is possible to add an XPlayer on the current slide. It looks like this will allow you to play a video, and you can specify the file's URL automatically.
Here is an example using createPlayer: https://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=57699.
EDIT:
This Basic code works on my system. To play the video, simply call the routine.
sub play_video
If Video_flag = 0 Then
video =converttoURL( _
"C:\Users\JimStandard\Downloads\H264_test1_Talkinghead_avi_480x360.avi")
Video_flag = 1
'for windows:
oManager = CreateUnoService("com.sun.star.media.Manager_DirectX")
' for Linux
' oManager = CreateUnoService("com.sun.star.media.Manager_GStreamer")
oPlayer = oManager.createPlayer( video )
' oPlayer.CreatePlayerwindow(array()) ' crashes?
'oPlayer.setRate(1.1)
oPlayer.setPlaybackLoop(False)
oPlayer.setMediaTime(0.0)
oPlayer.setVolumeDB(GetSoundVolume())
oPlayer.start() ' Lecture
Player_flag = 1
Else
oPlayer.start() ' Lecture
Player_flag = 1
End If
End Sub

How to get last modified date of a file in Google Cloud Storage?

This guide on the Python library says cloudstorage.stat() returns
... last-modified time, header-length, content type for the specified file.
The Java guide however, seems to have no counterpart. At least, I cannot find it in GcsFileMetadata.
Is it possible to get this information using the Java library?
It seems the Java library doesn't have this feature (yet).
You can set this info manually, though:
GSFileOptionsBuilder optionsBuilder = new GSFileOptionsBuilder()
.setBucket("my_bucket")
.setKey("my_object")
.setAcl("public-read")
.setMimeType("text/html") //etc etc
.setUserMetadata("date-created", "01/09/2011", "owner", "Jon");
And retrieve this piece of metadata with
GcsFileMetadata.getOptions().getUserMetadata().get("date-created")

Generating BPEL files programmatically?

Is there a way to generate BPEL programmatically in Java?
I tried using the BPEL Eclipse Designer API to write this code:
Process process = null;
try {
Resource.Factory.Registry reg =Resource.Factory.Registry.INSTANCE;
Map<String, Object> m = reg.getExtensionToFactoryMap();
m.put("bpel", new BPELResourceFactoryImpl());//it works with XMLResourceFactoryImpl()
//create resource
URI uri =URI.createFileURI("myBPEL2.bpel");
ResourceSet rSet = new ResourceSetImpl();
Resource bpelResource = rSet.createResource(uri);
//create/populate process
process = BPELFactory.eINSTANCE.createProcess();
process.setName("myBPEL");
Sequence mySeq = BPELFactory.eINSTANCE.createSequence();
mySeq.setName("mainSequence");
process.setActivity(mySeq);
//save resource
bpelResource.getContents().add(process);
Map<String,String> map= new HashMap<String, String>();
map.put("bpel", "http://docs.oasis-open.org/wsbpel/2.0/process/executable");
map.put("tns", "http://matrix.bpelprocess");
map.put("xsd", "http://www.w3.org/2001/XMLSchema");
bpelResource.save(map);
}
catch (Exception e) {
e.printStackTrace();
}
}
but I received an error:
INamespaceMap cannot be attached to an eObject ...
I read this message by Simon:
I understand that using the BPEL model outside of eclipse might be desirable, but it was never intended by us. Thus, this isn't supported
Is there any other API that can help?
You might want to give JAXB a try. It helps you to transform the official BPEL XSD into Java classes. You use those classes to construct your BPEL document and output it.
I had exactly the same problem with the BPELUnit [1], so I started a module in BPELUnit that has the first things necessary for generating and reading BPEL Models [2] although it is far from complete. Supported is only BPEL 2.0 (1.1 will follow later) and handlers are also currently not supported (but will be added). It is under active development because BPELUnit's code coverage component will be based on it so it will get BPEL-feature complete over time. You are happily invited to contribute if you need to close gaps earlier.
You can check it out from GitHub or grap the Maven artifact.
As of now there is no documentation but you can have a look at the JUnit tests that read and write processes.
If this is not suitable for, I'd like to share some experiences with you:
Do not use JAXB: You will need to read and write XML Namespaces which are not preserved with JAXB. That's why I have chosen XMLBeans. DOM would be the other alternative that I can think of.
The inheritance in the XML Schema is not really developer friendly. That's why there are own interface structures and wrappers around the XMLBeans generated classes.
Daniel
[1] http://www.bpelunit.net
[2] https://github.com/bpelunit/bpelunit/tree/master/net.bpelunit.model.bpel
This has been solved using the unify framework API after adding the necessary classes to handle correlation. BPELUnit stated by #Daniel seems to be another alternative.
The Eclipse BPEL API is based on an EMF Model. So you could generate your own artifacts using JET or Xpand based on that. This way there is no requirement to run inside Eclipse.
Although you may can't use BPEL outside of Eclipse, have you considered moving parts of your application inside it?
The BPEL XML Schemas are listed in the appendig of the spec. So you could also base your work on that and integrate with existing BPEL applications where necessary.
In case anyone is looking to solve the above problem while still running inside eclipse environment.
The problem can be resolved as stated by Luca Pino here by adding:
AdapterRegistry.INSTANCE.registerAdapterFactory( BPELPackage.eINSTANCE, BasicBPELAdapterFactory.INSTANCE );
before the resource creation line i.e.
Resource bpelResource = rSet.createResource(uri);
Note: Another solution, to the same problem, also stating how to resolve the dependencies to make this code work, can be found in my other answer here.

$wnd.google.visualization is undefined

I'm currently building a SmartGWT-based web application (using the Portlet Layout). So I have several "Portlet", which basically extend GWT Window with different content. Now I want a Portlet to display Dygraphs. So I've created an RPC Service implementation which returns a JSON String (based on a DataTable object).
Since I cannot directly serialize a DataTable object I use
String json = JsonRenderer.renderDataTable(data, true, true).toString();
where "data" is of type DataTable.
Now this String gets correctly passed to the client side where I want to create the Dygraph. In this thread , someone suggested to use
public static native DataTable toDataTable(String json)
/-{ return new $wnd.google.visualization.DataTable(eval("(" + json + ")")); }-/;
If I use this in my GWT client code, i get an error saying
com.google.gwt.core.client.JavaScriptException: (TypeError): $wnd.google.visualization is undefined
Do i miss some "import" of the visualization API? Where do i have to instantiate it?
Or is there another way to get the JSON datastring into the Dygraph? I can't find any examples...
Thank you for any hint!
I assume you have included the visualization.jar and the visualization namespace in your module's XML
<inherits name="com.google.gwt.visualization.Visualization"/>
This will give you the Classes. You probably have done this otherwise you would have gotten a compiler error.
However you also have to include the actual visualization javascript file from the google servers (the visualization.jar is only a wrapper). This can be done in two different ways:
1.) Include it in the host page:
<script type="text/javascript">
google.load("visualization", "1", {'packages' : ["corechart"] });
</script>
or
2.) Load it dynamically where you need it:
VisualizationUtils.loadVisualizationApi(onLoadCallback, MotionChart.PACKAGE);
see http://code.google.com/docreader/#p=gwt-google-apis&s=gwt-google-apis&t=VisualizationGettingStarted
Btw. I have forked the Dygraphs Project and changed the GWT wrapper to more like the other visualization wrappers. You can check it out here: https://github.com/timeu/dygraphs
Edit: I have a new GWT wrapper for dygraphs that uses the GWT 2.8's new JsInterop: https://github.com/timeu/dygraphs-gwt
Note: I changed some behaviour in dygraphs and added some features which weren't available in the upstream code.

Categories

Resources