How to embed Tablesaw graph in Jupyter Notebook with IJava kernel?

How to embed Tablesaw graph in Jupyter Notebook with IJava kernel? - java

I'd like to embed Tablesaw interactive graphs in Jupyter Notebook using the IJava kernel. I realize that Tablesaw may not be able to do this out of the box, but I'm willing to put in a little effort to make this happen. I'm a Java expert, but I'm new to Jupyter Notebook and to the IJava kernel, so I'm not sure where to start. Is there some API for Jupyter Notebook or for IJava for embedding objects?
First of all I installed Anaconda and then installed the IJava kernel in Jupyter Notebook with no problem. So far it's working without a hitch using OpenJDK 11 on Windows 10! Next I tried to use Tablesaw. I was able to add its Maven dependencies, load a CSV file, and create a plot. Very nice!
However to produce a graph Tablesaw generates a temporary HTML file using Plotly, and invokes the browser to show the interactive plot. In other words, the graph does not appear inside Jupyter Notebook.
Tablesaw has an example with embedded graphs using the BeakerX kernel (not IJava), and as you can see (scroll down to "Play (Money)ball with Linear Regression"), they are embedding a Tablesaw Plot directly within Jupyter Notebook. So I know that conceptually embedding an interactive Tablesaw graph in Jupyter Notebook with a Java kernel is possible.
Is this capability something specific to BeakerX? I would switch to BeakerX, but from the documentation I didn't see anything about BeakerX support Java 9+. In addition IJava seemed like a leaner implementation, built directly on top of JShell.
Where do I start to figure out how to embed a Tablesaw Plot object as an interactive graph in Jupyter Notebook using the IJava kernel, the way they are doing in BeakerX?

The IJava kernel has 2 main functions for display_data; display and render. Both hook into the Renderer from the base kernel. We can register a render function for the Figure type from tablesaw.
Add the tablesaw-jsplot dependency (and all required transitive dependencies) via the loadFromPOM cell magic:
%%loadFromPOM
<dependency>
<groupId>tech.tablesaw</groupId>
<artifactId>tablesaw-jsplot</artifactId>
<version>0.30.4</version>
</dependency>
Register a render function for IJava's renderer.
We create a registration for tech.tablesaw.plotly.components.Figure.
If an output type is not specified during the render then we want it to default to text/html (with the preferring call).
If an html render is requested, we build the target <div> as well as the javascript that invokes a plotly render into that <div>. Most of this logic is done via tablesaw's asJavascript method but hooks into Jupyter notebook's require AMD module setup.
import io.github.spencerpark.ijava.IJava;
IJava.getKernelInstance().getRenderer()
.createRegistration(tech.tablesaw.plotly.components.Figure.class)
.preferring(io.github.spencerpark.jupyter.kernel.display.mime.MIMEType.TEXT_HTML)
.register((figure, ctx) -> {
ctx.renderIfRequested(io.github.spencerpark.jupyter.kernel.display.mime.MIMEType.TEXT_HTML, () -> {
String id = UUID.randomUUID().toString().replace("-", "");
figure.asJavascript(id);
Map<String, Object> context = figure.getContext();
StringBuilder html = new StringBuilder();
html.append("<div id=\"").append(id).append("\"></div>\n");
html.append("<script>require(['https://cdn.plot.ly/plotly-1.44.4.min.js'], Plotly => {\n");
html.append("var target_").append(id).append(" = document.getElementById('").append(id).append("');\n");
html.append(context.get("figure")).append('\n');
html.append(context.get("plotFunction")).append('\n');
html.append("})</script>\n");
return html.toString();
});
});
Then we can create a table to display (taken from the Tablesaw docs).
import tech.tablesaw.api.*;
import tech.tablesaw.plotly.api.*;
import tech.tablesaw.plotly.components.*;
String[] animals = {"bear", "cat", "giraffe"};
double[] cuteness = {90.1, 84.3, 99.7};
Table cuteAnimals = Table.create("Cute Animals")
.addColumns(
StringColumn.create("Animal types", animals),
DoubleColumn.create("rating", cuteness)
);
cuteAnimals
Finally we can create a Figure for the table and display it via one of 3 methods for displaying things in IJava.
VerticalBarPlot.create("Cute animals", cuteAnimals, "Animal types", "rating");
which is equivalent to a render call where the "text/html" is implicit because we set it as the preferred type (preferring during registration)
render(VerticalBarPlot.create("Cute animals", cuteAnimals, "Animal types", "rating"), "text/html");
and if it is not at the end of a cell, the display function is another option. For example to display the chart and the cuteAnimals afterwards:
Figure figure = VerticalBarPlot.create("Cute animals", cuteAnimals, "Animal types", "rating");
display(figure);
cuteAnimals

try using the command prompt and using pip install Tablesaw, then use other suggestions.

Related

Why is a blank Java Icon appearing when I parse a PDF file using Tabula?

I am working on an integration with Apache Drill which enables users to query PDF files directly using SQL. I'm about 80% done and really impressed with how well Tabula works for this.
However, when I execute the first Drill query that uses the Tabula libraries a Java icon pops up and I get the following text in the command line:
2020-10-25 15:06:55.770 java[71188:7121498] Persistent UI failed to open file file://localhost/Users/******/Saved%20Application%20State/net.java.openjdk.cmd.savedState/window_1.data: Permission denied (13)
I changed the permissions on that directory but I'm still getting the Java popup.
This is not normal behavior for Drill and my goal here was to integrate Tabula programmatically. Is Tabula trying to open a window or something like that and if so, is there a way to disable this? I noted that this does not occur in my unit tests.
Here are some relevant code snippets:
public static List<Table> extractTablesFromPDF(PDDocument document, ExtractionAlgorithm algorithm) {
NurminenDetectionAlgorithm detectionAlgorithm = new NurminenDetectionAlgorithm();
ExtractionAlgorithm algExtractor;
SpreadsheetExtractionAlgorithm extractor=new SpreadsheetExtractionAlgorithm();
ObjectExtractor objectExtractor = new ObjectExtractor(document);
PageIterator pages = objectExtractor.extract();
List<Table> tables= new ArrayList<>();
while (pages.hasNext()) {
Page page = pages.next();
algExtractor = algorithm;
/*if (extractor.isTabular(page)) {
algExtractor=new SpreadsheetExtractionAlgorithm();
}
else {
algExtractor = new BasicExtractionAlgorithm();
}*/
List<Rectangle> tablesOnPage = detectionAlgorithm.detect(page);
for (Rectangle guessRect : tablesOnPage) {
Page guess = page.getArea(guessRect);
tables.addAll(algExtractor.extract(guess));
}
}
return tables;
}
This doesn't happen in my unit tests.
Thanks in advance for your help!

Because some code is executed that does an operation that is usually, but technically not necessarily, involved in things that require so-called 'headful' mode (well, that's perhaps not really a term, but the opposite, 'headless' certainly is). This causes a few things to happen, including that icon showing up.
One easy way out of this is to force headless mode. But note that when you do this, any of these 'usually but technically not neccessarily headful' operations may either [1] work fine and no longer show that icon, or, [2] crash with a HeadlessException. Which one you end up with is not just dependent on which operation you're doing, but also which VM you are doing it on - as a rule once one of these ops works fine and no longer throws, later versions won't revert back to throwing (in other words, newer versions of java offer more things that work in headless mode).
To force headless mode, run java with java -Djava.awt.headless=true.
If you must do it from within java code, run System.setProperty("java.awt.headless", "true"); at least once, and before you do any of these 'usually causes headful mode' operations.
Presumably, the thing that is causes headful mode to occur is something graphics involved, such as rendering a JPG or PNG into an ImageBuffer. It's not surprising that Apache Drill is doing this to 'read' images, for example.
Another option is to just upgrade your VM, maybe that helps. As a general rule, features 'move downwards' on this line:
Requires headful mode; running it makes the VM go headful (icon appears); if java.awt.headless is set, the operation fails with a HeadlessException.
Causes headful mode; running it makes the VM go headful. However, if headless is set, it works fine and won't do that.
Completely freed. Running the code works fine and does not cause the VM to go headful. the headless flag has no bearing whatsoever on how the code operates.

How to generate a graphml file in Java. Gephi, JGraph, Prefuse, etc

Help! I'm looking to create a Java application that generates a graph in any one of these formats:
.graphml
.ygf
.gml
.tgf
I need to be able to open the file in the graph editor "yEd".
So far, I have found these solutions:
yFiles For Java
Pro: Export to graphml, able to open in yEd, Java based, perfect.
Why I can't use it: Would cost me more than $2000 to use :( it is exactly what I need however
Gephi
Pro: FREE, Export to graphml, Java based!
Why I can't use it: When I try to open the generated graphml file in yEd, the graphml is broken: it's linear - one line, like this screenshot:
If I get it to work, then this is perfect
The graph I tried was generated using their example project
JGraphX
Pro: Able to generate a graph, Java based, FREE
Why I can't use it: How to export the generated graph to graphml? I couldn't figure it out...
Prefuse
Pro: Free, graph generation, Java based
What I can't use it: Seems like I can only read graphml, and not write graphml. Also, I built the demos fine with build.sh all, but then when I tried to run demos.jar, I got "Failed to load Main-Class"...
Blueprints with GraphML Reader and Writer Library (Tinkerpop?)
Pro: Java, Free, seems like you can export graphml with it
Why I can't use it: I'm confused, do I need to use this in conjunction with one of the "Implementations" listed? How do I use this?
JGraphT with GraphMLExporter
Pro: Able to generate graph, Java based, free, can export to graphml I think
Why I can't use it: I can't figure out how to export it! When I tried to open the generated graphml in yed, I got "yEd has encountered the following error: Could not import file test.graphml." I used thier example project, and did this:
JGraphT Code I Used:
UndirectedGraph<String, DefaultEdge> g = new SimpleGraph<String, DefaultEdge>(DefaultEdge.class);
String v1 = "v1";
String v2 = "v2";
String v3 = "v3";
String v4 = "v4";
// add the vertices
g.addVertex(v1);
g.addVertex(v2);
g.addVertex(v3);
g.addVertex(v4);
// add edges to create a circuit
g.addEdge(v1, v2);
g.addEdge(v2, v3);
g.addEdge(v3, v4);
g.addEdge(v4, v1);
FileWriter w;
try {
GmlExporter<String, DefaultEdge> exporter =
new GmlExporter<String, DefaultEdge>();
w = new FileWriter("test.graphml");
exporter.export(w, g);
} catch (IOException e) {
e.printStackTrace();
}
Any ideas? Thanks!

It might be late to answer, but for solution number two:
Right after you import the graph into yEd, just click "Layout" and select one. yed will not choose one for you as default, that's why it seemed to be linear.

I also wanted to export JgraphT graphs for yED but was not happy with the results. Therefore, I created an extended GMLWriter supporting yED's specific GML format (groups, colours, different edges,...).
GML-Writer-for-yED

I don't know if this fits your use case, but I use neo4j for creating a graph and then use the neo4j-shell-tools to export the graph as graphml. Perhaps this will work for you.

Just replace every occurrence of GmlExporter with GraphMLExporter in your code. That should work.

I´m using de Prefuse library and you can generate a GraphML file from a Graph object with de class GraphMLWriter.

I created a little Tutorial/Github Repo and sample code on how to work with the classes of JgraphT to export to GraphML and GML and how the results could look like in yED.
As already mentioned in another answer, if you don't want to do much configuration yourself, GML-Writer-for-yED might be handy.

How to set Record Selection Formula on Crystal Reports java

How do you create/set Record Selection Formula programatically on crystal reports using java? I tried searching on the internet but the only option is through IFilter which requires a Crystal Report Server. My program only uses the JRC library. Also this is a java desktop application using swing.

It may be a bit late, but maybe this is useful for someone:
reportClientDoc.getDataDefController().getRecordFilterController().setFormulaText("your record selection formula here");

I was doing some research about this and noticed that there are 3 methods with which you can do this:
Using the IFilter interface as shown in this example provided by SAP
// Set the filter string to be used as the Record Filter
String freeEditingFilter = "{Customer.Country} = 'Canada'";
// Retrieve the record filter for the Data Definition Controller
IFilter iFilter = clientDoc.getDataDefController().getDataDefinition().getRecordFilter();
// Set the filter to free editing text filter string
iFilter.setFreeEditingText(freeEditingFilter);
// Modify the filter through the Record Filter Controller to the report
clientDoc.getDataDefController().getRecordFilterController().modify(iFilter);
I am using the JRC only without a Crystal Report Server and the above example worked for me.
As Francisco said in his answer, using the setFormulaText method:
clientDoc.getDataDefController().getRecordFilterController().setFormulaText("{Customer.Country} = 'Canada'");
Using parameters. Parameters can be passed to the report using code (you can use the addDiscreteParameterValue function in the helper class) or else they can be filled in by the user during runtime. I chose not to opt for this option because they can not be set to optional

If you want to create a crystal report of your program, you need another jar file of software.
You can create your program in NetBeans IDE and link your IDE with IReport software which is used in NetBeans for creating Reporting in java.
You get many example from internet about this.

ajax / processing java applets - How to dynamically add an applet to a page

I created a .jar java applet with Processing. I want to be able to dynamically embed the the applet onto my page, but what I'm doing is currently replacing my entire current page with the java applet, rather than appending it in a predefined DOM container. Here's my code:
var $projContainer = $('#project_container');
var attributes = {
codebase:'http://java.sun.com/update/1.6.0/jinstall-6u20-windows-i586.cab',
code:'tree',
archive: projFile,
width: '680',
height: '360'
};
var parameters = {archive: projFile, code: 'tree', scriptable: 'true', image: '/images/structure/processing_loading.gid', boxMessage: 'Loading...', boxbgcolor: '#FFFFFF'};
var version = '1.5' ; // JDK version
deployJava.runApplet(attributes, parameters, version); //Want to be able to specify a dom element to deploy to here

what I'm doing is currently replacing my entire current page with the java applet..
This is a known issue with the runApplet function. One alternative is to:
Extend deployJava.js (by adding it into your page, then writing a new function)
Add a new function getApplet or getAppletElement that basically mimics what runApplet does, but instead of writing it to the document, appending it to (and returning it as) a String.
Do as you will, with the String.

I ended up porting to processing.js for this, which has no problems with being loaded via AJAX.
I put a tutorial for that process up here, since people were contacting me about it:
http://mikeheavers.com/index.php/site/code_single/load_processing.js_sketch_with_ajax_on_user_click

Render JavaScript and HTML in (any) Java Program (Access rendered DOM Tree)?

What are the best Java libraries to "fully download any webpage and render the built-in JavaScript(s) and then access the rendered webpage (that is the DOM-Tree !) programmatically and get the DOM Tree as an "HTML-Source"?
(Something similarly what firebug does in the end, it renders the page and I get access to the fully rendered DOM Tree, as the page looks like in the browser! In contrast, if I click "show source" I only get the JavaScript source code. This is not what I want. I need to have access to the rendered page...)
(With rendering I mean only rendering the DOM Tree not a visual rendering...)
This does not have to be one single library, it's ok to have several libraries that can accomplish this together (one will download, one render...), but due to the dynamic nature of JavaScript most likely the JavaScript library will also have to have some kind of downloader to fully render any asynchronous JS...
Background:
In the "good old days" HttpClient (Apache Library) was everything required to build your own very simple crawler. (A lot of cralwers like Nutch or Heretrix are still built around this core princible, mainly focussing on Standard HTML parsing, so I can't learn from them)
My problem is that I need to crawl some websites that rely heavily on JavaScript and that I can't parse with HttpClient as I defenitely need to execute the JavaScripts before...

You can use JavaFX 2 WebEngine. Download JavaFX SDK (you may already have it if you installed JDK7u2 or later) and try code below.
It will print html with processed javascript.
You can uncomment lines in the middle to see rendering as well.
public class WebLauncher extends Application {
#Override
public void start(Stage stage) {
final WebView webView = new WebView();
final WebEngine webEngine = webView.getEngine();
webEngine.load("http://stackoverflow.com");
//stage.setScene(new Scene(webView));
//stage.show();
webEngine.getLoadWorker().workDoneProperty().addListener(new ChangeListener<Number>() {
#Override
public void changed(ObservableValue<? extends Number> observable, Number oldValue, Number newValue) {
if (newValue.intValue() == 100 /*percents*/) {
try {
org.w3c.dom.Document doc = webEngine.getDocument();
new XMLSerializer(System.out, new OutputFormat(doc, "UTF-8", true)).serialize(doc);
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
});
}
public static void main(String[] args) {
launch();
}
}

This is a bit outside of the box, but if you are planning on running your code in a server where you have complete control over your environment, it might work...
Install Firefox (or XulRunner, if you want to keep things lightweight) on your machine.
Using the Firefox plugins system, write a small plugin which takes loads a given URL, waits a few seconds, then copies the page's DOM into a String.
From this plugin, use the Java LiveConnect API (see http://jdk6.java.net/plugin2/liveconnect/ and https://developer.mozilla.org/en/LiveConnect ) to push that string across to a public static function in some embedded Java code, which can either do the required processing itself or farm it out to some more complicated code.
Benefits: You are using a browser that most application developers target, so the observed behavior should be comparable. You can also upgrade the browser along the normal upgrade path, so your library won't become out-of-date as HTML standards change.
Disadvantages: You will need to have permission to start a non-headless application on your server. You'll also have the complexity of inter-process communication to worry about.
I have used the plugin API to call Java before, and it's quite achievable. If you'd like some sample code, you should take a look at the XQuery plugin - it loads XQuery code from the DOM, passes it across to the Java Saxon library for processing, then pushes the result back into the browser. There are some details about it here:
https://developer.mozilla.org/en/XQuery

The Selenium library is normally used for testing, but does give you remote control of most standard browsers (IE, Firefox, etc) as well as a headless, browser free mode (using HtmlUnit). Because it is intended for UI verification by page scraping, it may well serve your purposes.
In my experience it can sometimes struggle with very slow JavaScript, but with careful use of "wait" commands you can get quite reliable results.
It also has the benefit that you can actually drive the page, not just scrape it. That means that if you perform some actions on the page before you get to the data you want (click the search button, click next, now scrape) then you can code that into the process.
I don't know if you'll be able to get the full DOM in a navigable form from Selenium, but it does provide XPath retrieval for the various parts of the page, which is what you'd normally need for a scraping application.

You can use Java, Groovy with or without Grails. Then use Webdriver, Selenium, Spock and Geb these are for testing purposes, but the libraries are useful for your case.
You can implement a Crawler that won't open a new window but just a runtime of these either browser.
Selenium : http://code.google.com/p/selenium/
Webdriver : http://seleniumhq.org/projects/webdriver/
Spock : http://code.google.com/p/spock/
Geb : http://www.gebish.org/manual/current/testing.html

MozSwing could help http://confluence.concord.org/display/MZSW/Home.

You can try JExplorer.
For more information see http://www.teamdev.com/downloads/jexplorer/docs/JExplorer-PGuide.html
You can also try Cobra, see http://lobobrowser.org/cobra.jsp

I haven't tried this project, but I have seen several implementations for node.js that include javascript dom manipulation.
https://github.com/tmpvar/jsdom

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.