I'm working on an Apache Jena project. I've got a Fuseki server running on my localhost.
I want to create a Java Program for my Fuseki server, that shows all the data in the triplestore in a JTable. I just have no idea how to parse the result from my query into a JTable.
My code sofar:
(left out the part where the window, table, frame etc is created)
private void Go() {
String query = "SELECT ?subject ?predicate ?object \n" +
"WHERE { \n" +
"?subject ?predicate ?object }";
Query sparqlQuery = QueryFactory.create(query, Syntax.syntaxARQ) ;
QueryEngineHTTP httpQuery = new QueryEngineHTTP("http://localhost:3030/AnimalDataSet/", sparqlQuery);
ResultSet results = httpQuery.execSelect();
System.out.println(ResultSetFormatter.asText(results));
while (results.hasNext()) {
QuerySolution solution = results.next();
}
httpQuery.close();
}
The sysout prints this, which is the correct data:
-------------------------------------------------------------------------------------------------------------------------------------
| subject | predicate | object |
=====================================================================================================================================
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> | <urn:animals:lion> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> | <urn:animals:tarantula> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3> | <urn:animals:hippopotamus> |
| <urn:animals:lion> | <http://www.some-ficticious-zoo.com/rdf#name> | "Lion" |
| <urn:animals:lion> | <http://www.some-ficticious-zoo.com/rdf#species> | "Panthera leo" |
| <urn:animals:lion> | <http://www.some-ficticious-zoo.com/rdf#class> | "Mammal" |
| <urn:animals:tarantula> | <http://www.some-ficticious-zoo.com/rdf#name> | "Tarantula" |
| <urn:animals:tarantula> | <http://www.some-ficticious-zoo.com/rdf#species> | "Avicularia avicularia" |
| <urn:animals:tarantula> | <http://www.some-ficticious-zoo.com/rdf#class> | "Arachnid" |
| <urn:animals:hippopotamus> | <http://www.some-ficticious-zoo.com/rdf#name> | "Hippopotamus" |
| <urn:animals:hippopotamus> | <http://www.some-ficticious-zoo.com/rdf#species> | "Hippopotamus amphibius" |
| <urn:animals:hippopotamus> | <http://www.some-ficticious-zoo.com/rdf#class> | "Mammal" |
-------------------------------------------------------------------------------------------------------------------------------------
I really hope someone here knows how to parse the data from the query into a JTbale :D
Thanks in advance!
I've done some further research and finally found the solution! It's quite easy actually.
You just simply change the while loop like this:
while(rs.hasNext())
{
QuerySolution sol = rs.nextSolution();
RDFNode object = sol.get("object");
RDFNode predicate = sol.get("predicate");
RDFNode subject = sol.get("subject");
DefaultTableModel model = (DefaultTableModel) table.getModel();
model.addRow(new Object[]{subject, predicate, object});
}
And that works fine for me!
For everyone who's interested, i've puplished my version as it is now to pastebin which has comments:
The link to the full (current) version of my project
Related
I'm building up a series of distribution analysis using Java Spark library. This is the actual code I'm using to fetch the data from a JSON file and save the output.
Dataset<Row> dataset = spark.read().json("local/foods.json");
dataset.createOrReplaceTempView("cs_food");
List<GenericAnalyticsEntry> menu_distribution= spark
.sql(" ****REQUESTED QUERY ****")
.toJavaRDD()
.map(row -> Triple.of( row.getString(0), BigDecimal.valueOf(row.getLong(1)), BigDecimal.valueOf(row.getLong(2))))
.map(GenericAnalyticsEntry::of)
.collect();
writeObjectAsJsonToHDFS(fs, "/local/output/menu_distribution_new.json", menu_distribution);
The query I'm looking for is based on this structure:
+------------+-------------+------------+------------+
| FIRST_FOOD | SECOND_FOOD | DATE | IS_SPECIAL |
+------------+-------------+------------+------------+
| Pizza | Spaghetti | 11/02/2017 | TRUE |
+------------+-------------+------------+------------+
| Lasagna | Pizza | 12/02/2017 | TRUE |
+------------+-------------+------------+------------+
| Spaghetti | Spaghetti | 13/02/2017 | FALSE |
+------------+-------------+------------+------------+
| Pizza | Spaghetti | 14/02/2017 | TRUE |
+------------+-------------+------------+------------+
| Spaghetti | Lasagna | 15/02/2017 | FALSE |
+------------+-------------+------------+------------+
| Pork | Mozzarella | 16/02/2017 | FALSE |
+------------+-------------+------------+------------+
| Lasagna | Mozzarella | 17/02/2017 | FALSE |
+------------+-------------+------------+------------+
How can I achieve this (written below) output from the code written above?
+------------+--------------------+----------------------+
| FOODS | occurrences(First) | occurrences (Second) |
+------------+--------------------+----------------------+
| Pizza | 2 | 1 |
+------------+--------------------+----------------------+
| Lasagna | 2 | 1 |
+------------+--------------------+----------------------+
| Spaghetti | 2 | 3 |
+------------+--------------------+----------------------+
| Mozzarella | 0 | 2 |
+------------+--------------------+----------------------+
| Pork | 1 | 0 |
+------------+--------------------+----------------------+
I've of course tried to figure out a solution by myself but had no luck with the my tries, I may be wrong, but I need something like this:
"SELECT (first_food + second_food) as menu, COUNT(first_food), COUNT(second_food) from cs_food GROUP BY menu"
From the example data, this looks like it will produce the output you want:
select
foods,
first_count,
second_count
from
(select first_food as food from menus
union select second_food from menus) as f
left join (
select first_food, count(*) as first_count from menus
group by first_food
) as ff on ff.first_food=f.food
left join (
select second_food, count(*) as second_count from menus
group by second_food
) as sf on sf.second_food=f.food
;
Simple combination of flatMap and groupBy should do the job like this (sorry, can't check if it 100% correct right now):
import spark.sqlContext.implicits._
val df = Seq(("Pizza", "Pasta"), ("Pizza", "Soup")).toDF("first", "second")
df.flatMap({case Row(first: String, second: String) => Seq((first, 1, 0), (second, 0, 1))})
.groupBy("_1")
I want to randomly select a subset of my data and then limit it to 200 entries. But after using the sample() function, I'm getting duplicate rows, and I don't know why. Let me show you:
DataFrame df= sqlContext.sql("SELECT * " +
" FROM temptable" +
" WHERE conditions");
DataFrame df1 = df.select(df.col("col1"))
.where(df.col("col1").isNotNull())
.distinct()
.orderBy(df.col("col1"));
df.show();
System.out.println(df.count());
Up until now, everything is OK. I get the output:
+-----------+
|col1 |
+-----------+
| 10016|
| 10022|
| 100281|
| 10032|
| 100427|
| 100445|
| 10049|
| 10070|
| 10076|
| 10079|
| 10081|
| 10082|
| 100884|
| 10092|
| 10099|
| 10102|
| 10103|
| 101039|
| 101134|
| 101187|
+-----------+
only showing top 20 rows
10512
with 10512 records without duplicates. AND THEN!
df = df.sample(true, 0.5).limit(200);
df.show();
System.out.println(users.count());
This returns 200 rows full of duplicates:
+-----------+
|col1 |
+-----------+
| 10022|
| 100445|
| 100445|
| 10049|
| 10079|
| 10079|
| 10081|
| 10081|
| 10082|
| 10092|
| 10102|
| 10102|
| 101039|
| 101134|
| 101134|
| 101134|
| 101345|
| 101345|
| 10140|
| 10141|
+-----------+
only showing top 20 rows
200
Can anyone tell me why? This is driving me crazy. Thank you!
You explicitly ask for a sample with replacement so there is nothing unexpected about getting duplicates:
public Dataset<T> sample(boolean withReplacement, double fraction)
My table looks like this
+-------+--------------------+
|id | c1|
+-------+--------------------+
| 1|ArrayBuffer(a,b) |
| 1|ArrayBuffer(c ) |
| 2|ArrayBuffer(d ) |
| 2|ArrayBuffer(e,f) |
| 2|ArrayBuffer(g ) |
| 3|ArrayBuffer(h ) |
+-------+--------------------+
I want to the output to look like this
+-------+--------------------+
|id | c1|
+-------+--------------------+
| 1|ArrayBuffer(a,b,c) |
| 2|ArrayBuffer(c,d,e,f,g)
| 3|ArrayBuffer(h ) |
+-------+--------------------+
Here's what I was thinking
SQLQuery = "SELECT table.id, join(table.c1) FROM table GROUP BY table.id
sqlContext.udf().register("join",
new UDF1<ArrayBuffer, ArrayBuffer>() {
#Override
public ArrayBuffer call(ArrayBuffer idArray) {
// how do I join them?
return idArray;
}
}, DataTypes.StringType);
I am running a map only job in Hadoop. The data-set is a set of html pages in a single file (returned by a crawler)
The mapper code is written in Java. I am using JSoup to parse. What I want as my output is a key that has both the contents of the title tag and the content of a meta tag. Ideally I should get 1592 records for my map output records. I am getting 3184.
The concatenation I attempt to do with this line of code is not happening.
String MN_Job = (jobT + "\t" + jobsDetail);
What I get instead is each of these separately, hence double the number of outputs. What am I doing wrong here?
public class JobsDataMapper extends Mapper<LongWritable, Text, Text, Text> {
private Text keytext = new Text();
private Text valuetext = new Text();
#Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
Document doc = Jsoup.parse(line);
Elements desc = doc.select("head title, meta[name=twitter:description]");
for (Element jobhtml : desc) {
Elements title = jobhtml.select("title");
String jobT = "";
for (Element titlehtml : title) {
jobT = titlehtml.text();
}
Elements meta = jobhtml.select("meta[name=twitter:description]");
String jobsDetail ="";
for (Element metahtml : meta) {
String content = metahtml.attr("content");
String content1 = content.replaceAll("\\p{Punct}+", " ");
jobsDetail = content1.replaceAll(" (?i)a | (?i)able | (?i)about | (?i)across | (?i)after | (?i)all | (?i)almost | (?i)also | (?i)am | (?i)among | (?i)an | (?i)and | (?i)any | (?i)are | (?i)as | (?i)at | (?i)be | (?i)because | (?i)been | (?i)but | (?i)by | (?i)can | (?i)cannot | (?i)could | (?i)dear | (?i)did | (?i)do | (?i)does | (?i)either | (?i)else | (?i)ever | (?i)every | (?i)for | (?i)from | (?i)get | (?i)got | (?i)had | (?i)has | (?i)have | (?i)he | (?i)her | (?i)hers | (?i)him | (?i)his | (?i)how | (?i)however | (?i)i | (?i)if | (?i)in | (?i)into | (?i)is | (?i)it | (?i)its | (?i)just | (?i)least | (?i)let | (?i)like | (?i)likely | (?i)may | (?i)me | (?i)might | (?i)most | (?i)must | (?i)my | (?i)neither | (?i)no | (?i)nor | (?i)not | (?i)nbsp | (?i)of | (?i)off | (?i)often | (?i)on | (?i)only | (?i)or | (?i)other | (?i)our | (?i)own | (?i)rather | (?i)said | (?i)say | (?i)says | (?i)she | (?i)should | (?i)since | (?i)so | (?i)some | (?i)than | (?i)that | (?i)the | (?i)their | (?i)them | (?i)then | (?i)there | (?i)these | (?i)they | (?i)this | (?i)tis | (?i)to | (?i)too | (?i)twas | (?i)us | (?i)wants | (?i)was | (?i)we | (?i)were | (?i)what | (?i)when | (?i)where | (?i)which | (?i)while | (?i)who | (?i)whom | (?i)why | (?i)will | (?i)with | (?i)would | (?i)yet | (?i)you | (?i)your "," ");
}
String IT_Job = (jobT + "\t" + jobsDetail);
keytext.set(IT_Job) ;
valuetext.set("JobDetail");
context.write( keytext, valuetext );
}
}
}
Edit: I know what the problem is. But the thing is that the solution might not be obvious in MapReduce. You might have to write your custom RecordReader. Let me explain the problem.
In your code you read line by line. Then you apply this to the line you read:
Elements desc = doc.select("head title, meta[name=twitter:description]");
But evidently, it might only have a title or a <meta name=twitter:description> tag. So you read one of those and store it. The other one remains blank. So at a time, only one of your variables, jobT and jobsDetail has any data. So for the code snippet:
String IT_Job = (jobT + "\t" + jobsDetail);
one time, the first one is blank and the second time, the other one is blank. So if you are expecting n records, you get 2n records. Similarly, if you'll attempt to extract three fields, then you should get 3n records. So you can test this theory by extracting another field and then checking if you are getting thrice the number of expected records.
If the theory turns out to be correct, you might want to delimit the webpages you extract with a specific delimiter string. Then you want to write a custom RecordReader which will read one html file at a time according to the delimiter and then process the entire html file at once. That way you'll get the title and the meta tags together.
Just by the look at the numbers: 3184/2 = 1592.
I think that your file is just duplicated in the input folder. I can't tell for sure, because you have not given the code how you submit the job, but maybe you can verify it with a simple:
bin/hadoop fs -ls /your/input_path
When submitting, either make sure that there is just the single file in there, or just reference the single file in your submission logic.
I made changes to the original code removing the loops that were not necessary. What was happening the older code was that when there is a title in the record, it is output, and later when there is a content, it is output as well. So, there are two writes per HTML file.
public class JobsDataMapper extends Mapper<LongWritable, Text, Text, Text> {
private Text keytext = new Text();
private Text valuetext = new Text();
private String jobT = new String();
private String jobName= new String();
#Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
Document doc = Jsoup.parse(line);
Elements desc = doc.select("head title, meta[name=twitter:description]");
for (Element jobhtml : desc){
Elements title = jobhtml.select("title");
String jobTT = title.text();
jobT =jobTT ;
if (jobT.length()> 0){
jobName=jobTT;
}
Elements meta = jobhtml.select("meta[name=twitter:description]");
String jobsDetail ="";
String content = meta.attr("content");
String content1 = content.replaceAll("\\p{Punct}+", " ");
jobsDetail = content1.toLowerCase();
jobsDetail = content1.replaceAll(" a| able | about | across | after | all | almost | also | am | among | an | and | any | are | as | at | be| because | been | but | by | can | cannot | could | dear | did | do | does | either | else | ever | every | for | from | get | got | had | has | have | he | her | hers | him | his | how | however | i | if | in | into | is | it | its | just | least | let | like | likely | may | me | might | most | must | my | neither | no | nor | not | nbsp | of | off | often | on | only | or | other | our | own | rather | said | say | says | she | should | since | so | some | than | that | the | their | them | then | there | these | they | this | tis | to | too | twas | us | wants | was | we | were | what | when | where | which | while | who | whom | why | will | with | would | yet | you | your "," ");
if (jobsDetail.length()>0) {
String MN_Job = (jobName+ "\t" + jobsDetail);
keytext.set(MN_Job) ;
valuetext.set("JobInIT");
context.write( keytext, valuetext );
}
}
}
}
I have a problem in implementing the tree structure of OID. when I click the parent , i need to display only child details, not the sub child of a child.
i.e., i need not display an OID which contains a "." (dot).
For example, if my OID structure is private.MIB.sample.first
private.MIB.sample.second and so on.
when I click on MIB, it should display only "sample" not first and second.
first and second is to be displayed when I click sample.
How can I implement this in java.
My datyabase is MySQL. The code which I tried is given below
FilteredRowSet rs = new FilteredRowSetImpl();
// for Other Types Like OBJECT-TYPE, Object_IDENTIFIER
rs = new FilteredRowSetImpl();
rs.setCommand("Select * from MIBNODEDETAILS where " + "mn_OID like '" + OID
+ ".%' order by mn_NodeType, mn_OID");
rs.setUrl(Constants.DB_CONNECTION_URL);
rs.setFilter(new MibRowFilter(1, expString));
rs.execute();
rs.absolute(1);
rs.beforeFirst();
I guess the change is to be made in the setCommand argument.
How can I do this?
Structure of mobnodedetails table
+--------------------+-------------------+-------------+
| mn_OID | mn_name | mn_nodetype |
+--------------------+-------------------+-------------+
| 1 | iso | 0 |
| 1.3 | org | 1 |
| 1.3.6 | dod | 1 |
| 1.3.6.1 | internet | 1 |
| 1.3.6.1.1 | directory | 1 |
| 1.3.6.1.2 | mgmt | 1 |
| 1.3.6.1.2.1 | mib-2 | 0 |
| 1.3.6.1.2.1.1 | system | 1 |
| 1.3.6.1.2.1.10 | transmission | 1 |
You can use something like
SELECT *
FROM mibnodedetails
WHERE mn_oid LIKE "+mn_OID+"%
AND LENGTH ("+mn_OID+") + 2 = LENGTH (mn_oid)
ORDER BY mn_nodetype, mn_oid
So if you pass mm_OID as 1.3.6.1 (|1.3.6.1 |internet |1 |)
You will get following result:
| 1.3.6.1.1 | directory | 1 |
| 1.3.6.1.2 | mgmt | 1 |
Working Demo
PS: This will not work for child more than 9 as we are using length + 2
The function given below dispalys the tree as required.
public void populateMibValues()
{
final DefaultTreeModel model = (DefaultTreeModel) this.m_mibTree.getModel();
model.setRoot(null);
this.rootNode.removeAllChildren();
final String query_MibNodeDetailsSelect = "Select * from MIBNODEDETAILS where LENGTH(mn_oid)<=9 "
+ " and mn_OID<='1.3.6.1.4.1' order by mn_OID"; // only
this.innerNodeNames.clear();
this.innerNodes.clear();
this.innerNodesOid = null;
try {
final ResultSet deviceRS = Application.getDBHandler().executeQuery(query_MibNodeDetailsSelect, null);// inner
// nodes
while (deviceRS.next()) {
final mibNode mb = new mibNode(deviceRS.getString("mn_OID").toString(), deviceRS.getString("mn_name")
.toString());
mb.m_Type = Integer.parseInt(deviceRS.getString("mn_nodetype").toString());
createMibTree(mb);
}
}
catch (final Exception e) {
Application.showErrorInConsole(e);
NmsLogger.writeErrorLog("ERROR creating MIB tree failed", e.toString());
}