Can't find a codec for class org.json.JSONArray - java

private void getUsersWithin24Hours(String id, Map < String, Object > payload) throws JSONException {
JSONObject json = new JSONObject(String.valueOf(payload.get("data")));
Query query = new Query();
query.addCriteria(Criteria.where("user_id").is(id).and("timezone").in(json.get("timezone")).and("gender").in(json.get("gender")).and("locale").in(json.get("language")).and("time").gt(getDate()));
mongoTemplate.getCollection("user_log").distinct("user_id", query.getQueryObject());
}
I was going to made a query and get result from mongodb and I was succeed with mongo terminal command:
db.getCollection('user_log').find({"user_id" : "1", "timezone" : {$in: [5,6]}, "gender" : {$in : ["male", "female"]}, "locale" : {$in : ["en_US"]}, "time" : {$gt : new ISODate("2017-01-26T16:57:52.354Z")}})
but from java when I was trying it gave me below error.
org.bson.codecs.configuration.CodecConfigurationException: Can't find
a codec for class org.json.JSONArray
What is the ideal way to do this?
Hint : actually I think in my code error occurred of this part json.get("timezone"). because it contains array. When I am using hardcode string arrays this code works

You don't have to use JSONObject/JSONArray for conversion.
Replace with below line if the payload.get("data") is Map
BasicDBObject json = new BasicDBObject(payload.get("data"));
Replace with below line if the payload.get("data") holds json string.
BasicDBObject json =(BasicDBObject) JSON.parse(payload.get("data"));

Here's an example of MongoDB from MongoDB University course with a MongoDB database named "students" with a collection named "grades" :
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.mongodb</groupId>
<artifactId>test</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>test</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver</artifactId>
<version>3.2.2</version>
</dependency>
</dependencies>
</project>
com/mongo/Main.java
package com.mongo;
import com.mongodb.MongoClient;
import com.mongodb.client.*;
import org.bson.Document;
import org.bson.conversions.Bson;
import javax.print.Doc;
public class Main {
public static void main(String[] args) {
MongoClient client = new MongoClient();
MongoDatabase database = client.getDatabase("students");
final MongoCollection<Document> collection = database.getCollection("grades");
Bson sort = new Document("student_id", 1).append("score", 1);
MongoCursor<Document> cursor = collection.find().sort(sort).iterator();
try {
Integer student_id = -1;
while (cursor.hasNext()) {
Document document = cursor.next();
// Doing more stuff
}
} finally {
cursor.close();
}
}
}

Related

How to add a sub-document (if not exists) to an array field, in MongoDB using Java?

I'm having issues with a really simple task, in Java:
I need to get "users" array inside an object,
Check if it contains a key ID, and if not, add a new user to the array.
I don't get any error, but the user isn't added.
Please, answer using the Java driver.
Why is that? Here is the code:
List<DBObject> queryResultList = cursor.toArray(1);
DBObject currentObj = queryResultList.get(0);
Set userIdsKeySet = ((BasicDBObject) currentObj.get("users")).keySet();
BasicDBObject newObj = null;
if(!userIdsKeySet.contains(userId)){
((BasicDBList)currentObj.get("users")).add(new BasicDBObject().append(userId, user));
}
if(newObj != null) {
collection.update(currentObj, new BasicDBObject("users", newObj));
return true;
}
The document structure looks like that:
{
"_id": "5de9604ef36d7f394a4ade2f",
"_index": "sw0dfb0",
"users": [{
"e9604ef36d94a4ade7f394": {
"some_field": "abcde"
}
}]
}
Is it a better way to make the users array this way?
"users": [{
"user_id":"e9604ef36d94a4ade7f394",
"some_field":"abcde"
}]
Note: I Know there are much prettier and simpler ways to do it, any informative advice is welcome.
I have a Java example program that I continually hack on to learn. My example may have bits that don't apply, but the basis for the question in here. Notice use of "$push"...
This assumes a record is available to query and push a new array item...
package test.barry;
public class Main {
public static void main(String[] args) {
com.mongodb.client.MongoDatabase db = connectToClusterStandAlone();
InsertArrayItem(db);
return;
}
private static void InsertArrayItem(com.mongodb.client.MongoDatabase db) {
System.out.println("");
System.out.println("Starting InsertArrayItem...");
com.mongodb.client.MongoCollection<org.bson.Document> collection = db.getCollection("people");
com.mongodb.client.MongoCursor<org.bson.Document> cursor = collection.find(com.mongodb.client.model.Filters.eq("testfield", true)).sort(new org.bson.Document("review_date", -1)).limit(1).iterator();
if(cursor.hasNext()) {
org.bson.Document document = cursor.next();
Object id = document.get("_id");
System.out.println("Selected Id: " + id.toString());
org.bson.Document newDocument = new org.bson.Document("somekey", "somevalue");
collection.findOneAndUpdate(
com.mongodb.client.model.Filters.eq("_id", id),
new org.bson.Document("$push", new org.bson.Document("myarray", newDocument))
);
}
System.out.println("Completed InsertArrayItem.");
}
private static com.mongodb.client.MongoDatabase connectToClusterStandAlone() {
// STANDALONE STILL REQUIRES HOSTS LIST WITH ONE ELEMENT...
// http://mongodb.github.io/mongo-java-driver/3.9/javadoc/com/mongodb/MongoClientSettings.Builder.html
java.util.ArrayList<com.mongodb.ServerAddress> hosts = new java.util.ArrayList<com.mongodb.ServerAddress>();
hosts.add(new com.mongodb.ServerAddress("127.0.0.1", 27017));
com.mongodb.MongoCredential mongoCredential = com.mongodb.MongoCredential.createScramSha1Credential("testuser", "admin", "mysecret".toCharArray());
com.mongodb.MongoClientSettings mongoClientSettings = com.mongodb.MongoClientSettings.builder()
.applyToClusterSettings(clusterSettingsBuilder -> clusterSettingsBuilder.hosts(hosts))
.credential(mongoCredential)
.writeConcern(com.mongodb.WriteConcern.W1)
.readConcern(com.mongodb.ReadConcern.MAJORITY)
.readPreference(com.mongodb.ReadPreference.nearest())
.retryWrites(true)
.build();
com.mongodb.client.MongoClient client = com.mongodb.client.MongoClients.create(mongoClientSettings);
com.mongodb.client.MongoDatabase db = client.getDatabase("test");
return db;
}
}
Example document after running twice...
{
"_id" : ObjectId("5de7f472b0ba4011a7caa59c"),
"name" : "someone somebody",
"age" : 22,
"state" : "WA",
"phone" : "(739) 543-2109",
"ssn" : "444-22-9999",
"testfield" : true,
"versions" : [
"v1.2",
"v1.3",
"v1.4"
],
"info" : {
"x" : 444,
"y" : "yes"
},
"somefield" : "d21ee185-b6f6-4b58-896a-79424d163626",
"myarray" : [
{
"somekey" : "somevalue"
},
{
"somekey" : "somevalue"
}
]
}
For completeness here is my maven file...
Maven POM File...
<project
xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>test.barry</groupId>
<artifactId>test</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>test</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<outputDirectory>${basedir}</outputDirectory>
<finalName>Test</finalName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>test.barry.Main</mainClass>
</transformer>
</transformers>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>3.10.1</version>
</dependency>
</dependencies>
</project>
I have a sample document and Java code (using MongoDB Java Driver 3.9.0) to update the users array.
MongoDB Enterprise > db.test.find()
{
"_id" : 1,
"index" : "999",
"users" : [
{
"user11" : {
"fld1" : "abcde"
}
},
{
"user22" : {
"fld1" : "xyz123"
}
}
]
}
Java Code:
import org.bson.Document;
import org.bson.conversions.Bson;
import com.mongodb.client.result.UpdateResult;
import static com.mongodb.client.model.Updates.*;
import static com.mongodb.client.model.Filters.*;
import com.mongodb.client.*;
public class Testing9 {
public static void main(String [] args) {
MongoClient mongoClient = MongoClients.create("mongodb://localhost/");
MongoDatabase database = mongoClient.getDatabase("users");
MongoCollection<Document> collection = database.getCollection("test");
String user = "user99"; // new user to be added to the array
Bson userNotExistsFilter = exists(("users." + user), false);
Bson idFilter = eq("_id", new Integer(1));
Document newUser = new Document(user, new Document("fld1", "some_value"));
Bson pushUser = push("users", newUser);
UpdateResult result =
collection.updateOne(and(idFilter, userNotExistsFilter), pushUser);
System.out.println(result);
}
}
Result:
Querying the collection shows the updated array field users with the new user "user99":
MongoDB Enterprise > db.test.find().pretty()
{
"_id" : 1,
"index" : "999",
"users" : [
{
"user11" : {
"fld1" : "abcde"
}
},
{
"user22" : {
"fld1" : "xyz123"
}
},
{
"user99" : {
"fld1" : "some_value"
}
}
]
}
Shell Query:
This is the equivalent update query from the mongo shell:
db.test.updateOne(
{ _id: 1, "users.user99": { $exists: false} },
{ $push: { users: { user99: { fld1: "some_value" } } } }
)
The collection's array will be added with the following document:
{
"user99" : {
"fld1" : "some_value"
}
}

Not able to parse Protobuf in java

I have two protobuf files. I have to compare the contents of both of them in order to proceed further with the code. For this, i am trying to parse a protobuf file but some how i am not able to get the various message types and other information within the .proto file. I have to do all this in java.
Code snippets:
package com.example.demo;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import com.google.protobuf.DescriptorProtos;
import com.google.protobuf.DescriptorProtos.FileDescriptorProto;
import com.google.protobuf.Descriptors;
import com.google.protobuf.Descriptors.FileDescriptor;
import com.google.protobuf.InvalidProtocolBufferException;
public class TestProto {
public static FileDescriptorProto parseProto(InputStream protoStream)
throws InvalidProtocolBufferException, Descriptors.DescriptorValidationException {
DescriptorProtos.FileDescriptorProto descriptorProto = null;
try {
descriptorProto = FileDescriptorProto.parseFrom(protoStream);
} catch (IOException e) {
e.printStackTrace();
}
return descriptorProto;
}
public static InputStream readProto(File filePath) {
InputStream is = null;
Reader reader = null;
try {
is = new FileInputStream(filePath);
reader = new InputStreamReader(is);
int data = reader.read();
while (data != -1) {
System.out.print((char) data);
data = reader.read();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return is;
}
public static void main(String args[]) {
InputStream protoStream = readProto(new File("D:/PROTOBUF CONVERTER/default.proto"));
Descriptors.FileDescriptor fileDescriptor = null;
DescriptorProtos.FileDescriptorProto fileDescriptorProto = null;
try {
fileDescriptorProto = parseProto(protoStream);
fileDescriptor = FileDescriptor.buildFrom(fileDescriptorProto, new FileDescriptor[] {}, true);
System.out.println("\n*******************");
System.out.println(fileDescriptor.getFullName());
System.out.println(fileDescriptor.getName());
System.out.println(fileDescriptor.getPackage());
System.out.println(fileDescriptor.getClass());
System.out.println(fileDescriptor.getDependencies());
System.out.println(fileDescriptor.toProto());
System.out.println(fileDescriptor.getServices());
System.out.println(fileDescriptor.getMessageTypes());
System.out.println(fileDescriptor.getOptions());
} catch(Exception e) {
e.printStackTrace();
}
}
}
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.3.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.springboot</groupId>
<artifactId>demo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>demo</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.xolstice.maven.plugins</groupId>
<artifactId>protobuf-maven-plugin</artifactId>
<version>0.6.1</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>3.5.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-io/commons-io -->
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.6</version>
</dependency>
</dependencies>
<build>
<finalName>ProtobufParseDemo</finalName>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<inherited>true</inherited>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
default.proto
syntax = "proto3";
package tutorial;
option java_package = "com.example.tutorial";
option java_outer_classname = "AddressBookProtos";
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
}
repeated PhoneNumber phones = 4;
}
message AddressBook {
repeated Person people = 1;
}
I can see the protofile data on the console due to code line "System.out.print((char) data);". However, i am not able to see any output in the sysout of the FileDescriptors.
I am new to Protocol buffers.
Questions:
what I am trying to do, is it relevant OR I am making some mistake?
Is there any other method to do this in Java?
I have seen some answers, like the one here Protocol Buffers: How to parse a .proto file in Java.
It says that the input to the parseFrom method should be of binary type i.e. a compiled schema. Is there a way in which we can obtain the compiled version of the .proto file in java code (not in command line)?
Ok, to be more clear on this, I have to compare two .proto files.
First would be the one which is already uploaded with the ML model
and
Second would be the one which is to be uploaded for the same ML model.
If there are differences in the input or output message types of the two .proto files, then accordingly i have to increment the version number of the model.
I have found solutions where the proto is converted to proto descriptors and then converted to byte array and further passed tp parsrFrom method. Can't this process of converting .proto to proto.desc, be done via java code ?
Point to keep in mind here is that, i do not have the proto files in my classpath and giving the address in pom.xml (that of input and output directories) is not possible here as i have to download the old proto and compare it with the new proto to be uploaded as mentioned above.

Unable to connect to mongo database using Java, OSGI, Karaf

I've installed the mongo driver in my running Karaf server:
bundle:install -s wrap:mvn:org.mongodb/mongo-java-driver/3.6.3
I'm simply trying to connect to the DB and log the databases I have. Currently running out of the box local instance. Below is the code I wrote to demo this in OSGI/Karaf. I'm using the mvn bundle plugin.
I created a database under the alias osgiDatabase
I'm running my debugger and the failure happens during the instantiation of the MongoClient() but not understanding what I could be doing wrong.
This works when I don't use Karaf. The only error I get is Activator start error in bundle
POM
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.qa</groupId>
<artifactId>board</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>bundle</packaging>
<dependencies>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>3.6.3</version>
</dependency>
<dependency>
<groupId>org.osgi</groupId>
<artifactId>org.osgi.core</artifactId>
<version>6.0.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
<extensions>true</extensions>
<configuration>
<instructions>
<Import-Package>com.mongodb, org.osgi.framework</Import-Package>
<Bundle-Activator>Connection.Activator</Bundle-Activator>
<Export-Package>*</Export-Package>
</instructions>
</configuration>
</plugin>
</plugins>
</build>
</project>
DBUtil
package Connection;
import com.mongodb.MongoClient;
import com.mongodb.client.MongoDatabase;
import java.util.List;
public class DBUtil {
MongoClient client;
MongoDatabase database;
public DBUtil() {
}
public DBUtil(String databaseName) {
if (client == null) {
client = new MongoClient();
database = client.getDatabase(databaseName);
}
}
/**
* Allows you to reveal all databases under the current connection
*/
public void showDatabases() {
if (client == null) {
throw new NullPointerException();
}
List<String> databases = client.getDatabaseNames();
for (String db : databases) {
System.out.println("The name of the database is: " + db);
}
}
}
Activator
package Connection;
import org.osgi.framework.BundleActivator;
import org.osgi.framework.BundleContext;
public class Activator implements BundleActivator {
public void start(BundleContext bundleContext) throws Exception {
DBUtil util = new DBUtil("osgiDatabase");
// util.showDatabases();
System.out.println("Working");
}
public void stop(BundleContext bundleContext) throws Exception {
System.out.println("Bundle disabled");
}
}
Your Import-Package configuration looks wrong. If you configure it explicitly like this you switch off the auto detection of needed packages. So it is very likely you are missing some packages your code needs.
Instead try to only configure the activator and leave the rest on defaults.
To get better logs you should use a try catch in your Activator an log the exception using slf4j. So you get some more information what is wrong.

Best way to read TSV file using Apache Spark in java

I have a TSV file, where the first line is the header. I want to create a JavaPairRDD from this file. Currently, I'm doing so with the following code:
TsvParser tsvParser = new TsvParser(new TsvParserSettings());
List<String[]> allRows;
List<String> headerRow;
try (BufferedReader reader = new BufferedReader(new FileReader(myFile))) {
allRows = tsvParser.parseAll((reader));
//Removes the header row
headerRow = Arrays.asList(allRows.remove(0));
}
JavaPairRDD<String, MyObject> myObjectRDD = javaSparkContext
.parallelize(allRows)
.mapToPair(row -> new Tuple2<>(row[0], myObjectFromArray(row)));
I was wondering if there was a way to have the javaSparkContext read and process the file directly instead of splitting the operation into two parts.
EDIT: This is not a duplicate of How do I convert csv file to rdd, because I'm looking for an answer in Java, not Scala.
use https://github.com/databricks/spark-csv
import org.apache.spark.sql.SQLContext
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
.format("com.databricks.spark.csv")
.option("inferSchema", "true")
.option("header", "true")
.option("delimiter","\t")
.load("cars.csv");
df.select("year", "model").write()
.format("com.databricks.spark.csv")
.option("header", "true")
.save("newcars.csv");
Try below code to read CSV file and create JavaPairRDD.
public class SparkCSVReader {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("CSV Reader");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> allRows = sc.textFile("c:\\temp\\test.csv");//read csv file
String header = allRows.first();//take out header
JavaRDD<String> filteredRows = allRows.filter(row -> !row.equals(header));//filter header
JavaPairRDD<String, MyCSVFile> filteredRowsPairRDD = filteredRows.mapToPair(parseCSVFile);//create pair
filteredRowsPairRDD.foreach(data -> {
System.out.println(data._1() + " ### " + data._2().toString());// print row and object
});
sc.stop();
sc.close();
}
private static PairFunction<String, String, MyCSVFile> parseCSVFile = (row) -> {
String[] fields = row.split(",");
return new Tuple2<String, MyCSVFile>(row, new MyCSVFile(fields[0], fields[1], fields[2]));
};
}
You can also use Databricks spark-csv (https://github.com/databricks/spark-csv). spark-csv is also included in Spark 2.0.0.
Apache Spark 2.x have built-in csv reader so you don't have to use https://github.com/databricks/spark-csv
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
/**
*
* #author cpu11453local
*/
public class Main {
public static void main(String[] args) {
SparkSession spark = SparkSession.builder()
.master("local")
.appName("meowingful")
.getOrCreate();
Dataset<Row> df = spark.read()
.option("header", "true")
.option("delimiter","\t")
.csv("hdfs://127.0.0.1:9000/data/meow_data.csv");
df.show();
}
}
And maven file pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.meow.meowingful</groupId>
<artifactId>meowingful</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
</project>
I'm the author of uniVocity-parsers and can't help you much with spark, but I believe something like this can work for you:
parserSettings.setHeaderExtractionEnabled(true); //captures the header row
parserSettings.setProcessor(new AbstractRowProcessor(){
#Override
public void rowProcessed(String[] row, ParsingContext context) {
String[] headers = context.headers() //not sure if you need them
JavaPairRDD<String, MyObject> myObjectRDD = javaSparkContext
.mapToPair(row -> new Tuple2<>(row[0], myObjectFromArray(row)));
//process your stuff.
}
});
If you want to paralellize the processing of each row, you can wrap a ConcurrentRowProcessor:
parserSettings.setProcessor(new ConcurrentRowProcessor(new AbstractRowProcessor(){
#Override
public void rowProcessed(String[] row, ParsingContext context) {
String[] headers = context.headers() //not sure if you need them
JavaPairRDD<String, MyObject> myObjectRDD = javaSparkContext
.mapToPair(row -> new Tuple2<>(row[0], myObjectFromArray(row)));
//process your stuff.
}
}, 1000)); //1000 rows loaded in memory.
Then just call to parse:
new TsvParser(parserSettings).parse(myFile);
Hope this helps!

How to make an executable jar?

I'm creating a new web application using Maven. I have got some code from 'spring's guides online which creates a database. However for some reason, the code is never being ran.
In my pom.xml, I have included the following code:
<properties>
<start-class>hello.Application</start-class>
</properties>
And this is the 'Application' class which I got from the spring guides.
package hello;
import java.sql.ResultSet;
public class Application {
public static void main(String args[]) {
// simple DS for test (not for production!)
SimpleDriverDataSource dataSource = new SimpleDriverDataSource();
dataSource.setDriverClass(org.h2.Driver.class);
dataSource.setUsername("sa");
dataSource.setUrl("jdbc:h2:mem");
dataSource.setPassword("");
JdbcTemplate jdbcTemplate = new JdbcTemplate(dataSource);
System.out.println("Creating tables");
jdbcTemplate.execute("drop table customers if exists");
jdbcTemplate.execute("create table customers(" +
"id serial, first_name varchar(255), last_name varchar(255))");
String[] names = "John Woo;Jeff Dean;Josh Bloch;Josh Long".split(";");
for (String fullname : names) {
String[] name = fullname.split(" ");
System.out.printf("Inserting customer record for %s %s\n", name[0], name[1]);
jdbcTemplate.update(
"INSERT INTO customers(first_name,last_name) values(?,?)",
name[0], name[1]);
}
System.out.println("Querying for customer records where first_name = 'Josh':");
List<Customer> results = jdbcTemplate.query(
"select * from customers where first_name = ?", new Object[] { "Josh" },
new RowMapper<Customer>() {
#Override
public Customer mapRow(ResultSet rs, int rowNum) throws SQLException {
return new Customer(rs.getLong("id"), rs.getString("first_name"),
rs.getString("last_name"));
}
});
for (Customer customer : results) {
System.out.println(customer);
}
}
}
My Project structure is as follows:
Project name
src
webapp
hello
application.java
I am pretty new to this but I just can't see why its not finding the application.java file.
Any ideas would be appreciated.
EDIT: This is the whole pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org
/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>embed.tomcat.here</groupId>
<artifactId>EmbedTomcatNew</artifactId>
<packaging>war</packaging>
<version>0.0.1-SNAPSHOT</version>
<name>EmbedTomcatNew Maven Webapp</name>
<url>http://maven.apache.org</url>
<properties>
<start-class>hello.Application</start-class>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<finalName>EmbedTomcatNew</finalName>
<plugins>
<plugin>
<groupId>org.apache.tomcat.maven</groupId>
<artifactId>tomcat7-maven-plugin</artifactId>
<version>2.2</version>
<configuration>
<port>9966</port>
</configuration>
</plugin>
</plugins>
</build>
</project>
EDIT:
I should also mention that it is running a jsp file - which only has 'hello world' in it that prints to the browser - so it is running but I want it to run the java class first.
WAR
You are building a .war file, as evidenced in the <packaging>war</packaging> definition, which is only deployable to a Web Application container. There is no startup class, and as well documented on stackoverflow there is do way to control the order of startup in most web app containers.
JAR
You have to change your project to be an executable .jar and specify the main class in the Manifest in the jar plugin configuration options. Just setting some random property isn't going to do anything.
You probably want to use the shade plugin to bundle all the transient dependencies into a monolithic .jar as well otherwise you have an classpath installation nightmare on your hands.
Here is an example, running this from the src/main/webapp dir is a bad non-portable idea, that should be passed in as an argument.
import java.io.File;
import org.apache.catalina.startup.Tomcat;
public class Main {
public static void main(String[] args) throws Exception {
String webappDirLocation = "src/main/webapp/";
Tomcat tomcat = new Tomcat();
//The port that we should run on can be set into an environment variable
//Look for that variable and default to 8080 if it isn't there.
String webPort = System.getenv("PORT");
if(webPort == null || webPort.isEmpty()) {
webPort = "8080";
}
tomcat.setPort(Integer.valueOf(webPort));
tomcat.addWebapp("/", new File(webappDirLocation).getAbsolutePath());
System.out.println("configuring app with basedir: " + new File("./" + webappDirLocation).getAbsolutePath());
tomcat.start();
tomcat.getServer().await();
}
}

Categories

Resources