Sending Streaming Data as JSON in Java/Scala - java

I'm used to python and using the Scala Spark Streaming libraries to handle real-time Twitter streaming data. Right now, I'm able to send as a string, however, my streaming service requires JSON. Is there a way I can easily adapt my code to send as JSON dictionary instead of a String?
%scala
import scala.collection.JavaConverters._
import com.microsoft.azure.eventhubs._
import java.util.concurrent._
val namespaceName = "hubnamespace"
val eventHubName = "hubname"
val sasKeyName = "RootManageSharedAccessKey"
val sasKey = "key"
val connStr = new ConnectionStringBuilder()
.setNamespaceName(namespaceName)
.setEventHubName(eventHubName)
.setSasKeyName(sasKeyName)
.setSasKey(sasKey)
val pool = Executors.newFixedThreadPool(1)
val eventHubClient = EventHubClient.create(connStr.toString(), pool)
def sendEvent(message: String) = {
val messageData = EventData.create(message.getBytes("UTF-8"))
// CONVERT IT HERE?
eventHubClient.get().send(messageData)
System.out.println("Sent event: " + message + "\n")
}
import twitter4j._
import twitter4j.TwitterFactory
import twitter4j.Twitter
import twitter4j.conf.ConfigurationBuilder
val twitterConsumerKey = "key"
val twitterConsumerSecret = "key"
val twitterOauthAccessToken = "key"
val twitterOauthTokenSecret = "key"
val cb = new ConfigurationBuilder()
cb.setDebugEnabled(true)
.setOAuthConsumerKey(twitterConsumerKey)
.setOAuthConsumerSecret(twitterConsumerSecret)
.setOAuthAccessToken(twitterOauthAccessToken)
.setOAuthAccessTokenSecret(twitterOauthTokenSecret)
val twitterFactory = new TwitterFactory(cb.build())
val twitter = twitterFactory.getInstance()
val query = new Query(" #happynewyear ")
query.setCount(100)
query.lang("en")
var finished = false
while (!finished) {
val result = twitter.search(query)
val statuses = result.getTweets()
var lowestStatusId = Long.MaxValue
for (status <- statuses.asScala) {
if(!status.isRetweet()){
sendEvent(status.getText())
}
lowestStatusId = Math.min(status.getId(), lowestStatusId)
Thread.sleep(2000)
}
query.setMaxId(lowestStatusId - 1)
}
eventHubClient.get().close()

Scala has no native way to convert string to Json, you'll need to use an external library. I recommend using Jackson. If you use gradle you can add a dependency like this: compile("com.fasterxml.jackson.module:jackson-module-scala_2.12"). (Use appropriate scala version)
Then, you can simply convert your data object to JSON like this:
val mapper = new ObjectMapper() with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
val json = valueToTree(messageData)
I'd strongly recommend you put your effort in Jackson, you'll need it a lot if you work with JSON.

Related

How to convert a map to Json string in kotlin?

I have a mutableMap,
val invoiceAdditionalAttribute = mutableMapOf<String, Any?>()
invoiceAdditionalAttribute.put("clinetId",12345)
invoiceAdditionalAttribute.put("clientName", "digital")
invoiceAdditionalAttribute.put("payload", "xyz")
I want to convert it into json string
the output should be,
"{\"clinetId\"=\"12345\", \"clientName\"=\"digital\", \"payload\"=\"xyz\"}"
Currently, I am using Gson library,
val json = gson.toJson(invoiceAdditionalAttribute)
and the output is
{"clinetId":12345,"clientName":"digital","payload":"xyz"}
The right json formatting string is:
{"clinetId":12345,"clientName":"digital","payload":"xyz"}
So this is the right method to get it:
val json = gson.toJson(invoiceAdditionalAttribute)
If you want a string formatted like this:
{"clinetId"=12345, "clientName"="digital", "payload"="xyz"}
just replace : with =:
val json = gson.toJson(invoiceAdditionalAttribute).replace(":", "=")
But if you truly want to have a string with backslashes and clinetId value to be inside quotes:
val invoiceAdditionalAttribute = mutableMapOf<String, Any?>()
invoiceAdditionalAttribute["clinetId"] = 12345.toString()
invoiceAdditionalAttribute["clientName"] = "digital"
invoiceAdditionalAttribute["payload"] = "xyz"
val json = gson.toJson(invoiceAdditionalAttribute)
.replace(":", "=")
.replace("\"", "\\\"")
EDIT:
As pointed int he comments .replace(":", "=") can be fragile if some string values contain a ":" character.
To avoid it I would write a custom extension function on Map<String, Any?>:
fun Map<String, Any?>.toCustomJson(): String = buildString {
append("{")
var isFirst = true
this#toCustomJson.forEach {
it.value?.let { value ->
if (!isFirst) {
append(",")
}
isFirst = false
append("\\\"${it.key}\\\"=\\\"$value\\\"")
}
}
append("}")
}
// Using extension function
val customJson = invoiceAdditionalAttribute.toCustomJson()

Using Gson with interfaces to fetch data from API

I've no experience with Kotlin and I'm trying to write an app fetching data from diffrent financial APIs using Gson. I have created two classes implementing an interface and I'd like to instantiate it in generic function. Right now I have two diffrent methods to operate on each API and I'd like to make it more decent.
EDIT:
I want to make a generic function out of two given functions:
Interface and two classes:
interface TickerEntity{
val tickers: Array<String>
data class MainData (
val Bid: Double,
val Ask: Double
)
}
object API1TickerEntity : TickerEntity {
val tickers = arrayOf<String>("BTC-LTC", "BTC-DOGE", "BTC-POT", "BTC-USD")
data class MainData(
val success: Boolean,
val message: String,
val result: ResultData
)
data class ResultData (
val Bid: Double,
val Ask: Double,
val Last: Double
)
}
object API2TickerEntity : TickerEntity {
val tickers = arrayOf<String>("LTCBTC", "BTCDOGE", "BTCPOT", "BTCUSD")
data class MainData(
val max : Double,
val min : Double,
val last : Double,
val bid : Double,
val ask : Double,
val vwap : Double,
val average : Double,
val volume : Double
)
}
My functions to manage Json I want to be generic:
data class BuySell( val stockName: String, val buy: Double = 0.0, val sell: Double = 0.0)
fun getAPI1BuySell(): () -> BuySell {
val currency = API1TickerEntity.tickers[0]
val response = sendRequest("somesite.com")
val gson = Gson()
val ticker: API1Entity.MainData = gson.fromJson(response.body, API1TickerEntity.MainData::class.java)
println(currency)
return { BuySell("API1", ticker.result.Ask, ticker.result.Bid) }
}
fun getAPI2BuySell(): () -> BuySell {
val currency = API2TickerEntity.tickers[0]
val response = sendRequest("someothersite.com")
val gson = Gson()
val ticker: API2TickerEntity.MainData = gson.fromJson(response.body, API2TickerEntity.MainData::class.java)
return { BuySell("API2", ticker.ask, ticker.bid) }
}
So far I have tried:
fun <T : TickerEntity> getStockBuySell(url: String, stockName: String): () -> BuySell {
val tickerEntity : T = T
val currency = tickerEntity.tickers[0]
val response = sendRequest(url.replace("{}", currency))
val gson = Gson()
val ticker: tickerEntity.MainData = gson.fromJson(response.body, tickerEntity.MainData::class.java)
println(currency)
return { BuySell (stockName, ticker.Ask, ticker.Bid) }
}
}
But I can't instantiate the interface alone, and also it seems I can't override data class alone since it is not a value.
JSON files to manage:
API1:
{"success":true,"message":"","result":{"Bid":0.00596655,"Ask":0.00597554,"Last":0.00597933}}
API2:
{"max":0.00606939,"min":0.0059345,"last":0.00595134,"bid":0.00594972,"ask":0.00599205,"vwap":0.00595134,"average":0.00595134,"volume":29.60407718}
All help appreciated

JsonObject to Json records - format output

I am using the CryptoCompare API to pull the crypto symbol details; the output is
like below with nested JSON - I need to convert into Rerecords with below format:
{
"ETH":{
"USD":{
"FROMSYMBOL":"Ξ",
"TOSYMBOL":"$",
"MARKET":"CryptoCompare Index",
"PRICE":"$ 117.74",
"LASTUPDATE":"Just now",
"LASTVOLUME":"Ξ 0.01000",
"LASTVOLUMETO":"$ 1.17",
"LASTTRADEID":"44473885",
"VOLUMEDAY":"Ξ 340,510.0",
"VOLUMEDAYTO":"$ 39,874,960.0",
"VOLUME24HOUR":"Ξ 418,836.6",
"VOLUME24HOURTO":"$ 49,126,029.4",
"OPENDAY":"$ 118.40",
"HIGHDAY":"$ 119.29",
"LOWDAY":"$ 114.48",
"OPEN24HOUR":"$ 117.99",
"HIGH24HOUR":"$ 119.50",
"LOW24HOUR":"$ 114.12"
}
}
}
I need to generate output as below - Separate record for each symbol:
Mapping - each Currency NODE - is added as "Sym" field e.g. "ETH" Node is now "Sym" : "ETH";
rest of the fields are straight move from innermost node "USD"
{
"Sym":"ETH",
"PRICE":"$ 117.74",
"LASTTRADEID":"44473885",
"VOLUMEDAY":"Ξ 340,510.0",
"VOLUMEDAYTO":"$ 39,874,960.0",
"VOLUME24HOUR":"Ξ 418,836.6"
}
Code being used :
import com.crypto.cryptocompare.api.CryptoCompareApi
import com.google.gson.Gson
import com.google.gson.GsonBuilder;
object cryptoComapreMultiSCryptoPriceGson extends App{
val gson = new Gson()
val api = new CryptoCompareApi();
//val response = api.priceMulti("ETH,DASH","BTC,USD,EUR", new Nothing() {})
val m = new java.util.LinkedHashMap[String,Object]
m.put("extraParams", "TestProject")
val response = api.priceMultiFull( //to get priceMultiFull
"ETH,DASH,BTC",
"USD",
m)
//val jsonRec = gson.toJsonTree(response)
println(response.get("DISPLAY"))
}
Any pointers or help?

Passing type in Scala as an argument

I want to pass a type to a function in Scala.
Problem in detail
First iteration
I have the following Java classes (coming from an external source):
public class MyComplexType {
public String name;
public int number;
}
and
public class MyGeneric<T> {
public String myName;
public T myValue;
}
In this example I want MyComplexType to be the the actual type of MyGeneric; in the real problem there are several possibilities.
I want to deserialize a JSON string using a Scala code as follows:
import org.codehaus.jackson.map.ObjectMapper
object GenericExample {
def main(args: Array[String]) {
val jsonString = "{\"myName\":\"myNumber\",\"myValue\":{\"name\":\"fifteen\",\"number\":\"15\"}}"
val objectMapper = new ObjectMapper()
val myGeneric: MyGeneric[MyComplexType] = objectMapper.readValue(jsonString, classOf[MyGeneric[MyComplexType]])
val myComplexType: MyComplexType = myGeneric.myValue
}
}
it compiles fine but runtime error occurs:
java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to MyComplexType
at GenericExample$.main(GenericExample.scala:9)
Second iteration
Working solution to the problem:
val jsonString = "{\"myName\":\"myNumber\",\"myValue\":{\"name\":\"fifteen\",\"number\":\"15\"}}"
val objectMapper = new ObjectMapper()
val myGeneric: MyGeneric[MyComplexType] = objectMapper.readValue(jsonString, classOf[MyGeneric[MyComplexType]])
myGeneric.myValue = objectMapper.readValue(objectMapper.readTree(jsonString).get("myValue").toString, classOf[MyComplexType])
val myComplexType: MyComplexType = myGeneric.myValue
Not nice but works. (If anybody knows how to make it better, that would also welcome.)
Third iteration
The lines in the solution of second iteration occur in the real problem several times, therefore I want to create a function. The altering variables are the JSON formatted string and the MyComplexType.
I want something like this:
def main(args: Array[String]) {
val jsonString = "{\"myName\":\"myNumber\",\"myValue\":{\"name\":\"fifteen\",\"number\":\"15\"}}"
val myGeneric = extractMyGeneric[MyComplexType](jsonString)
val myComplexType: MyComplexType = myGeneric.myValue
}
private def extractMyGeneric[T](jsonString: String) = {
val objectMapper = new ObjectMapper()
val myGeneric = objectMapper.readValue(jsonString, classOf[MyGeneric[T]])
myGeneric.myValue = objectMapper.readValue(objectMapper.readTree(jsonString).get("myValue").toString, classOf[T])
myGeneric
}
This does not work (compiler error). I've already played around with various combinations of Class, ClassTag, classOf but none of them helped. There were compiler and runtime errors as well. Do you know how to pass and how to use such a type in Scala? Thank you!
When you use jackson to parse json, you can use TypeReference to parse generic type. Example:
val jsonString = "{\"myName\":\"myNumber\",\"myValue\":{\"name\":\"fifteen\",\"number\":\"15\"}}"
val objectMapper = new ObjectMapper()
val reference = new TypeReference[MyGeneric[MyComplexType]]() {}
val value: MyGeneric[MyComplexType] = objectMapper.readValue(jsonString, reference)
if you still want to use Jackson, I think you can create a parameter with TypeReference type. like:
implicit val typeReference = new TypeReference[MyGeneric[MyComplexType]] {}
val value = foo(jsonString)
println(value.myValue.name)
def foo[T](jsonStr: String)(implicit typeReference: TypeReference[MyGeneric[T]]): MyGeneric[T] = {
val objectMapper = new ObjectMapper()
objectMapper.readValue(jsonStr, typeReference)
}
Using your approach, I think this is how you can get classes that you need using ClassTags:
def extractMyGeneric[A : ClassTag](jsonString: String)(implicit generic: ClassTag[MyGeneric[A]]): MyGeneric[A] = {
val classOfA = implicitly[ClassTag[A]].runtimeClass.asInstanceOf[Class[A]]
val classOfMyGenericOfA = generic.runtimeClass.asInstanceOf[Class[MyGeneric[A]]]
val objectMapper = new ObjectMapper()
val myGeneric = objectMapper.readValue(jsonString, classOfMyGenericOfA)
myGeneric.myValue = objectMapper.readValue(objectMapper.readTree(jsonString).get("myValue").toString, classOfA)
myGeneric
}
I am not familiar with jackson but in play-json you could easily define Reads for your generic class like this
import play.api.libs.functional.syntax._
import play.api.libs.json._
implicit def genReads[A: Reads]: Reads[MyGeneric[A]] = (
(__ \ "myName").read[String] and
(__ \ "myValue").read[A]
)((name, value) => {
val e = new MyGeneric[A]
e.myName = name
e.myValue = value
e
})
Having this, and provided that instance of Reads for MyComplexType exists, you can implement your method as
def extractMyGeneric[A: Reads](jsonString: String): MyGeneric[A] = {
Json.parse(jsonString).as[MyGeneric[A]]
}
the issue here is that you need to provide Reads for all of your complex types, which would be as easy as
implicit complexReads: Reads[MyComplexType] = Json.reads[MyComplexType]
if those were case classes, otherways I think you would have to define them manually in simillar way to what I've done with genReads.

Spark - How to use SparkContext within classes?

I am building an application in Spark, and would like to use the SparkContext and/or SQLContext within methods in my classes, mostly to pull/generate data sets from files or SQL queries.
For example, I would like to create a T2P object which contains methods that gather data (and in this case need access to the SparkContext):
class T2P (mid: Int, sc: SparkContext, sqlContext: SQLContext) extends Serializable {
def getImps(): DataFrame = {
val imps = sc.textFile("file.txt").map(line => line.split("\t")).map(d => Data(d(0).toInt, d(1), d(2), d(3))).toDF()
return imps
}
def getX(): DataFrame = {
val x = sqlContext.sql("SELECT a,b,c FROM table")
return x
}
}
//creating the T2P object
class App {
val conf = new SparkConf().setAppName("T2P App").setMaster("local[2]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val t2p = new T2P(0, sc, sqlContext);
}
Passing the SparkContext as an argument to the T2P class doesn't work since the SparkContext is not serializable (getting a task not serializable error when creating T2P objects). What is the best way to use the SparkContext/SQLContext inside my classes? Or perhaps is this the wrong way to design a data pull type process in Spark?
UPDATE
Realized from the comments on this post that the SparkContext was not the problem, but that I was using a using a method within a 'map' function, causing Spark to try to serialize the entire class. This would cause the error since SparkContext is not serializable.
def startMetricTo(userData: ((Int, String), List[(Int, String)]), startMetric: String) : T2PUser = {
//do something
}
def buildUserRollup() = {
this.userRollup = this.userSorted.map(line=>startMetricTo(line, this.startMetric))
}
This results in a 'task not serializable' exception.
I fixed this problem (with the help of the commenters and other StackOverflow users) by creating a separate MetricCalc object to store my startMetricTo() method. Then I changed the buildUserRollup() method to use this new startMetricTo(). This allows the entire MetricCalc object to be serialized without issue.
//newly created object
object MetricCalc {
def startMetricTo(userData: ((Int, String), List[(Int, String)]), startMetric: String) : T2PUser = {
//do something
}
}
//using function in T2P
def buildUserRollup(startMetric: String) = {
this.userRollup = this.userSorted.map(line=>MetricCalc.startMetricTo(line, startMetric))
}
I tried several options, this is what worked eventually for me..
object SomeName extends App {
val conf = new SparkConf()...
val sc = new SparkContext(conf)
implicit val sqlC = SQLContext.getOrCreate(sc)
getDF1(sqlC)
def getDF1(sqlCo: SQLContext): Unit = {
val query1 = SomeQuery here
val df1 = sqlCo.read.format("jdbc").options(Map("url" -> dbUrl,"dbtable" -> query1)).load.cache()
//iterate through df1 and retrieve the 2nd DataFrame based on some values in the Row of the first DataFrame
df1.foreach(x => {
getDF2(x.getString(0), x.getDecimal(1).toString, x.getDecimal(3).doubleValue) (sqlCo)
})
}
def getDF2(a: String, b: String, c: Double)(implicit sqlCont: SQLContext) : Unit = {
val query2 = Somequery
val sqlcc = SQLContext.getOrCreate(sc)
//val sqlcc = sqlCont //Did not work for me. Also, omitting (implicit sqlCont: SQLContext) altogether did not work
val df2 = sqlcc.read.format("jdbc").options(Map("url" -> dbURL, "dbtable" -> query2)).load().cache()
.
.
.
}
}
Note: In the above code, if I omitted (implicit sqlCont: SQLContext) parameter from getDF2 method signature, it would not work. I tried several other options of passing the sqlContext from one method to the other, it always gave me NullPointerException or Task not serializable Excpetion. Good thins is it eventually worked this way, and I could retrieve parameters from a row of the DataFrame1 and use those values in loading the DataFrame 2.

Categories

Resources