Converting a Array[ValueRange] to util.List[ValueRange] - java

I am currently trying to do a batch update to a google sheet via a scala project however i am having a lot of trouble converting a Array[ValueRange] to util.List[ValueRange]
here is the code produceing the information for the body value for the batch update:
val values = util.Arrays.asList(util.Arrays.asList[AnyRef](
"Y"
))
val data =
allTasksToBeCompleted.map {
id =>
val range = s"$sheetName!G${id.toInt}:G${id.toInt}"
new ValueRange().setRange(range).setValues(values)
}
I have tried converting via util.Arrays.asList(data) and asInstanceOf however neither seems to work at runtime.

Have you tried JavaConverters decorators?
import scala.collection.JavaConverters._
val data = Array(1, 2, 3)
val list = data.toList.asJava
data: Array[Int] = Array(1, 2, 3)
list: java.util.List[Int] = [1, 2, 3]
This should convert any Array[T] to java.util.List[T]

Related

Retain keys with null values while writing JSON in spark

I am trying to write a JSON file using spark. There are some keys that have null as value. These show up just fine in the DataSet, but when I write the file, the keys get dropped. How do I ensure they are retained?
code to write the file:
ddp.coalesce(20).write().mode("overwrite").json("hdfs://localhost:9000/user/dedupe_employee");
part of JSON data from source:
"event_header": {
"accept_language": null,
"app_id": "App_ID",
"app_name": null,
"client_ip_address": "IP",
"event_id": "ID",
"event_timestamp": null,
"offering_id": "Offering",
"server_ip_address": "IP",
"server_timestamp": 1492565987565,
"topic_name": "Topic",
"version": "1.0"
}
Output:
"event_header": {
"app_id": "App_ID",
"client_ip_address": "IP",
"event_id": "ID",
"offering_id": "Offering",
"server_ip_address": "IP",
"server_timestamp": 1492565987565,
"topic_name": "Topic",
"version": "1.0"
}
In the above example keys accept_language, app_name and event_timestamp have been dropped.
Apparently, spark does not provide any option to handle nulls. So following custom solution should work.
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
case class EventHeader(accept_language:String,app_id:String,app_name:String,client_ip_address:String,event_id: String,event_timestamp:String,offering_id:String,server_ip_address:String,server_timestamp:Long,topic_name:String,version:String)
val ds = Seq(EventHeader(null,"App_ID",null,"IP","ID",null,"Offering","IP",1492565987565L,"Topic","1.0")).toDS()
val ds1 = ds.mapPartitions(records => {
val mapper = new ObjectMapper with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
records.map(mapper.writeValueAsString(_))
})
ds1.coalesce(1).write.text("hdfs://localhost:9000/user/dedupe_employee")
This will produce output as :
{"accept_language":null,"app_id":"App_ID","app_name":null,"client_ip_address":"IP","event_id":"ID","event_timestamp":null,"offering_id":"Offering","server_ip_address":"IP","server_timestamp":1492565987565,"topic_name":"Topic","version":"1.0"}
If you are on Spark 3, you can add
spark.sql.jsonGenerator.ignoreNullFields false
ignoreNullFields is an option to set when you want DataFrame converted to json file since Spark 3.
If you need Spark 2 (specifically PySpark 2.4.6), you can try converting DataFrame to rdd with Python dict format. And then call pyspark.rdd.saveTextFile to output json file to hdfs. The following example may help.
cols = ddp.columns
ddp_ = ddp.rdd
ddp_ = ddp_.map(lambda row: dict([(c, row[c]) for c in cols])
ddp_ = ddp.repartition(1).saveAsTextFile(your_hdfs_file_path)
This should produce output file like,
{"accept_language": None, "app_id":"123", ...}
{"accept_language": None, "app_id":"456", ...}
What's more, if you want to replace Python None with JSON null, you will need to dump every dict into json.
ddp_ = ddp_.map(lambda row: json.dumps(row, ensure.ascii=False))
Since Spark 3, and if you are using the class DataFrameWriter
https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/DataFrameWriter.html#json-java.lang.String-
(same applies for pyspark)
https://spark.apache.org/docs/3.0.0-preview/api/python/_modules/pyspark/sql/readwriter.html
its json method has an option ignoreNullFields=None
where None means True.
So just set this option to false.
ddp.coalesce(20).write().mode("overwrite").option("ignoreNullFields", "false").json("hdfs://localhost:9000/user/dedupe_employee")
To retain null values converting to JSON please set this config option.
spark = (
SparkSession.builder.master("local[1]")
.config("spark.sql.jsonGenerator.ignoreNullFields", "false")
).getOrCreate()

rxJava multible group by variables

The example is C# code. I am using RxJava to use lambda in Android.
var dex = device.OrderBy(x => x.Connection1).ThenBy(x => x.EventName1).GroupBy(x => new { x.Connection1, x.EventName1 }, y => new { EventNames = y.EventName1 }).
Select(x => new { Connector = x.Key.Connection1, EventName = x.Key.EventName1, Counter = x.Select(y => y.EventNames).Count() });
So i need this code written in Java with the help of RxJava does anyone know how to use a multible group by in RxJava ???

Prohibit resolving during loading in typesafe config

I want to prohibit resolving of a.b. I want to substitute param from another config. Like this:
val d = ConfigFactory.load(ConfigFactory.parseString(
"""
|param = x
|a.b = ${param}
""".stripMargin))
val a = ConfigFactory.parseString("param = 1")
val result = a.withFallback(d).resolve()
In this case param gets value 1, but a.b remains x
I've tried to set ConfigResolveOptions.defaults().setAllowUnresolved(true) when loading config d, but that doesn't work.
How can I overcome this?
The problem is that Config.load is resolving the substitution immediately. If you take that out it resolves like you want it to:
val p = ConfigFactory.parseString(
"""
|param = x
|a.b = ${param}
""".stripMargin)
val a = ConfigFactory.parseString("param = 1")
val result = a.withFallback(p).resolve()
println(result.getString("a.b"))
This prints 1.
You don't need to use Config.load unless you want to use reference.conf, etc. If you do want to use Config.load then you should do it after you have composed all the configs together using withFallback.
For example, this also prints 1:
val p = ConfigFactory.parseString(
"""
|param = x
|a.b = ${param}
""".stripMargin)
val d = ConfigFactory.load(p)
val a = ConfigFactory.parseString("param = 1")
val result = a.withFallback(p)
val loaded = ConfigFactory.load(result)
println(loaded.getString("a.b"))
Or, say you have an application.conf with include that you want to use with ConfigFactory.load() (per your comment).
If application.conf looks like
include "foo"
and foo.conf looks like
a.b = ${param}
then this prints 1 also:
val a = ConfigFactory.parseString("param = 1")
val app = ConfigFactory.load("application", ConfigParseOptions.defaults,
ConfigResolveOptions.defaults.setAllowUnresolved(true))
val result = a.withFallback(app).resolve
println(result.getString("a.b"))
In general, if you want A to override B to override C then you should use A.withFallback(B).withFallback(C).
I struggled a bit with the same thing: trying to use "fallbacks" to override values, when it was designed for layering/merging configs
Assuming I understand your use case, I recommend instead using file includes.
In my application.conf I have the default value
a.b = "placeholder"
And at the bottom I have the following include
# Local overrides - for development use
include "local.conf"
And finally in local.conf
param = 1
a.b = ${param}
The end result is that a.b will be overridden with 1
Found a workaround for my problem:
So If I have config file application.conf which uses include to include config files which contain substitution syntax and files which contain declaration of the config values which are going to be substituted.
val a = ConfigFactory.parseString(s"""param = 1""")
val z = ConfigFactory.parseResources("application.conf") //this doesn't resolve substitutions
val result = a.withFallback(z).resolve().withFallback(ConfigFactory.load("application.conf"))

Devise confirmation tokens in Java

Can't seem to create a functional way to insert a user from Java for Devise. Currently there are these fields:
"_id",
"access_level",
"confirmation_sent_at",
"confirmation_token",
"confirmed_at",
"email",
"encrypted_password",
"sign_in_count"
I am able to insert a document that counts as a user. The problem is that when I go to:
http://www.mysite.com:3000/users/confirmation?confirmation_token=TOKENHERE
I get a message saying that it's invalid.
EDIT 1:
When I resend confirmation instructions for this user (WHICH GENERATES A NEW TOKEN), the user can be logged into. This confirms my doubts about the token being the problem. How can I port Devise's token generator to Java?
EDIT 2:
When I register on site, it says I should check for a confirmation link. However, if I go into the Mongo shell, manually take out the confirmation token and paste it to site.com/users/confirmation?confirmation_token= then it doesn't work! However, if I actually use the confirmation link I was sent, it works. How can I make a VALID token, all from Java. Please help!
For this quoestion you should refer to this stackoverflow answer and to the Rails API of protect_from_forgery.
The short answer is to disable forgery protection in your controller, but this makes your application vulnerable to CSRF attacks:
skip_before_action :verify_authenticity_token
The better way would be to authenticate with a JSON or XML request as these requests are not protected by CSRF protection. You can find a solution for devise here.
Edit
Monkey patch devise to save unencoded confirmation token. In your config/initializers/devise.rb
module Devise
module Models
module Confirmable
def generate_confirmation_token
raw, enc = Devise.token_generator.generate(self.class, :confirmation_token)
#raw_confirmation_token = raw
self.my_unencoded_column = raw # Patch
self.confirmation_token = enc
self.confirmation_sent_at = Time.now.utc
end
end
end
end
In case anyone else finds themselves trying to get a java or scala app to coexist with a rails app, I hacked up the following. Its in scala but uses java apis so should be easy to read. As far as I can tell it replicates Devise's behavior, and if I hit the confirmation link in the rails app with the raw token rails/devise generates the same encoded string.
import java.security.spec.KeySpec
import javax.crypto.SecretKey
import javax.crypto.SecretKeyFactory
import javax.crypto.spec.PBEKeySpec
import javax.crypto.spec.SecretKeySpec
import javax.crypto.Mac
import javax.xml.bind.DatatypeConverter
import java.util.Base64
// copy functionality from Rails Devise
object TokenGenerator {
// sample values 9exithzwZ8P9meqdVs3K => 54364224169895883e87c8412be5874039b470e26e762cb3ddc37c0bdcf014f5
// 5zNMi6egbyPoDUy2t3NY => 75bd5d53aa36d3fc61ac186b4c6e2be8353e6b39536d3cf846719284e05474ca
private val deviseSecret = sys.env("DEVISE_SECRET")
private val factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA1")
val encoder = Base64.getUrlEncoder()
case class TokenInfo(raw: String, encoded: String)
def createConfirmationToken: TokenInfo = {
// copy behavior from rails world. Don't know why it does this
val replacements = Map('l' -> "s", 'I' -> "x", 'O' -> "y", '0' -> "z")
// make a raw key of 20 chars, doesn't seem to matter what they are, just need url valid set
val bytes = new Array[Byte](16)
scala.util.Random.nextBytes(bytes)
val raw = encoder.encodeToString(bytes).take(20).foldLeft(""){(acc, x) => acc ++ replacements.get(x).getOrElse(x.toString)}
TokenInfo(raw, digestForConfirmationToken(raw))
}
private def generateKey(salt: String): Array[Byte] = {
val iter = 65536
val keySize = 512
val spec = new PBEKeySpec(deviseSecret.toCharArray, salt.getBytes("UTF-8"), iter, keySize)
val sk = factory.generateSecret(spec)
val skspec = new SecretKeySpec(sk.getEncoded, "AES")
skspec.getEncoded
}
def sha256HexDigest(s: String, key: Array[Byte]): String = {
val mac = Mac.getInstance("HmacSHA256")
val keySpec = new SecretKeySpec(key, "RAW")
mac.init(keySpec)
val result: Array[Byte] = mac.doFinal(s.getBytes())
DatatypeConverter.printHexBinary(result).toLowerCase
}
private def getDigest(raw: String, salt: String) = sha256HexDigest(raw, generateKey(salt))
// devise uses salt "Devise #{column}", in this case its confirmation_token
def digestForConfirmationToken(raw: String) = getDigest(raw, "Devise confirmation_token")
}

mongodb javascript queries

currently I'm working on a project in Java, and I need to run the JavaScript Mongo queries using Java. I figured out I can do something like that using db.eval(). Problem is I have the following JavaScript query for Mongo and I have no idea how can I pass the whole script to the db.eval() method.
var red = function(doc, out) {
out.count_order++;
out.sum_qty += doc.quantity;
out.sum_base_price += doc.extendedprice;
out.sum_disc_price += doc.extendedprice * (1 - doc.discount);
out.sum_charge += doc.extendedprice * (1 - doc.discount) * (1 + doc.tax);
out.avg_disc += doc.discount
};
var avg = function(out) {
out.avg_qty = out.sum_qty / out.count_order;
out.avg_price = out.sum_base_price / out.count_order;
out.avg_disc = out.avg_disc / out.count_order
};
db.deals.group( {
key : { RETURNFLAG : true, LINESTATUS : true},
cond : { "SHIPDATE" : {$lte: new Date(1998, 8, 1)}},
initial: { count_order : 0, sum_qty : 0, sum_base_price : 0, sum_disc_price : 0,
sum_charge : 0, avg_disc : 0},
reduce : red,
finalize : avg
});
I encourage you to look at the mongodb java driver to run queries from Java. The Java driver allows one to interact with their mongodb database directly in Java. Thus, you can just port this code to Java and do it all in Java, avoiding ever having to use javascript or db.eval. Let me know if you would like more clarification.
you can also use stored procedures in which you can call the stored functions from the java-driver using eval()
Some details : http://dirolf.com/2010/04/05/stored-javascript-in-mongodb-and-pymongo.html
Recently in v2.4 there were some concurrency improvements for javascript operations : http://docs.mongodb.org/manual/release-notes/2.4-javascript/

Categories

Resources