Aerospike : Retrieve a set of keys from LDT Bin in one call - java

Suppose In my LDT(LargeMap) Bin I have following values,
key1, value1
key2, value2
key3, value3
key4, value4
. .
key50, value50
Now, I get my required data using following snippet :
Map<?, ?> myFinalRecord = new HashMap<?, ?>();
// First call to client to get the largeMap associated with the bin
LargeMap largeMap = myDemoClient.getLargeMap(myPolicy, myKey, myLDTBinName, null);
for (String myLDTKey : myRequiredKeysFromLDTBin) {
try {
// Here each get call results in one call to aerospike
myFinalRecord.putAll(largeMap.get(Value.get(myLDTKey)));
} catch (Exception e) {
log.warn("Key does not exist in LDT Bin");
}
}
The problem is here if myRequiredKeysFromLDTBin contains say 20 keys. Then largeMap.get(Value.get(myLDTKey)) will make 20 calls to aerospike.
Thus if I go by retrieval time of 1 ms per transaction , here my one call to retrieve 20 ids from a record will result in 20 calls to aerospike. This will increase my response time to approx. 20 ms !
So is there any way where I can just pass a set of ids to be retrieved from a LDT Bin and it takes only one call to do so ?

There is no direct API to do multi-get. A way of doing this would be call lmap API directly from server multiple time through UDF.
Example 'mymap.lua'
local lmap = require('ldt/lib_lmap');
function getmany(rec, binname, keys)
local resultmap = map()
local keycount = #keys
for i = 1,keycount,1 do
local rc = lmap.exists(rec, binname, keys[i])
if (rc == 1) then
resultmap[keys[i]] = lmap.get(rec, binname, keys[i]);
else
resultmap[keys[i]] = nil;
end
end
return resultmap;
end
Register this lua file
aql> register module 'mymap.lua'
OK, 1 module added.
aql> execute lmap.put('bin', 'c', 'd') on test.demo where PK='1'
+-----+
| put |
+-----+
| 0 |
+-----+
1 row in set (0.000 secs)
aql> execute lmap.put('bin', 'b', 'c') on test.demo where PK='1'
+-----+
| put |
+-----+
| 0 |
+-----+
1 row in set (0.001 secs)
aql> execute mymap.getmany('bin', 'JSON["b","a"]') on test.demo where PK='1'
+--------------------------+
| getmany |
+--------------------------+
| {"a":NIL, "b":{"b":"c"}} |
+--------------------------+
1 row in set (0.000 secs)
aql> execute mymap.getmany('bin', 'JSON["b","c"]') on test.demo where PK='1'
+--------------------------------+
| getmany |
+--------------------------------+
| {"b":{"b":"c"}, "c":{"c":"d"}} |
+--------------------------------+
1 row in set (0.000 secs)
Java Code to invoke this would be
try {
resultmap = myClient.execute(myPolicy, myKey, 'mymap', 'getmany', Value.get(myLDTBinName), Value.getAsList(myRequiredKeysFromLDTBin)
} catch (Exception e) {
log.warn("One of the key does not exist in LDT bin");
}
Value will be set if key exists and it would return NIL if it does not.

Related

Strange binding behavior of Hibernate when storing OffsetTime (JVM) into Time

I had an unexpected result when try to store with Hibernate (5.6.1) OffsetTime entity properties into Postgresql Time with time zone field.
For ex (if current default zone is +02):
| OffsetTime| Timez |
| -------- | -------- |
| 00:00+01 | 00:00+02 |
| 00:00+02 | 00:00+02 |
| 00:00+03 | 00:00+02 |
Original offset was lost and stored default instead.
I researched two classes:
org.hibernate.type.descriptor.sql.TimeTypeDescriptor
final Time time = javaTypeDescriptor.unwrap( value, Time.class, options );
org.hibernate.type.descriptor.java.OffsetTimeJavaDescriptor
if ( java.sql.Time.class.isAssignableFrom( type ) ) {
return (X) java.sql.Time.valueOf( offsetTime.toLocalTime() );
}
I think, that I had some mistake in understanding this logic, but in another answers I saw recommendation: (LINK)
ZoneOffset zoneOffset = ZoneOffset.systemDefault().getRules()
.getOffset(LocalDateTime.now());
Notification notification = new Notification()
//...
).setClockAlarm(
OffsetTime.of(7, 30, 0, 0, zoneOffset)
);
So, do I must to convert all OffsetTime values to default time zone so that it store correctly?

How to apply a function on a sequential data within group in spark?

I have a custom function which is depended on the order of the data. I want to apply this function for each group in spark in parallel (parallel groups). How can I do?
For example,
public ArrayList<Integer> my_logic(ArrayList<Integer> glist) {
Boolean b = true;
ArrayList<Integer> result = new ArrayList<>();
for (int i=1; i<glist.size();I++) { // Size is around 30000
If b && glist[i-1] > glist[i] {
// some logic then set b to false
result.add(glist[i]);
} else {
// some logic then set b to true
}
}
return result;
}
My data,
Col1 Col2
a 1
b 2
a 3
c 4
c 3
…. ….
I want something similar to below
df.group_by(col(“Col1”)).apply(my_logic(col(“Col2”)));
// output
a [1,3,5…]
b [2,5,8…]
…. ….
In Spark, you can use Window Aggregate Functions directly, I will show that here in Scala.
Here is your input data (my preparation):
import scala.collection.JavaConversions._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import org.apache.spark.sql.Row
val schema = StructType(
StructField("Col1", StringType, false) ::
StructField("Col2", IntegerType, false) :: Nil
)
val row = Seq(Row("a", 1),Row("b", 8),Row("b", 2),Row("a", 5),Row("b", 5),Row("a", 3))
val df = spark.createDataFrame(row, schema)
df.show(false)
//input:
// +----+----+
// |Col1|Col2|
// +----+----+
// |a |1 |
// |b |8 |
// |b |2 |
// |a |5 |
// |b |5 |
// |a |3 |
// +----+----+
Here is the code to obtain desired logic :
import org.apache.spark.sql.expressions.Window
df
// NEWCOLUMN: EVALUATE/CREATE LIST OF VALUES FOR EACH RECORD OVER THE WINDOW AS FRAME MOVES
.withColumn(
"collected_list",
collect_list(col("Col2")) over Window
.partitionBy(col("Col1"))
.orderBy(col("Col2"))
)
// NEWCOLUMN: MAX SIZE OF COLLECTED LIST IN EACH WINDOW
.withColumn(
"max_size",
max(size(col("collected_list"))) over Window.partitionBy(col("Col1"))
)
// FILTER TO GET ONLY HIGHEST SIZED ARRAY ROW
.where(col("max_size") - size(col("collected_list")) === 0)
.orderBy(col("Col1"))
.drop("Col2", "max_size")
.show(false)
// output:
// +----+--------------+
// |Col1|collected_list|
// +----+--------------+
// |a |[1, 3, 5] |
// |b |[2, 5, 8] |
// +----+--------------+
Note:
you can just use collect_list() Aggregate function with groupBy directly but, you can not get the collection list ordered.
collect_set() Aggregate function you can explore if you want to eliminate duplicates (with some changes to the above query).
EDIT 2 : You can write your custom collect_list() as a UDAF (UserDefinedAggregateFunction) like this in Scala Spark for DataFrames
Online Docs
For Spark2.3.0
For Latest Version
Below Code Spark Version == 2.3.0
object Your_Collect_Array extends UserDefinedAggregateFunction {
override def inputSchema: StructType = StructType(
StructField("yourInputToAggFunction", LongType, false) :: Nil
)
override def dataType: ArrayType = ArrayType(LongType, false)
override def deterministic: Boolean = true
override def bufferSchema: StructType = {
StructType(
StructField("yourCollectedArray", ArrayType(LongType, false), false) :: Nil
)
}
override def initialize(buffer: MutableAggregationBuffer): Unit = {
buffer(0) = new Array[Long](0)
}
override def update(buffer: MutableAggregationBuffer, input: Row): Unit = {
buffer.update(
0,
buffer.getAs[mutable.WrappedArray[Long]](0) :+ input.getLong(0)
)
}
override def merge(
buffer1: MutableAggregationBuffer,
buffer2: Row
): Unit = {
buffer1.update(
0,
buffer1.getAs[mutable.WrappedArray[Long]](0) ++ buffer2
.getAs[mutable.WrappedArray[Long]](0)
)
}
override def evaluate(buffer: Row): Any =
buffer.getAs[mutable.WrappedArray[Long]](0)
}
//Below is the query with just one line change i.e., calling above written custom udf
df
// NEWCOLUMN : USING OUR CUSTOM UDF
.withColumn(
"your_collected_list",
Your_Collect_Array(col("Col2")) over Window
.partitionBy(col("Col1"))
.orderBy(col("Col2"))
)
// NEWCOLUMN: MAX SIZE OF COLLECTED LIST IN EACH WINDOW
.withColumn(
"max_size",
max(size(col("your_collected_list"))) over Window.partitionBy(col("Col1"))
)
// FILTER TO GET ONLY HIGHEST SIZED ARRAY ROW
.where(col("max_size") - size(col("your_collected_list")) === 0)
.orderBy(col("Col1"))
.drop("Col2", "max_size")
.show(false)
//Output:
// +----+-------------------+
// |Col1|your_collected_list|
// +----+-------------------+
// |a |[1, 3, 5] |
// |b |[2, 5, 8] |
// +----+-------------------+
Note:
UDFs are not that efficient in spark hence, use them only when you absolutely need them. They are mainly focused for data analytics.

Create new dataset using existing dataset by adding null column in-between two columns

I created a dataset in Spark using Java by reading a csv file. Following is my initial dataset:
+---+----------+-----+---+
|_c0| _c1| _c2|_c3|
+---+----------+-----+---+
| 1|9090999999|NANDU| 22|
| 2|9999999999| SANU| 21|
| 3|9999909090| MANU| 22|
| 4|9090909090|VEENA| 23|
+---+----------+-----+---+
I want to create dataframe as follows (one column having null values):
+---+----+--------+
|_c0| _c1| _c2|
+---+----|--------+
| 1|null| NANDU|
| 2|null| SANU|
| 3|null| MANU|
| 4|null| VEENA|
+---+----|--------+
Following is my existing code:
Dataset<Row> ds = spark.read().format("csv").option("header", "false").load("/home/nandu/Data.txt");
Column [] selectedColumns = new Column[2];
selectedColumns[0]= new Column("_c0");
selectedColumns[1]= new Column("_c2");
ds2 = ds.select(selectedColumns);
which will create dataset as follows.
+---+-----+
|_c0| _c2|
+---+-----+
| 1|NANDU|
| 2| SANU|
| 3| MANU|
| 4|VEENA|
+---+-----+
To select the two columns you want and add a new one with nulls you can use the following:
import org.apache.spark.sql.functions.*;
import org.apache.spark.sql.types.StringType;
ds.select({col("_c0"), lit(null).cast(DataTypes.StringType).as("_c1"), col("_c2")});
Try Following code
import org.apache.spark.sql.functions.{ lit => flit}
import org.apache.spark.sql.types._
val ds = spark.range(100).withColumn("c2",$"id")
ds.withColumn("new_col",flit(null: String)).selectExpr("id","new_col","c2").show(5)
Hope this Helps
Cheers :)
Adding new column with string null value may solve the problem. Try the following code although it's written in scala but you'll get the idea:
import org.apache.spark.sql.functions.lit
import org.apache.spark.sql.types.StringType
val ds2 = ds.withColumn("new_col", lit(null).cast(StringType)).selectExpr("_c0", "new_col as _c1", "_c2")

Passing a parameter in a jpql query select

I have a jpql query instanciates a java object in select clause
public List<ChampEtatOT> getEtatOT(Date dateDebut, Date dateFin) {
Query query = em.createQuery("SELECT NEW ChampEtatOT( ot.numero, uo.denominationFr, ot.etat, ot.dateDebutReelle , ot.dateFinReelle, :dateParam1, :dateParam2, :dateParam3) FROM ordre ot JOIN ot.unite uo")
.setParameter("dateParam1", dateDebut, TemporalType.DATE)
.setParameter("dateParam2", dateFin, TemporalType.DATE)
.setParameter("dateParam3", new Date("2015-01-01"), TemporalType.DATE);
return query.getResultList();
}
I put 3 parameters, so i can pass it in the constructor
I get this error
Caused by: Exception [EclipseLink-6137] (Eclipse Persistence Services - 2.3.2.v20111125-r10461): org.eclipse.persistence.exceptions.QueryExceptionException Description: An Exception was thrown while executing a ReportQuery with a constructor expression: java.lang.NoSuchMethodException: dz.elit.gmao.commun.reporting.classe.ChampEtatOT.<init>(java.lang.String, java.lang.String, java.lang.String, java.util.Date, java.util.Date)Query: ReportQuery(referenceClass=TravOrdreTravail jpql="SELECT NEW dz.elit.gmao.commun.reporting.classe.ChampEtatOT( ot.numero, uo.denominationFr, ot.etat, ot.dateDebutReelle , ot.dateFinReelle, :dateParam1, :dateParam2, :dateParam3) FROM TravOrdreTravail ot JOIN ot.uniteOrganisationnellle uo")
I think that it's not possible to put parameters in a select clause so does anyone have an idea, the constructor method is as follows:
public ChampEtatOT(String numero, String denominationFr, String etat, Date dateDebutReelle, Date dateFinReelle, Date dateParam1, Date dateParam2, Date dateParam3) {
this.numero = numero;
this.denominationFr = denominationFr;
if (etat.equals("OUV")) {
if (dateDebutReelle.before(dateParam1)) {
etatEntreeSortie = "En instance debut du mois";
} else {
if (dateDebutReelle.before(dateParam2)) {
etatEntreeSortie = "En instance fin du mois";
} else {
if (dateDebutReelle.after(dateParam1) && dateDebutReelle.before(dateParam2)) {
etatEntreeSortie = "Entree/Mois";
}
}
}
}
}
Problem solved, as you suggested bRIMOs Bor it's not possible to pass parameters in a SELECT clause, so i have retreived all the results in a List than filtered the results according to the three dates date1, date2, date3
Query query = em.createQuery("SELECT NEW ChampEtatAteliers"
+ "( ot.numero, uo.denominationFr, ot.etat, ot.dateDebutReelle, ot.dateFinReelle) "
+ "FROM ordre ot JOIN ot.unite uo");
List<ChampEtatAteliers> champEtatAtelierses = query.getResultList();
for (ChampEtatAteliers champEtatAtelierse : champEtatAtelierses) {
if (champEtatAtelierse.getDateDebutReelle().compareTo(date1) >= 0 && champEtatAtelierse.getDateDebutReelle().compareTo(date2) <= 0) {
champEtatAtelierList2.add(new ChampEtatAteliers(champEtatAtelierse.getNumero(), champEtatAtelierse.getDenominationFr(), "Entree/Mois"));
}
if (champEtatAtelierse.getEtat().equals("OUV")) {
if (champEtatAtelierse.getDateDebutReelle().compareTo(date1) < 0) {
champEtatAtelierse.setEtatEntreeSortie("En instance début du mois");
} else {
if (champEtatAtelierse.getDateDebutReelle().compareTo(date2) <= 0) {
champEtatAtelierse.setEtatEntreeSortie("En instance fin du mois");
}
}
}
}
I think that it's not possible to reference a parameter in the contructor.
in your case it throws a NoSuchMethodexeption : it means that, no method with the current signature in your ChampEtatOT class (5 parameters instead of 8 )
you can refer to this answer => Passing a parameter in a jpql query select
So ,try to retrive all data then make a filter method to set all the etatEntreeSortie values inside the ChampEtatOT class of the ResultList
Clearly the JPQL BNF does permit passing parameters as constructor arguments.
constructor_expression ::= NEW constructor_name ( constructor_item {, constructor_item}* )
constructor_item ::= single_valued_path_expression | scalar_expression | aggregate_expression |
identification_variable
scalar_expression ::= simple_arithmetic_expression | string_primary | enum_primary |
datetime_primary | boolean_primary | case_expression | entity_type_expression
string_primary ::= state_field_path_expression | string_literal |
input_parameter | functions_returning_strings | aggregate_expression | case_expression
i.e a scalar_expression can be a string_primary, which can be an input_parameter. So your JPA provider is not meeting the JPA spec and you should raise a bug on it.

Row within a Row with Ext GWT Grid

Is is possible to have a RowExpander that is not HTML but rather another Row? That is, a row have a expand [+] icon then when expanded, sub rows appear like a "child-row""?
For example I have a List<ModelData> like this:
ModelData column1 = new BaseModelData();
column1.set("Date", "11-11-11");
column1.set("Time", "11:11:11");
column1.set("Code", "abcdef");
column1.set("Status", "OK");
ModelData column2 = new BaseModelData();
column2.set("Date", "11-11-11");
column2.set("Time", "12:11:11");
column2.set("Code", "abcdef");
column2.set("Status", "Failed");
ModelData column3 = new BaseModelData();
column3.set("Date", "11-11-11");
column3.set("Time", "13:11:11");
column3.set("Code", "abcedf");
column3.set("Status", "Failed");
ModelData column4 = new BaseModelData();
column4.set("Date", "11-11-11");
column4.set("Time", "14:11:11");
column4.set("Code", "abcdef");
column4.set("Status", "Failed");
List<ModelData> data = ...
data.add(model1);
data.add(model2);
data.add(model3);
data.add(model4);
And that this will be rendered in the Grid as two columns (Grouped by the Code and Status column):
Date | Time | Code | Status
-------------------------------------
11-11-11 | 11:11:11 | abcedf | OK
[+] 11-11-11 | 12:11:11 | abcedf | Failed
|--->11-11-11 | 13:11:11 | abcedf | Failed
|--->11-11-11 | 14:11:11 | abcedf | Failed
Something like this.
Update:
I was advised that the solution would be to extends the RowExpander class and merge with GridView class.
You can take a look at GroupingView and TreeGrid and customize one of them for you purposes. It is much safer than trying to reuse GridView's rows rendering functionality.

Categories

Resources