functionally combine list of same object - java

I have a set of alerts that I need to combine and output. I'm struggling to see how I can do this functionally. I have everything I need I just want to combine format a little and output.
orderedStatuses contains a set of alerts
data class Alert(
val status: String,
val recordId: String
)
This is what I'm currently returning
Alerts:
Status1 :
000000000000
Status1 :
111111111111
Status2 :
222222222222
Status2 :
333333333333
Status3 :
444444444444
Status3 :
555555555555
this is what I want:
Alerts:
status1 :
('00000', '111111')
status2 :
('222222', '333333')
status3 :
('444444', '55555')
current code:
val alert = if (orderedStatuses.isEmpty()) {
"No alerts found for status"
} else {
"Records:\n" + orderedStatuses.joinToString("\n") { it ->
"\t${it.status} : \n" + it.recordId
}
}

data class Alert(
val status: String,
val recordId: String
)
val alerts = listOf(
Alert("Status1", "00000"),
Alert("Status1", "111111"),
Alert("Status2", "222222"),
Alert("Status2", "333333"),
Alert("Status3", "444444"),
Alert("Status3", "55555")
)
alerts
.groupBy { it.status }
.map { map -> map.key + " : \n('" + map.value.joinToString("', '") { it.recordId } + "')\n" }
.forEach { print(it) }
This will print:
Status1 :
('00000', '111111')
Status2 :
('222222', '333333')
Status3 :
('444444', '55555')
This might be more readable:
alerts
.groupBy(Alert::status)
.map { (key, value) ->
key + " : \n('" + value.joinToString("', '", transform = Alert::recordId) + "')\n"
}
.forEach(::print)
Detailed example on Kotlin Playground

Related

Convert DDL String to Spark structType?

I have hive/redshift tables and I want to create a spark data frame with precisely the DDL of the original tables, written in JAVA. Is there an option to achieve that?
I think maybe is better to convert the DDL string to Spark schema json, and from that create a df struct type. I started to investiage the spark parser api
String ddlString = "CREATE TABLE data.baab (" +
"id STRING, " +
"test STRING, " +
"test2 STRING, " +
"audit STRUCT<createdDate: TIMESTAMP, createdBy: STRING, lastModifiedDate: TIMESTAMP, lastModifiedBy: STRING>) " +
"USING parquet " +
"LOCATION 's3://test.com' " +
"TBLPROPERTIES ('transient_lastDdlTime' = '1676593278')";
SparkSqlParser parser = new SparkSqlParser();
and I cant see anything that related to ddl parser:
override def parseDataType(sqlText : _root_.scala.Predef.String) : org.apache.spark.sql.types.DataType = { /* compiled code */ }
override def parseExpression(sqlText : _root_.scala.Predef.String) : org.apache.spark.sql.catalyst.expressions.Expression = { /* compiled code */ }
override def parseTableIdentifier(sqlText : _root_.scala.Predef.String) : org.apache.spark.sql.catalyst.TableIdentifier = { /* compiled code */ }
override def parseFunctionIdentifier(sqlText : _root_.scala.Predef.String) : org.apache.spark.sql.catalyst.FunctionIdentifier = { /* compiled code */ }
override def parseMultipartIdentifier(sqlText : _root_.scala.Predef.String) : scala.Seq[_root_.scala.Predef.String] = { /* compiled code */ }
override def parseTableSchema(sqlText : _root_.scala.Predef.String) : org.apache.spark.sql.types.StructType = { /* compiled code */ }
override def parsePlan(sqlText : _root_.scala.Predef.String) : org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = { /* compiled code */ }
protected def astBuilder : org.apache.spark.sql.catalyst.parser.AstBuilder
protected def parse[T](command : _root_.scala.Predef.String)(toResult : scala.Function1[org.apache.spark.sql.catalyst.parser.SqlBaseParser, T]) : T = { /* compiled code */ }
this is what I tried:
StructType struct = null;
Pattern pattern = Pattern.compile("\\(([^()]*)\\)");
Matcher matcher = pattern.matcher(ddlString);
if (matcher.find()) {
String result = matcher.group(1);
struct = StructType.fromDDL(result);
}
return struct;
this is work, but I afraid that this solution will not covert all the cases.
Any suggestions?

Bluetooth LE - How do i get Advertisement Interval in milliseconds?

I have to get the Advertisement Interval in milliseconds. I used result.periodicAdvertisingInterval but this returns 0. I have to implement something like this:
private val scanCallback = object : ScanCallback(){
#RequiresApi(Build.VERSION_CODES.N)
#SuppressLint("MissingPermission", "NotifyDataSetChanged")
override fun onScanResult(callbackType: Int, result: ScanResult) {
val scanJob = CoroutineScope(Dispatchers.Main).launch {
val tag = deviceMap.computeIfAbsent(result.device.address) {
val newTag = BleTag(result.device.name ?: "Unbekannt", result.device.address, result.rssi , result.scanRecord?.bytes, "")
deviceList.add(newTag)
newTag
}
tag.name = result.device.name ?: "Unbekannt"
tag.rssi = result.rssi
tag.advertisementData = result.scanRecord?.bytes
}
deviceList.sortBy {result.rssi }
recyclerView.adapter?.notifyDataSetChanged()
menu.findItem(R.id.count).title = "Geräte: " + deviceList.size
super.onScanResult(callbackType, result)
}
override fun onScanFailed(errorCode: Int) {
super.onScanFailed(errorCode)
Log.e("Scan failed","")
}
}
This result is obtained by subtracting the timestamps of two consecutive advertisements of the same device.

How can I add dynamically where clauses in spring boot? Please tell me your approach

I'm using spring-boot and I use an interface that extends from CrudRepository, but for some of the queries I need more control. Before with spring I used to do something like this:
String queryStr = "SELECT s.clave, Descripcion, CAST(Cantidad1 AS SIGNED) cantidad,"
+ " (SELECT SUM(NrCompInt) FROM TransacSQL.Items i WHERE transacNr IN(SELECT DISTINCT ObserIng FROM ComunSql.Trazabi t WHERE t.nrTransacSalida = 0 AND t.articulo = ea.Clave) AND i.Articulo = ea.Clave) valorizacion"
+ " FROM ComunSql.ExiArt ea" + " INNER JOIN ComunSql.Stock s ON ea.Clave = s.Clave"
+ " WHERE ea.Cantidad1 > 0";
if (!idArticulo.isEmpty())
queryStr += " AND s.Clave = :idArticulo";
if (idExistencia != -1)
queryStr += " AND ea.NrExist = :idExistencia";
Query query = sessionFactory.getCurrentSession().createSQLQuery(queryStr);
There are many ways to create your desired queries , you can use criteria based condition and adding them to the query then execute them with aid of repositories or data access operation level of you application. beside that you can create your own desired query by creating the string and native queries(same as what you trying to do), please note that this approach isn't called DSL query briefly saying the DSL queries are query syntax thats created for Apache Lucene or any Lucene based data bases like Elasticsearch.
For the case in hand consider that we passed controller implementing and service interface and we are in service implementing, we do create a buffered string that satisfy the related query syntax(TSQL, DSL etc) and the logic that i want to query executed then i pass it to repository class that extend to spring boot data that i used in my project .
Consider below example which our spring data based on Elasticsearch
#Override
public ResponseEntity<?> SearchCore(SequenceSearchDTO entity, Pageable pageable) {
String query="";
Double z=0.00;
Long x=z.longValue();
if(entity.getParentKnowledgeGroupId() != null && !entity.getParentKnowledgeGroupId().isEmpty() && !entity.getParentKnowledgeGroupId().equals("")){
query= "[{\"query_string\": { \"query\" : \""+entity.getParentKnowledgeGroupId()+"\", \"fields\" : [ \"parentKnowledgeGroupId\" ] }}," ;
}
if(entity.getKnowledgeGroupId() != null && !entity.getKnowledgeGroupId().isEmpty() && !entity.getKnowledgeGroupId().equals("")){
query=query+ "{\"query_string\" : { \"query\" : \""+entity.getKnowledgeGroupId()+"\", \"fields\" : [ \"knowledgeGroupId\" ]}},";
}
if(entity.getTerm() != null && !entity.getTerm().isEmpty() && !entity.getTerm().equals("")){
query=query+ "{\"query_string\" : { \"query\" : \""+entity.getTerm()+"\", \"fields\" : [ \"content\" ]}},";
}
if(entity.getPrice() != null && ! entity.getPrice().equals("")){
query=query+"{\"range\" : {\"price\" : {\"from\" :"+x+" , \"to\" : "+entity.getPrice()+", \"include_lower\" : true, \"include_upper\" : true } } },";
}
if(entity.getCourseType() != null && ! entity.getCourseType().equals("")){
query=query+ "{\"query_string\" : { \"query\" : \""+entity.getCourseType()+"\", \"fields\" : [ \"courseType\" ]}},";
}
if(entity.getScore() != null && ! entity.getScore().equals("")){
System.out.println("+++++");
System.out.println(entity.getScore());
query=query+"{\"range\" : {\"scoree\" : {\"from\" : "+0+", \"to\" : "+entity.getScore()+", \"include_lower\" : true, \"include_upper\" : false } } },";
}
if(entity.getBeginDate() != null && ! entity.getBeginDate().equals("")){
if(entity.getEndDate() != null && ! entity.getEndDate().equals("")) {
query = query + "{\"range\" : {\"LogicalDate\" : {\"from\" : " + entity.getBeginDate().getTime() / 1000 + ", \"to\" : " + entity.getEndDate().getTime() / 1000 + ", \"include_lower\" : true, \"include_upper\" : true } } },";
}
else{
query=query+ "{\"range\" : {\"LogicalDate\" : {\"from\" : "+entity.getBeginDate().getTime()/1000+", \"to\" : "+new Date().getTime()/1000+", \"include_lower\" : true, \"include_upper\" : true } } },";
}
}
if(entity.getDuration() != null && ! entity.getDuration().equals("")){
query=query+"{\"range\" : {\"studyTime\" : {\"from\" :"+x+" , \"to\" : "+entity.getDuration()+", \"include_lower\" : true, \"include_upper\" : true } } },";
}
if(entity.getFree() != null){
query=query+ "{\"query_string\" : { \"query\" : \""+entity.getFree()+"\", \"fields\" : [ \"free\" ]}}]";
}
Page<ElasticModelSequence> flag=sequenceRepository.stringQuery(query, pageable);
return ResponseEntity.ok().body(flag);
}
Finally passing it to the related repository
#Repository
public interface SequenceRepository extends ElasticsearchRepository<ElasticModelSequence, String> {
#Query("{\"bool\":"+
"{\"must\" :"+
"?0"+
"}"+
"}")
Page<ElasticModelSequence> stringQuery(#Param("flag") String flag, Pageable pageable);
}

What is the default value for spark.sql.columnNameOfCorruptRecord?

I have read the documentation but can not get spark.sql.columnNameOfCorruptRecord default value even with google searching.
The second question - how PERMISSIVE mode works when spark.sql.columnNameOfCorruptRecord is empty or null?
According to the code (19/01/2021) it's _corrupt_record:
val COLUMN_NAME_OF_CORRUPT_RECORD = buildConf("spark.sql.columnNameOfCorruptRecord")
.doc("The name of internal column for storing raw/un-parsed JSON and CSV records that fail " +
"to parse.")
.version("1.2.0")
.stringConf
.createWithDefault("_corrupt_record")
Regarding how PERMISSIVE mode works, you can see this in FailSafeParser[T]:
def parse(input: IN): Iterator[InternalRow] = {
try {
rawParser.apply(input).toIterator.map(row => toResultRow(Some(row), () => null))
} catch {
case e: BadRecordException => mode match {
case PermissiveMode =>
Iterator(toResultRow(e.partialResult(), e.record))
case DropMalformedMode =>
Iterator.empty
case FailFastMode =>
throw new SparkException("Malformed records are detected in record parsing. " +
s"Parse Mode: ${FailFastMode.name}. To process malformed records as null " +
"result, try setting the option 'mode' as 'PERMISSIVE'.", e)
}
}
private val toResultRow: (Option[InternalRow], () => UTF8String) => InternalRow = {
if (corruptFieldIndex.isDefined) {
(row, badRecord) => {
var i = 0
while (i < actualSchema.length) {
val from = actualSchema(i)
resultRow(schema.fieldIndex(from.name)) = row.map(_.get(i, from.dataType)).orNull
i += 1
}
resultRow(corruptFieldIndex.get) = badRecord()
resultRow
}
} else {
(row, _) => row.getOrElse(nullResult)
}
}
If it isn't specified, it'll fallback to the default value defined in the configuration.

Return substring of analyzed, non-stored text field in elasticsearch java api

I work on a project that has a string field (the name is urlOrContent) and it can be small (less than 50 character) or very long (more than 50 character), and I just want to return the first 50 characters every time based on a specific query. My database is elasticsearch and my problem is raised in this link and the questioner’s response seems to be correct (urlOrContent field is analyzed and non stored text field). It uses following script:
{
"script_fields": {
"substring": {
"script": {
"lang": "painless",
"inline": "params._source.text.substring(0, 100)"
}
}
}
}
But my main problem is that I can not find the equivalent of elasticsearch java api code. In fact, what should be added to the code below, which only returns the first 50 characters of the urlOrContent field? Note that this field may not even have 50 characters in some cases, and then the entire string should be returned.
String queryString =
EnumLinkFields.CREATE_TIME.getFieldName() + ":(>=" + dateFrom + " AND <=" + dateTo + ")";
QueryBuilder query = QueryBuilders.queryStringQuery(queryString);
SearchResponse response = TRANSPORT_CLIENT.prepareSearch(MY_INDEX)
.setTypes(MY_TYPE)
.setSearchType(SEARCH_TYPE)
.setQuery(query)
.setFetchSource(null, new String[]{EnumLinkFields.USER_ID.getFieldName()})
.setFrom(offset)
.setSize(count)
.addSort(orderByField, sortOrder)
.execute().actionGet();
I found the best answer.
String queryString =
EnumLinkFields.CREATE_TIME.getFieldName() + ":(>=" + dateFrom + " AND <=" + dateTo + ")";
QueryBuilder query = QueryBuilders.queryStringQuery(queryString);
String codeUrlOrContent = "if (" + EnumElasticScriptField.URL_OR_CONTENT.getFieldName() + ".length() > 50) {" +
"return " + EnumElasticScriptField.URL_OR_CONTENT.getFieldName() + ".substring(0, 50);" +
"} else { " +
"return " + EnumElasticScriptField.URL_OR_CONTENT.getFieldName() + "; }";
Script scriptUrlOrContent = new Script(ScriptType.INLINE, "painless",
codeUrlOrContent, Collections.emptyMap());
Script scriptIsUrl = new Script(ScriptType.INLINE, "painless",
EnumElasticScriptField.IS_URL.getFieldName(), Collections.emptyMap());
SearchResponse response = TRANSPORT_CLIENT.prepareSearch(MY_INDEX)
.setTypes(MY_TYPE)
.setSearchType(SEARCH_TYPE)
.setQuery(query)
.addScriptField(EnumLinkFields.URL_OR_CONTENT.getFieldName(),
scriptUrlOrContent)
.addScriptField(EnumLinkFields.IS_URL.getFieldName(), scriptIsUrl)
.setFrom(offset)
.setSize(count)
.addSort(orderByField, sortOrder)
.execute().actionGet();
Note that the call to the setFetchSource function must be removed and all returned fields must be returned through the script.
You can put your script_fields query in the query object, i.e. in setQuery(query).
Your query object should be looking like this right now.
"query" : {
"term" : { "user" : "kimchy" }
}
After you add the script_fields in the object, it should become:
"query" : {
"term" : { "user" : "kimchy" }
},
"script_fields": {
"urlOrContent": {
"script": {
"lang": "painless",
"inline": "if(params._source.urlOrContent.length() > 50){
params._source.urlOrContent.substring(0, 50)
}
else {
params._source.urlOrContent
}"
}
}
}
The resulting hits will have a fields array with the substring you required.
You have to enable scripting by changing the elasticsearch.yml file like so and restart the elasticsearch:
script.engine.painless.inline.aggs: on
script.engine.painless.inline.update: on
script.inline: on
script.indexed: on

Categories

Resources