[Hive]I got "ArrayIndexOutOfBoundsException" while I query the hive database

[Hive]I got "ArrayIndexOutOfBoundsException" while I query the hive database - java

I always get "ArrayIndexOutOfBoundsException" while I query the hive base(both hive-0.11.0 and hive-0.12.0), but sometimes not. Here is the error
java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:313)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:232)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:539)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:231)
at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:645)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:257)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:298)
... 8 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:220)
... 13 more
Could someone help me?
Update my code:
Select distinct jabberUseragent.gameID,agentPlayInfo.gameLabel,jabberUseragent.userAgent,CONCAT(CONCAT(CONCAT(triggerUsageStart.generateDate,' '),triggerUsageStart.timezone),CONCAT(' ',triggerUsageStart.generateTime)) as generateDate,(unix_timestamp(CONCAT(CONCAT(triggerUsageStop.generateDate,' '),triggerUsageStop.generateTime)) - unix_timestamp(CONCAT(CONCAT(triggerUsageStart.generateDate,' '),triggerUsageStart.generateTime))) from
(Select gameSession,gameID,userAgent from(Select distinct regexp_extract(t.payload,'playRequestId:(.*), playRequest',1) as gameSession,regexp_extract(t.payload,'gameId:(.*), userAgent:',1) as gameID,regexp_extract(t.payload,', userAgent:(.*), agentLocation',1) as userAgent,payload from (select * from ${hiveconf:DATA_BASE} p where p.dt >= '${hiveconf:LOW_DATE}' and p.dt <= '${hiveconf:UPPER_DATE}') t where CONCAT(t.generatedate,t.generatetime) >= CONCAT('${hiveconf:LOW_DATE}','${hiveconf:LOW_TIME}') and CONCAT(t.generatedate,t.generatetime) <= CONCAT('${hiveconf:UPPER_DATE}','${hiveconf:UPPER_TIME}'))jabberUseragent where jabberUseragent.gameSession!='' and jabberUseragent.userAgent!='') jabberUseragent
join
(Select gameID,gameLabel from(Select distinct regexp_extract(t.payload,'gameId=(.*),gameLabel=.*,configFilePath',1) as gameID,regexp_extract(t.payload,'gameId=.*,gameLabel=(.*),configFilePath',1) as gameLabel,payload from (select * from ${hiveconf:DATA_BASE} p where p.dt >= '${hiveconf:LOW_DATE}' and p.dt <= '${hiveconf:UPPER_DATE}') t where CONCAT(t.generatedate,t.generatetime) >= CONCAT('${hiveconf:LOW_DATE}','${hiveconf:LOW_TIME}') and CONCAT(t.generatedate,t.generatetime) <= CONCAT('${hiveconf:UPPER_DATE}','${hiveconf:UPPER_TIME}'))agentPlayInfo where agentPlayInfo.gameID!='' and agentPlayInfo.gameLabel!='') agentPlayInfo
join
(Select gameSession,generateDate,generateTime,timezone,payload from(Select distinct regexp_extract(t.payload,'GAME_SESSION=.*((.{8})-(.{4})-(.{4})-(.{4})-(.{12}))\" USAGE=\"([\\w\\-\\(\\)\\.]*,){41}9.*\"',1) as gameSession,generateDate,generateTime,timezone,payload from (select * from ${hiveconf:DATA_BASE} p where p.dt >= '${hiveconf:LOW_DATE}' and p.dt <= '${hiveconf:UPPER_DATE}') t where t.payload like '%[e] usage_record%' and CONCAT(t.generatedate,t.generatetime) <= CONCAT('${hiveconf:UPPER_DATE}','${hiveconf:UPPER_TIME}') and CONCAT(t.generatedate,t.generatetime) >= CONCAT('${hiveconf:LOW_DATE}','${hiveconf:LOW_TIME}'))triggerStart where triggerStart.gameSession!='')triggerUsageStart
join
(Select gameSession,generateDate,generateTime,timezone,payload from(Select distinct regexp_extract(t.payload,'GAME_SESSION=.*((.{8})-(.{4})-(.{4})-(.{4})-(.{12}))\" USAGE=\"([\\w\\-\\(\\)\\.]*,){41}[1-5].*\"',1) as gameSession,generateDate,generateTime,timezone,payload from (select * from ${hiveconf:DATA_BASE} p where p.dt >= '${hiveconf:LOW_DATE}' and p.dt <= '${hiveconf:UPPER_DATE}') t where t.payload like '%[e] usage_record%' and CONCAT(t.generatedate,t.generatetime) <= CONCAT('${hiveconf:UPPER_DATE}','${hiveconf:UPPER_TIME}') and CONCAT(t.generatedate,t.generatetime) >= CONCAT('${hiveconf:LOW_DATE}','${hiveconf:LOW_TIME}'))triggerStop where triggerStop.gameSession!='')triggerUsageStop
on (jabberUseragent.gameSession = triggerUsageStart.gameSession and triggerUsageStart.gameSession = triggerUsageStop.gameSession and jabberUseragent.gameID = agentPlayInfo.gameID) order by generateDate
Sorry, I can't share my samples.
By the way, I've also got this exception before I got "ArrayIndexOutOfBoundException"
javax.jdo.JDODataStoreException: Error executing SQL query "select PARTITIONS.PART_ID from PARTITIONS inner join TBLS on PARTITIONS.TBL_ID = TBLS.TBL_ID inner join DBS on TBLS.DB_ID = DBS.DB_ID where TBLS.TBL_NAME = ? and DBS.NAME = ? and PARTITIONS.PART_NAME in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)".
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:181)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:82)
at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByNamesInternal(ObjectStore.java:1717)
at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByNames(ObjectStore.java:1700)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)......
NestedThrowablesStackTrace:
org.postgresql.util.PSQLException: ERROR: relation "partitions" does not exist
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1591)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1340)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:471)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:373)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:258)......

Based on the information provided, only this can be a sensible solution to your problem.
I have put the method definition for reference. Please go through it to understand
If you run down the source code closely, there are two areas / possibilities where ArrayIndexOutOfBoundException can be thrown.
Accessing the array values of input splits from Configuration
Reading the Row from currentReadBlock array (this is mostly not the case for the exception since it's size is greater than 0)
Please check your set of input files for job because InputFormat#split() method returns an array of InputSplit type. Each InputSplit is then assigned to an individual Mapper for processing. Mostly, the exception occurs while accessing this InputSplit[] array.

Related

Comparing rows with multiples keys that might or might not be present in 2 Excel sheets

I need to process 2 Excel files which I need to compare in order to get those registers that are not present in both tables. Each register is identified by 4 component: the 4 last digits of an account, the card number, the authorization and the amount of the transaction. The problem I'm facing is that while one of the files contains all of this component, the other one does not, leaving holes in the register, as such is not guaranteed for them to be unique. I'll include an example:
File 1 with holes where I don't have data:
ACCOUNT
TC
AUTO
AMOUNT
456
0
98846
2302
0
4797
0
79268599
782
8415
38711
78636161
0
0
61658
199234
506
0
0
1029249
0
0
50724
33554244
782
0
12308
28755261
956
0
0
9392043
396
3238
0
2482527
614
4442
0
30846962
855
0
71143
61793724
File 2 with all the information:
ACCOUNT
TC
AUTO
AMOUNT
861
4797
19313
79268599
782
8415
38711
78636161
410
1022
61658
199234
506
6368
93040
1029249
785
2478
50724
33554244
782
4741
12308
28755261
956
2623
45025
9392043
396
3238
14196
2482527
614
4442
61568
30846962
855
5917
71143
61793724
As you can see in this case Id need to return this value:
456
0
98846
2302
Is there an efficient way I can do this either in SQL or using Java data structures? Is important to note that values might be missing from either one of the files. I was thinking about uploading them to a temporal data base, but that seems rather inefficient and I wanted to use a Map that supports the query operation I want with possible repeated values. Thank you for your help!
I tried this query, but it isn't returning no information and I'm not sure about creating a temporal database table, is there a map implementation that supports values with several keys being some of them optional?
select e1.id
, e2.id
from excel1 e1
left join excel2 e2 on
(
e1.Cuenta = e2.Cuenta
OR
e1.TC = e2.TC
OR e1.auto = e2.auto
)
AND e1.monto = e2.monto
where e1.id not in
(
SELECT e1.id
FROM excel1 e1
WHERE id NOT IN
(
SELECT e1.id
FROM excel1 e1
LEFT JOIN excel2 e2 ON
(
e1.Cuenta = e2.Cuenta
OR
e1.TC = e2.TC
OR e1.auto = e2.auto
)
AND e1.monto = e2.monto
GROUP BY e1.monto
HAVING COUNT(*) = 1
)
)
AND e2.id =
(
SELECT e2.id
FROM excel1 e1
WHERE id NOT IN
(
SELECT e1.id
FROM excel1 e1
LEFT JOIN excel2 e2 ON
(
e1.Cuenta = e2.Cuenta
OR
e1.TC = e2.TC
OR
e1.auto = e2.auto
)
AND e1.monto = e2.monto
GROUP BY e1.monto
HAVING COUNT(*) = 1
)
)

How to increase Dataflow read parallelism from Cassandra

I am trying to export a lot of data (2 TB, 30kkk rows) from Cassandra to BigQuery. All my infrastructure is on GCP. My Cassandra cluster have 4 nodes (4 vCPUs, 26 GB memory, 2000 GB PD (HDD) each). There is one seed node in the cluster. I need to transform my data before writing to BQ, so I am using Dataflow. Worker type is n1-highmem-2. Workers and Cassandra instances are at the same zone europe-west1-c. My limits for Cassandra:
Part of my pipeline code responsible for reading transform is located here.
Autoscaling
The problem is that when I don't set --numWorkers, the autoscaling set number of workers in such manner (2 workers average):
Load balancing
When I set --numWorkers=15 the rate of reading doesn't increase and only 2 workers communicate with Cassandra (I can tell it from iftop and only these workers have CPU load ~60%).
At the same time Cassandra nodes don't have a lot of load (CPU usage 20-30%). Network and disk usage of the seed node is about 2 times higher than others, but not too high, I think:
And for the not seed node here:
Pipeline launch warnings
I have some warnings when pipeline is launching:
WARNING: Size estimation of the source failed:
org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#7569ea63
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.132.9.101:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.132.9.101:9042] Cannot connect), /10.132.9.102:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.132.9.102:9042] Cannot connect), /10.132.9.103:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.132.9.103:9042] Cannot connect), /10.132.9.104:9042 [only showing errors of first 3 hosts, use getErrors() for more details])
My Cassandra cluster is in GCE local network and it seams that some queries are made from my local machine and cannot reach the cluster (I am launching pipeline with Dataflow Eclipse plugin as described here). These queries are about size estimation of tables. Can I specify size estimation by hand or launch pipline from GCE instance? Or can I ignore these warnings? Does it have effect on rate of read?
I'v tried to launch pipeline from GCE VM. There is no more problem with connectivity. I don't have varchar columns in my tables but I get such warnings (no codec in datastax driver [varchar <-> java.lang.Long]). :
WARNING: Can't estimate the size
com.datastax.driver.core.exceptions.CodecNotFoundException: Codec not found for requested operation: [varchar <-> java.lang.Long]
at com.datastax.driver.core.CodecRegistry.notFound(CodecRegistry.java:741)
at com.datastax.driver.core.CodecRegistry.createCodec(CodecRegistry.java:588)
at com.datastax.driver.core.CodecRegistry.access$500(CodecRegistry.java:137)
at com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:246)
at com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:232)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
at com.google.common.cache.LocalCache.get(LocalCache.java:4053)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4986)
at com.datastax.driver.core.CodecRegistry.lookupCodec(CodecRegistry.java:522)
at com.datastax.driver.core.CodecRegistry.codecFor(CodecRegistry.java:485)
at com.datastax.driver.core.CodecRegistry.codecFor(CodecRegistry.java:467)
at com.datastax.driver.core.AbstractGettableByIndexData.codecFor(AbstractGettableByIndexData.java:69)
at com.datastax.driver.core.AbstractGettableByIndexData.getLong(AbstractGettableByIndexData.java:152)
at com.datastax.driver.core.AbstractGettableData.getLong(AbstractGettableData.java:26)
at com.datastax.driver.core.AbstractGettableData.getLong(AbstractGettableData.java:95)
at org.apache.beam.sdk.io.cassandra.CassandraServiceImpl.getTokenRanges(CassandraServiceImpl.java:279)
at org.apache.beam.sdk.io.cassandra.CassandraServiceImpl.getEstimatedSizeBytes(CassandraServiceImpl.java:135)
at org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource.getEstimatedSizeBytes(CassandraIO.java:308)
at org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.startDynamicSplitThread(BoundedReadEvaluatorFactory.java:166)
at org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:142)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:146)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Pipeline read code
// Read data from Cassandra table
PCollection<Model> pcollection = p.apply(CassandraIO.<Model>read()
.withHosts(Arrays.asList("10.10.10.101", "10.10.10.102", "10.10.10.103", "10.10.10.104")).withPort(9042)
.withKeyspace(keyspaceName).withTable(tableName)
.withEntity(Model.class).withCoder(SerializableCoder.of(Model.class))
.withConsistencyLevel(CASSA_CONSISTENCY_LEVEL));
// Transform pcollection to KV PCollection by rowName
PCollection<KV<Long, Model>> pcollection_by_rowName = pcollection
.apply(ParDo.of(new DoFn<Model, KV<Long, Model>>() {
#ProcessElement
public void processElement(ProcessContext c) {
c.output(KV.of(c.element().rowName, c.element()));
}
}));
Number of splits (Stackdriver log)
W Number of splits is less than 0 (0), fallback to 1
I Number of splits is 1
W Number of splits is less than 0 (0), fallback to 1
I Number of splits is 1
W Number of splits is less than 0 (0), fallback to 1
I Number of splits is 1
What I'v tried
No effect:
set read consistency level to ONE
nodetool setstreamthroughput 1000, nodetool setinterdcstreamthroughput 1000
increase Cassandra read concurrency (in cassandra.yaml): concurrent_reads: 32
setting different number of workers 1-40.
Some effect:
1. I'v set numSplits = 10 as #jkff proposed. Now I can see in logs:
I Murmur3Partitioner detected, splitting
W Can't estimate the size
W Can't estimate the size
W Number of splits is less than 0 (0), fallback to 10
I Number of splits is 10
W Number of splits is less than 0 (0), fallback to 10
I Number of splits is 10
I Splitting source org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#6d83ee93 produced 10 bundles with total serialized response size 20799
I Splitting source org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#25d02f5c produced 10 bundles with total serialized response size 19359
I Splitting source [0, 1) produced 1 bundles with total serialized response size 1091
I Murmur3Partitioner detected, splitting
W Can't estimate the size
I Splitting source [0, 0) produced 0 bundles with total serialized response size 76
W Number of splits is less than 0 (0), fallback to 10
I Number of splits is 10
I Splitting source org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#2661dcf3 produced 10 bundles with total serialized response size 18527
But I'v got another exception:
java.io.IOException: Failed to start reading from source: org.apache.beam.sdk.io.cassandra.Cassandra...
(5d6339652002918d): java.io.IOException: Failed to start reading from source: org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#5f18c296
at com.google.cloud.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:582)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:347)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:183)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:148)
at com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:68)
at com.google.cloud.dataflow.worker.DataflowWorker.executeWork(DataflowWorker.java:336)
at com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:294)
at com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:244)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:53 mismatched character 'p' expecting '$'
at com.datastax.driver.core.exceptions.SyntaxError.copy(SyntaxError.java:58)
at com.datastax.driver.core.exceptions.SyntaxError.copy(SyntaxError.java:24)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:43)
at org.apache.beam.sdk.io.cassandra.CassandraServiceImpl$CassandraReaderImpl.start(CassandraServiceImpl.java:80)
at com.google.cloud.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:579)
... 14 more
Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:53 mismatched character 'p' expecting '$'
at com.datastax.driver.core.Responses$Error.asException(Responses.java:144)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:186)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:50)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:817)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:651)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1077)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1000)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:565)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:479)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 more
Maybe there is a mistake: CassandraServiceImpl.java#L220
And this statement looks like mistype: CassandraServiceImpl.java#L207
Changes I'v done to CassandraIO code
As #jkff proposed, I've change CassandraIO in the way I needed:
#VisibleForTesting
protected List<BoundedSource<T>> split(CassandraIO.Read<T> spec,
long desiredBundleSizeBytes,
long estimatedSizeBytes) {
long numSplits = 1;
List<BoundedSource<T>> sourceList = new ArrayList<>();
if (desiredBundleSizeBytes > 0) {
numSplits = estimatedSizeBytes / desiredBundleSizeBytes;
}
if (numSplits <= 0) {
LOG.warn("Number of splits is less than 0 ({}), fallback to 10", numSplits);
numSplits = 10;
}
LOG.info("Number of splits is {}", numSplits);
Long startRange = MIN_TOKEN;
Long endRange = MAX_TOKEN;
Long startToken, endToken;
String pk = "$pk";
switch (spec.table()) {
case "table1":
pk = "table1_pk";
break;
case "table2":
case "table3":
pk = "table23_pk";
break;
}
endToken = startRange;
Long incrementValue = endRange / numSplits - startRange / numSplits;
String splitQuery;
if (numSplits == 1) {
// we have an unique split
splitQuery = QueryBuilder.select().from(spec.keyspace(), spec.table()).toString();
sourceList.add(new CassandraIO.CassandraSource<T>(spec, splitQuery));
} else {
// we have more than one split
for (int i = 0; i < numSplits; i++) {
startToken = endToken;
endToken = startToken + incrementValue;
Select.Where builder = QueryBuilder.select().from(spec.keyspace(), spec.table()).where();
if (i > 0) {
builder = builder.and(QueryBuilder.gte("token(" + pk + ")", startToken));
}
if (i < (numSplits - 1)) {
builder = builder.and(QueryBuilder.lt("token(" + pk + ")", endToken));
}
sourceList.add(new CassandraIO.CassandraSource(spec, builder.toString()));
}
}
return sourceList;
}

I think this should be classified as a bug in CassandraIO. I filed BEAM-3424. You can try building your own version of Beam with that default of 1 changed to 100 or something like that, while this issue is being fixed.
I also filed BEAM-3425 for the bug during size estimation.

EclipseLink - Subqueries with Criteria Query (JPA)

take the following two tables as given:
Objects with columns objId, objTaxonId
Taxa with columns taxId, taxValidSynonymId, taxName (Note Taxa is plural for Taxon)
if a Taxon is valid, its id and validSynonymId are identical, otherwise they are different. To find all synonyms of a Taxon you 'only' need to find all Taxa where the taxValidSynonymId is filled with the taxId of the valid Taxon
how can i gain all Objects where the Taxon has a given Name (including their Synonyms?)
in SQL this is done in a few lines (and Minutes)
SELECT *
FROM Objects
WHERE objTaxonId IN (
SELECT taxId
FROM Taxa
WHERE taxName LIKE 'Test Taxon 1'
OR taxSynIdTaxon IN(
SELECT taxId
FROM Taxa
WHERE taxName LIKE 'Test Taxon 1'
)
)
i was able to work out the inside part, where i gain the list of Taxa and its Synonyms. Now i need to transform this Query to a Subquery...
String NAME_LIKE = "Test Taxon 1";
EntityManager em = EntityManagerProvider.getEntityManager("TestDB"); // get the EntityManager
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<TaxonImpl> cqObject = cb.createQuery(TaxonImpl.class);//
Root<TaxonImpl> taxonRoot = cqObject.from(TaxonImpl.class);//
Expression<String> taxon_name = taxonRoot.<String> get("taxName");
Predicate where = cb.equal(taxon_name, NAME_LIKE);
// subquery
Subquery<Integer> subQuery = cqObject.subquery(Integer.class);
Root<TaxonImpl> subRoot = subQuery.from(clsImpl);
subQuery.select(subRoot.<Integer> get("taxId"));
subQuery.where(cb.equal(subRoot.<String> get("taxName"), NAME_LIKE));
where = cb.or(where, taxonRoot.get("taxValidSynonymId").in(subQuery));
cqObject.where(where);
Query query = em.createQuery(cqObject);
List<TaxonImpl> result = query.getResultList();
NOTE: the Taxon is Mapped as an many to One relation (target-entity is TaxonImpl)
in my actual application the code (from the subquery) will be dynamically, so a Native Query does not help me.

i figured out how to "transform" the subquery into a query but Eclipselink threw me two errors
the first was forbidden access via field, when i tried the in on a result of TaxonImpl (i tried that first, as in my mapping files the Taxon is mapped as Entity.
so after this i tried to form the SQL 1:1 to JPA.
but Eclipselink generated something weird:
SELECT t0.objIdObject, t0.objAdminCreated, t0.objAdminCreator, t0.objAdminEdited, t0.objAdminEditor, t0.objAdminImport1, t0.objAdminImport2, t0.objAddBool1, t0.objAddBool2, t0.objAddBool3, t0.objAddBool4, t0.objAddBool5, t0.objAddDateTime1, t0.objAddDateTime2, t0.objCommonComments, t0.objCommonDescription, t0.objCommonKeywords, t0.objCommonName, t0.objCommonPublished, t0.objCommonPublishedAs, t0.objCommonStatus, t0.objCommonType, t0.objCommonTypustype, t0.objDetAccuracy, t0.objDetCf, t0.objDetComments, t0.objDetDate, t0.objDetMethod, t0.objDetResult, t0.objAddFloat1, t0.objAddFloat2, t0.objAddFloat3, t0.objAddFloat4, t0.objAddFloat5, t0.objEventAbundance, t0.objEventCollectionMethod, t0.objEventComments, t0.objEventMoreContacts, t0.objEventDateDay1, t0.objEventDate1, t0.objEventDateMonth1, t0.objEventDate2, t0.objEventDateUncertain, t0.objEventDateYear1, t0.objEventEcosystem, t0.objEventHabitat, t0.objEventNumber, t0.objEventPermission, t0.objEventSubstratum, t0.objEventTime1, t0.objEventTime2, t0.objEventWeekNumber, t0.objFlora, t0.objGuidObject, t0.objIOComments, t0.objIODeAccessed, t0.objAddInt1, t0.objAddInt2, t0.objAddInt3, t0.objAddInt4, t0.objAddInt5, t0.objStorageForeignNumber, t0.objStorageNumber, t0.objStorageNumberInCollection, t0.objStorageNumberOld, t0.objStorageNumberPrefix, t0.objAddLkp1, t0.objAddLkp10, t0.objAddLkp2, t0.objAddLkp3, t0.objAddLkp4, t0.objAddLkp5, t0.objAddLkp6, t0.objAddLkp7, t0.objAddLkp8, t0.objAddLkp9, t0.objAddLkpCs1, t0.objAddLkpCs10, t0.objAddLkpCs11, t0.objAddLkpCs12, t0.objAddLkpCs13, t0.objAddLkpCs14, t0.objAddLkpCs15, t0.objAddLkpCs2, t0.objAddLkpCs3, t0.objAddLkpCs4, t0.objAddLkpCs5, t0.objAddLkpCs6, t0.objAddLkpCs7, t0.objAddLkpCs8, t0.objAddLkpCs9, t0.objOriginAccessionDate, t0.objOriginAccessionNumber, t0.objOriginComments, t0.objOriginMoreContacts, t0.objOriginSource, t0.objOriginType, t0.objPreparationComments, t0.objPreparationDate, t0.objPreparationType, t0.objPropAdults, t0.objPropAge, t0.objPropAgeUnit, t0.objPropEggs, t0.objPropFemale, t0.objPropHeight, t0.objPropHeightUnit, t0.objPropJuveniles, t0.objPropLarvae, t0.objPropLength, t0.objPropLengthUnit, t0.objPropMale, t0.objPropObservation, t0.objPropObservationComments, t0.objPropPupae, t0.objPropSex, t0.objPropStadium, t0.objPropWeight, t0.objPropWeightUnit, t0.objPropWidth, t0.objPropWidthUnit, t0.objSiteComments, t0.objStorageComments, t0.objStorageContainerNumber, t0.objStorageContainerPieces, t0.objStorageContainerType, t0.objStorageLevel1, t0.objStorageLevel2, t0.objStorageLevel3, t0.objStorageLevel4, t0.objStorageLevel5, t0.objStorageNumberInContainer, t0.objstoragePieces, t0.objStorageValue, t0.objStorageValueUnit, t0.objAddText1, t0.objAddText10, t0.objAddText2, t0.objAddText3, t0.objAddText4, t0.objAddText5, t0.objAddText6, t0.objAddText7, t0.objAddText8, t0.objAddText9, t0.objIdCollection, t0.objCommonIdReference, t0.objDetIdContact, t0.objDetIdReference, t0.objEventIdContact, t0.objIdExcursion, t0.objOriginIdContact, t0.objPreparationIdContact, t0.objIdProject, t0.objSiteIdSite, t0.objdetIdTaxon
FROM tObjects t0
WHERE t0.objdetIdTaxon IN (
SELECT t1.taxIdTaxon.t1.taxIdTaxon
FROM tTaxa t1
WHERE (t1.taxTaxonDisplay LIKE 'Test Taxon 1'
OR t1.taxSynIdTaxon IN (
SELECT t2.taxSynIdTaxon
FROM tTaxa t2
WHERE t2.taxTaxonDisplay LIKE 'Test Taxon 1')))
to take out the error:
SELECT t1.taxIdTaxon.t1.taxIdTaxon
which is complete crap. you cannot execute a function on an type of int!
resolving this error (BUG?) introduces a new construct (which still returns the same results)
SELECT t1.objIdObject, t1.objAdminCreated, t1.objAdminCreator, t1.objAdminEdited, t1.objAdminEditor, t1.objAdminImport1, t1.objAdminImport2, t1.objAddBool1, t1.objAddBool2, t1.objAddBool3, t1.objAddBool4, t1.objAddBool5, t1.objAddDateTime1, t1.objAddDateTime2, t1.objCommonComments, t1.objCommonDescription, t1.objCommonKeywords, t1.objCommonName, t1.objCommonPublished, t1.objCommonPublishedAs, t1.objCommonStatus, t1.objCommonType, t1.objCommonTypustype, t1.objDetAccuracy, t1.objDetCf, t1.objDetComments, t1.objDetDate, t1.objDetMethod, t1.objDetResult, t1.objAddFloat1, t1.objAddFloat2, t1.objAddFloat3, t1.objAddFloat4, t1.objAddFloat5, t1.objEventAbundance, t1.objEventCollectionMethod, t1.objEventComments, t1.objEventMoreContacts, t1.objEventDateDay1, t1.objEventDate1, t1.objEventDateMonth1, t1.objEventDate2, t1.objEventDateUncertain, t1.objEventDateYear1, t1.objEventEcosystem, t1.objEventHabitat, t1.objEventNumber, t1.objEventPermission, t1.objEventSubstratum, t1.objEventTime1, t1.objEventTime2, t1.objEventWeekNumber, t1.objFlora, t1.objGuidObject, t1.objIOComments, t1.objIODeAccessed, t1.objAddInt1, t1.objAddInt2, t1.objAddInt3, t1.objAddInt4, t1.objAddInt5, t1.objStorageForeignNumber, t1.objStorageNumber, t1.objStorageNumberInCollection, t1.objStorageNumberOld, t1.objStorageNumberPrefix, t1.objAddLkp1, t1.objAddLkp10, t1.objAddLkp2, t1.objAddLkp3, t1.objAddLkp4, t1.objAddLkp5, t1.objAddLkp6, t1.objAddLkp7, t1.objAddLkp8, t1.objAddLkp9, t1.objAddLkpCs1, t1.objAddLkpCs10, t1.objAddLkpCs11, t1.objAddLkpCs12, t1.objAddLkpCs13, t1.objAddLkpCs14, t1.objAddLkpCs15, t1.objAddLkpCs2, t1.objAddLkpCs3, t1.objAddLkpCs4, t1.objAddLkpCs5, t1.objAddLkpCs6, t1.objAddLkpCs7, t1.objAddLkpCs8, t1.objAddLkpCs9, t1.objOriginAccessionDate, t1.objOriginAccessionNumber, t1.objOriginComments, t1.objOriginMoreContacts, t1.objOriginSource, t1.objOriginType, t1.objPreparationComments, t1.objPreparationDate, t1.objPreparationType, t1.objPropAdults, t1.objPropAge, t1.objPropAgeUnit, t1.objPropEggs, t1.objPropFemale, t1.objPropHeight, t1.objPropHeightUnit, t1.objPropJuveniles, t1.objPropLarvae, t1.objPropLength, t1.objPropLengthUnit, t1.objPropMale, t1.objPropObservation, t1.objPropObservationComments, t1.objPropPupae, t1.objPropSex, t1.objPropStadium, t1.objPropWeight, t1.objPropWeightUnit, t1.objPropWidth, t1.objPropWidthUnit, t1.objSiteComments, t1.objStorageComments, t1.objStorageContainerNumber, t1.objStorageContainerPieces, t1.objStorageContainerType, t1.objStorageLevel1, t1.objStorageLevel2, t1.objStorageLevel3, t1.objStorageLevel4, t1.objStorageLevel5, t1.objStorageNumberInContainer, t1.objstoragePieces, t1.objStorageValue, t1.objStorageValueUnit, t1.objAddText1, t1.objAddText10, t1.objAddText2, t1.objAddText3, t1.objAddText4, t1.objAddText5, t1.objAddText6, t1.objAddText7, t1.objAddText8, t1.objAddText9, t1.objIdCollection, t1.objCommonIdReference, t1.objDetIdContact, t1.objDetIdReference, t1.objEventIdContact, t1.objIdExcursion, t1.objOriginIdContact, t1.objPreparationIdContact, t1.objIdProject, t1.objSiteIdSite, t1.objdetIdTaxon
FROM tTaxa t0, tObjects t1
WHERE (
t0.taxIdTaxon IN (
SELECT t2.taxIdTaxon
FROM tTaxa t2
WHERE (t2.taxTaxonDisplay LIKE 'Test Taxon 1'
OR t2.taxSynIdTaxon IN (
SELECT t3.taxSynIdTaxon
FROM tTaxa t3
WHERE t3.taxTaxonDisplay LIKE 'Test Taxon 1'
)
)
) AND (t0.taxIdTaxon = t1.objdetIdTaxon)
)
this seems strange to me, but it is working - and it is faster than my alternative query, which includes a inner join
NOTE: Eclipselink does ignore the JoinType. regardless what you pass it takes an left outer join. (documentation says something else!)
finally i provide the two examples for join and joinless
private static Predicate addSynonymsWithJoins(Root<BioObjectImpl> r, CriteriaBuilder b, CriteriaQuery cq,
Attribute attr, Path path, Object value) {
Join taxJoin = r.join(BioObjectEnum.taxon.name(), JoinType.INNER);
Path<Object> taxValidSynonymId = taxJoin.get(TaxonEnum.validSynonymId.name());
Subquery<TaxonImpl> innerSubquery = cq.subquery(TaxonImpl.class);
Root fromSubTax = innerSubquery.from(TaxonImpl.class);
innerSubquery.select(fromSubTax.<Integer> get(TaxonEnum.id.name()));
Predicate dynamic1 = cb.like(fromSubTax.get(TaxonEnum.name.name()),
NAME_LIKE);
innerSubquery.where(dynamic1);
Predicate dynamic2 = resolveComparator(b, attr, taxJoin.get(attr.getPropertyName()), attr.getValue());//
Predicate p = b.or(taxValidSynonymId.in(innerSubquery), dynamic2);
return p;
}
private static Predicate addSynonymsWithoutJoins(Root<BioObjectImpl> r, CriteriaBuilder b, CriteriaQuery cq,
Attribute attr, Path path, Object value) {
cq.select(r);
Path<Integer> objTaxonId = r.<Integer> get(BioObjectEnum.taxon.name()).get(TaxonEnum.id.name());
Subquery<Integer> t2 = cq.subquery(Integer.class);
Root<TaxonImpl> t2fromTaxon = t2.from(TaxonImpl.class);
Path<Integer> t2taxId = t2fromTaxon.<Integer> get(TaxonEnum.validSynonymId.name());
t2.select(t2taxId);
Predicate t2dynamicWhere = resolveComparator(b, attr, t2fromTaxon.get(attr.getPropertyName()), attr.getValue());
t2.where(t2dynamicWhere);
Subquery<Integer> t1 = cq.subquery(Integer.class);
Root<TaxonImpl> t1fromTaxon = t1.from(TaxonImpl.class);
Predicate t1dynamicWhere = b.like(fromSubTax.get(TaxonEnum.name.name()),
NAME_LIKE);
Path<Integer> t1Select = t1fromTaxon.<Integer> get(TaxonEnum.id.name());
t1.select(t1Select);
Path<Integer> t1TaxSynonymId = t1fromTaxon.<Integer> get(TaxonEnum.validSynonymId.name());
t1dynamicWhere = b.or(t1dynamicWhere, t1TaxSynonymId.in(t2));
t1.where(t1dynamicWhere);
Predicate where = objTaxonId.in(t1);
return where;
}

Solr Custom Transfomer Not Working?

I am trying to Add Few Fields to solr Index While indexing with Data Import Handler.
Below is My data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/db" user="*****" password="*********"/>
<script><![CDATA[
function addMergedPdt(row)
{
var m = row.get('mergedPdt');
if(m == null)
{
row.put('mergedPdt',9999999999);
}
return row;
}
]]></script>
<script><![CDATA[
function transform(row)
{
if(row.get(mergedPdt) == null)
{
row.put('catStock', 0);
row.put('catPxMrp',0);
row.put('catPrice',0);
row.put('catCount',1)
row.put('catRating',0);
row.put('catAval',0);
return row;
}
else
{
row.put('catAval',1);
return row;
}
}
]]></script>
<document>
<entity name="product" onError="continue" transformer="script:addMergedPdt" query="select p.id, name, image, stock, lower(p.seller) as seller, brand,
cast(price as signed) as price, cast(pxMrp as signed) as pxMrp, mergedPdt, shipDays, url, cast(rating as signed) as rating,
disc(price, pxMrp) as discount, mc.node as seller_cat, oc.node as cat, substring_index(oc.node, '|', 1) as cat1,
substring(substring_index(oc.node, '|', 2), length(substring_index(oc.node, '|', 1)) + 2) as cat2,
substring(substring_index(oc.node, '|', 3), length(substring_index(oc.node, '|', 2)) + 2) as cat3
from _products as p, _mergedCat as mc, _ourCat as oc where active = 1 and cat_id = mc.id and ourCat = oc.id and
('${dataimporter.request.full}' != 'false' OR last_visited > '${dataimporter.last_index_time}') limit 10000">
<!-- To Papulate the Catalog Data -->
<entity name="mergedPdt" transformer="script:transform" onError="continue" query = "SELECT mergedPdt,count(*) as catCount,cast(max(stock) as signed) as catStock,cast(max(pxMrp) as signed) as catPxMrp,cast(min(price) as signed) as catPrice,cast(avg(rating) as signed) as catRating FROM `_products` where mergedPdt = ${product.mergedPdt}"/>
</entity>
I am Getting some Error Like
org.apache.solr.handler.dataimport.DataImportHandlerException: Error invoking script for entity mergedPdt Processing Document # 10000
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:59)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:198)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:256)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: java.lang.NoSuchMethodException: no such method: transform
at com.sun.script.javascript.RhinoScriptEngine.invoke(RhinoScriptEngine.java:286)
at com.sun.script.javascript.RhinoScriptEngine.invokeFunction(RhinoScriptEngine.java:258)
at org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:55)
... 10 more
and all fields are being Indexed excluding the extra fields which i tried adding using the transformer.
And surprisingly only one field "catCount" has been indexed.
You can trust me and i am confident regarding the schema definition and other configurations.
any lead will be higly appriciated ??
Thanks in advance :)

JPA 2 Constructor Expression cannot use constructor

I use OpenJPA 2.1.1 on WebSphere Application Server 8.
I want to create object from SELECT query using constructor expression:
String queryString = "SELECT NEW mypackage.StatisticDataObject(c.source, "
+ "SUM(CASE WHEN (c.validTo <= CURRENT_TIMESTAMP AND c.expireNoProblem LIKE 'N') THEN 1 ELSE 0 END ), "
+ "SUM(CASE WHEN (c.validTo <= CURRENT_TIMESTAMP AND c.expireNoProblem LIKE 'Y') THEN 1 ELSE 0 END ), "
+ "SUM(CASE WHEN (c.validTo > :timestamp30 ) THEN 1 ELSE 0 END ), "
+ "SUM(CASE WHEN (c.validTo > :timestamp10 AND c.validTo <= :timestamp30 ) THEN 1 ELSE 0 END ), "
+ "SUM(CASE WHEN (c.validTo > CURRENT_TIMESTAMP AND c.validTo <= :timestamp10 ) THEN 1 ELSE 0 END ) )"
+ "FROM MYTABLE c GROUP BY c.source";
TypedQuery<StatisticDataObject> q = em.createQuery(queryString,
StatisticDataObject.class);
q.setParameter("timestamp30", getTimestampIn(30));
q.setParameter("timestamp10", getTimestampIn(10));
Constructor:
public StatisticDataObject(String name, Integer expired,
Integer expiredButOK, Integer expireIn10Days, Integer expireIn30Days,
Integer expireGT30Days) {
this.name = name;
this.expired = expired;
this.expiredButOK = expiredButOK;
this.expireIn10Days = expireIn10Days;
this.expireIn30Days = expireIn30Days;
this.expireGT30Days = expireGT30Days;
}
But I get the following exception:
Caused by: <openjpa-2.1.1-SNAPSHOT-r422266:1141200 nonfatal user error> org.apache.openjpa.persistence.ArgumentException: Query "SELECT NEW mypackage StatisticDataObject(c.source, ... at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:872)
at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:794)
at org.apache.openjpa.kernel.DelegatingQuery.execute(DelegatingQuery.java:542)
at org.apache.openjpa.persistence.QueryImpl.execute(QueryImpl.java:315)
at org.apache.openjpa.persistence.QueryImpl.getResultList(QueryImpl.java:331)
...
Caused by: java.lang.RuntimeException: Es wurde kein Konstruktor für "class mypackage.StatisticDataObject" mit den Argumenttypen "[class java.lang.String, class java.lang.String, class java.lang.String, class java.lang.String, class java.lang.String, class java.lang.String]" gefunden, um die Daten einzutragen.
// ENGLISH Translation: Caused by: java.lang.RuntimeException: There is no constructor "class mypackage.StatisticDataObject" with argument type "[class java.lang.String, class java.lang.String, class java.lang.String, class java.lang.String, class java.lang.String, class java.lang.String]".
at org.apache.openjpa.kernel.FillStrategy$NewInstance.findConstructor(FillStrategy.java:139)
at org.apache.openjpa.kernel.FillStrategy$NewInstance.fill(FillStrategy.java:144)
at org.apache.openjpa.kernel.ResultShape.pack(ResultShape.java:362)
at org.apache.openjpa.kernel.ResultShapePacker.pack(ResultShapePacker.java:48)
at org.apache.openjpa.kernel.QueryImpl$PackingResultObjectProvider.getResultObject(QueryImpl.java:2082)
at org.apache.openjpa.lib.rop.EagerResultList.<init>(EagerResultList.java:36)
at org.apache.openjpa.kernel.QueryImpl.toResult(QueryImpl.java:1251)
at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:1007)
at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:863)
... 84 more
If I run the query without NEW mypackage.StatisticDataObject() it works by using Object[]. Also the class of object[1-5] (.getClass()) is Integer.
So why does JPA return a String from SUM() instead of Integer when using the constructor expression?

In your constructor take in count that the state of the field type in which you will apply SUM need to be numeric, and the result type must correspond on the field type. For example, if a Double field is summed , the result will be returned as Double,. If a Long field type is summed, the response will be returned as a Long.
It is the root cause of the problem, SUM not return a Integer type.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

[Hive]I got "ArrayIndexOutOfBoundsException" while I query the hive database - java

Related

Comparing rows with multiples keys that might or might not be present in 2 Excel sheets

How to increase Dataflow read parallelism from Cassandra

EclipseLink - Subqueries with Criteria Query (JPA)

Solr Custom Transfomer Not Working?

JPA 2 Constructor Expression cannot use constructor

Categories

Resources