Esper rules for different users

Esper rules for different users - java

I recently started programming with Esper and I have a smart wearable that sends pedometer data to my laptop. I then process this data using esper. But suppose I have multiple smart wearables with each an unique MAC address. I use time windows and my question is how can I change my rule file so that the rules only fire for events with the same macaddress and take appropiate action based on this MAC address. My initialization and rule are:
Configuration cepConfig = new Configuration();
cepConfig.addEventType("Steps", Steps.class.getName());
// We setup the engine
EPServiceProvider cep = EPServiceProviderManager.getProvider("myCEPEngine", cepConfig);
EPRuntime cepRT = cep.getEPRuntime();
// We register an EPL statement
EPAdministrator cepSteps1 = cep.getEPAdministrator();
EPStatement cepStatementSteps1 = cepSteps1.createEPL("select * from "
+ "Steps().win:time(1 hour) "
+ "group by macAddress "
+ "having sum(max(steps)-min(steps)) < 100");
cepStatementSteps1.addListener(new rule1Listener());
My Steps class has the following fields:
double steps;
String stepsTimestamp;
String macAddress;
And this is how I insert the events:
Steps steps0 = new Steps(0, new Date(timeStamp).toString(), "K5E45H778");
cepRT.sendEvent(steps0);
Steps steps00 = new Steps(0, new Date(timeStamp).toString(), "LD24ESF74");
cepRT.sendEvent(steps00);
Steps steps1 = new Steps(25, new Date(timeStamp).toString(), "K5E45H778");
cepRT.sendEvent(steps1);
Steps steps2 = new Steps(50, new Date(timeStamp).toString(), "LD24ESF74");
cepRT.sendEvent(steps2);
Steps steps3 = new Steps(55, new Date(timeStamp).toString(), "K5E45H778");
cepRT.sendEvent(steps3);
Steps steps4 = new Steps(105, new Date(timeStamp).toString(), "LD24ESF74");
cepRT.sendEvent(steps4);
Steps steps5 = new Steps(75, new Date(timeStamp).toString(), "K5E45H778");
cepRT.sendEvent(steps5);
Steps steps6 = new Steps(110, new Date(timeStamp).toString(), "K5E45H778");
cepRT.sendEvent(steps6);
This is my output:
Sending tick: Steps: 0.0 Timestamp: Mon Mar 14 18:13:23 CET 2016 Mac Address: K5E45H778
->Rule 1 fired: K5E45H778
Sending tick: Steps: 0.0 Timestamp: Mon Mar 14 18:18:23 CET 2016 Mac Address: LD24ESF7474
->Rule 1 fired: LD24ESF7474
Sending tick: Steps: 25.0 Timestamp: Mon Mar 14 18:23:23 CET 2016 Mac Address: K5E45H778
->Rule 1 fired: K5E45H778
Sending tick: Steps: 105.0 Timestamp: Mon Mar 14 18:28:23 CET 2016 Mac Address: LD24ESF7474
Sending tick: Steps: 55.0 Timestamp: Mon Mar 14 18:33:23 CET 2016 Mac Address: K5E45H778
->Rule 1 fired: K5E45H778
Sending tick: Steps: 75.0 Timestamp: Mon Mar 14 18:38:23 CET 2016 Mac Address: K5E45H778
Sending tick: Steps: 110.0 Timestamp: Mon Mar 14 18:43:23 CET 2016 Mac Address: K5E45H778
Why doesn't the rule fire for the one but last event of 75 steps?

The SQL-standard "group by" clause is for aggregation per group. Thus just adding "group by macAddress" should get it done.

Related

Agent configuration for 'a1' has no configfilters

I got an error(Agent configuration for 'a1' has no configfilters) when I use flume 1.9 to transfer the data from kafka to HDFS, but no other error or info were reported.
The source I used is KafkaSource, sink is file sink.
Interceptor I used is self-define which I will show bolow.
Agent configuration for 'a1' has no configfilters.
the logger info is below. differ from other question, the
16 Aug 2022 11:45:27,600 WARN [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateConfigFilterSet:623) - Agent configuration for 'a1' has no configfilters.
16 Aug 2022 11:45:27,623 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:163) - Post-validation flume configuration contains configuration for agents: [a1]
16 Aug 2022 11:45:27,624 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:151) - Creating channels
16 Aug 2022 11:45:27,628 INFO [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:42) - Creating instance of channel c1 type file
16 Aug 2022 11:45:27,642 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205) - Created channel c1
16 Aug 2022 11:45:27,643 INFO [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:41) - Creating instance of source r1, type org.apache.flume.source.kafka.KafkaSource
16 Aug 2022 11:45:27,655 INFO [conf-file-poller-0] (org.apache.flume.sink.DefaultSinkFactory.create:42) - Creating instance of sink: k1, type: hdfs
16 Aug 2022 11:45:27,786 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:120) - Channel c1 connected to [r1, k1]
16 Aug 2022 11:45:27,787 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:162) - Starting new configuration:{ sourceRunners:{r1=PollableSourceRunner: { source:org.apache.flume.source.kafka.KafkaSource{name:r1,state:IDLE} counterGroup:{ name:null counters:{} } }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#2aa62fb9 counterGroup:{ name:null counters:{} } }} channels:{c1=FileChannel c1 { dataDirs: [/opt/module/flume/data/ranqi/behavior2] }} }
16 Aug 2022 11:45:27,788 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:169) - Starting Channel c1
16 Aug 2022 11:45:27,790 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:184) - Waiting for channel: c1 to start. Sleeping for 500 ms
16 Aug 2022 11:45:27,790 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FileChannel.start:278) - Starting FileChannel c1 { dataDirs: [/opt/module/flume/data/ranqi/behavior2] }...
16 Aug 2022 11:45:27,833 INFO [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:119) - Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
16 Aug 2022 11:45:27,833 INFO [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:95) - Component type: CHANNEL, name: c1 started
16 Aug 2022 11:45:27,839 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.<init>:356) - Encryption is not enabled
16 Aug 2022 11:45:27,840 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.replay:406) - Replay started
16 Aug 2022 11:45:27,845 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.replay:418) - Found NextFileID 3, from [/opt/module/flume/data/ranqi/behavior2/log-3, /opt/module/flume/data/ranqi/behavior2/log-2]
16 Aug 2022 11:45:27,851 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:55) - Starting up with /opt/module/flume/checkpoint/ranqi/behavior2/checkpoint and /opt/module/flume/checkpoint/ranqi/behavior2/checkpoint.meta
16 Aug 2022 11:45:27,851 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:59) - Reading checkpoint metadata from /opt/module/flume/checkpoint/ranqi/behavior2/checkpoint.meta
16 Aug 2022 11:45:27,906 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FlumeEventQueue.<init>:115) - QueueSet population inserting 0 took 0
16 Aug 2022 11:45:27,908 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.replay:457) - Last Checkpoint Mon Aug 15 17:11:08 CST 2022, queue depth = 0
16 Aug 2022 11:45:27,918 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.doReplay:542) - Replaying logs with v2 replay logic
16 Aug 2022 11:45:27,919 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.ReplayHandler.replayLog:249) - Starting replay of [/opt/module/flume/data/ranqi/behavior2/log-2, /opt/module/flume/data/ranqi/behavior2/log-3]
16 Aug 2022 11:45:27,920 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.ReplayHandler.replayLog:262) - Replaying /opt/module/flume/data/ranqi/behavior2/log-2
16 Aug 2022 11:45:27,925 INFO [lifecycleSupervisor-1-0] (org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize:112) - Unable to get maxDirectMemory from VM: NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)
16 Aug 2022 11:45:27,926 INFO [lifecycleSupervisor-1-0] (org.apache.flume.tools.DirectMemoryUtils.allocate:48) - Direct Memory Allocation: Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 1908932608, Remaining = 1908932608
16 Aug 2022 11:45:27,982 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:660) - Checkpoint for file(/opt/module/flume/data/ranqi/behavior2/log-2) is: 1660554206424, which is beyond the requested checkpoint time: 1660555388025 and position 0
16 Aug 2022 11:45:27,982 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.ReplayHandler.replayLog:262) - Replaying /opt/module/flume/data/ranqi/behavior2/log-3
16 Aug 2022 11:45:27,983 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:658) - fast-forward to checkpoint position: 273662090
16 Aug 2022 11:45:27,983 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.LogFile$SequentialReader.next:683) - Encountered EOF at 273662090 in /opt/module/flume/data/ranqi/behavior2/log-3
16 Aug 2022 11:45:27,983 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.ReplayHandler.replayLog:345) - read: 0, put: 0, take: 0, rollback: 0, commit: 0, skip: 0, eventCount:0
16 Aug 2022 11:45:27,984 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FlumeEventQueue.replayComplete:417) - Search Count = 0, Search Time = 0, Copy Count = 0, Copy Time = 0
16 Aug 2022 11:45:27,988 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.replay:505) - Rolling /opt/module/flume/data/ranqi/behavior2
16 Aug 2022 11:45:27,988 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.roll:990) - Roll start /opt/module/flume/data/ranqi/behavior2
16 Aug 2022 11:45:27,989 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.LogFile$Writer.<init>:220) - Opened /opt/module/flume/data/ranqi/behavior2/log-4
16 Aug 2022 11:45:27,996 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.roll:1006) - Roll end
16 Aug 2022 11:45:27,996 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:230) - Start checkpoint for /opt/module/flume/checkpoint/ranqi/behavior2/checkpoint, elements to sync = 0
16 Aug 2022 11:45:28,000 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:255) - Updating checkpoint metadata: logWriteOrderID: 1660621527859, queueSize: 0, queueHead: 557327
16 Aug 2022 11:45:28,008 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.Log.writeCheckpoint:1065) - Updated checkpoint for file: /opt/module/flume/data/ranqi/behavior2/log-4 position: 0 logWriteOrderID: 1660621527859
16 Aug 2022 11:45:28,008 INFO [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FileChannel.start:289) - Queue Size after replay: 0 [channel=c1]
16 Aug 2022 11:45:28,290 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:196) - Starting Sink k1
16 Aug 2022 11:45:28,291 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:207) - Starting Source r1
16 Aug 2022 11:45:28,292 INFO [lifecycleSupervisor-1-1] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:119) - Monitored counter group for type: SINK, name: k1: Successfully registered new MBean.
16 Aug 2022 11:45:28,292 INFO [lifecycleSupervisor-1-1] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:95) - Component type: SINK, name: k1 started
16 Aug 2022 11:45:28,292 INFO [lifecycleSupervisor-1-4] (org.apache.flume.source.kafka.KafkaSource.doStart:524) - Starting org.apache.flume.source.kafka.KafkaSource{name:r1,state:IDLE}...
flume agent start use shell command below.
#!/bin/bash
case $1 in
"start")
echo " --------start flume-------"
ssh hadoop104 "nohup /opt/module/flume/bin/flume-ng agent -n a1 -c /opt/module/flume/conf -f /opt/module/flume/job/ranqi/ranqi_kafka_to_hdfs_db.conf >/dev/null 2>&1 &"
;;
"stop")
echo " --------stop flume-------"
ssh hadoop104 "ps -ef | grep ranqi_kafka_to_hdfs_db.conf | grep -v grep |awk '{print \$2}' | xargs -n1 kill"
;;
esac
flume config is below.
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.batchSize = 5000
a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = hadoop102:9092,hadoop103:9092
a1.sources.r1.kafka.topics = copy_1015
a1.sources.r1.kafka.consumer.group.id = flume
a1.sources.r1.setTopicHeader = true
a1.sources.r1.topicHeader = topic
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = com.atguigu.flume.interceptor.ranqi.ranqiTimestampInterceptor$Builder
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /opt/module/flume/checkpoint/ranqi/behavior2
a1.channels.c1.dataDirs = /opt/module/flume/data/ranqi/behavior2/
a1.channels.c1.maxFileSize = 2146435071
a1.channels.c1.capacity = 1123456
a1.channels.c1.keep-alive = 6
## sink1
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /origin_data/ranqi/db/%{topic}_inc/%Y-%m-%d
a1.sinks.k1.hdfs.filePrefix = db
a1.sinks.k1.hdfs.round = false
a1.sinks.k1.hdfs.rollInterval = 10
a1.sinks.k1.hdfs.rollSize = 134217728
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.fileType = CompressedStream
a1.sinks.k1.hdfs.codeC = gzip
## 拼装
a1.sources.r1.channels = c1
a1.sinks.k1.channel= c1
ranqiTimestampInterceptor class I defined is below, which in flume/lib.
package com.atguigu.flume.interceptor.ranqi;
import com.alibaba.fastjson.JSONObject;
import com.atguigu.flume.interceptor.db.TimestampInterceptor;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.nio.charset.StandardCharsets;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;
import java.util.Map;
public class ranqiTimestampInterceptor implements Interceptor {
public static String dateToStamp(String s) throws ParseException {
String res;
SimpleDateFormat simpleDateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
Date date = simpleDateFormat.parse(s);
long ts = date.getTime();
res = String.valueOf(ts);
return res;
}
#Override
public void initialize() {
}
private final static Logger logger = LoggerFactory.getLogger(ranqiTimestampInterceptor.class);
#Override
public Event intercept(Event event) {
byte[] body = event.getBody();
Long createDate ;
String time = new String();
String log = new String(body, StandardCharsets.UTF_8);
JSONObject jsonObject = JSONObject.parseObject(log);
// logger.info(log);
logger.info(String.valueOf(jsonObject));
JSONObject data = jsonObject.getObject("data", JSONObject.class);
if(data.containsKey("createDate") && data.getLong("createDate") != null){
createDate = data.getLong("createDate");
try {
createDate = Long.valueOf(dateToStamp(String.valueOf(createDate)));
time = String.valueOf(createDate);
} catch (ParseException e) {
e.printStackTrace();
}
finally {
Long ts = jsonObject.getLong("ts");
time = String.valueOf(ts);
}
}else{
Long ts = jsonObject.getLong("ts");
time = String.valueOf(ts);
}
System.out.println(time);
logger.info(time);
Map<String, String> headers = event.getHeaders();
headers.put("timestamp",time);
return event;
}
#Override
public List<Event> intercept(List<Event> list) {
for (Event event : list) {
intercept(event);
}
return list;
}
#Override
public void close() {
}
public static class Builder implements Interceptor.Builder{
#Override
public Interceptor build() {
return new TimestampInterceptor();
}
#Override
public void configure(Context context) {
}
}
}
.

How to close Flume with kafka as Source

I'm upgrading Flume1.7 to 1.9.I have 5 Kafka sources and 7 aws-s3's sinks in FLume's conf.
Firstly I need stop Flume 1.7.So I execute the command
'kill ps -ef |grep flume |grep bidinfo | awk '{print $2}'' to stop the bidinfo task,but this process still exists.22 hours later ,this process always alive until now. What else can I do without this command 'kill -9 xxxx'.Welcome to suggest!!!
This server has been running for 60 days with Flume 1.7,kafka ,I've tried to execute the command 'kill -3 xxxx',However, only two sources and two sinks have been stopped and the others have been running.
I read the source code of flume-kafkaSource and observed flume's log.
It is possible that two methods(Consumer. wakeup ();Consumer. close ();) are not completed.
#flume-conf
ag.sources = src_1 src_2 src_3 src_4 src_5
ag.channels = ch
ag.sinks = sk_1 sk_2 sk_3 sk_4 sk_5 sk_6 sk_7
#source:kafka
ag.sources.src_1.type = org.apache.flume.source.kafka.KafkaSource
ag.sources.src_1.kafka.bootstrap.servers = xxxx
ag.sources.src_1.kafka.consumer.group.id = flume.xxxxx
ag.sources.src_1.kafka.consumer.retry.backoff.ms = 10000
ag.sources.src_1.batchSize = 5000
ag.sources.src_1.batchDurationMillis = 2000
ag.sources.src_1.kafka.topics = xxxx
ag.sources.src_1.interceptors = i1
ag.sources.src_1.interceptors.i1.type = xxxx.interceptor
ag.sources.src_1.channels = ch
#sink:aws S3
ag.sinks.sk_1.type = hdfs
ag.sinks.sk_1.hdfs.path = s3a://xxxxxxx
ag.sinks.sk_1.hdfs.filePrefix = %{minute}
ag.sinks.sk_1.hdfs.fileSuffix = .xxx.1.lzo
ag.sinks.sk_1.hdfs.rollSize = 0
ag.sinks.sk_1.hdfs.rollCount = 0
ag.sinks.sk_1.hdfs.rollInterval = 0
ag.sinks.sk_1.hdfs.idleTimeout = 180
ag.sinks.sk_1.hdfs.callTimeout = 600000
ag.sinks.sk_1.hdfs.closeTries = 5
ag.sinks.sk_1.hdfs.retryInterval = 60
ag.sinks.sk_1.hdfs.batchSize = 3000
ag.sinks.sk_1.hdfs.codeC = lzop
ag.sinks.sk_1.hdfs.fileType = CompressedStream
ag.sinks.sk_1.hdfs.writeFormat = Text
ag.sinks.sk_1.channel = ch
#channels
ag.channels.ch.type = memory
ag.channels.ch.capacity = 2000000
ag.channels.ch.transactionCapacity = 100000
Error Message
09 Sep 2019 11:14:18,010 ERROR [PollableSourceRunner-KafkaSource-src_2] (org.apache.flume.source.kafka.KafkaSource.doProcess:314) - KafkaSource EXCEPTION, {}
org.apache.flume.ChannelException: java.lang.InterruptedException
at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:154)
at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
at org.apache.flume.source.kafka.KafkaSource.doProcess(KafkaSource.java:295)
at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:60)
at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:133)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:582)
at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:119)
at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
... 5 more
#this place I think is the main problem
09 Sep 2019 11:14:18,011 INFO [PollableSourceRunner-KafkaSource-src_2] (org.apache.flume.source.PollableSourceRunner$PollingRunner.run:143) - Source runner interrupted. Exiting
09 Sep 2019 11:14:18,011 INFO [agent-shutdown-hook] (org.apache.kafka.clients.consumer.internals.AbstractCoordinator$2.onFailure:571) - LeaveGroup request failed with error
org.apache.kafka.clients.consumer.internals.SendFailedException
#-------------
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:149) - Component type: SOURCE, name: src_2 stopped
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:155) - Shutdown Metric for type: SOURCE, name: src_2. source.start.time == 1567911973591
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:161) - Shutdown Metric for type: SOURCE, name: src_2. source.stop.time == 1567998858018
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. source.kafka.commit.time == 2470886
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. source.kafka.empty.count == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. source.kafka.event.get.time == 78823378
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append-batch.accepted == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append-batch.received == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append.accepted == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.append.received == 0
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.events.accepted == 767319160
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.events.received == 767366357
09 Sep 2019 11:14:18,018 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:177) - Shutdown Metric for type: SOURCE, name: src_2. src.open-connection.count == 0

OutputSream.write is too slow

I am encountering with a senerior like this:
My project has a servlet to catch a request from perl. The request is to download a file. The request is a multipartRequest.
#RequestMapping(value = "/*", method = RequestMethod.POST)
public void tdRequest(#RequestHeader("Authorization") String authenticate,
HttpServletResponse response,
HttpServletRequest request) throws Exception
{
if (ServletFileUpload.isMultipartContent(request))
{
ServletFileUpload sfu = new ServletFileUpload();
FileItemIterator items = sfu.getItemIterator(request);
while (items.hasNext())
{
FileItemStream item = items.next();
if (("action").equals(item.getFieldName()))
{
InputStream stream = item.openStream();
String value = Streams.asString(stream);
if (("upload").equals(value))
{
uploadRequest(items, response);
return;
}
else if (("download").equals(value))
{
downloadRequest(items, response);
return;
}
The problem is not here, it appears on the downloadRequest() function.
void downloadRequest(FileItemIterator items,
HttpServletResponse response) throws Exception
{
log.info("Start downloadRequest.......");
OutputStream os = response.getOutputStream();
File file = new File("D:\\clip.mp4");
FileInputStream fileIn = new FileInputStream(file);
//while ((datablock = dataOutputStreamServiceImpl.readBlock()) != null)
byte[] outputByte = new byte[ONE_MEGABYE];
while (fileIn.read(outputByte) != -1)
{
System.out.println("--------" + (i = i + 1) + "--------");
System.out.println(new Date());
//dataContent = datablock.getContent();
System.out.println("Start write " + new Date());
os.write(outputByte, 0,outputByte.length);
System.out.println("End write " + new Date());
//System.out.println("----------------------");
}
os.close();
}
}
I try to read and write blocks of 1MB from the file. However, it takes too long for downloading the whole file. ( my case is 20mins for file of 100MB)
I try to sysout and I saw a result like this:
The first few blocks can read, write data realy fast:
--------1--------
Mon Dec 07 16:24:20 ICT 2015
Start write Mon Dec 07 16:24:20 ICT 2015
End write Mon Dec 07 16:24:21 ICT 2015
--------2--------
Mon Dec 07 16:24:21 ICT 2015
Start write Mon Dec 07 16:24:21 ICT 2015
End write Mon Dec 07 16:24:21 ICT 2015
--------3--------
Mon Dec 07 16:24:21 ICT 2015
Start write Mon Dec 07 16:24:21 ICT 2015
End write Mon Dec 07 16:24:21 ICT 2015
But the next block is slower than the previous
--------72--------
Mon Dec 07 16:29:22 ICT 2015
Start write Mon Dec 07 16:29:22 ICT 2015
End write Mon Dec 07 16:29:29 ICT 2015
--------73--------
Mon Dec 07 16:29:29 ICT 2015
Start write Mon Dec 07 16:29:29 ICT 2015
End write Mon Dec 07 16:29:37 ICT 2015
--------124--------
Mon Dec 07 16:38:22 ICT 2015
Start write Mon Dec 07 16:38:22 ICT 2015
End write Mon Dec 07 16:38:35 ICT 2015
--------125--------
Mon Dec 07 16:38:35 ICT 2015
Start write Mon Dec 07 16:38:35 ICT 2015
End write Mon Dec 07 16:38:48 ICT 2015
The problem is in the os.write()
I realy cannot understand how the outputStream write, why it take such a long time like that? or I made some mistakes?
Sorry for my bad english. I realy need your support. Thank in advance!
This is the perl code from the client side
# ----- get connected to download the file
#
$Response = $ua->request(POST $remoteHost ,
Content_Type => 'form-data',
Authorization => $Authorization,
'Proxy-Authorization' => $Proxy_Authorization ,
Content => [ DOS => 1 ,
action => 'download' ,
first_run => 0 ,
dl_filename => $dl_filename ,
delivery_dir => $delivery_dir ,
verbose => $Verbose ,
debug => $debug ,
version => $VERSION
]
);
unless ($Response->is_success) {
my $Msg = $Response->error_as_HTML;
# Remove HTML tags - we're in a DOS shell!
$Msg =~ s/<[^>]+>//g;
print "ERROR! SERVER RESPONSE:\n$Msg\n";
print "$remoteHost\n\n" if $Options{'v'};
Error "Could not connect to " . $remoteHost ;
}
my $Result2 = $Response->content();
Error "Abnormal termination...\n$Result2" if $Result2 =~ /_APP_ERROR_/;
open(F, ">$dl_filename") or Error "Could not open '$dl_filename'!";
binmode F; # unless $dl_filename =~ /\.txt$|\.htm$/;
print F $Result2;
close F;
print "received.\n";
}

One problem is that fileIn.read(outputByte) can read random number of bytes, not only full outputByte. You read few KB, then you store full 1MB, and very fast you are running out of space on disk. Try this, notice the "readed" parameter.
void downloadRequest(FileItemIterator items,
HttpServletResponse response) throws Exception
{
log.info("Start downloadRequest.......");
OutputStream os = response.getOutputStream();
File file = new File("D:\\clip.mp4");
FileInputStream fileIn = new FileInputStream(file);
//while ((datablock = dataOutputStreamServiceImpl.readBlock()) != null)
byte[] outputByte = new byte[ONE_MEGABYE];
int readed =0;
while ((readed =fileIn.read(outputByte)) != -1)
{
System.out.println("--------" + (i = i + 1) + "--------");
System.out.println(new Date());
//dataContent = datablock.getContent();
System.out.println("Start write " + new Date());
os.write(outputByte, 0,readed );
System.out.println("End write " + new Date());
//System.out.println("----------------------");
}
os.close();
}
}

It looks like your download performance gets slower and slower, the further you are getting into the download. You start out at one or less seconds per block, by block 72 it is 7+ seconds per block and by block 128 it is 13 seconds per block.
There is nothing on the server side to explain this. Rather, it has the "smell" of the client side doing something wrong. My guess is that the client side is reading the data from the socket into an in-memory data structure, and that data structure (maybe just a String or StringBuffer or StringBuilder) is getting larger and larger. Either the time take to expand it is getting larger, or your memory footprint is growing and the GC is taking longer and longer. (Or both.)
If you showed us the client-side code .....
UPDATE
As I suspected, this line of code will be reading the entire content into the Perl equivalent of a string builder before turning it into a string.
my $Result2 = $Response->content();
Depending on how it is implemented under the hood, this will lead to repeated copying of the data as the builder runs out of buffer space and needs to be expanded. Depending on the buffer expansion strategy that Perl employs for this, it could give O(N^2) behavior, where N is the size of the file you are transferring. (The evidence is that you are not getting O(N) behavior ...)
If you want a faster downloads, you need to stream the data on the client side. Read the response content in chunks and write them to the output file. (I'm not a Perl expert, so I can't offer you code.) This will also reduce the memory footprint on the client side ... which could be important if your file sizes increase.

Not able to change quantity of a commerceItem using handleSetOrderByCommerceId

I am not able to change the quantity of commerceItem using handleSetOrderByCommerceId.PFB.
I am using Art Technology Group web commerce platform.
Here is my code snippet:
for (CommerceItem cI : commerceItems) {
if (isLoggingDebug()) {
logDebug("inside handleRemoveItemFromOrder : commerceItems iteration : "+cI.getId() +" : "+ cI.getCatalogRefId());
}
request.setParameter(cI.getId(), cI.getQuantity());
if(cI.getCatalogRefId().equals(dealTobeDeleted)){
long quantity = cI.getQuantity();
//String currentSku = cI.getCatalogRefId();
if(quantity > 1){
// Set the new quantity for the commerce item being updated.
request.setParameter(cI.getCatalogRefId(), quantity-1);
setCheckForChangedQuantity(true);
result = super.handleSetOrderByCommerceId(request, response);
if (isLoggingDebug()) {
logDebug("inside handleRemoveItemFromOrder : after super call handleSetOrderByCommerceId : "+result);
}
}else{
String[] dealArray = new String[]{cI.getId()};
repoItemIdOfCommerceItemGettingRemoved = cI.getCatalogRefId();
setRemovalCommerceIds(dealArray);
if (isLoggingDebug()) {
logDebug("inside handleRemoveItemFromOrder : before super call handleRemoveItemFromOrder : "+repoItemIdOfCommerceItemGettingRemoved +" : "+getRemovalCommerceIds().length);
}
break;
}
}
}
The logs says :
**** debug Fri Sep 18 15:49:55 CAT 2015 1442584195886 /atg/dynamo/servlet/pipeline/RequestScopeManager/RequestScope-37/atg/commerce/order/purchase/CartModifierFormHandler no form errors - staying on same page.
**** debug Fri Sep 18 15:49:55 CAT 2015 1442584195886 /atg/dynamo/servlet/pipeline/RequestScopeManager/RequestScope-37/atg/commerce/order/purchase/CartModifierFormHandler no form errors - staying on same page.
**** info Fri Sep 18 15:49:55 CAT 2015 1442584195889 /com/cellc/online/commerce/pricing/calculators/OrderMonthlyCostCalculator monthlyPrice::::::::::::::509.0
**** debug Fri Sep 18 15:49:55 CAT 2015 1442584195889 /atg/dynamo/servlet/pipeline/RequestScopeManager/RequestScope-37/atg/commerce/order/purchase/CartModifierFormHandler runProcess skipped because chain ID is null
**** debug Fri Sep 18 15:49:55 CAT 2015 1442584195889 /atg/dynamo/servlet/pipeline/RequestScopeManager/RequestScope-37/atg/commerce/order/purchase/CartModifierFormHandler no form errors - staying on same page.
**** debug Fri Sep 18 15:49:55 CAT 2015 1442584195889 /atg/commerce/order/OrderManager Order: o8950004 Version in object: 183 Version in repItem: 183
**** debug Fri Sep 18 15:49:56 CAT 2015 1442584196206 /atg/dynamo/servlet/pipeline/RequestScopeManager/RequestScope-37/atg/commerce/order/purchase/CartModifierFormHandler no form errors - staying on same page.

Twitter4j authentication credentials are missing

I would like to make a tweet with Twitter4j in my Android app. Here is my code:
//TWITTER SHARE.
#Click (R.id. img_btn_twitter)
#Background
public void twitterPostWall(){
try {
//Twitter Conf.
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey(CONSUMER_KEY)
.setOAuthConsumerSecret(CONSUMER_SECRET)
.setOAuthAccessToken(ACCESS_KEY)
.setOAuthAccessTokenSecret(ACCESS_SECRET);
TwitterFactory tf = new TwitterFactory(cb.build());
Twitter twitter = new TwitterFactory().getInstance();
twitter.setOAuthConsumer(CONSUMER_KEY, CONSUMER_SECRET);
try {
RequestToken requestToken = twitter.getOAuthRequestToken();
Log.e("Request token: ", "" + requestToken.getToken());
Log.e("Request token secret: ", "" + requestToken.getTokenSecret());
AccessToken accessToken = null;
}
catch (IllegalStateException ie) {
if (!twitter.getAuthorization().isEnabled()) {
Log.e("OAuth consumer key/secret is not set.", "");
}
}
Status status = twitter.updateStatus(postLink);
Log.e("Successfully updated the status to [", "" + status.getText() + "].");
}
catch (TwitterException te) {
Log.e("TWEET FAILED", "");
}
}
I always get this error message from Twitter4j: java.lang.IllegalStateException: Authentication credentials are missing. See http://twitter4j.org/en/configuration.html for the detail. But as you can see I'm using builder to set my key. Can someone help me to fix it please? thanks.

Problem is following lines.
TwitterFactory tf = new TwitterFactory(cb.build());
Twitter twitter = new TwitterFactory().getInstance();
You are passing the configuration to one TwitterFactory instance and using another TwitterFactory instance to get the Twitter instance.
Hence, You are getting
java.lang.IllegalStateException: Authentication credentials are missing
I suggest you to modify your code as follows:
//Twitter Conf.
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey(CONSUMER_KEY)
.setOAuthConsumerSecret(CONSUMER_SECRET)
.setOAuthAccessToken(ACCESS_KEY)
.setOAuthAccessTokenSecret(ACCESS_SECRET);
TwitterFactory tf = new TwitterFactory(cb.build());
Twitter twitter = tf.getInstance();
And use this twitter instance. It will work.

I was having issues with the configuration on Twitter4j because I was not providing the right configuration. So in order to fix it, I created the following function to establish my configuration to later be used in another function:
public static void main(String args[]) throws Exception {
TwitterServiceImpl impl = new TwitterServiceImpl();
ResponseList<Status> resList = impl.getUserTimeLine("spacex");
for (Status status : resList) {
System.out.println(status.getCreatedAt() + ": " + status.getText());
}
}
public ResponseList<Status> getUserTimeLine(String screenName) throws TwitterException {
TwitterFactory twitterFactory = new TwitterFactory(getConfiguration().build());
Twitter twitter = twitterFactory.getInstance();
twitter.getAuthorization();
Paging paging = new Paging(1, 10);
twitter.getId();
return twitter.getUserTimeline(screenName, paging);
}
public ConfigurationBuilder getConfiguration() {
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey("myConsumerKey")
.setOAuthConsumerSecret("myConsumerSecret")
.setOAuthAccessToken("myAccessToken")
.setOAuthAccessTokenSecret("myAccessTokenSecret");
return cb;
}
To get the required info, you must have a Twitter developer account, and to get the auth info of an app previously created go to: Projects and Apps.
In the end, I was able to retrieve the data from a SpaceX account:
Tue Nov 24 20:58:13 CST 2020: Falcon 9 launches Starlink to orbit – the seventh launch and landing of this booster https://twitter.com/SpaceX/status/1331431972430700545
Tue Nov 24 20:29:36 CST 2020: Deployment of 60 Starlink satellites confirmed https://twitter.com/SpaceX/status/1331424769632215040
Tue Nov 24 20:23:17 CST 2020: Falcon 9’s first stage lands on the Of Course I Still Love You droneship! https://twitter.com/SpaceX/status/1331423180431396864
Tue Nov 24 20:14:20 CST 2020: Liftoff! https://twitter.com/SpaceX/status/1331420926450094080
Tue Nov 24 20:02:38 CST 2020: Watch Falcon 9 launch 60 Starlink satellites ? https://www.spacex.com/launches/index.html https://twitter.com/i/broadcasts/1ypKdgVXWgRxW
Tue Nov 24 19:43:14 CST 2020: T-30 minutes until Falcon 9 launches its sixteenth Starlink mission. Webcast goes live ~15 minutes before liftoff https://www.spacex.com/launches/index.html
Tue Nov 24 18:00:59 CST 2020: RT #elonmusk: Good Starship SN8 static fire! Aiming for first 15km / ~50k ft altitude flight next week. Goals are to test 3 engine ascent,…
Mon Nov 23 15:45:38 CST 2020: Now targeting Tuesday, November 24 at 9:13 p.m. EST for Falcon 9’s launch of Starlink, when weather conditions in the recovery area should improve
Sun Nov 22 20:45:13 CST 2020: Standing down from today’s launch of Starlink. Rocket and payload are healthy; teams will use additional time to complete data reviews and are now working toward backup opportunity on Monday, November 23 at 9:34 p.m. but keeping an eye on recovery weather
Sat Nov 21 22:09:12 CST 2020: More Falcon 9 launch and landing photos ? https://www.flickr.com/photos/spacex https://twitter.com/SpaceX/status/1330362669837082624
Where to get Auth Tokens for your app

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.