I have an application that subscribes to a topic in GCP and when there is some messages over there it downloads them and sends them to a queue on ActiveMQ.
In order to make this process fast, I am using executorService and launching multiple threads for sending messages to activeMQ. Since this the subscription is supposed to be an ongoing task I am putting the code in a while(true) loop, and hence I can't shutdown the executorService in a normal fashion, as I will be creating and shutting down the executor service in every loop.
I am searching for an elegant way to shutdown the executorService when the subscription is empty (no data in the topic) for like 2 or 3 minutes or some inactivity window. and then of course it starts again when there is some new data.
The following is my idea which I don't like, which is just a counter that I am incrementing when the subscription retrieves no data.
I am looking for a more elegant way of doing that.
#Service
#Slf4j
public class PubSubSubscriberService {
private static final int EMPTY_SUBSCRIPTION_COUNTER = 4;
private static final Logger businessLogger = LoggerFactory.getLogger("BusinessLogger");
private Queue<PubsubMessage> messages = new ConcurrentLinkedQueue<>();
public void pullMessagesAndSendToBroker(CompositeConfigurationElement cce) {
var patchSize = cce.getSubscriber().getPatchSize();
var nThreads = cce.getSubscriber().getSendingParallelThreads();
var scheduledTasks = 0;
var subscribeCounter = 0;
ThreadPoolExecutor threadPoolExecutor = null;
while (true) {
try {
if (subscribeCounter < EMPTY_SUBSCRIPTION_COUNTER) {
log.info("Creating Executor Service for uploading to broker with a thread pool of Size: " + nThreads);
threadPoolExecutor = getThreadPoolExecutor(nThreads);
}
var subscriber = this.getSubscriber(cce);
this.startSubscriber(subscriber, cce);
this.checkActivity(threadPoolExecutor, subscribeCounter++);
// send patches of {{ messagesPerIteration }}
while (this.messages.size() > patchSize) {
if (poolIsReady(threadPoolExecutor, nThreads)) {
UploadTask task = new UploadTask(this.messages, cce, cf, patchSize);
threadPoolExecutor.submit(task);
scheduledTasks ++;
}
subscribeCounter = 0;
}
// send the rest
if (this.messages.size() > 0) {
UploadTask task = new UploadTask(this.messages, cce, cf, patchSize);
threadPoolExecutor.submit(task);
scheduledTasks ++;
subscribeCounter = 0;
}
if (scheduledTasks > 0) {
businessLogger.info("Scheduled " + scheduledTasks + " upload tasks of size upto: " + patchSize + ", preparing to start subscribing for 30 more sec") ;
scheduledTasks = 0;
}
} catch ( Exception e) {
e.printStackTrace();
businessLogger.error(e.getMessage());
}
}
Your pool take few space and memory and consume almost no CPU when it's not used. Set a max limit to your Pool capacity and use it with trying to downscale it. If you have too much messages to process, the task are queued waiting a free executor pool to complete the task.
If you have scalability up and down concerne, you design could be reviewed. Instead of executorPool internal to the pod, you could trigger an event in your cluster and process them in parallel, on other pods. These pods will be able to scale up and down according to the traffic (have a look to Knative)
Related
I try to create a POD with an helm file on my Kubernetes Cluster and wait until the pod is ready.
For this I use the Kubernetes Java Client API. My code is:
public static String startWdPod(File specFile, String namespace, String imageName, String controllerAddress)
throws IOException, ApiException, InterruptedException {
LOGGER.info("Loading pod spec from {}", specFile);
V1Pod preparedPod = (V1Pod) Yaml.load(specFile);
// Dynamic settings
String name = WD_NAME_BASE + UUID.randomUUID();
Objects.requireNonNull(preparedPod.getMetadata()).setName(name);
V1Container container = Objects.requireNonNull(preparedPod.getSpec()).getContainers().get(0);
container.setImage(imageName);
Objects.requireNonNull(container.getEnv()).add(new V1EnvVar().name("WD_CONTROLLER_ADDRESS").value(controllerAddress));
LOGGER.info("Creating pod {} with image {}", name, imageName);
V1Pod pod = getApi().createNamespacedPod(namespace, preparedPod, null, null, null);
podReference = pod;
int count = 0;
LOGGER.info("POD Phase1 {}", pod.getStatus().getPhase());
while (count < 50) {
if (pod.getStatus().getPhase().equals("Ready")) {
LOGGER.info("Found Ready");
count = 50;
}
LOGGER.info("POD not ready, wait 1 second");
LOGGER.info("POD Phase loop {}", pod.getStatus().getPhase());
TimeUnit.SECONDS.sleep(1);
count++;
}
return Objects.requireNonNull(pod.getMetadata()).getName();
}
The Pod is created well and becomes ready after some time. But in my Loop the Status will never get other than "Pending". I expect it will change some time to "Ready". Is there anything I think wrong about?
I am trying to set up a integration flow to consume messages from a amazon sqs queue and its working fine so far. But i would like to pace the number of messages per minutes or seconds. e.g. 20 messages per minute.
Here is the definition of my sql listener bean
#Bean
public MessageProducer mySqsMessageDrivenChannelAdapter() {
SqsMessageDrivenChannelAdapter adapter = new SqsMessageDrivenChannelAdapter(this.amazonSqs, queueName);
adapter.setMessageDeletionPolicy(SqsMessageDeletionPolicy.ON_SUCCESS);
adapter.setVisibilityTimeout(TIMEOUT_VISIBILITY);
adapter.setWaitTimeOut(TIMEOUT_MESSAGE_WAIT);
adapter.setMaxNumberOfMessages(prefetch);
adapter.setOutputChannel(processMessageChannel());
return adapter;
}
As you can see, I'm setting the maximum number of messages to fetch per poll, but how to set the delay between polls?
In a regular jms queue I could use a JMS.inboundAdapter using a custom poller but it seems that using SqsMessageDrivenChannelAdapter I cant set any poll timer value.
Maybe I could use a MessageProducer other than SqsMessageDrivenChannelAdapter but which one?
Is it possible to set a JMS.inboundAdapter using sqs?
Spring Integration SqsMessageDrivenChannelAdapter is a message-driver active component. It is based on the SimpleMessageListenerContainer from the Springh Cloud AWS project which has long-running while() loop to call AmazonSQS.receiveMessage(). The logic in that loop isn't too complicated:
try {
ReceiveMessageResult receiveMessageResult = getAmazonSqs().receiveMessage(this.queueAttributes.getReceiveMessageRequest());
CountDownLatch messageBatchLatch = new CountDownLatch(receiveMessageResult.getMessages().size());
for (Message message : receiveMessageResult.getMessages()) {
if (isQueueRunning()) {
MessageExecutor messageExecutor = new MessageExecutor(this.logicalQueueName, message, this.queueAttributes);
getTaskExecutor().execute(new SignalExecutingRunnable(messageBatchLatch, messageExecutor));
} else {
messageBatchLatch.countDown();
}
}
try {
messageBatchLatch.await();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
} catch (Exception e) {
As you see we create there messageBatchLatch and wait for it after the loop.
Each messages is processed by their own SignalExecutingRunnable which countDown()s in the end of MessageExecutor. So, what you would like to do maybe achieved with an artificial Thread.sleep() in the target service method to have some more interval in between SQS polls.
But I hear your request and we indeed have to add something like:
/**
* The sleep interval in milliseconds used in the main loop between shards polling cycles.
* Defaults to {#code 1000} minimum {#code 250}.
* #param idleBetweenPolls the interval to sleep between shards polling cycles.
*/
public void setIdleBetweenPolls(int idleBetweenPolls) {
this.idleBetweenPolls = Math.max(250, idleBetweenPolls);
}
I did this for the KinesisMessageDrivenChannelAdapter, but here we have to request Spring Cloud AWS to do that for the SimpleMessageListenerContainer.
I've got an app that sends simple SQS messages to multiple queues. Previously, this sending happened serially, but now that we've got more queues we need to send to, I decided to parallelize it by doing all the sending in a thread pool (up to 10 threads).
However, I've noticed that sqs.sendMessage latency seems to increase when I throw more threads at the job!
I've created a sample program below to reproduce the problem (Note that numIterations is just to get more data, and this is just a simplified version of the code for demo purposes).
Running on EC2 instance in the same region and using 7 queues, I'm typically getting average results around 12-15ms with 1 thread, and 21-25ms with 7 threads - nearly double the latency!
Even running from my laptop remotely (when creating this demo), I'm getting average latency of ~90ms with 1 thread and ~120ms with 7 threads.
public static void main(String[] args) throws Exception {
AWSCredentialsProvider creds = new AWSStaticCredentialsProvider(new BasicAWSCredentials(A, B));
final int numThreads = 7;
final int numQueues = 7;
final int numIterations = 100;
final long sleepMs = 10000;
AmazonSQSClient sqs = new AmazonSQSClient(creds);
List<String> queueUrls = new ArrayList<>();
for (int i=0; i<numQueues; i++) {
queueUrls.add(sqs.getQueueUrl("testThreading-" + i).getQueueUrl());
}
Queue<Long> resultQueue = new ConcurrentLinkedQueue<>();
sqs.addRequestHandler(new MyRequestHandler(resultQueue));
runIterations(sqs, queueUrls, numThreads, numIterations, sleepMs);
System.out.println("Average: " + resultQueue.stream().mapToLong(Long::longValue).average().getAsDouble());
System.exit(0);
}
private static void runIterations(AmazonSQS sqs, List<String> queueUrls, int threadPoolSize, int numIterations, long sleepMs) throws Exception {
ExecutorService executor = Executors.newFixedThreadPool(threadPoolSize);
List<Future<?>> futures = new ArrayList<>();
for (int i=0; i<numIterations; i++) {
for (String queueUrl : queueUrls) {
final String message = String.valueOf(i);
futures.add(executor.submit(() -> sendMessage(sqs, queueUrl, message)));
}
Thread.sleep(sleepMs);
}
for (Future<?> f : futures) {
f.get();
}
}
private static void sendMessage(AmazonSQS sqs, String queueUrl, String messageBody) {
final SendMessageRequest request = new SendMessageRequest()
.withQueueUrl(queueUrl)
.withMessageBody(messageBody);
sqs.sendMessage(request);
}
// Use RequestHandler2 to get accurate timing metrics
private static class MyRequestHandler extends RequestHandler2 {
private final Queue<Long> resultQueue;
public MyRequestHandler(Queue<Long> resultQueue) {
this.resultQueue = resultQueue;
}
public void afterResponse(Request<?> request, Response<?> response) {
TimingInfo timingInfo = request.getAWSRequestMetrics().getTimingInfo();
Long start = timingInfo.getStartEpochTimeMilliIfKnown();
Long end = timingInfo.getEndEpochTimeMilliIfKnown();
if (start != null && end != null) {
long elapsed = end-start;
resultQueue.add(elapsed);
}
}
}
I'm sure this is some weird client configuration issue, but the default ClientConfiguration should be able to handle 50 concurrent connections.
Any suggestions?
Update: It's looking like the key to this problem is something I left out of the original simplified version - there is a delay between batches of messages being sent (relating to doing processing). The latency issue isn't there if the delay is ~2s, but it is an issue when the delay between batches is ~10s. I've tried different values for ClientConfiguration.validateAfterInactivityMillis with no effect.
I have multiple resources - for the sake of understanding say 3 resources namely XResource, YResource and ZResource (Java classes - Runnables) who are able to do a certain Task. There is a List of Tasks which needs to be done in parallel among the 3 resources. I need the resources to be locked and if one of the resource is locked then the task should go to some other resource and if none of the resources are available then it should wait till one of the resource is available. I am currently trying to get a lock to a resource using a Semaphore but the thread gets assigned to one Runnable only and the other Runnables are always idle. I am very new to multithreading so I might be overlooking something obvious. I am using Java SE 1.6
Below is my code -
public class Test {
private final static Semaphore xResourceSphore = new Semaphore(1, true);
private final static Semaphore yResourceSphore = new Semaphore(1, true);
private final static Semaphore zResourceSphore = new Semaphore(1, true);
public static void main(String[] args) {
ArrayList<Task> listOfTasks = new ArrayList<Task>();
Task task1 = new Task();
Task task2 = new Task();
Task task3 = new Task();
Task task4 = new Task();
Task task5 = new Task();
Task task6 = new Task();
Task task7 = new Task();
Task task8 = new Task();
Task task9 = new Task();
listOfTasks.add(task1);
listOfTasks.add(task2);
listOfTasks.add(task3);
listOfTasks.add(task4);
listOfTasks.add(task5);
listOfTasks.add(task6);
listOfTasks.add(task7);
listOfTasks.add(task8);
listOfTasks.add(task9);
//Runnables
XResource xThread = new XResource();
YResource yThread = new YResource();
ZResource zThread = new ZResource();
ExecutorService executorService = Executors.newFixedThreadPool(3);
for (int i = 0; i < listOfTasks.size(); i++) {
if (xResourceSphore.tryAcquire()) {
try {
xThread.setTask(listOfTasks.get(i));
executorService.execute(xThread );
} finally {
xResourceSphore.release();
}
}else if (yResourceSphore.tryAcquire()) {
try {
yThread.setTask(listOfTasks.get(i));
executorService.execute(yThread );
} finally {
yResourceSphore.release();
}
}else if (zResourceSphore.tryAcquire()) {
try {
zThread.setTask(listOfTasks.get(i));
executorService.execute(zThread );
} finally {
zResourceSphore.release();
}
}
}
executorService.shutdown();
}
}
You need to move the resource locking logic to the task which is run in another thread.
By doing the locking in the current thread, you are not waiting for the task to be performed before releasing the resource. The reason you are seeing the problem you are is that you are not waiting for the task to complete (or even start) before calling setTask() on the same resource. This replaces the previous task set.
Queue<Resource> resources = new ConcurrentLinkedQueue<>();
resources.add(new XResource());
resources.add(new YResource());
resources.add(new ZResource());
ExecutorService service = Executors.newFixedThreadPool(resources.size());
ThreadLocal<Resource> resourceToUse = ThreadLocal.withInitial(() -> resources.remove());
for (int i = 1; i < 9; i++) {
service.execute(() -> {
Task task = new Task();
resourceToUse.setTask(task);
});
}
Following Peter Lawrey's suggestion I passed the Semaphore within the runnable and released it after it finished execution. However I still faced the issue that I am unable to allocate all the tasks to the threads within the for loop. So I made a while(true) loop until one of the resource is available for a task. Below is the code:
ExecutorService executorService = Executors.newFixedThreadPool(3);
for (int i = 0; i < listOfTasks.size(); i++) {
while(true){
if (xResourceSphore.tryAcquire()) {
xThread.setTask(listOfTasks.get(i));
xThread.setSemaphore(xResourceSphore);
executorService.execute(xThread );
break;
}else if (yResourceSphore.tryAcquire()) {
yThread.setTask(listOfTasks.get(i));
yThread.setSemaphore(yResourceSphore);
executorService.execute(yThread );
break;
}else if (zResourceSphore.tryAcquire()) {
zThread.setTask(listOfTasks.get(i));
zThread.setSemaphore(zResourceSphore);
executorService.execute(zThread );
break;
}
}
}
executorService.shutdown();
I don't like this solution much because it cannot be extended if my resources are doing different types of tasks and hence if I need a particular resource for a particular kind of task , my other tasks would be waiting continuously till the particular task gets done. But for now couldn't get any other way. Even after so much of research!
I know Apache Curator can do the distributed lock feature which is build on the top of zookeeper. It looks like very easy to use based on the document which is posted in the Apache Curator official website. For example:
RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);
CuratorFramework client = CuratorFrameworkFactory.newClient("host:ip",retryPolicy);
client.start();
InterProcessSemaphoreMutex lock = new InterProcessSemaphoreMutex(client, path);
if(lock.acquire(10, TimeUnit.SECONDS))
{
try { /*do something*/ }
finally { lock.release(); }
}
But what does the second parameter "path" of "InterProcessSemaphoreMutex" mean? It means "the path for the lock" based on API but what exactly is it? Can anybody give me an example?
If I have millions of locks, do I have to create millions of "path to the lock"? Is there any limit that the maximum number of locks(znodes) a zookeeper cluster has? Or can we remove this lock when a process releases it?
ZooKeeper presents what looks like a distributed file system. For any ZooKeeper operation, recipe, etc., you write "znodes" to a particular path and watch for changes. See here: http://zookeeper.apache.org/doc/trunk/zookeeperOver.html#Simple+API (regarding znodes).
For Curator recipes, it needs to know the base path you want to use to perform the recipe. For InterProcessSemaphoreMutex, the path is what every participant should use. i.e. Process 1 and Process 2 want to both contend for the lock. So, they both allocate InterProcessSemaphoreMutex instances with the same path, say "/my/lock". Think of the path as the lock identifier. In the same ZooKeeper cluster, you could have multiple locks by using different paths.
Hope this helps (disclaimer: I'm the main author of Curator).
Some examples about Reaper.
#Test
public void testSomeNodes() throws Exception
{
Timing timing = new Timing();
ChildReaper reaper = null;
CuratorFramework client = CuratorFrameworkFactory.newClient(server.getConnectString(), timing.session(), timing.connection(), new RetryOneTime(1));
try
{
client.start();
Random r = new Random();
int nonEmptyNodes = 0;
for ( int i = 0; i < 10; ++i )
{
client.create().creatingParentsIfNeeded().forPath("/test/" + Integer.toString(i));
if ( r.nextBoolean() )
{
client.create().forPath("/test/" + Integer.toString(i) + "/foo");
++nonEmptyNodes;
}
}
reaper = new ChildReaper(client, "/test", Reaper.Mode.REAP_UNTIL_DELETE, 1);
reaper.start();
timing.forWaiting().sleepABit();
Stat stat = client.checkExists().forPath("/test");
Assert.assertEquals(stat.getNumChildren(), nonEmptyNodes);
}
finally
{
CloseableUtils.closeQuietly(reaper);
CloseableUtils.closeQuietly(client);
}
}
Java Code Examples for org.apache.curator.framework.recipes.locks.Reaper