Using AmazonSQS sendMessage over multiple threads causes it to run slower

Using AmazonSQS sendMessage over multiple threads causes it to run slower - java

I've got an app that sends simple SQS messages to multiple queues. Previously, this sending happened serially, but now that we've got more queues we need to send to, I decided to parallelize it by doing all the sending in a thread pool (up to 10 threads).
However, I've noticed that sqs.sendMessage latency seems to increase when I throw more threads at the job!
I've created a sample program below to reproduce the problem (Note that numIterations is just to get more data, and this is just a simplified version of the code for demo purposes).
Running on EC2 instance in the same region and using 7 queues, I'm typically getting average results around 12-15ms with 1 thread, and 21-25ms with 7 threads - nearly double the latency!
Even running from my laptop remotely (when creating this demo), I'm getting average latency of ~90ms with 1 thread and ~120ms with 7 threads.
public static void main(String[] args) throws Exception {
AWSCredentialsProvider creds = new AWSStaticCredentialsProvider(new BasicAWSCredentials(A, B));
final int numThreads = 7;
final int numQueues = 7;
final int numIterations = 100;
final long sleepMs = 10000;
AmazonSQSClient sqs = new AmazonSQSClient(creds);
List<String> queueUrls = new ArrayList<>();
for (int i=0; i<numQueues; i++) {
queueUrls.add(sqs.getQueueUrl("testThreading-" + i).getQueueUrl());
}
Queue<Long> resultQueue = new ConcurrentLinkedQueue<>();
sqs.addRequestHandler(new MyRequestHandler(resultQueue));
runIterations(sqs, queueUrls, numThreads, numIterations, sleepMs);
System.out.println("Average: " + resultQueue.stream().mapToLong(Long::longValue).average().getAsDouble());
System.exit(0);
}
private static void runIterations(AmazonSQS sqs, List<String> queueUrls, int threadPoolSize, int numIterations, long sleepMs) throws Exception {
ExecutorService executor = Executors.newFixedThreadPool(threadPoolSize);
List<Future<?>> futures = new ArrayList<>();
for (int i=0; i<numIterations; i++) {
for (String queueUrl : queueUrls) {
final String message = String.valueOf(i);
futures.add(executor.submit(() -> sendMessage(sqs, queueUrl, message)));
}
Thread.sleep(sleepMs);
}
for (Future<?> f : futures) {
f.get();
}
}
private static void sendMessage(AmazonSQS sqs, String queueUrl, String messageBody) {
final SendMessageRequest request = new SendMessageRequest()
.withQueueUrl(queueUrl)
.withMessageBody(messageBody);
sqs.sendMessage(request);
}
// Use RequestHandler2 to get accurate timing metrics
private static class MyRequestHandler extends RequestHandler2 {
private final Queue<Long> resultQueue;
public MyRequestHandler(Queue<Long> resultQueue) {
this.resultQueue = resultQueue;
}
public void afterResponse(Request<?> request, Response<?> response) {
TimingInfo timingInfo = request.getAWSRequestMetrics().getTimingInfo();
Long start = timingInfo.getStartEpochTimeMilliIfKnown();
Long end = timingInfo.getEndEpochTimeMilliIfKnown();
if (start != null && end != null) {
long elapsed = end-start;
resultQueue.add(elapsed);
}
}
}
I'm sure this is some weird client configuration issue, but the default ClientConfiguration should be able to handle 50 concurrent connections.
Any suggestions?
Update: It's looking like the key to this problem is something I left out of the original simplified version - there is a delay between batches of messages being sent (relating to doing processing). The latency issue isn't there if the delay is ~2s, but it is an issue when the delay between batches is ~10s. I've tried different values for ClientConfiguration.validateAfterInactivityMillis with no effect.

Related

Kubernetes JavaClient API Pod readiness

I try to create a POD with an helm file on my Kubernetes Cluster and wait until the pod is ready.
For this I use the Kubernetes Java Client API. My code is:
public static String startWdPod(File specFile, String namespace, String imageName, String controllerAddress)
throws IOException, ApiException, InterruptedException {
LOGGER.info("Loading pod spec from {}", specFile);
V1Pod preparedPod = (V1Pod) Yaml.load(specFile);
// Dynamic settings
String name = WD_NAME_BASE + UUID.randomUUID();
Objects.requireNonNull(preparedPod.getMetadata()).setName(name);
V1Container container = Objects.requireNonNull(preparedPod.getSpec()).getContainers().get(0);
container.setImage(imageName);
Objects.requireNonNull(container.getEnv()).add(new V1EnvVar().name("WD_CONTROLLER_ADDRESS").value(controllerAddress));
LOGGER.info("Creating pod {} with image {}", name, imageName);
V1Pod pod = getApi().createNamespacedPod(namespace, preparedPod, null, null, null);
podReference = pod;
int count = 0;
LOGGER.info("POD Phase1 {}", pod.getStatus().getPhase());
while (count < 50) {
if (pod.getStatus().getPhase().equals("Ready")) {
LOGGER.info("Found Ready");
count = 50;
}
LOGGER.info("POD not ready, wait 1 second");
LOGGER.info("POD Phase loop {}", pod.getStatus().getPhase());
TimeUnit.SECONDS.sleep(1);
count++;
}
return Objects.requireNonNull(pod.getMetadata()).getName();
}
The Pod is created well and becomes ready after some time. But in my Loop the Status will never get other than "Pending". I expect it will change some time to "Ready". Is there anything I think wrong about?

How to check gcp pubsub empty/inactive subscription

I have an application that subscribes to a topic in GCP and when there is some messages over there it downloads them and sends them to a queue on ActiveMQ.
In order to make this process fast, I am using executorService and launching multiple threads for sending messages to activeMQ. Since this the subscription is supposed to be an ongoing task I am putting the code in a while(true) loop, and hence I can't shutdown the executorService in a normal fashion, as I will be creating and shutting down the executor service in every loop.
I am searching for an elegant way to shutdown the executorService when the subscription is empty (no data in the topic) for like 2 or 3 minutes or some inactivity window. and then of course it starts again when there is some new data.
The following is my idea which I don't like, which is just a counter that I am incrementing when the subscription retrieves no data.
I am looking for a more elegant way of doing that.
#Service
#Slf4j
public class PubSubSubscriberService {
private static final int EMPTY_SUBSCRIPTION_COUNTER = 4;
private static final Logger businessLogger = LoggerFactory.getLogger("BusinessLogger");
private Queue<PubsubMessage> messages = new ConcurrentLinkedQueue<>();
public void pullMessagesAndSendToBroker(CompositeConfigurationElement cce) {
var patchSize = cce.getSubscriber().getPatchSize();
var nThreads = cce.getSubscriber().getSendingParallelThreads();
var scheduledTasks = 0;
var subscribeCounter = 0;
ThreadPoolExecutor threadPoolExecutor = null;
while (true) {
try {
if (subscribeCounter < EMPTY_SUBSCRIPTION_COUNTER) {
log.info("Creating Executor Service for uploading to broker with a thread pool of Size: " + nThreads);
threadPoolExecutor = getThreadPoolExecutor(nThreads);
}
var subscriber = this.getSubscriber(cce);
this.startSubscriber(subscriber, cce);
this.checkActivity(threadPoolExecutor, subscribeCounter++);
// send patches of {{ messagesPerIteration }}
while (this.messages.size() > patchSize) {
if (poolIsReady(threadPoolExecutor, nThreads)) {
UploadTask task = new UploadTask(this.messages, cce, cf, patchSize);
threadPoolExecutor.submit(task);
scheduledTasks ++;
}
subscribeCounter = 0;
}
// send the rest
if (this.messages.size() > 0) {
UploadTask task = new UploadTask(this.messages, cce, cf, patchSize);
threadPoolExecutor.submit(task);
scheduledTasks ++;
subscribeCounter = 0;
}
if (scheduledTasks > 0) {
businessLogger.info("Scheduled " + scheduledTasks + " upload tasks of size upto: " + patchSize + ", preparing to start subscribing for 30 more sec") ;
scheduledTasks = 0;
}
} catch ( Exception e) {
e.printStackTrace();
businessLogger.error(e.getMessage());
}
}

Your pool take few space and memory and consume almost no CPU when it's not used. Set a max limit to your Pool capacity and use it with trying to downscale it. If you have too much messages to process, the task are queued waiting a free executor pool to complete the task.
If you have scalability up and down concerne, you design could be reviewed. Instead of executorPool internal to the pod, you could trigger an event in your cluster and process them in parallel, on other pods. These pods will be able to scale up and down according to the traffic (have a look to Knative)

Vertx WebClient reponds slowly

I am new to vertx and RxJava. I am trying to implement a simple test program. However, I am not able to understand the dynamics of this program. Why do some requests take more than 10 seconds to respond?
Below is my sample Test application
public class Test {
public static void main(String[] args) {
Vertx vertx = Vertx.vertx();
WebClient webClient = WebClient.create(vertx);
Observable < Object > google = hitURL("www.google.com", webClient);
Observable < Object > yahoo = hitURL("www.yahoo.com", webClient);
for (int i = 0; i < 100; i++) {
google.repeat(100).subscribe(timeTaken -> {
if ((Long) timeTaken > 10000) {
System.out.println(timeTaken);
}
}, error -> {
System.out.println(error.getMessage());
});
yahoo.repeat(100).subscribe(timeTaken -> {
if ((Long) timeTaken > 10000) {
System.out.println(timeTaken);
}
}, error -> {
System.out.println(error.getMessage());
});
}
}
public static Observable < Object > hitURL(String url, WebClient webClient) {
return Observable.create(emitter -> {
Long l1 = System.currentTimeMillis();
webClient.get(80, url, "").send(ar -> {
if (ar.succeeded()) {
Long elapsedTime = (System.currentTimeMillis() - l1);
emitter.onNext(elapsedTime);
} else {
emitter.onError(ar.cause());
}
emitter.onComplete();
});
});
}
}
What I want to know is, what is making my response time slow?

The problem here seems to be in the way you are using WebClient and/or the way you are measuring "response" times (depending on what you are trying to achieve here).
Vert.x's WebClient, like most http clients, under the hood uses limited-size connection pool to send the requests. In other words, calling .send(...) does not necessarily start the http request immediately - instead, it might wait in some sort of queue for an available connection. Your measurements include this potential waiting time.
You are using the default pool size, which seems to be 5 (at least in the latest version of Vert.x - it's defined here), and almost immediately starting 200 http requests. It's not surprising that most of the time your requests wait for the available connection.
You might try increasing the pool size if you want to test if I'm right:
WebClient webClient = WebClient.create(vertx, new WebClientOptions().setMaxPoolSize(...));

Chronicle Queue: Usage with less or no lambdas

The documentation shows the usage of an appender or a tailer generally with a lambda, like this:
appender.writeDocument(wireOut -> wireOut.write("log").marshallable(m ->
m.write("mkey").text(mkey)
.write("timestamp").dateTime(now)
.write("msg").text(data)));
For a tailer I I use:
int count = 0;
while (read from tailer ) {
wire.read("log").marshallable(m -> {
String mkey = m.read("mkey").text();
LocalDateTime ts = m.read("timestamp").dateTime();
String bmsg = m.read("msg").text();
//... do more stuff, like updating counters
count++;
}
}
During the read I would like to do stuff like updating counters, but this is not possible in lambda (needs "effectively final" values/objects).
What is good practice for using the API without lambdas?
Any other ideas on how to do this? (Currently I use AtomicInteger objects)

static class Log extends AbstractMarshallable {
String mkey;
LocalDateTime timestamp;
String msg;
}
int count;
public void myMethod() {
Log log = new Log();
final SingleChronicleQueue q = SingleChronicleQueueBuilder.binary(new File("q4")).build();
final ExcerptAppender appender = q.acquireAppender();
final ExcerptTailer tailer = q.createTailer();
try (final DocumentContext dc = appender.writingDocument()) {
// this will store the contents of log to the queue
dc.wire().write("log").marshallable(log);
}
try (final DocumentContext dc = tailer.readingDocument()) {
if (!dc.isData())
return;
// this will replace the contents of log
dc.wire().read("log").marshallable(log);
//... do more stuff, like updating counters
count++;
}
}

Multithreading: Issue in allocating and blocking resources

I have multiple resources - for the sake of understanding say 3 resources namely XResource, YResource and ZResource (Java classes - Runnables) who are able to do a certain Task. There is a List of Tasks which needs to be done in parallel among the 3 resources. I need the resources to be locked and if one of the resource is locked then the task should go to some other resource and if none of the resources are available then it should wait till one of the resource is available. I am currently trying to get a lock to a resource using a Semaphore but the thread gets assigned to one Runnable only and the other Runnables are always idle. I am very new to multithreading so I might be overlooking something obvious. I am using Java SE 1.6
Below is my code -
public class Test {
private final static Semaphore xResourceSphore = new Semaphore(1, true);
private final static Semaphore yResourceSphore = new Semaphore(1, true);
private final static Semaphore zResourceSphore = new Semaphore(1, true);
public static void main(String[] args) {
ArrayList<Task> listOfTasks = new ArrayList<Task>();
Task task1 = new Task();
Task task2 = new Task();
Task task3 = new Task();
Task task4 = new Task();
Task task5 = new Task();
Task task6 = new Task();
Task task7 = new Task();
Task task8 = new Task();
Task task9 = new Task();
listOfTasks.add(task1);
listOfTasks.add(task2);
listOfTasks.add(task3);
listOfTasks.add(task4);
listOfTasks.add(task5);
listOfTasks.add(task6);
listOfTasks.add(task7);
listOfTasks.add(task8);
listOfTasks.add(task9);
//Runnables
XResource xThread = new XResource();
YResource yThread = new YResource();
ZResource zThread = new ZResource();
ExecutorService executorService = Executors.newFixedThreadPool(3);
for (int i = 0; i < listOfTasks.size(); i++) {
if (xResourceSphore.tryAcquire()) {
try {
xThread.setTask(listOfTasks.get(i));
executorService.execute(xThread );
} finally {
xResourceSphore.release();
}
}else if (yResourceSphore.tryAcquire()) {
try {
yThread.setTask(listOfTasks.get(i));
executorService.execute(yThread );
} finally {
yResourceSphore.release();
}
}else if (zResourceSphore.tryAcquire()) {
try {
zThread.setTask(listOfTasks.get(i));
executorService.execute(zThread );
} finally {
zResourceSphore.release();
}
}
}
executorService.shutdown();
}
}

You need to move the resource locking logic to the task which is run in another thread.
By doing the locking in the current thread, you are not waiting for the task to be performed before releasing the resource. The reason you are seeing the problem you are is that you are not waiting for the task to complete (or even start) before calling setTask() on the same resource. This replaces the previous task set.
Queue<Resource> resources = new ConcurrentLinkedQueue<>();
resources.add(new XResource());
resources.add(new YResource());
resources.add(new ZResource());
ExecutorService service = Executors.newFixedThreadPool(resources.size());
ThreadLocal<Resource> resourceToUse = ThreadLocal.withInitial(() -> resources.remove());
for (int i = 1; i < 9; i++) {
service.execute(() -> {
Task task = new Task();
resourceToUse.setTask(task);
});
}

Following Peter Lawrey's suggestion I passed the Semaphore within the runnable and released it after it finished execution. However I still faced the issue that I am unable to allocate all the tasks to the threads within the for loop. So I made a while(true) loop until one of the resource is available for a task. Below is the code:
ExecutorService executorService = Executors.newFixedThreadPool(3);
for (int i = 0; i < listOfTasks.size(); i++) {
while(true){
if (xResourceSphore.tryAcquire()) {
xThread.setTask(listOfTasks.get(i));
xThread.setSemaphore(xResourceSphore);
executorService.execute(xThread );
break;
}else if (yResourceSphore.tryAcquire()) {
yThread.setTask(listOfTasks.get(i));
yThread.setSemaphore(yResourceSphore);
executorService.execute(yThread );
break;
}else if (zResourceSphore.tryAcquire()) {
zThread.setTask(listOfTasks.get(i));
zThread.setSemaphore(zResourceSphore);
executorService.execute(zThread );
break;
}
}
}
executorService.shutdown();
I don't like this solution much because it cannot be extended if my resources are doing different types of tasks and hence if I need a particular resource for a particular kind of task , my other tasks would be waiting continuously till the particular task gets done. But for now couldn't get any other way. Even after so much of research!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Using AmazonSQS sendMessage over multiple threads causes it to run slower - java

Related

Kubernetes JavaClient API Pod readiness

How to check gcp pubsub empty/inactive subscription

Vertx WebClient reponds slowly

Chronicle Queue: Usage with less or no lambdas

Multithreading: Issue in allocating and blocking resources

Categories

Resources