Should parallel programming be used in the development of microservices in case the microservices are scalable and, for instance, deployed as ECS on AWS?
If yes, what are the benefits of consuming more resources by one instance vs the same resources by N instances?
How does parallel programming match https://12factor.net/
P.S. to be more specific - should I conceptually use parallel streams rather than simple streams?
Basically the link that you provided also provides answer to your question already
This does not exclude individual processes from handling their own internal multiplexing, via threads inside the runtime VM, or the async/evented model found in tools such as EventMachine, Twisted, or Node.js. But an individual VM can only grow so large (vertical scale), so the application must also be able to span multiple processes running on multiple physical machines.
https://12factor.net/concurrency
Sure, imagine a microservice that needs to execute multiple independent calls to a dB or to other microservice and aggregate the results. As the calls are independent, they can be executed in parallel so that the total time is at most the time it takes to execute the slowest call.
Parallel streams must be used when the tasks at hand are mutually exclusive and can be done in parallel. However, parallel programming comes with an overhead of using a little more resources. So depending on the tasks at hand, you need to make a decision with trade-offs, which would be the best for you.
Related
I have a program which spins up thousands of threads. I am currently using one host for all the threads which takes a lot of time. If I want to use multiple hosts (say 10 hosts, each running 100 different threads), how should I proceed ?
Having thousands of threads on a single JVM sounds like a bad idea - you may spend most time context-switching instead of doing the actual work.
To split your work across multiple host, you cannot use threads managed by a single JVM. You'll need to have each host exposing an API that can receive part of work and return the result of the work done.
One approach would be to use Java RMI (remote method invocation) to complete this task, but really, your question lacks so many details important for the decision of what architecture to choose.
Creating 1000 threads in on JVM is very bad design and need to minimise count.
High thread count will not give you multi-threading benefit as context switching will be very frequent and will hit performance.
If you are thinking of dividing in multiple hosts then you need parallel processing system like Hadoop /Spark.
They internally handles task allocation as well as central system for syncing all hosts on which threads/tasks are running.
In order to improve the execution speed of a Java program running in Google App Engine, can I create additional Java threads during the runtime to make use of idle machines in the data center?
I've found conflicting data thus far.
If your primary concern is to improve the execution time, take a look at Memcache and Tasks. They can be used to reduce or avoid the latency of reading from or writing to the Datastore or other storage options, fetching URLs, sending emails, etc. If you do a lot of difficult computations that can run in parallel, look at MapReduce API.
Once you remove all the delays from your program, there will be no reason to use multiple threads within a single request.
Note that App Engine instances can use multithreading to execute multiple requests at the same time, so they tend to use allocated resources efficiently. To enable it, see:
https://developers.google.com/appengine/docs/java/config/appconfig#Java_appengine_web_xml_Using_concurrent_requests
If you have a problem that calls for a multithreaded solution, you can use threads (as described on the link that you included in your question).
However, based on your reasoning ("to make use of idle machines in the datacenter"), it seems like you're misguided. You should not use threads for that reason. You use the machines hours that you pay for and not more. The only time you will have an idle machine is if you tell App Engine to keep around an extra idle machine so that it doesn't have to start up an extra machine your app gets a big usage spike.
Most of the time, unless you are truly doing parallel computation, you won't need to use multiple threads in App Engine. For instance, the datastore has an asynchronous API so that you can do multiple datastore operations in parallel without having to deal with threads yourself.
Does that make sense?
I'm not really understanding the dyno and worker process model of Heroku as it relates to a single process but multi-threaded Java-based server.
For example: How do I know (for a single dyno) how many processors are available for my background threads? Do I need to use something like RabbitMQ and create a separate process (app) for each background processing task and communicate between the server and these? Seems a little overkill for some Scheduled Tasks using Thread Cached Executors. Should all Futures be changed to inter-process Futures?
I guess it comes down to this question. Can I no longer write a multi-threaded server and scale the processors available to my server process in order to accommodate my thread activity? Or do I need to refactor my architecture to use separate processes for concurrency? If the former, do I need workers or just multiple dynos?
Thanks.
Heroku supports multiple concurrency models, so it's really up to you how you would like to architect your application. You have access to the full Java stack, so if something makes more sense to just be run as multiple threads in your web processes, you can definitely do that, or you can always enqueue jobs on something like RabbitMQ or Redis and process them on separate worker dynos. Multithreading is simpler and makes sense if the amount of work is light and proportional to your web requests because it will be scaled along with the web dynos; however, if the work is large, not proportional, and/or needs to be scaled independently, then breaking it out into a separate process would be better.
Heroku was originally just a Ruby platform, which does not have the same threading capabilities as Java, so the use of separate worker dynos is more important for Ruby and this is reflected in some of the documentation and examples out there, which might have led to your confusion. Luckily, with Java you have more options available to you and can use what's best for the job at hand.
I am creating a distributed service and i am looking at restricting a set of time consuming operations to a single thread of execution across all JVMs at any given time. (I will have to deal with 3 JVMs max).
My initial investigations point me towards java.util.concurrent.Executors , java.util.concurrent.Semaphore. Using singleton pattern and Executors or Semaphore does not guarantee me a single thread of execution across Multiple JVMs.
I am looking for a java core API (or at least a Pattern) that i can use to accomplish my task.
P.S: I have access to ActiveMQ within my existing project which i was planning to use in order to achieve single thread of execution across multiple JVM Machines only if i dont have another choice.
There is no simple solution for this with a core java API. If the 3 JVMs have access to a shared file system you could use it to track state across JVMs.
So basically you do something like create a lock file when you start the expensive operation and delete it at the conclusion. And then have each JVM check for the existence of this lock file before starting the operation. However there are some issues with this approach like what if the JVM dies in the middle of the expensive operation and the file isn't deleted.
ZooKeeper is a nice solution for problems like this and any other cross process synchronization issue. Check it out if that is a possibility for you. I think it's a much more natural way to solve a problem like than a JMS queue.
I was just wondering whether we actually need the algorithm to be muti-threaded if it must make use of the multi-core processors or will the jvm make use of multiple core's even-though our algorithm is sequential ?
UPDATE:
Related Question:
Muti-Threaded quick or merge sort in java
I don't believe any current, production JVM implementations perform automatic multi-threading. They may use other cores for garbage collection and some other housekeeping, but if your code is expressed sequentially it's difficult to automatically parallelize it and still keep the precise semantics.
There may be some experimental/research JVMs which try to parallelize areas of code which the JIT can spot as being embarrassingly parallel, but I haven't heard of anything like that for production systems. Even if the JIT did spot this sort of thing, it would probably be less effective than designing your code for parallelism in the first place. (Writing the code sequentially, you could easily end up making design decisions which would hamper automatic parallelism unintentionally.)
Your implementation needs to be multi-threaded in order to take advantage of the multiple cores at your disposal.
Your system as a whole can use a single core per running application or service. Each running application, though, will work off a single thread/core unless implemented otherwise.
Java will not automatically split your program into threads. Currently, if you want you code to be able to run on multiple cores at once, you need to tell the computer through threads, or some other mechanism, how to split up the code into tasks and the dependencies between tasks in your program. However, other tasks can run concurrently on the other cores, so your program may still run faster on a multicore processor if you are running other things concurrently.
An easy way to make you current code parallizable is to use JOMP to parallelize for loops and processing power intensize, easily parellized parts of your code.
I dont think using multi-threaded algorithm will make use of multi-core processors effectively unless coded for effectiveness. Here is a nice article which talks about making use of multi-core processors for developers -
http://java.dzone.com/news/building-multi-core-ready-java