I am reading about Scala’s actors, so say we have something like:
object Worker extends Actor {
def act() {
while(true) {
receive {
case "exit" => {
println("exiting...")
sender ! Exit
}
case s:String if s.startsWith("scp") => {
println("Starting scp")
Thread.sleep(2000)
sender ! Done(s)
}
case s:String => {
println("Starting " + s)
sender ! Done(s)
}
}
}
}
}
(http://www.naildrivin5.com/scalatour/wiki_pages/ActorsAndConcurrency)
What would the equivalent pattern be like with Java? I understand it is much more cumbersome to do this in Java.
Are there any performance implications with Scala’s actors? Sure it is way easier to both implement and understand from what I gather, but curious if there any tradeoffs.
Take a look in akka framework. With it you will have the power of Actor Model in Java.
As someone else mentioned Akka is probably the best candidate, as while it has been written in Scala, it has been done in such a way as to also make it very accessible from Java. As a side note to that the Akka implementation will replace the current implementation in the future.
Also the Scala actor implementation isn't a feature of the language itself, it's just that the standard library includes that implementation.
As far as the performance implications the current Scala implementation isn't that good anyway, so would be a bad example. I can't highly recommend the docs for the Akka one enough however: http://doc.akka.io/docs/akka/2.0.4/
Scala's actor (not to mix with akka's actor) is effectively a thread with an input queue, and its equivalent can be easily implemented in java:
interface Port<T>{
public void send(T msg);
}
class StringMessage {
String value;
Port sender;
}
class Worker extends Thread implements Port<StringMessage>{
ConcurrentLinkedQueue<StringMessage > q=new ConcurrentLinkedQueue<StringMessage >();
public send(StringMessage m) {
q.put(m);
}
public void run() {
while(true) {
StringMessage msg=q.take();
String s=msg.value;
if (s.equals("exit") {
println("exiting...");
msg.sender.send(Exit);
return;
} else if (s.startsWith("scp") {
println("Starting scp")
Thread.sleep(2000)
msg.sender.send(Exit);
} else {
println("Starting " + s)
msg.sender.send(Done(s));
}
}
}
}
This is only a sketch, to make it workable you have to develop contracts and protocols between communicating threads. Or you can take an existing actor framework for java (there are many).
To choose wisely, you have to answer following questions:
should actors be based on Threads or lightweight tasks executing on a thread pool? Threads consume much memory, but allow blocking operations. Most widely known Akka framework uses lightweight tasks.
is actor model enough for you? Classic actor have single input port, more broad dataflow model allow actor node to have several input ports, and the firing occurs when all input ports are not empty. This allow to construct "nested callbacks" as in another question. Java dataflow frameworks are rare, the only opensource library I know is mine df4j. It allows both thread-based and task-based actor nodes, and have subclass Actor with single input.
Related
The program I am working on has a distributed architecture, more precisely the Broker-Agent Pattern. The broker will send messages to its corresponding agent in order to tell the agent to execute a task. Each message sent contains the target task information(the task name, configuration properties needed for the task to perform etc.). In my code, each task in the agent side is implemente in a seperate class. Like :
public class Task1 {}
public class Task2 {}
public class Task3 {}
...
Messages are in JSON format like:
{
"taskName": "Task1", // put the class name here
"config": {
}
}
So what I need is to associate the message sent from the broker with the right task in the agent side.
I know one way is to put the target task class name in the message so that the agent is able to create an instance of that task class by the task name extracted from the message using reflections, like:
Class.forName(className).getConstructor(String.class).newInstance(arg);
I want to know what is the best practice to implement this association. The number of tasks is growing and I think to write string is easy to make mistakes and not easy to maintain.
If you're that specific about classnames you could even think about serializing task objects and sending them directly. That's probably simpler than your reflection approach (though even tighter coupled).
But usually you don't want that kind of coupling between Broker and Agent. A broker needs to know which task types there are and how to describe the task in a way that everybody understands (like in JSON). It doesn't / shouldn't know how the Agent implements the task. Or even in which language the Agent is written. (That doesn't mean that it's a bad idea to define task names in a place that is common to both code bases)
So you're left with finding a good way to construct objects (or call methods) inside your agent based on some string. And the common solution for that is some form of factory pattern like: http://alvinalexander.com/java/java-factory-pattern-example - also helpful: a Map<String, Factory> like
interface Task {
void doSomething();
}
interface Factory {
Task makeTask(String taskDescription);
}
Map<String, Factory> taskMap = new HashMap<>();
void init() {
taskMap.put("sayHello", new Factory() {
#Override
public Task makeTask(String taskDescription) {
return new Task() {
#Override
public void doSomething() {
System.out.println("Hello" + taskDescription);
}
};
}
});
}
void onTask(String taskName, String taskDescription) {
Factory factory = taskMap.get(taskName);
if (factory == null) {
System.out.println("Unknown task: " + taskName);
}
Task task = factory.makeTask(taskDescription);
// execute task somewhere
new Thread(task::doSomething).start();
}
http://ideone.com/We5FZk
And if you want it fancy consider annotation based reflection magic. Depends on how many task classes there are. The more the more effort to put into an automagic solution that hides the complexity from you.
For example above Map could be filled automatically by adding some class path scanning for classes of the right type with some annotation that holds the string(s). Or you could let some DI framework inject all the things that need to go into the map. DI in larger projects usually solves those kinds of issues really well: https://softwareengineering.stackexchange.com/questions/188030/how-to-use-dependency-injection-in-conjunction-with-the-factory-pattern
And besides writing your own distribution system you can probably use existing ones. (And reuse rather then reinvent is a best practice). Maybe http://www.typesafe.com/activator/template/akka-distributed-workers or more general http://twitter.github.io/finagle/ work in your context. But there are way too many other open source distributed things that cover different aspects to name all the interesting ones.
I am coming to Akka after spending quite a bit of time over in Hystrix-land where, like Akka, failure is a first-class citizen.
In Hystrix, I might have a SaveFizzToDbCmd that attempts to save a Fizz instance to an RDB (MySQL, whatever), and a backup/“fallback” SaveFizzToMemoryCmd that saves that Fizz to an in-memory cache in case the primary (DB) command goes down/starts failing:
// Groovy pseudo-code
class SaveFizzToDbCmd extends HystrixCommand<Fizz> {
SaveFizzToMemoryCmd memoryFallback
Fizz fizz
#Override
Fizz run() {
// Use raw JDBC to save ‘fizz’ to an RDB.
}
#Override
Fizz getFallback() {
// This only executes if the ‘run()’ method above throws
// an exception.
memoryFallback.fizz = this.fizz
memoryFallback.execute()
}
}
In Hystrix, if run() throws an exception (say a SqlException), its getFallback() method is invoked. If enough exceptions get thrown within a certain amount of time, the HystrixCommands “circuit breaker” is “tripped” and only the getFallback() method will be invoked.
I am interested in accomplishing the same in Akka, but with actors. With Akka, we might have a JdbcPersistor actor and an InMemoryPersistor backup/fallback actor like so:
class JdbcPersistor extends UntypedActor {
#Override
void onReceive(Object message) {
if(message instanceof SaveFizz) {
SaveFizz saveFizz = message as SaveFizz
Fizz fizz = saveFizz.fizz
// Use raw JDBC to save ‘fizz’ to an RDB.
}
}
}
class InMemoryPersistor extends UntypedActor {
// Should be obvious what this does.
}
The problem I’m struggling with is:
How to get InMemoryPeristor correctly configured/wired as the backup to JdbcPersistor when it is failing; and
Failing back over to the JdbcPersistor if/when it “heals” (though it may never)
I would imagine this is logic that belongs inside JdbcPersistors SupervisorStrategy, but I can find nothing in the Akka docs nor any code snippets that implement this kind of behavior. Which tells me “hey, maybe this isn’t the way Akka works, and perhaps there’s a different way of doing this sort of circuit breaking/failover/failback in Akka-land.” Thoughts?
Please note: Java examples are enormously appreciated as Scala looks like hieroglyphics to me!
One way would be to have a FailoverPersistor actor which consuming code communicates with, that has both a JdbcPersistor and a InMemoryPeristor as children and a flag that decides which one to use and then basically routes traffic to the correct child depending on the state. The flag could then be manipulated by both the supervisor and timed logic/statistics inside the actor.
There is a circuit breaker in the contrib package of akka that might be an inspiration (or maybe even useable to achieve what you want): http://doc.akka.io/docs/akka/current/common/circuitbreaker.html
Please note: I am a Java developer with no working knowledge of Scala (sadly). I would ask that any code examples provided in the answer would be using Akka's Java API.
I am brand-spanking-new to Akka and actors, and am trying to set up a fairly simple actor system:
So a DataSplitter actor runs and splits up a rather large chunk of binary data, say 20GB, into 100 KB chunks. For each chunk, the data is stored in the DataCache via the DataCacher. In the background, a DataCacheCleaner rummages through the cache and finds data chunks that it can safely delete. This is how we prevent the cache from becoming 20GB in size.
After sending the chunk off to the DataCacher for caching, the DataSplitter then notifies the ProcessorPool of the chunk which now needs to be processed. The ProcessorPool is a router/pool consisting of tens of thousands of different ProcessorActors. When each ProcessActor receives a notification to "process" a 100KB chunk of data, it then fetches the data from the DataCacher and does some processing on it.
If you're wondering why I am bothering even caching anything here (hence the DataCacher, DataCache and DataCacheCleaner), my thinking was that 100KB is still a fairly large message to pass around to tens of thousands of actor instances (100KB * 1,000 = 100MB), so I am trying to just store the 100KB chunk once (in a cache) and then let each actor access it by reference through the cache API.
There is also a Mailman actor that subscribes to the event bus and intercepts all DeadLetters.
So, altogether, 6 actors:
DataSplitter
DataCacher
DataCacheCleaner
ProcessorPool
ProcessorActor
Mailman
The Akka docs preach that you should decompose your actor system based on dividing up subtasks rather than purely by function, but I'm not exactly seeing how this applies here. The problem at hand is that I'm trying to organize a supervisor hierarchy between these actors and I'm not sure what the best/correct approach is. Obviously ProcessorPool is a router that needs to be the parent/supervisor to the ProcessorActors, so we have this known hierarchy:
/user/processorPool/
processorActors
But other than that known/obvious relationship, I'm not sure how to organize the rest of my actors. I could make them all "peers" under one common/master actor:
/user/master/
dataSplitter/
dataCacher/
dataCacheCleaner/
processorPool/
processorActors/
mailman/
Or I could omit a master (root) actor and try to make things more vertical around the cache:
/user/
dataSplitter/
cacheSupervisor/
dataCacher/
dataCacheCleaner/
processorPool/
processorActors/
mailman/
Being so new to Akka I'm just not sure what the best course of action is, and if someone could help with some initial hand-holding here, I'm sure the lightbulbs will all turn on. And, just as important as organizing this hierarchy is, I'm not even sure what API constructs I can use to actually create the hierarchy in the code.
Organising them under one master makes it easier to manage since you can access all the actors watched by the supervisor (in this case master).
One hierarchical implementation can be:
Master Supervisor Actor
class MasterSupervisor extends UntypedActor {
private static SupervisorStrategy strategy = new AllForOneStrategy(2,
Duration.create(5, TimeUnit.MINUTES),
new Function<Throwable, Directive>() {
#Override
public Directive apply(Throwable t) {
if (t instanceof SQLException) {
log.error("Error: SQLException")
return restart()
} else if (t instanceof IllegalArgumentException) {
log.error("Error: IllegalArgumentException")
return stop()
} else {
log.error("Error: GeneralException")
return stop()
}
}
});
#Override
public SupervisorStrategy supervisorStrategy() { return strategy }
#Override
void onReceive(Object message) throws Exception {
if (message.equals("SPLIT")) {
// CREATE A CHILD OF MyOtherSupervisor
if (!dataSplitter) {
dataSplitter = context().actorOf(FromConfig.getInstance().props(Props.create(DataSplitter.class)), "DataSplitter")
// WATCH THE CHILD
context().watch(dataSplitter)
log.info("${self().path()} has created, watching and sent JobId = ${message} message to DataSplitter")
}
// do something with message such as Forward
dataSplitter.forward(message, context())
}
}
DataSplitter Actor
class DataSplitter extends UntypedActor {
// Inject a Service to do the main operation
DataSplitterService dataSplitterService
#Override
void onReceive(Object message) throws Exception {
if (message.equals("SPLIT")) {
log.info("${self().path()} recieved message: ${message} from ${sender()}")
// do something with message such as Forward
dataSplitterService.splitData()
}
}
}
I have a layered architecture in a Java web application. The UI layer is just Java, services are typed Akka actors and external service calls (WS, DB etc.) are wrapped in Hystrix commands.
THe UI calls the service and the service returns an Akka future. It's an Akka future because I want to make UI coding simpler with the onComplete and onFailure callbacks that Akka futures provide. The service then creates the future that does some mapping etc. and wraps a call to a HystrixCommand that returns a Java future.
So in pseudocode:
UI
AkkaFuture future = service.getSomeData();
Service
public AkkaFuture getSomeData() {
return future {
JavaFuture future = new HystrixCommand(mapSomeData()).queue()
//what to do here, currently just return future.get()
}
}
The problem is that I would like to free up the thread the service actor is using and just tie up the threads that Hystrix uses. But the java future prevents that because I have to block on it's completion. The only option I can think of (which I'm not sure I like) is to poll the Java future(s) constantly and complete the Akka future when the Java future finishes.
Note: the question isn't really related to Hystrix per se, but I decided to mention it if somebody comes up with a solution specifically related to Hystrix.
I'm marking the answer by #Hbf as a solution, since I ended up doing an Akka poller as explained in How do I wrap a java.util.concurrent.Future in an Akka Future?. For reference I also tried:
Creating a HystrixCommandExcutionHook and extending HystrixCommand to allow callbacks. That didn't work because the hook wasn't called at the right time.
Using Guavas listenable future by having a decorated executor create the futures inside Hystrix and then casting the futures from the commands. Doesn't work because Hystrix uses a ThreadPoolExecutor which can't be decorated.
EDIT: I'm adding the Akka poller code below, since the original answer was in Scala and it hangs if the Java future doesn't cancel nicely. The solution below always walks away from threads after a timeout.
protected Future wrapJavaFutureInAkkaFuture(final java.util.concurrent.Future javaFuture, final Option maybeTimeout, final ActorSystem actorSystem) {
final Promise promise = Futures.promise();
if (maybeTimeout.isDefined()) {
pollJavaFutureUntilDoneOrCancelled(javaFuture, promise, Option.option(maybeTimeout.get().fromNow()), actorSystem);
} else {
pollJavaFutureUntilDoneOrCancelled(javaFuture, promise, Option. none(), actorSystem);
}
return promise.future();
}
protected void pollJavaFutureUntilDoneOrCancelled(final java.util.concurrent.Future javaFuture, final Promise promise, final Option maybeTimeout, final ActorSystem actorSystem) {
if (maybeTimeout.isDefined() && maybeTimeout.get().isOverdue()) {
// on timeouts, try to cancel the Java future and simply walk away
javaFuture.cancel(true);
promise.failure(new ExecutionException(new TimeoutException("Future timed out after " + maybeTimeout.get())));
} else if (javaFuture.isDone()) {
try {
promise.success(javaFuture.get());
} catch (final Exception e) {
promise.failure(e);
}
} else {
actorSystem.scheduler().scheduleOnce(Duration.create(50, TimeUnit.MILLISECONDS), new Runnable() {
#Override
public void run() {
pollJavaFutureUntilDoneOrCancelled(javaFuture, promise, maybeTimeout, actorSystem);
}
}, actorSystem.dispatcher());
}
}
Java futures are known to be inferior in design compared to something like Scala futures. Take a look at the discussion "How do I wrap a java.util.concurrent.Future in an Akka Future", for example.
But: Maybe, instead of polling (as suggested in the above discussion), Hystrix offers some kind of onComplete callback? I do not know the library at all but stumbled upon an onComplete in the Hystrix API. Maybe it helps?
As of Hystrix 1.3 it now also supports true non-blocking callbacks and that will fit much better into Akka/Scala Future behavior that is non-blocking and composable: https://github.com/Netflix/Hystrix/wiki/How-To-Use#wiki-Reactive-Execution
I am in the process of moving the business logic of my Swing program onto the server.
What would be the most efficient way to communicate client-server and server-client?
The server will be responsible for authentication, fetching and storing data, so the program will have to communication frequently.
it depends on a lot of things. if you want a real answer, you should clarify exactly what your program will be doing and exactly what falls under your definition of "efficient"
if rapid productivity falls under your definition of efficient, a method that I have used in the past involves serialization to send plain old java objects down a socket. recently I have found that, in combination with the netty api, i am able to rapidly prototype fairly robust client/server communication.
the guts are fairly simple; the client and server both run Netty with an ObjectDecoder and ObjectEncoder in the pipeline. A class is made for each object designed to handle data. for example, a HandshakeRequest class and HandshakeResponse class.
a handshake request could look like:
public class HandshakeRequest extends Message {
private static final long serialVersionUID = 1L;
}
and a handshake response may look like:
public class HandshakeResponse extends Message {
private static final long serialVersionUID = 1L;
private final HandshakeResult handshakeResult;
public HandshakeResponse(HandshakeResult handshakeResult) {
this.handshakeResult = handshakeResult;
}
public HandshakeResult getHandshakeResult() {
return handshakeResult;
}
}
in netty, the server would send a handshake request when a client connects as such:
#Override
public void channelConnected(ChannelHandlerContext ctx, ChannelStateEvent e) {
Channel ch = e.getChannel();
ch.write(new HandshakeRequest();
}
the client receives the HandshakeRequest Object, but it needs a way to tell what kind of message the server just sent. for this, a Map<Class<?>, Method> can be used. when your program is run, it should iterate through the Methods of a class with reflection and place them in the map. here is an example:
public HashMap<Class<?>, Method> populateMessageHandler() {
HashMap<Class<?>, Method> temp = new HashMap<Class<?>, Method>();
for (Method method : getClass().getMethods()) {
if (method.getAnnotation(MessageHandler.class) != null) {
Class<?>[] methodParameters = method.getParameterTypes();
temp.put(methodParameters[1], method);
}
}
return temp;
}
this code would iterate through the current class and look for methods marked with an #MessageHandler annotation, then look at the first parameter of the method (the parameter being an object such as public void handleHandshakeRequest(HandshakeRequest request)) and place the class into the map as a key with the actual method as it's value.
with this map in place, it is very easy to receive a message and send the message directly to the method that should handle the message:
#Override
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) {
try {
Message message = (Message) e.getMessage();
Method method = messageHandlers.get(message.getClass());
if (method == null) {
System.out.println("No handler for message!");
} else {
method.invoke(this, ctx, message);
}
} catch(Exception exception) {
exception.printStackTrace();
}
}
there's not really anything left to it. netty handles all of the messy stuff allowing us to send serialized objects back and forth with ease. if you decide that you do not want to use netty, you can wrap your own protocol around java's Object Output Stream. you will have to do a little bit more work overall, but the simplicity of communication remains intact.
It's a bit hard to say which method is "most efficient" in terms of what, and I don't know your use cases, but here's a couple of options:
The most basic way is to simply use "raw" TCP-sockets. The upside is that there's nothing extra moving across the network and you create your protocol yourself, the latter being also a downside; you have to design and implement your own protocol for the communication, plus the basic framework for handling multiple connections in the server end (if there is a need for such).
Using UDP-sockets, you'll probably save a little latency and bandwidth (not much, unless you're using something like mobile data, you probably won't notice any difference with TCP in terms of latency), but the networking code is a bit harder task; UDP-sockets are "connectionless", meaning all the clients messages will end up in the same handler and must be distinguished from one another. If the server needs to keep up with client state, this can be somewhat troublesome to implement right.
MadProgrammer brought up RMI (remote method invocation), I've personally never used it, and it seems a bit cumbersome to set up, but might be pretty good in the long run in terms of implementation.
Probably one of the most common ways is to use http for the communication, for example via REST-interface for Web services. There are multiple frameworks (I personally prefer Spring MVC) to help with the implementation, but learning a new framework might be out of your scope for now. Also, complex http-queries or long urls could eat your bandwidth a bit more, but unless we're talking about very large amounts of simultaneous clients, this usually isn't a problem (assuming you run your server(s) in a datacenter with something like 100/100MBit connections). This is probably the easiest solution to scale, if it ever comes to that, as there're lots of load-balancing solutions available for web servers.