Grid Computing and Java

Grid Computing and Java - java

I couldn't seem to find a similar question to this.
I am currently looking at the best solution solving a grid computing problem.
The setup:
I have a server/client situation where there clients [typically dumb of most logic] and recieve instructions from the server
Have an authorization request
Clients report back information on speed of completing the task (the task's difficult is judged by the task type)
Clients recieve the best fit task for their previous performance (the best clients receive the worst problems)
Eventually the requirements would be:
The client's footprint must be small and standalone - I can't have a client that requires lots to install and setup
The client should be able to grab new jobs and job runtimes from the server (it would be nice to have the grid scale to new problems [and the new problems would be distributed by the server] that are introduced)
I need to have an authentication layer (doesn't have to be complex or conform to an existing ldap) [easier requirement: clients can signup for a new "membership" and get access] (I'm not sure that RMI's strengths lie here)
The clients would be able to run from the Internet rather in a networked environement
Which means encryption of the results requested
I'm currently using webservices to communicate between the clients and and the server. All of the information and results goes back to the hosting server (J2EE).
My question is there a grid system setup that matches all/most of these requirements, and is open source?
I'm not interested in doing a cloud because most of these tasks are small, but very frequent (once a day but the task may be easy, but performs maintenance).
All of the code for this system is in Java.

You may want to investigate space-based architectures, and in particular Jini and Javaspaces. What's Jini ? It is essentially RMI with a configurable discovery mechanism. You request an implementor of a Java interface, and the Jini subsystem finds current services implementing that interface and dynamically informs your service of these.
Briefly, you'd write the work items into a space. The grid nodes would be set up to read data transactionally from the space. Each grid node would take a work item, process it and write a result back into that space (or another space). The distributing node can monitor for results being written back (and.or for your projected result timings, as you've requested).
It's all Java, and will scale linearly. Because it's Jini, the grid nodes can dynamically load their classes from an HTTP server and so you can propagate code updates trivially.

Take a look on Grid Beans

BOINC sounds like it would work for your problem, though you have to wrap java for your clients. That, and it may be overkill for you.

Related

How to send/receive data to/from MetaTrader Ternminal 4 with JAVA (or anything!)

I have been working on an algorithm ( Not mine, I am just modifying it ) that predicts when to buy and sell on the FOREX market. I need to be able to open and close orders, dynamically update parameters of the orders ( such as stoploss, maximum stop etc. ) and receive real time tick data.
I have been researching for well over a week, and have no success.
The closest I have gotten is using JavoNet and Mt4 Api
I managed to import the DLL into java and use a MQL4 function, which was AccountBalance(), however this has returned 0.0, which was not the account balance, I messed around with the code and the settings on MT4 client but still no luck.
Q0: Can anyone please point me in the right direction?
I am new to automated FOREX trading but from what I understand there is a broker somewhere with a MT4 server and I connect to that server with my MT4 client on my windows machine.
Q1: If this is the case, do I need to make an API work with the server side instead of my client side?
All these DLL's I have tried so far have been used with the MT4 client software on my machine.
I have also been doing some reading on the FIX-Protocol and ZeroMQ.
Q2: Can these help me achieve my goal in any way (instead of creating some bridges between JAVA and MT4 DLL's)?

A0: yes, forget straight about REST and synchronous, blocking chains in FX-trading domain
A1: well, not a typical way. MetaTrader Server is a proprietary suite of systems on the Broker-side and theirs API are not disclosed to allow some 3rd party integrations against.
A2: FIX-Protocol is the industry standard LP-interfacing lingua franca. In case you have contracted relations with your institutional trading provider, incl. the FIX-Protocol GWY-port, this may provide you an A-level access to the Market and to integrate your trading tools against. If this is the case, forget about MT4 instrumentation, as prime-time cadences are far beyond the MT4 Terminal localhost processing architecture ( multiple events with a sub-millisecond TimeDOMAIN resolution are common, whereas MQL4 does not provide any direct support for multithreaded-concurrent / better parallel programme scheduling designs ). FIX-Protocol events are simply off-the picture above, being far left, "before" the graph starts from 1st [ms] column.
ZeroMQ may help liberate your further designs from MQL4 limitations. May like to read my other posts on distributed systems, where MQL4 / ZeroMQ / ML-AI-predictors / GPU-processing infrastructures appear.
Anyway:
Enjoy the Wild Worlds of MQL4/MQL5
Interested? May also like reading other MQL4, ZeroMQ distributed processing and low-latency trading posts

You can try MetaApi https://metaapi.cloud cloud service which provides REST API and WebSocket API access to both MetaTrader 4 and MetaTrader 5 accounts.
Official REST API documentation: https://metaapi.cloud/docs/client
SDKs: https://metaapi.cloud/sdks (javascript, python and Java SDKs are provided as per April 2021)
It supports reading account information, positions, orders, trade history, receiving quotes, and accessing market data.
The service also provides copy trading API https://metaapi.cloud/docs/copyfactory and API to calculate forex trading metrics on a MetaTrader account https://metaapi.cloud/docs/metastats.

I started to code an expert with MQL5, naturally on MT5 platform, and I must admit that the difficulty of managing the application along with the increase of its complexity is high. It's not only due to a missing garbage collector, that of course imposes the deletion of the new instances, but also because Java offers a set of powerful data structures and syntax that MQL5 naturally doesn't have. Last but not least, talking about the community and the third party libraries available, there's a light year of the distance between Java and MQL5. I.e. if I need to find a library for a JSON conversion on the Java side I find dozens of official and stable versions, in the MQL5 community I have found only rubbish that I had to modify myself.
So, after numerous failed tries on coding my expert in MQL5 (not a simple one of course), I decided to adopt a radical approach: coding an application, client-side MQL5, and server-side Java, that provides a Java facade for the MT5 platform. Same API, same basic events and so on. Even though I thought more than once that I was getting stuck in a blind path, I kept coding and eventually, I made it, obtaining a really solid result.
Naturally, the REST interface drastically reduces the performances, and each request, even with Tomcat and MT5 running in the same localhost, is in the order of milliseconds, not micros, but on the other side this reduces only the suitability of this architecture, it doesn't make it useless at all.
Strategies like scalpelling and every kind of high-frequency trading are not good for such kind of scenario, vice-versa every other strategy in the longer period, even if intraday's ones, can be implemented successfully without any cons.
Last but not least, it isn't necessary to use the WebRequest() MQL5 method to call any Servlet container, it is possible to import the wininet.dll from the OS (talking about Windows) and the strategy tester will work as if the strategy has been coded in MQL5, maybe just a little bit slower.
To sum up, I wouldn't be so sarcastic on the Java facade approach for the FX trading platforms, citing only the nude performances without contextualizing the overall scenario is a naive approach to face the argument.

If you need to send/receive synchronous message between MT4 and Java application, REST would be the best approach because fast response matters in this scenario. Message Queue solutions like ZeroMQ fits better in asynchronous solutions, so it won't help you. Once you choose REST approach, you can use MQL4 WebRequest() to call your Java application.

WebRequest isn't the end of the world, you can submit http requests from your EA using API, works even with Strategy Tester.

In order to collect the tick information and open, update or close orders, you can use mt4 server api.
please check this url.
http://mtapi.online/#overlappable-4
Maybe you will find what you want.
And then I have also mt4 server api. If you have any questions please update me.

Simple node discovery method

I'm starting work on a system that will need to discover nodes in a cluster and send those nodes jobs to work on. I know that a myriad of systems exist that solve this but I'm unclear about the complexities of each and which would best suit my specific needs.
Our requirements are that an application should be able to send out job requests. Each request will specify multiple segments of data to work on. Nodes in the cluster should get these job requests and figure out whether the data segments being requested are "convenient". The application will need to keep track of which segments are being worked on by some node and then possibly send out a further requests if there are data segments that it needs to force some nodes to work on (all the nodes have access to all the data, but they should prefer to work on data segments that they have already cached).
This is a very typical map/reduce problem but we don't want to use the standard hadoop solutions because we are trying to avoid the overhead of writing preliminary results to files. This is more of a streaming problem where we want nodes to perform filtering on data that they read and then send it over a network socket to the application that will combine the results from all the nodes.
I've taken a quick look at akka, apache-spark (streaming), storm and just plain simple UPNP and I'm not quite sure which one would suit my needs best. One thing that works against at least spark is that it seems to require ZooKeeper to be set up on the network which is a complication that we'd like to be able to avoid.
Is there any simple library that does something similar to this "auto discover nodes via network multicast" and then allows you to simply send messages back and forth to negotiate which node will handle which data segment? Will Akka be able to help me here? How are nodes added/discovered in a cluster there? Again, we'd like to keep the configuration overhead to a minimum which is why UPNP/SSDP look sort of nice.
Any suggestions for how to use the solutions mentioned above or even other libraries or solutions to look into are very much appreciated.

You could use Akka Clustering: http://doc.akka.io/docs/akka/current/java/cluster-usage.html. However, it doesn't use multicast, it uses a Gossip protocol to handle node up/down messages. You could use a Cluster-Aware Router (see the Akka Clustering doc and http://doc.akka.io/docs/akka/current/java/routing.html) to route your messages to the cluster, there are several different types of routers depending on your needs and what you mean by "convenient". If "convenient" just means which actor is currently free, you can use a Smallest Mailbox router. If it has something to do with the content of the message, you could use a Consistent Hashing router.

See Balancing Workload Across Nodes with Akka 2.
This post describes a work distribution algorithm using Akka. The algorithm doesn't use multicast to discover workers. There is a well-known master address and the workers register with the master. Other than that though it fits your requirements well.
Another variation on it is described in Akka Work Pulling Pattern.
I've used this pattern in a number of projects - it works great.

Storm is fairly resilient when it comes to worker-nodes coming offline & online. However, just like Spark, it does require Zookeeper.
The good news is that Storm is comes with a sister project to make deployment a breeze: https://github.com/nathanmarz/storm-deploy/wiki
If you're running vanilla storm on EC2, the storm-deploy project could be what you're looking for.

Inter-process-communication between a Java application and a local server

Firstly Cheers to all PROGRAMMERS [ Today = Programmers day :) ]
Secondly,
I'm working on a project where the specifications require using a server as a front end and an application in the back end. The project is an advanced smart home system. The server will handle commands coming from the client through the internet (let's say like a remote control from outside the house) and send them (through a channel of communication) to the application (planning on using JAVA application) which will handle the main logic like controlling hardware stuff (lights ...) , reading from a microphone (local mic) and accessing a database to act as a speech recognition system (offline).
Now I'm still in the planning phase and I'm not sure which technologies are the best for this project. I'm thinking to use Node.js or Apache as the server and a JAVA application as the back end and any SQL database for the application's SRS.
I hope this illustration demonstrates clearly how the system works:
The main question is:
What is the best way to make the Java application communicate with the server (communication channel [must be bidirectional]) ?
and Do you recommend a specific server other than the mentioned ones for this job ?
What crossed my mind so far:
1- JSP and servlets (making the server is the application too). But I don't want a server to handle the offline stuff and I'm not sure if java servlets can access hardware interface. I also want the server to be separate from making critical decisions (different layer for security reasons and since it won't be used as frequently as the local [offline] system).
2- Communication channel :
A- A shared file, but it's a bad idea since I don't want the application to check if the file contents changed (command received) or not from time to time (excessive operations).
B- A an inter-process-communication through a port (socket communication) seems the best solution but I don't know how that would turn in terms of operation cost and communication errors.
OS used : Linux Raspbian
EDIT:
I'm sure ZMQ+Apache is good enough for this task, but how is it in comparison to WebServices (like SOAP) ? Would WebServices be a better solution in terms of standard implementation and security ?
All related suggestions are welcomed, TQ

ZeroMQ is great for internal communications, or any other similar communication solutions.
For specifically your case, I can see that ZeroMQ would be a best fit.
Reasons:
You offline server have to be agnostic to web solution.
Communication can be reliable and bi-directional, possibly another patterns like (pub>sub, req<>res, etc).
Restarting any of sides would not require to restart the sockets (connection) on other side, as messages are queued.
Possibility to scale not just on same hardware, but as well to local area network or even through internet.
Big community of support. It might look a bit hard to get into, but in reality it is dead simple, just go to examples and once concept understood - it is very easy and neat to work with.
ZeroMQ has lots drivers for most popular languages, that includes Java and Node.js.
Considerations:
You need to think over packets and data will be sent. So some popular data protocols like XML or JSON is good way of thinking.
Responsibilities over different services - make sure they are not dependant on each other too much. Or if main offline server - is a core of system, make sure it does not depend on web facing service, so that web face can be removed/replaced/improved etc.
Few more points to think about:
Why Java, and what about modular approach? For example if you want to expand and scale - add more sensors into smart home solutions, then having one giant application would require to change it, it is harder to maintain as well as maintain different clients with own needs. Think modular way - some core functionality for offline stuff, but many aggregator processes that would talk to different sensors. This makes easier to support different setups and environments, as well maintain the system as a whole by improving independent components.

GAE/GWT server side data inconsistent / not persisting between instances

I'm writing a game app on GAE with GWT/Java and am having a issues with server-side persistent data.
Players are polling using RPC for active games and game states, all being stores on the server. Sometimes client polling fails to find game instances that I know should exist. This only happens when I deploy to google appspot, locally everything is fine.
I understand this could be to do with how appspot is a clouded service and that it can spawn and use a new instance of my servlet at any point, and the existing data is not persisting between instances.
Single games only last a minute or two and data will change rapidly, (multiple times a second) so what is the best way to ensure that RPC calls to different instances will use the same server-side data?
I have had a look at the DataStore API and it seems to be database like storage which i'm guessing will be way too slow for what I need. Also Memcache can be flushed at any point so that's not useful.
What am I missing here?

You have two issues here: persisting data between requests and polling data from clients.
When you have a distributed servlet environment (such as GAE) you can not make request to one instance, save data to memory and expect that data is available on other instances. This is true for GAE and any other servlet environment where you have multiple servers.
So to you need to save data to some shared storage: Datastore is costly, persistent, reliable and slow. Memcache is fast, free, but non-reliable. Usually we use a combination of both. Some libraries even transparently combine both: NDB, objectify.
On GAE there is also a third option to have semi-persisted shared data: backends. Those are always-on instances, where you control startup/shutdown.
Data polling: if you have multiple clients waiting for updates, it's best not to use polling. Polling will make a lot of unnecessary requests (data did not change on server) and there will still be a minimum delay (since you poll at some interval). Instead of polling you use push via Channel API. There are even GWT libs for it: gwt-gae-channel, gwt-channel-api.

Short answer: You did not design your game to run on App Engine.
You sound like you've already answered your own question. You understand that data is not persisted across instances. The two mechanisms for persisting data on the server side are memcache and the datastore, but you also understand the limitations of these. You need to architect your game around this.
If you're not using memcache or the datastore, how are you persisting your data (my best guess is that you aren't actually persisting it). From the vague details, you have not architected your game to be able to run across multiple instances, which is essential for any app running on App Engine. It's a basic design principle that you don't know which instance any HTTP request will hit. You have to rearchitect to use the datastore + memcache.
If you want to use a single server, you can use backends, which behave like single servers that stick around (if you limit it to one instance). Frankly though, because of the cost, you're better off with Amazon or Rackspace if you go this route. You will also have to deal with scaling on your own - ie if a game is running on a particular server instance, you need to build a way such that playing the game consistently hits that instance.

Remember you can deploy GWT applications without GAE, see this explanation:
https://developers.google.com/web-toolkit/doc/latest/DevGuideServerCommunication#DevGuideRPCDeployment
You may want to ask yourself: Will your application ever NEED multiple server instances or GAE-specific features?
If so, then I agree with Peter Knego's reply regarding memcache etc.
If not, then you might be able to work around your problem by choosing a different hosting option (other than GAE). Particularly one that lets you work with just a single instance. You could then indeed simply manage all your game data in server memory, like I understand you have been doing so far.
If this solution suits your purpose, then all you need to do is find a suitable hosting provider. This may well be a cloud-based PaaS offer, provided that they let you put a hard limit (unlike with GAE) on the number of server instances, and that it goes as low as one. For example, Heroku (currently) lets you do that, as far as I understand, and apparently it's suitable for GWT applications, according to this thread:
https://stackoverflow.com/a/8583493/2237986
Note that the above solution involves a bit of fiddling and I don't know your needs well enough to make a strong recommendation. There may be easier and better solutions for what you're trying to do. In particular, have a look at non-cloud-based hosting options and server architectures that are optimized for highly time-critical, real-time multiplayer gaming.
Hope this helps! Keep us posted on your progress.

How to route to the nearest RMI Server?

In continuation to my question How to improve the performance of client server architecture application
I have decided to maintain a centralized database and several slave server-database configuration. I plan to use Symmetric DS for replicating between the slave and master database. Each server-database configuration would be installed closer to the client. Ideally I want the request from a client to route to the nearest slave server-database for the obvious reason. Since I'm using RMI to connect to the server, I want to know if there is any product/API currently available, which would solve this?
Any other solution than the above one is highly regarded :)
Note: Refactoring the client code is definitely one alternative but since the application is very huge, its a huge risk (can break existing code), time taking & expensive.

Take a look at distributed and consistent hashing:
http://en.wikipedia.org/wiki/Distributed_hash_table#Keyspace_partitioning
http://en.wikipedia.org/wiki/Consistent_hashing
Barebones, you would setup a variant of consistent hashing that would take the identifier of the client (in lieu of the 'key') and locate the nearest server. Bonus benefit here is that if one of the slaves goes down, your infrastructure will transparently route to the next nearest server.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.