Anomaly detection in java application

Anomaly detection in java application - java

What i'm trying to do is to integrate anomaly detection module in the existing java application to allow user to chose from different algorithms and forecasting models
Egads library looks quite optimistic, but I'm not sure whether it fits my purposes, in case new data coming in should I store and update the existing model or to pass the whole data once again. Also what if I would like to forecast only 15 min time window, by passing only 15 min data in the results won't be precise for sure.
Might be there're any other useful techniques and someone could share his experience of similar tasks. Unfortunately can't find any other java libs for that purposes.

What I found out is that we can't store the model trained initially and apply it against any incoming data, as soon as the initial time series is changed the exception is thrown. That's why the only possible option here to to train the model every time new data comes in, fortunately it doesn't have great performance impact on our system yet.
Library itself looks fine and could be used as a base in building anomaly detection systems, but its still not that flexible as it Python competitors, however it's open sourced and can be modified anytime depending on your needs.

Related

Python requests vs java webservices

I have a legacy web application sys-1 written in cgi that currently uses a TCP socket connection to communicated with another system sys-2. Sys-1 sends out the data in the form a unix string. Now sys-2 is upgrading to java web service which in turn requires us to upgrade. Is there any way to upgrade involving minimal changes to the existing legacy code. I am contemplating the creating of a code block which gets the output of Sys-1 and changes it into a format required by Sys-2 and vice versa.
While researching, I found two ways of doing this:
By using the "requests" library in python.
Go with the java webservices.
I am new to Java web services and have some knowledge in python. Can anyone advise if this method works and which is a better way to opt from a performance and maintenance point of view? Any new suggestions are also welcome!

Is there any way to upgrade involving minimal changes to the existing legacy code.
The solution mentioned, adding a conversion layer outside of the application, would have the least impact on the existing code base (in that it does not change the existing code base).
Can anyone advise if this method works
Would writing a Legacy-System-2 to Modern-System-2 converter work? Yes. You could write this in any language you feel comfortable in. Web Services are Web Services, it matters not what they are implemented in. Same with TCP sockets.
better way to opt from a performance
How important is performance? If this is used once in a blue moon then who cares. Adding a box between services will make the communication between services slower. If implemented well and running close to either System 1 or System 2 likely not much slower.
maintenance point of view?
Adding additional infrastructure adds complexity thus more problems with maintenance. It also adds a new chunk of code to maintain, and if System 1 needs to use System 2 in a new way you have two lots of code to maintain (Legacy System 1 and Legacy/Modern converter).
Any new suggestions are also welcome!
How bad is legacy? Could you rip the System-1-to-System-2 code out into some nice interfaces that you could update to use Modern System 2 without too much pain? Long term this would have a lower overall cost, but would have a (potentially significantly) larger upfront cost. So you have to make a decision on what, for your organisation, is more important. Time To Market or Long Term Maintenance. No one but your organisation can answer that.

Maybe you could add a man-in-the-middle. A socket server who gets the unix strings, parse them into a sys-2 type of message and send it to sys-2. That could be an option to not re-write all calls between the two systems.

changing 3d model at run time

I am new in 3D modelling, I am not expecting exact ans but any hint, link or direction will good for me. I am working in java and i have to work with 3D model now. So its like, I will make a model (using XYZ software) and define some parts of it lets say part1, part2. Then at run time on browser user have a drop down to select the part and then he give some value , now i want to change the 3d model according to that value of that part (that could be length,width etc) and user should be able to see that on browser.
First, Is it possible, changing dimension at runtime (I can make some restriction also)?
Second , Any hint , library or logic direction will be great help.
I can go for any language , any software now as because I am about to start from scratch.
If I haven't made my question clear, do let me know.
One solution I can think of is for different possibilities ,I should have different model already in background, and load that one , whose dimensions matches with user selection.
OR any other suggestions ??

Your question is quite unclear in that you say that you are using Java but can go for any language...
I am making the assumption here that you are indeed using Java. There are several libraries available for Java that support "rendering" of 3d objects. The level of abstraction depends on the individual library/framework.
Some example of what are referred to as low-level libraries:
Java 3D, (OpenGL wrappers)lwjgl, JOGL.
Some higher level frameworks:
JMonkeyEngine, Andor3D, Ogre4J.
I only list a few as an exhaustive list is not what SO is for, nor am I going to compare them for the same reason.
If you want to code the loading of assets, how they render, how they are stored at runtime, how they should be rendered then you will probably be going with the lower level ones. Going with a framework means you care less about what low level graphics library you use rather what the framework can offer.
Your second set of questions really deal with details that you would only need to worry about after the initial choice. If you have model loading code then you can just load a new model triggered by an event.
Edit: Only caught that you said run in browser, if that is a requirement then that complicates things. You would probably need to go another route, perhaps WebGL and javascript, or three.js.
Again without a more specific question, can't really give a more concrete answer.
Edit : Reuest per comment:
Although flash support is being phased out of web-browsers there are several libraries available. papervision, unity3d-web, away 3d. I have away3d myself on a small project for a simulation visualisation project. It was pretty easy to use, but for what I needed perfomance was not good, but then again I would not recommend using flash for 3d either. That said it has probably improved since then.

Loading facebook's big text file to memory (39MB) for autocompletion

I'm trying to implement part of the facebook ads api, the auto complete function ads.getAutoCompleteData
Basically, Facebook supplies this 39MB file which updated weekly, and which contains targeting ads data including colleges, college majors, workplaces, locales, countries, regions and cities.
Our application needs to access all of those objects and supply auto completion using this file's data.
I'm thinking of preferred ways to solved this. I was thinking about one of the following options:
Loading it to memory using Trie (Patricia-trie), the disadvantage of course that it will take too much memory on the server.
Using a dedicated search platform such as Solr on a different machine, the disadvantage is perhaps over-engineering (Though the file size will probably increase largely in the future).
(Fill here cool, easy and speed of light option) ?
Well, what do you think?

I would stick with a service oriented architecture (especially if the product is supposed to handle high volumes) and go with Solr. That being said, 39 MB is not a lot of hold in memory if it's going to be a singleton. With indexes and all this will get up to what? 400MB? This of course depends on what your product does and what kind of hardware you wish to run it on.
I would go with Solr or write your own service that reads the file into a fast DB like MySQL's MyISAM table (or even in-memory table) and use mysql's text search feature to serve up results. Barring that I would try to use Solr as a service.
The benefit of writing my own service is that I know what is going on, the down side is that it'll be no where as powerful as Solr. However I suspect writing my own service will take less time to implement.
Consider writing your own service that serves up request in a async manner (if your product is a website then using ajax). The trouble with Solr or Lucene is that if you get stuck, there is not a lot of help out there.
Just my 2 cents.

Implementing a Dynamic Award System

I have been developing a Online Poker Game. But I keep hitting a wall. I want to implement Awards into the system, but I want them to be dynamic. Meaning I don't want to recompile for every award I would like to add.
I have thought about using Python code for each award. Then when the server checks to see if the user qualifies for the award it runs the python script with Jython (server is in Java and Netty NIO) and if the function returns a certain value I award the award to the user. Which could work but is there maybe a more efficient technique out there that will not force me to run hundreds of python scripts each time I need to check if a user got a award.
And when are the best times to do these checks? I have tought about a hook system where I will specify the hooks like ( [onconnect][ondisconnect][chatmessage.received] ). Which also could work but feels a bit crude and I will still have to run all the scripts from the database.

If I were you, I'd have a totally separate process that grants awards. It runs perhaps once a day on the underlying database that contains all your player/game data.
Your core customer-facing app knows about awards, but all it knows about them is data it loads from the DB -- something like a title, image, description, maybe how many people have the award, etc., and (based on DB tables) who has won the award.
Your "award granter" process simply runs in batch mode, once per day / hour etc, and grants new awards to eligible players. Then the core customer-facing app notifies them but doesn't actually have to know the smarts of how to grant them. This gives you the freedom to recompile and re-run your award granter any time you want with no core app impact.
Another approach, depending on how constrained your awards are, would be to write a simple rules interface that allows you to define rules in data. That would be ideal to achieve what you describe, but it's quite a bit of work for not much reward, in my opinion.
PS -- in running something like an online poker server, you're going to run into versions of this problem all the time. You are absolutely going to need to develop a way to deploy new code without killing your service or having a downtime window. Working around a java-centric code solution for awards is not going to solve that problem for you in the long run. You should look into the literature on running true 24/7 services, there are quite a few ways to address the issue and it's actually not that difficult these days.

There are a number of options I can think of:
OSGi as described above - it comes at a cost, but is probably the most generic and dynamic solution out there
If you're open to restart (just not recompile), a collection of jars in a well known folder and spring give you a cheaper but equally generic solution. Just have your award beans implement a standard interface, be beans, and let spring figure #Autowire all the available awards into your checker.
If you award execution is fairly standard, and the only variation between awards are the rules themselves, you can have some kind of scripted configuration. Many options there, from the python you described (except I'd go for a few big script managing all awards), to basic regular expressions, with LUA and Drools in the middle. In all cases you're looking at some kind of rules engine architecture, which is flexible in term of what the award can trigger on but doesn't offer much flexibility in term of what an award can lead to (i.e. perfect for achievements).

Some comments to the answer with batch ideas:
Implementing a Dynamic Award System
That batch processes can be on separate server/machine, so you can recompile the app or restart the server at any time. Having that new awards can be handled using for example the mentioned approach with adding jars and restarting the server, also new batch jobs can be introduced at any time and so on. So your core application is running 99% of the time, batch server can be restarted frequently. So separate batch machines is good to have.
When you need to deploy new version of core app I think you can just stop, deploy and start it with maintenance notice to users. That approach is used even by top poker rooms with great software (e.g. FullTiltPoker did so, right now it is down due to the license loss, but their site says 'System Update' :) ).
So one approach for versions update is to redeploy/restart in off-hours.
Another approach is real time updates. As a rule it is done by migrating users bunch by bunch to new version. So at the same time some users are using old version, some - new. Not pretty cool for poker software were users with different versions can interact. But if you are sure in the versions 'compatibility' you can go with that approach, checking user's client version for example on login.
In my answer I tried to say that you need not introduce 24/7 support logic to your code. Leave that system availability problems for hardware (failovers, loadbalancers, etc.). You can follow any good techniques you used to write code, only you need to remember that your crucial core logic is deployed not frequently (for example once a week), and batch part can be updated/restarted at any time if needed.

As far as I understand you, you probably do not have to run external processes from your application nor use OSGI.
Just create a simple Java interface and implement each plugin ('award') as a class implementing the interface. You could then just simply compile any new plugin and load it in as a class file from your application at run-time using Class.forName(String className).
Any logic you need from such a plugin would be contained in methods on the interface.
http://download.oracle.com/javase/1,5.0/docs/api/java/lang/Class.html#forName(java.lang.String)

Experiences with "language converters"?

I have read a few articles mentioning converters from one language to another.
I'm a bit more than skeptical about the use of such kind of tools. Does anyone know or have experiences let's say about Visual Basic to Java or vs converters? Just one example to pick
http://www.tvobjects.com/products/products.html, claims to be the "world leader" or so in that aspect, However if read this:
http://dev.mysql.com/tech-resources/articles/active-grid.html
There the author states:
"The consensus of MySQL users is that automated conversion tools for MS Access do not work. For example, tools that translate existing Access applications to Java often result in 80% complete solutions where finishing the last 20% of the work takes longer than starting from scratch."
Well we know we need 80% of the time to implement the first 80% functionality and another 80% of the time for the other 20 %....
So has anyone tried such tools and found them to be worthwhile?

Tried? No, actually built (more than one) language convertor.
Here's one I (and my coworkers) built for the B2 Spirit Stealth Bomber to convert the mission software, coded in a legacy language, JOVIAL, into maintainable C code, with 100% automated conversion. One of the requirements was that we were NOT allowed to see the actual source code. No joke.
You are right: if you get only a medium high conversion rate (e.g., 70-80%), the effort to finish the conversion is still very significant if indeed you can do it at all. We target 95%+ and do better when told to try harder as was the case for the B2. The only reason people accept medium high rate converters is because they can't find (or won't fund!) a better one, insist on starting now, and accept the fact that converting it this way may be painful (usually they don't know how much) but is in fact less painful than rebuilding it from scratch. (I happen to agree with this assessment: in general, projects that try to recode a large system from scratch usually fail and conversions using medium high conversion rate tools don't have as high a failure rate.)
There are lots of bad conversion tools out there, something slapped together with a mountain of PERL code doing regexes on text strings, or some YACC-based parser with code generation essentially one-to-one for each statement in the compilation unit. The former are built by people who had a conversion dropped on them out of the sky. The latter are often built by well-intentioned engineers that don't have decent compiler background.
For a singularly bad example, see my response to this SO question about COBOL migration: Experience migrating legacy Cobol/PL1 to Java, which is exactly a direct statement translator... producing the stuff that gave rise to the term "JOBOL".
To get such high-accuracy conversion rates, you need high-quality parsers, and means to build high-quality translation rules that preserve semantics, and optimize for target-language properties and special cases. In essence, you need what amounts to configurable compiler technology. The reason we succeed, IMHO, is our DMS Software Reengineering Toolkit, which was designed to do this job. (I'm the architect; check out my SO icon/bio).
Lots of careful testing helps, too.
DMS "knows" what the compiler knows about code, by virtue of having a compiler-like front end for the language of interest, and having the ability to build ASTs, symbol tables, control and data flows, call graphs. It uses much of the compiler technology that the compiler community spent the last half-century inventing, because that stuff has been proven to be useful in translation!
DMS knows more than most compilers know, because it can read/analyze/transform the entire application at once; most compilers stick to single compilation units. Thus one can code translation rules that depend on the entire application as opposed to just the current statement. We often add problem- or application-specific knowledge to improve the translation. This often shows up when converting special features of a language, or calls on libraries, where one must recognize the library calls as special idioms, and translate them to calls on compositions of target libraries and language constructs.
This capability is used to build translators (e.g., the JOVIAL translator), or domain-specific code generators.
More often we build complex automated software engineering tools that solve problems specific to customers, such as program analysis tools (dead code, duplicate code, style-broken code, metrics, architecture extraction, ...), and mass change tools (platform [not langauge] migrations, data layer insertion, API replacement, ...)

It seems to me, as is almost always the case with MS-ACCESS questions having tags that attract the wider StackOverflow population, that the people answering are missing the key question here, which I read as:
Are there any tools that can successfully convert an Access application to any other platform?
And the answer is
ABSOLUTELY NOT
The reason for that is simply that tools in the same family that use similar models for the UI objects (e.g., VB6) lack so many things that Access provides by default (how do you convert an Access continuous subform to VB6 and not lose functionality?). And other platforms don't even share the same core model as VB6 and Access, so those have even more hurdles to clear.
The cited MySQL article is quite interesting, but it really confuses the problems that come with incompetently-developed apps vs. the problems that come with the development tools being used. A bad data schema is not inherent to Access -- it's inherent to [most] novice database users. But the articles seems to attribute this problem to Access.
And entirely overlooks the possibility of fixing the schema, upsizing it to MySQL and keeping the front end in Access, which is by far the easiest approach to the problem.
This is exactly what I expect from people who just don't get Access -- they don't even consider that Access as front end to a securable, large-capacity server database engine can be a superior solution to the problem.
That article doesn't even really consider conversion of an Access app, and there's good reason for that. All the tools that I've seen that claim to convert Access applications (to whatever platform) either convert nothing but data (in which case they don't convert the app at all -- morons!), or convert the front end structure slavishly, with a 1:1 correspondence between UI objects in the Access application and in the target app.
This doesn't work.
Access's application design is specific to itself, and other platforms don't support the same set of features. Thus, there has to be translation of Access features into a working substitute for the original feature in the converted application. This is not something that can be done in an automated fashion, in my opinion.
Secondly, when contemplating converting an Access app for deployment in the web browser, the whole application model is different, i.e., from stateful to stateless, and so it's not just a matter of a few Access features that are unsupported, but of a completely different fundamental model of how the UI objects interact with the data. Perhaps a 100% unbound Access app could be relatively easily be converted to a browser-based implementation, but how many of those are there? It would mean an Access app that uses no subforms whatsoever (since they can't be unbound), and an app that uses only a handful of events from the rich event model (most of which work only with bound forms/controls). In short, a 100% unbound Access app would be one that fights against the whole Access development paradigm. Anyone who thinks they want to build an unbound app in Access really shouldn't be using Access in the first place, as the whole point of Access is the bound forms/controls! If you eliminate that, you've thrown out the majority of Access's RAD advantage over other development platforms, and gained almost nothing in return (other than enormous code complexity).
To build an app for deployment in the web browser that accomplishes the same tasks as an Access applications requires from-the-ground-up redesign of the application UI and workflow. There is no conversion or translation that will work because the successful Access application model is antithetical to the successful web application model.
Of course, all of this changes with Access 2010 and Sharepoint Server 2010 with Access Services. In that case, you can build your app in Access (using web objects) and deploy on Sharepoint for users to run it in the browser. The results are functionally 100% equivalent (and 90% visually), and run on all browsers (no IE-specific dependencies here).
So, starting this June, the cheapest way to convert an Access app for deployment in the browser may very well be to upgrade to A2010, convert the design to use all web objects, and then deploy with Sharepoint. That's not a trivial project, as Access web objects have a limited set of features in comparison to client objects (and no VBA, for instance, so you have to learn the new macros, which are much more powerful and safe than the old ones, so that's not the terrible hardship it may seem for those familiar with Access's legacy macros), but it would likely be much less work than a full-scale redesign for deployment on the web.
The other thing is that it won't require any retraining for end users (insofar as the web-object version is the same as the original client version), as it will be the same in the Access client as in the web browser.
So, in short, I'd say conversion is a chimera, and almost always not worth the effort. I'm agreeing with the cited sentiment, in fact (even if I have a lot of problems with the other comments from that source). But I'd also caution that the desire for conversion is often misguided and misses out on cheaper, easier and better solutions that don't require wholesale replacement of the Access app from top to bottom. Very often the dissatisfaction with Jet/ACE as data store confuses people into thinking they have to replace the Access application as well. And it's true that many user-developed Access apps are filled with terrible, unmaintainable compromises and are held together with chewing gum and bailing wire. But a badly-designed Access application can be improved in conjunction with the back-end upsizing andrevision of the data schema -- it doesn't have to be discarded.
That doesn't mean it's easy -- it's very often not. As I tell clients all the time, it's usually easier to build a new house than to remodel an old one. But one of the reasons we remodel old houses is because they have irreplaceable characteristics that we don't want to lose. It's very often the case that an Access app implicitly includes a lot of business rules and modelling of workflows that should not be lost in a new app (the old Netscape conundrum, pace Joel Spolsky). These things may not be obvious to the outside developer trying to port to a different platform, but for the end user, if the app produces results that are off by a penny in comparison to the old app, they'll be unhappy (and probably should be, since it may mean that other aspects of the app are not producing reliable results, either).
Anyway, I've rambled on for too long, but my opinion is that conversion never works except for the most trivial apps (or for ones that were designed to be converted, e.g., a 100% unbound Access app). I'm all for revision in place of replacment.
But, of course, that's how I make my living, i.e., fixing Access apps.

A couple of issues that effect the success or failure of cross-language conversion are the relative semantic richness of the languages, and their semantic models.
Translation from C++ to C should be relatively easy, but translation of C to idiomatic C++ would be next to impossible because that would be next to impossible to automatically turn a procedural program into an OO program.
Translation of Java to C would be relatively simple, though handling storage management would be messy. Translation of C into Java would be next to impossible if the C program did funky pointer arithmetic or casting between integers and different kinds of pointer.
Translation of a functional language to an imperative language would be much easy though the result would probably be inefficient, an non-idiomatic. Translation of an imperative language to a functional language is probably beyond the state of the art .... unless you implement an interpreter for the imperative language in the functional language.
What this means is that some translators are necessarily going to be more successful than others in terms of:
completeness and accuracy of translation, and
readability and maintainability of the resulting code.

Things You Should Never Do, Part I by Joel Spolsky
"....They did it by making the single worst strategic mistake that any software company can make:
They decided to rewrite the code from scratch."
I have a list of MS Access converters on my website. I've never heard anything good about any of them in any postings in the Access related newsgroups I read on a daily basis. And I read a lot of postings on a daily basis.
Also note that there is a significant amount of functionality in Access, such as bound continuous forms or subforms, that is more work to reproduce in other systems. Not necessarily a lot of work but more work. And more troubles when it comes time to distribute and install the app.

I've used an automated converter from C# to Visual Basic.NET. It worked pretty well except for adding some unnecessary If True statements.
I've also attempted to use Shed Skin to convert Python-to-C++, but it didn't work because of its lack of support for new-style division.

I've used tools for converting a VB6 Project into VB.Net - which you would hope would be perhaps one of the simpler examples of this sort of thing. My experience was that everything had to be checked, in fine detail, and half the stuff was missing / wrong.
Certainly I would recommend a migration by hand, or depending on the language you're targetting, I would consider a complete rewrite if this gives you a chance to make major improvements to your codebase.
Martin

I have only tried free and basic paid for converters. But the main problem is that it is very very hard to have confidence that the conversion is entirely successful.
Usually they are best used to hand convert code section at a time, where you review each piece of code. Often in my experience a rewrite instead of a conversion turns out to be a better option.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.