Benefits of Custom Designed Algorithms

Benefits of Custom Designed Algorithms - java

In many languages, for me specifically, Java and C++, there is an massive standard library. Many classic problems in computer science, search, sorting, hashing etc etc... are implemented in this library. My question is, are there any benefits of say implementing one's own algorithm versus simply using the library's version? Are there any particular instances were this would be true?
I only ask because in school a huge deal of time is spent on say sorting, however in my actual code I have found no reason to utilize this knowledge when people have already implemented and optimized a sorting algorithm in both Java and C++.
EDIT: I discussed this at length with a professor I know and I posted his response, can anyone think of more to add to it?

Most of the time, the stock library functions will be more performant than anything you'll custom code.
If you have a highly specific (as opposed to a generic) problem, you may find a performance gain by coding a specialized function, but as a developer you should make a conscious effort to not "reinvent the wheel."

Sorting is a good example to consider. If you know nothing whatsoever about the data to be sorted, except how to compare elements, then the standard sort algorithms fare well. In this situation, in C++, the STL sort will do fine.
But sometimes you know more about your data. For example, if your data consists of uniformly distributed numbers, a radix sort can be much faster. But radix sort is 'invasive' in the sense that it needs to know more about your data than simply whether one number is bigger than another. That makes it harder to write a generic interface that can be shared by everyone. So STL lacks radix sort and for this case you can do better by writing your own code.

In general, standard libraries contain very fast code for very general problems. If you have a specific problem, you can in many cases do better than the library. Of course, you may eventually come across a complex problem which is not solved by a library, in which case the knowledge you have gained from studying solutions to solved problems could prove invaluable.

In college, or school, or if learning as a recreational programmer, you will be (or in my strident opinion, you should be) encouraged to implement a subset of these things yourself. Why? To learn. Tackling the implementation of an important already invented wheel (the B-Tree) for me was one of the most formative experiences of my time in college.
Sure I would agree that as a developer you should make an effort not to reinvent the wheel, but when learning through formative experiences, different rules apply. I read somewhere else on this forum that to use something at abstraction level N, it is a very good idea to have a working knowledge of abstraction level N-1, and be familiar with level N-2. I would agree. In addition to being formative, it prepares you for the day when you do encounter a problem when the stock libraries are not a good fit. Believe me this can happen in your 50 year career. If you are learning fundamentals such as data structures, where the end goal is not the completeness of your finished product but, instead, self improvement, it is time well spent to "re-invent the wheel".

Is pre-algebra/algebra/trigonometry/calculus worth learning?
I can't tell if this is a "am I wasting my time/money in school" aimed question or if this is a sincere question of if your own version is going to be better.
As for wasting your time/money in school: If all you want to do is take pot shots at developing a useful application, then you're absolutely wasting your time by learning about these already-implemented algorithms -- you just need to kludge something together that works good 'nuff.
On the other hand if you're trying to make something that really matters, needs to be fast, and needs to be the right tool for the right job -- well, then it often doesn't exist already and you'll be back at some site like Stack Overflow asking first or second year computer science questions because you're not familiar enough with existing techniques to roll your own variations.
Depending on my job, I've been on both sides. Do I need to develop it fast, or does it have to work well? For fast application programming, it's stock functions galore unless there's a performance or functionality hindrance I absolutely must resolve. For professional game programming it has to run blazing fast. That's when the real knowledge kicks into memory management, IO access optimization, computational geometry, low level and algorithmic optimization, and all sorts of clever fun. And it's rarely ever a stock implementation that gets the job done.
And did I learn most of that in school? No, because already knew most of it, but the degrees helped without a doubt. On the other hand you don't know most of it (otherwise you wouldn't be asking), so yes, in short: It is worthwhile.
Some specific examples:
If you ever want to make truly amazing games, live and breath algorithms so you can code what other people can't. If you want to make fun games that aren't particularly amazing, use stock code and focus on design. It's limiting, but it's faster development.
If you want to program embedded devices (a rather large market), often stock code just won't do. Often there's a code or data memory constraint that the library implementations won't satisfy.
If you need serious server performance from modest hardware, stock code won't do. (See this Slashdot entry.)
If you ever want to do any interesting phone development the resource crunch requires you to get clever, even often for "boring" applications. (User experience is everything, and the stock sort function on a large section of data is often just too slow.)
Often the libraries you're restricted to using don't do what you need. (For example, C# doesn't have a "stable" sort method. I run into this annoyance all the time and have since written my own solution.)
If you're dealing with large amounts of data (most businesses have it these days) you'll end up running into situations where an interface is too slow and needs some clever workarounds, often involving good use of custom data structures.

Those libraries offer you tested implementations that work well, so the rule of thumb is to use those implementations. If you have a very particular/complex problem where you can use some domain knowledge you have a case were you will need to implement your own version of an algorithm.
I remember an example Bill Pugh gave in his programming languages class where they analyzed the performance of a complex application and they realized a faulty custom implementation of a sorting algorithm by a programmer (that code was used many times in the real runs of the application) was responsible for 90% performance decrease!

After discussing this at length with professor of Computer Science, here were his opinions:
Reasons to Use Libraries
1. You are writing code with a deadline.
There is no sense in hampering your ability to complete a project in a quick and timely manner. That's why libraries are written after all, to save time and avoid "reinventing the wheel"
2. If you want to optimize your code fully.
Chances are the team of incredibly talented people who wrote the algorithm in Java or C++'s or whoever's library did a far better job at optimizing their algorithm for that language in however long it took them than you can possibly do in an hour or two. Or four.
3. You've already done previously solved this problem.
If you have already solved this problem and have a good complete understanding of how it is designed you don't need to labor over a complex solution as you don't stand to gain much benefit.
That being said, there are still many reasons to make your own solution.
Reasons to Do It Yourself
1. A fundamental understanding of problem solving techniques and algorithms are completely necessary once you reach a problem that is better optimized by a non-library solution.
If you have a highly specified problem, such things often come up when working with networking or gaming or such. It becomes invaluable to be able to spot situations in which a specific algorithm will outperform the libraries version.
2. Having a very good understanding of algorithms and their design and use makes you much more valuable in the work place.
Any halfway decent programmer can write a function to compare two objects and then toss them into a library function, however the one that is able to spot a situation and ultimately improve the programs functionality and speed is going to be looked upon well by management.
3. Having the concept of how to do something is often just as, if not more so, valuable than being able to do it.
With an outstanding knowledge of Java's libraries and how to use them, chances are you can field any problem in java with reasonable success. However when you get hired to work in erlang you're going to have some rough times ahead. Where if you had known how and not merely what Java's libraries did, you could move those ideas to any language.
4. We as programmers are never truly satisfied with merely having something "work".
Chances are that you have an itch to understand why things work. It was this curiosity that probably drove you to this area of study. Don't deny this curiosity! Encourage it and learn to your hearts content.
5. Finally, there is a huge feeling of success and accomplishment that comes with creating your own personal way of sorting or hashing etc.
Just imagine how cool your friends will see you when you proclaim that you can find the shortest path between 2 vertices in n log(n) time! On a serious note, it is very rewarding to know that you are completely capable of understanding and choosing an optimum solution based on knowledge. Not what some library gives you.

Related

When to use Java Thread API

Personally I am using high level concurrency abstractions cause they are much easier. Actually I cannot remember the last time I have used a theads. But on the technical interview it is a frequently asked question. And yeah I ask about it too.
Are there any use cases when it is necessary to use low level Thread API instead of Executors/Locks/Latches/etc.?
Is there any reason to discuss low level thread API during the technical interview?

Technical interviews are often designed to measure the depth of the candidate's knowledge, not any particular ability. Arguably, there has been no need to implement your own linked lists and binary trees for good 15..20 years, yet the questions asking to implement these data structures routinely come up in technical interviews. A smart candidate should be able to figure out high-level concurrency APIs from a short tutorial and the API docs. You ask about thread primitives to see if the candidate understands what is happening behind the scenes when he or she uses concurrency in general, no matter what API is called.
Personally, when I ask questions about things you'd never use, I do not insist on getting the correct names or the right order of the API method parameters. As long as the candidates are clear on the concept, I do not mind them not remembering the particulars of the specific API.

If you want to implement something very specific, you're stuck with legacy code, an old version of Java or you need an abstraction that's not provided, then I'd consider using the raw low level threading API. Even when there's an old version / legacy code in use though bear in mind that making the switch to the newer API may well be worth it if it drives maintenance down - personally I try to keep the amount of user generated concurrent code at a minimum!

Haskell vs JVM performance [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I want to write a backend system for a web site (it'll be a custom search-style service). It needs to be highly concurrent and fast. Given my wish for concurrency, I was planning on using a functional language such as Haskell or Scala.
However, speed is also a priority. http://benchmarksgame.alioth.debian.org results appear to show that Java is almost as fast as C/C++, Scala is generally pretty good, but Haskell ranges from slower to a lot slower for most tasks.
Does anyone have any performance benchmarks/experience of using Haskell vs Scala vs Java for performing highly concurrent tasks?
Some sites I've seen suggest that Scala has memory leaks which could be terrible for long running services such as this one.
What should I write my service in, or what should I take into account before choosing (performance and concurrency being the highest priorities)?
Thanks

This question is superficially about performance of code compiled with GHC vs code running on the JVM. But there are a lot of other factors that come into play.
People
Is there a team working on this, or just you?
How familiar/comfortable is that team with these languages?
Is this a language you (all) want to invest time in learning?
Who will maintain it?
Behavior
How long is this project expected to live?
When, if ever, is downtime acceptable?
What kind of processing will this program do?
Are there well-known libraries that can aid you in this?
Are you willing to roll your own library? How difficult would this be in that language?
Community
How much do you plan to draw from open source?
How much do you plan to contribute to open source?
How lively and helpful is the community
on StackOverflow
on irc
on Reddit
working on open source components that you might make use of
Tools
Do you need an IDE?
Do you need code profiling?
What kind of testing do you want to do?
How helpful is the language's documentation? And for the libraries you will use?
Are there tools to fill needs you didn't even know you had yet?
There are a million and one other factors that you should consider. Whether you choose Scala, Java, or Haskell, I can almost guarantee that you will be able to meet your performance requirements (meaning, it probably requires approximately the same amount of intelligence to meet your performance requirements in any of those languages). The Haskell community is notoriously helpful, and my limited experience with the Scala community has been much the same as with Haskell. Personally I am starting to find Java rather icky compared to languages that at least have first-class functions. Also, there are a lot more Java programmers out there, causing a proliferation of information on the internet about Java, for better (more likely what you need to know is out there) or worse (lots of noise to sift through).
tl;dr I'm pretty sure performance is roughly the same. Consider other criteria.

You should pick the language that you know the best and which has the best library support for what you are trying to accomplish (note that Scala can use Java libraries). Haskell is very likely adequate for your needs, if you learn enough to use it efficiently, and the same for Scala. If you don't know the language reasonably well, it can be hard to write high-performance code.
My observation has been that one can write moderately faster and more compact high-performance parallel code in Scala than in Haskell. You can't just use whatever most obviously comes to mind in either language, however, and expect it to be blazing fast.
Scala doesn't have actor-related memory leaks any more except if you use the default actors in a case where either you're CPU-limited so messages get created faster than they're consumed, or you forget to process all your messages. This is a design choice rather than a bug, but can be the wrong design choice for certain types of fault-tolerant applications. Akka overcomes these problems by using a different implementation of actors.

Take a look at the head-to-head comparison. For some problems ghc and java7-server are very close. For equally many, there's a 2x difference, and for only one there's a 5x difference. That problem is k-nucleotide for which the GHC version uses a hand-rolled mutable hashtable since there isn't a good one in the stdlibs. I'd be willing to bet that some of the new datastructures work provides better hashtables than that one now.
In any case, if your problem is more like the first set of problems (pure computation) then there's not a big performance difference and if its more like the second (typically making essential use of mutation) then even with mutation you'll probably notice somewhat of a performance difference.
But again, it really depends on what you're doing. If you're searching over a large data set, you'll tend to be IO bound. If you're optimizing traversal of an immutable structure, haskell will be fine. If you're mutating a complex structure, then you may (depending) pay somewhat more.
Additionally, GHC's lightweight green threads can make certain types of server applications extremely efficient. So if the serving/switching itself would tend to be a bottleneck, then GHC may have the leg up.
Speed is well and good to care about, but the real difference is between using any compiled language and any scripting language. Beyond that, only in certain HPC situations are the sorts of differences we're talking about really going to matter.

The shootout benchmark assumes the same algorithm is used in all implementations. This gives the most advantage to C/C++ (which is the reference implementation in most cases) and languages like it. If you were to use a different approach which suited a different language, this is disqualified.
If you start with a problem which more naturally described in Haskell it will perform best in that language (or one very much like it)
Often when people talk about using concurrency they forget the reason they are doing it is to make the application faster. There are plenty of examples where using multiple threads is not much faster or much much slower. I would start with an efficient single threaded implementation, as profiled/tuned as you can make it and then consider what could be performed concurrently. If its not faster this more than one CPU, don't make it concurrent.
IMHO: Performance is your highest priority (behind correctness), concurrency is only a priority in homework exercise.

Does anyone have any performance benchmarks/experience of using
Haskell vs Scala vs Java for performing highly concurrent tasks?
Your specific solution architecture matters - it matters a lot.

I would say Scala, but then I have been experimenting with Scala so my preference would definitely be Scala. Any how, I have seen quite a few high performance multi-threaded applications written in Java, so I am not sure why this nature of an application would mandate going for FP. I would suggest you write a very small module based on what your application would need in both scala and haskell and measure the performance on your set up. And, may I also add clojure to the mix ? :-) I suspect you may want to stay with java, unless you are looking at benefiting from any other feature of the language you choose.

c++ or java for robotics [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
I know embedded C is used for micro-controllers along with other languages. but what if the control was from a PC, well I had two possible candidates (java and c++)
Java is simple and easy also Developer friendly when it comes to threading or GUI, but of course C++ is so much better performance (I know computers getting faster, and performance depend on good Algorithms ) but the compilation makefiles, shared-library and cross compiling wastes lots of time caring about technicalities when I should be working on other Important issues.
But still I've faced something like Const references which java doesn't support and force you to use clone() or copying and when that came to arrays it was a giant mess,
NOTE: I'm going to use reverse kinematics and maybe Neural network for pattern recognition. which requires tons of Calculations. but as I said I care also about the whole life cycle of the project (speed of development, performance , user friendliness and quick deployment)
I'm swinging between languages and i'm planning for long term learning process so I don't want to waste that in the wrong language or let's say (without asking) so please help and I hope this question won't be considered subjective but a reference.
cheers

Why you eliminated C?
Why do you think java has worse performances then c++? Some things are as good as c++, and it is easy to use java program on different platforms without much hassle.
Just pick the language you feel comfortable and you have most experience with, and go with it.

Personally I would lean toward C++. Java has a garbage collector, which can put your app to sleep at random. In C++ I have to collect my own garbage, which gives me an incentive to generate less of it. Also C++ allows macros, which I know have been declared a bad thing by Java-nistas, but I use as a way of shortening the code and making it more like a DSL. Making the code more like a DSL is the main way I shorten development effort and minimize introducing bugs.
I wouldn't assume that Java is inherently slower than either C++ or C. IME slowness (and bigness) comes not from how well they spin cycles, but from the design practices that they encourage you to follow. The nice things they give you, like collection classes, are usually well-built, but that doesn't stop you from over-using them because they are so convenient.
IME, the secret of good performance is to have as little data structure as possible (i.e. minimal garbage), and keep it as normalized as possible. That way, you minimize the need to keep it consistent via message-waves. To the extent the data has to be unnormalized, it is better to be able to tolerate temporary inconsistency, that you periodically patch up, than to try to keep it always consistent through notifications (which OO languages encourage you to do). Unless carefully monitored, those make it extremely easy to introduce performance bugs.
Here's an example of some of these points.

I wouldnt worry too much about performance at first - write the code in whatever language you feel comfortable in and then refactor as necessary.
You can always use something like JNI to call out to c/c++ if needed, although the performance gap between Java and c/c++ is nowhere near what it was...

Depending upon your circumstance, Java is no more quick to deploy than is C++. This mainly boils down to: are you guaranteed the same environment in your testbed that you are in production? With all of the modern additions to C++, there is little cause to suggest that Java is easier on the developer unless you are still new to the C++ language.
That aside, you have performance concerns. Unless it is a real-time system, there's no reason to eliminate any language just yet. If you code your Java intelligently (for instance, do your best to avoid copying objects and creating garbage in the most-used sections), the performance differences won't be seriously noticeable for a compute-bound process.
All told, I think you are focusing too much on textbook definitions of these two languages rather than actual use. You haven't really given any overriding reason to choose one over the other.

Java is a bit more portable, but as far as I know the only real factor for something like this is personal preference.

It would really help if You described Your problem in greater detail.
You are willing to use IK, that might suggest some robotic arm manipulation. What it doesn't say are your real time requirements. If it's going on a class-A production line it'll be hard to get away with garbage collected language.
Java is great. There are some very mature NN libraries (Neuroph, Encog) which could save You a lot of coding time. I don't know of any IK library, but I'm sure there also are at least good matrix manipulation libraries to help.
The Garbage Collection in Java is getting better and better. The latest one (G1) is a lot better than anything else, but even with it the best You can get is soft real time. So You can't expect pause-free run.
On the other hand You also might want to look at some dedicated environments - Matlab toolboxes for robotics and artificial intelligence. I think that would yield fastest prototypes.
If it's going on production than You are pretty much stuck with C or C++.

How fast is Javascript compared to Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Are there any tests that compare Javascript's performance with Java's?
UPDATE: Since everyone is asking why the hell this question, here is some context :)
As you all know - I hope - Javascript nowadays doesn't only reside in the web client but also in the web server with node.js.
It could also be run in mobile phones and dekstops with appcelerator and phonegap.
It could also be used substantially in the web browser to make the user experience first class like with desktop applications.
But Java could do these things too, running applets on the web client, and on mobile phones. It's also a language for the backend with many frameworks to choose between.
Since each one of them could almost/entirely replace each other in the mentioned area, I want to know the performance difference between them, for every case I described:
Client: Java Applets vs Javascript
Server: Java EE vs Javascript with Node.js + Express
Mobile phones: Java ME vs Javascript with Phonegap / Appcelerator
Desktop: Java SE vs Javascript with Phonegap / Appcelerator
I hope the context is more clear now.

Java and JavaScript are both programming languages. Programming languages are just a bunch of abstract mathematical rules. Programming languages aren't fast. Or slow. They just are.
The performance of an application has nothing to do with the language. The most important factor is the application architecture. Then comes algorithmic efficiency. Then micro-optimizations. Then comes the quality of the compiler/interpreter. Then the CPU. Maybe a couple of other steps in between. The language, however, doesn't directly play a role. (And of course if you're talking about benchmarks, then also the particular benchmark plays a role, as well as how well implemented the benchmark is, how well run it is, whether the guy who performs the benchmark actually knows something about benchmarking, and even more importantly statistics. Also, the precise definition of what you actually mean by "fast" is pretty important, since it can also have significant influence on the benchmark.)
However, the language might indirectly play a role: it is much easier to find and fix performance bottlenecks in 10 lines of highly expressive, clear, concise, readable, well-factored, isolated, high-level Lisp code, than in 100 lines of tangled, low-level C. (Note that those two languages are only examples. I don't mean to single any one language out.) Twitter, for example, have said that with a less expressive language than Ruby, they wouldn't have been able to make such radical changes to their architecture in such a short amount of time, to fix their scalability problems. And the reason why Node.js is able to provide such good evented I/O performance is because JavaScript's standard library is so crappy. (That way, Node.js has to provide all I/O itself, so they can optimize it for evented I/O from the ground up. Ruby and Python, for example, have evented I/O libraries that work just as well as Node.js and are much more mature ... but, Ruby and Python already have large standard libraries, including I/O libraries, all of which are synchronous and don't play well with evented libraries. JavaScript doesn't have the problem of I/O libraries that don't play well with evented I/O, because JavaScript doesn't have I/O libraries at all.)
But if you really want to compare the two, here's an interesting datapoint for you: HotSpot, which is one of the more popular, and also more performant JVM implementations out there, was created by a team of guys which included, among other people, a guy named Lars Bak. But actually, HotSpot didn't appear out of thin air, it was based on the sourcecode of the Anamorphic Smalltalk VM, which was created by a team of guys which included, among other people, a guy named Lars Bak.
V8, which is one of the more popular, and also more performant JavaScript implementations out there, was created by a team of guys which included, among other people, a guy named Lars Bak. But actually, V8 didn't appear out of thin air, it was based on the sourcecode of the Anamorphic Smalltalk VM, which was created by a team of guys which included, among other people, a guy named Lars Bak.
Given that the two are more or less the same, we can expect similar performance. The only difference is that HotSpot has over a hundred engineers working on it for 15 years, whereas V8 has a dozen engineers working for less than 5 years. That is the only difference in performance. It's not about static vs. dynamic typing (Java is statically typed, but most JVMs and certainly HotSpot make no static optimizations whatsoever, all optimizations are purely dynamic), compilation vs. interpretation (HotSpot is actually interpreted with an additional JIT compiler, whereas V8 is purely compiled), high-level vs. low-level. It is purely about money.
But I am going to bet that for every pair of Java and JavaScript implementations where the Java implementation is faster, I can find another pair where the JavaScript implementation is faster. Also, I can probably keep the pair and just use a different benchmark. There's a reason the call the Computer Languages Benchmark Game a "game": they even encourage you right on their own page to play around with the benchmarks to make any arbitrary language rise to the top.

I only have an anecdote to add: I've recently reimplemented a Java calc server (finance) in Javascript (nodejs v0.6.8). WRT development time, the Javascript implementation was a breeze compared to the original Java implementation with far fewer lines of code. It was a breath of fresh air, really.
The Javascript-based server is able to calc through 2.4k trades/sec whereas the Java server handles 400+/sec on the same hardware using less memory. I wouldn't attribute the speed increase to raw V8 vs. Java 7 performance but rather to the implementation. The Javascript implementation uses far fewer data structures, does an order of magnitude fewer method calls and takes a more straight-forward and terse approach.
Needless to say, I'm very happy with the performance of node.js. And this, coming from someone who was Java only for many (9) years.

Here are some tests comparing Javascript (V8) and compiled Java:
32 bit
64 bit
They indicate that Java is generally faster1. However, if you dig around with those pages and the linked resources, you will notice that it is very difficult to compare like with like.
Interestingly, Javascript does significantly better than Java (under certain conditions) for the "regex-dna" benchmark. My guess is that this is because the Javascript regex engine is faster than the Java regex engine. This is not entirely unsurprising, given the importance of regexes in typical Javascript applications.
1 - Strictly speaking, you cannot say that language X is faster than language Y. You can only compare specific implementations of the respective languages. And the site I linked to is clear about that ... if you care to go in via the front page. However it is not entirely unreasonable to generalize from specific datapoints ... and the apparent of absence of contradictory datapoints ... that Java is typically faster than Javascript in computationally intensive tasks. But the flip side is that that kind of performance is often not an objectively important criterion.

Java, obviously.
Programmers love to compare execution speed like some sort of pissing content. It is just one metric, and the majority of the time, not the most important one by a long shot. Java is a language that has a mix of being fast enough for almost anything, but high enough level that you get stuff like GC, which you don't usually get in similar languages. Javascript is a dynamic closure language that is great for getting stuff done quickly (and for FP programmers stuck in an OO world ;-) ). There isn't much in the way of intersection in the spaces where either would be appropriate.
I'll stop pontificating now
EDIT: to address the edit in the post
Due to the way one writes idiomatic javascript (functions composed of functions), it lends itself surprisingly well to asynchronous programming, probably better then any other language of similar popularity. Node.js shines when it comes to a huge amount of short connections, so javascript is a really great fit for that sort of thing.
While node.js is absolutely drenched in awesome, being the new hotness really doesn't mean it is the best at everything, no matter what the hype says. If a java app is replaceable by node, chances are java wasn't really appropriate in the first place.

Probably not, but it doesn't really matter.
Prior to Google Chrome's JavaScript JIT, Java would win over JavaScript as soon as the problem got big enough to overcome the load time.
Java should still roundly trounce JavaScript due to integer vs. float math. No matter how good the JIT it can't really make up for this.
WebAssembly will turn this on its head anyway.

http://benchmarksgame.alioth.debian.org/u64q/javascript.html
(Remember to look at the cpu column as-well-as elapsed secs).
According to the above link JavaScript as reality stands now is much slower for almost everything.

Modernize Legacy Cobol [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am constantly reading about how much Cobol code is still in production. And the main reason that it hasn't been updated into am more modern language is that it would take too long/cost too much.
My question is: If there was a tool that converted Cobol to, say, Java, would any organizations find it useful? Or would they rather continue maintaining what they know already works?

Currently, a large volume of the COBOL code (I'd estimate well over 90%) is untestable.
No one knows what it really does.
They know that -- minimally -- it does the expected job most of the time. And when it doesn't, the bugs are known.
Worse, some percentage of COBOL is just workarounds for bugs in other parts of the COBOL.
Therefore, if you subject it to any scrutiny, you'll find that you don't know what's really going on. You can't create test cases.
Indeed, you'll find that most organizations can't even agree on what's "right". But they're willing to compromise on what's available.
The cost and risk of examining the core business processing is unthinkable.

Any conversion tool would have risks associated with it, and the resulting code would have to undergo a lot of testing.
Given that a lot of these systems are in use daily to run a business, a lot rides on the continuing operation. So it is not just "how long" or "how expensive", but can we trust it to work 100% the same.

One will always find tools to convert one language to another - they usually go by the term "compilers".
There is always a shortcoming with compilers that have to perform the task of converting code in language X to language Y, especially when the said code was written by a person. That shortcoming happens to be the fact that readbility is often lost in the process of translation. There is no guarantee that the code compiled from COBOL to Java will be understood by any programmer, so in effect the cost of translation has actually increased. In fact, it is difficult to define readability in such a context.
Lack of readability and understandability translates into lack of knowledge of runtime behavior of the translated code. Besides there is no guarantee that people understand the original code completely; surely they do understand bits and pieces of it.

Probably a little of both. There are companies that provide tools and services for conversion using both automated and manual techniques.
Many companies, however, follow the "ain't broke" philosophy, which is likely as wise as anything. Especially since many conversions result in attempts to "improve" the existing system or try to introduce modern software design/construction philosophies and result in a mess.

Many systems written in Cobol have many transactions going though them. They work well on the mainframe platforms that they run on. It would be risky to change them just for the sake of change.

I think some organizations could find it useful, particularly organizations where interfacing with/designing around legacy code has become more costly and problematic than converting the code to Java (or another language)
while ( (CostToPortToJava > CostOfNotPortingOverTime++) && DoesLegacyCodeStillWork() )
{
StayWithLegacyCode();
}
PortCodeToJava();

There are a few factors here:
Cobol program files are super long and just about always on ultra-secure mainframes. Usually the Java developers don't have access to them.
Colleges & Universities haven't taugh Cobol for more than 20 years. As a result, all of the really top-notch Cobol developers have moved up in their companies to be replaced with a bunch of tech school grads. These people didn't love programming enough to be hackers (or they'd do C, Python, C++, whatever and wouldn't have taken a course) or enough to go school (and be Java, .Net, Python, whatever).
Java developers generally lose their minds when they look at Cobol programs in their 50,000 line glory, so they aren't any help.
There really aren't any documents, and the logic is so tight in these programs that you should really just read them and convert them.
Most of these companies are financial companies where the best way to blowup and not be in the industry anymore is to screw something up. Good way to screw something up is to tack something like converting a critical task from Cobol to Java.
It's going to take a long time - every so often, part of one of the programs stops working or can't do something, and it gets replaced. I don't see a lot of senior managers having the stomach for the all of the FUD in one of these projects, and the timeframes are pretty long in terms of return on money spent.

COBOL is, in effect, a superb DSL (domain specific language).
It's domain is business rules as embedded in (mainly) backend applications.
Find another language that....
is feature rich in that specific domain
has some years of actual, applied, experience behind it so all the gotchas are cured or out in the open
has a TCO (total cost of ownership) lower than the existing COBOL legacy mountain
is cost-effective to convert to
....and you will have the killer application for backend business applications.

Something to realize about old COBOL applications, besides the language dissimilarity, is that at a lot of data structures built in these applications don't conform to any later RDBMS structure, so really you would be talking about rethinking a lot of the underlying architecture and design, not just changing the language syntax, and replacing that would have a lot of performance risk once it hit real world loads, even if it could be QA'd sufficiently.
The bottom line is that it is more economical to bolt on new features in a modern language than rewrite it. As long as that continues to be the case, COBOL will continue to live on.

Cobol has the advantage of being fast for moving data around, which is what that kind of applications tend to do a LOT. Also the machines are designed for I/O, not processing speeds. Hence, any translation to another language will most likely be slower than the Cobol counterpart on identical or similar hardware, leaving no reason to do so.
Let me ask a counter question: WHY convert it, if you have something in place that works?
(Similar to tearing down a bridge ever 10 years just for rebuilding it again right afterwards - it is usually always cheaper just to maintain what you have).

There are translators around which can be modified at little cost to make it run on a specific machine or operating system and some are available from England and can be run there or on site. Standard versions exist for the major models (anyone can contact me about them). Cobol to another language source code or script is relatively easy to do automatically and would produce a text file for import into a source file on the target machine with 95 percent or more code compatibility. Simple manual amendments are all that are necessary before running the compiler or JIT software to achieve a new program - do not forget to amend the job command language or macro for mainframe jobs when testing or going live. New cobol compilers exist for ICT/ICL mainframes and one or two others and these compile faster than the old software and sometimes the new compiled program can run several times faster.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.