Java checkpointing - java

I hope my question is not too vague but I'm looking for more info about checkpointing in Java. I have to generate a big searchtree of which i'd like to be able to resume calculation after the program got interrupted (for example after a sudden reboot etc). Therefore I need checkpointing. I find very few documentation about that and I get the impression a lot of developement has stopped in the mid-90s.
So far I've found a library called Padmig, but I hope alternatives are available? Can anyone point me into the right direction with some info about checkpointing for java?

What you describe sounds a lot like object prevalence. In Java there is a library called prevayler that been around since 2002. I haven't used it before, but when it came our there was a lot of fuzz about it and had quite a few interesting concepts.

Java doesn't support first-class continuations, so this is impossible. juancn's answer about object prevalence might help you, though.
Alternatively, if you're not tied to Java, you might use a language which does support first-class continuations, such as Lisp.

Related

Google Go for Java platform?

JVM provides great performance - it's on the one hand. Golang sounds like a new paradigm and extremely productive - on the other hand. If we could bring together the best of two worlds - JVM performance and golang productivity - we could get a lot of benefits. Does anyone know any project that provides golang implementation in java?
A quick search came up with
http://code.google.com/p/jgo/
This link suggest it's the main or only effort.
http://en.wikipedia.org/wiki/List_of_JVM_languages
It may be difficult to make a good JVM implementation of Go. Rob Pike, who is one of Go's creators, spoke about this on episode 0.0.3 of the Changelog podcast:
[timecode 17:05] For instance, it is quite difficult to implement Go's interface model using a JVM: you might have to add a bytecode to deal with some of the type stuff. So for some of these existing systems [(JVM and CLR)] it's not quite obvious how Go would run with them […]
You should check JGO website:
http://jgo.herokuapp.com/
And the JGO Docs: http://jgo.herokuapp.com/api/
A different route might be to use a JVM library which provides the most important features of Go, which are in my opinion and experience the lightweight Go-routines multiplexed on JVM threads, and channels for communication and synchronization.
There is one such library, Quasar, from Parallel Universe (see e.g. this blog post comparing Quasar and Go). Also, it works well with Kotlin, which is getting more popular now as an officially supported Android language, and providing much more compact (productive?) syntax than Java.

Just enough Java for Hadoop [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have been a C++ developer for about 10 years. I need to pick up Java just for Hadoop. I doubt I will be doing any thing else in Java. So, I would like a list of things I would need to pick up. Of course, I would need to learn the core language, but what else?
I did Google around for this and this could be seen as a possible duplicate of "I want to learn Java. Show me how?" but it's not. Java is a huge programming language with lots, of libraries and what I need to learn will depend largely on what I am using Hadoop for. But I suppose it is possible to say something like don't bother learning this. This will be quite useful too.
In my day job, I've just spent some time helping a C++ person to pick up enough Java to use some Java libraries via JNI (Java Native Interface) and then shared memory into their primarily C++ application. Here are some of the key things I noticed:
You cannot manage for anything beyond a toy project without an IDE. The very first thing you should do is download a popular Java IDE (Eclipse is a fine choice, but there are also alternatives including Netbeans and IntelliJ). Do not be tempted to try and manage with vi / emacs and javac / make. You will be living in a cave and not realising it. Once you're up to speed with even basic IDE functions you will be literally dozens of times more poductive than without an IDE.
Learn how to layout a simple project structure and packages. There will be simple walkthroughs of how to do this on the Eclipse site or elsewhere. Never put anything into the default package.
Java has a type system whereby the reference and primitive types are relatively separate for historic / performance reasons.
Java's generics are not the same as C++ templates. Read up on "type erasure".
You may wish to understand how Java's GC works. Just google "mark and sweep" - at first, you can just settle for the naivest mental model and then learn the details of how a modern production GC would do it later.
The core of the Collections API should be learned without delay. Map / HashMap, List / ArrayList & LinkedList and Set should be enough to get going.
Learn modern Java concurrency. Thread is an assembly-language level primitive compared to some of the cool stuff in java.util.concurrent. Learn ConcurrentHashMap, Atomic*, Lock, Condition, CountDownLatch, BlockingQueue and the threadpools from Executors. Good books here are those by Brian Goetz and Doug Lea.
As soon as you want to use 3rd party libraries, you'll need to learn how the classpath works. It's not rocket science, but it is a bit verbose.
If you're a low-level C++ guy, then you may find some of this interesting also:
Java has virtual dispatch by default. The keyword static on a Java method is used to indicate a class method. private Java methods use invokespecial dispatch, which is a dispatch onto the exact type in use.
On an Oracle VM at least, objects comprise two machine words of header (the mark word and the class word). The mark word is a bunch of flags the VM uses - notably for thread synchronization. The class word you can think of as a pointer to the VM's representation of the Class object (which is where the vtables for methods live). Following the class word are the member fields of the instance of the object.
Java .class files are an intermediate language, and not really that similar to x86 object code. In particular there are lots more useful tools for .class files (including the javap disassembler which ships with the JVM)
The Java equivalent of the symbol table is called the Constant Pool. It's typed and it has a lot of information in it - arguably more than the x86 object code equivalent.
Java virtual method dispatch consists of looking up the correct method to be called in the Constant Pool and then converting that to an offset into a vtable. Then walking up the class hierarchy until a not-null value is found at that vtable offset.
Java starts off interpreted and then goes compiled (for Oracle and some other VMs anyway). The switch to compiled mode is done method-by-method on a as-need basis. When benchmarking and perf tuning you need to make sure that you've warmed the system up before you start, and that you should typically profile at the method level to start with. The optimizations that are made can be quite aggressive / optimistic (with a check and a fallback if the assumptions are violated) - so perf tuning is a bit of an art.
Hopefully there's some useful stuff in there to be going on with - please comment / ask followup questions.
Learning "just enough" Java is learning Java. Either you learn all the core principles and language design decisions, or you suffer along making easily avoidable mistakes. Considering that you already know how to program, a lot of the information can be skimmed (with an eye for where it differs from other languages you are intimately familiar).
so you need to learn:
How to get started
The language itself
The core, essential classes
The major Collections
And if you don't have a build framework in place, how to package your compiled code.
Beyond that, nearly every other item you might need to learn depends heavily on what you intend to do. Don't discount the on-line tutorials from Oracle/Sun, they are quite good (compared to other online tutorials).
Hadoop can use C++ : WordCount example in C++
You can't really use Java without knowing these packages in the standard API:
java.lang
java.util
java.io
And, to a lesser degree:
java.text
java.math
java.net
java.lang.reflect
java.util.concurrent
They contain a lot of classes you'll need to use constantly for pretty much any application, and it's a good idea to look through them until you know which classes they contain and what those are good for, lest you end up reinventing wheels.
Take it easy, learning Java could be
pleasant and fast if you already know
C++
Buy these two books:
The JavaTM Programming Language, (4th Edition) Ken Arnold, James
Gosling, Davis Holmes
Effective Java (2nd Edition), Joshua Bosh
You will soon be mastering Java, You will not regret. Good Luck.
Since C++ and Java share common roots, the core language shouldn't give you too much trouble. You will need to become familar with the java SDK, particularly java.lang and the Collections framework (java.util.)
But perhaps learning java is overkill if you don't see yourself using it elsewhere. Hadoop also has bindings to Python - perhaps learning python would be a better alternative? See Java vs Python on Hadoop.
Here is the quickstart for all you will need
I suggest Eclipse (java) to start working, see this for that
Maybe you don't even need to know Java to use Hadoop.
Pig is far enough from simple to advanced usage of Hadoop.
I don't know how familiar are you with other higher level programming languages. Garbage collection is an important function in Java. It would be important to read a bit about the GC in your VM of choice.
Besides the obvious packages, check out the java.util packages for the collection framework. You might want to check out the source of some classes. I suggest HashMap to get the idea of the computing/memory cost of these operations.
Java likes to use streams instead of buffers when processing large amounts of data. That may take some time getting used to.
Java has no unsigned types. Depending on the packets of data you need to process at once you can either use larger variables and streight arythetics (if we're talking about relatively small packets), or you have to (b[i] & 0xff) every time you read for example unsigned bytes. Also note that Java uses network byte order (msbf) when serializing multibyte numbers.
The most beloved design patterns by the API are Singleton, Decorator and Factory. Check the source of JFC itself for best practices, how these patterns are achieved in the language.
... and you can still post more concrete questions on SO :)
Answer 1 :
It is very desirable to know Java. Hadoop is written in Java. Its popular Sequence File format is dependent on Java.
Even if you use Hive or Pig, you'll probably need to write your own UDF someday. Some people still try to write them in other languages, but I guess that Java has more robust and primary support for them.
Most Hadoop tools are not mature enough (like Sqoop, HCatalog and so on), so you'll see many Java error stack traces and probably you'll want to hack the source code someday
Answer 2
It is not required for you to know Java.
As the others said, it would be very helpful depending on how complex your processing may be. However, there is an incredible amount you can do with just Pig and say Hive.
I would agree that it is fairly likely you will eventually need to write a user defined function (UDF), however, I've written those in Python, and it is very easy to write UDFs in Python.
Granted, if you have very stringent performance requirements, then a Java based MapReduce program would be the way to go. However, great advancements in performance are being made all of the time in both Pig and Hive.
So, the short answer to your question is, "No", it is not required for you to know Java in order to perform Hadoop development.
Source :
http://www.linkedin.com/groups/Is-it-must-Hadoop-Developer-988957.S.141072851
Most of the stuff should be pretty familiar to you. I'd just download eclipse and google a tutorial site. Familiarize yourself with classloading, keywords. One tricky thing a lot of C++ guys run into is how to run a java app so that it finds its library classes(sort of analogous to dynamic linking). Learn the difference between the JRE and JDK. If you can get a few hello world type apps working you ought to be able to get a start on hadoop if you follow the tutorials.
You dont need to learn java to use hadoop.
You need to know linux to installand configure hadoop
then you can write your map reduce jobs using the stream line api on any language which understand standard input/output
further you can do more complex map reduce using other libraries like hive etc
even other components of hadoop like hbase/ cassandra also has clients on most of the languages

What's the best way to learn Smali (and how/when to use Dalvik VM opcodes)?

I know Java, and learned C but never used it. I do not know any form of assembly, either for a virtual machine or a real one.
What's the best way to learn how to hack Smali?
UPDATE: As I promised yesterday, I added some more links to the list.
Ufff. Not much documentation around! Best advice? Decompile, and read, and tweak, and see how it did, and start the cycle again and again. But you did not ask for that advice, right? ;)
Now, there are a few places out there that wil lhelp a little bit:
http://androidcracking.blogspot.com/search/label/smali
This is the best one. I even asked the guy a question and he answered very quickly, so go and take a look.
http://pallergabor.uw.hu/androidblog/dalvik_opcodes.html
Very comprehensive table - good reference!
http://webchat.freenode.net/?channels=smali
I never tried it, but it's on the google code page of the baksmali author ( http://code.google.com/p/smali/ )
http://forum.xda-developers.com/showthread.php?t=777707
Lastly, this is a post I made some time ago describing some hacks to the Captivate camera.
You can follow the diffs in there as I comment a little bit on what each .diff file is doing. The good stuff starts at the post #20.
http://www.slideshare.net/paller/understanding-the-dalvik-bytecode-with-the-dedexer-tool
Interesting slide show with some basic concepts. Good way to start.
http://sites.google.com/site/haynesmathew/home/projects/dalvik-notes
Even more low level than the typical .smali. A reference for later, but a good read.
http://jasmin.sourceforge.net/guide.html
Smali syntax is based on Jasmin, so this gives good concepts.
http://groups.google.com/group/apktool?pli=1
Some discussions there are worth reading through. Also a good place to search for when you're stuck in something.
And last, but not least, the most helpful trick I used: start coding very basic classes and methods in java, compile them and then baksmali your own code. You know exactly what it does, so it will be a lot easier to follow.
I continue to be surprised that folks don't use the official Dalvik format documents as a primary reference. On older releases, the Dalvik docs are in the Android source under dalvik/docs. The particular file you'll want to look at is called dalvik-bytecode.html. A few releases back the bytecode definition became part of the android.com developer docs:
dalvik-bytecode.html on source.android.com
As an additional convenience, I occasionally mirror these docs on my personal website. In this case:
dalvik-bytecode.html on milk.com
I do have a couple of wiki pages on the smali site with some information:
https://github.com/JesusFreke/smali/wiki/Registers
https://github.com/JesusFreke/smali/wiki/TypesMethodsAndFields
And there are examples of smali code that use every opcode in the integration tests for smali/baksmali:
smali integration tests

Good start for .Net and c++ to Java transition?

I know the .Net framework very well and know where to find things ie: StreamReader, StreamWriter, Graphics, etc, and I know Java has similar things. The syntax is different but quite similar to c++ which I have a lot of native c++ experience. Therefore, what would you recomend as a good starting point for tutorials and such. Thanks
In my new job, I quickly found myself working on a common library in C++, C# and Java. I had no Java knowledge and yet found it pretty intuitive to make simple mods to the Java code - the general C# principle that there is a framework class/namespace for most things you want to do, appear to hold in Java.
The thing that bothers me is that this MO would not teach me tricks and improvements in Java that are specific to that language. That's where I would like to see other answers to this question lead.
In the meantime: http://en.wikipedia.org/wiki/Comparison_of_C_Sharp_and_Java
btw while I found C# and Java pretty congruent, I would not say the same about C++ vs Java.
If you work in eclipse/netbeans/intelliJ it may actually be a no-brainer. Guess at a class name, start typing it and hit ctrl-space (for eclipse, others vary). Regardless of which package it is in, it will find all the classes that match and list them for you faster than you could look them up anywhere else.
The other really nice thing to have on hand is the javadocs for the SDK you are working with--you can code effectively with nothing else. They are online (just search for JDK 6.0 or whatever version) or they can be downloaded from the same place you get the JDK.
The javadocs are your friend - once you figure out some of the main packages in java.*, it's easier to know where to look for specific classes / functionality.
Once you're writing some code, buy Effective Java - it's full of tips for the language, and is just a good programming book.

How to master java and ruby fastest for a PHPer?

I've been using PHP all the time.
Any advice to taking on these two languages?
I would say it really depends if you're used to the way OOP (object oriented programming) works. If you're not familiar with this way of thinking I would definately go with the book "Objects First With Java". It might look really, really basic at first, and you might be able to skip the first chapter or two. But if you read it from chapter 2 or 3 or so and finish it, you should have a good amount of knowledge to start building applications.
It's a little hard to help here because I don't know your level of skill when it comes to OOP. :) I've been writing PHP for a long time and didn't know a thing about OOP until I read the above-mentioned book.
All the best,
Bo
The same way you learned PHP - read the documentation, write some code, compile or execute it, debug it. Repeat until you are good. But don't expect to master a language quickly - anyone can learn to write code in a given language, but it takes time and effort to actually write good, high quality, and idiomatic code in that language.
The way I learn new languages is to read the documentation and other people's source code. It really helps to see what is possible in the language, without having it all wrapped up in academic speak.
Books are also helpful, if you have the time/patience to read through them.
A really good idea is to look up programs written in those languages and see if you can write the pseudo code for the programs. Then compare those to the source code and see what the difference is.
The best way to learn Ruby and/or Java is to forget the "PHP way" and to tackle each new language under their own idioms.
Both Ruby and Java have a fair selection of books (dead tree, electronic, free and non-free) as well as numerous free online tutorials. Ruby even has a nifty online interactive tutorial by _why (you did search didn't you?).
Learning the basic operation and syntax of each language is essential to avoid wasting time with random guessing as to why X doesn't work like Y. (Hint: If X doesn't work like Y, it's because X isn't Y.)
Trying to learn two languages at once is probably not the best idea. Ruby is quite similar to PHP, so the transition may be fairly simple, depending on your prior experience with other dynamic languages. You may find this site useful: http://railsforphp.com/
I recommend you try and build basic applications. Have a target, use the documentation and search the blogs or ask somebody if you're stuck. That's how I learned Ruby.
Also, for Ruby and Rails documentation I like APIdock, too bad they don't have Ruby 1.9 (which I recommend you use).
Click this: code-golf
Then solve all the challenges that got at least 10 upvotes in both Java and Ruby. Don't worry about the golf-scoring part, just do the best you can. If you post your efforts you may get some feedback, and you can compare your results with others.
Keep it enjoyable and simple at first. Use the learning style that works for YOU. If you like reading docs - great, otherwise you'll just end up with a nasty aftertaste. I'll say keep it enjoyable again because your initial exposure/experience can be greatly influenced by how you hit it off with a new language. Try to approach it from an angle of familiarity, you will find that there's some overlap between what you know and the new material. It will help if you can introduce the new stuff with as much ease as possible.
Recall what speaks to you or demonstrates most effectively when you learn and plot your course based on that. If you like books, find one that suites your style. Most of the books will give you most of the same information so what will make the most difference in a case like this is style of the book. For me "Java Objects" by J. Baker did the trick, my friend swears by "Thinking in Java". ... or find some screencasts if you like screencasts.
Then of course fire up the debugger and step though some code, but what ever you do first make sure you're enjoying it.
Start with Ruby. There is a wonderful online tutorial that lets you try Ruby right in your browsers. It covers the essentials of flow control and collections.
http://tryruby.sophrinix.com/
Java and PHP have more in common syntactically than Ruby and PHP. Sometimes that makes it harder instead of easier when learning a new language. That's why I think learning Ruby before Java will help. Ruby is also a lot easier to learn than Java, so the emotional return is greater and you'll be more equipped for Java.
As to learning Java, start with a Tutorial using Tapestry. Not so that you learn Tapestry, but so you gain the benefit of rapid development.
"Java developers love it because they can make Java code changes and see them immediately ... no redeploy, no restart!"
The down side to Tapestry is potential mess of setting up Tomcat.
I'd recommend getting yourself familiar an IDE (i.e. Eclipse) and working through some basic HelloWorld-esque problems. This will let you understand the lifecycle of a java program and some basic I/O. Maybe even take parts of a PHP project you've written and port it over to Java to get the basic syntax ideas down.

Categories

Resources