Clojure targeted initially to servers? - java

Since Clojure is designed to run in a Java virtual machine (JVM), I don't understand this statement:
While Clojure started its life mainly as a server-side language, the advent of ClojureScript demonstrates that the core developers don't see that as its only purpose.
I am not real familiar with Java though I am interested in Lisp languages and hence Clojure, so this makes me wonder. Most web servers I've worked on are traditional Apache variants with standard server-side languages like Ruby, PHP, Perl, but I've never seen Java as a default installed server language in my hosting environments, so what is the meaning of this statement?
Second, JVMs are typically run on client operating systems like Mac or Windows, are they not? Sun says there are many billions of JVMs in the world, obviously this is not referring to servers.

One main point is that Clojure has several important philosophies and practices that, when applied to a particular runtime environment such as the JVM, JavaScript/ECMAScript, etc. yield a powerful language. These philosophies include:
simplicity: The ability to separate distinct parts. All the clojure variants provide for the separation of Code and data. This includes the ability to deal with data independently of the code that produced it. Directly this is the ability to read and write compound and simple (non-comound) data structures is built into the language.
Immutable Data Structures: All Clojure varients have data structures where producing a new version of even a very large data structure is efficient, and leaves the old data intact. If you pass a large data structure to several threads there is no need for locking because they work on different "forks" of the data. This is all done without copying (with structural sharing) and is efficient.
Explicit handling of Identity, State and Time: All the Clojure Varients provide explicit handling of sequences of events built into the languge. This is different between variants depending on the platform. For instance ClojureScript which produces JavaScript as it's output has no place for coordinated syncronous updates because JavaScript has only one thread, though it has all the rest of the types.
There is a lot more and it can be found on the Clojure Philosophy Page. It is also worth mentioning that If not the majority, then a very large part of the worlds web applications are written almost entirely in Java. Many people find that Clojure provides them a way to interact with this world, even if Java isn't their preferred language.

Java is exactly as much of a server-side language as Ruby or Perl (though not really PHP): It's a general-purpose language that is frequently used to write server applications, including Web applications and SOA services. Whether Java is "installed as a default", it's typically trivial to install on the Unix machines that are the usual hosts for Java servers.
A JVM can theoretically run on any platform; there are JVMs that run on bare x86 hardware, and Blu-Ray players have embedded JVMs. Sun originally thought that Java was the future for rich-client applications, but instead it's found a much wider use in powering Web sites and other services that clients access through various APIs.

By server-side language the author does not mean just the web server. It may include a whole stack of services running on the server from simple file upload to big data processing, served to the clients via the web.
Second, JVMs are typically run on client operating systems like Mac or Windows, are they not?
JVM is a development as well as deployment platform. There are numerous web scale applications deployed over JVM. It is very common to have JVM installed on you servers if your stack includes java based services.

The author of that statement means that Clojure was envisioned as a server-side language, but it has enough power where there is a demand to be able to use Clojure on the client as well.
An important distinction though is that it isn't like Clojure is actually running in a browser. ClojureScript is a tool that compiles client-side (i.e. browser) Clojure code into JavaScript. It is similar to CoffeeScript, which compiles Ruby-style code into JavaScript.
So ClojureScript is nothing more than syntactic sugar that lets people who love the power and succinctness of Clojure on the server side still write Clojure on the client side in the browser. But in the end, that client-side Clojure isn't really Clojure at all but JavaScript.
So when it comes to ClojureScript, the JVM is irrelevant.

I don't know much about Clojure's history, but it seems clear that it has been intended as a general-purpose language for some time--whatever initially pushed Hickey et al. to want to develop it. Because Clojure supports easy access to existing Java libraries and is able to create standard Java-style jar files--both crucial benefits on a server as well as elsewhere--it would have been obvious early on that Clojure could be useful outside of servers. So my answer to why "the advent of ClojureScript demonstrates that the core developers don't see [server side applications] as its only purpose" is that no such demonstration was needed.

Related

Languages that compile to Java Bytecode and can run on the JVM

I am an embedded programmer and working with an embedded JVM.
This enables running Java files on constrained devices.
These Java files are first compiled to bytecode into .class files which are then further optimized and uploaded to the device which has a micro JVM to run the optimized bytecode.
The micro JVM does not support all features, e.g., no reflection.
The main benefit is obvious: this allows programming in Java for constrained devices.
However, I was thinking that plenty of languages compile to bytecode, some are listed https://en.wikipedia.org/wiki/Java_bytecode.
So in theory these languages could also be used to program.
I'd like to obtain a list of common languages that compile down to bytecode and was wondering if you could help.
For example, Python has special implementations that reduce to Java Bytecode, if I'm not mistaken, and stuff like C to Java virtual machine compilers also exist.
So what languages would you think are logical to try and run on the devices? Any pointers on how to or similar experiences?
Also, I'm not clear what the difference is from reading Wikipedia between (Python) bytecode and Java bytecode, could anybody help explain that?
I'm agree with you about the overall idea and it would be nice to develop an embedded application using any language that can run on a JVM. But there are some practical issues that you should consider and I think that's why none of major vendors or open source initiatives have any active/serious project on this (as far as I know).
As you mentioned, a JVM implementations that can run on embedded devices, each have their own constraints and limitations. The most obvious one is that some packages may not be available at runtime. In order to apply such a constraint, you should either control it in the compile process or have a toolchain (sort of an SDK) which accepts the bytecode and checks such constraints.
This situation would be worth when a developer tries to use a third party library that is available for that specific language. It's not easy to guess if a library is safe for use against such a framework or not.
One great facility for developers would be to have their IDE check such issues on the fly (something like inspection in IntelliJ Idea). This makes it much more smoother to move toward using such a solution. But again the problem is that for each such languages there need to be a specific plugin compatible with their own syntax.
Also some of JVM languages that are actually implementation of an existing language (e.g. Jython or JRuby) are most of the time out of sync with the original language in case of supporting libraries/syntax changes of that language.
Anyway, I think in order to have a list of JVM languages you could easily find them on Wikipedia. Maybe you mean those who may worth considering in this regard by having a large community and tools support. In my opinion, you should focus on the following JVM languages as those who may worth to include in your final list:
Groovy
Kotlin
Scala
These are all pure JVM languages and are only using different syntax than Java.
Regarding the topic in general, I should say that when you search for embedded JVM implementations, you'll notice that it's also a fairly academic concepts and they're so many publications in this topics regarding the overall architecture, threading support, toolchain, error handling, memory management, etc. This means that you should have a very great experiences/background on both embedded systems and programming language concepts and implementation to be able to devise a proper architecture for such a platform.
About your last question regarding the difference between Python bytecode and Java bytecode (if I understand your question correctly), these are both conceptually the same but each has its own syntax and constraints. The bytecode concept refers to the piece of software that is the output of the compiler and is the platform independent representation of the original code and can be run/interpreted at runtime by another software component which is the virtual machine. In Java world, this software is called the Java Virtual Machine (JVM). I'm from the Java world so I don't know what it's called in Python vocabulary but it should be something similar (e.g. Python virtual machine).
I think due to the complexity of developing such a toolchain and also considering the unprecedented development of new IoT and SoC devices, many of them capable of running a more higher level operating systems, maybe in a long run most developers prefer to develop for a more high end devices using more high level APIs and SDKs. Who knows! In that case, we would have a same situation that we're in today for PCs. Languages like C and Assembly are still in use, but they have their own domain of applications. I mean throughout the time, layers of abstraction are being added on top of the previous one. The same thing can happen for embedded devices.

Using IPC to combine multiple languages

This is a general "noob" question about software design, so I apologise if it seems vague,
but I would really appreciate the advice. Note the system described below is purely an example, not a specific product I have in mind.
I often have a need to combine the functionality of several libraries or utilities, written in different languages. For example, if I want to code a high-performance audio processing application for the desktop, I will write it in C / C++. Then, I want to add a nice GUI. But I don't want to learn Qt. I like the look and feel of Adobe Air, and would like to use that. Later, I have a need to access a USB device. But the USB library I have only has an API in Java. How can I combine all these elements together, to take advantage of their relative strengths?
Clearly, I cannot compile these various elements into one single executable. So I need to create and run them seperately, and give them a means to communicate. The most common way to do this seems to be using IPC (Inter Process Communication), eg shared memory or sockets. I prefer the idea of sockets, as the programs could potentially run on seperate machines on a network.
So I decide to create a local client / server system, with a custom API, to allow these elements to communicate. For example, the Air application will receive a message from the C application, telling it to update it's UI. The USB application running in Java will use the sockets to stream audio from the USB hardware, into the C application.
My question : is using local sockets in this way a typical way to design such a system?
Will the performance be much worse than a truly native application (e.g. everything in Java or C, in a single executable) ? It also seems likely that such an approach would be prone to bugs, and difficult to maintain?
I frequently find myself coming up against the limits of existing software libraries (e.g. a graphics library with a pretty, flexible UI but no way to access low-level hardware, or a media library that can mix many audio streams, but has no support for video playback), and find it very frustrating. If anyone could advise the best way to combine arbitrary software libraries like this, I would really appreciate it.
Thanks in advance!
As you have correctly identified, combining libraries from different language or platforms is hard. There are several ways to do it, but none are ideal. Examples:
Native call interfaces (e.g. JNI / JNA) - very fast but tricky to make work correctly, and you have the problem that the data types used typically don't map cleanly across different platforms. Adds native dependencies.
Socket based IPC with text protocol (XML, JSON, etc) - works OK and common formats are likely to be supported at both ends, but adds a lot of overhead. Can be a pain to maintain custom schema mappings etc.
Socket based IPC with binary protocol (e.g. Google protocol buffers) - quite efficient, needs a lot of work to get a custom protocol working correctly on both ends
Communication via a 3rd system (e.g. database, message queue, filesystem) - lots of overhead, can get fragile, introduces a major dependency on a 3rd system.
In my experience, it usually isn't worth integrating a new language / platform just to get one specific library or feature. Take your user interface example - no matter how nice Adobe Air looks, I doubt it is worth trying to integrate it with an existing C/C++ application.
Even if you get it to work, it will significantly complicate the future maintenance and devlopment of your application. Builds become more complex. You need to maintain additional communication / "glue" code. You need to manage more dependencies. Your users will get hit by many more configuration issues. Testing becomes more difficult. It becomes harder to teach someone new about how the whole system works. You need to maintain your skills in more languages / frameworks etc.
I'd recommend the following strategy:
Pick a primary platform
Whenever you need a new library or feature, look for something on your primary platform first. Hopefully (usually?) there is something good available - but even if not then it might be worth coding something yourself if the requirement is quite small.
Only if there is no reasonable option on the primary platform, then you can start to think about integrating a new language/platform
In terms of primary platform, I'd normally suggest a JVM language like Java, Scala or Clojure since the JVM is very well engineered, offers great performance, is highly portable and has the largest / most cohesive library ecosystem (most of which is open source). The JVM is therefore probably the best "general purpose" choice unless you have some very specific requirement which is unlikely to be possible on the JVM, e.g.:
If you are doing lots of embedded / realtime / systems programming wthat requires hardware access you probably need to go for C/C++
If you are coding purely for web-based clients, you probably want to use JavaScript (if you are also writing code on the server side you can consider JavaScript code generation frameworks/libraries that can work on the JVM, e.g. Vaadin or ClojureScript)
the answer is pretty much depends on the technologies you're using and there is no silver-bullet solution for this.
In general, this solutions will fall into one of the following categories:
Some interprocess communication techniques
Integrations provided by the language/platform itself
Database/some common storage (even files :) )
Example of the first:
Sockets/pipes/whatever you operating system allows.
CORBA - allows to write distributed code in different languages.
Google protobuf - allows serialization/deserialization of data-objects and its language agnostic
For the second it really depends on language/ecosystem you're using.
Examples for java:
JNI - Java Native interface - allow to execute code (dlls/so) outside the JVM.
JCA - if you're in the enterprise environment - you can write the integration with the legacy systems in this.
For languages that are compiled into the native code its less tricky - you can write and compile some code, say in Pascal, and then use the DLL in C.
Sometimes when we're talking about Java there is a plethora of languages that have their own syntax and compiler, but their compiler compiles into java binary code that can be run inside the jvm. So if your solution is based on these languages the integration will be easier. Languages like Scala, Groovy, Closure, Jython and so on are falling into this category.
The last but not the least technology to be mentioned is Web Services. This is a very popular tool for integration of different system, although its more used in enterprise environment.
Basically its an abstraction over the sockets layer that allows to send data objects in XML/JSON format between the processes/servers. Both of XML and JSON are language agnostic, so its not an issue to create an XML in a program written in C++ and then consume it in JAVA.
Hope this helps

RPG to Java migration on an IBM iSeries

Our company uses an IBM iSeries for a majority of our data processing. All of our internal apps are written in RPG. According to IBM's roadmap, IBM is pushing companies to move to Java/J2EE. We're looking to modernize our internal apps to a more GUI interface. We provide an external web presence by using Asp.Net webs, although perhaps greenfield projects could be Java. One option is to use a screen scraper app while staying on RPG but I think it may be better to slowly go the way of IBM's roadmap and move to Java. Our goal is to migrate to a GUI interface and to be inline with IBM's roadmap.
Have you been involved with an RPG to Java migration, even if only greenfield projects were Java and the brownfield projects remained RPG?
My management is afraid that:
1) updating JRE on workstations, particularly thin clients, could cause an administrative nightmare (Our company uses 80% thin clients and 20% PCs) and
2) Java demands too much overhead of the workstation to run effectively
3) Incompatibility between JRE clients as we update, potentially breaking other apps requiring the JRE.
Can you shed some light on this? Are there any huge benefits? Any huge gotchas?
CLARIFICATION: I am only interested in a migration to Java. What is the difficulty level and do I lose anything when going from RPG to Java? Are the screens very responsive when migrated to Java?
My company is also attempting to migrate to Java from RPG.
We're not attempting to use a JRE on a thin-client, we're moving to web applications delivered through a browser. This may entail (eventually) replacing our old POS-scanners with some of the newer PC-based ones.
I have been informed (by company architects) that the JVM on the iSeries OS does have some performance issues. I do not personally know what these limitations are. In our case the migration has involved allocating an AIX resource, which is supposed to be much better - talk to your IBM rep about whether you just need to purchase the OS license (I just program on the thing, I don't get involved in hardware).
See reponse to question 1. In a larger context, where you're trying to update the browser (or any other resource), this is usually handled by having enterprise licenses - most will have options to allow forced, remote updates.
Some other notes:
You should be able to move to just using .NET, although you may need different hardware/partitions to run the environment. You can talk to DB2 that way, at least. The only benefit Java has there is that it will run on the same OS/hardware as the database.
I've seen a screenscraper application here (it was in VB.NET, but I'm fairly sure the example applies). Screen-scraping was accomplished by getting/putting characters to specific positions on the screens (the equivalent of substring()). That could be just the API we were using - I think I've heard of solutions that were able to read the field names. However, it also relied on the RPG program flow for it's logic, and was otherwise not maintainable.
Most of the RPG programs I've seen and written tend to be a violation of MVC, meaning you can't do anything less than integration testing - the history and architecture of the language itself (and some developers) prefers that everything (file access to screen display) be in one file. This will also make attempting to wrap RPG for calling remotely effectively impossible. IF you've properly seperated everything into Service Programs, you should be able to wrap them up (as the equivalent of a native method call, almost) neatly - unfortunately I haven't seen anything here that didn't tend to rely on one or more tricks that wouldn't hold up for typical Web use (for example, using a file in QTEMP for controlling program execution - the session on the iSeries effectively disappears every time a new page is requested...).
Java as a language tends to promote better seperation of code (note that it can be misused just as badly), as it doesn't have quite the history of RPG. In general, it may be helpful to think of Java as a language where everything is a service program, where all parameters are passed with VALUE set, OPTIONS(*nopass : *omit) is disallowed, CONST is generally recommended, and most parameters are of type DS (datastructure - this is a distinct type in RPG) and passed around by pointer. Module level parameters are frowned upon, if favor of encapsulating everything either in passed datastructures or the service program procedures themselves. STATIC has somewhat different use in Java, making variable global, and is not available inside of procedures.
RPG is quite a bit more simple than Java, generally, and OO-programming is quite a different paradigm. Here are some things that are likely to trip up developers migrating to Java:
Arrays in RPG start at 1. Arrays in Java start at 0.
Java doesn't have 'ouput' parameters, and all primitive types are passed by value (copied). This means that editing an integer won't be visible in calling methods.
Java doesn't have packed/signed encoding, and so translating to/from numbers/strings is more involved. The Date type in Java also has some serious problems (it includes time, sort of), and is far more difficult to meaningfully change to/from a character representation.
It's harder to read/write files in Java, even when using SQL (and forget about using native I/O directly with Java) - this can be mitigated somewhat with a good framework, however.
There are no ENDxx operators in Java, everything uses brackets ({}) to specify the start/end of blocks.
Everything in Java is in freeformat, and there are no columnar specifications of any sort (although procedure signatures are still required). There is no hardlimit on line length, although ~80 characters is still recommended. The tools (the free ones, even) are better, period, and generally far more helpful (although they may take some getting used to for those exposed to SEU). There are also huge, free libraries available for download.
The = sign is not context-sensitive in Java the way it is in RPG, it is always used for assignments. Use the double-equals, == operator for comparisons of values in Java.
Objects (datastructures) cannot be meaningfully compared with == - you will often need to implement a method called equals() instead.
Strings are not mutable, they cannot be changed. All operations performed on strings (either on the class/datastructure itself, or from external libraries) return brand new references. And yes, strings are considered datastructures, not value types, so you can't compare them with == either.
There are no built-in equivalents to the /copy pre-compiler directives. Attempting to implement them is using Java incorrectly. Because these are usually used to deal with 'boilerplate' code (variable definitions or common code), it's better to deal with this in the architecure. Variable(ALL D-specs, actually) definitons will be handled with import or import static statements, while common-code variants are usually handled by a framework, or defining a new class.
I'm sure there are a number of other things out there, let me know if you have any other questions.
Distributing and managing a fat client would be an absolute nightmare.
The ideal solution is a Java based web application hosted on the iSeries. The workstations access your applications through a web browser just like ASP.NET.
I've been using the Grails Framework to modernize and create new applications and it is working wonderfully.
When IBM says you should move to Java/J2EE then you should probably move your applications to web applications like your asp.net web apps. You should probably use a feature rich interface like JSF or GWT.
Web applications don't have to worry about JRE problems as you just need a standard browser.
However I don't know RPG and I don't know the suggested migration strategy.
I am a developer involved in as400 modernization. So far, from my experiences, I can give you my insights.
In addition to Java EE based websites, you can probably go for jax-ws based web services, which provide services for different flat and grid screens.
The clients can consume them in whichever technology they desire. Some lag is there, but the overall usability is good as in the normal web based applications.

Find an efficient way to integrate different language libraries into one project using Python as the "glue"

I am about to get involved in a NLP-related project and I need to use various libraries. Some are in java, others in C/C++ (for tasks that require more speed) and finally some are in Python. I was thinking of using Python as the "glue" and create wrapper-classes for every task that I want to do that relies on a different language. In order to do that, the wrapper class, for example, would execute the java program and communicate with it using pipes.
My questions are:
Do you think that would work for cpu-demanding and highly repetitive tasks? Or would the overhead added by the pipe-communication be too heavy?
Is there any other (preferably simple) architecture that you would suggest?
I would simply advise not doing this.
Don't implement stuff in C/C++ "for speed". The performance benefit is not likely to be as great as you expect; e.g. compared with implementing in Java using "best practice" design and performance techniques.
Don't try and glue lots of languages together. You are setting yourself up for lots of portability issues, difficulties in debugging, and reliability issues; e.g. due to C / C++ bugs crashing the JVM. In addition, there are performance overheads in bridging between languages, and there can be unexpected bottlenecks. (For instance, you may find that your C/C++ has to be run single-threaded due to threading issues, and that you therefore can't get the benefit of Java multi-threading on a typically multi-core system.)
Instead, I advise you to look for libraries that allow you to implement the entire application in one language. If that is not possible, design it so that the different language components are different executables / processes, communicating via some kind of RPC, messaging, or whatever.
Whether or not you'd have problems communicating over pipes / sockets has nothing to do with how CPU intensive the tasks are, but how frequently you'd need to send information between the processes and how much data they need to send. Setting up threads to do your communication will have little processing overhead.
You can probably automatically wrap the C/C++ code with Python (SWIG, ctypesgen, Boost.Python), so the only glue you'll have to write yourself would then be talking to Java.
You could also do it the other way -- run the Python code in the JVM with Jython so the Python and Java code are together, then talk to the C/C++ from there.
You should take a look at Apache UIMA. It is designed exactly for this. From the project website:
The Frameworks run the components, and are available for both Java and C++. The Java Framework supports running both Java and non-Java components (using the C++ framework). The C++ framework, besides supporting annotators written in C/C++, also supports Perl, Python, and TCL annotators.
UIMA can manage pipes and annotators and is built to scale.
I would look at Jepp or JPype instead of using IPC for this. I would avoid Jython since loading the C/C++ libraries into Java would probably be harder than into CPython.
1) Do you think that would work for cpu-demanding and highly repetitive tasks? Or would the overhead added by the pipe-communication be too heavy?
Depends on your task. If this is a typical NLP app where you have a large model loaded in memory and you only communicate relatively small pieces of data (strings in, label sequences/parse trees out), it may work. Pipe communication is hard to get right, though, since there's a lot of buffering and synchronization issues you have to tackle. Python is a very good glue language, but it doesn't solve everything.
2) Is there any other (preferably simple) architecture that you would suggest?
Make your NLP components services and connect to them via REST interfaces. There are off-the-shelf tools that do this, e.g. CLAM. Pyro and SPIRO make communication between Java and Python even more direct and might be easier to use than HTTP/REST (but YMMV).
The parts that are written in C/C++ can also be integrated with CPython using Cython. Don't start implementing things in C or C++ because you think they'll be faster, though; you can also implement them in Python first, then see if you can get the desired performance with NumPy and/or Cython.

JVM/CLR Source-compatible Language Options

I have an open source Java database migration tool (http://www.liquibase.org) which I am considering porting to .Net.
The majority of the tool (at least from a complexity side) is around logic like "if you are adding a primary key and the database is Oracle use this SQL. If database is MySQL use this SQL. If the primary key is named and the database is Postgres use this SQL".
I could fork the Java codebase and covert it (manually and/or automatically), but as updates and bug fixes to the above logic come in I do not want to have to apply it to both versions. What I would like to do is move all that logic into a form that can be compiled and used by both Java and .Net versions naively.
The code I am looking to convert does not contain any advanced library usage (JDBC, System.out, etc) that would vary significantly from Java to .Net, so I don't think that will be an issue (at worst it can be designed around).
So what I am looking for is:
A language in which I can code common parts of my app in and compile it into classes usable by the "standard" languages on the target platform
Does not add any runtime requirements to the system
Nothing so strange that it scares away potential contributors
I know Python and Ruby both have implementations on for the JVM and CLR. How well do they fit my requirements? Has anyone been successful (or unsuccesful) using this technique for cross-platform applications? Are there any gotcha's I need to worry about?
Check out the Fantom programming language. It has its own Java-like/C#-like syntax but can target either the Java VM or .NET CLR.
Their "Why Fantom" page gives a high-level overview of their approach to portability versus dynamic languages running on a VM.
You might have some luck using IKVM.NET. I'm not sure on its exact status, but it's worth a try if you're insistent on running Java code on the .NET Framework. It includes a .NET implementation of the Java base class library, so it seems reasonably complete.
The only other option I might suggest is porting the code to the J# language, a full .NET language (although not first class in the sense that C# or VB.NET is). The language was designed so that the differences with Java were minimal.
If you are thinking about an emdedded approach, you might look at Lua.

Categories

Resources