Advanced Java File I/O tutorials? Tips? Advice?

Advanced Java File I/O tutorials? Tips? Advice? - java

I'm working on a project right now that will make use of Java File I/O that goes beyond the simple "write this string to a file" documentation and tutorials that I find on the net. This project will essentially provide a database mechanism, similar to the popular "NoSQL" databases that are gaining a lot of press these days. However, I'm unable to find a ton of documentation that provides detailed information on which APIs to use, how to use them, etc. I've also been looking for any generally accepted design patterns around Java File I/O, but without any luck.
If I had to list a couple of requirements, I'd say:
Pseudo-transactional support (not a hard requirement, as it can be implemented higher up in the API stack)
Ability to write data of an arbitrary length in a structure that can be read back later on
Indexing
Ability to remove an object from the "database" efficiently
Fast searching
Possible multi-threaded access (multiple read threads, single write, most likely)
Can anyone point me to any tutorials, documentation, design patterns, etc. that may be helpful? Are there any open source frameworks that revolve around Java File I/O? I know of a lot of frameworks that provide wrappers around NIO for the purposes of Network I/O, but nothing File-related.
Thanks for any help you can provide!

Take a look at Apache Commons Transaction. It supports transactional file access, by performing the work in temporary files, and committing the work by moving them to the actual files.
You might also be interested in the XADisk project, although I haven't pored through it's sources.
As far as searching is concerned, the Apache Solr and Lucene projects would be of help.

Related

Import .csv to RTC using JAVA

I am working on IBM RTC and I need to import a .csv file to RTC using JAVA. Is there a way to do this? If yes, can someone help me with the same.

Parsing CSV data is something that you definitely do not want to implement yourself, there are plenty of libraries for that (see here).
RTC offers a wide range of APIs that can be used with, see:
rsjazz.wordpress.com or
jazz.net
In that sense: you can write Java code that reads CSV data, and RTC has a rich API that allows you push "content" into the system.
But a word of warning: I used that java API some years ago to manipulate information within our RTC instance. That was a very painful experience. I found the APIs to be badly documented and extremely hard to use. It took me several days to come to working code that would make just a few small updates to our stories/tasks.
Maybe things have improved since then, but be prepared for, as said ... a painful experience.
EDIT, regarding your comment on "other options":
Well, I dont see them: you want to push data you have in CSV into your RTC instance. So, if you still want to do that, you have to use that means that are available to you! And don't let my words discourage you. A) it was some time back when I did my programming with RTC, so maybe their APIs are better structured and more intuitive today. B) there is some documentation out there (for example here). And I think everybody can register at jazz.net; so when you have further, specific questions, you might find "better" answers there!
All I wanted to say was: I know that other products such as jenkins or sonarqube have great APIs; and you work with that, all nice, easy, fun. You get things working with RTC, too. Just the path there, maybe isnt that nice and easy.
My personal recommendation: start with the RTC part first. Meaning: just try to write a small programm that authenticates against the server; and then push some example data into the system. If that works nicely for you; then spend the time on pulling / transforming the real data that you have in mind!

How to write custom storage plugin for apache drill

I have my data in a propriety format, None of the ones supported by Apache drill.
Are there any tutorial on how to write my own storage plugin to handle such data.

This is something that really should be in the docs but currently is not. The interface isn't too complicated, but it can be a bit much to look at one of the existing plugins and understand everything that is going on.
There are 2 major components to writing a storage plugin, exposing information to the query planner and schema management system and then actually implementing the translation from the datasource API to the drill record representation.
The Kudu plugin was added recently and is a reasonable model for a storage system with a lot of the elements Drill can take advantage of. One thing I would note is that if your storage system is not distributed and you just plan on making all remote reads you don't have to do as much work around affinities/work lists/assignments in the group scan. If I have some time soon I'll try to write up a doc on the different parts of the interface and maybe write a tutorial about one of the existing plugins.
https://github.com/apache/drill/tree/master/contrib/storage-kudu/src/main/java/org/apache/drill/exec/store/kudu

Java app & C++ app integration / communication

We have two code bases, one written in C++ (MS VS 6) and another in Java (JDK 6).
Looking for creative ways to make the two talk to each other.
More Details:
Both applications are GUI applications.
Major rewrites or translations are not an option.
Communications needs to be two-way.
Try to avoid anything involving writing files to disk.
So far the options considered are:
zero MG
RPC
CORBA
JNI
Compiling Java to native code, and then linking
Essentially, apart from the last item, this boils down to a choice between various ways to achieve interprocess communication between a Java application and a C++ application. Still open to other creative suggestions!
If you have attempted this, or something similar before please chime in with your suggestions, lessons learnt, pitfalls to avoid, etc.
Someone will no doubt point out shortly, that there is no one correct answer to this question. I thought I would tap on the collective expertise of the SO community anyway, and hope to get many excellent answers.

Well, it depends on how tightly integrated you want these applications to be and how you see them evolving in the future. If you just want to communicate data between the two of them (e.g. you want one to be able to open a file written by the other, or read a stream directly from the other), then I would say that protocol buffers are your best bet. If you want the window rendered by one of these GUI apps to actually be embedded in a panel of the other GUI app, then you probably want to use the JNI approach. With the JNI approach, you can use SWIG to automate a great deal of it, though it is dangerously magical and comes with a number of caveats (e.g. it doesn't do so well with function overloading).
I strongly recommend against CORBA, RMI, and similarly remote-procedure-call implementations, mostly because, in my experience, they tend to be very heavy-weight and consume a lot of resources. If you do want something similar to RMI, I would recommend something lighter weight where you pass messages, but not actual objects (as is the case with RMI). For example, you could use protocol buffers as your message format, and then simply serialize these back and forth across normal sockets.
Kit Ho mentioned XML or JSON, but protocol buffers are significantly more efficient than either of those formats and also have notions of backwards-compatibility built directly into the definition language.

Use Jacob ( http://sourceforge.net/projects/jacob-project ), JCom ( http://sourceforge.net/projects/jcom ), or j-Interop ( http://j-interop.org ) and use COM for communication.

Since you're using Windows, I'd suggest using DDE (Dynamic Data Exchange). There's a Java library available from Java Parts.

Dont' know how much data and what type of data you wanna transfer and communicate.
But to simplify the way, I suggest using XML or Json based on HTTP protocol.
Since there are lots of library for both applications and you won't spend too much effort to implement and understand.
More, if you have additional applications to talk with, it is not hard since both tech. are cross-languages.
correct me if i am wrong

Whats the best way to implement a simple document management system?

I am planning to build a simple document management system. Preferably built around the java platform. Are there are best practices around this? The requirements are :
Ability to upload documents
Ability to Tag documents
Version the documents
Comment on documents
There are a couple of options that I am currently considering. The first option would be a simple API on top of SVN or CVS and use a DB backend to track tags, uploader, comments etc
Another option is to use the filesystem. Version the documents as copies in a versions folder and work with filenames.
Or, if there is an Open non GPL'ed doc management system, we could customize it to our needs and package it in our application. Does anybody have any experience building something like this?

You may want to take a look at Content repository API for Java and the several implementations (some of them free).

Take a look at the many Document Oriented Database systems out there. I can't speak about MongoDB or any of the others, but my experience with Couchdb has been fantastic.
http://couchdb.apache.org/
best part of it is that you communicate with it via a REST protocol.

The best way is to reuse the efforts of others. This particular wheel has been invented quite a bit of times.
Who will use this and for what purpose?

Where can I find an AS400 to Java interface?

Does anyone have links and resources to connect to an AS400 from Java?
I remember years ago, somebody told me about a connector that simulates KeyStrokes from the keyboard and other "purest" approach that connected directly.
On the web I have found a lot of links, but I cannot find a complete product to do this (I am probably not using the right keywords).
EDIT
Thanks for the answers:
What we are looking for is a way to access the data inside the AS400 and/or the screens it uses and expose them for other new applications re-use. Either as a webservice of some sort, or directly through Java ( and java will expose the operations using webservices )
Thanks in advance.
EDIT
As per MicSim post, I've also found this link:
http://www.ibm.com/developerworks/library/ws-as400/index.html

What you are looking for is probably the Toolbox for Java™ & JTOpen from IBM. There is also an AS400 class in the toolbox for performing specific AS400 tasks. You can look here and here for more details. Just googled it and hope it's helpful.

IBM's 5250 screen-scraping technology was "WebFacing" - I would post a link but you're probably better off Googling it, since IBM's documentation is so scattered. There are other technologies available too but: Screen-scraping was never anyone's favourite since typically you end up with something which, although it looks more up-to-date, actually is harder to use than a green screen and no more functional. The 5250 is probably the single best data entry platform I've ever used - web forms in a browser are one of the worst.
As mentioned, jt400 is the way to go for most other things. In particular:
JDBC - for all things SQL. If you do it right and address your files as though they really are tables, it's a way to get away from the 400 entirely.
Record-level access - write Java programs using a similar database API to RPGLE (all those chains, setlls that 400 programmers love)
Call programs, system commands, manage resources (data queues, data areas, prints / spools, jobs etc etc)
Good luck

If you just want to run Java on the AS/400 (or iSeries, or System i, or whatever IBM's marketing department has decided to call it this month), that's a supported language. You can access the pseudo-DB2 database directly. Or are you after some other form of integration?

This obviously depends on what you want to do, however if you want to simulate keystrokes across a network connection to an AS400 process then Expect4j may be the library you are looking for.
This is generally a really nasty hack though and there are frequently better ways to achieve your goals. What are you trying to do?
The expect4J library can be found here. Expect was originally a unix command that allowed you to specify a string that you are expecting to see and then a string of characters to return. It was frequently used for automating logins etc and for screen-scraping applications.

Even better is the TN5250j Console, which can be used to extract data from the AS/400.

jacada makes tools to do what your looking for
http://www.jacada.com/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.