a java excel api for addressing my requirements? [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Problem description: i want to load image pixel data in excel sheet.
what i have tried: using apache POI for writing the data to excel, but i found there are some limitations in apache POI (as elaborated below)
I have come to know of some workarounds, which are tedious on the part of the programmer and i am not really willing to do that for such a trivial looking task.
Details:
i have been using apache POI for quite some time, and i have come across few limitations:
the whole file is in memory at once, so cant use directly for bigger files.
(specific to HSSF) :
no more than 255 columns
no more than 4000 cell styles
cant use custom colors directly.
my requirement is to read an image(say, 1024x764) pixel by pixel and write pixel value in rows and columns of the excel sheet, every different pixel value is styled differently.
the problems i have faced are:
out of memory exception, while writing to the excel sheet, because of so many rows/columns and styles
writing a logic for reusing styles would slow down the whole program
even if i reuse styles, what to do about the huge number of rows/columns
I have come to know that there are workarounds for these problems:
reusing styles
writing logic for efficient memory usage
but i do not intend to take much pain for a job as simple as that, and since these are not directly the limitations of excel (atleast not .xlsx), i am looking for a library that can do it for me.
can someone please suggest another library which can do this,or can you suggest some easier workarounds for these problems?

can someone please suggest a good library to do this, or else i would change from java to csharp
In short, nope - the POI libraries are, in my experience, the best ones available for the job. They're not perfect, but I don't know of an alternative that's better. You may want to try checking trunk out and seeing if any of your issues have been resolved there - entirely possible, it's a relatively active project.
The only other thing I'd suggest looking at is the OpenOffice API, but note that requires OO to be installed (or distributed with your app.)
In all honesty though, POI's strength is it's cross platform nature - it's a pure Java implementation with no native components. If you don't care about this and could therefore go with C# and use the native office APIs, this would seem like the logical approach surely? It seems odd to me that you're not doing this already.

JExcelApi
http://jexcelapi.sourceforge.net/
It works in declarative mode, as Adobe LifeCycle e JReport: you create a Template file xls andin every cell you put the reference to the beans.
Invoking the engine, a the end you have a XLS file.
Sorry for the extreme synthesis, but I worked with it a lot of years ago and I don't remember the details, but in the website there's the documentation.

Related

Memory-efficient Java library to read Excel files? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 15 days ago.
Improve this question
Is there a memory-efficient Java library to read large Microsoft Excel files (both .xls and .xlsx)? I have very limited experience with Apache POI, and it seemed to be a huge memory hog from what I recall (though perhaps this was just for writing and not for reading). Is there something better? Or am I misremembering and/or misusing POI?
It would be important for it to have a "friendly" open-source license as well.
Apache's POI library has an event-based API that has a smaller memory-footprint. Unfortunately, it only works with HSSF (Horrible Spreadsheet Format) and not XSSF (XML Spreadsheet Format - for OOXML files).
The Excel file formats are (both) huge and extremely complicated, and anything that reads all of their possible contents is going to be equally huge and complicated. Remember they can contain ranges, macros, links, embedded stuff etc.
However if you are reading something simple like a grid of numbers, I recommend first converting the spreadsheet to something simpler like CSV and then reading that format.
Take a look at JExcel:
http://jexcelapi.sourceforge.net/
I can't account for the memory footprint, but obviously with large spreadsheets your going to consume lots of memory for processing.
You should be able to use it for xls and xlsx:
Read XLSX file in Java
I cannot answer your question directly, as I'm not using Java; however I can share a similar experience in Perl that may be partially relevant.
The OOXML format is indeed very large and complex, so any software that aims at covering the full specification is likely to be quite costly in terms of resources. In Perl, the most well-known module for reading .xlsx files is https://metacpan.org/pod/Spreadsheet::ParseXLSX, which does the job well for small and medium files; however it is far too slow on large amounts of data. So I ended up writing another module https://metacpan.org/pod/Excel::ValueReader::XLSX, with far less features, but optimized for fast parsing of large files.
The moral is : there is no one-size-fits-all solution. If you are willing to sacrifice some features for better speed or less memory consumption, you might find other libraries. In Java, https://github.com/dhatim/fastexcel could perhaps be a good candidate (just from reading the documentation).

What's the most useful and complete Java cheat sheet? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I need a cheat sheet for Java and started looking around, but could not find one that seemed "canonical" - which surprised me considering how widespread the language is. Could experienced Java coders please suggest a cheat sheet that is useful (organized so well you actually use it often) and complete (covers real-world daily usage) please?
By contrast, here's what I'd consider a canonical cheat sheet for Python: http://rgruet.free.fr/PQR26/PQR2.6.html
It is complete (syntax, types, statements, built-ins, common modules, idioms) and useful (well-organized: sectioned and hyperlinked; easy to search, and easy to explore).
Also, I have looked at the listing here already: http://devcheatsheet.com/tag/java/ and did not find a cheat sheet comparable to RGruet's Python cheat sheet above. The top listing in Google for "Java cheat sheet" is http://www.cs.princeton.edu/introcs/11cheatsheet/ which is fairly complete, but not organized to be useful. There's gotta' be something better out there!? BTW, it need not fit on 1 page. I'm aware of the Java API docs, but that's more what I'd expect a cheat sheet to link to, not be.
Update
Some SO members thought this question was subjective, but I think I explained my criteria to be fairly objective: completeness (content) and usefulness (presentation) are not hard to judge in this context. I've accepted one of the more useful answers, but remain surprised that Java doesn't have a canonical cheat-sheet.
This one didn't seem too bad.
http://mindprod.com/jgloss/jcheat.html
found one interesting cheat sheet here..
http://introcs.cs.princeton.edu/java/11cheatsheet/
This Quick Reference looks pretty good if you're looking for a language reference. It's especially geared towards the user interface portion of the API.
For the complete API, however, I always use the Javadoc. I reference it constantly.
Here is a great one
http://download.oracle.com/javase/1.5.0/docs/api/
These languages are big. You cant expect a cheat sheet to fit on a piece of paper
I have personally found the dzone cheatsheet on core java to be really handy in the beginning. However the needs change as we grow and get used to things.
There are a few listed (at the end of the post) in on this java learning resources article too
For the most practical use, in recent past I have found Java API doc to be the best place to cheat code and learn new api. This helps specially when you want to focus on latest version of java.
mkyong - is one my fav places to cheat a lot of code for quick start - http://www.mkyong.com/
And last but not the least, Stackoverflow is king of all small handy code snippets. Just google a stuff you are trying and there is a chance that a page will be top of search results, most of my google search results end at stackoverflow. Many of the common questions are available here - https://stackoverflow.com/questions/tagged/java?sort=frequent
It's not really a cheat-sheet, but for me I setup a 'java' search keyword in Google Chrome to search over the javadoc, using site:<javadoc_domain_here>.
You could do the same but also add the domain for the Sun Java Tutorial and for several Java FAQ sites and you'd be OK.
Otherwise, StackOverflow is a pretty good cheat-sheet :)

which keyvalue store has the best performance? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I think tow months ago. I found a google's open source project that can store key value pairs with high performance. But i forget the name. Could anybody tell me? or you can have some other suggestions for me? I have been using BerkerlyDB, but I found BerkerlyDb is not fast enough for my program. However, berkerylyDB is convenient to use as it appears as a java lib jar, which can be integraed with my program seamlessly. My program is also written in Java.
Two strong competitors in the DHT (Distributed Hash Table) 'market':
Cassandra (created by Facebook, in use by Digg and Twitter)
HBase
Here is a presentation about Cassandra. On slide 20 you'll see some speed benchmarks- 0.12 ms / write
(You can search around for the whole presentation, including Eric Evans talking)
Nobody mentions leveldb and yet this post is at the top when searching for "good key value store". Leveldb in my experience is simply awesome. It's so fast I couldn't believe it.
I've been trying quite a few databases for a task I was doing. I tried:
windows azure table storage (expensive, value size max 1 Mb and each property size is max 64 Kb)
redis (awesome if you have as much ram as you please)
mongodb (awesome as long as there is enough ram, breaks after that point)
sql server (expensive, needs maintenance, such as rebuilding indexes and eventually still not fast enough)
sqlite (free, but not as simple to use as leveldb and not fast)
leveldb. If you can model your job as to reading large consecutive chunks of data through an iterator then you'll get great speed. Writing is also pretty fast. Combine it with ssd disk and you'll love it.
Bigtable?
Redis
http://code.google.com/p/redis/
Maybe you should describe what features you need. If it doesn't need to be distributed (does it?) then I would try using the H2 Database. For those who think "it can't be fast because it's using SQL" please note that when using prepared statement, SQL parsing is only done once. Disclaimer: I'm the main author of H2.
Many answer seem to automatically assume need for distribution; but that seems odd if question refers to BDB.
With that in mind, beyond Redis and H2 (which are both good), there is also Tokyo Cabinet to consider, which seems to offer benefits over BDB. And one more newer possibility is Krati.
I think you saw Guava or Google collections.

Real-time Java graph / chart library? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
There was an earlier thread on Java graph or chart library, where JFreeChart was found to be quite good, but, as stated in its FAQ, it's not meant for real-time rendering.
Can anyone recommend a comparable library that supports real-time rendering? Just some basic xy-rendering - for instance, getting a voltage signal from data acquisition system and plotting it as it comes (time on x-axis, voltage on y-axis).
What the FAQ actually says is that JFreeChart doesn't support hard real-time charting, meaning that the chart isn't updated when new data arrives or at deterministic interval after it. However I have found that JFreeChart can be used for the kind of applications you are describing. You can achieve 1 update per second, which is fine. I don't think a human eye can follow something quicker than this.
If you want something more than this, I doubt you will find anything in Java (or even in another language). Operating Systems that we use aren't designed to be real time. You can't have a guaranty that they will respond in a minimum interval after an event. A tight integration with the hardware driver will be needed to show more than 1-10 frames per second.
However, if you design your application correctly, the OS will do respond quickly and your application can easily display a "real-time" graph (meaning a graph that updates once a second). Just don't use your application to shut down a valve in an emergency situation!
http://www.live-graph.org/
Just stumbled upon a description on how to use the visualvm charting library. Looks very nice!
have a look at processing -- it's an open-source, java-based environment designed for all sorts of animated visualizations.
Well, if it has to be Java, then you might want to look into these.
Java Real-Time Systems (includes demo both real-time and non-real-timem, and JavaFX version of the charting application)
Real-time Java application development using multicore systems
Expedited Real-Time Task Graphs (This technology runs on Linux, but development can be done on any platform that supports Java 5.0 and Eclipse.)
JavaFX - A Pie Chart Demo
You probably have already found a good solution, but if not, I have recently done some work on a framework for producing 2D charts allowing for live updates at a rate of over 50 changes per second.
The original intention was to mimic the appearance of a chart recorder in a scrolling region of a web page, but I believe the approach has wider application.
A demo can be found at Chart Recorder Demo if anyone is interested.
The appearance is defined by a template (www.journeylog.co.uk/chart/templates/chartRecorder.xml). One feature is the ability to specify drawing either on the server or in the browser using ExplorerCanvas.
If anyone is interested I could start an open source project for it.
Fast enough for real time is swtchart, at least in my experience. Even with lots of data. Don't be scared away by the version number, yes it is a rather new API, but I use it successfully without problems.
As the name implies, it is based on SWT, which uses native OS drawing. Also it does some clever optimizations for drawing fast, like not drawing all points in the dataset (see Large Series Example Snippet).
this seems like a good candidate.
http://jchart2d.sourceforge.net/
demo:
http://jchart2d.sourceforge.net/applet.shtml
JCCkit is vary good library who are targetting less memory especially in embedded environment : https://sourceforge.net/projects/jcckit .
Takes less than <100 kb .
You could dig around the source for NetBeans. The profiler does real time graphing of various things such as memory usage.
SWT XYGraph can plot data with your own data provider, so you can create a real time data provider providing live data. With SWTChart and JFreeChart, you have to prepare the whole array for it.
This question has been answered well in:
Java Real time graph plotting
As VisualVM includes a Charting API, and this API is included in the JDK, you have a good/fast charting API available.

What software would you recommend for image enhancement prior to OCR (Optical Character Recognition)? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
We are currently researching ways of enhancing image quality prior to submission to OCR. The OCR engine we are currently utilizing is the Scansoft API from Nuance (v15). We were researching the Lead Tools but have since decided to look elsewhere. The licensing costs associated with Lead Tools is just too great. To start with we are looking for simple image enhancement features such as: deskewing, despeckling, line removal, punch hole removal, sharpening, etc. We are running a mix of .NET and Java software, but java solution would be preferred.
Kofax is good for pre-processing, but for the types of cleanup you are talking about may be overkill unless the images are really bad. Unless your specialty is in image processing, I'd recommend working with a provider that does the image cleanup and the OCR so you can focus on the value you actually add.
We license the OCR development kit from ABBYY (ABBY SDK) and have found it to be superb for both image processing and OCR. The API is quite extensive, and the sample apps, help and support have been beyond impressive. I definitely recommend taking a look.
Disclaimer: I work for Atalasoft
We have those functions and run-time royalty-free licensing for .NET.
http://www.atalasoft.com/products/dotimage/
We also have OCR components including a .NET wrapper for Abbyy, Tesseract and others and Searchable PDF generation (image on top of text in a PDF)
Not sure if this would be quite up to the standards that you guys would need, but perhaps you should look at some of the Paint.Net APIs. I don't know how easy it would be to extract their image processing algorithms for use in your project, but I believe they do some of the things you are looking for. Plus it is an open source project with an MIT License, so it should be pretty friendly for business use.
Research about KOFAX VRS at KOFAX.com
Maybe JMagick, it is an open source Java interface of ImageMagick. It is implemented in the form of a thin Java Native Interface (JNI) layer into the ImageMagick API. It's licensed under the LGPL so it shouldn't be a problem license wise.
http://sourceforge.net/projects/jmagick/
I would suggest Intel for its zero-cost runtime licensing.
Depends on the number and quality of the original images. Managed code and imaging tool kits will work but it's not always the best solution if you haved several million images to process. For small batches and tight budgets, I agree with the previous posters that projects like Aforge, Paint.NET, and other open source computer vision libraries will do the trick. Of course, you are on your own if the results are not improving... At least this let's you put everything you need under one application for a low cost.
If you are processing several hundred thousand images a month, then I would suggest you divide up the process into smaller workflow step and tweak each one until your cost per image gets as close to zero as you can. You will find that the OCR results rise quickly at first and then level off sooner than you expected. (I'm not a big fan of OCR but it has its place)
I use commercial Windows product from Recogniform to process and clean up the images prior to OCR in a batch mode using scripts adjusted for various kinds of images. If an image fails QC or is rejected by the OCR engine, it is "repaired" by hand using a custom .NET application built with Atalasoft's toolkit. Batch process everything and only touch what fails.

Categories

Resources