JAVA: How to automatically write unique ID number to a .CSV file

JAVA: How to automatically write unique ID number to a .CSV file - java

I'm doing a java desktop application which writes "ID, Name, Address, Phone number" into a .CSV file then reads and shows it on JTable. The problem is ID needs to be a unique Integer number that automatically written. Every time you write, it must not be the same as any number of the previous written IDs. I tried creating a method that increases ID number by 1 for every time you click a button. But if you exit the program and run again, the ID number starts from 0 as I initialized it to.
Edit: I'm new to programming.

The best option is to use out-of-the box solution: Use
UUID.randomUUID() method. It gives you a unique id.
Second option: You will have to write your last used ID into persistent storage (File, DB or other option). So when your program starts you read your last used ID and generate the next value. This way you can use numeric sequence. If Thread safety is an issue you can use class AtomicLong to store your value. (But it won't help if you run your app twice as two separate processes)
Third: Use the timestamp you can get it as Long. (simple solution, no tracking previous values needed)

There are essentially two approaches to this:
Use a UUID:
UUIDs are big random numbers. There is a chance that you'll get the same
one twice, but the probability is so low as to be negligible, because
the number space is so unimaginably huge
get one with java.util.UUID.randomUUID()
Use an atomic identifier source.
This is just something with a lock to prevent concurrent access, that
emits unique numbers on request
A very simple identifier generator uses synchronized to ensure atomicity:
public class UniqueIdGenerator {
private long currentId;
public UniqueGenerator(long startingId) {
this.currentId = startingId;
}
public synchronized int getUniqueId() {
return currentId++;
}
}
(You can also use AtomicLong, or let a database engine take care of atomicity for you)
There are more advanced strategies for making this work in a distributed system -- for example, the generator could be accessible as a web service. But this is probably beyond the scope of your question.

You have to persist the last written ID and there are many different ways you could think of.
Writing ID to a file
Writing ID to User-Preferences (maybe a windows-registry entry?)
You have to think of the uniqnes of the ID. What if you run the programm as two different users on the same machine? What if you run your programm on two different machines?

At the start of your application and everytime you manipulate (write) your .csv file. You could update your ID to start from the max(ID's in your .csv) and then add 1 everytime you create a new entry.
You might consider using a small embedded Database (e.g Apache derby) and not writing .csv files. That might be a "cleaner" solution because you can use Database operations to ensure that behaviour.
Best regards!

If ID is required in long format and environment is not multi-threaded then System.nanoTime() can be used.
Otherwise for multi-threaded environments, there could be multiple solutions:
java.security.SecureRandom
java.util.UUID.randomUUID--> internally uses SecureRandom
File.createTempFile().getName() --> internally use SecureRandom
If a long output is required then String.hashCode() can be used after above code.

Related

Multiple keys pointing to a single value in Redis (Cache) with Java

I want to store multiple keys with a single value using jedis (Redis cache) with Java.
I have three keys like user_1, driver_10, admin_5 and value = this is user, and I want to get value by using any one key among those three.

Having multiple keys point to same value is not supported in Redis for now, see issue #2668.
You would need a workaround.
Some ideas below, possibly obvious or stupid :)
Maybe have an intermediate key:
- user_10 → id_123
- driver_5 → id_123
- id_123 → data_that_you_dont_want_to_duplicate
You could implement that logic in your client code, or in custom Lua scripts on server, and have your client code use those scripts (but I don't know enough about that to provide details).
If you implement the indirection logic on client side, and if accesses are unbalanced, for example you would access data via user key 99% of the time, and via driver key 1% of the time, it might be worth avoiding 2 client-server round trips for the 99% case. For this you can encode redirections. For example, if first character is # then the rest is the data. If first character is # then the rest is the actual key.
user_10 → #data_that_you_dont_want_to_duplicate
driver_5 → #user_10

Here is a Lua script that can save on trafic, and pull the data in one call:
eval "return redis.call('get',redis.call('get',KEYS[1]))" 1 user-10
The above will return the request data.

Hardcoding values vs. reading from file

This is a general question about the efficiency of hardcoding data - I'm writing a program in Java that does some chemical analysis, and I need to use the isotopic abundances of different elements. The way I have it set up right now is that all values (which never need to be modified) are stored as final fields in my class, i.e.
static final double C12Abundance = .989;
static final double C12Mass = 12;
A lot of similar programs store this type of data in an XML file, then read the values from there, like this:
<compounds>
<elements>
<element symbol='C' mono_isotopic_mass ='12.00000000000' abundance='.989'/>
Is there any reason (performance, memory, etc) to read from it this way? Seems easier to just leave it as a field.

Hard coding is way much faster in terms of performance and memory allocation.
The thing you gain from reading from a file is code re-usability (running your program with different parameters without the need to recompile it).
Note that reading from a file has the following steps:
Declare variable to use for storing a value.
Create an input (stream) object
Initialize it with a path
Open The file from FS
Find the correct line to read from
Read the value
Store it in the variable above
Close the input (stream)
That's a pretty big overhead instead of having a pre-compiled final variable with a value

As these are truely universal constants, properties limited in number, you can put them in code, but nicely organized.
public enum Element {
// Name Mass Abund
C12("C", 12.0, .989),
He4(...),
O32(...),
...;
public final String name;
public final double monoIsotopicMass;
public final double abundancy;
private Element(String name, double monoIsotopicMass, double abundancy) {
this.name = name;
this.monoIsotopicMass = monoIsotopicMass;
this.abundancy = abundancy;
}
}
for (Element elem : Element.values()) {
if (elem.abundancy > 0.5) {
...
}
}

If you want to hard-code the values and want to change them, you have to recompile your program, that's the problem. Reading your data from a file have the following benefits:
You don't have to wait for the program to recompile for every change in the data. For a fairly large program, this can take time.
Your users can change the data without even having access to the source.
You can have different data sets between which you can switch just by changing the config file name.
Maybe none of this matter to you; then just go ahead and put your data in the source.
Performance itself (as in the performance of the program) is never a problem except if your profiler says so. But I don't see how reading for the data file at startup a small set of data could be a long process, so I'm fairly sure you would'nt see a difference.

If you want so simulate a universe with a different abundance of C12, having the values hard coded would mean you have to recompile the program.
There may be other reasons as well: if the values are read from an external file the file serves as documentation, an external file may be easier to check for errors, there may be tools that generate the file or use it for other purposes besides running your program, ...

A configuration file holds properties, generally speaking these properties are changing in time. I believe that in your case those are fixed and will never change, by definition.
For this reason I would do the easier thing possible, which is leaving them as fields.
This is not a performance matter, as long as performance doesn't show up being an issue, this is just a matter of what is more easily usable in your codebase.
I would advice you to extract these values in a class as constants, so you could always import it to access the values.

The Java code in only readable by a Java compiler whereas XML is readable by any reasonable (meaning XML aware) language. Also If you want to add some value, you don't have to recompile everything.

Personally I'd go for hardcoding if the values are not gonna be changed ever and if the app is small. Otherwise I would choose external sources of conf data.
But everytime people tell me that values are not gonna change, it pretty much means that they will so preparing dynamic environments is the way to go in general. XML conf files, database conf tables etc.

If you write them in XML you can use different values for different devices,
e.g assume you have a dimension with name item_margin and it needs to be different based on width of device so in values/dimens.xml you have this
<dimen name="item_margin">0dp</dimen>
and in devices which has min 600dp width you want this margin to be 60dp so in
values-sw600dp/dimens.xml you have this
<dimen name="item_margin">60dp</dimen>
in this way these values are automatically selected based on device width, thus you don't have to check the device width and select appropriate value in your Java code

Generate unique ID

I need to generate unique ID's for my application. When I used (UUID.randomUUID()).toString(), I am getting a code (thinking this will be unique), which is very lengthy.
I am not sure how unique it will be, when we generate codes with the help of Java Timestamp or randomstring.
I need to generate unique codes which is only of 8-10 characters in length (alpha-numeric). How to get so?
I am using MySQL database.
Is generating unique code on database side is the best way or can we generate such short (but unique) codes in Java?
Any suggestions with example code will be very helpful.

I use RandomStringUtils.randomAlphanumeric() method from commons-lang to achieve this:
import org.apache.commons.lang.RandomStringUtils;
public static final int ID_LENGTH = 10;
public String generateUniqueId() {
return RandomStringUtils.randomAlphanumeric(ID_LENGTH);
}
If you using Maven, ensure that you have added commons-lang to project's dependencies:
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.6</version>
</dependency>
Is generating unique code on database side is the best way or can we generate such short (but unique) codes in java?
It's up to you and your project. Is id-generation part of business logic? If yes and all logic written on Java, so write it on Java. If all or some part of logic delegated to database, so generate id there (but in this case you will have strong dependency to particular database).

Do you have any specific limitation you need to take into account? Such as cross-application uniqueness? Because otherwise, MySQL is quite capable of generating IDs by itself, all you need to do is define an autoincrement column and not specify it at insert time (meaning, inserting a NULL value for it) - that will make MySQL fill it with the next available ID, unique and requiring no work from you.
It won't be an alphanumerical string (which I'm not sure if you specified as a requirement or restriction), but if all you require is uniqueness, it's more than enough. 8 - 10 alphanumeric characters aren't enough to guarantee uniqueness in a randomly-generated string, so you'd have to perform an insert check on the database.

Is generating unique code on database side is the best way or can we generate such short (but unique) codes in Java?
Databases are designed to be able to generate unique IDs where needed. I doubt anything you (or I) could code would be a 'better' variant of that.

I have written a simple service which can generate semi-unique non-sequential 64 bit long numbers. It can be deployed on multiple machines for redundancy and scalability. It uses ZeroMQ for messaging. For more information on how it works look at github page: zUID

Take look at: UIDGenerator.java
You can customize it (unique to process only, or world), it is easy to use and fast:
private static final UIDGenerator SCA_GEN = new UIDGenerator(new ScalableSequence(0, 100));
.......
SCA_GEN.next();
You can change the implementation to reduce the size of the ID (and add other tradeoffs)
see my benchmarking results at:
http://zoltran.com/roller/zoltran/entry/generating_a_unique_id
or run them yourself.

The question if id generation part be done in database or java end:
This question has to be answered by you depending on requirements of your application:
1) One way is to go by System.currenTimeMillis() . But if your applicaation will work in multi clustered env, then you may end up with duplicate values.
http://www2.sys-con.com/itsg/virtualcd/java/archives/0512/Westra/index.html
2) Another way is to use UUID Generator .It will help you in case you have different databases that need to be merged. Using this mehtod you don't have to worry about duplication of id when merging databases.
https://marketplace.informatica.com/solutions/mapping_uuid_using_java
There may be other factors you may want to consider.
As per your question UUID method will go.

read/write to a large size file in java

i have a binary file with following format :
[N bytes identifier & record length] [n1 bytes data]
[N bytes identifier & record length] [n2 bytes data]
[N bytes identifier & record length] [n3 bytes data]
as you see i have records with different lengths. in each record i have N bytes fixed which contains and id and the length of data in record.
this file is very big and can contains 3 millions records.
I want to open this file by an application and let user to browse and edit the records.
( Insert / Update / Delete records)
my initial plan is to create and index file from original file and for each record, keep next and previous record address to navigate forward and backward easily. (some sort of linked list but in file not in memory)
is there library (java library) to help me to implement this requirement ?
any recommendation or experience that you think is useful?
----------------- EDIT ----------------------------------------------
Thanks for guides and suggestions,
some more info:
the original file and its format is out of my control (it's a third party file) and i can't change the file format. but i have to read it, let user to navigate over records and edit some of them (insert new record/ update an existing record/ delete a record) and at the end save it back to original file format.
do u still recommend DataBase instead of a normal index file ?
----------------- SECOND EDIT ----------------------------------------------
record size in update mode is fixed. it means updated (edited) record has same length as original record's, unless user delete the record and create another record with different format.
Many Thanks

Seriously, you should NOT be using a binary file for this. You should use a database.
The problems with trying to implement this as a regular file stem from the fact that operating systems do not allow you to insert extra bytes into the middle of an existing file. So if you need to insert a record (anywhere but the end), update a record (with a different size) or remove a record, you would need to:
rewrite other records (after the insertion/update/deletion point) to make or reclaim space, or
implement some kind of free space management within the file.
All of this is complicated and / or expensive.
Fortunately, there is a class of software that implements this kind of thing. It is called database software. There are a wide range of options, ranging from using a full-scale RDBMS to light-weight solutions like BerkeleyDB files.
In response to your 1st and 2nd edits, a database will still be simpler.
However, here's an alternative that might perform better for this use-case than using a DB... without doing complicated free-space management.
Read the file and build an in-memory index that maps ids to file locations.
Create a second file to hold new and updated records.
Perform the record adds/updates/deletes:
An addition is handled by writing the new record to the end of the second file, and adding an index entry for it.
An update is handled by writing the updated record to the end of the second file, and changing the existing index entry to point to it.
A delete is handled by deleting the index entry for the record's key.
Compact the file as follows:
Create a new file.
Read each record in the old file in order, and check the index for the record's key. If the entry still points to the location of the record, copy the record to the new file. Otherwise skip it.
Repeat the step 4.2 for the second file.
If we completed all of the above successfully, delete the old file and second file.
Note this relies on being able to keep the index in memory. If that is not feasible, then the implementation is going to be more complicated ... and more like a database.

Having a data file and an index file would be the general base idea for such an implementation, but you'd pretty much find yourself dealing with data fragmentation upon repeated data updates/deletion, etc. This kind of project, in itself, should be a separate project and should not be part of your main application. However, essentially, a database is what you need as it is specifically designed for such operations and use cases and will also allow you to search, sort, and extend (alter) your data structure without having to refactor an in-house (custom) solution.
May I suggest you to download Apache Derby and create a local embedded database (derby does it for you want you create a new embedded connection at run-time). It will not only be faster than anything you'll write yourself, but will make your application easier to maintain.
Apache Derby is a single jar file that you can simply include and distribute with your project (check the license if any legal issue may apply in your app). There is no need for a database server or third party software; it's all pure Java.
Bottom line as that it all depends on how large is your application, if you need to share the data across many clients, if speed is a critical aspect of your app, etc.
For a stand-alone, single user project, I recommend Apache Derby. For a n-tier application, you might want to look into MySQL, PostgreSQL or (hrm) even Oracle. Using already made and tested solutions is not only smart, but will cut down your development time (and maintenance efforts).
Cheers.

Generally you are better off letting a library or database do the work for you.
You may not want to have an SQL database and there are plenty of simple databases which don't use SQL. http://nosql-database.org/ lists 122 of them.
At a minimum, if you are going to write this I suggest you read the source for one of these databases to see how they work.
Depending on the size of the records, 3 million isn't that much and I would suggest you keep as much in memory as possible.
The problem you are likely to have is ensuring the data is consistent and recovering the data when a corruption occurs. The second problem is dealing with fragmentation efficiently (some thing the brightest minds working on the GC deal with) The third problem is likely to be maintain the index in a transaction fashion with the source data to ensure there are no inconsistencies.
While this may appear simple at first, there are significant complexities in making sure there data is reliable, maintainable and can be accessed efficiently. This is why most developers use an existing database/datastore library and concentrate on the features which are unqiue to their application.

(Note: My answer is about the problem in general, not considering any Java libraries or - like the other answers also proposed - using a database (library), which might be better than reinventing the wheel)
The idea to create an index is good and will be very helpful performance-wise (although you wrote "index file", I think it should be kept in memory). Generating the index should be quite fast if you read the ID and record length for each entry and then just skip the data with a file seek.
You should also think about the edit functionality. Especially inserting and deleting can be very slow on such a big file if you do it wrong (f.e. deleting and then moving all the following entries to close the gap).
The best option would be to only mark deleted entries as deleted. When inserting, you can overwrite one of those or append to the end of the file.

Insert / Update / Delete records
Inserting (rather than merely appending) and deleting records to a file is expensive because you have to move all the following content of the file to create space for the new record or to remove the space it used. Updating is similarly expensive if the update changes the length of the record (you say they are variable length).
The file format you propose is fundamentally unsuitable for the kinds of operations you want to perform. Others have suggested using a data-base. If you don't want to go that far, adding an index file (as you suggest) is the way to go. I recommend making the index records all the same length.

As others have stated a database would seem a better solution. The following are Java SQL DB's that could be used: H2, Derby or HSQLDB
If you want to use an index file look at Berkley DB or No Sql
If there is some reason for using a file, look at JRecord . It has
Several Classes for reading/writing files with variable length binary records (they where written for Cobol VB files). Any of Mainframe / Fujitsu / Open Cobol VB file structures should do the job.
An Editor for editing JRecord files. The latest version of the Editor can handle large files (it uses Compression / spill file). The editor suffers from having to download the whole file and only one user can edit the file at one time.
The JRecord solution will only work if
There is a limited number (preferably one) users all located in the one location
Fast infostructure

Java application design question

I have a hobby project, which is basically to maintain 'todo' tasks in the way I like.
One task can be described as:
public class TodoItem {
private String subject;
private Date dueBy;
private Date startBy;
private Priority priority;
private String category;
private Status status;
private String notes;
}
As you can imagine I would have 1000s of todo items at a given time.
What is the best strategy to store a
todo item? (currently on an XML file)
such that all the items are loaded
quickly up on app start up(the
application shows kind of a dashboard
of all the items at start up)?
What is the best way to design its
back-end so that it can be ported to
Android/or a J2ME based phone?
Currently this is done using Java
Swing. What should I concentrate on so
that it works efficiently on a device
where memory is limited?
The application throws open a form
to enter new todo task. For now, I
would like to save the newly added
task to my-todos.xml once the user
presses "save" button. What are the
common ways to append such a change
to an existing XML file?(note that I don't want to read the whole file again and then persist)

For storing: SQLite seems like a good solution for things such as searching and cross platform support. Android and many other devices support SQLite.

As with any programming question there are a lot of ways to do things. However, by specifying that you are intending to go to a phone, you list of considerations changes. Firstly you need to look at your intended phones to see what they support. Especially in terms of data storage.
Xml or some other flat file format will work fine if you don't have too much data and don't want to enable searching and other functions which will access the data in random ways.
But if you want to store larger amounts of data or do random access, you need to look into data storage techniques that are more database like. This is where you intended target platforms are likely to impose limits in terms of performance or storage limits.
The other alternative is that you design the application so that it's storage os decoupled from the core program. This means that you can apply different types of data storage, depending on whether it's a PC or phone, yet not have to recode everything else.

One option that comes to mind is an in-memory DB, which exists in various flavors. I've yet to use one of these, so I can't tell you about memory usage or platform constraints. Still, it's worth looking at.
Another option that comes to mind is to maintain a large collection of TodoItem objects, and write your own code to read from and persist this collection to the XML file. Essentially, build a class that contains the large Map (or whatever you decide to use) and have this class implement Externalizable.
Both of these options will allow you to read the XML file to its in-memory representation, search and alter the state, and eventually write the final state back to XML when the app goes down (or at fixed intervals, whatever you decide).

You might be able to use java.util.prefs.Preferences.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.