Java String hashcode as Mysql ID

Java String hashcode as Mysql ID - java

The scenario is something like this description.
I've the typical mysql table for the users storage, currently, the user ID is one integer set as autoincrement. Very much of the API rest interfaces works with the user alias (that's unique) to find the user, then, I'm thinking implement the user ID with the alias.hashcode() (that's one integer) to find diretly for ID every times.
Is a good idea implement Mysql ID with one java String hashcode?. Would enhance the performance?

I don't think it's a great idea. The pigeon hole principle states (from Wikipedia) if n items are put into m containers, with n > m, then at least one container must contain more than one item. Basically, your solution cannot handle collisions and collisions are very possible with hashing.

Don't use String hashCode as your ID, since it's not unique. Two different Strings may have the same hashCode. I'm assuming your ID should be unique.

Just add an index on the alias column, and query the db by alias directly. There are two problems with using alias hash code or other derivatives as an id. First, as others pointed out, hash codes are not be unique (this can be almost solved by changing the id type to string, and using a digest instead of the hash. Collisions with digests, while still possible, are extremely unlikely). Second, if the user changes his alias, the value will get out of sync with the id. If functionality of your application is such, that this situation is either impossible or unimportant, then you don't really need an id at all, and can identify users by alias directly.

Related

Dynamic Named SQL Fields

So i've got a bot that serves as a roleplaying mamager handeling combat, skill points and the like, i'm trying to make my code a bit more general so i can have less pages since they all do the same thing they just have different initilizers but i ran into a snag i need to check if the user has a minimum in a particular stat Strength, perceptions, agility, etc
so i call
mainSPECIAL = rows[0].Strength;
Here's the rub, weathers it strength, percpetion, intelligence, luck, whatever i'm always going to be checking Rows[0].that attribute ie Rows[0].Luck for luck perks, and i already set earlier in my initilizers
var PERKSPECIALName = "Strength";
But i can't call
mainSPECIAL = rows[0].PERKSPECIALName but there should be a way to do that right? so that when it sees "rows[0].PERKSPECIALName" it looks up "PERKSPECIALName" and then fetches the value of rows[0].Strength

For this you need to use reflection:
Field f1 = rows[0].getClass().getField(PERKSPECIALName);
Integer attribute = (Integer) f1.get(rows[0]);
Where "Integer" is the type of the element your pulling from the object (the type of strength)
The field must be declared as public! I think there is a way to obtain them when they are not public but it requires more code.

Seems like you have a set of integers that you need to identify with a constant identifier. You might find an EnumMap useful. Have a look at How to use enumMap in java.
Or if you want to only use a string to identify which perk you want to reference, just use a Map.
Java doesn't have reference-to-member like some other languages, so if you don't want to change your data structure, you are looking at using lambda functions or heavier language features to increase re-use, which seems like overkill for what you're trying to do.

Should I compare all fields in my class's "equals" method?

I'm working on an application that allows the user to manage accounts. So, suppose I have an Account class, representing one of the user's accounts:
class Account
{
public int id;
public String accountName;
public String accountIdentifier;
public String server;
public String notes;
}
My equals method looks like this:
public boolean equals(Object o)
{
if (this == o)
return true;
if (o == null || !(o instanceof Account))
return false;
Account other = (Account) o;
if (!accountIdentifier.equals(other.accountIdentifier))
return false;
if (!server.equals(other.server))
return false;
return true;
}
As you can see, I'm only comparing the accountIdentifier and the server, but not the other fields. There are several reasons why I chose this approach.
I keep the accounts in a List. When the user updates an account, by changing the account name (which is just a name specified by the user to identify the account) or the notes, I can do accountList.set(accountList.indexOf(account), account); to update the account in the list. If equals compared all properties, this approach wouldn't work, and I'd have to work around it (for example by iterating over the list and checking for these properties manually).
This might actually be more important, but it only came to my mind after thinking about it for a while. An Account is uniquely identified by the accountIdentifier and the server it belongs to. The user might decide to rename the account, or change the notes, but it's still the same account. But if the server is changed, I think I would consider it a different account. The id is just an internal ID since the accounts are stored in a database. Even if that changed, the account is still considered the same account if the accountIdentifier and the server stayed the same.
What I'm trying to say is that I basically implemented equals this way to allow for shorter, more concise code in the rest of the application. But I'm not sure if I'm breaking some rules here, or if I'm doing something that might cause other developers headaches if it ever happens that someone is working with my application's API.
Is it okay to only compare some fields in the equals method, or should I compare all fields?

Yes, it's definitely okay to do this. You get to decide what equality means for your class, and you should use it in a way that makes the most sense for your application's logic — in particular, for collections and other such classes that make use of equality. It sounds like you have thought about that and decided that the (server, identifier) pair is what uniquely distinguishes instances.
This would mean, for instance, that two instances with the same (server, identifier) pair but a different accountName are different versions of the same Account, and that the difference might need to be resolved somehow; that's a perfectly reasonable semantic.
It may make sense to define a separate boolean allFieldsEqual(Account other) method to cover the "extended" definition, depending on whether you need it (or would find it useful for testing).
And, of course, you should override hashCode to make it consistent with whatever definition of equals you go with.

You should compare all of the fields that are necessary to determine equality. If the accountIdentifier and server fields are enough to determine if two objects are equal, then that is perfectly fine. No need to include any of the other fields that don't matter in terms of equality.

For the key normally you should use the business key, this key can be simple or composite key and not necessary need to include all the fields in the entity. So... depends of each case to select what identify an entity. If possible should be the minimum number of field fully and unique identify the entity.
Some people prefer (and is a good practice) to create a surrogate key that will identity the object, this is very useful when you want to persist your objects using any ORM due you don’t need to export the keys to the child entities in 1:M or M:N relations. For example the ID in your sample can be considered as surrogate key if you create it as internal unique identifier.
Also may want to take into consideration:
Always you override equals you must override hashCode too, this is important to work properly with classes like Collections, Maps etc
Apache provide a really nice API to help in the implementation of equals and hashCode. Those classes are EqualsBuilder and HashCodeBuilder. Both allow you to concatenate the fields you want to use in your comparison and have a way also to use reflection.

The answer is "it depends depends on the semantics of your data".
For example, you might internally store a field that can be derived (calculated) from the other fields. In which case, you don't need to compare the calculated value.
As a gross generalisation, anything that cannot be derived from other fields should be included.

This is fine - and probably a good thing to do. If you've identified equality as the accountIdentifier and the server being distinct and unique, then that's perfectly valid for your use case.
You don't want to use more fields than you need to since that would produce false positives in your code. This approach is perfectly suitable to your needs.

Best way to represent set of string values in DB and UI

I'm looking for opinions so I guess this is a 'which is better' question. I have a webapp build in Javascript/jQuery and struts that uses Hibernate to access data in a relational DB (MySQL). When an object/database field has a limited set of strings for values, is it better to use the full string in the object/DB or a 'code' for that string, like a single CHAR instead of the entire string?
class User {
int id;
String userName;
String type; // Values of 'Administrator', 'Regular'
OR
char type // Values of 'A', 'R'
OR
char type // Values of 'A', 'R'
String typeString; // Can be returned on the fly based on 'type' or by DB in SQL CASE statement
}
If the database has the full text string, then its easy coding all the way around, but its wasting the space (in the DB, data transfer) on something that only has a few values.
If the database has just a 'code' then when presenting this field to a user ( like in a grid of existing users, or a dropdown selection list when creating a new user ) the char value must be converted to the full string. Then the question is where should that conversion be done? It could be at the DB level where Hibernate can fill in the full string value from a CASE statement. This saves DB space, but not in data transfer or memory. It could be at the object level where its done in the getter/setter for the 'type' field. Or it could be all the way in the GUI where Javascript converts the 'char' to the appropriate string for the user to see.
Also... if either method is OK to use, what might influence the choice you make? The number of different values? The max length of the strings? How many rows are expected in the table?
I'm sure every DB/programmer has come across this situation many times and probably has a preference.

If you only have a fixed set of user types like Admin and Regular, I think it will easier to use a static hashmap in your code and just store A and R in your code. Something like:
static HashMap<Character,String> userRoles = new HashMap<>();
static{
userRoles.put("A","Admin");
userRoles.put("R", "Regular");
}
When ever you get result from DB, you can just do userRoles.get(type) to check the actual type. This saves space and also it's readable.

I would put the full name in the database alongside an associated short code or ID in some kind of lookup table. Use the shortcode/ID as the primary key for the lookup table, and as a foreign key from other tables. If someone needs to investigate the database layer, or someone needs to use the database for reporting, data warehousing, or analytics this will simplify things greatly.
It's commonly seen as bad practice to name variables, database tables, database columns, functions, etc. with unclear names or abbreviations that not everyone will understand - short codes like this should be seen the same way.

I think its better to do the conversion from the typecode to type (and vice versa) as close to database interaction as possible - in this case Hibernate. This is because your application logic would become more readable and intuitive if it uses the explicit types.
In my opinion- if(BMW.equals(carTypeCode)) {} is lot more readable than if("X".equals(carTypeCode)) {}.
I am not very familiar with Hibernate, but it would be awesome if you could leverage Hibernate for the mapping of String to DB representation and vice versa (maybe using CASE as you mentioned). Personally, I would probably have modeled these Strings as enums and used something like Hibernate Enum Type mapping. Also, you should think about making these type codes a little bit readable by making them at least few chars because these may come in handy when you are debugging some issue by looking at DB dump and you don't have to consult your type-code to type conversion chart.
I don't think performance wise either would not impact much in the average case.

Value object with OID

Can you describe the pros and cons of including an OID (typically a database row identifier) in a POJO representing an entity in your model?
In fact I'm not talking about issues related to equals/hashcode and so on, I should have described better my problem (my bad :) )...
We've got some of those entity classes which represent business objects (like Product, Catalog and so on...). Sometime they have a 'business id', for example Product can be found by its unique ProductId (which has 3 fields : id, type, repository).
In our database, the Product table has a surrogate primary key column (OID) in addition to the 3 business columns ( id, type, repository) to facilitate foreign keys references and to have less joins clauses.
The Product/ProductId classes are part of the API that we expose to other applications. So for example they can call :
productManager.findProductById(ProductId productId);
The question is, should or should not the OID be included in the Product or in the ProductId class knowing that our clients are expected to use the ProductId identifier.
Pros :
I can use the OID to do another lookup like
Product p = productManager.findProductById(ProductId productId);
Catalog c = productManager.findAllCatalogsContainingProduct(p.getOid());
We're used to lookup a lot in the application by ProductId so this saves each time a roundtrip to the database to avoid to find the OID matching a ProductId.
Cons :
I've just exposed the OID to a client (let's hope he doesn't use it instead of the business key!!)
Can you list other pros and cons?

Database row identifier = Primary key? If so, there is no pro or con, you have to have it otherwise you can't relate the POJO back to its corresponding database row.
To retrieve Products and Catalogs, the standard SQL way is to do a Join. For example, with my DAL I can do:
SearchCriteria sc = new SearchCriteria();
sc.AddBinding("ProductId", productId);
List<Entity> Products = SQL.Read(sc, new Product(new Catalog());
or
List<Entity> Products = SQL.Read(sc, new Catalog(new Product());
This way there is no need to reveal anything to the caller, nor for a roundtrip.

You can run into problems if your implementation of equals() or hashCode() is based off the identifier since it will likely be null initially and then change later once the object is persisted. See below:
http://java.sun.com/javase/6/docs/api/java/util/Set.html
Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.
Let's assume that your implementation of hashCode() is based off the identifier and equals() uses hashCode() in its comparison. If you add the object to a Set and its identifer is null the equals comparisons will perform one way. If you then persist the object in the set, its identifier value will likely change, thus changing the behavior of equals() and hashCode(). This breaks the "contract" of Set as described above.
It's a bit of an edge case but one worth noting.

Any implementation of Map<K1, K2, V>, i.e. two keys?

I need a map that has two keys, e.g.
Map2<String /*ssn*/, String /*empId*/, Employee> _employees;
So that I can
_employees.put(e.ssn(), e.empId(), e)
And later
_employees.get1(someSsn);
_employees.get2(someImpId);
Or even
_employees.remove1(someImpId);
I am not sure why I want to stop at two, why not more, probably because that's the case I am I need right now :-) But the type needs to handle fixed number of keys to be type-safe -- type parameters cannot be vararg :-)
Appreciate any pointers, or advice on why this is a bad idea.

I imagine the main key would be empId, so I would build a Map with that as the key, i.e. empId ---> Employee. All other unique attributes (e.g. ssn) would be treated as secondary and will use separate Maps as a lookup table for empId (e.g. ssn ---> empId).
This implementation makes it easy to add/remove employees, since you only need to change one Map, i.e. empId ---> Employee; the other Maps can be rebuilt only when needed.

My first thought was: the easiest way to do this, I think, would be two maps.
Map< String, Map< String,Employee> > _employees;
But from what it looks like, you just want to be able to look up an employee by either SSN or ID. What's to stop you then from making two maps, or at worst a class that contains two maps?
As a clarification, are you looking for a compound key being employees are uniquely identified by the combination of their SSN and ID, but not either one by itself, or are you looking for two different ways of referencing an employee?

The Spiffy Framework appears to provide exactly what you`re looking for. From the Javadocs:
A two-dimensional hashmap, is a
HashMap that enables you to refer to
values via two keys rather than one
The relevant class is TwoDHashMap. It also provides a ThreeDHashMap.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.