Cassandra: Selecting a Range of TimeUUIDs using the DataStax Java Driver

Cassandra: Selecting a Range of TimeUUIDs using the DataStax Java Driver - java

The use case that we are working to solve with Cassandra is this: We need to retrieve a list of entity UUIDs that have been updated within a certain time range within the last 90 days. Imagine that we're building a document tracking system, so our relevant entity is a Document, whose key is a UUID.
The query we need to support in this use case is: Find all Document UUIDs that have changed between StartDateTime and EndDateTime.
Question 1: What's the best Cassandra table design to support this query?
I think the answer is as follows:
CREATE TABLE document_change_events (
event_uuid TIMEUUID,
document_uuid uuid,
PRIMARY KEY ((event_uuid), document_uuid)
) WITH default_time_to_live='7776000';
And given that we can't do range queries on partition keys, we'd need to use the token() method. As such the query would then be:
SELECT document_uuid
WHERE token(event_uuid) > token(minTimeuuid(?))
AND token(event_uuid) < token(maxTimeuuid(?))
For example:
SELECT document_uuid
WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00+0000'))
AND token(event_uuid) < token(maxTimeuuid('2015-05-20 00:00+0000'))
Question 2: I can't seem to get the following Java code using DataStax's driver to reliability return the correct results.
If I run the following code 10 times pausing 30 seconds between, I will then have 10 rows in this table:
private void addEvent() {
String cql = "INSERT INTO document_change_events (event_uuid, document_uuid) VALUES(?,?)";
PreparedStatement preparedStatement = cassandraSession.prepare(cql);
BoundStatement boundStatement = new BoundStatement(preparedStatement);
boundStatement.setConsistencyLevel(ConsistencyLevel.ANY);
boundStatement.setUUID("event_uuid", UUIDs.timeBased());
boundStatement.setUUID("document_uuid", UUIDs.random());
cassandraSession.execute(boundStatement);
}
Here are the results:
cqlsh:> select event_uuid, dateOf(event_uuid), document_uuid from document_change_events;
event_uuid | dateOf(event_uuid) | document_uuid
--------------------------------------+--------------------------+--------------------------------------
414decc0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:51:09-0500 | 92b6fb6a-9ded-47b0-a91c-68c63f45d338
9abb4be0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:53:39-0500 | 548b320a-10f6-409f-a921-d4a1170a576e
6512b960-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:52:09-0500 | 970e5e77-1e07-40ea-870a-84637c9fc280
53307a20-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:51:39-0500 | 11b4a49c-b73d-4c8d-9f88-078a6f303167
ac9e0050-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:54:10-0500 | b29e7915-7c17-4900-b784-8ac24e9e72e2
88d7fb30-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:53:09-0500 | c8188b73-1b97-4b32-a897-7facdeecea35
0ba5cf70-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:49:39-0500 | a079b30f-be80-4a99-ae0e-a784d82f0432
76f56dd0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:52:39-0500 | 3b593ca6-220c-4a8b-8c16-27dc1fb5adde
1d88f910-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:50:09-0500 | ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
2f6b3850-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:50:39-0500 | db42271b-04f2-45d1-9ae7-0c8f9371a4db
(10 rows)
But if I then run this code:
private static void retrieveEvents(Instant startInstant, Instant endInstant) {
String cql = "SELECT document_uuid FROM document_change_events " +
"WHERE token(event_uuid) > token(?) AND token(event_uuid) < token(?)";
PreparedStatement preparedStatement = cassandraSession.prepare(cql);
BoundStatement boundStatement = new BoundStatement(preparedStatement);
boundStatement.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);
boundStatement.bind(UUIDs.startOf(Date.from(startInstant).getTime()),
UUIDs.endOf(Date.from(endInstant).getTime()));
ResultSet resultSet = cassandraSession.execute(boundStatement);
if (resultSet == null) {
System.out.println("None found.");
return;
}
while (!resultSet.isExhausted()) {
System.out.println(resultSet.one().getUUID("document_uuid"));
}
}
It only retrieves three results:
3b593ca6-220c-4a8b-8c16-27dc1fb5adde
ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
db42271b-04f2-45d1-9ae7-0c8f9371a4db
Why didn't it retrieve all 10 results? And what do I need to change to achieve the correct results to support this use case?
For reference, I've tested this against dsc-2.1.1, dse-4.6 and using the DataStax Java Driver v2.1.6.

First of all, please only ask one question at a time. Both of your questions here could easily stand on their own. I know these are related, but it just makes the readers come down with a case of tl;dr.
I'll answer your 2nd question first, because the answer ties into a fundamental understanding that is central to getting the data model correct. When I INSERT your rows and run the following query, this is what I get:
aploetz#cqlsh:stackoverflow2> SELECT document_uuid FROM document_change_events
WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00-0500'))
AND token(event_uuid) < token(maxTimeuuid('2015-05-22 00:00-0500'));
document_uuid
--------------------------------------
a079b30f-be80-4a99-ae0e-a784d82f0432
3b593ca6-220c-4a8b-8c16-27dc1fb5adde
ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
db42271b-04f2-45d1-9ae7-0c8f9371a4db
(4 rows)
Which is similar to what you are seeing. Why didn't that return all 10? Well, the answer becomes apparent when I include token(event_uuid) in my SELECT:
aploetz#cqlsh:stackoverflow2> SELECT token(event_uuid),document_uuid FROM document_change_events WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00-0500')) AND token(event_uuid) < token(maxTimeuuid('2015-05-22 00:00-0500'));
token(event_uuid) | document_uuid
----------------------+--------------------------------------
-2112897298583224342 | a079b30f-be80-4a99-ae0e-a784d82f0432
2990331690803078123 | 3b593ca6-220c-4a8b-8c16-27dc1fb5adde
5049638908563824288 | ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
5577339174953240576 | db42271b-04f2-45d1-9ae7-0c8f9371a4db
(4 rows)
Cassandra stores partition keys (event_uuid in your case) in order by their hashed token value. You can see this when using the token function. Cassandra generates partition tokens with a process called consistent hashing to ensure even cluster distribution. In other words, querying by token range doesn't make sense unless the actual (hashed) token values are meaningful to your application.
Getting back to your first question, this means you will have to find a different column to partition on. My suggestion is to use a timeseries mechanism called a "date bucket." Picking the date bucket can be tricky, as it depends on your requirements and query patterns...so that's really up to you to pick a useful one.
For the purposes of this example, I'll pick "month." So I'll re-create your table partitioning on month and clustering by event_uuid:
CREATE TABLE document_change_events2 (
event_uuid TIMEUUID,
document_uuid uuid,
month text,
PRIMARY KEY ((month),event_uuid, document_uuid)
) WITH default_time_to_live='7776000';
Now I can query by a date range, when also filtering by month:
aploetz#cqlsh:stackoverflow2> SELECT document_uuid FROM document_change_events2
WHERE month='201505'
AND event_uuid > minTimeuuid('2015-05-10 00:00-0500')
AND event_uuid < maxTimeuuid('2015-05-22 00:00-0500');
document_uuid
--------------------------------------
a079b30f-be80-4a99-ae0e-a784d82f0432
ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
db42271b-04f2-45d1-9ae7-0c8f9371a4db
92b6fb6a-9ded-47b0-a91c-68c63f45d338
11b4a49c-b73d-4c8d-9f88-078a6f303167
970e5e77-1e07-40ea-870a-84637c9fc280
3b593ca6-220c-4a8b-8c16-27dc1fb5adde
c8188b73-1b97-4b32-a897-7facdeecea35
548b320a-10f6-409f-a921-d4a1170a576e
b29e7915-7c17-4900-b784-8ac24e9e72e2
(10 rows)
Again, month may not work for your application. So put some thought behind coming up with an appropriate column to partition on, and then you should be able to solve this.

Related

How to successfully separate "ordered items" from "product items" in a shopping software system design?

I am developing an app for ordering products online. I will first present you a part of the ER Diagram and then explain.
Here the "products" are food items and they will go inside the fresh_products table. As you can see a fresh_product consists of product species, category, type, size and product grade.
Then when I place an order, the order id and order_status will be saved in order table. All the ordered items will be saved in the order_item table. In order_item table you can see I have a direct connection with the fresh_products.
At certain times, the management will decide to change the existing fresh_products. For an example, lets take fish products such as Tuna. This fish item has a category called Loin(fished with Loin), after sometime management decide to remove it or rename it because they no longer fish with Loin.
Now, my design will be affected because of the above change. Why? When you make a change to any of the fresh_product fields, it will directly affect the order_item which holds the information of already ordered items. So if you rename Loin , all order_items which has Loin will now be linked to the new name which is wrong. If you decide to delete a fresh_product you can't do that either, because there are existing order history bound to that fresh_product.
My Suggestion
As a solution for this problem, I am thinking of removing the relationship of fresh_products from order_item. Then, I will add String fields into order_item representing all fields of fresh_products, for an example productType, prodyctCategory and so on.
Now I don't have a direct connection with fresh_products but have all the information needed. At the same time, anyfresh_product item can undertake any change without affecting already purchased items at order_item.
My question is, is my suggestion the best way to solve this issue? If you have better solutions, I am open.

Consider the following; in this example, only the name product changes, but you could easily add columns for each attribute (although at some point this would move from sublime to ridiculous)...
Also, I rarely use correlated subqueries, so apologies if there's an error there...
create table product_history
(product_id int not null
,product_name varchar(12) not null
,date date not null
,primary key (product_id,date)
);
insert into product_history values
(1,'soda','2020-01-01'),
(1,'pop','2020-01-04'),
(1,'cola','2020-01-07');
create table order_detail
(order_id int not null
,product_id int not null
,quantity int not null default 1
,primary key(order_id,product_id)
);
insert into order_detail values
(100,1,4);
create table orders
(order_id serial primary key
,customer_id int not null
,order_date date not null
);
insert into orders values
(100,22,'2020-01-05');
SELECT o.order_id
, o.customer_id
, o.order_date
, ph.product_id
, ph.product_name
, od.quantity
FROM orders o
JOIN order_detail od
ON od.order_id = o.order_id
JOIN product_history ph
ON ph.product_id = od.product_id
WHERE ph.date IN
( SELECT MAX(x.date)
FROM product_history x
WHERE x.product_id = ph.product_id
AND x.date <= o.order_date
);
| order_id | customer_id | order_date | product_id | product_name | quantity |
| -------- | ----------- | ---------- | ---------- | ------------ | -------- |
| 100 | 22 | 2020-01-05 | 1 | pop | 4 |
---
View on DB Fiddle

How to query Oracle via JDBC for an intersection

I need to check from Java if a certain user has at least one group membership. In Oracle (12 by the way) there is a big able that looks like this:
DocId | Group
-----------------
1 | Group-A
1 | Group-E
1 | Group-Z
2 | Group-A
3 | Group-B
3 | Group-W
In Java I have this information:
docId = 1
listOfUsersGroups = { "Group-G", "Group-A", "Group-Of-Something-Totally-Different" }
I have seen solutions like this, but this is not the approach I want to go for. I would like to do something like this (I know this is incorrect syntax) ...
SELECT * FROM PERMSTABLE WHERE DOCID = 1 AND ('Group-G', 'Group-A', 'Group-Of-Something-Totally-Different' ) HASATLEASTONE OF Group
... and not use any temporary SQL INSERTs. The outcome should be that after executing this query I know that my user has a match because he is member of Group-A.

You can do this (using IN condition):
SELECT * FROM PERMSTABLE WHERE DocId = 1 AND Group IN
('Group-G', 'Group-A', 'Group-Of-Something-Totally-Different')

ResultSet to ArrayList and proper way select information

I have successfully called a Stored Procedure ("Users_info") from my java app, which returns a table that contains two columns: ("user" and "status").
CallableStatement proc_stmt = con.prepareCall("{call Users_info (?)}");
proc_stmt.setInt(1, 1);
ResultSet rs = proc_stmt.executeQuery();
I managed to get all the user names and their respective status ("0" for inactive and "1" for active) into a ArrayList using ResultSet.getString.
ArrayList<String> userName= new ArrayList<String>();
ArrayList<String> userStatus= new ArrayList<String>();
while ( rs.next() ) {
userName.add(rs.getString("user"));
userStatus.add(rs.getString("CHKStatus_Id"));
}
Database shows:
user | status|
------------|--------
Joshua H. | 1 |
Mark J. | 0 |
Jonathan R. | 1 |
Angie I. | 0 |
Doris S. | 0 |
I want to find the best or most efficient way to separate the active users from the inactive users.
I can do this by comparing values using IF STATEMENTS, which, because of my limited knowledge in Java programming I think is not the most efficient way and maybe there is a better way that you guys can help me understand.
Thanks.

Embedded SQL Firebird batch update OutOfMemoryError with Java SE

I'm having a big problem with the embedded firebird database engine and Java SE.
I'm currently developing a filtering tool for users to filter out data.
So I have made two Options for filtering, user can chose one or both:
Filter out from a black list(the black list is controled by user).
Filter out according to a massive list that records every record ever uploaded and filtered out.
The data the user uploads its in plain text comma or token separated like this:
(SET OF COLUMNS)| RECORD TO FILTER |
0-MANY COLUMNS | ABC2 |
0-MANY COLUMNS | ABC5 |
When I Upload it to the DB, I add A FLAG for every filter
(SET OF COLUMNS) | RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC2 | | |
0-MANY COLUMNS | ABC5 | | |
So, when it comes to the second filter, the program has a main empty table on the first run of the software, then it fills that table with all the records from the very first upload.
The main table will have unique records like the following table after a few text uploads made by the user:
Record | Date criteria for filtering |
ABC1 | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
ABC2 | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
ABC3 | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
ABC4 | 12/11/2012:1,11/11/2012:0,10/11/2012:0|
ABC5 | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
ABC9 | 11/11/2012:3,10/11/2012:1,09/11/2012:0|
When the data is processed, for example, the software updates both, the main table and the user table:
(SET OF COLUMNS| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC4 | | |
0-MANY COLUMNS | ABC9 | | |
So the main table will update:
Record | Day criteria for filtering |
ABC1 | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
ABC2 | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
ABC3 | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
ABC4 | 12/11/2012:1,11/11/2012:0,10/11/2012:0| ->12/11/2012:2,11/11/2012:0,10/11/2012:0
ABC5 | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
ABC9 | 11/11/2012:3,10/11/2012:1,09/11/2012:0| ->12/11/2012:1,11/11/2012:3,10/11/2012:1
If in the last three days the data criteria event has reached more than four, the user table will flag filter B. Notice that each date has an integer next to it.
(SET OF COLUMNS)| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC4 | | |
0-MANY COLUMNS | ABC9 | | X |
Both updates are in a single transaction, the problem is that when the user uploads more than 800,000 records my program throws the following exception in the while loop.
I use StringBuilder parsing and append methods for maximun performance on the mutable days string.
java.lang.OutOfMemoryError: Java heap space
Here it's my code, I use five days:
FactoriaDeDatos factoryInstace = FactoriaDeDatos.getInstance();
Connection cnx = factoryInstace.getConnection();
cnx.setAutoCommit(false);
PreparedStatement pstmt = null;
ResultSet rs=null;
pstmt = cnx.prepareStatement("SELECT CM.MAIL,CM.FECHAS FROM TCOMERCIALMAIL CM INNER JOIN TEMPMAIL TMP ON CM.MAIL=TMP."+colEmail);
rs=pstmt.executeQuery();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_MAIL INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_FECHAS INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
sql_com_local_tranx=0;
int trxNum=0;
int ix=0;
int ixE1=0;
int ixAc=0;
StringBuilder sb;
StringTokenizer st;
String fechas;
int pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9;
StringBuilder s1,s2,sSQL,s4,s5,s6,s7,s8,s9,s10;
long startLoop = System.nanoTime();
long time2 ;
boolean ejecutoMax=false;
//int paginador_sql=1000;
//int trx_ejecutada=0;
sb=new StringBuilder();
s1=new StringBuilder();
s2=new StringBuilder();
sSQL=new StringBuilder();
s4=new StringBuilder();
s6=new StringBuilder();
s8=new StringBuilder();
s10=new StringBuilder();
while(rs.next()){
//De aqui
actConteoDia=0;
sb.setLength(0);
sb.append(rs.getString(2));
pos1= sb.indexOf(":",0);
pos2= sb.indexOf(",",pos1+1);
pos3= sb.indexOf(":",pos2+1);
pos4= sb.indexOf(",",pos3+1);
pos5= sb.indexOf(":",pos4+1);
pos6= sb.indexOf(",",pos5+1);
pos7= sb.indexOf(":",pos6+1);
pos8= sb.indexOf(",",pos7+1);
pos9= sb.indexOf(":",pos8+1);
s1.setLength(0);
s1.append(sb.substring(0, pos1));
s2.setLength(0);
s2.append(sb.substring(pos1+1, pos2));
s4.setLength(0);
s4.append(sb.substring(pos3+1, pos4));
s6.setLength(0);
s6.append(sb.substring(pos5+1, pos6));
s8.setLength(0);
s8.append(sb.substring(pos7+1, pos8));
s10.setLength(0);
s10.append(sb.substring(pos9+1));
actConteoDia=Integer.parseInt(s2.toString());
actConteoDia++;
sb.setLength(0);
//sb.append(s1).a
if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
actConteoDia=MAXIMO_LIMITE_POR_SEMANA+1;
}
sb.append(s1).append(":").append(actConteoDia).append(",").append(rs.getString(2).substring(pos2+1, rs.getString(2).length()));
//For every date record it takes aprox 8.3 milisec by record
sSQL.setLength(0);
sSQL.append("UPDATE TCOMERCIALMAIL SET FECHAS='").append(sb.toString()).append("' WHERE MAIL='").append(rs.getString(1)).append("'");
pstmtDet1.addBatch(sSQL.toString());
//actConteoDia=0;
//actConteoDia+=Integer.parseInt(s2.toString());
actConteoDia+=Integer.parseInt(s4.toString());
actConteoDia+=Integer.parseInt(s6.toString());
actConteoDia+=Integer.parseInt(s8.toString());
actConteoDia+=Integer.parseInt(s10.toString());
if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
sSQL.setLength(0);
sSQL.append("UPDATE TEMPMAIL SET DIASLIMITE='S' WHERE ").append(colEmail).append("='").append(rs.getString(1)).append("'");
pstmtDet.addBatch(sSQL.toString());
}
sql_com_local_tranx++;
if(sql_com_local_tranx%2000==0 || sql_com_local_tranx%7000==0 ){
brDias.setString("PROCESANDO "+sql_com_local_tranx);
pstmtDet1.executeBatch();
pstmtDet.executeBatch();
}
if(sql_com_local_tranx%100000==0){
System.gc();
System.runFinalization();
}
}
pstmtDet1.executeBatch();
pstmtDet.executeBatch();
cnx.commit();
I've made telemetry tests so I can trace where the problem lies.
The big while it's the problem, I think, but I don't know where the problem may exactly be.
I'm adding some images of the telemtry tests, please I need to interpretate them properly.
The gc becomes inverse to the time the jvm to keep objects alive:
http://imageshack.us/photo/my-images/849/66780403.png
The memory heap goes from 50 MB to 250 MB, the used heap reaches the 250 MB creating the outOfMemory exception:
50 MB
http://imageshack.us/photo/my-images/94/52169259.png
REACHING 250 MB
http://imageshack.us/photo/my-images/706/91313357.png
OUT OF MEMORY
http://imageshack.us/photo/my-images/825/79083069.png
The final stack of objetcts generated ordered by LiveBytes:
http://imageshack.us/photo/my-images/546/95529690.png
Any help, suggestion, answer will be vastly appreciated.

The problem is that you are using PreparedStatement as if it is a Statement, as you are calling addBatch(string). The javadoc of this method says:
Note:This method cannot be called on a PreparedStatement or CallableStatement.
This comment was added with JDBC 4.0, before that it said the method was optional. The fact that Jaybird allows you to call this method on PreparedStatement is therefor a bug: I created issue JDBC-288 in the Jaybird tracker.
Now to the cause of the OutOfMemoryError: When you use addBatch(String) on the PreparedStatement implementation of Jaybird (FBPreparedStatement), it is added to a list internal to the Statement implementation (FBStatement). In case of FBStatement, when you call executeBatch(), it will execute all statements in this list and then clear it. In FBPreparedStatement however executeBatch() is overridden to execute the originally prepared statement with batch parameters (in your example it won't do anything, as you never actually add a PreparedStatement-style batch). It will never execute the statements you added with addBatch(String), but it will also not clear the list of statements in FBStatement and that is most likely the cause of your OutOfMemoryError.
Based on this, the solution should be to create a Statement using cnx.createStatement and use that to execute your queries, or investigate if you could benefit from using one or more PreparedStatement objects with a parameterized query. It looks like you should be able to use two separate PreparedStatements, but I am not 100% sure; the added benefit would be protection against SQL injection and a minor performance improvement.
Addendum
This issue has been fixed since Jaybird 2.2.2
Full disclosure: I am the developer of the Jaybird / Firebird JDBC driver.

do not execute batch statements while iterating through the result set. store the sql you want to execute in a collection and then when you have finished processing the result set
start executing the new sql. does everything have to happen inside the same transaction?

handling DATETIME values 0000-00-00 00:00:00 in JDBC

I get an exception (see below) if I try to do
resultset.getString("add_date");
for a JDBC connection to a MySQL database containing a DATETIME value of 0000-00-00 00:00:00 (the quasi-null value for DATETIME), even though I'm just trying to get the value as string, not as an object.
I got around this by doing
SELECT CAST(add_date AS CHAR) as add_date
which works, but seems silly... is there a better way to do this?
My point is that I just want the raw DATETIME string, so I can parse it myself as is.
note: here's where the 0000 comes in: (from http://dev.mysql.com/doc/refman/5.0/en/datetime.html)
Illegal DATETIME, DATE, or TIMESTAMP
values are converted to the “zero”
value of the appropriate type
('0000-00-00 00:00:00' or
'0000-00-00').
The specific exception is this one:
SQLException: Cannot convert value '0000-00-00 00:00:00' from column 5 to TIMESTAMP.
SQLState: S1009
VendorError: 0
java.sql.SQLException: Cannot convert value '0000-00-00 00:00:00' from column 5 to TIMESTAMP.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
at com.mysql.jdbc.ResultSetImpl.getTimestampFromString(ResultSetImpl.java:6343)
at com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.java:5670)
at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5491)
at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5531)

Alternative answer, you can use this JDBC URL directly in your datasource configuration:
jdbc:mysql://yourserver:3306/yourdatabase?zeroDateTimeBehavior=convertToNull
Edit:
Source: MySQL Manual
Datetimes with all-zero components (0000-00-00 ...) — These values can not be represented reliably in Java. Connector/J 3.0.x always converted them to NULL when being read from a ResultSet.
Connector/J 3.1 throws an exception by default when these values are encountered as this is the most correct behavior according to the JDBC and SQL standards. This behavior can be modified using the zeroDateTimeBehavior configuration property. The allowable values are:
exception (the default), which throws an SQLException with an SQLState of S1009.
convertToNull, which returns NULL instead of the date.
round, which rounds the date to the nearest closest value which is 0001-01-01.
Update: Alexander reported a bug affecting mysql-connector-5.1.15 on that feature. See CHANGELOGS on the official website.

I stumbled across this attempting to solve the same issue. The installation I am working with uses JBOSS and Hibernate, so I had to do this a different way. For the basic case, you should be able to add zeroDateTimeBehavior=convertToNull to your connection URI as per this configuration properties page.
I found other suggestions across the land referring to putting that parameter in your hibernate config:
In hibernate.cfg.xml:
<property name="hibernate.connection.zeroDateTimeBehavior">convertToNull</property>
In hibernate.properties:
hibernate.connection.zeroDateTimeBehavior=convertToNull
But I had to put it in my mysql-ds.xml file for JBOSS as:
<connection-property name="zeroDateTimeBehavior">convertToNull</connection-property>
Hope this helps someone. :)

My point is that I just want the raw DATETIME string, so I can parse it myself as is.
That makes me think that your "workaround" is not a workaround, but in fact the only way to get the value from the database into your code:
SELECT CAST(add_date AS CHAR) as add_date
By the way, some more notes from the MySQL documentation:
MySQL Constraints on Invalid Data:
Before MySQL 5.0.2, MySQL is forgiving of illegal or improper data values and coerces them to legal values for data entry. In MySQL 5.0.2 and up, that remains the default behavior, but you can change the server SQL mode to select more traditional treatment of bad values such that the server rejects them and aborts the statement in which they occur.
[..]
If you try to store NULL into a column that doesn't take NULL values, an error occurs for single-row INSERT statements. For multiple-row INSERT statements or for INSERT INTO ... SELECT statements, MySQL Server stores the implicit default value for the column data type.
MySQL 5.x Date and Time Types:
MySQL also allows you to store '0000-00-00' as a “dummy date” (if you are not using the NO_ZERO_DATE SQL mode). This is in some cases more convenient (and uses less data and index space) than using NULL values.
[..]
By default, when MySQL encounters a value for a date or time type that is out of range or otherwise illegal for the type (as described at the beginning of this section), it converts the value to the “zero” value for that type.

DATE_FORMAT(column name, '%Y-%m-%d %T') as dtime
Use this to avoid the error. It return the date in string format and then you can get it as a string.
resultset.getString("dtime");
This actually does NOT work. Even though you call getString. Internally mysql still tries to convert it to date first.
at com.mysql.jdbc.ResultSetImpl.getDateFromString(ResultSetImpl.java:2270)
~[mysql-connector-java-5.1.15.jar:na] at com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.java:5743)
~[mysql-connector-java-5.1.15.jar:na] at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5576)
~[mysql-connector-java-5.1.15.jar:na]

If, after adding lines:
<property
name="hibernate.connection.zeroDateTimeBehavior">convertToNull</property>
hibernate.connection.zeroDateTimeBehavior=convertToNull
<connection-property
name="zeroDateTimeBehavior">convertToNull</connection-property>
continues to be an error:
Illegal DATETIME, DATE, or TIMESTAMP values are converted to the “zero” value of the appropriate type ('0000-00-00 00:00:00' or '0000-00-00').
find lines:
1) resultSet.getTime("time"); // time = 00:00:00
2) resultSet.getTimestamp("timestamp"); // timestamp = 00000000000000
3) resultSet.getDate("date"); // date = 0000-00-00 00:00:00
replace with the following lines, respectively:
1) Time.valueOf(resultSet.getString("time"));
2) Timestamp.valueOf(resultSet.getString("timestamp"));
3) Date.valueOf(resultSet.getString("date"));

I wrestled with this problem and implemented the 'convertToNull' solutions discussed above. It worked in my local MySql instance. But when I deployed my Play/Scala app to Heroku it no longer would work. Heroku also concatenates several args to the DB URL that they provide users, and this solution, because of Heroku's use concatenation of "?" before their own set of args, will not work. However I found a different solution which seems to work equally well.
SET sql_mode = 'NO_ZERO_DATE';
I put this in my table descriptions and it solved the problem of '0000-00-00 00:00:00' can not be represented as java.sql.Timestamp

I suggest you use null to represent a null value.
What is the exception you get?
BTW:
There is no year called 0 or 0000. (Though some dates allow this year)
And there is no 0 month of the year or 0 day of the month. (Which may be the cause of your problem)

I solved the problem considerating '00-00-....' isn't a valid date, then, I changed my SQL column definition adding "NULL" expresion to permit null values:
SELECT "-- Tabla item_pedido";
CREATE TABLE item_pedido (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
id_pedido INTEGER,
id_item_carta INTEGER,
observacion VARCHAR(64),
fecha_estimada TIMESTAMP,
fecha_entrega TIMESTAMP NULL, // HERE IS!!.. NULL = DELIVERY DATE NOT SET YET
CONSTRAINT fk_item_pedido_id_pedido FOREIGN KEY (id_pedido)
REFERENCES pedido(id),...
Then, I've to be able to insert NULL values, that means "I didnt register that timestamp yet"...
SELECT "++ INSERT item_pedido";
INSERT INTO item_pedido VALUES
(01, 01, 01, 'Ninguna', ADDDATE(#HOY, INTERVAL 5 MINUTE), NULL),
(02, 01, 02, 'Ninguna', ADDDATE(#HOY, INTERVAL 3 MINUTE), NULL),...
The table look that:
mysql> select * from item_pedido;
+----+-----------+---------------+-------------+---------------------+---------------------+
| id | id_pedido | id_item_carta | observacion | fecha_estimada | fecha_entrega |
+----+-----------+---------------+-------------+---------------------+---------------------+
| 1 | 1 | 1 | Ninguna | 2013-05-19 15:09:48 | NULL |
| 2 | 1 | 2 | Ninguna | 2013-05-19 15:07:48 | NULL |
| 3 | 1 | 3 | Ninguna | 2013-05-19 15:24:48 | NULL |
| 4 | 1 | 6 | Ninguna | 2013-05-19 15:06:48 | NULL |
| 5 | 2 | 4 | Suave | 2013-05-19 15:07:48 | 2013-05-19 15:09:48 |
| 6 | 2 | 5 | Seco | 2013-05-19 15:07:48 | 2013-05-19 15:12:48 |
| 7 | 3 | 5 | Con Mayo | 2013-05-19 14:54:48 | NULL |
| 8 | 3 | 6 | Bilz | 2013-05-19 14:57:48 | NULL |
+----+-----------+---------------+-------------+---------------------+---------------------+
8 rows in set (0.00 sec)
Finally: JPA in action:
#Stateless
#LocalBean
public class PedidosServices {
#PersistenceContext(unitName="vagonpubPU")
private EntityManager em;
private Logger log = Logger.getLogger(PedidosServices.class.getName());
#SuppressWarnings("unchecked")
public List<ItemPedido> obtenerPedidosRetrasados() {
log.info("Obteniendo listado de pedidos retrasados");
Query qry = em.createQuery("SELECT ip FROM ItemPedido ip, Pedido p WHERE" +
" ip.fechaEntrega=NULL" +
" AND ip.idPedido=p.id" +
" AND ip.fechaEstimada < :arg3" +
" AND (p.idTipoEstado=:arg0 OR p.idTipoEstado=:arg1 OR p.idTipoEstado=:arg2)");
qry.setParameter("arg0", Tipo.ESTADO_BOUCHER_ESPERA_PAGO);
qry.setParameter("arg1", Tipo.ESTADO_BOUCHER_EN_SERVICIO);
qry.setParameter("arg2", Tipo.ESTADO_BOUCHER_RECIBIDO);
qry.setParameter("arg3", new Date());
return qry.getResultList();
}
At last all its work. I hope that help you.

To add to the other answers: If yout want the 0000-00-00 string, you can use noDatetimeStringSync=true (with the caveat of sacrificing timezone conversion).
The official MySQL bug: https://bugs.mysql.com/bug.php?id=47108.
Also, for history, JDBC used to return NULL for 0000-00-00 dates but now return an exception by default.
Source

you can append the jdbc url with
?zeroDateTimeBehavior=convertToNull&autoReconnect=true&characterEncoding=UTF-8&characterSetResults=UTF-8
With the help of this, sql convert '0000-00-00 00:00:00' as null value.
eg:
jdbc:mysql:<host-name>/<db-name>?zeroDateTimeBehavior=convertToNull&autoReconnect=true&characterEncoding=UTF-8&characterSetResults=UTF-8

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.