I have the following SQL query to group orders by the order date and hour in a day:
select to_char(o.order_date, 'YYYY-MM-DD HH24') order_date_hour,
sum(o.quantity) quantity
from orders o
where o.order_date >= to_date('01.02.2016', 'DD.MM.YYYY')
and o.order_date < to_date('03.02.2016', 'DD.MM.YYYY')
group by to_char(o.order_date, 'YYYY-MM-DD HH24')
order by to_char(o.order_date, 'YYYY-MM-DD HH24');
An example for the result is as follows:
ORDER_DATE_HOUR | QUANTITY
2016-02-01 06 | 10
2016-02-03 09 | 20
The query works as expected using SQL developer.
In QueryDSL I came up with the following query:
SQLQuery q = queryFactory.createSQLQuery();
q.from(order);
q.where(order.orderDate.goe(Timestamp.valueOf(from)))
.where(order.orderDate.lt(Timestamp.valueOf(to)));
q.groupBy(to_char(order.orderDate, "YYYY-MM-DD HH24"));
q.orderBy(order.orderDate.asc());
List<Tuple> result = q.list(to_char(order.orderDate, "YYYY-MM-DD HH24"), order.quantity);
to_char is a method I found in this thread: https://groups.google.com/forum/#!msg/querydsl/WD04ZRon-88/nP5QhqhwCUcJ
The exception I get is:
java.sql.SQLSyntaxErrorException: ORA-00979: not a GROUP BY expression
I tried a few variations of the query with no luck at all.
Does anyone know why the query is failing?
Thanks :)
You can use StringTemplate and DateTemplate to build custom expressions, like done in the unit test com.querydsl.sql.TemplateTest:
StringTemplate datePath = Expressions.stringTemplate(
"to_char({0},'{1s}')", order.orderDate, ConstantImpl.create("YYYY-MM-DD HH24"));
DateTemplate from = Expressions.dateTemplate(
Date.class, "to_date({0},'{1s}')", fromStr, ConstantImpl.create("DD.MM.YYYY"));
DateTemplate to = Expressions.dateTemplate(
Date.class, "to_date({0},'{1s}')", toStr, ConstantImpl.create("DD.MM.YYYY"));
query.select(datePath.as("order_date_hour"), order.quantity.sum().as("quantity"))
.from(order)
.where(order.orderDate.goe(from)
.and(order.orderDate.lt(to)))
.groupBy(datePath)
.orderBy(datePath.asc());
List<Tuple> results = query.fetch();
Here the printout for query.getSQL().getSQL():
select to_char("order".order_date,'YYYY-MM-DD HH24') order_date_hour, sum("order".quantity) quantity
from "order" "order"
where "order".order_date >= to_date(?,'DD.MM.YYYY') and "order".order_date < to_date(?,'DD.MM.YYYY')
group by to_char("order".order_date,'YYYY-MM-DD HH24')
order by to_char("order".order_date,'YYYY-MM-DD HH24') asc
Related
I have a dataframe in the following schema, that I extract from a Hive table using the SQL below:
Id
Group_name
Sub_group_number
Year_Month
1
Active
1
202110
2
Active
3
202110
3
Inactive
4
202110
4
Active
1
202110
The T-SQL to extract the information is:
SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month = 202110
AND id IN (SELECT Id FROM table WHERE Year_Month = 202109 AND Sub_group_number = 1)
After extract this information I want to group by Sub_group to extract the Id quantity as below:
df = (df.withColumn('FROM', F.lit(1))
.groupBy('Year_Month', 'FROM', 'Sub_group_number')
.count())
The result is a table as below:
Year_Month
From
Sub_group_number
Quantity
202110
1
1
2
202110
1
3
1
202110
1
4
1
Until this point there is no issue on my code and I'm able to run and execute action commands with Spark. The issue happens when I try to make the year_month and sub_group as parameters of my T-SQL in order to have a complete table. I'm using the following code:
sub_groups = [i for i in range(22)]
year_months = [202101, 202102, 202103]
for month in year_months:
for group in sub_groups:
query = f"""SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month = {month + 1}
AND id IN (SELECT Id FROM table WHERE Year_Month = {month} AND Sub_group_number = {group})"""
df_temp = (spark.sql(query)
.withColumn('FROM', F.lit(group))
.groupBy('Year_Month', 'FROM', 'Sub_group_number')
.count())
df = df.union(df_temp).dropDuplicates()
When I execute a df.show() or try to write as Table I have the issue:
An error occurred while calling o8522.showString
Any ideas of what is causing this error?
You're attempting string interpolation.
If using Python, maybe try this:
query = "SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month = {0}
AND id IN (SELECT Id FROM table WHERE Year_Month = {1}
AND Sub_group_number = {2})".format(month + 1, month, group)
The error states it is StackOverflowError that can happen when DAG plan grows too much. Because of Spark's lazy evaluation, this could easily happen with for-loops, especially you have nested for-loop. If you are curious, you can try df.explain() where you did df.show(), you should see pretty long physical plans that Spark cannot handle to run in actual.
To solve this, you want to avoid for-loop as much as possible and in your case , it seems you don't need it.
sub_groups = [i for i in range(22)]
year_months = [202101, 202102, 202103]
# Modify this to use datetime lib for more robustness (ex: handle 202112 -> 202201).
month_plus = [x+1 for x in year_months]
def _to_str_elms(li):
return str(li)[1:-1]
spark.sql("""
SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month IN ({','.join(_to_str_elms(month_plus))})
AND id IN (SELECT Id FROM table WHERE Year_Month IN ({','.join(_to_str_elms(month))}) AND Sub_group_number IN ({','.join(_to_str_elms(sub_groups))}))
""")
UPDATE:
I think I understood why you are looping. You need "parent" group where along with the Sub_group_number of its record and you are using lit with looped value. I think one way is that you can rethink about this problem by first query to fetch all records that are in [202101, 202102, 202103, 202104], then use some window functions to figure out the parent group. I am not yet foreseeing how it looks like, so if you can give us some sample records and logic how you want to get the "group", I can perhaps provide updates.
I am using HQL to get the data inserted exact 21 days from now. Here is my Code
Query queryThreeWeek = session.createQuery("from Users where createdDate = CURDATE()-21");
List<Users> userDetailsThreeWeekList = queryThreeWeek.list();
I can not use createSQLQuery.
Right now I am not getting any data, but there is data for the date 2016-06-20. And that is because of the month changed because when I used CURDATE()-7 I got the correct data of the date 2016-07-04.
The calculation for dat is like;
2016-07-11 - 7 = 20160704
2016-07-11 - 21 = 20160690
I also Tired using INTERVAL which is for native sqlQuery. Here is my code for using INTERVAL in HQL:
Query queryThreeWeek = session.createQuery("from Users where createdDate = DATE( DATE_SUB( NOW() , INTERVAL 21 DAY ) )");
List<Users> userDetailsThreeWeekList = queryThreeWeek.list();
Also tried
Query queryThreeWeek = session.createQuery("from Users where createdDate = DATE( DATE_SUB( CURDATE() , INTERVAL 21 DAY ) )");
List<Users> userDetailsThreeWeekList = queryThreeWeek.list();
but it is giving me exception like: org.hibernate.hql.internal.ast.QuerySyntaxException: unexpected token: 21.
So what can I use instead of subtracting the day like this: CURDATE()-21? in HQL only
I have solved the issue by using one native SQL query which can get me the exact date.
Query sub3Week = session.createSQLQuery("select DATE( DATE_SUB( CURDATE() , INTERVAL 21 DAY ) ) from dual");
List<Date> sub3WeekList = sub3Week.list();
And then I use this data in the HQL query like this:
Query queryThreeWeek = session.createQuery("from Users where createdDate = :createdDate");
queryThreeWeek.setParameter("createdDate", sub3WeekList.get(0).toString());
List<Users> userDetailsThreeWeekList = queryThreeWeek.list();
You can use date_Sub in a native SQL query (not a HQL query!):
"from Users where createdDate = DATE( DATE_SUB( NOW() , INTERVAL 21 DAY ) )"
The solution with HQL is quite simple:
final long time = System.currentTimeMillis() - java.time.Duration.ofDays(21).toMillis();
final javax.persistence.Query query = entityManagerOrSession.createQuery(
"SELECT x FROM users x WHERE x.createddate> :time");
query.setParameter("time", new java.sql.Timestamp(time));
I try to retrieve records from ORACLE database table using JDBC thin driver.
The prepared statement I'm using:
(1)
SELECT (t1.LOGGED_TIME - ?) AS TDIFF, t1.ID, t1.STATUS, t1.LOGGED_TIME, t1.SERVER_TIME
FROM table_1 t1
WHERE (
((t1.LOGGED_TIME - ?) <= INTERVAL '10' DAY)
AND ((t1.LOGGED_TIME - ?) >= INTERVAL '-10' DAY))
ORDER BY t1.LOGGED_TIME DESC
where t1.LOGGED_TIME represents a timestamp column. Every three parameters are identical timestamps set with
java.sql.Timestamp controlTime = Timestamp.valueOf("2014-08-15 03:52:00");
lookupTime.setTimestamp(1, controlTime);
lookupTime.setTimestamp(2, controlTime);
lookupTime.setTimestamp(3, controlTime);
Executing the code works fine - no exceptions or warnings are displayed. Nevertheless the resultset returned by
rs = lookupTime.executeQuery();
is empty.
Setting the query to
(2)
SELECT (t1.LOGGED_TIME - TO_TIMESTAMP('2014-08-15 03:52', 'yyyy-mm-dd hh24:mi')) AS TDIFF, t1.ID, t1.STATUS, t1.LOGGED_TIME, t1.SERVER_TIME
FROM table_1 t1
WHERE (
((t1.LOGGED_TIME - TO_TIMESTAMP('2014-08-15 03:52', 'yyyy-mm-dd hh24:mi')) <= INTERVAL '10' DAY)
AND ((t1.LOGGED_TIME - TO_TIMESTAMP('2014-08-15 03:52', 'yyyy-mm-dd hh24:mi')) >= INTERVAL '-10' DAY))
ORDER BY t1.LOGGED_TIME DESC
returns the expected data.
When I query e.g. strings from another column of the same table with a prepared statement the result is ok.
What I'm missing here? Where is the point? Any idea?
To say it clear: the point is not to identify a kind of wrong date/time format conversion in (2). That will always lead to an oracle error message and can be fixed easily.
The question is: why stays the RecordSet returned by the preparedStatement (1) empty (= not a single record) without any error notification? If the Timestamp format is wrong in any way, why there isn't an error or a warning?
Check your TO_TIMESTAMP format:
TO_TIMESTAMP('2014-08-15 03:52',
'dd.mm.yy hh24:mi')
Aug. 14, 2015, not Aug. 15, 2014
Update
Actually, I get the following error when trying that one:
ORA-01843: not a valid month
01843. 00000 - "not a valid month"
Update2
A Java Timestamp maps to an Oracle DATE data type, not a TIMESTAMP. Don't know if that makes a difference, but you might try TO_TIMESTAMP(?).
I would however change the query to allow use of a potential index on LOGGED_TIME:
SELECT ID, STATUS, LOGGED_TIME, SERVER_TIME
FROM table_1
WHERE LOGGED_TIME BETWEEN ? AND ?
ORDER BY LOGGED_TIME DESC
Then do all the math in Java:
Timestamp controlTime = Timestamp.valueOf("2014-08-15 03:52:00");
Calendar cal = Calendar.getInstance();
cal.setTime(controlTime);
cal.add(Calendar.DAY_OF_MONTH, -10);
lookupTime.setTimestamp(1, new Timestamp(cal.getTimeInMillis()));
cal.setTime(controlTime);
cal.add(Calendar.DAY_OF_MONTH, 10);
lookupTime.setTimestamp(2, new Timestamp(cal.getTimeInMillis()));
try (ResultSet rs = lookupTime.executeQuery()) {
while (rs.next()) {
long tdiffInSeconds = (rs.getTimestamp("LOGGED_TIME").getTime() - controlTime.getTime()) / 1000;
// other code
}
}
i'm trying to select all enteties from a databse where a certain date is older than 7 days. It works fine via SQLyog, but in Java it always throws this error:
[33, 76] The expression is not a valid conditional expression.
[76, 101] The query contains a malformed ending.
This is my query in Java:
SELECT a FROM Applicants a WHERE (a.lastMod <= CURRENT_DATE - INTERVAL 7 DAY) ORDER BY a.applDate ASC
May the problem be the "CURRENT_DATE"-part?
CURRENT_DATE is ok, but INTERVAL 7 DAY is not a valid JPQL expression. You'll need to supply the date as parameter
WHERE a.lastMod <= :dateParam
Example:
Query q = em.createQuery("SELECT a FROM Applicants a WHERE a.lastMod <= :dateParam ORDER BY a.applDate ASC");
q.setParameter("dateParam", dateParam);
List<Applicants> applicants = (List<Applicants>)q.getResultList();
// or, to avoid casting (thanks to #DavidSN)
TypedQuery<Applicants> q = em.createQuery("SELECT a FROM Applicants a WHERE a.lastMod <= :dateParam ORDER BY a.applDate ASC", Applicants.class);
q.setParameter("dateParam", dateParam);
List<Applicants> applicants = q.getResultList();
EntityManager em = ...
Query q = em.createQuery ("SELECT a FROM Applicants a WHERE a.lastMod <= :dateParam");
q.setParameter("dateParam" , dateParam);
List<blabla> results = q.getResultList ();
I have a native query I need to change to HQL.
The original query is:
SELECT COUNT(DISTINCT `table`.`id`) FROM `database`.`table` WHERE
(`table`.`date`" " BETWEEN '" + year + "-01-01 00:00:00' AND '" + year
+ "-12-31 23:59:59') AND `table`.`box` NOT LIKE '' GROUP BY MONTH(`table`.`date`)";
I tried something like:
StringBuilder hql = new StringBuilder();
hql.append(" select count(distinct table.id)");
hql.append(" from Table table");
hql.append(" where table.date between '?-01-01 00:00:00' and '?-12-31 23:59:59'");
hql.append(" and table.box not like ''");
hql.append("group by month (table.date)");
query.setParameter(1, Integer.toString(year));
query.setParameter(2, Integer.toString(year));
Where year is a int passed to the method as argument.
The generated query is:
Hibernate: select count(distinct table0_.id) as col_0_0_ from table table0_ where (table0_.date between '2013-01-01 00:00:00' and '2013-12-31 23:59:59') and (table0_.box not like '') group by month(table0_.date)
My problem is: using the native query, I get one value and using the hql I get another for month 2 (February). For month 1 (January) results are the same.
What am I missing here?
Thanks in advance,
gtludwig
They seem to be the same query without the schema qualification to me. Aren't you running them in different instance? Like, one in 'database' and the other pointing to anoher base with similar data loaded?