im having a hard time finding where is my mistake so i could use some help.
I'm using 2 tables. "diamerismata", and another one based on which users selects in previous steps, you'll be looking at it as "+KataxDiamTable+".
My query is:
String monthly = "SELECT diamerismata.DIAMERISMA as Διαμέρισμα, diamerismata.ΟΝΟΜΑ as Όνομα, "+KataxDiamTable+".LASTDIFF as ΠρΔιαφορά, "+KataxDiamTable+".ΧΡΕΩΣΕΙΣ as Χρέωση, "+KataxDiamTable+".DATE as Ημερομηνία, "+KataxDiamTable+".DIFF as Διαφορά FROM diamerismata, "+KataxDiamTable+" WHERE MONTH("+KataxDiamTable+".DATE) = "+ms+" AND YEAR("+KataxDiamTable+".DATE) = "+yr+" AND "+KataxDiamTable+".IDdiam = diamerismata.IDdiam AND ID = ( SELECT MAX(ID) FROM "+KataxDiamTable+" WHERE "+KataxDiamTable+".IDdiam = diamerismata.IDdiam) ORDER BY diamerismata.DIAMERISMA";
But it only returns 4 rows when it should be returning 10.
"yr" and "ms" are date filters. But every record of mine has the same date, so the problem isn't there...
IDdiam and ID are ASC and UNIQUE.
Any Ideas? Thank you
Related
I have a dataframe in the following schema, that I extract from a Hive table using the SQL below:
Id
Group_name
Sub_group_number
Year_Month
1
Active
1
202110
2
Active
3
202110
3
Inactive
4
202110
4
Active
1
202110
The T-SQL to extract the information is:
SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month = 202110
AND id IN (SELECT Id FROM table WHERE Year_Month = 202109 AND Sub_group_number = 1)
After extract this information I want to group by Sub_group to extract the Id quantity as below:
df = (df.withColumn('FROM', F.lit(1))
.groupBy('Year_Month', 'FROM', 'Sub_group_number')
.count())
The result is a table as below:
Year_Month
From
Sub_group_number
Quantity
202110
1
1
2
202110
1
3
1
202110
1
4
1
Until this point there is no issue on my code and I'm able to run and execute action commands with Spark. The issue happens when I try to make the year_month and sub_group as parameters of my T-SQL in order to have a complete table. I'm using the following code:
sub_groups = [i for i in range(22)]
year_months = [202101, 202102, 202103]
for month in year_months:
for group in sub_groups:
query = f"""SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month = {month + 1}
AND id IN (SELECT Id FROM table WHERE Year_Month = {month} AND Sub_group_number = {group})"""
df_temp = (spark.sql(query)
.withColumn('FROM', F.lit(group))
.groupBy('Year_Month', 'FROM', 'Sub_group_number')
.count())
df = df.union(df_temp).dropDuplicates()
When I execute a df.show() or try to write as Table I have the issue:
An error occurred while calling o8522.showString
Any ideas of what is causing this error?
You're attempting string interpolation.
If using Python, maybe try this:
query = "SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month = {0}
AND id IN (SELECT Id FROM table WHERE Year_Month = {1}
AND Sub_group_number = {2})".format(month + 1, month, group)
The error states it is StackOverflowError that can happen when DAG plan grows too much. Because of Spark's lazy evaluation, this could easily happen with for-loops, especially you have nested for-loop. If you are curious, you can try df.explain() where you did df.show(), you should see pretty long physical plans that Spark cannot handle to run in actual.
To solve this, you want to avoid for-loop as much as possible and in your case , it seems you don't need it.
sub_groups = [i for i in range(22)]
year_months = [202101, 202102, 202103]
# Modify this to use datetime lib for more robustness (ex: handle 202112 -> 202201).
month_plus = [x+1 for x in year_months]
def _to_str_elms(li):
return str(li)[1:-1]
spark.sql("""
SELECT Id, Group_Name, Sub_group_number, Year_Month
FROM table
WHERE Year_Month IN ({','.join(_to_str_elms(month_plus))})
AND id IN (SELECT Id FROM table WHERE Year_Month IN ({','.join(_to_str_elms(month))}) AND Sub_group_number IN ({','.join(_to_str_elms(sub_groups))}))
""")
UPDATE:
I think I understood why you are looping. You need "parent" group where along with the Sub_group_number of its record and you are using lit with looped value. I think one way is that you can rethink about this problem by first query to fetch all records that are in [202101, 202102, 202103, 202104], then use some window functions to figure out the parent group. I am not yet foreseeing how it looks like, so if you can give us some sample records and logic how you want to get the "group", I can perhaps provide updates.
I'm running this script in MySQL to retrieve the count of appoinments that are grouped by a specific type and month:
SELECT Type, month(Start), COUNT(Type) FROM appointments
GROUP BY Type, month(Start);
Which provides me with this table:
[1]: https://i.stack.imgur.com/sRJwx.png
This is great! Now my problem is trying to place the data into variables:
String sql = "SELECT `Type`, month(`Start`), COUNT(Type) FROM appointments " +
"GROUP BY `Type`, month(`Start`)";
try(Statement s = DatabaseConnection.getConnection().createStatement();
ResultSet rs = s.executeQuery(sql)) {
while(rs.next()) {
String typeKey = rs.getString("Type");
int monthValue = rs.getInt("month(Start)");
int count = rs.getInt("COUNT(Type)");
I get an SQLException when I run this, saying that the column "month(Type) cannot be found. I have tried with the column name "Start" and tried with just "month" and recieve the same SQLException (I know, kind of a stupid troubleshoot but worth a shot). I've looked around for some solutions but I'm falling short.
This is for a school project, so I can only use JDBC and Java specific drivers. I'd appreciate it if someone could point me in the right direction.
Try rename columns with as
sql = "SELECT `Type` as type ,
month(`Start`) as mstart,
COUNT(Type) as ctype
FROM appointments " +
"GROUP BY `Type`, month(`Start`)";
Initially count is 0 in database. i need to update count every time whenever request comes.
Update Query
Here count is initially 0.
suppose int count=10;
query.setParameter("count",count);
it will update only one time
the output is 11.next time same query update same output 11 will come.but i need 12 is the output.
String hql = "Update User f set f.count =:count + 1 where f.userId =:userId";
Query query = session.createQuery(hql);
query.setParameter("userId", userId);
//query.setParameter("count", count);
result = query.executeUpdate();
count is not updated. looking positive replay.
Thank you..!
Did you try?
"Update User f set f.count=(f.count + 1) where f.userId =:userId";
I only need the two most current items in a result set and was wondering what the best way to do that would be without a break. I realize that rs.next() returns true or false and tried to stop it with a counter but that failed. I have this at the moment:
while(rs.next()){
String name = rs.getString("name");
String startTime = rs.getString("starting_time");
String endTime = rs.getString("ending_time");
String date = rs.getString("directory");
String loc = rs.getString("location");
htmlBuilder.append("<li><a href='public/committees/calendar'>"+ name+"<br>");
htmlBuilder.append(date +" "+startTime+" - "+endTime+"</a> <!-- Link/title/date/start-end time --><br>");
htmlBuilder.append("<strong>Location: </strong>"+loc+"<br>");
htmlBuilder.append("</li>");
}
html = htmlBuilder.toString();
As you can tell, this returns everything from the ResultSet but I only need the first two entries.
Here is my correct query:
SELECT to_char(to_date(to_char(x.starting_date), 'J'),'mm/dd/yyyy') as start_date, to_char(to_date(to_char(x.ending_date), 'J'),'mm/dd/yyyy') as end_date, to_char(to_date(to_char(x.starting_date), 'J'),'yyyy-mm-dd') as directory, x.starting_time, x.ending_time, x.description, x.description as location, x.name, x.short_name, x.add_info_url, x.contact_name, x.contact_info FROM calitem x, calendar x, calitemtypes x WHERE x.calendar_id = x.calendar_id AND x.short_name LIKE ? AND x.style_id = 0 AND x.starting_date > to_char(sysdate-1, 'J') AND x.item_type_id = x.item_type_id AND ROWNUM <= 3 ORDER BY to_date(to_char(x.starting_date), 'J')
Adding the rownum attribute worked perfectly and the query was ordered before return. Thanks for the help
You should limit the query to return only the two most current rows. This can by achieved by a LIMIT clause (exists in MySQL, not sure about other DBs) and an ORDER BY clause.
There is not need to add an index that would count the rows returned by the ResultSet.
I want to filter my table to show records by month so i make a textboxes for the user input. Now i dont know if my query is correct. I dont have any error but also doesnt have any results. I use LIKE because i dont have specific day provided. Can someone suggest a better way?
ConnectToDatabase conn = null;
conn = ConnectToDatabase.getConnectionToDatabase();
String query = "Select * from inventoryreport where InDate LIKE "+txtYear.getText()+""+ txtMonth.getText()+"";
conn.setPreparedStatement(conn.getConnection().prepareStatement(query));
conn.setResultSet(conn.getPreparedStatement().executeQuery());
java.sql.ResultSetMetaData metaData = conn.getResultSet().getMetaData();
int columns = metaData.getColumnCount();
for (int i = 1; i <= columns; i++) {
columnNames.addElement(metaData.getColumnName(i));
}
LIKE it's wrong choise becouse your db doesn't use index and will be slow (and doesn't work).
The query is like this:
SELECT * FROM inventoryreport WHERE YEAR(Date_column) = 2014 AND MONTH(Date_column) = 3;
So your code is:
String query = "Select * from inventoryreport where YEAR(InDate) = " +txtYear.getText()+" AND MONTH(InDate) = "+ txtMonth.getText();
I think it is a small mistake in the date format:
Your format : YYYYMM (no seperation symbol)
Right format: YYYY-MM-DD (with a '-' to seperate)
I think
String query = "Select * from inventoryreport where InDate LIKE "+txtYear.getText()+"-"+ txtMonth.getText()+"-00";
should fix it, if your database only includes monthly exact values.
Otherwise you should use
Select * from inventoryreport where InDate BETWEEN '2014-03-18' AND '2014-03-20'
A SQL query can take advantage of indexes when a column in not surrounded by a function. The following where clause would allow the use of indexes:
SELECT *
FROM inventoryreport
WHERE Date_Column >= str_to_date(concat_ws('-', txtYear.getText(), txtMonth.getText(), '01'), '%Y-%m-%d') and
Date_Column < adddate(str_to_date(concat_ws('-', txtYear.getText(), txtMonth.getText(), '01'), '%Y-%m-%d'), interal 1 month)
Although more complicated, all the manipulations are on constants, so the query engine can still take advantage of an index on Date_Column.