split the request for every 100 characters - java

I am calling a stored procedure for one request with more than 100 characters in one field and it is getting failed because the field size is 100 and how to split the request for every 100 characters of that particular field? the maximum size for that particular field is 100 and when are passing 250 characters to that field we have to split the call for every 100 characters.
FYI not updating anything into DB just reading the values from DB.

You've given very little to go on, but here's my best guess at a solution:
String longStr; // your long string
for (String max100 : longStr.split("(?<=.{,100})")) {
connection.execute("call someProc('" + max100 + "')");
}
This code is very simplistic and is for illustrative use only. In reality you'd use a prepared statement with placeholders.
That said, the splitting code, which is the core of this question, should be helpful.

Try this:
// longStr is your long string
String substring = null;
PreparedStatement prepped = connection.prepareeStatement("call someProc(?)", ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
int maxLoop = longStr.length() / 100;
for (int i = 0; i < maxLoop; i++) {
substring = longStr.substring(100 * i, 100 * (i + 1));
prepped.setString(1, substring);
ResultSet result = prepped.executeQuery();
}
if (longStr.length() % 100 != 0) {
substring = longStr.substring(maxLoop * 100);
prepped.setString(1, substring);
ResultSet result = prepped.executeQuery();
}

Related

Binding query param in LENGTH() condition doesn't work

I'm having some trouble finding any info about this problem, but it appears to be a limitation of SQLite.
Consider a simple words table with 2 fields, _id (int) and word (text). The following query works and returns the expected results (all words which are 12 characters or less):
SELECT * FROM words WHERE LENGTH(word) <= 12;
However if this character limit needs to be dynamic and made into a parameter, the query no longer works.
It returns all rows of the table:
String query = "SELECT * FROM words WHERE LENGTH(word) <= ?";
Cursor cursor = database.rawQuery(query, new String[]{ Integer.toString(12) });
I also tried selecting the length as a new column, then applying the condition to that, but it gives the same results:
String query = "SELECT w.*, LENGTH(w.word) AS word_length FROM words w WHERE word_length <= ?";
Cursor cursor = database.rawQuery(query, new String[]{ Integer.toString(12) });
Is my only option to just filter through the query results afterward? Why do parameterized conditions on normal INT columns work but not on LENGTH()? (e.g. WHERE _id < ? works fine)
The sql statement that is executed with:
rawQuery(query, new String[]{ Integer.toString(12) });
is:
SELECT * FROM words WHERE LENGTH(word) <= '12';
and not:
SELECT * FROM words WHERE LENGTH(word) <= 12;
because rawQuery() treats all the passed parameters as strings and encloses all of them inside single quotes.
So the integer LENGTH(word) is compared to a string literal like 12 and this is where exists a feature of SQLite which states that:
An INTEGER or REAL value is less than any TEXT or BLOB value.
(from Datatypes In SQLite Version 3).
So all integers are considered less than the string literal '12'.
Of course this is not what you want and expect, so what you can do is force a conversion of '12' to the integer 12 and you can do it by adding 0 to it:
String query = "SELECT * FROM words WHERE LENGTH(word) <= ? + 0";
What this does is an implicit conversion of '12' to 12 because you apply to it a numeric operation.
Of course as soon as I rubber duck this I'm able to figure out a workaround by casting the field to an integer:
String query = "SELECT * FROM words WHERE CAST(LENGTH(word) AS INTEGER) <= ?";
Cursor cursor = database.rawQuery(query, new String[]{ Integer.toString(12) });
Seems excessive but I guess the return value of LENGTH() isn't considered an integer (all documentation I've come across just says it "returns the number of characters")
Did you try Integer.parseInt() ? I think your query need's integer parameter and you are converting it to string :)

Cut out different elements from a string and put them into a list

Here's updated code. For those following along the question edits contains the original question.
if (0 != searchString.length()) {
for (int index = input.indexOf(searchString, 0);
index != -1;
index = input.indexOf(searchString, eagerMatching ? index + 1 : index + searchString.length())) {
occurences++;
System.out.println(occurences);
indexIN=input.indexOf(ListStringIN, occurences - 1) + ListStringIN.length();
System.out.println(indexIN);
System.out.println(ListStringIN.length());
indexOUT=input.indexOf(ListStringOUT, occurences - 1);
System.out.println(indexOUT);
Lresult.add(input.substring(indexIN, indexOUT));
System.out.println();
}
}
As you can see, I gave me out the index numbers
My code works well with only one Element
But when I write something like this: %%%%ONE++++ %%%%TWO++++
There's this exception:
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: begin 16, end 7, length 23
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3410)
at java.base/java.lang.String.substring(String.java:1883)
at com.DMMS.Main.identify(Main.java:81)
And I found out that the indexIN changes in the Start of the second String but not the indexOUT
I couldn't find out why
When you look at your code you can notice: in the first loop that counts the number of occurrences, your code "knows" that it has to use that version of indexOf() that relies on offsets within the search strings.
In other words: you know that you have to search after previous "hits" when walking through your string.
But your second loop, the one that has to extract the actual things, there you are using indexOf() without that extra offset parameter. Therefore you keep "copying out" the same part repeatedly.
Thus: "simply" apply the same logic from loop 1 for loop 2!
Beyond that:
you don't need two loops for that. Counting occurrences and "copying out" the matching code ... can be done in one loop
and honestly: rewrite that first loop. This code is almost incomprehensible for human beings. A reader would have to sit down and read this 10, 20 times, and then run it in a debugger to understand what it is doing
I dit it!
Heres the code:
.........................
static String ListStringIN = "%%%%";
static String ListStringOUT = "++++";
........................
else if (input.contains(ListStringIN) && input.contains(ListStringOUT)) {
System.out.println("Identifiziere Liste...");
String searchString = ListStringIN;
int occurences = 0;
boolean eagerMatching = false;
if (0 != searchString.length()) {
for (int index = input.indexOf(searchString, 0); index != -1; index = input
.indexOf(searchString, eagerMatching ? index + 1 : index + searchString.length())) {
occurences++;
System.out.println(occurences);
indexIN=input.indexOf(ListStringIN, occurences - 1) + ListStringIN.length();
System.out.println(indexIN);
//indexOUT=input.indexOf(ListStringOUT, occurences);
//indexOUT=input.indexOf(ListStringOUT, occurences - 1);
indexOUT = input.indexOf(ListStringOUT, eagerMatching ? index + 1 : index + ListStringOUT.length());
System.out.println(indexOUT);
Lresult.add(input.substring(indexIN, indexOUT));
System.out.println();
}
}
//for (int i = 0; i <occurences; i ++) {
// Lresult.add(input.substring(input.indexOf(ListStringIN, 0) + ListStringIN.length(), input.indexOf(ListStringOUT)));
//}
result = Lresult.toString();
return result;
}
I hope this is useful for other people
#GhostCat Thanks for your advices!

Calculating levenshtein distance between two strings

Im executing the following Postgres query.
SELECT * FROM description WHERE levenshtein(desci, 'Description text?') <= 6 LIMIT 10;
Im using the following code execute the above query.
public static boolean authQuestion(String question) throws SQLException{
boolean isDescAvailable = false;
Connection connection = null;
try {
connection = DbRes.getConnection();
String query = "SELECT * FROM description WHERE levenshtein(desci, ? ) <= 6";
PreparedStatement checkStmt = dbCon.prepareStatement(query);
checkStmt.setString(1, question);
ResultSet rs = checkStmt.executeQuery();
while (rs.next()) {
isDescAvailable = true;
}
} catch (URISyntaxException e1) {
e1.printStackTrace();
} catch (SQLException sqle) {
sqle.printStackTrace();
} catch (Exception e) {
if (connection != null)
connection.close();
} finally {
if (connection != null)
connection.close();
}
return isDescAvailable;
}
I want to find the edit distance between both input text and the values that's existing in the database. i want to fetch all datas that has edit distance of 60 percent. The above query doesnt work as expected. How do I get the rows that contains 60 percent similarity?
Use this:
SELECT *
FROM description
WHERE 100 * (length(desci) - levenshtein(desci, ?))
/ length(desci) > 60
The Levenshtein distance is the count of how many letters must change (move, delete or insert) for one string to become the other. Put simply, it's the number of letters that are different.
The number of letters that are the same is then length - levenshtein.
To express this as a fraction, divide by the length, ie (length - levenshtein) / length.
To express a fraction as a percentage, multiply by 100.
I perform the multiplication by 100 first to avoid integer division truncation problems.
The most general version of the levenshtein function is:
levenshtein(text source, text target, int ins_cost, int del_cost, int sub_cost) returns int
Both source and target can be any non-null string, with a maximum of
255 characters. The cost parameters specify how much to charge for a
character insertion, deletion, or substitution, respectively. You can
omit the cost parameters, as in the second version of the function; in
that case they all default to 1.
So, with the default cost parameters, the result you get is the total number of characters you need to change (by insertion, deletion, or substitution) in the source to get the target.
If you need to calculate the percentage difference, you should divide the levenshtein function result by the length of your source text (or target length - according to your definition of the percentage difference).

Return the number of times query occurs as a substring of src

/** Return the number of times query occurs as a substring of src
* (different occurrences may overlap).
* Precondition: query is not the empty string "".
* Examples: For src = "ab", query = "b", return 1.
* For src = "Luke Skywalker", query = "ke", return 2.
* For src = "abababab", query = "aba", return 3.
* For src = "aaaa", query = "aa", return 3.*/
public static int numOccurrences(String src, String query) {
/* This should be done with one loop. If at all possible, don't have
* each iteration of the loop process one character of src. Instead,
* see whether some method of class String can be used to jump from one
* occurrence of query to the next. */
int count = 0;
for (int i = 0; i < src.length(); i++) {
int end = i + query.length() - 1;
if (end < src.length()) {
String sub = src.substring(i, end);
if (query.contentEquals(sub))
++count;
}
}return count;
}
I tested the code. If the src is "cherry" and the query is "err", then the output is expected to be 1 but it turns out to be 0. What's wrong with the code? BTW, I cannot use methods outside the String class.
Check the existence of query in src and loop until it return false. With each occurrence, take the substring, update the count and repeat until query is not found in src.
Pseudo code:
int count = 0;
loop (flag is true)
int index = find start index of query in src;
if (query is found)
src = update the src with substring(index + query.length());
count++;
else
flag = false;
return count;
What's wrong is that you're comparing err to:
i | sub
--|------
0 | ch
1 | he
2 | er
3 | rr
Notice that these strings you're comparing to look short, and you don't even get to the end of "cherry" before you stop checking for a match. So there are two things you need to fix in your code: the way you calculate end and the comparison between end and src.length().
Hint: the second argument (ending index) to substring is exclusive.
Pseudo code:
init 'count' and 'start' to 0
while true do
find first occurence of 'query' in 'source', start search at 'start'
if found
set 'start' to found position + 1
set count to count + 1
else
break out of while loop
end while
return count
Tip: Use String#indexOf(String str, int fromIndex) when finding occurence of query in source
This does the job:
public static int numOccurrences(String src, String query) {
int count = 0;
for(int i = src.indexOf(query); i > -1;i = src.indexOf(query, i + 1))
count++;
return count;
}
Here, i is the index of query in src, but the increment term makes use of indexOf(String str, int fromIndex), which javadoc says:
Returns the index within this string of the first occurrence of the specified substring, starting at the specified index.
which is passed the index i plus 1 to start searching for another occurrence after the previous hit.
This also addresses the NFR hinted at in the comment:
Instead, see whether some method of class String can be used to jump from one occurrence of query to the next.

What is the best alternative to BatchStatement execute for retriving values from database (MSSQL 2008)

I have a SQL query as shown below.
SELECT O_DEF,O_DATE,O_MOD from OBL_DEFINITVE WHERE OBL_DEFINITVE_ID =?
A collection of Ids is passed to this query and ran as Batch query. This executes for 10000
times for retriveing values from Database.(Some one else mess)
public static Map getOBLDefinitionsAsMap(Collection oblIDs)
throws java.sql.SQLException
{
Map retVal = new HashMap();
if (oblIDs != null && (!oblIDs.isEmpty()))
{
BatchStatementObject stmt = new BatchStatementObject();
stmt.setSql(SELECT O_DEF,O_DATE,O_MOD from OBL_DEFINITVE WHERE OBL_DEFINITVE_ID=?);
stmt.setParameters(
PWMUtils.convertCollectionToSubLists(taskIDs, 1));
stmt.setResultsAsArray(true);
QueryResults rows = stmt.executeBatchSelect();
int rowSize = rows.size();
for (int i = 0; i < rowSize; i++)
{
QueryResults.Row aRow = (QueryResults.Row) rows.getRow(i);
CoblDefinition ctd = new CoblDefinition(aRow);
retVal.put(aRow.getLong(0), ctd);
}
}
return retVal;
Now we had identified that if the query is modified to
add as
SELECT O_DEF,O_DATE,O_MOD from OBL_DEFINITVE WHERE OBL_DEFINITVE_ID in (???)
so that we can reduce it to 1 query.
The problem here is MSSQL server is throwing exception that
Prepared or callable statement has more than 2000 parameter
And were struck here . Can some one provide any better alternative to this
There is a maximum number of allowed parameters, let's call it n. You can do one of the following:
If you have m*n + k parameters, you can create m batches (or m+1 batches, if k is not 0). If you have 10000 parameters and 2000 is the maximum allowed parameters, you will only need 5 batches.
Another solution is to generate the query string in your application and adding your parameters as string. This way you will run your query only once. This is an obvious optimization in speed, but you'll have a query string generated in your application. You would set your where clause like this:
String myWhereClause = "where TaskID = " + taskIDs[0];
for (int i = 1; i < numberOfTaskIDs; i++)
{
myWhereClause += " or TaskID = " + taskIDs[i];
}
It looks like you are using your own wrapper around PreparedStatement and addBatch(). You are clearly reaching a limit of how many statements/parameters can be batched at once. You will need to use executeBatch (eg every 100 or 1000) statements, instead of having it build up until the limit is reached.
Edit: Based on the comment below I reread the problem. The solution: make sure you use less than 2000 parameters when building the query. If necessary, breaking it up in two or more queries as required.

Categories

Resources