how to retrieve data with \n directly in java? - java

suppose I enter katheline\njoseph in a column(datatype- CLOB) of db table.
At the java end , I want the o/p as :
katheline
Joseph
i.e \n should be recognized as a newline character at the java end.
Is there any ONE single method in java to retrieve db column with \n as newline character.
I don’t want to do any manipulations.. for eg. Usage of string Tokenizer class or replace method is not desirable. They work fine but I am looking for a Direct method

\n is just a way you can write the char number 10 ("newline") in a string. If you retrieve that char from DB it will be in your string. You can iterate over the chars and verify it is:
for(int i=0; i<value.length(); i++)
{
char c = value.charAt(i);
int code = value.codePointAt(i);
System.out.println(c + " - " + code);
}
It shound print:
A - 65
B - 66
- 10
C - 67
for:
AB
C
Manual SQL
In SQL: try char(10) or something similar. It depends on the RDBMS.
Replacing in Java
If you can't convert the values in DB to real newlines then you can do:
value.replaceAll("\\\\n", "\\n");
It receives a regex: \\n meaning bar and n
And a replacement: \n
As I'm writing a Java string I must escape bars, so that's why so many bars are used.
Writing values with newline to DB
Never write values as literal SQL:
insert into .... values ('myvalue');
Use params:
insert into... values (?)
and
preparedStatement.setParam(1, myValue);

Related

Java How to Remove Double-Quote Character Between Double Quote Text Qualifier

I have a csv file where each field (except column headings) has a double quote text qualifier: field: "some value". However some of the fields in the file have a double quote within the value; field2: "25" TV" or field3: "25" x 14" x 2"" or field4: "A"bcd"ef"g". (I think you get the point). In cases where I have data like in fields 2-4, my java file process fails due to me specifying that the double-quote is a text-qualifier on the fields and it looks as if there are too many fields for that row. How do I do either or all of the following:
remove the double-quote character from inside the field
replace the double-quote character with another value
have my java process "ignore" or "skip" double-quotes within a field.
What is my level of control over this file? The file comes in as-is, but I just need data from two different columns in the file. I can do whatever I need to do to it to get that data.
First, if it is indeed a CSV file, you should be using the presence of commas to break each line into columns.
Once its broken in columns, if we know for sure that the value should begin and end with double-quote ("), we can simply remove all of the double-quote and then re-apply the ones at the beginning and end.
String input = "\"hello\",\"goodbye Java \"the best\" language\", \"this is really \"\"\"bad\"";
String[] parsed = input.split(",");
String[] clean = new String[parsed.length];
int index = 0;
for (String value : parsed) {
clean[index] = "\"" + value.replace("\"", "") + "\"";
index++;
}
If a comma could exist inside of the value, the following should be used instead
String input = "\"hello\",\"goodbye,\" Java \"the best\" language\", \"this is really \"\"\"bad\"";
String[] parsed = input.split("\"\\s*,\\s*\"");
String[] clean = new String[parsed.length];
int index = 0;
for (String value : parsed) {
clean[index] = "\"" + value.replace("\"", "") + "\"";
index++;
}
}
Note that if the sequence of \"\s*,\s*\" existed inside a value, the record would be ambiguous. For example, if it was a two column file, the input record
"abc","def","ghi" could be either
value 1 = "abc","def" value 2 = "ghi"
or
value 1 = "abc" value 2 = "def","ghi"
Note many CSV implementations will escape a double quote as two consecutive quotes.
So "25"" TV" might (should?) be your input.
Assuming that a comma is the column separator and that every column is surrounded by double quotes:
String[] columns = input.split("\",\"");
if (columns.length > 0) {
columns[0] = columns[0].substring(1);
String lastColumn = columns[columns.length-1];
columns[columns.length-1] = lastColumn.substring(0,lastColumn.length()-1);
}
The columns will still have the internal double quotes. You can replace them out if you don't want them.

How to divide a file into parts and assign it (multiword parts)?

Ex#1. 1 AM Roset Malin 18 19
Ex#2. 2 PM Margie 20 21
If you did something like for Ex#1...
int a = scanner.nextInt() << 1
String b = scanner.next() << AM
String name = scanner.next() << Roset //Malin will not be shown here
String c = scanner.nextInt() << 18
What if you wanted to have "Roset Malin" in a single variable AND still work correctly for Ex#2, where there is no two part name?
I can't seem to find a way to do this. And before people asks, I'm not familiar with tokenizers, buffreaders (??) and such. Only have used scanners.
As you have already noticed, names can be a real issue when retrieving Space delimited data since a name can contain one or more words, for example:
Dr. Steven Jon Parker III
Even perhaps no name at all which may or may not need to be considered.
I think an easy solution would be to use the Scanner.nextLine() method instead of the Scanner.next() method and then split the incoming file line with the String.split() method. In my opinion this then gives you easier control of the read data in this case.
The code example below utilizes the aforementioned Scanner.nextLine() and String.split() methods. Of course everything is enclosed within a while loop code block so as to iterate through the entire file. The condition for the while loop utilizes the Scanner.hasNextLine() method which returns boolean true if there is another line in the input of the scanner object.
You will also notice that there are Regular Expressions (RegEx) used within the code. These are simple expressions an I'll explain what they do now:
Expression used in the String.split() method - "\\s+":
The String.split() method splits it's respective string into a One Dimensional (1D) String Array based on the supplied delimiter to the method which in this case is a white-space delimiter. The "\\s" expression means a single white-space whereas the "\\s+" expression actually used means one or more white-spaces. Why one or more? Well, to basically handle typo's where an unintentional white-space may have been supplied during data entry and not removed before the data string was saved to file.
Expression used in the String.matches()
method - "\\d+":
In the code below we also utilize the String.matches() method to merely verify the fact that the string we are about to convert to integer with the Integer.parseInt() method is indeed a string representation of a numerical value. If we try to pass a value to this method that contains a space or a alpha (non-numerical) character then the Integer.parseInt() method will throw a NumberFormatException and possibly halting code execution.
The "\\d" expression used in the String.matches() method means, does the respective string match a string representation of a single numerical digit. The "\\d+" expression actually used means, does the respective string match a string representation of one or more numerical digits and we already know why we want one or more digits.
As mentioned the String.matches() method is merely used to validate the fact that what we are about to pass to the Integer.parseInt() method is indeed a string numerical value but if it fails this validation then 0 is placed into it's respective int variable. We use a Ternary Operator to do this. A Ternary Operator (also sometimes called a Conditional Operator) is basically a shortened version of an if/else situation.
Here is the code. Yes...it looks like a lot but it's mostly comments which you can safely remove if you like:
// A try/catch is required in case the supplied
// file could not be found.
try {
// Try With Resources to auto close the Scanner object
try (Scanner scanner = new Scanner(new File("Data.txt"))) {
// Iterate through the file line by line
while (scanner.hasNextLine()) {
String strg = scanner.nextLine();
// Skip past blank lines (if any) and Comment Lines
// that start with a semicolon (;) (if any). We don't
// want to process these.
if (strg.trim().equals("") || strg.trim().startsWith(";")) {
continue;
}
// Split the read in line into a String Array
String[] parts = strg.split("\\s+");
// Get the number from the current file data line. Ternary
// is used here. If the data is found not to be numerical
// then 0 is used.
int number = parts[0].matches("\\d+") ? Integer.parseInt(parts[0]) : 0;
// Get the AM or PM from the current file data line
String amPM = parts[1];
// Get the name from the current file data line
String name = ""; // Declare & initialize the name variable
// Start from parts Array index 2 because we already used 0 & 1
int i = 2;
// Get the Name regardless of its length...
for (; i < parts.length; i++) {
// If we hit an element that is a numerical value
// then we know we hit the end of our name, so we get
// out of this loop with the break statement.
if (parts[i].matches("\\d+")) { break; }
// Ternary used here to see if the name variable contains
// anything. If it doesn't then just a word is appended to
// the string held by the variable otherwise a whitespace
// and the word is appended.
name+= name.equals("") ? parts[i] : " " + parts[i];
}
// If it was found that there is no name supplied
// in the current file data line then we will make
// the name hold the string of: "Unknown".
if (name.trim().equals("")) {
name = "Unknown";
}
// Get first value after the name. Ternary is used here.
// If the data is found not to be numerical then 0 is used.
int valOne = parts[i].matches("\\d+") ? Integer.parseInt(parts[i]) : 0;
// Get second value after the name. Ternary is used here.
// If the data is found not to be numerical then 0 is used.
int valTwo = parts[i + 1].matches("\\d+") ? Integer.parseInt(parts[i + 1]) : 0;
// Display variable contents to the Console Window.
System.out.println("Number:\t" + number);
System.out.println("AMPM:\t" + amPM);
System.out.println("Name:\t" + name);
System.out.println("Value1:\t" + valOne);
System.out.println("Value2:\t" + valTwo);
System.out.println("=====================");
}
}
}
catch (FileNotFoundException ex) {
Logger.getLogger("Get Data Test").log(Level.SEVERE, null, ex);
}
This code has been tested against a data text file named Data.txt. This is what it contains:
1 AM Roset Malin 18 19
2 PM Margie 20 21
3 PM Dr. Steven Jon Parker III 18 19
4 AM 20 21
5 AM Jack B Black 20 21
Console Window Display:
Number: 1
AMPM: AM
Name: Roset Malin
Value1: 18
Value2: 19
=====================
Number: 2
AMPM: PM
Name: Margie
Value1: 20
Value2: 21
=====================
Number: 3
AMPM: PM
Name: Dr. Steven Jon Parker III
Value1: 18
Value2: 19
=====================
Number: 4
AMPM: AM
Name: Unknown
Value1: 20
Value2: 21
=====================
Number: 5
AMPM: AM
Name: Jack B Black
Value1: 20
Value2: 21
=====================

How to process Text Qualifier delimited file in scala

I have a lot of delimited files with Text Qualifier (every column start and end has double quote). Delimited is not consistent i.e. there can be any delimited like comma(,), Pipe (|), ~, tab (\t).
I need to read this file with text (single column) and then check no of delimiters by considering Text Qualifier. If any record has less or more columns than defined that record should be rejected and loaded to different path.
Below is test data with 3 columns ID, Name and DESC. DESC column has extra delimiter.
"ID","Name","DESC" "1" , "ABC", "A,B C" "2" , "XYZ" , "ABC is bother" "3" , "YYZ" , "" 4 , "XAA" , "sf,sd
sdfsf"
Last record splitted into two records due new line char in desc field
Below is the code I tried to handle but not able to handle correctly.
val SourceFileDF = spark.read.text(InputFilePath)
SourceFile = SourceFile.filter("value != ''") // Removing empty records while reading
val aCnt = coalesce(length(regexp_replace($"value","[^,]", "")), lit(0)) //to count no of delimiters
val Delimitercount = SourceFileDF.withColumn("a_cnt", aCnt)
var invalidrecords= Delimitercount
.filter(col("a_cnt")
.!==(NoOfDelimiters)).toDF()
val GoodRecordsDF = Delimitercount
.filter(col("a_cnt")
.equalTo(NoOfDelimiters)).drop("a_cnt")
With above code I am able to reject all the records which has less or more delimiters but not able to ignore if delimiter is with in text qualifier.
Thanks in Advance.
You may use a closure with replaceAllIn to remove any chars you want inside a match:
var y = """4 , "XAA" , "sf,sd\nsdfsf""""
val pattern = """"[^"]*(?:""[^"]*)*"""".r
y = pattern replaceAllIn (y, m => m.group(0).replaceAll("[,\n]", ""))
print(y) // => 4 , "XAA" , "sfsdnsdfsf"
See the Scala demo.
Details
" - matches a "
[^"]* - any 0+ chars other than "
(?:""[^"]*)* - matches 0 or more sequences of "" and then 0+ chars other than "
" - a ".
The code finds all non-overlapping matches of the above pattern in y and upon finding a match (m) the , and newlines (LF) are removed from the match value (with m.group(0).replaceAll("[,\n]", ""), where m.group(0) is the match value and [,\n] matches either , or a newline).

JAVA: Space delimiting all non-numerical characters in a String

I am having some trouble with modifying Strings to be space delimited under the special case of adding spaces to all non-numerical characters.
My code must take a string representing a math equation, and split it up into it's individual parts. It does so using space delimits between values This part works great if the string is already delimited.
The problem is that I do not always get a space delimited input. To deal with this, I want to first insert these spaces so that the array is created properly.
What my code must do is take any character that is NOT a number, and add a space before and after it.
Something like this:
3*24+321 becomes 3 * 24 + 321
or
((3.0)*(2.5)) becomes ( ( 3.0 ) * ( 2.5 ) )
Obviously I need to avoid inserting space in the numbers, or 2.5 becomes 2 . 5, and then gets entered into the array as 3 elements. which it is not.
So far, I have tried using
String InputLineDelmit = InputLine.replaceAll("\B", " ");
which successfully changes a string of all letters "abcd" to "a b c d"
But it makes mistakes when it runs into numbers. Using this method, I have gotten that:
(((1)*(2))) becomes ( ( (1) * (2) ) ) ---- * The numbers must be separate from parens
12.7+3.1 becomes 1 2.7+3.1 ----- * 12.7 is split
51/3 becomes 5 1/3 ----- * same issue
and 5*4-2 does not change at all.
So, I know that \D can be used as a regular expression for all non-numbers in java. However, my attempts to implement this (by replacing, or combining it with \B above) have led either to compiler errors or it REPLACING the char with a space, not adding one.
EDIT:
==== Answered! ====
It wont let me add my own answer because I'm new, but an edit to neo108's code below (which, itself, does not work) did the job. What i did was change it to check isDigit, not isLetter, and then do nothing in that case (or in the special case of a decimal, for doubles). Else, the character is changed to have spaces on either side.
public static void main(String[] args){
String formula = "12+((13.0)*(2.5)-17*2)+(100/3)-7";
StringBuilder builder = new StringBuilder();
for (int i = 0; i < formula.length(); i++){
char c = formula.charAt(i);
char cdot = '.';
if(Character.isDigit(c) || c == cdot) {
builder.append(c);
}
else {
builder.append(" "+c+" ");
}
}
System.out.println("OUTPUT:" + builder);
}
OUTPUT: 12 + ( ( 13.0 ) * ( 2.5 ) - 17 * 2 ) + ( 100 / 3 ) - 7
However, any ideas on how to do this more succinctly, and also a decent explanation of StringBuilders, would be appreciated. Namely what is with this limit of 16 chars that I read about on javadocs, as the example above shows that you CAN have more output.
Something like this should work...
String formula = "Ab((3.0)*(2.5))";
StringBuilder builder = new StringBuilder();
for (int i = 0; i < formula.length(); i++){
char c = formula.charAt(i);
if(Character.isLetter(c)) {
builder.append(" "+c+" ");
} else {
builder.append(c);
}
}
Define the operations in your math equation + - * / () etc
Convert your equation string to char[]
Traverse through the char[] one char at a time and append the read char to a StringBuilder object.
If you encounter any character matching with the operations defined, then add a space before and after that character and then append this t o the StringBuilder object.
Well this is one of the algorithm you can implement. There might be other ways of doing it as well.

SQL Query passing values from Java

I would like to know if when I place a sql query using java , does it retain the new lines?
for instance if i have
"IF EXISTS (SELECT * FROM mytable WHERE EMPLOYEEID='"+EMPID+"')"+
"UPDATE myTable SET ....)"
So after the "+" sign in the first line the UPDATE follows, does it maintain the new line when it is being passed to the database?
Thank you
No. For the query to work successfully you will have to add a space before UPDATE or after ).
Firstly, there is no newline in the example source code to "maintain" ...
Secondly, your problem is with Java rather than SQL. You will only get an newline into a Java String if you put it there explicitly; e.g.
// No newline in this string
String s = "a" +
"b";
// Line break in these strings
String s = "a" + "\n" + "b";
String s2 = "a\nb";
String s3 = "a" + System.getProperty("line.separator") + "b";
Finally, in your example, a space or TAB will do just as well as a line break.

Categories

Resources