Split Java Spring - java

I have a db with 2 columns, key and value. record:
------------------------------------
| key | value |
------------------------------------
| A | 1,desc 1;2,desc 2;3,desc 3 |
------------------------------------
I want to split value column become json format:
[{"key":"1","value":"desc 1"},{"key":"2","value":"desc 2"},{"key":"3", "value":"desc 3"}]
Where I am put split function? in service? because too dificult for 2 split. How to solve this problem?
Thanks,
Bobby

That depends on how your application is usually working with this value. If the usual case is using some specific data from this column, I would parse this at repository level already:
public static void main(String[] args) {
// You actually get this from DB
String value = "1,desc 1;2,desc 2;3,desc 3";
JSONArray j = new JSONArray();
Stream.of(value.split(";")).forEach((pair -> {
String[] keyValue = pair.split(",");
JSONObject o = new JSONObject();
o.put("key", keyValue[0]);
o.put("value", keyValue[1]);
j.put(o);
}));
System.out.println(j);
}

Related

reading data from a file and printing the only specific instances of the data

I am trying to manipulate this exercise but I am having a hard time executing it. Input is being scanned from a file. The information is then being formatted as its being output.
The csv file currently has the following information:
16:40,Wonders of the World,G
20:00,Wonders of the World,G
19:00,End of the Universe,NC-17
12:45,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
15:00,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
19:30,Buffalo Bill And The Indians or Sitting Bull's History Lesson,PG
10:00,Adventure of Lewis and Clark,PG-13
14:30,Adventure of Lewis and Clark,PG-13
19:00,Halloween,R
But my output is coming out like this:
Wonders of the World | G | 16:40
Wonders of the World | G | 20:00
End of the Universe | NC-17 | 19:00
Buffalo Bill And The Indians or Sitting Bull | PG | 12:45
Buffalo Bill And The Indians or Sitting Bull | PG | 15:00
Buffalo Bill And The Indians or Sitting Bull | PG | 19:30
Adventure of Lewis and Clark | PG-13 | 10:00
Adventure of Lewis and Clark | PG-13 | 14:30
Halloween | R | 19:00
I need to be able to only output one movie with showtimes so it looks like this.
Wonders of the World | G | 16:40 20:00
End of the Universe | NC-17 | 19:00
Buffalo Bill And The Indians or Sitting Bull | PG | 12:45 15:00 19:30
Adventure of Lewis and Clark | PG-13 | 10:00 14:30
Halloween | R | 19:00
My code so far:
public class LabProgram4 {
public static void main(String[] args) throws IOException {
String filename = "movies.csv";
int recordCount = 0;
Scanner fileScanner = new Scanner(new File(filename));
while (fileScanner.hasNextLine()) {
fileScanner.nextLine();
++recordCount;
}
String[] showtimes = new String[recordCount];
String[] title = new String[recordCount];
String[] rating = new String[recordCount];
fileScanner.close();
fileScanner = new Scanner(new File(filename));
for (int i = 0; i < recordCount; ++i) {
String[] data = fileScanner.nextLine().strip().split(",");
showtimes[i] = data[0].strip();
title[i] = data[1].strip();
rating[i] = data[2].strip();
}
fileScanner.close();
for (int i = 0; i < recordCount; ++i) {
if (title[i].length() > 44)
title[i] = title[i].substring(0, 44);
System.out.printf("%-44s | %5s | %s\n", title[i], rating[i], showtimes[i]);
}
}
}
public static final class Movie {
private String title;
private String showTime;
private String rating;
}
public static void main(String... args) throws FileNotFoundException {
List<Movie> movies = readMovies(new File("d:/movies.csv"));
Map<String, List<Movie>> map = movies.stream().collect(Collectors.groupingBy(movie -> movie.title));
print(map);
}
private static void print(Map<String, List<Movie>> map) {
int titleWidth = getTitleWidth(map);
int ratingWidth = getRatingWidth(map);
map.forEach((title, movies) -> {
String rating = movies.stream().map(movie -> movie.rating).distinct().collect(Collectors.joining(" "));
String showTime = movies.stream().map(movie -> movie.showTime).distinct().sorted().collect(Collectors.joining(" "));
System.out.format("%-" + titleWidth + "s | %-" + ratingWidth + "s | %s\n", title, rating, showTime);
});
}
private static int getTitleWidth(Map<String, List<Movie>> map) {
return map.keySet().stream()
.mapToInt(String::length)
.max().orElse(0);
}
private static int getRatingWidth(Map<String, List<Movie>> map) {
return map.values().stream()
.mapToInt(movies -> movies.stream()
.map(movie -> movie.rating)
.distinct()
.mapToInt(String::length)
.sum())
.max().orElse(0);
}
private static final int SHOW_TIME = 0;
private static final int TITLE = 1;
private static final int RATING = 2;
private static List<Movie> readMovies(File file) throws FileNotFoundException {
try (Scanner scan = new Scanner(file)) {
List<Movie> movies = new ArrayList<>();
while (scan.hasNext()) {
String[] data = scan.nextLine().split(",");
Movie movie = new Movie();
movie.title = data[TITLE].trim();
movie.showTime = data[SHOW_TIME].trim();
movie.rating = data[RATING].trim();
movies.add(movie);
}
return movies;
}
}
In my opinion, reading a file once for the sole purpose of getting the number of records (per say) is just a wrong way to go. Read the file once and carry out the task as the file is being read.
There are oodles of ways to read a file and store or display records in a unique fashion (such as no duplicate titles). Using parallel arrays to store the data is one way I suppose but these arrays need to be initialized to a specific length since they can not grow dynamically. Although not impossible, this is rather problematic in this particular situation and would require far more code to carry out the task compared to utilizing a Collection object such as a List Interface, ArrayList, (etc) which can grow dynamically.
The code below utilizes the java.util.List Interface to store and then later display Movies read in from the movies.csv file. The code looks long but it is mostly comments explaining things. I would suggest you read those comments and if you like delete them since they are excessive:
// The Movies data file name.
String filename = "movies.csv";
// Counter to keep track of the number of movies stored.
int moviesCount = 0;
// List Interface object to store movie titles in.
java.util.List<String> movies = new java.util.ArrayList<>();
// 'Try With Resources' used here to auto-close the reader
try (Scanner reader = new Scanner(new File(filename))) {
// Read the data file line by line.
String dataLine;
while (reader.hasNextLine()) {
dataLine = reader.nextLine().trim();
// Skip blank lines...
if (dataLine.isEmpty()) {
continue;
}
/* The regex ("\\s*,\\s*") passed to the String#split() method
below handles any number of whitespaces before or after the
comma delimiter on any read in data file line. */
String[] data = dataLine.split("\\s*,\\s*");
/* Do we already have title in the 'movies' List?
If so just add the show-time to the title and
continue on to the next file data line. */
boolean alreadyHave = false; // Flag that we don't already have this title
for (int i = 0; i < movies.size(); i++) {
// Do we already have the movie title in the list?
if (movies.get(i).contains(data[1])) {
// Yes we do so flag it that we already do have this title.
alreadyHave = true;
// Add the additional show-time to that title's stored information
movies.set(i, movies.get(i) + " " + data[0]);
/* Break out of this 'for' loop since there is no
more need to check other titles in the List. */
break;
}
}
/* If however we don't already have this movie title
in the List then add it in the desired display
format using the Pipe (|) character as a delimiter. */
if (!alreadyHave) {
moviesCount++; // Increment the number of movies
movies.add(String.format("%s | %s | %s", data[1], data[2], data[0]));
}
}
}
// DISPLAY THE MOVIES LIST IN TABLE STYLE FASHION
// Display Title in Console Window:
String msg = "There are " + moviesCount + " movies with various show-times:";
System.out.println(msg); // Print title
// Display Table Header:
String header = String.format("%-44s | %6s | %s", "Movie Title", "Rating", "Show Times");
String overUnderline = String.join("", java.util.Collections.nCopies(header.length(), "="));
// Header Overline
System.out.println(overUnderline);
System.out.println(header);
// Header Underline
System.out.println(overUnderline);
// Display the movies in console window.
for (String movie : movies) {
/* Split the current List element into its respective parts
using the String#split() method so that the List contents
can be displayed in a table format. The regex passed t0
the 'split()' method ("\\s*\\|\\s*") will take care of any
whitespaces before or after any Pipe (|) delimiter so that
String#trim() or String#strip() is not required. Note that
the Pipe (|) delimiter needs to be escaped (\\|) within the
expression so as to acquire is literal meaning since it is
a regex meta character. */
String[] movieParts = movie.split("\\s*\\|\\s*");
/* 'Ternary Operators' are used in the String#format() data
section components so to truncate movie Title Names to the
desire table cell width of 44 and to convert Rating and
Show-Times to "N/A" should they EVER be empty (contain no
data). */
System.out.println(String.format("%-44s | %6s | %s",
(movieParts[0].length() > 44 ? movieParts[0].substring(0, 44) : movieParts[0]),
(movieParts[1].isEmpty() ? "N/A" : movieParts[1]),
(movieParts[2].isEmpty() ? "N/A" : movieParts[2])));
}
System.out.println(overUnderline);
If the data file actually contains what you've indicated in your post then the code above will display the following into the Console Window:
There are 5 movies with various show-times:
==================================================================
Movie Title | Rating | Show Times
==================================================================
Wonders of the World | G | 16:40 20:00
End of the Universe | NC-17 | 19:00
Buffalo Bill And The Indians or Sitting Bull | PG | 12:45 15:00 19:30
Adventure of Lewis and Clark | PG-13 | 10:00 14:30
Halloween | R | 19:00
==================================================================

Compare and Highlight the differences of two dataframes using spark and java

I am using spark and java to to try and compare two data frames.
Once I convert my csv files into data frames, I want to highlight exactly what changed between two dataframes.
They all have the same columns in common.
As you can see the only thing not correct with below data frames is emp_id 4 in the second df2.
Dataset<Row> df1 = spark.read().csv("/Users/dataframeOne.csv");
Dataset<Row> df1 = spark.read().csv("/Users/dataframeTwo.csv");
df1.unionAll(df2).except(df1.intersect(df2)).show(true);
Df1
+------+---------+--------+----------+-------+--------+
|emp_id| emp_city|emp_name| emp_phone|emp_sal|emp_site|
+------+---------+--------+----------+-------+--------+
| 3| Chennai| rahman|9848022330| 45000|SanRamon|
| 1|Hyderabad| ram|9848022338| 50000| SF|
| 2|Hyderabad| robin|9848022339| 40000| LA|
| 4| sanjose| romin|9848022331| 45123|SanRamon|
+------+---------+--------+----------+-------+--------+
Df2
+------+---------+--------+----------+-------+--------+
|emp_id| emp_city|emp_name| emp_phone|emp_sal|emp_site|
+------+---------+--------+----------+-------+--------+
| 3| Chennai| rahman|9848022330| 45000|SanRamon|
| 1|Hyderabad| ram|9848022338| 50000| SF|
| 2|Hyderabad| robin|9848022339| 40000| LA|
| 4| sanjose| romino|9848022331| 45123|SanRamon|
+------+---------+--------+----------+-------+--------+
Difference
+------+--------+--------+----------+-------+--------+
|emp_id|emp_city|emp_name| emp_phone|emp_sal|emp_site|
+------+--------+--------+----------+-------+--------+
| 4| sanjose| romino|9848022331| 45123|SanRamon|
+------+--------+--------+----------+-------+--------+
How can I highlight in yellow 'Romino', the incorrect field using JAVA and SPARK?
Highlighting something in Spark depends on your GUI, so as first step I would suggest to detect the different values and add the information about the differences as additional column to the dataframe.
Step 1: Add a suffix to all columns of the two dataframes and join them over the primary key (emp_id):
import static org.apache.spark.sql.functions.*;
private static Dataset<Row> prefix(Dataset<Row> df, String prefix) {
for(String col: df.columns()) df = df.withColumnRenamed(col, col + prefix);
return df;
}
[...]
Dataset<Row> df1 = spark.read().option("header", "true").csv(...);
Dataset<Row> df2 = spark.read().option("header", "true").csv(...);
String[] columns = df1.columns();
Dataset<Row> joined = prefix(df1, "_1").join(prefix(df2, "_2"),
col("emp_id_1").eqNullSafe(col("emp_id_2")), "full_outer");
Step 2: create a list of column objects that check if the value from one table is different from the other table. This list will later be used as input parameter for map.
List<Column> diffs = new ArrayList<>();
for( String column: columns) {
diffs.add(lit(column));
diffs.add(when(col(column + "_1").eqNullSafe(col(column + "_2")), null)
.otherwise(concat_ws("/", col(column + "_1"), col(column + "_2"))));
}
Step 3: create a new column containing a map with all differences:
joined.withColumn("differences", map(diffs.toArray(new Column[]{})))
.withColumn("differences", map_filter(col("differences"), (k, v) -> not(v.isNull())))
.select("emp_id_1", "differences")
.filter(size(col("differences")).gt(0))
.show(false);
Output:
+--------+--------------------------+
|emp_id_1|differences |
+--------+--------------------------+
|4 |{emp_name -> romin/romino}|
+--------+--------------------------+

Parsing key value pairs as Hive Dataset rows using java spark

I have a hdfs file with the following data
key1=value1 key2=value2 key3=value3...
key1=value11 key2=value12 key3=value13..
We use and internal framework that gives Dataset as an input to java method which should be transformed as below and put in a hive table
keys should be the hive column names
Row formed after splitting the dataset with = as delimiter and picking the value to the right
Expected Output:
key1 | key 2 | key3
----------+-------------+----------
value1 | value2 | value3
value11 | value12 | value13
Hdfs file would roughly have 60 key- value pairs so its impossible to manually do a withcolumn() on Dataset. Any help is appreciated.
Edit1:
This is what I could write so far. Dataset.withColumn() doesnt seem to be working in a loop except for the 1st iteration
String[] columnNames = new String[dataset.columns().length];
String unescapedColumn;
Row firstRow= (Row)dataset.first();
String[] rowData = firstRow.mkString(",").split(",");
for(int i=0;i<rowData.length;i++) {
unescapedColumn=rowData[i].split("=")[0];
if(unescapedColumn.contains(">")) {
columnNames[i] = unescapedColumn.substring(unescapedColumn.indexOf(">")+1).trim();
}else {
columnNames[i] = unescapedColumn.trim();
}
}
Dataset<Row> namedDataset = dataset.toDF(columnNames);
for(String column : namedDataset.columns()) {
System.out.println("Column name :" + column);
namedDataset = namedDataset.withColumn(column, functions.substring_index(namedDataset.col(column),"=",-1));
}

Looping through 2d array with String.split()

I've got a simple problem, but I'm new to Java coming from PHP. I need to split a delimited text file into an array. I've broken it down into an array of lines, each one would look something like this:
{
{Bob | Smithers | Likes Cats | Doesnt Like Dogs},
{Jane | Haversham | Likes Bats | Doesnt Like People}
}
I need to turn this into a 2 dimensional array.
In PHP, it's a cinch. You just use explode(); I tried using String.split on a 1d array and it wasn't that bad either. The things is, I haven't yet learned how to be nice to Java. So I don't know how to loop through the array and turn it into a 2d. This is what I have:
for (i = 0; i < array.length; i++) {
String[i][] 2dArray = array[i].split("|", 4);
}
PHP would be
for ($i = 0; $i < count($array); $i++) {
$array[i][] = explode(",", $array[i]);
}
You can loop the array like this:
// Initialize array
String[] array = {
"Bob | Smithers | Likes Cats | Doesnt Like Dogs",
"Jane | Haversham | Likes Bats | Doesnt Like People"
};
// Convert 1d to 2d array
String[][] array2d = new String[2][4];
for(int i=0;i<array.length;i++) {
String[] temp = array[i].split(" \\| ");
for(int j=0;j<temp.length;j++) {
array2d[i][j] = temp[j];
}
}
// Print the array
for(int i=0;i<array2d.length;i++) {
System.out.println(Arrays.toString(array2d[i]));
}
Notes: I used \\|to split the pipe character.
Problem
If I got you right you have an input like this:
{{Bob | Smithers | Likes Cats | Doesnt Like Dogs},{Jane | Haversham | Likes Bats | Doesnt Like People}}
Readable version:
{
{Bob | Smithers | Likes Cats | Doesnt Like Dogs},
{Jane | Haversham | Likes Bats | Doesnt Like People}
}
And you want to represent that structure in a 2-dimensional String aray, String[][].
Solution
The key is the method String#split which splits a given String into substrings delimited by a given symbol. This is , and | in your example.
First of all we remove all {, } as we don't need them (as long as the text itself does not contain delimiter):
String input = ...
String inputWithoutCurly = input.replaceAll("[{}]", "");
The text is now:
Bob | Smithers | Likes Cats | Doesnt Like Dogs,Jane | Haversham | Likes Bats | Doesnt Like People
Next we want to create the outer dimension of the array, that is split by ,:
String[] entries = inputWithoutCurly.split(",");
Structure now is:
[
"Bob | Smithers | Likes Cats | Doesnt Like Dogs",
"Jane | Haversham | Likes Bats | Doesnt Like People"
]
We now want to split each of the inner texts into their components. We therefore iterate all entries, split them by | and collect them to the result:
// Declaring a new 2-dim array with unknown inner dimension
String[][] result = new String[entries.length][];
// Iterating all entries
for (int i = 0; i < entries.length; i++) {
String[] data = entries[i].split(" | ");
// Collect data to result
result[i] = data;
}
Finally we have the desired structure of:
[
[ "Bob", "Smithers", "Likes Cats", "Doesnt Like Dogs" ],
[ "Jane", "Haversham", "Likes Bats", "Doesnt Like People"]
]
Everything compact:
String[] entries = input.replaceAll("[{}]", "").split(",");
String[][] result = new String[entries.length][];
for (int i = 0; i < entries.length; i++) {
result[i] = entries[i].split(" | ");
}
Stream
If you have Java 8 or newer you can use the Stream API for a compact functional style:
String[][] result = Arrays.stream(input.replaceAll("[{}]", "").split(","))
.map(entry -> entry.split(" | "))
.toArray(String[][]::new);

In java, what's the best way to read a url and split it into its parts?

Firstly, I am aware that there are other posts similar, but since mine is using a URL and I am not always sure what my delimiter will be, I feel that I am alright posting my question. My assignment is to make a crude web browser. I have a textField that a user enters the desired URL into. I then have obviously have to navigate to that webpage. Here is an example from my teacher of what my code would look kinda like. This is the code i'm suposed to be sending to my socket. Sample url: http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
GET /wiki/Hypertext_Transfer_Protocol HTTP/1.1\n
Host: en.wikipedia.org\n
\n
So my question is this: I am going to read in the url as just one complete string, so how do I extract just the "en.wikipedia.org" part and just the extension? I tried this as a test:
String url = "http://en.wikipedia.org/wiki/Hypertext Transfer Protocol";
String done = " ";
String[] hope = url.split(".org");
for ( int i = 0; i < hope.length; i++)
{
done = done + hope[i];
}
System.out.println(done);
This just prints out the URL without the ".org" in it. I think i'm on the right track. I am just not sure. Also, I know that websites can have different endings (.org, .com, .edu, etc) so I am assuming i'll have to have a few if statements that compenstate for the possible different endings. Basically, how do I get the url into the two parts that I need?
The URL class pretty much does this, look at the tutorial. For example, given this URL:
http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING
This is the kind of information you can expect to obtain:
protocol = http
authority = example.com:80
host = example.com
port = 80
path = /docs/books/tutorial/index.html
query = name=networking
filename = /docs/books/tutorial/index.html?name=networking
ref = DOWNLOADING
This is how you should split your URL parts: http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html
Instead of url.split(".org"); try url.split("/"); and iterate through your array of strings.
Or you can look into regular expressions. This is a good example to start with.
Good luck on your homework.
Even though the answer with URL class is great, here is one more way to split URL to components using REGEXP:
"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?"
|| | | | | | | |
12 - scheme | | | | | | |
3 4 - authority, includes hostname/ip and port number.
5 - path| | | |
6 7 - query| |
8 9 - fragment
You can use it with Pattern class:
var regex = "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?";
var pattern = Pattern.compile(REGEX);
var matcher = pattern.matcher("http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING");
if (matcher.matches()) {
System.out.println("scheme: " + matcher.group(2));
System.out.println("authority: " + matcher.group(4));
System.out.println("path: " + matcher.group(5));
System.out.println("query: " + matcher.group(7));
System.out.println("fragment: " + matcher.group(9));
}
you can use String class split() and store the result into the String array then iterate the array and store the variable and value into the Map.
public class URLSPlit {
public static Map<String,String> splitString(String s) {
String[] split = s.split("[= & ?]+");
int length = split.length;
Map<String, String> maps = new HashMap<>();
for (int i=0; i<length; i+=2){
maps.put(split[i], split[i+1]);
}
return maps;
}
public static void main(String[] args) {
String word = "q=java+online+compiler&rlz=1C1GCEA_enIN816IN816&oq=java+online+compiler&aqs=chrome..69i57j69i60.18920j0j1&sourceid=chrome&ie=UTF-8?k1=v1";
Map<String, String> newmap = splitString(word);
for(Map.Entry map: newmap.entrySet()){
System.out.println(map.getKey()+" = "+map.getValue());
}
}
}

Categories

Resources