My program will read a file line by line, then split the line by delimiter | (vertical line) and stored into a String []. However, as column position and number of columns in the line will change in the future, instead of using concrete index number 0,1,2,3..., I use index++ to iterate the line split tokens;
After running the program, instead of increase 1, the index will increase more than 1 each time.
My code is like as follows:
BufferedReader br = null;
String line = null;
String[] lineTokens = null;
int index = 1;
DataModel dataModel = new DataModel();
try {
br = new BufferedReader(new FileReader(filePath));
while((line = br.readLine()) != null) {
// check Group C only
if(line.contains("CCC")) {
lineTokens = line.split("\\|");
dataModel.setGroupID(lineTokens[index++]);
//System.out.println(index); The value of index not equal to 2 here. The value change each running time
dataModel.setGroupName(lineTokens[index++]);
//System.out.println(index);
// dataModel.setOthers(lineTokens[index++]); <- if the file add columns in the middle of the line in the future, this is required.
dataModel.setMemberID(lineTokens[index++]);
dataModel.setMemberName(lineTokens[index++]);
dataModel.setXXX(lineTokens[index++]);
dataModel.setYYY(lineTokens[index++]);
index = 1;
//blah blah below
}
}
br.close();
} catch (Exception ex) {
}
The file format is like as follows:
Prefix | Group ID | Group Name | Memeber ID | Member Name | XXX | YYY
GroupInterface | AAA | Group A | 001 | Amy | XXX | YYY
GroupInterface | BBB | Group B | 002 | Tom | XXX | YYY
GroupInterface | AAA | Group A | 003 | Peter | XXX | YYY
GroupInterface | CCC | Group C | 004 | Sam | XXX | YYY
GroupInterface | CCC | Group C | 005 | Susan | XXX | YYY
GroupInterface | DDD | Group D | 006 | Parker| XXX | YYY
Instead of increase 1, the index++ will increase more than 1. I wonder why this happen and how to solve it? Any help is highly appreciated.
Well, #SomeProgrammerDude slyly hinted it, but I'll just come out and say it: when you reset index it should be set to zero, not 1.
By starting index at 1, you're always indexing one position ahead of where you should be, and you're probably eventually getting an IndexOutOfBoundsException that's being swallowed up by your empty catch clause.
Related
I am new to Spark 2.4 with Java 8. I need help. Here is example of instances:
Source DataFrame
+--------------+
| key | Value |
+--------------+
| A | John |
| B | Nick |
| A | Mary |
| B | Kathy |
| C | Sabrina|
| B | George |
+--------------+
Meta DataFrame
+-----+
| key |
+-----+
| A |
| B |
| C |
| D |
| E |
| F |
+-----+
I would like to transform it to the following: Column names from Meta Dataframe and Rows will be transformed based on Source Dataframe
+-----------------------------------------------+
| A | B | C | D | E | F |
+-----------------------------------------------+
| John | Nick | Sabrina | null | null | null |
| Mary | Kathy | null | null | null | null |
| null | George | null | null | null | null |
+-----------------------------------------------+
Need to write a code Spark 2.3 with Java8. Appreciated your help.
To make things clearer (and easily reproducible) let's define dataframes:
val df1 = Seq("A" -> "John", "B" -> "Nick", "A" -> "Mary",
"B" -> "Kathy", "C" -> "Sabrina", "B" -> "George")
.toDF("key", "value")
val df2 = Seq("A", "B", "C", "D", "E", "F").toDF("key")
From what I see, you are trying to create one column by value in the key column of df2. These columns should contain all the values of the value column that are associated to the key naming the column. If we take an example, column A's first value should be the value of the first occurrence of A (if it exists, null otherwise): "John". Its second value should be the value of the second occurrence of A: "Mary". There is no third value so the third value of the column should be null.
I detailed it to show that we need a notion of rank of the values for each key (windowing function), and group by that notion of rank. It would go as follows:
import org.apache.spark.sql.expressions.Window
val df1_win = df1
.withColumn("id", monotonically_increasing_id)
.withColumn("rank", rank() over Window.partitionBy("key").orderBy("id"))
// the id is just here to maintain the original order.
// getting the keys in df2. Add distinct if there are duplicates.
val keys = df2.collect.map(_.getAs[String](0)).sorted
// then it's just about pivoting
df1_win
.groupBy("rank")
.pivot("key", keys)
.agg(first('value))
.orderBy("rank")
//.drop("rank") // I keep here it for clarity
.show()
+----+----+------+-------+----+----+----+
|rank| A| B| C| D| E| F|
+----+----+------+-------+----+----+----+
| 1|John| Nick|Sabrina|null|null|null|
| 2|Mary| Kathy| null|null|null|null|
| 3|null|George| null|null|null|null|
+----+----+------+-------+----+----+----+
Here is the very same code in Java
Dataset<Row> df1_win = df1
.withColumn("id", functions.monotonically_increasing_id())
.withColumn("rank", functions.rank().over(Window.partitionBy("key").orderBy("id")));
// the id is just here to maintain the original order.
// getting the keys in df2. Add distinct if there are duplicates.
// Note that it is a list of objects, to match the (strange) signature of pivot
List<Object> keys = df2.collectAsList().stream()
.map(x -> x.getString(0))
.sorted().collect(Collectors.toList());
// then it's just about pivoting
df1_win
.groupBy("rank")
.pivot("key", keys)
.agg(functions.first(functions.col("value")))
.orderBy("rank")
// .drop("rank") // I keep here it for clarity
.show();
I have this string to parse and extract all elements between <>:
String text = "test user #myhashtag <#C5712|user_name_toto> <#U433|user_hola>";
I tried with this pattern, but it doesn't work (no result):
String pattern = "<#[C,U][0-9]+\\|[.]+>";
So in this example I want to extract:
<#C5712|user_name_toto>
<#U433|user_hola>
Then for each, I want to extract:
C or U element
ID (ie: 5712 or 433)
user name (ie: user_name_toto)
Thank you very much guys
The main problem I can see with your pattern is that it doesn't contain groups, hence retrieving parts of it will be impossible without further parsing.
You define numbered groups within parenthesis: (partOfThePattern).
From Java 7 onwards, you can also define named groups as follows: (?<theName>partOfThePattern).
The second problem is that [.] corresponds to a literal dot, not an "any character" wildcard.
The third problem is your last quantifier, which is greedy, therefore it would consume the whole rest of the string starting from the first username.
Here's a self-contained example fixing all that:
String text = "test user #myhashtag <#C5712|user_name_toto> <#U433|user_hola>";
// | starting <#
// | | group 1: any 1 char
// | | | group 2: 1+ digits
// | | | | escaped "|"
// | | | | | group 3: 1+ non-">" chars, greedy
// | | | | | | closing >
// | | | | | |
Pattern p = Pattern.compile("<#(.)(\\d+)\\|([^>]+))>");
Matcher m = p.matcher(text);
while (m.find()) {
System.out.printf(
"C or U? %s%nUser ID: %s%nUsername: %s%n",
m.group(1), m.group(2), m.group(3)
);
}
Output
C or U? C
User ID: 5712
Username: user_name_toto
C or U? U
User ID: 433
Username: user_hola
Note
I'm not validating C vs U here (gives you another . example).
You can easily replace the initial (.) with (C|U) if you only have either. You can also have the same with ([CU]).
<#([CU])(\d{4})\|(\w+)>
Where:
$1 --> C/U
$2 --> 5712/433
$3 --> user_name_toto/user_hola
I have a result set like this…
+--------------+--------------+----------+--------+
| LocationCode | MaterialCode | ItemCode | Vendor |
+--------------+--------------+----------+--------+
| 1 | 11 | 111 | 1111 |
| 1 | 11 | 111 | 1112 |
| 1 | 11 | 112 | 1121 |
| 1 | 12 | 121 | 1211 |
+--------------+--------------+----------+--------+
And so on for LocationCode 2,3,4 etc. I need an object (to be converted to json, eventually) as : List<Location>
Where the the hierarchy of nested objects in Location Class are..
Location.class
LocationCode
List<Material>
Material.class
MaterialCode
List<Item>
Item.class
ItemCode
Vendor
This corresponds to the resultset, where 1 location has 2 materials, 1 material(11) has 2 Items, 1 item(111) has 2 vendors. How do i achieve this? I have used AliasToBeanResultTransformer before, but i doubt it will be of help in this case.
I don't think there is a neat way to do that mapping. I'd just do it with nested loops, and custom logic to decide when to when to start building the next Location, Material, Item, whatever.
Something like this pseudo-code:
while (row = resultSet.next()) {
if (row.locationCode != currentLocation.locationCode) {
currentLocation = new Location(row.locationCode)
list.add(currentLocation)
currentMaterial = null
} else if (currentMaterial == null ||
row.materialCode != currentMaterial.materialCode) {
currentMaterial = new Material(row.materialCode)
currentLocation.add(currentMaterial)
} else {
currentMaterial.add(new Item(row.itemCode, row.vendorCode))
}
}
I have a string like this:
1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | |
The string might have more/less data also.
I need to remove | and get only numbers one by one.
Guava's Splitter Rocks!
String input = "1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | |";
Iterable<String> entries = Splitter.on("|")
.trimResults()
.omitEmptyStrings()
.split(input);
And if you really want to get fancy:
Iterable<Integer> ints = Iterables.transform(entries,
new Function<String, Integer>(){
Integer apply(String input){
return Integer.parseInt(input);
}
});
Although you definitely could use a regex method or String.split, I feel that using Splitter is less likely to be error-prone and is more readable and maintainable. You could argue that String.split might be more efficient but since you are going to have to do all the trimming and checking for empty strings anyway, I think it will probably even out.
One comment about transform, it does the calculation on an as-needed basis which can be great but also means that the transform may be done multiple times on the same element. Therefore I recommend something like this to perform all the calculations once.
Function<String, Integer> toInt = new Function...
Iterable<Integer> values = Iterables.transform(entries, toInt);
List<Integer> valueList = Lists.newArrayList(values);
You can try using a Scanner:
Scanner sc = new Scanner(myString);
sc.useDelimiter("|");
List<Integer> numbers = new LinkedList<Integer>();
while(sc.hasNext()) {
if(sc.hasNextInt()) {
numbers.add(sc.nextInt());
} else {
sc.next();
}
}
Here you go:
String str = "1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | |".replaceAll("\\|", "").replaceAll("\\s+", "");
Do you mean like?
String s = "1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | |";
for(String n : s.split(" ?\\| ?")) {
int i = Integer.parseInt(n);
System.out.println(i);
}
prints
1
2
3
4
5
6
7
inputString.split("\\s*\\|\\s*") will give you an array of the numbers as strings. Then you need to parse the numbers:
final List<Integer> ns = new ArrayList<>();
for (String n : input.split("\\s*\\|\\s*"))
ns.add(Integer.parseInt(n);
You can use split with the following regex (allows for extra spaces, tabs and empty buckets):
String input = "1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | ";
String[] numbers = input.split("([\\s*\\|\\s*])+");
System.out.println(Arrays.toString(numbers));
outputs:
[1, 2, 3, 4, 5, 6, 7]
Or with Java onboard methods:
String[] data="1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | |".split("|");
for(String elem:data){
elem=elem.trim();
if(elem.length()>0){
// do someting
}
}
Split the string at its delimiter | and then parse the array.
Something like this should do:
String test = "|1|2|3";
String delimiter = "|";
String[] testArray = test.split(delimiter);
List<Integer> values = new ArrayList<Integer>();
for (String string : testArray) {
int number = Integer.parseInt(string);
values.add(number);
}
I have a problem in implementing the tree structure of OID. when I click the parent , i need to display only child details, not the sub child of a child.
i.e., i need not display an OID which contains a "." (dot).
For example, if my OID structure is private.MIB.sample.first
private.MIB.sample.second and so on.
when I click on MIB, it should display only "sample" not first and second.
first and second is to be displayed when I click sample.
How can I implement this in java.
My datyabase is MySQL. The code which I tried is given below
FilteredRowSet rs = new FilteredRowSetImpl();
// for Other Types Like OBJECT-TYPE, Object_IDENTIFIER
rs = new FilteredRowSetImpl();
rs.setCommand("Select * from MIBNODEDETAILS where " + "mn_OID like '" + OID
+ ".%' order by mn_NodeType, mn_OID");
rs.setUrl(Constants.DB_CONNECTION_URL);
rs.setFilter(new MibRowFilter(1, expString));
rs.execute();
rs.absolute(1);
rs.beforeFirst();
I guess the change is to be made in the setCommand argument.
How can I do this?
Structure of mobnodedetails table
+--------------------+-------------------+-------------+
| mn_OID | mn_name | mn_nodetype |
+--------------------+-------------------+-------------+
| 1 | iso | 0 |
| 1.3 | org | 1 |
| 1.3.6 | dod | 1 |
| 1.3.6.1 | internet | 1 |
| 1.3.6.1.1 | directory | 1 |
| 1.3.6.1.2 | mgmt | 1 |
| 1.3.6.1.2.1 | mib-2 | 0 |
| 1.3.6.1.2.1.1 | system | 1 |
| 1.3.6.1.2.1.10 | transmission | 1 |
You can use something like
SELECT *
FROM mibnodedetails
WHERE mn_oid LIKE "+mn_OID+"%
AND LENGTH ("+mn_OID+") + 2 = LENGTH (mn_oid)
ORDER BY mn_nodetype, mn_oid
So if you pass mm_OID as 1.3.6.1 (|1.3.6.1 |internet |1 |)
You will get following result:
| 1.3.6.1.1 | directory | 1 |
| 1.3.6.1.2 | mgmt | 1 |
Working Demo
PS: This will not work for child more than 9 as we are using length + 2
The function given below dispalys the tree as required.
public void populateMibValues()
{
final DefaultTreeModel model = (DefaultTreeModel) this.m_mibTree.getModel();
model.setRoot(null);
this.rootNode.removeAllChildren();
final String query_MibNodeDetailsSelect = "Select * from MIBNODEDETAILS where LENGTH(mn_oid)<=9 "
+ " and mn_OID<='1.3.6.1.4.1' order by mn_OID"; // only
this.innerNodeNames.clear();
this.innerNodes.clear();
this.innerNodesOid = null;
try {
final ResultSet deviceRS = Application.getDBHandler().executeQuery(query_MibNodeDetailsSelect, null);// inner
// nodes
while (deviceRS.next()) {
final mibNode mb = new mibNode(deviceRS.getString("mn_OID").toString(), deviceRS.getString("mn_name")
.toString());
mb.m_Type = Integer.parseInt(deviceRS.getString("mn_nodetype").toString());
createMibTree(mb);
}
}
catch (final Exception e) {
Application.showErrorInConsole(e);
NmsLogger.writeErrorLog("ERROR creating MIB tree failed", e.toString());
}