Regex to get particular values from a string data - java

While executing a particular shell command am getting following output as follows and keeping this in a string variable
dat /var/france.log
exit
bluetooth_mac=45h6kuu
franceIP=testMarmiton
build_type=BFD
france_mac=F4:0E:83:35:E8:D1
seloger_mac=F4:0E:83:35:E8:D0
tdVersion=1.2
td_number=45G67j
france_mac=fjdjjjgj
logo_mac=tiuyiiy
logout
Connection to testMarmiton closed.
Disconnected channel and session
From this i have too fetch particular details like below and put htese values in a Map. How i can perform this using java.
bluetooth_mac=45h6kuu
build_type=BFD
tdVersion=1.2
seloger_mac=F4:0E:83:35:E8:D0
france_mac=fjdjjjgj
Map<String, String> details =new HashMap<String,String>();
details.put(bluetooth_mac, 45h6kuu);
details.put(build_type, BFD)
etc
etc

Solution 1
If you are using Java 8 you can use :
String fileName = "shell.txt";
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
Map<String, String> result = stream
.filter(line -> line.matches("\\w+=\\w+"))
.map(line -> line.split("="))
.collect(Collectors.toMap(a -> a[0], a -> a[1]));
} catch (IOException e) {
e.printStackTrace();
}
Outputs
{franceIP=testMarmiton, bluetooth_mac=45h6kuu, logo_mac=tiuyiiy, td_number=45G67j, france_mac=fjdjjjgj, build_type=BFD}
Solution 2
It seems that you have multiple line which have the same name, in this case I would like to group by a Map<String, List<String>> :
String fileName = "shell.txt";
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
Map<String, List<String>> result = stream
.filter(line -> line.matches("[^=]+=[^=]+")) // filter only the lines which contain one = signe
.map(line -> line.split("=")) // split with = sign
.collect(Collectors.groupingBy(e -> e[0], Collectors.mapping(e -> e[1], Collectors.toList())));
result.forEach((k, v) -> System.out.println(k + " : " + v));
} catch (IOException e) {
e.printStackTrace();
}
Outputs
franceIP : [testMarmiton]
bluetooth_mac : [45h6kuu]
logo_mac : [tiuyiiy]
td_number : [45G67j]
seloger_mac : [F4:0E:83:35:E8:D0]
france_mac : [F4:0E:83:35:E8:D1, fjdjjjgj]
tdVersion : [1.2]
build_type : [BFD]

public static void main(String[] args) {
String str="abc def \n"
+ "key=123 \n "
+ "pass=456 \n"
+ "not working";
String[] sarray=str.split("\\r?\\n");
for (String eachline : sarray) {
System.out.println("line " + " : " + eachline);
if(eachline.contains("="))
{
String[] sarray2=eachline.split("=");
System.out.println("key:" +sarray2[0] +":Value:"+ sarray2[1]);
}
}
System.out.println(""+sarray.length);
}
Use split("\r?\n") for new line splitting.

You could try:
Pattern re = Pattern.compile("^\\s*(.*)\\s*=(.*)$", Pattern.MULTILINE);
Matcher matcher = re.matcher(input);
while (matcher.find()) {
map.put(matcher.group(1), matcher.group(2));
}

Here's a complete example, extracting the value using regex-matching and building a HashMap (the appropriate map for key-value pairs). You can copy the whole program and run it yourself:
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class main {
public static void main(String[] args) {
String input = // Your input log
"dat /var/france.log\n" +
"\n" +
"exit\n" +
"\n" +
"root#france24:~# \n" +
"root#france24:~# dat /var/france.log\n" +
"bluetooth_mac=45h6kuu\n" +
"franceIP=testMarmiton\n" +
"build_type=BFD\n" +
"france_mac=F4:0E:83:35:E8:D1\n" +
"seloger_mac=F4:0E:83:35:E8:D0\n" +
"tdVersion=1.2\n" +
"td_number=45G67j\n" +
"france_mac=fjdjjjgj\n" +
"logo_mac=tiuyiiy\n" +
"root#france24:~# \n" +
"root#france24:~# exit\n" +
"logout\n" +
"Connection to testMarmiton closed.\n" +
"\n" +
"Disconnected channel and session";
String[] keys = {
"bluetooth_mac",
"build_type",
"tdVersion",
"seloger_mac",
"france_mac"
};
HashMap<String, String> map = new HashMap<>();
for(String key : keys){
String value = getValueOf(input, key);
if(value != null)
map.put(key, value);
}
for(String key : keys)
System.out.println(key + " = " + map.get(key));
}
public static String getValueOf(String input, String key){ //returns null if not found
String result = null;
Pattern pattern = Pattern.compile(key + "=.*+\\s");
Matcher matcher = pattern.matcher(input);
if(matcher.find()) {
result = matcher.group();
result = result.substring(result.indexOf('=') + 1, result.length() - 1);
}
return result;
}
}
Output (add more keys to the key-string if you want get more values):
bluetooth_mac = 45h6kuu
build_type = BFD
tdVersion = 1.2
seloger_mac = F4:0E:83:35:E8:D0
france_mac = F4:0E:83:35:E8:D1

Related

Comparing two JSON objects irrespective of json array sequence in them using java

Is there any way in java to compare two json objects and print the changes/differences?
I have tried with "Flat-map utility" which is flattening and also comparing the json objects, however diff failed in case of json array elements in different orders in two json objects.
e.g
JSON 1
{
"name":"Jack",
"student_ID":1,
"subject":[
{
"marks":50,
"subjectId":"PHY",
"subjectName":"Physics"
},
{
"marks":60,
"subjectId":"CHE",
"subjectName":"Chemistry"
}
]
}
JSON 2
{
"name":"Jack",
"student_ID":1,
"subject":[
{
"marks":60,
"subjectId":"CHE",
"subjectName":"Chemistry"
},{
"marks":50,
"subjectId":"PHY",
"subjectName":"Physics"
}
]
}
json diff shows the Two jsons are mismatched.. Logically those are identical..
Is there a good way of json matching and comparing in java?
code used:
public class compareUtil
{
public static void main(String[] args) throws IOException, UnresolvedDatasetException, ParseException
{
// TODO Auto-generated method stub
String baselineFolderPath = "input//jsonBaseline";
String regressionfolderPath = "input//jsonRegression";
String fileToCompare = "sampleJson1_simple.json";
String baselineFile = baselineFolderPath + "//" + fileToCompare;
String regressionFile = regressionfolderPath + "//" + fileToCompare;
InputStream getBaselineJsonFile = new FileInputStream(baselineFile);
InputStream getRegressionJsonFile = new FileInputStream(regressionFile);
ObjectMapper mapper = new ObjectMapper();
TypeReference<HashMap<String, Object>> type = new TypeReference<HashMap<String, Object>>() {};
Map<String, Object> baselineMap = mapper.readValue(getBaselineJsonFile, type);
Map<String, Object> regressionMap = mapper.readValue(getRegressionJsonFile, type);
MapDifference<String, Object> differenceNormal = Maps.difference(baselineMap, regressionMap);
System.out.println(differenceNormal);
Map<String, Object> baselineFlatMap = FlatMapUtil.flatten(baselineMap);
Map<String, Object> regressionFlatMap = FlatMapUtil.flatten(regressionMap);
MapDifference<String, Object> difference = Maps.difference(baselineFlatMap, regressionFlatMap);
System.out.println("The diff "+difference);
System.out.println("Entries only on the baseline json\n--------------------------");
difference.entriesOnlyOnLeft()
.forEach((key, value) -> System.out.println(key + ": " + value));
System.out.println("\n\nEntries only on the regression json\n--------------------------");
difference.entriesOnlyOnRight()
.forEach((key, value) -> System.out.println(key + ": " + value));
System.out.println("\n\nEntries differing (baseline, regression)\n--------------------------");
difference.entriesDiffering()
.forEach((key, value) -> System.out.println(key + ": " + value));
}
}
------------------------------------------------------------------------------------------------
public final class FlatMapUtil {
private FlatMapUtil() {
throw new AssertionError("No instances for you!");
}
public static Map<String, Object> flatten(Map<String, Object> map) {
return map.entrySet().stream()
.flatMap(FlatMapUtil::flatten)
.collect(LinkedHashMap::new, (m, e) -> m.put("/" + e.getKey(), e.getValue()), LinkedHashMap::putAll);
}
private static Stream<Map.Entry<String, Object>> flatten(Map.Entry<String, Object> entry) {
if (entry == null) {
return Stream.empty();
}
if (entry.getValue() instanceof Map<?, ?>) {
return ((Map<?, ?>) entry.getValue()).entrySet().stream()
.flatMap(e -> flatten(new AbstractMap.SimpleEntry<>(entry.getKey() + "/" + e.getKey(), e.getValue())));
}
if (entry.getValue() instanceof List<?>) {
List<?> list = (List<?>) entry.getValue();
return IntStream.range(0, list.size())
.mapToObj(i -> new AbstractMap.SimpleEntry<String, Object>(entry.getKey() + "/" + i, list.get(i)))
.flatMap(FlatMapUtil::flatten);
}
return Stream.of(entry);
}
}
Josson & Jossons has operators <==> to compare two JSON datasets.
https://github.com/octomix/josson
Deserialization
Jossons jossons = new Jossons();
jossons.putDataset("json1", Josson.fromJsonString(
"{" +
"\"name\":\"Jack\"," +
"\"student_ID\":1," +
"\"subject\":[ {" +
"\"marks\":50," +
"\"subjectId\":\"PHY\"," +
"\"subjectName\":\"Physics\"" +
"}, {" +
"\"marks\":60," +
"\"subjectId\":\"CHE\"," +
"\"subjectName\":\"Chemistry\"" +
"} ]" +
"}"));
jossons.putDataset("json2", Josson.fromJsonString(
"{" +
"\"name\":\"Jack\"," +
"\"student_ID\":1," +
"\"subject\":[ {" +
"\"marks\":60," +
"\"subjectId\":\"CHE\"," +
"\"subjectName\":\"Chemistry\"" +
"}, {" +
"\"marks\":50," +
"\"subjectId\":\"PHY\"," +
"\"subjectName\":\"Physics\"" +
"} ]" +
"}"));
jossons.putDataset("json3", Josson.fromJsonString(
"{" +
"\"name\":\"Jack\"," +
"\"student_ID\":1," +
"\"subject\":[ {" +
"\"marks\":60," +
"\"subjectId\":\"CHE\"," +
"\"subjectName\":\"Chemistry\"" +
"}, {" +
"\"marks\":70," +
"\"subjectId\":\"BIO\"," +
"\"subjectName\":\"Biology\"" +
"} ]" +
"}"));
Query
JsonNode node = jossons.evaluateQuery("json1 <==> json2");
System.out.println("json1 <==> json2 : " + node.toString());
node = jossons.evaluateQuery("json1 <==> json3");
System.out.println("json1 <==> json3 : " + node.toString());
node = jossons.evaluateQuery("json2 <==> json3");
System.out.println("json2 <==> json3 : " + node.toString());
Output
json1 <==> json2 : true
json1 <==> json3 : false
json2 <==> json3 : false

If I have two keys with equal value, how do I return a Set<String> kinda (value + " - " + key ", " + key)?

This code works only when I have a pair like name(value) + number(key), but I'm stuck with situation like - name + number, number. It returns name + number1, name + number2
public Set<String> getAllContacts() {
TreeSet<String> contacts = new TreeSet<>();
if (phoneBook.isEmpty()) {
return new TreeSet<>();
} else {
for (Map.Entry<String, String> entry : phoneBook.entrySet()) {
contacts.add(entry.getValue() + " - " + entry.getKey());
}
}
return contacts;
}
Basically you want to invert key and value of your phoneBook map. Since one name can have multiple numbers according to your example the output structure should be Map<String, List<String>>. With this name oriented phone book you can easily create your contact collection. E.g.
Map<String, List<String>> phoneBookByName = new HashMap<>();
phoneBook.forEach((k, v) -> {
List<String> numbers = phoneBookByName.computeIfAbsent(k, s -> new ArrayList<>());
numbers.add(v);
});
TreeSet<String> contacts = new TreeSet<>();
phoneBookByName.forEach((k, v) -> contacts.add(k + " - " + String.join(", ", v)));
return contacts;
or using streams
Map<String, List<String>> phoneBookByName = phoneBook.entrySet().stream()
.collect(Collectors.groupingBy(Map.Entry::getValue, Collectors.mapping(Map.Entry::getKey, Collectors.toList())));
return phoneBookByName.entrySet().stream()
.map(entry -> entry.getKey() + " - " + String.join(", ", entry.getValue()))
.collect(Collectors.toCollection(TreeSet::new));

How to display the first row of results from extracted data?

I used the following script to read a CSV file and extract suspicious data in descending order. I am trying to print out the result of the highest occurrence which is the first result showing. Should I specify it in the map script? Can I utilize println to show that?
public class Part2 {
public static void main(String[] args) {
try (Scanner csvData = new Scanner(
new File("C:\\Users\\amber\\Documents\\IN300_Dataset1.csv"))) {
List<String> list = new ArrayList<String>();
while (csvData.hasNext()) {
list.add(csvData.nextLine());
}
String[] tempArray = list.toArray(new String[1]);
String[][] csvArray = new String[tempArray.length][];
String combined_list[] = new String[tempArray.length];
String myData = null;
for(int i=0; i < tempArray.length; i++) {
if(i == 0) continue;
csvArray[i] = tempArray[i].split(",");
if(csvArray[i][4].matches("^\"[a-zA-Z].*\"")) {
continue;
}
else {
myData= csvArray[i][2] + " " +
csvArray[i][3] + " " +
csvArray[i][4] + " " +
csvArray[i][5] + " " +
csvArray[i][6];
combined_list[i] = myData;
}
}
getOccurences("Suspicious Result(s)", combined_list);
}
catch(Exception e) {
System.out.println(e);
}
}
private static void getOccurences(String message, String[] myArray) {
Map<String, Integer> map = new HashMap<>();
for (String key : myArray) {
if (map.containsKey(key)) {
int occurrence = map.get(key);
occurrence++;
map.put(key, occurrence);
} else {
map.put(key, 1);
}
}
Map<String, Integer> sortedMap =
map.entrySet().stream()
.sorted(Collections.reverseOrder(Entry.comparingByValue()))
.collect(Collectors.toMap(Entry::getKey, Entry::getValue, (e1, e2) -> e2, LinkedHashMap::new));
printMap(message, sortedMap);
}
private static void printMap(String message, Map<String, Integer> map)
{
System.out.println();
System.out.println();
System.out.println("Printing" + message);
System.out.println();
map.forEach((key, value) -> {
if(key != null && value > 100) {
System.out.println(key + "appeared " + value + " times(s).");
}
});
}
}
Here is a sample of my results (The bold text is the result that I only want to display):
PrintingSuspicious Result(s)
00:28:00:01:00:00 02:00:00:00:45:00 0x4006 44 Ethernet IIappeared 7536 times(s).
209.99.61.21 192.168.1.24 UDP 1359 17212 > 52797 Len=1309appeared 2990 times(s).
192.168.1.24 209.99.61.21 UDP 170 52797 > 17212 Len=128appeared 2905 times(s).
209.99.61.21 192.168.1.24 UDP 1351 17212 > 52797 Len=1309appeared 2851 times(s).
If you want to display only the first result, just quit the loop after displaying the first element (you'll have to use normal for loop instead of using forEach to be able to break):
private static void printMap(String message, Map<String, Integer> map)
{
System.out.println();
System.out.println();
System.out.println("Printing" + message);
System.out.println();
for (Map.Entry<String, Integer> entry : map.entrySet()) {
if (entry.getKey() != null && entry.getValue() > 100) {
System.out.println(entry.getKey() + "appeared " + entry.getValue() + " times(s).");
break;
}
}
}

Performance tuning for JavaRDD function

I want to convert dataframe to Array of Json using Java and Spark version 1.6, for which am converting the data from
Dataframe -> Json -> RDD -> Array
where the data looks like this.
[
{
"prtdy_pgm_x":"P818_C",
"prtdy_pgm_x":"P818",
"prtdy_attr_c":"Cost",
"prtdy_integer_r":0,
"prtdy_cds_d":"prxm",
"prtdy_created_s":"2018-05-12 04:12:19.0",
"prtdy_created_by_c":"brq",
"prtdy_create_proc_x":"w_pprtdy_security_t",
"snapshot_d":"2018-05-12-000018"
},
{
"prtdy_pgm_x":"P818_I",
"prtdy_pgm_x":"P818",
"prtdy_attr_c":"Tooling",
"prtdy_integer_r":0,
"prtdy_cds_d":"prxm",
"prtdy_created_s":"2018-05-12 04:12:20.0",
"prtdy_created_by_c":"brq",
"prtdy_create_proc_x":"w_pprtdy_security_t",
"snapshot_d":"2018-05-12-000018"
},
{
"prtdy_pgm_x":"P818_W",
"prtdy_pgm_x":"P818",
"prtdy_attr_c":"Weight",
"prtdy_integer_r":0,
"prtdy_cds_d":"prxm",
"prtdy_created_s":"2018-05-12 04:12:20.0",
"prtdy_created_by_c":"brq",
"prtdy_create_proc_x":"w_pprtdy_security_t",
"snapshot_d":"2018-05-12-000018"
},
......
]
so I wrote my code something like this.
if(cmnTableNames != null && cmnTableNames.length > 0)
{
for(int i=0; i < cmnTableNames.length; i++)
{
String cmnTableName = cmnTableNames[i];
DataFrame cmnTableContent = null;
if(cmnTableName.contains("PTR_security_t"))
{
cmnTableContent = hiveContext.sql("SELECT * FROM " + cmnTableName + " where fbrn04_snapshot_d = '" + snapshotId + "'");
}
else
{
cmnTableContent = hiveContext.sql("SELECT * FROM " + cmnTableName);
}
String cmnTable = cmnTableName.substring(cmnTableName.lastIndexOf(".") + 1);
if (cmnTableContent.count() > 0)
{
String cmnStgTblDir = hdfsPath + "/staging/" + rptName + "/common/" + cmnTable;
JavaRDD<String> cmnTblCntJson = cmnTableContent.toJSON().toJavaRDD();
String result = cmnTblCntJson.reduce((ob1, ob2) -> (String)ob1+","+(String)ob2); //This Part, takes more time than usual contains large set of data.
String output = "["+result+"]";
ArrayList<String> outputList = new ArrayList<String>();
outputList.add(output);
JavaRDD<String> finalOutputRDD = sc.parallelize(outputList);
String cmnStgMrgdDir = cmnStgTblDir + "/mergedfile";
if(dfs.exists(new Path(cmnStgTblDir + "/mergedfile"))) dfs.delete(new Path(cmnStgTblDir + "/mergedfile"), true);
finalOutputRDD.coalesce(1).saveAsTextFile(cmnStgMrgdDir, GzipCodec.class);
fileStatus = dfs.getFileStatus(new Path(cmnStgMrgdDir + "/part-00000.gz"));
dfs.setPermission(fileStatus.getPath(),FsPermission.createImmutable((short) 0770));
dfs.rename(new Path(cmnStgMrgdDir + "/part-00000.gz"), new Path(CommonPath + "/" + cmnTable + ".json.gz"));
}
else
{
System.out.println("There are no records in " + cmnTableName);
}
}
}
else
{
System.out.println("The common table lists are null.");
}
sc.stop();
but while reduce function is applied it's taking more time
JavaRDD<String> cmnTblCntJson = cmnTableContent.toJSON().toJavaRDD();
String result = cmnTblCntJson.reduce((ob1, ob2) -> (String)ob1+","+(String)ob2); //This Part, takes more time than usual contains large set of data.
the table with the partition "PTR_security_t" is huge and takes a lot of time compared to other tables which don't have partitions (40-50 mins odd for 588kb)
I Tried Applying Lambda but i ended up with Task not serializable error. Check the code below.
if(cmnTableNames != null && cmnTableNames.length > 0)
{
List<String> commonTableList = Arrays.asList(cmnTableNames);
DataFrame commonTableDF = sqc.createDataset(commonTableList,Encoders.STRING()).toDF();
commonTableDF.toJavaRDD().foreach(cmnTableNameRDD -> {
DataFrame cmnTableContent = null;
String cmnTableName = cmnTableNameRDD.mkString();
if(cmnTableName.contains("PTR_security_t"))
{
cmnTableContent = hiveContext.sql("SELECT * FROM " + cmnTableName + " where fbrn04_snapshot_d = '" + snapshotId + "'");
}
else
{
cmnTableContent = hiveContext.sql("SELECT * FROM " + cmnTableName);
}
String cmnTable = cmnTableName.substring(cmnTableName.lastIndexOf(".") + 1);
if (cmnTableContent.count() > 0)
{
String cmnStgTblDir = hdfsPath + "/staging/" + rptName + "/common/" + cmnTable;
JavaRDD<String> cmnTblCntJson = cmnTableContent.toJSON().toJavaRDD();
String result = cmnTblCntJson.reduce((ob1, ob2) -> (String)ob1+","+(String)ob2);
String output = "["+result+"]";
ArrayList<String> outputList = new ArrayList<String>();
outputList.add(output);
JavaRDD<String> finalOutputRDD = sc.parallelize(outputList);
String cmnStgMrgdDir = cmnStgTblDir + "/mergedfile";
if(dfs.exists(new Path(cmnStgTblDir + "/mergedfile"))) dfs.delete(new Path(cmnStgTblDir + "/mergedfile"), true);
finalOutputRDD.coalesce(1).saveAsTextFile(cmnStgMrgdDir, GzipCodec.class);
fileStatus = dfs.getFileStatus(new Path(cmnStgMrgdDir + "/part-00000.gz"));
dfs.setPermission(fileStatus.getPath(),FsPermission.createImmutable((short) 0770));
dfs.rename(new Path(cmnStgMrgdDir + "/part-00000.gz"), new Path(CommonPath + "/" + cmnTable + ".json.gz"));
}
else
{
System.out.println("There are no records in " + cmnTableName);
}
});
}
else
{
System.out.println("The common table lists are null.");
}
sc.stop();
is there any efficient way where i can enhance my Performance ?

Java custom implode methode like in PHP

I am trying to replicate the php function implode() in Java.
This is what I made:
private String implode(String delimiter, Map<String, String> map){
StringBuilder sb = new StringBuilder();
for(Entry<String, String> e : map.entrySet()){
sb.append(" "+delimiter+" ");
sb.append(" " + e.getKey() + " = '" + e.getValue() + "' ");
}
return sb.toString();
}
Testing:
Map<String, String> myList = new HashMap<String, String>();
myList.put("address", "something1");
myList.put("last_name", "something2");
myList.put("first_name", "something3");
update_database("dummy", myList, "");
public void update_database(String table, Map<String, String> update_list, String condition){
String query = "UPDATE " + table + " SET ";
query += implode(",", update_list) + " " + condition;
System.out.println(query);
}
Output:
UPDATE dummy SET , address = 'something' , last_name = 'something2',
first_name = 'something3'
If you worked with mysql before, you know that it's not a valid query because the string query start with ",".
How can I format my string to get a correct query?
You could try something like this:
Own Implementation
private String implode(String delimiter, Map<String, String> map){
boolean first = true;
StringBuilder sb = new StringBuilder();
for(Entry<String, String> e : map.entrySet()){
if (!first) sb.append(" "+delimiter+" ");
sb.append(" " + e.getKey() + " = '" + e.getValue() + "' ");
first = false;
}
return sb.toString();
}
StringUtils
Another solution would be to use public static String join(Collection collection, char separator)
You don't need to write this yourself. There's already a library with this functionality named guava made by google. It has a class called a Joiner. Here's an example:
Joiner joiner = Joiner.on("; ").skipNulls();
return joiner.join("Harry", null, "Ron", "Hermione");
This returns the string "Harry; Ron; Hermione". Note that all input elements are converted to strings using Object.toString() before being appended.

Categories

Resources