Simultaneous traverse of two arrays - java

I have to read a csv file which has a specific number of fields.I must traverse and detect the consecutive strings of the first column (i have used an array to read the file) and only for these strings, i want to get the sum of their int values in the third column of the file,which i have stored into an another array. So far, i am able to do the detection of the consecutive same strings, but how can i grab their values and get their sum for each string? Is it possible to do this with simultaneous traversal? I don't have experience in java, please help. Thanks.
Here's my code.
The csv file is something like this with random values:
ip, timestamp,elapsed,..
127.0.0.1,...,1500
127.0.0.2,...,2800
127.0.0.2,...,2400
127.0.0.2,...,2500
127.0.0.3,...,1700
127.0.0.4,...,1600
127.0.0.4,...,1500
127.0.0.5,...,2000
I must get something like this: 127.0.0.2:7700, 127.0.0.4:3100
public static void main(String[] args) {
try {
System.out.println("Give file's name: ");
Scanner in = new Scanner(System.in);
String filename = in.nextLine();
File file = new File(filename);
Scanner inputfile = new Scanner(file);
String csv_data[];
ArrayList<String> ip_list = new ArrayList<String>();
ArrayList<String> elapsed_list = new ArrayList<String>();
String[] ip_array = new String[ip_list.size()];
String[] elapsed_array = new String[elapsed_list.size()];
int i = 0;
int j = 0;
int sum = 0;
while (inputfile.hasNextLine()) {
String line = inputfile.nextLine();
csv_data = line.split(",");
ip_list.add(csv_data[0]);
elapsed_list.add(csv_data[2]);
}
ip_array = ip_list.toArray(ip_array);
elapsed_array = elapsed_list.toArray(elapsed_array);
for (String element : elapsed_array) {
try {
int num = Integer.parseInt(element);
} catch (NumberFormatException fe) {
fe.printStackTrace();
System.out.println(" That's not a number");
}
}
while (i < ip_array.length) {
int start = i;
while (i < ip_array.length && (ip_array[i].equals(ip_array[start]))) {
i++;
}
int count = i - start;
if (count >= 5) {
System.out.println(ip_array[start] + " " + "|" + " " + count);
}
}
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
}

public static void main(String[] args) throws Exception {
List<Data> data = read(getFile());
Map<String, Integer> idSum = groupByIdWithSum(data);
// ...
}
private static File getFile() throws Exception {
try (Scanner scan = new Scanner(System.in)) {
System.out.println("Give file's name: ");
return new File(in.nextLine());
}
}
private static List<Data> read(File file) throws Exception {
try (Scanner scan = new Scanner(file)) {
List<Data> res = new ArrayList<>();
while(scan.hasNextLine()){
String[] lineParts = scan.nextLine().splie(",");
res.add(new Data(lineParts[0], Integer.parseInt(lineParts[2])));
}
return res;
}
}
private static Map<String, Integer> groupByIdWithSum(List<Data> data) {
Map<String, Integer> map = new HashMap<>();
for(Data d : data)
map.put(d.getId(), map.getOrDefault(d.getId(), 0) + d.getElapsed());
return map;
}
final static class Data {
private final String ip;
private final int elapsed;
public Data(String ip, int elapsed) {
this.ip = ip;
this.elapsed = elapsed;
}
public String getId() {
return id;
}
public int getElapsed() {
return elapsed;
}
}

Related

How to sort a cvs file by one field in Java?

How can I sort a cvs file by one field in Java?
For example I want to sort it by the third field
I have a cvs file that looks like this:
1951,Jones,5
1984,Smith,7
...
I tried using Scanner as such, with a delimiter but I couldn't figure out how to go on:
public static void main(String[] args)
{
//String data = args[0];
Scanner s = null;
String delim = ";";
try
{
s = new Scanner(new BufferedReader (new FileReader("test.csv")));
List<Integer> three = new ArrayList<Integer>();
while(s.hasNext())
{
System.out.println(s.next());
s.useDelimiter(delim);
}
}
catch(FileNotFoundException e)
{
System.out.println("File not found");
}
finally
{
if(s != null)
{
s.close();
}
}
}
Thank you!
public static void main(String[] args)
{
final String DELIM = ";";
final int COLUMN_TO_SORT = 2; //First column = 0; Third column = 2.
List<List<String>> records = new ArrayList<>();
try (Scanner scanner = new Scanner(new File("test.csv"))) {
while (scanner.hasNextLine()) {
records.add(getRecordFromLine(scanner.nextLine(), DELIM));
}
}
catch(FileNotFoundException e){
System.out.println("File not found");
}
Collections.sort(records, new Comparator<List<String>>(){
#Override
public int compare(List<String> row1, List<String> row2){
if(row1.size() > COLUMN_TO_SORT && row2.size() > COLUMN_TO_SORT)
return row1.get(COLUMN_TO_SORT).compareTo(row2.get(COLUMN_TO_SORT));
return 0;
}
});
for (Iterator<List<String>> iterator = records.iterator(); iterator.hasNext(); ) {
System.out.println(iterator.next());
}
}
private static List<String> getRecordFromLine(String row, String delimiter) {
List<String> values = new ArrayList<String>();
try (Scanner rowScanner = new Scanner(row)) {
rowScanner.useDelimiter(delimiter);
while (rowScanner.hasNext()) {
values.add(rowScanner.next());
}
}
return values;
}
** Note that the example file is separated by comma, but in the code you use semicolon as the delimiter.

Use threading to process multiple files

I have a file that I need to use to execute the wordcount function(based on MapReduce) but using threads, I take the file and split it into multiple small files then I loop the small files to count the number of occurrences of words with a Reduce() function, how can I implement threads withe the run() function to use them with the Reduce function.
here's my code:
public class WordCounter implements Runnable {
private String Nom;
protected static int Chunks = 1 ;
public WordCounter (String n) {
Nom = n;
}
public void split () throws IOException
{
File source = new File(this.Nom);
int maxRows = 100;
int i = 1;
try(Scanner sc = new Scanner(source)){
String line = null;
int lineNum = 1;
File splitFile = new File(this.Nom+i+".txt");
FileWriter myWriter = new FileWriter(splitFile);
while (sc.hasNextLine()) {
line = sc.nextLine();
if(lineNum > maxRows){
Chunks++;
myWriter.close();
lineNum = 1;
i++;
splitFile = new File(this.Nom+i+".txt");
myWriter = new FileWriter(splitFile);
}
myWriter.write(line+"\n");
lineNum++;
}
myWriter.close();
}
}
public void Reduce() throws IOException
{
ArrayList<String> words = new ArrayList<String>();
ArrayList<Integer> count = new ArrayList<Integer>();
for (int i = 1; i < Chunks; i++) {
//create the input stream (recevoir le texte)
FileInputStream fin = new FileInputStream(this.getNom()+i+".txt");
//go through the text with a scanner
Scanner sc = new Scanner(fin);
while (sc.hasNext()) {
//Get the next word
String nextString = sc.next();
//Determine if the string exists in words
if (words.contains(nextString)) {
int index = words.indexOf(nextString);
count.set(index, count.get(index)+1);
}
else {
words.add(nextString);
count.add(1);
}
}
sc.close();
fin.close();
}
// Creating a File object that represents the disk file.
FileWriter myWriter = new FileWriter(new File(this.getNom()+"Result.txt"));
for (int i = 0; i < words.size(); i++) {
myWriter.write(words.get(i)+ " : " +count.get(i) +"\n");
}
myWriter.close();
//delete the small files
deleteFiles();
}
public void deleteFiles()
{
File f= new File("");
for (int i = 1; i <= Chunks; i++) {
f = new File(this.getNom()+i+".txt");
f.delete();
}
}
}
Better use Callable instead of using Runnable interface and this way you can retrieve your data.
So in order to fix your code you can more or less do something like this:
public class WordCounter {
private static ExecutorService threadPool = Executors.newFixedThreadPool(5); // 5 represents the number of concurrent threads.
public Map<String, Integer> count(String filename) {
int chunks = splitFileInChunks(filename);
List<Future<Report>> reports = new ArrayList<Future<Report>>();
for (int i=1; i<=chunks; i++) {
Callable<Report> callable = new ReduceCallable(filename + i + ".txt");
Future<Report> future = threadPool.submit(callable);
reports.add(future);
}
Map<String, Integer> finalMap = new HashMap<>();
for (Future<Report> future : reports) {
Map<String, Integer> map = future.get().getWords();
for (Map.Entry<String, Integer> entry : map.entrySet()) {
int oldCnt = finalMap.get(entry.getKey()) != null ? finalMap.get(entry.getKey()) : 0;
finalMap.put(entry.getKey(), entry.getValue() + oldCnt);
}
}
// return a map with the key being the word and the value the counter for that word
return finalMap;
}
// this method doesn't need to be run on the separate thread
private int splitFileInChunks(String filename) throws IOException { .... }
}
public class Report {
Map<String, Integer> words = new HashMap<>();
// ... getter, setter, constructor etc
}
public class ReduceCounter implements Callable<Report> {
String filename;
public ReduceCounter(String filename) { this.filename = filename;}
public Report call() {
// store the values in a Map<String, Integer> since it's easier that way
Map<String, Integer> myWordsMap = new HashMap<String, Integer>;
// here add the logic from your Reduce method, without the for loop iteration
// you should add logic to read only the file named with the value from "filename"
return new Report(myWordsMap);
}
}
Please note you can skip the Report class and return Future<Map<String,Integer>>, but I used Report to make it more easy to follow.
Update for Runnable as requested by user
public class WordCounter {
public Map<String, Integer> count(String filename) throws InterruptedException {
int chunks = splitFileInChunks(filename);
List<ReduceCounter> counters = new ArrayList<>();
List<Thread> reducerThreads = new ArrayList<>();
for (int i=1; i<=chunks; i++) {
ReduceCounter rc = new ReduceCounter(filename + i + ".txt");
Thread t = new Thread(rc);
counters.add(rc);
reducerThreads.add(t);
t.start();
}
// next wait for the threads to finish processing
for (Thread t : reducerThreads) {
t.join();
}
// now grab the results from each of them
for (ReduceCounter cnt : counters ) {
cnt.getWords();
// next just merge the results here...
}
}
Reducer class should look like:
public class ReduceCounter implements Runnable {
String filename;
Map<String, Integer> words = new HashMap();
public ReduceCounter(String filename) { this.filename = filename;}
public void run() {
// store the values in the "words" map
// here add the logic from your Reduce method, without the for loop iteration
// also read, only the file named with the value from "filename"
}
public Map<String, Integer> getWords() {return words;}
}
I kind of found a solution as i assign a thread to each small file, then i call the Reduce() function inside the run() function, but i still don't fully have my head around it, here's the code:
public void Reduce() throws IOException
{
ArrayList<String> words = new ArrayList<String>();
ArrayList<Integer> count = new ArrayList<Integer>();
Thread TT= new Thread();
for (int i = 1; i < Chunks; i++) {
//create the input stream (recevoir le texte)
FileInputStream fin = new FileInputStream(this.getNom()+i+".txt");
TT=new Thread(this.getNom()+i+".txt");
TT.start();
//go through the text with a scanner
Scanner sc = new Scanner(fin);
while (sc.hasNext()) {
//Get the next word
String nextString = sc.next();
//Determine if the string exists in words
if (words.contains(nextString)) {
int index = words.indexOf(nextString);
count.set(index, count.get(index)+1);
}
else {
words.add(nextString);
count.add(1);
}
}
sc.close();
fin.close();
}
// Creating a File object that represents the disk file.
FileWriter myWriter = new FileWriter(new File(this.getNom()+"Result.txt"));
for (int i = 0; i < words.size(); i++) {
myWriter.write(words.get(i)+ " : " +count.get(i) +"\n");
}
myWriter.close();
//Store the result in the new file
deleteFiles();
}
public void run() {
try {
this.Reduce();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void main(String[] args) throws IOException {
Wordcounter w1 = new Wordcounter("Words.txt");
Thread T1= new Thread(w1);
T1.start();
}

Converting lines of text from a file to String Array

I read the information in a .txt file and now I would like to store the lines of information from the text into a String Array or a variable.
The information in the .txt file is as given:
Onesimus, Andrea
BAYV
Twendi, Meghan
RHHS
Threesten, Heidi
MDHS
I want to store BAYV, RHHS, MDHS into a different array from the names.
import java.io.File;
import java.util.Scanner;
class testing2 {
public static void main(String [] args) throws Exception {
File Bayviewcamp = new File ("H:\\Profile\\Desktop\\ICS3U\\Bayviewland Camp\\Studentinfo.txt");
Scanner scanner = new Scanner (Bayviewcamp);
while (scanner.hasNextLine())
System.out.println(scanner.nextLine());
Check whether names matches with the regex "[A-Z]+"
List<String> upperCaseList = new ArrayList<>();
List<String> lowerCaseList = new ArrayList<>();
while (scanner.hasNextLine()) {
String[] names = scanner.nextLine().split(",");
for(String name:names) {
if(name.matches("[A-Z]+")) {
upperCaseList.add(name);
}
else {
lowerCaseList.add(name);
}
}
}
As per your example, some of the names has leading spaces. you may have to trim those spaces before you compare with the regex
for(String name:names) {
if(name.trim().matches("[A-Z]+")) {
upperCaseList.add(name.trim());
}
else {
lowerCaseList.add(name.trim());
}
}
Below code has few restrictions like:
There must be format that you said (name and next line value)
Array size is 100 by default but you can change as you want
By name I mean one line: (Onesimus, Andrea) it's under first index in names array.
private static final int ARRAY_LENGTH = 100;
public static void main(String[] args) throws FileNotFoundException {
boolean isValue = false;
File txt = new File("file.txt");
Scanner scanner = new Scanner(txt);
String[] names = new String[ARRAY_LENGTH];
String[] values = new String[ARRAY_LENGTH];
int lineNumber = 0;
while (scanner.hasNextLine()) {
if (isValue) {
values[lineNumber / 2] = scanner.nextLine();
} else {
names[lineNumber / 2] = scanner.nextLine();
}
isValue = !isValue;
lineNumber++;
}
for (int i = 0; i < ARRAY_LENGTH; i++) {
System.out.println(names[i]);
System.out.println(values[i]);
}
}
Below code return separated names:
private static final int ARRAY_LENGTH = 100;
public static void main(String[] args) throws FileNotFoundException {
boolean isValue = false;
File txt = new File("file.txt");
Scanner scanner = new Scanner(txt);
String[] names = new String[ARRAY_LENGTH];
String[] values = new String[ARRAY_LENGTH];
int namesNumber = 0;
int valuesNumber = 0;
while (scanner.hasNextLine()) {
if (!isValue) {
String tempArrayNames[] = scanner.nextLine().split(",");
values[valuesNumber++] = tempArrayNames[0].trim();
values[valuesNumber++] = tempArrayNames[1].trim();
} else {
names[namesNumber++] = scanner.nextLine();
}
isValue = !isValue;
}
}

Manipulating strings and integers via two dimensional array from an external file java

I am trying to design a program that takes data from an external file, stores the variable to arrays and then allows for manipulation.sample input:
String1 intA1 intA2
String2 intB1 intB2
String3 intC1 intC2
String4 intD1 intD2
String5 intE1 intE2
I want to be able to take these values from the array and manipulate them as follows;
For each string I want to be able to take StringX and computing((intX1+
intX2)/)
And for each int column I want to be able to do for example (intA1 + intB1 + intC1 + intD1 + intE1)
This is what I have so far, any tips?
**please note java naming conventions have not been taught in my course yet.
public class 2D_Array {
public static void inputstream(){
File file = new File("data.txt");
try (FileInputStream fis = new FileInputStream(file)) {
int content;
while ((content = fis.read()) != -1) {
readLines("data.txt");
FivebyThree();
System.out.print((char) content);
}
} catch (IOException e) {
e.printStackTrace();
}
}
public static int FivebyThree() throws IOException {
Scanner sc = new Scanner(new File("data.txt"));
int[] arr = new int[10];
while(sc.hasNextLine()) {
String line[] = sc.nextLine().split("\\s");
int ele = Integer.parseInt(line[1]);
int index = Integer.parseInt(line[0]);
arr[index] = ele;
}
int sum = 0;
for(int i = 0; i<arr.length; i++) {
sum += arr[i];
System.out.print(arr[i] + "\t");
}
System.out.println("\nSum : " + sum);
return sum;
}
public static String[] readLines(String filename) throws IOException {
FileReader fileReader = new FileReader(filename);
BufferedReader bufferedReader = new BufferedReader(fileReader);
List<String> lines = new ArrayList<String>();
String line = null;
while ((line = bufferedReader.readLine()) != null)
{
lines.add(line);
}
return lines.toArray(new String[lines.size()]);
}
/* int[][] FivebyThree = new int[5][3];
int row, col;
for (row =0; row < 5; row++) {
for(col = 0; col < 3; col++) {
System.out.printf( "%7d", FivebyThree[row][col]);
}
System.out.println();*/
public static void main(String[] args)throws IOException {
inputstream();
}
}
I see that you read data.txt twice and do not use first read result at all. I do not understand, what you want to do with String, but having two-dimension array and calculate sum of columns of int is very easy:
public class Array_2D {
static final class Item {
final String str;
final int val1;
final int val2;
Item(String str, int val1, int val2) {
this.str = str;
this.val1 = val1;
this.val2 = val2;
}
}
private static List<Item> readFile(Reader reader) throws IOException {
try (BufferedReader in = new BufferedReader(reader)) {
List<Item> content = new ArrayList<>();
String str;
while ((str = in.readLine()) != null) {
String[] parts = str.split(" ");
content.add(new Item(parts[0], Integer.parseInt(parts[1]), Integer.parseInt(parts[2])));
}
return content;
}
}
private static void FivebyThree(List<Item> content) {
StringBuilder buf = new StringBuilder();
int sum1 = 0;
int sum2 = 0;
for (Item item : content) {
// TODO do what you want with item.str
sum1 += item.val1;
sum2 += item.val2;
}
System.out.println("str: " + buf);
System.out.println("sum1: " + sum1);
System.out.println("sum2: " + sum2);
}
public static void main(String[] args) throws IOException {
List<Item> content = readFile(new InputStreamReader(Array_2D.class.getResourceAsStream("data.txt")));
FivebyThree(content);
}
}

Parse a file into an array and then search for words in the file, count the words in the file

I can parse the file and I can read out the contents of the file, but I am unable to search for a specific word in the file or count the number of words in the file:
Code below:
public class Manager{
private String[] textData;
private String path;
public String loadFile (String file_path){
return path= file_path;
}
public String [] openFile() throws IOException{
FileReader fr = new FileReader(path);
BufferedReader textReader = new BufferedReader (fr);
int numberOfLines = readLines();
textData = new String[numberOfLines];
int i;
for (i=0; i < numberOfLines; i++) {
textData[i] = textReader.readLine();
}
textReader.close( );
return textData;
}
int readLines() throws IOException{
FileReader file_to_read = new FileReader(path);
BufferedReader bf = new BufferedReader (file_to_read);
String aLine;
int numberOfLines = 0;
while ((aLine =bf.readLine()) !=null){
numberOfLines++;
}
bf.close();
return numberOfLines;
}
}
private int findText(String s){
for (int i = 0; i < textData.length; i++){
if (textData[i] != null && textData[i].equals(s)){
return i;
}
}
return -1;
}
public boolean contains(String s){
for(int i=0; i<textData.length; i++){
if(textData[i] !=null && textData[i].equals(s)){
return true;
}
}
return false;
}
public int count(){
int counter = 0;
for (int i = 0; i < textData.length; i++){
if (textData[i] != null) counter++;
}
return counter;
}
}
My other Class:
ublic class Runner {
private String fileInput;
private Scanner scanner = new Scanner(System.in);
private boolean keepRunning= true;
private Manager m = new Manager();
public static void main(String[] args) throws Exception {
new Runner();
}
public Runner() throws Exception{
do {
System.out.println("--------------------------------------------------");
System.out.println("\t\t\t\tText Analyser");
System.out.println("--------------------------------------------------");
System.out.println("1)Parse a File");
System.out.println("2)Parse a URL");
System.out.println("3)Exit");
System.out.println("Select option [1-3]>");
String option = scanner.next();
if (option.equals("1")){
parseFile();
}else if(option.equals("2")){
parseUrl();
}else if(option.equals("3")){
keepRunning = false;
}else{
System.out.println("Please enter option 1-3!");
}
} while (keepRunning);
System.out.println("Bye!");
scanner.close();
}
private void parseFile()throws Exception{
String file_name;
System.out.print("What is the full file path name of the file you would like to parse?\n>>"); ////The user might enter in a path name like: "C:/Users/Freddy/Desktop/catDog.txt";
file_name = scanner.next();
try {
Manager file = new Manager();
file.loadFile(file_name);
String[] aryLines = file.openFile( );
int i;
for ( i=0; i < aryLines.length; i++ ) {
System.out.println( aryLines[ i ] ) ;
}
}
catch ( IOException e ) {
System.out.println( e.getMessage() );
}
do {
System.out.println("*** Parse a file or URL ***");
System.out.println("1)Search File");
System.out.println("2)Print Stats about File");
System.out.println("3)Exit");
System.out.println("Select option [1-3]>");
String option = scanner.next();
if (option.equals("1")){
}else if(option.equals("2")){
}else if(option.equals("3")){
keepRunning = false;
}else{
System.out.println("Please enter option 1-3!");
}
} while (keepRunning);
System.out.println("Bye!");
scanner.close();
}
private void parseUrl()throws Exception{
}
private void search() throws Exception{
do {
System.out.println("*** Search ***");
System.out.println("1)Does the file/URL contain a certain word");
System.out.println("2)Count all words in the file/url");
System.out.println("9)Exit");
System.out.println("Select option [1-9]>");
String choice = scanner.next(); //Get the selected item
if (choice.equals("1")){
contains();
}else if(choice.equals("2")){
count();
}else if(choice.equals("3")){
keepRunning = false;
}else{
System.out.println("Please enter option 1-3!");
}
} while (keepRunning);
System.out.println("Bye!");
scanner.close();
}
private void contains(){
System.out.println("*** Check to see if a certain Word/Letter appears in the file/URL ***");
System.out.println("Enter what you would like to search for:");
String s = scanner.next();
boolean sc = m.contains(s);
System.out.println("If its true its in the file, if its false its not in the file/URL");
System.out.println("The answer = " + sc);
}
private void count(){
int totalNumberOfElement = m.count();
System.out.println("Total number of elements in the file/Url is" + totalNumberOfElement );
}
Consider the below points to change your code:
1. Use a List (eg: List contents = new List) for reading the lines from a file
2. Use contents.size() to get the number of lines. That would be simple.
3. You are using equals() method to search for text in a line. Use contains() or indexOf() methods. Better, use regex if you are aware of it.
I think the easyest way is something like the following:
public boolean contains(String s){
return textData.toLowerCase().contains(s.toLowerCase())
}
but that only is goint go work for strings!
I've written some code below. See whether it helps you. Please don't copy-paste, try to learn the logic also.
import java.io.*;
public class Manager
{
String[] textData;
String path;
int numberOfWords;
public void loadFile(String path) throws Exception
{
this.path = path;
StringBuffer buffer = new StringBuffer();
BufferedReader reader = new BufferedReader(new FileReader(path));
String line;
String[] words;
numberOfWords = 0;
while((line=reader.readLine()) != null)
{
buffer.append(line + "\n");
words = line.split(" ");
numberOfWords = numberOfWords + words.length;
}
//deleting the last extra newline character
buffer.deleteCharAt(buffer.lastIndexOf("\n"));
textData = buffer.toString().split("\n");
}
public void printFile()
{
for(int i=0;i<textData.length;i++)
System.out.println(textData[i]);
}
public int findText(String text)
{
for(int i=0;i<textData.length;i++)
if(textData[i].contains(text))
return i;
return -1;
}
public boolean contains(String text)
{
for(int i=0;i<textData.length;i++)
if(textData[i].contains(text))
return true;
return false;
}
public int getNumberOfWords()
{
return numberOfWords;
}
public int getCount()
{
return textData.length;
}
}

Categories

Resources