Using streams for parsing string from file - java

I learn how to use java 8 API. I have a simple log file with the following contents:
SVF2018-05-24_12:02:58.917
NHR2018-05-24_12:02:49.914
FAM2018-05-24_12:13:04.512
KRF2018-05-24_12:03:01.250
SVM2018-05-24_12:18:37.735
MES2018-05-24_12:04:45.513
LSW2018-05-24_12:06:13.511
BHS2018-05-24_12:14:51.985
EOF2018-05-24_12:17:58.810
RGH2018-05-24_12:05:14.511
SSW2018-05-24_12:16:11.648
KMH2018-05-24_12:02:51.003
PGS2018-05-24_12:07:23.645
CSR2018-05-24_12:03:15.145
SPF2018-05-24_12:12:01.035
DRR2018-05-24_12:14:12.054
LHM2018-05-24_12:18:20.125
CLS2018-05-24_12:09:41.921
VBM2018-05-24_12:00:00.000
My goal is to parse it using streams. The desired output is the following:
[{SVF = [2018-05-24, 12:02:58.917]}, {NHR = [2018-05-24, 12:02:49.914]}...]
I already have the following:
public class FileParser {
Stream<String> outputStream;
public FileParser(String fileName) throws IOException, URISyntaxException {
FileReader fr = new FileReader();
this.outputStream = fr.getStreamFromFile(fileName);
public List<HashMap<String,ArrayList<String>>> getRacersInfo(){
return outputStream.map(line -> Arrays.asList(line.substring(0,3))
.collect(Collectors.toMap(???)); //Some code here which I cannot come up with.
}
Any help appreciated. If you need any additional information feel free to ask, I'll be glad to provide it.

Problem: FileReader
FileReader is obsolete, don't use it. It's outdated API, and it's problematic, in that it presumes 'platform default encoding' which is a different way of saying 'a bug waiting to happen that no test will catch but that will blow up in your face later'. You never want 'platform default encoding', especially as a silent default.
There's a new File API, and it lets you specify encoding explicitly. Also, in the new File API, if you don't, UTF-8 is assumed which is a far saner default than 'platform default'.
Problem: resources
Resources are objects that represent a resource that takes up OS-level handles. Files, network connections, database connections - those are some common examples of resources. The thing is unlike normal objects, you MUST explicitly CLOSE those. - if you don't, your VM will, eventually, crash. That means you can basically not put readers/inputstreams/outputstreams/writers in fields, ever, because how do you guarantee closing them? The only way is to make your own class a resource too (a thing that must explicitly be closed), which you can do, but is complicated, and not a good idea here.
You should never make resources unless you do so safely:
bad:
FileReader fr = new FileReader(..);
good:
try (FileReader fr = new FileReader(..)) {
// use here
}
// it's gone here
It does require you to restyle things a bit. You have to open a resource, use the resource, and close it. This meshes well with pragmatic concerns: Resources are a drain on the OS, you don't want to keep em open any longer than you must, so 'open it, use it, and lose it' is the right mindset.
Furthermore, of course, resources as a concept are generally 'once-through-only'. for example, when reading a file, well, you read it, once, from the top to the bottom, and then any further attempts to read from it don't work anymore. So, in your example, the first time I call getRacersInfo(), it works. But the second time I call it, it won't, as the reader has now been consumed.
The solution to both problems is to do the reading in the constructor*.
*) See later - we're going to move this out of the constructor eventually, but that's a separate concern.
Problem: Misunderstanding of responsibilities of constructors
This class is called a FileParser. So, it's job is to parse files (that, or, this class has a bad name). Generally, your constructors represent the 'data gathering' phase, not the 'do the job' phase. Therefore, parsing the file in the first place, in that constructor, is bad code style. You should not do this - your constructors should as a rule do as little as possible and definitely nothing tricky, such as opening files or actually parsing things. Again - the JOB of a FileParser is to parse files, and constructors should not do the job. They just set up the object so that it can do the job later.
The proper design, then, is:
public class FileParser {
private final Path path;
public FileParser(Path path) {
this.path = path;
}
public List<Map<String, List<String>> parseRacersInfo() {
try (Stream<String> lines = Files.lines(path)) {
lines.map(.... parse the content here ....);
}
}
}
We have now:
Moved the 'job' part to a method that accurately describes the job.
Ensured the constructor is simple and just gathers information to do the job.
Safely use resources by applying the try(){} concept.
Use the new API (java.nio.file.Files and java.nio.file.Path).
Clarified our typing: That parameter to the constructor represents a path. If I call new FileParser("Hello, IceTea, how's life?") - that call makes no sense. Path is more descriptive than String, and if your method makes sense looking only at the types of the parameters? That's better than if you need to read the docs too.
Problem: Not using java the way it wants to be used
Java is typed. Nominally so. Things should be stored in types that represent that thing. Thus, the string 2018-05-24_12:18:20.125 should be represented by an object that represents a time of some sort. Not a List<String> containing the string 2018-05-24 and 12:18:20.125.
Finally: How do I actually write the mapping?
Streams work by zooming in on a single element in the stream, and doing a series of operations on these elements, transforming them, filtering some out, etcetera. You cannot 'go back' in the process (once you map a thing to another thing, you can't go back to what it used to be), and you can't refer to other objects in your stream (you can't ask: Give me the item before me in the stream).
Thus, once you go: line.substring(0, 3), you've thrown out the date, and that's a problem because we need that info. Therefore, you can't do that; not in a .map() operation, at any rate.
In fact, we can go straight to collecting the stream back into a map here - we need that entire string and we can derive the key from it (SVF), and we need that entire string and we can derive the value from it (the date).
Let's write these conversion functions, and let's translate our string representing a time to a proper (also new in java 8) type for it: java.time.LocalDateTime:
Function<String, String> toKey = in -> in.substring(0, 3);
DateTimeFormatter DATETIME_FORMAT =
DateTimeFormatter.ofPattern("uuuu-MM-dd_HH:mm:ss.SSS", Locale.ENGLISH);
Function<String, LocalDateTime> toValue = in ->
LocalDateTime.parse(in.substring(3), DATETIME_FORMAT);
These are simple and we can test them:
assertEquals("VBM", toKey.apply("VBM2018-05-24_12:00:00.000"));
assertEquals(LocalDateTime.of(2018, 5, 24, 12, 0, 0),
toValue.apply("VBM2018-05-24_12:00:00.000"));
Then we put it all together:
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.Files;
import java.io.IOException;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class FileParser {
private static final DateTimeFormatter DATETIME_FORMAT =
DateTimeFormatter.ofPattern("uuuu-MM-dd_HH:mm:ss.SSS", Locale.ENGLISH);
private final Path path;
public FileParser(Path path) {
this.path = path;
}
public Map<String, LocalDateTime> parseRacersInfo() throws IOException {
try (Stream<String> lines = Files.lines(path)) {
return lines.collect(Collectors.toMap(
in -> in.substring(0, 3),
in -> LocalDateTime.parse(in.substring(3), DATETIME_FORMAT)));
}
}
public static void main(String[] args) throws Exception {
System.out.println(new FileParser("test.txt").parseRacersInfo());
}
}

Path path = Paths.get(fileName);
try (Stream<String> lines = Files.lines(path)) {
Map<String, List<LocalDateTime>> map = lines.collect(Collectors.groupingBy(
line -> line.substring(0, 3),
line -> LocalDateTime.parse(line.substring(3).replace('_', 'T')));
}
The toMap receives a key mapper and a value mapper. Here I keep the Stream of lines.
The resulting map is just a Map. Never provide an implementation, HashMap so the collect may return its own implementation. (If effect you could provide an implementation.)
(I used Files.lines which defaults to UTF-8 encoding, but you can add an encoding. The reason: Path is more generalized than File.)

Something like :
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class Test {
public static void main(String[] args) {
String fileName = "C:\\Users\\Asmir\\Desktop\\input1.txt";
Map<String,List<String>> map = new HashMap<>();
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
map = stream
.collect(Collectors.toMap(s -> s.substring(0,3), s -> Arrays.asList(s.substring(3).split("_"))));
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(map);
}
}

Related

Akka streams don't run when Source has large number of records

I'm trying to write a very simple introductory example of using Akka Streams. I'm attempting to basically create a stream that takes a range of integers as a source and filters out all the integers that are not prime, producing a stream of prime integers as its output.
The class that constructs the stream is rather simple; for that I have the following.
import akka.NotUsed;
import akka.actor.ActorSystem;
import akka.stream.javadsl.Flow;
import com.aparapi.Kernel;
import com.aparapi.Range;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
public class PrimeStream {
private final AverageRepository averageRepository = new AverageRepository();
private final ActorSystem actorSystem;
public PrimeStream(ActorSystem actorSystem) {
this.actorSystem = actorSystem;
}
public Flow<Integer, Integer, NotUsed> filterPrimes() {
return Flow.of(Integer.class).grouped(10000).mapConcat(PrimeKernel::filterPrimes).filter( v -> v != 0);
}
}
When I run the following test, it works fine.
private final ActorSystem actorSystem = ActorSystem.create("Sys");
#Test
public void testStreams() {
Flow<Integer, Integer, NotUsed> filterStream = new PrimeStream(actorSystem).filterPrimes();
Source<Integer, NotUsed> flow = Source.range(10000000, 10001000).via(filterStream);
flow.runForeach(System.out::println, ActorMaterializer.create(actorSystem));
}
However, when I increase the range by a factor of x10 by changing the line in the test to the following, it no longer works.
Source<Integer, NotUsed> flow = Source.range(10000000, 10010000).via(filterStream);
Now when the test runs, no exceptions are thrown, no warnings. It simply runs, then exits, without displaying any text to the console at all.
Just to be extra certain that the problem wasn't in my primality test itself, I ran the test over the same range without using Akka Streams, and it runs fine. The following code runs without a problem.
#Test
public void testPlain() {
List<Integer> in = IntStream.rangeClosed(10000000, 10010000).boxed().collect(Collectors.toList());
List<Integer> out = PrimeKernel.filterPrimes(in);
System.out.println(out);
}
Just for the sake of clarity, the primality test itself takes in a list of integers and sets any element in the list to 0 if it is not prime.
As suggested by #RamonJRomeroyVigil if i remove the mapConcat part all together but leave everythig the same it does, in fact, print out 10,000 integers. However If i leave everything the same but simply replace filterPrimes with a method that just returns the method parameter as is without touching it, then it doesnt print anything to the screen at all. I've also tried adding a println to the begining filterPrime to debug it. Whenever it doesnt print any output that includes the debugging statement. So no attempt is even made to call filterPrimes at all.
runForeach returns a CompletionStage, so if you want to see all the numbers getting printed then you have to await on the CompletionStage otherwise the test function returns and the program terminates without the CompletionStage getting completed.
Example:
flow.runForeach(System.out::println, ActorMaterializer.create(actorSystem)).toCompletableFuture().join();

Tests becoming very long with configurations Selenium

been a while since I have been here and am just trying to re-familiarize myself with my test automation framework I have been working on. Maybe a stupid question but I am going to throw it out there anyway as I think aloud.
Because I have introduced a config file which contains the path to an excel file(which contains test data) and implemented a basic excel reader to extract this data for testing I am finding that a great deal of my initial test is primarily taken up by all this set up.
For instance:
create an instance of a ReadPropertyFile class
Create an object of an ExcellDataConfig class and pass to it the location of the excel file from the config file
set the testcase id for this test to scan the excel file for where to start reading the data from the sheet - excel sheet contains markers
get the locations rol / col info from the sheet of all the interesting stuff i need for my test e.g. username / password, or some other data
open the browser
in the case of running a test on for multiple users set up a for loop that iterates through the excel sheet and logs in and then does the actual test.
Is a lot of configuration options but is there a simpler way?
I have a separate TestBase class which contains the login class and i thought to somehome move this user login info stuff there but not sure if that is such a good idea.
I just don't want to get bogged down duplicating work does anyone have any high level suggestions?
Here is a compilable (but not fully coded) quick-and-dirty example how you could design a base class for Selenium test classes. This follows the DRY principle (Don't Repeat Yourself).
The base class defines a login/logout method which would be called prior/after test execution of derived test classes.
Data is read from a Json file (based on javax.json) and used for locating elements (using the keys) and entering data (using the values). You can easily expand the code to support handling of other elements or location strategies (css, xpath).
Note that this example is not performance-optimised. But it is quick enough for a start and you could adapt it to your needs (e.g. eager data loading in a static context).
package myproject;
import java.io.*;
import java.util.*;
import javax.json.Json;
import javax.json.stream.JsonParser;
import javax.json.stream.JsonParser.Event;
import org.junit.*;
import org.openqa.selenium.*;
import org.openqa.selenium.support.ui.*;
public class MyProjectBaseTest {
protected static WebDriver driver;
#Before
public void before() {
WebDriver driver = new FirefoxDriver();
driver.get("http://myapp");
login();
}
#After
public void after() {
logout();
}
private void login() {
Map<String, String> data = readData("/path/to/testdata/login.json");
Set<String> keys = data.keySet();
for (String key : keys) {
WebDriverWait wait = new WebDriverWait(driver, 20L);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.id(key)));
final WebElement we = driver.findElement(By.id(key));
if ("input".equals(we.getTagName())) {
we.clear();
we.sendKeys(data.get(key));
}
//else if "button".equals(we.getTagName())
}
}
private void logout() {
//logout code ...
}
private Map<String, String> readData(String filename) {
Map<String, String> data = new HashMap<String, String>();
InputStream is = null;
String key = null;
try {
is = new FileInputStream(filename);
JsonParser parser = Json.createParser(is);
while (parser.hasNext()) {
Event e = parser.next();
if (e == Event.KEY_NAME) {
key = parser.getString();
}
if (e == Event.VALUE_STRING) {
data.put(key, parser.getString());
}
}
parser.close();
}
catch (IOException e) {
//error handling
}
finally {
//close is
}
return data;
}
}
All this "setup work", you've described actually is pretty common stuff and it is how the AAA pattern really works:
a pattern for arranging and formatting code in UnitTest methods
For advanced Fixture usage you could utilize the most suitable for your case xUnit Setup pattern.
I totally agree with #Würgspaß's comment. What he is describing is called Object Map and I've used it heavily in the past 3 years with great success across multiple Automation projects.
I don't see in your scenario any usage of specific framework, so I would suggest that you pick some mature one, like TestNG in combination with Cucumber JVM. The last one will provide a Context injection, so you can get always clean step definition objects that can share context/state during the scenario run. And you will be able to reuse all the heavy setup just once and share it between all the tests. I/O operations are expensive and may cause issues in more complex cases, e.g. parallel execution of your tests.
As for the design of your code, you can find some of the Selenium's Test design considerations very useful, like the CallWrappers.

Google Guava - Filter a Collection by the value of one if it's element's properties relative to another element's

Really poorly worded title for my first question here, but hopefully I'll still get an answer to it!
What I would like to have is to be able to chain what the filterIt-method in the following code-snippet does into my existing FluentIterable.
I'm VERY new to Guava (at least the functional programming part of it), so please bear with me.
import com.google.common.collect.Sets;
import org.joda.time.DateTime;
import org.joda.time.Days;
import java.util.Set;
public class Blah {
private DateTime date;
private Blah(final DateTime date) {
this.date = date;
}
public static void main(String[] args) {
Set<Blah> blahs = Sets.newHashSet(
new Blah(DateTime.now()),
new Blah(DateTime.now().minusDays(10)),
new Blah(DateTime.now().minusDays(21)),
new Blah(DateTime.now().minusDays(15))
);
Set<Blah> filteredBlahs = filterIt(blahs);
final int filtered = blahs.size() - filteredBlahs.size();
System.out.println(filtered + " results were filtered out");
}
private static Set<Blah> filterIt(final Set<Blah> blahs) {
final Set<Blah> filteredBlahs = Sets.newHashSet();
for (Blah currentBlah : blahs) {
final DateTime currentDate = currentBlah.date;
for (Blah blah : blahs) {
if (blah != currentBlah && !filteredBlahs.contains(blah)) {
final Days days = Days.daysBetween(currentDate, blah.date);
if (Math.abs(days.getDays()) < 5) {
filteredBlahs.add(currentBlah);
filteredBlahs.add(blah);
}
}
}
}
return filteredBlahs;
}
}
This code is written quickly as an example for what I want implemented. My problem is that I want this type of filtering to happen in the middle of some other transformations, and being able to chain it instead of splitting it up into different Iterables would make the flow more understandable at a glance.
Any feedback on how I can better the question, or clarify it, would be very much welcome!
NO.
To elaborate: You could write Functions to extract the date, you could write Predicates to test them, but even if you've found the way to stick them together, it would be a mess because of all those caveats. You'd have to wait for JDK8 in order to make something at least remotely readable out of it.
The other thing is that you're testing pairs of Blahs (+1 for the name), which goes really too far for a general purpose library. Imagine the myriads of methods like this.
The last thing is that what you're doing is not really functional: Your condition depends on the filteredBlahs list, which changes during the iteration. That's fine, if you need it, but converting this into something functionally-looking would be an obfuscation.
Predicates used for filtering really shouldn't change in the process, otherwise you can run in undefined or confusing behavior like in this issue.

How to access inline text file using Java?

A program i am working on deals with processing file content. Right now i am writing jUnit tests to make sure things work as expected. As part of these tests, i'd like to reference an inline text file which would define the scope of a particular test.
How do i access such a file?
--
Let me clarify:
Normally, when opening a file, you need to indicate where the file is. What i want to say instead is "in this project". This way, when someone else looks at my code, they too will be able to access the same file.I may be wrong, but isn't there a special way, one can access files which are a part of "this" project, relative to "some files out there on disk".
If what you mean is you have a file you need your tests to be able to read from, if you copy the file into the classpath your tests can read it using Class.getResourceAsStream().
For an example try this link (Jon Skeet answered a question like this):
read file in classpath
You can also implement your own test classes for InputStream or what have you.
package thop;
import java.io.InputStream;
/**
*
* #author tonyennis
*/
public class MyInputStream extends InputStream {
private char[] input;
private int current;
public MyInputStream(String s) {
input = s.toCharArray();
current = 0;
}
public int read() {
return (current == input.length) ? -1 : input[current++];
}
#Override
public void close() {
}
}
This is a simple InputStream. You give it a string, it gives you the string. If the code you wanted to test required an InputStream, you could use this (or something like it, heh) to feed exactly the strings wanted to test. You wouldn't need resources or disk files.
Here I use my lame class as input to a BufferedInputStream...
package thop;
import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
/**
*
* #author tonyennis
*/
public class Main {
/**
* #param args the command line arguments
*/
public static void main(String[] args) throws IOException {
InputStream is = new MyInputStream("Now is the time");
BufferedInputStream bis = new BufferedInputStream(is);
int res;
while((res = bis.read()) != -1) {
System.out.println((char)res);
}
}
}
Now, if you want to make sure your program parses the inputStream correctly, you're golden. You can feed it the string you want to test with no difficulty. If you want to make sure the class being tested always closes the InputStream, add a "isOpen" boolean instance variable, set it to true in the constructor, set it to false in close(), and add a getter.
Now your test code would include something like:
MyInputStream mis = new MyInputStream("first,middle,last");
classBeingTested.testForFullName(mis);
assertFalse(mis.isOpen());

How to write a hashtable<string, string > in to text file,java?

I have hastable
htmlcontent is html string of urlstring .
I want to write hastable into a .text file .
Can anyone suggest a solution?
How about one row for each entry, and two strings separated by a comma? Sort of like:
"key1","value1"
"key2","value2"
...
"keyn","valuen"
keep the quotes and you can write out keys that refer to null entries too, like
"key", null
To actually produce the table, you might want to use code similar to:
public void write(OutputStreamWriter out, HashTable<String, String> table)
throws IOException {
String eol = System.getProperty("line.separator");
for (String key: table.keySet()) {
out.write("\"");
out.write(key);
out.write("\",\"");
out.write(String.valueOf(table.get(key)));
out.write("\"");
out.write(eol);
}
out.flush();
}
For the I/O part, you can use a new PrintWriter(new File(filename)). Just call the println methods like you would System.out, and don't forget to close() it afterward. Make sure you handle any IOException gracefully.
If you have a specific format, you'd have to explain it, but otherwise a simple for-each loop on the Hashtable.entrySet() is all you need to iterate through the entries of the Hashtable.
By the way, if you don't need the synchronized feature, a HashMap<String,String> would probably be better than a Hashtable.
Related questions
Java io ugly try-finally block
Java hashmap vs hashtable
Iterate Over Map
Here's a simple example of putting things together, but omitting a robust IOException handling for clarity, and using a simple format:
import java.io.*;
import java.util.*;
public class HashMapText {
public static void main(String[] args) throws IOException {
//PrintWriter out = new PrintWriter(System.out);
PrintWriter out = new PrintWriter(new File("map.txt"));
Map<String,String> map = new HashMap<String,String>();
map.put("1111", "One");
map.put("2222", "Two");
map.put(null, null);
for (Map.Entry<String,String> entry : map.entrySet()) {
out.println(entry.getKey() + "\t=>\t" + entry.getValue());
}
out.close();
}
}
Running this on my machine generates a map.txt containing three lines:
null => null
2222 => Two
1111 => One
As a bonus, you can use the first declaration and initialization of out, and print the same to standard output instead of a text file.
See also
Difference between java.io.PrintWriter and java.io.BufferedWriter?
java.io.PrintWriter API
Methods in this class never throw I/O exceptions, although some of its constructors may. The client may inquire as to whether any errors have occurred by invoking checkError().
For text representation, I would recommend picking a few characters that are very unlikely to occur in your strings, then outputting a CSV format file with those characters as separators, quotes, terminators, and escapes. Essentially, each row (as designated by the terminator, since otherwise there might be line-ending characters in either string) would have as the first CSV "field" the key of an entry in the hashtable, as the second field, the value for it.
A simpler approach along the same lines would be to designate one arbitrary character, say the backslash \, as the escape character. You'll have to double up backslashes when they occur in either string, and express in escape-form the tab (\t) and line-end ('\n); then you can use a real (not escape-sequence) tab character as the field separator between the two fields (key and value), and a real (not escape-sequence) line-end at the end of each row.
You can try
public static void save(String filename, Map<String, String> hashtable) throws IOException {
Properties prop = new Properties();
prop.putAll(hashtable);
FileOutputStream fos = new FileOutputStream(filename);
try {
prop.store(fos, prop);
} finally {
fos.close();
}
}
This stores the hashtable (or any Map) as a properties file. You can use the Properties class to load the data back in again.
import java.io.*;
class FileWrite
{
public static void main(String args[])
{
HashTable table = //get the table
try{
// Create file
BufferedWriter writer = new BufferedWriter(new FileWriter("out.txt"));
writer.write(table.toString());
}catch (Exception e){
e.printStackTrace();
}finally{
out.close();
}
}
}
Since you don't have any requirements to the file format, I would not create a custom one. Just use something standard. I would recommend use json for that!
Alternatives include xml and csv but I think that json is the best option here. Csv doesn't handle complex types like having a list in one of the keys of your map and xml can be quite complex to encode/decode.
Using json-simple as example:
String serialized = JSONValue.toJSONString(yourMap);
and then just save the string to your file (what is not specific of your domain either using Apache Commons IO):
FileUtils.writeStringToFile(new File(yourFilePath), serialized);
To read the file:
Map map = (JSONObject) JSONValue.parse(FileUtils.readFileToString(new File(yourFilePath));
You can use other json library as well but I think this one fits your need.

Categories

Resources