finding character count between two special symbols - java

Am trying to find the character count between = and \n new line character using below java code. But \n is not considering in my case.
am using import org.apache.commons.lang3.StringUtils; package
Please find my below java code.
public class CharCountInLine {
public static void main(String[] args)
{
BufferedReader reader = null;
try
{
reader = new BufferedReader(new FileReader("C:\\wordcount\\sample.txt"));
String currentLine = reader.readLine();
String[] line = currentLine.split("=");
while (currentLine != null ){
String res = StringUtils.substringBetween(currentLine, "=", "\n"); // \n is not working.
if(res != null) {
System.out.println("line -->"+res.length());
}
currentLine = reader.readLine();
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
try
{
reader.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
}
Please find my sample text file.
sample.txt
Karthikeyan=123456
sathis= 23546
Arun = 23564

Well, you're reading the string using readLine(), which according to the Javadoc (emphasis mine):
Returns:
A String containing the contents of the line, not including
any line-termination characters, or null if the end of the stream has
been reached
So your code doesn't work because the string does not contain a newline character.
You can address this in a number of ways:
Use StringUtils.substringAfter() instead of StringUtils.substringBetween().
If it meets the requirements, treat your file as a Java properties file so you don't need to parse it yourself.
Use String.split().
Use String.lastIndexOf().
Some simple regex matching and grouping.

You don't need to change how you read the lines, simply change your logic to extract the text after =.
Pattern p = Pattern.compile("(?:.+)=(.+)$");
Matcher m = p.matcher("Karthikeyan=123456");
if (m.find()) {
System.out.println(m.group(1).length());
}
No need for Apache StringUtils either, simple Java regex will do. If you don't want to count whitespace, trim the string before calling length().
Alternatively, you can also split the line around = as discussed here.
10x simpler code:
Path p = Paths.get("C:\\wordcount\\sample.txt");
Files.lines(p)
.forEach { line ->
// Put the above code here
}

Related

Java - Reading a text file when a certain sequence occurs

I haven't been able to find a way to read from a .txt file when a certain sequence occurs.
This is how an entry from my file looks like:
&1551:John:Packard:83:Heavy:Blonde&
I want my file to be read from &1551 (1551 is the unique ID number of the user) until the next "&". Do you guys have any suggestions as to how to accomplish this? The ":" is later used for splitting the string.
Thanks!
A simple JDK Scanner has the ability to read a file stopping at certain patterns:
public String findWithinHorizon(String pattern,
int horizon)
Attempts to find the next occurrence of the specified pattern.
public Scanner skip(Pattern pattern)
Skips input that matches the specified pattern, ignoring delimiters. This method will skip input if an anchored match of the specified pattern succeeds.
If a match to the specified pattern is not found at the current position, then no input is skipped and a NoSuchElementException is thrown.
So this should be enough:
// skip anything up to "$1551:" (but keep "1551:" for next read)
Pattern toSkip = Pattern.compile(".*?\\$(?=1511:)", Pattern.DOTALL);
sc.skip(toSkip);
// get everything starting at the "1551:" up to a "$" sign on same line
String line = sc.findWithinHorizon(".*(?=\\$)", 0);
If end of lines can be included between the $ signs, then you should compile the pattern with the DOTALL flag as I did for toSkip.
Firstly you will have to get input from string without staring & and ending &,
then split string by :
So, Assuming all inputs will be in new line below code should work,
public static void main(String args[]) {
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader("D://test.txt"));
String sCurrentLine;
String[] fields;
while ((sCurrentLine = reader.readLine()) != null) {
sCurrentLine = sCurrentLine.substring(sCurrentLine.indexOf('&') + 1);
sCurrentLine = sCurrentLine.substring(0, sCurrentLine.indexOf('&'));
fields = sCurrentLine.split(":");
for (String tmp : fields)
System.out.println(tmp);
}
} catch (Exception e) {
System.out.println("Error in accepting String");
} finally {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Hope this helps.

Java - Groovy : regex parse text block

I know that this is a common question and I've been through a lot of forums to figure out whats the problem in my code.
I have to read a text file with several blocks in the following format:
import com.myCompanyExample.gui.Layout
/*some comments here*/
#Layout
LayoutModel currentState() {
MyBuilder builder = new MyBuilder()
form example
title form{
row_1
row_1
row_n
}
return build.get()
}
#Layout
LayoutModel otherState() {
....
....
return build.get()
}
I have this code to read all the file and I'd like to extract each block between the keyword "#Layout" and the keyword "return". I need also to catch all newline so later I'll be able to split each matched block into a list
private void myReadFile(File fileLayout){
String line = null;
StringBuilder allText = new StringBuilder();
try{
FileReader fileReader = new FileReader(fileLayout);
BufferedReader bufferedReader = new BufferedReader(fileReader);
while((line = bufferedReader.readLine()) != null) {
allText.append(line)
}
bufferedReader.close();
}
catch(FileNotFoundException ex) {
System.out.println("Unable to open file");
}
catch(IOException ex) {
System.out.println("Error reading file");
}
Pattern pattern = Pattern.compile("(?s)#Layout.*?return",Pattern.DOTALL);
Matcher matcher = pattern.matcher(allText);
while(matcher.find()){
String [] layoutBlock = (matcher.group()).split("\\r?\\n")
for(index in layoutBlock){
//check each line of the current block
}
}
layoutBlock returns size=1
I think this can potentially be a so called XY problem anyway...if the groovy source is composed only by #Layout annotated blocks of code you can use a tempered greedy token to select till the next annotation (view online demo).
Change the pattern loc as this:
Pattern pattern = Pattern.compile( "#Layout(?:(?!#Layout).)*", Pattern.DOTALL );
PS: the dotall flag (?s) inside the regex and the parameter Pattern.DOTALL do the same thing (enable the so called multiline mode), use only one of them indifferently.
UPDATE
I tried your code, the problem (preserving newline) is in the method you use to slurp the file (bufferedReader.readline() remove the newline at the end of the string).
Simply readd a newline when append to allText:
String ln = System.lineSeparator();
while((line = bufferedReader.readLine()) != null) {
allText.append(line + ln);
}
Or you can replace all the code to slurp the file with this:
import java.nio.file.Files;
import java.nio.file.Paths;
//can throw an IOException
String filePath = "/path/to/layout.groovy";
String allText = new String(Files.readAllBytes(Paths.get(filePath)),StandardCharsets.UTF_8);

Assigning part of txt file to java variable

I have a txt file with the following output:
"CN=COUD111255,OU=Workstations,OU=Mis,OU=Accounts,DC=FLHOSP,DC=NET"
What I'm trying to do is read the COUD111255 part and assign it to a java variable. I assigned ldap to sCurrentLine, but I'm getting a null point exception. Any suggestions.
try (BufferedReader br = new BufferedReader(new FileReader("resultofbatch.txt")))
{
final Pattern PATTERN = Pattern.compile("CN=([^,]+).*");
try {
while ((sCurrentLine = br.readLine()) != null) {
//Write the function you want to do, here.
String[] tokens = PATTERN.split(","); //This will return you a array, containing the string array splitted by what you write inside it.
//should be in your case the split, since they are seperated by ","
}
System.out.println(sCurrentLine);
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
} catch (IOException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
}
}
});
You just need to read data from a file line by line and assign the line to your variable str. Refer to following link:
How to read a large text file line by line using Java?
Your code is almost correct. You are writing this string to standard output - what for? If I understand you right, what you need is simply this:
private static final Pattern PATTERN = Pattern.compile("CN=([^,]+).*");
public static String solve(String str) {
Matcher matcher = PATTERN.matcher(str);
if (matcher.matches()) {
return matcher.group(1);
} else {
throw new IllegalArgumentException("Wrong string " + str);
}
}
This call
solve("CN=COUD111255,OU=Workstations,OU=Mis,OU=Accounts,DC=FLHOSP,DC=NET")
gave me "COUD111255" as answer.
To read from .txt, use BufferedReader. To create a one, write:
BufferedReader br = new BufferedReader(new FileReader("testing.txt"));
testing.txt is the name of the txt that you're reading and must be in your java file. After initializing, you must continue as:
while ((CurrentLine = br.readLine()) != null) {
//Write the function you want to do, here.
String[] tokens = CurrentLine.split(","); //This will return you a array, containing the string array splitted by what you write inside it.
//should be in your case the split, since they are seperated by ","
}
You got tokens array which is = [CN=COUD111255,OU=Workstations OU=Mis,OU=Accounts,DC=FLHOSP,DC=NET].
So, now take the 0th element of array and make use of it. You got the CN=COUD111255, now! Leaving here not to give whole code.
Hope that helps !

JAVA: Getting the content of specific strings from text files

I have a text file like this:
text
text
text
.
.
#data
instances1
instances2
.
.
instancesN
I want to get the contents of this file from #data until the end of the file, how can I do?
I found this method of FileUtils (from apache commons-lang) class but it's usable only if I already know the line number.
String ln = FileUtils.readLines(new File("arff_file/"+results.get(0)))
.get(lineNumber);
Since you are using Apache Commons, you can do it in one line:
String contents = FileUtils.readFileToString(new File("arff_file/"+results.get(0)), "UTF-16").replaceAll("^.*?(?=#data)", "");
This works by
reading the whole file into a single String
using regex-based replaceAll() to remove (by replacing with a blank) everything up to, but not including, #data
The regex breakdown of ^.*?(?=#data) is:
^ start of input
.*? a reluctantly quantified wildcard
(?=#data) a positive (non-consuming) look ahead that asserts that the next input is #data
A reluctant quantifier could be important to use so it won't skip past the first #data, in case it appears more than once in the input.
try {
String file = "fileName";
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
if (line.equals("#data"))
nowRead(br);//I just do this for more efficiency, you can set a boolean flag instead
}
br.close();
}catch (IOException e) {
//OMG Exception again!
}
}
static ArrayList<String> nowRead(BufferedReader br) throws IOException {
ArrayList<String> s = new ArrayList<String>();// do it as you wish
String line;
while ((line = br.readLine()) != null) {
s.add(line);
}
return s;
}
Path start = Paths.get("test.txt");
try
{
List<String> lines = Files.readAllLines(start);
for (Iterator<String> it = lines.iterator(); it.hasNext();)
{
String line = it.next();
if (!"#data".equals(line.trim()))
{
it.remove();
}
else
{
break;
}
}
System.out.println(lines);
}
catch (IOException e)
{
e.printStackTrace();
}
I was reading about Path online so why not something like this as alternative to Bohemian code?
Maybe something could be done using stream() of Java 8 but not still nothing...

Search text file for a specific line

I want to search for specific lines of text in a text file. If the piece of text I am looking for is on a specific line, I would like to read further on that line for more input.
So far I have 3 tags I am looking for.
#public
#private
#virtual
If I find any of these on a line, I would like to read what comes next so for example I could have a line like this:
#public double getHeight();
If I determine that the tag I found is #public then I have to take the following part after the white-space until I reach the semicolon. The problem is, that I can't really think of an efficient way to do this without excessive use of charAt(..) which neither looks pretty but probably isn't good either in the long run for a large file, or for multiple files in a row.
I would like help to solve this efficiently as I currently can't comprehend how I would do it. The code itself is used to parse comments in a C++ file, to later generate a Header file. The Pseudo Code part is where I am stuck. Some people suggest BufferedReader, others say Scanner. I went with Scanner as that seems to be the replacement for BufferedReader.
public void run() {
Scanner scanner = null;
String filename, path;
StringBuilder puBuilder, prBuilder, viBuilder;
puBuilder = new StringBuilder();
prBuilder = new StringBuilder();
viBuilder = new StringBuilder();
for(File f : files) {
try {
filename = f.getName();
path = f.getCanonicalPath();
scanner = new Scanner(new FileReader(f));
} catch (FileNotFoundException ex) {
System.out.println("FileNotFoundException: " + ex.getMessage());
} catch (IOException ex) {
System.out.println("IOException: " + ex.getMessage());
}
String line;
while((line = scanner.nextLine()) != null) {
/**
* Pseudo Code
* if #public then
* puBuilder.append(line.substring(after white space)
* + line.substring(until and including the semicolon);
*/
}
}
}
I may be misunderstanding you.. but are you just looking for String.contains()?
if(line.contains("#public")){}
String tag = "";
if(line.startsWith("#public")){
tag = "#public";
}else if{....other tags....}
line = line.substring(tag.length(), line.indexOf(";")).trim();
This gives you a string that goes from the end of the tag (which in this case is public), and then to the character preceding the semi-colon, and then trims off the whitespace on the ends.
if (line.startsWith("#public")) {
...
}
if you are allow to use open source libraries i suggest using the apache common-io and common-lang libraries. these are widely use java librariues that will make you life a lot more simpler.
String text = null;
InputStream in = null;
List<String> lines = null;
for(File f : files) {
try{
in = new FileInputStream(f);
lines = IOUtils.readLines(in);
for (String line: lines){
if (line.contains("#public"){
text = StringUtils.substringBetween("#public", ";");
...
}
}
}
catch (Exception e){
...
}
finally{
// alway remember to close the resource
IOUtils.closeQuietly(in);
}
}

Categories

Resources