Align all the strings in proper way using String.format java - java

I am creating a file by appending all the strings in StringBuilder and then dumping it in a file. But when I check the file (which is obvious) all my strings in each line is not properly align..
Below is the code called by upper layer by passing all the necessary stuff.
private static StringBuilder appendStrings(String pro, List<ProcConfig> list,
StringBuilder sb) {
for (ProcConfig pc : list) {
sb.append(pro).append(" ").append(pc.getNo()).append(" ").append(pc.getId())
.append(" ").append(pc.getAddress()).append(" ").append(pc.getPort()).append(" ")
.append(pc.getNumPorts()).append(" ").append(pc.getName())
.append(System.getProperty("line.separator"));
}
sb.append(System.getProperty("line.separator"));
return sb;
}
And here is how sample lines look like from the above code. After each new line, there is a new set so I want to align all the lines properly in each set before each new line.
config 51 106 10.178.151.25 8095 5 tyt_87612nsas_woqa_7y2_0
config 51 104 10.124.192.124 8080 5 tyt_abc_pz1_rn03c-7vb_01
if_hello_abc_tree tyt.* then process_is_not_necessary 32 80 10.86.25.29 9091 5 tyt_goldenuserappslc22
if_hello_abc_tree tyt.* then process_is_not_necessary 51 50 10.174.192.209 9091 5 tyt_goldenuserapprno01
if_hello_abc_tree tyt.* then config 4 140 10.914.198.26 10001 1 silos_lvskafka-1702600
if_hello_abc_tree tyt.* then config 4 184 10.444.289.138 10001 1 silos_lvskafka-1887568
Is there any way I can align each of the above line properly? So output should be like this:
config 51 106 10.178.151.25 8095 5 tyt_87612nsas_woqa_7y2_0
config 51 104 10.124.192.124 8080 5 tyt_abc_pz1_rn03c-7vb_01
if_hello_abc_tree tyt.* then process_is_not_necessary 32 80 10.86.25.29 9091 5 tyt_goldenuserappslc22
if_hello_abc_tree tyt.* then process_is_not_necessary 51 50 10.174.192.209 9091 5 tyt_goldenuserapprno01
if_hello_abc_tree tyt.* then config 4 140 10.914.198.26 10001 1 silos_lvskafka-1702600
if_hello_abc_tree tyt.* then config 4 184 10.444.289.138 10001 1 silos_lvskafka-1887568
Update:
Below is how it is getting generated now. As you can see on IP Address it is slightly off if you compare the two lines.
config 51 106 97.143.765.65 8095 5 abc_tyewaz1_rna03c-7nhl_02
config 51 104 97.143.162.184 8080 5 abc_tyewaz1_rna03c-7vjb_01
Instead can we generate like below? Basically making each column straight. Is this possible to do?
config 51 106 97.143.765.65 8095 5 abc_tyewaz1_rna03c-7nhl_02
config 51 104 97.143.162.184 8080 5 abc_tyewaz1_rna03c-7vjb_01

First of all: I'm not a java developer, but I know a little about it. I know this is not the best way to solve your problem but you'll get the point.
Instead of calling append method multiple times use a Formatter and loop through it like this:
private static StringBuilder appendStrings(String pro, List<ProcConfig> list, StringBuilder sb) {
Formatter formatter = new Formatter(sb);
String template ="";
for (ProcConfig pc : list) {
if (pro.length() == 6)
template = "%-6s %d %3d %-15s %d %d %s %n";
else if (pro.length() > 35)
template = "%-53s %d %3d %-15s %d %d %s %n";
else
template = "%-35s %d %3d %-15s %d %d %s %n";
formatter.format(template, pro, pc.getNo(), pc.getId(), pc.getAddress(), pc.getPort(), pc.getNumPorts(), pc.getName());
}
formatter.close();
return sb;
}
Because pro is in different length you could change the template to a proper one for each pro.
Note: Don't forget to import java.util.Formatter.
You can use format specifiers to specify the way the data is formatted. This is a list of common formatters:
%S or %s: Specifies String
%X or %x: Specifies hexadecimal integer
%o: Specifies Octal integer
%d: Specifies Decimal integer
%c: Specifies character
%T or %t: Specifies Time and date
%n: Inserts newline character
%B or %b: Specifies Boolean
%A or %a: Specifies floating point hexadecimal
%f: Specifies Decimal floating point
As I said before in comments The dash means the first string is left justified and every string will append after number of chars defined in template, so try to find the best number that suites your needs.
To learn more about formatter take a look at here and here.

Related

Java Parsing a String to Extract Data

I have to write a program but I have no idea where to start. Can anyone help me with an outline of how I should go about it? please excuse my novice level at programming. I have provided the input and output of the program.
The trouble that I'm facing is how do I handle the input text? How should I store the input text to extract the data that I need to produce the output commands? Any guidance would be so helpful.
A little explanation of the input:
The output will start with APPLE1: CT= (whatever number is there for CT in line 4)
The following lines of the output will begin with "APPLES:"
I must include and extract the values for CR, PLANTING and RW in the output.
Wherever there is a non-zero or not null in the DATA portion, it will appear in the output.
When the program reads END, "APP;APPLER:CT=(whatever number);" will be the last two commands
INPUT:
<apple:ct=12;
FARM DATA
INPUT DATA
CT CH CR PLANTING RW DATA
12 YES PG -0 FA=1 R=CODE1 MM2 COA COB CI COC COD
0 0 1 0
COE RN COF COG COH
4 00 0
COI COJ D
0
FA=2 R=CODE2 112 COA COB CI COC COD
0 0 0 0
COE RN COF COG COH
4 00 0
COI COJ D
7
END
OUPUT:
APPLE1:CT=12;
APPLES:CR=PG-0,FA=1,R=CODE1,RW=MM2,COC=1,COE=4;
APPLES:FA=2,R=CODE2,RW=112,COE=4,COI=7;
APP;
APPLER:CT=12;

Predict function R returns 0.0 [duplicate]

I posted earlier today about an error I was getting with using the predict function. I was able to get that corrected, and thought I was on the right path.
I have a number of observations (actuals) and I have a few data points that I want to extrapolate or predict. I used lm to create a model, then I tried to use predict with the actual value that will serve as the predictor input.
This code is all repeated from my previous post, but here it is:
df <- read.table(text = '
Quarter Coupon Total
1 "Dec 06" 25027.072 132450574
2 "Dec 07" 76386.820 194154767
3 "Dec 08" 79622.147 221571135
4 "Dec 09" 74114.416 205880072
5 "Dec 10" 70993.058 188666980
6 "Jun 06" 12048.162 139137919
7 "Jun 07" 46889.369 165276325
8 "Jun 08" 84732.537 207074374
9 "Jun 09" 83240.084 221945162
10 "Jun 10" 81970.143 236954249
11 "Mar 06" 3451.248 116811392
12 "Mar 07" 34201.197 155190418
13 "Mar 08" 73232.900 212492488
14 "Mar 09" 70644.948 203663201
15 "Mar 10" 72314.945 203427892
16 "Mar 11" 88708.663 214061240
17 "Sep 06" 15027.252 121285335
18 "Sep 07" 60228.793 195428991
19 "Sep 08" 85507.062 257651399
20 "Sep 09" 77763.365 215048147
21 "Sep 10" 62259.691 168862119', header=TRUE)
str(df)
'data.frame': 21 obs. of 3 variables:
$ Quarter : Factor w/ 24 levels "Dec 06","Dec 07",..: 1 2 3 4 5 7 8 9 10 11 ...
$ Coupon: num 25027 76387 79622 74114 70993 ...
$ Total: num 132450574 194154767 221571135 205880072 188666980 ...
Code:
model <- lm(df$Total ~ df$Coupon, data=df)
> model
Call:
lm(formula = df$Total ~ df$Coupon)
Coefficients:
(Intercept) df$Coupon
107286259 1349
Predict code (based on previous help):
(These are the predictor values I want to use to get the predicted value)
Quarter = c("Jun 11", "Sep 11", "Dec 11")
Total = c(79037022, 83100656, 104299800)
Coupon = data.frame(Quarter, Total)
Coupon$estimate <- predict(model, newdate = Coupon$Total)
Now, when I run that, I get this error message:
Error in `$<-.data.frame`(`*tmp*`, "estimate", value = c(60980.3823396919, :
replacement has 21 rows, data has 3
My original data frame that I used to build the model had 21 observations in it. I am now trying to predict 3 values based on the model.
I either don't truly understand this function, or have an error in my code.
Help would be appreciated.
Thanks
First, you want to use
model <- lm(Total ~ Coupon, data=df)
not model <-lm(df$Total ~ df$Coupon, data=df).
Second, by saying lm(Total ~ Coupon), you are fitting a model that uses Total as the response variable, with Coupon as the predictor. That is, your model is of the form Total = a + b*Coupon, with a and b the coefficients to be estimated. Note that the response goes on the left side of the ~, and the predictor(s) on the right.
Because of this, when you ask R to give you predicted values for the model, you have to provide a set of new predictor values, ie new values of Coupon, not Total.
Third, judging by your specification of newdata, it looks like you're actually after a model to fit Coupon as a function of Total, not the other way around. To do this:
model <- lm(Coupon ~ Total, data=df)
new.df <- data.frame(Total=c(79037022, 83100656, 104299800))
predict(model, new.df)
Thanks Hong, that was exactly the problem I was running into. The error you get suggests that the number of rows is wrong, but the problem is actually that the model has been trained using a command that ends up with the wrong names for parameters.
This is really a critical detail that is entirely non-obvious for lm and so on. Some of the tutorial make reference to doing lines like lm(olive$Area#olive$Palmitic) - ending up with variable names of olive$Area NOT Area, so creating an entry using anewdata<-data.frame(Palmitic=2) can't then be used. If you use lm(Area#Palmitic,data=olive) then the variable names are right and prediction works.
The real problem is that the error message does not indicate the problem at all:
Warning message: 'anewdata' had 1 rows but variable(s) found to have X
rows
instead of newdata you are using newdate in your predict code, verify once. and just use Coupon$estimate <- predict(model, Coupon)
It will work.
To avoid error, an important point about the new dataset is the name of independent variable. It must be the same as reported in the model. Another way is to nest the two function without creating a new dataset
model <- lm(Coupon ~ Total, data=df)
predict(model, data.frame(Total=c(79037022, 83100656, 104299800)))
Pay attention on the model. The next two commands are similar, but for predict function, the first work the second don't work.
model <- lm(Coupon ~ Total, data=df) #Ok
model <- lm(df$Coupon ~ df$Total) #Ko

Java Convert an object in table format

I'm implementing an api that reads data from json response and writes the resulting objects to csv.
Is there a way to convert an object in java to a table format (row-column)?
E.g. assume I have these objects:
public class Test1 {
private int a;
private String b;
private Test2 c;
private List<String> d;
private List<Test2> e;
// getters-setters ...
}
public class Test2 {
private int x;
private String y;
private List<String> z;
// getters-setters ...
}
Lets say I have an instance with the following values
Test1 c1 = new Test1();
c1.setA(11);
c1.setB("12");
c1.setC(new Test2(21, "21", Arrays.asList(new String[] {"211", "212"}) ));
c1.setD(Arrays.asList(new String[] {"111", "112"}));
c1.setE(Arrays.asList(new Test2[] {
new Test2(31, "32"),
new Test2(41, "42")
}));
I would like to see something like this returned as a List<Map<String, Object>> or some other object:
a b c.x c.y c.z d e.x e.y
---- ---- ------ ------- ------ ---- ------ ------
11 12 21 21 211 111 31 32
11 12 21 21 211 111 41 42
11 12 21 21 211 112 31 32
11 12 21 21 211 112 41 42
11 12 21 21 212 111 31 32
11 12 21 21 212 111 41 42
11 12 21 21 212 112 31 32
11 12 21 21 212 112 41 42
I have already implemented something in order to achieve this result using reflections but my solution is too slow for larger objects.
I was thinking in using an in memory database so to convert the object into a database table and then select the result, something like MongoDB or ObjectDB, but I think its an overkill, and maybe slower than my approach. Also, these two do not support in memory database and I do not want to use another disk database, since I'm already using MySQL with hibernate. Usint ramdisk is not an option, since my server only has limited ram. Is there there an in memory oodbms that can do this?
I would prefeer as a solution an algorithm, or even better, if there is already a library that can convert any object to a row-column format? something like jackson or jaxb that convert data to/from other formats.
Thanks for the help
Finally after one week of banging my head against any possible thing available in my house I managed to find a solution.
I shared the code on GitHub so that if anyone ever encounters this problem again, he can avoids a couple of migranes :)
you can get the code from here:
https://github.com/Sebb77/Denormalizer
Note: I had to use the getType() function and the FieldType enum for my specific problem.
In the future I will try to speed up the code with some caching, or something else :)
Note2: this is just a sample code that should be used only for reference. Lots of improvements can be done.
Anyone is free to use the code, just send me a thank you email :)
Any suggestions, improvements or bugs reports are very welcome.

same strings, different charset, not equals

I have a weird problem.
I have an application that crawl a webpage to get a list o names. Than this list is passed to another application that using those names, ask for information to a site, using its API's.
When I compare some strings in the first webpage to some others grabbed by API's usually I get wrong results.
I tried to get character value letter by letter I got this:
Rocco De Nicola
82 111 99 99 111 160 68 101 32 78 105 99 111 108 97 1st web page
82 111 99 99 111 32 68 101 32 78 105 99 111 108 97 2nd
As you can see, in the first string a space is codified by 160 (non-breaking space) instead of 32.
I can I codify correctly the first set of Strings?
I have also tried to set the Charset to UTF-8 but it didn't worked.
Maybe I just have to replace 160 to 32 ?
I would at first trim and replace complicated characters from the strings to compare. After this step follows the equals call. This brings also the advantages in cases you have language specific replacements in your text. It's also a good idea to convert the resulting strings to lower case.
Normally I use something like that ....
private String removeExtraCharsAndToLower(String str) {
str=str.toLowerCase();
str=str.replaceAll("ä", "ae");
str=str.replaceAll("ö", "oe");
str=str.replaceAll("ü", "ue");
str=str.replaceAll("ß", "ss");
return str.toLowerCase().replaceAll("[^a-z]","");
}
Using brute force. This lists all the character set which convert 160 to 32 when encoding.
String s = "" + (char) 160;
for (Map.Entry<String, Charset> stringCharsetEntry : Charset.availableCharsets().entrySet()) {
try {
ByteBuffer bytes = stringCharsetEntry.getValue().encode(s);
if (bytes.get(0) == 32)
System.out.println(stringCharsetEntry.getKey());
} catch (Exception ignored) {
}
}
prints nothing.
If I change the condition to
if (bytes.get(0) != (byte) 160)
System.out.println(stringCharsetEntry.getKey()+" "+new String(bytes.array(), 0));
I get quite a few examples.

Converting a large ASCII to CSV file in python or java or something else on Linux or Win7

Need a hint so I can convert a huge (300-400 mb) ASCII file to a CSV file.
My ASCII file is a database with a lot of products (about 600,000 pcs = 55,200,000 lines in the file).
Below is ONE product. It is like a tablerow in a database, with 88 columns.
If you count the below lines, there is 92 lines.
For every time we have the '00I+CR\LF' it indicates, that we have a new row/product.
Each line is ended with a CR+LF.
A whole product/row is ended with the following three lines:
A00
A10
A21
-as shown below.
Between the starting line '00I CR+LF' and the three ending lines, we have lines, starting with 2 digits (column name), and what comes after those digits, is the data for the column.
If we take the first line below the starting line '00I CR+LF' we will see:
'0109321609'. 01 indicates that it is the column named 01, and the rest is the data stored in that column: '09321609'.
I want to strip out the two digits, indicating each column name/line-number, so the first line (after the starting indication '00I'): 0109321609 comes out as the following: ”09321609”.
Putting it together with the next line (02), it should give an output like:
”09321609”,”15274”, etc.
When coming to the end, we want a new row.
The first line '00I' and the three last lines 'A00', 'A10' and 'A21' we don't want to be included in the output file.
Here is how a row looks like (every line is ended by a CR+LF):
00I
0109321609
0215274
032
0419685
05
062
072
081
09
111
121
15
161
17
1814740
1920120401
2020120401
2120120401
22
230
240
251
26BLAHBLAH 1000MG
27
281
29
30
31BLAHBLAH 1000 mg Filmtablets Hursutacinzki
32
3336
341
350
361
371
401
410
420
43
445774
45FTA
46
47AN03AX14
48BLAHBLAH00000000000000000000010
491
501
512
522
5317
542
552
561
572
581
591
60
61
62
631
641
65
66
67
681
69
721
74884
761
771
780
790
801
811
831
851474
86
871
880
891
901
911
922
930
941
951
961
97
98
990
A00
A10
A21
Anyone got a hint on how it can be converted?
The file is too big for a webserver with php and mysql to run. My thought was to put the file in a directory on my local server, and read the file, strip out the line numbers, and insert the data directly in a mysql database on the fly, but the file is too big, and the server stalls.
I'm able to run under Linux (Ubuntu) and Windows 7.
Maybe some python or java is recommended? I'm able to run both, but my experience with those is low, but I'm a quick learner, so if someone can give a hint? :-)
Best Regards
Bjarke :-)
If you are absolutely certain that each entry is 92 lines long:
from itertools import izip
import csv
with open('data.txt') as inf, open('data.csv','wb') as outf:
lines = (line[2:].rstrip() for line in inf)
rows = (data[1:89] for data in izip(*([lines]*92)))
csv.writer(outf).writerows(rows)
It should be like this in python.
import csv
fo = csv.writer(open('out.csv','wb'))
with open('eg.txt', 'r') as f:
for line in f:
assert line[:3] == '00I'
buf = []
for i in range(88):
line = f.next()
buf.append(line.strip()[2:])
line = f.next()
assert line[:3] == 'A00'
line = f.next()
assert line[:3] == 'A10'
line = f.next()
assert line[:3] == 'A21'
fo.writerow(buf)

Categories

Resources