Related
I need to train the Chunker in Opennlp to classify the training data as a noun phrase. How do I proceed? The documentation online does not have an explanation how to do it without the command line, incorporated in a program. It says to use en-chunker.train, but how do you make that file?
EDIT: #Alaye
After running the code you gave in your answer, I get the following error that I cannot fix:
Indexing events using cutoff of 5
Computing event counts... done. 3 events
Dropped event B-NP:[w_2=bos, w_1=bos, w0=He, w1=reckons, w2=., w_1=bosw0=He, w0=Hew1=reckons, t_2=bos, t_1=bos, t0=PRP, t1=VBZ, t2=., t_2=bost_1=bos, t_1=bost0=PRP, t0=PRPt1=VBZ, t1=VBZt2=., t_2=bost_1=bost0=PRP, t_1=bost0=PRPt1=VBZ, t0=PRPt1=VBZt2=., p_2=bos, p_1=bos, p_2=bosp_1=bos, p_1=bost_2=bos, p_1=bost_1=bos, p_1=bost0=PRP, p_1=bost1=VBZ, p_1=bost2=., p_1=bost_2=bost_1=bos, p_1=bost_1=bost0=PRP, p_1=bost0=PRPt1=VBZ, p_1=bost1=VBZt2=., p_1=bost_2=bost_1=bost0=PRP, p_1=bost_1=bost0=PRPt1=VBZ, p_1=bost0=PRPt1=VBZt2=., p_1=bosw_2=bos, p_1=bosw_1=bos, p_1=bosw0=He, p_1=bosw1=reckons, p_1=bosw2=., p_1=bosw_1=bosw0=He, p_1=bosw0=Hew1=reckons]
Dropped event B-VP:[w_2=bos, w_1=He, w0=reckons, w1=., w2=eos, w_1=Hew0=reckons, w0=reckonsw1=., t_2=bos, t_1=PRP, t0=VBZ, t1=., t2=eos, t_2=bost_1=PRP, t_1=PRPt0=VBZ, t0=VBZt1=., t1=.t2=eos, t_2=bost_1=PRPt0=VBZ, t_1=PRPt0=VBZt1=., t0=VBZt1=.t2=eos, p_2=bos, p_1=B-NP, p_2=bosp_1=B-NP, p_1=B-NPt_2=bos, p_1=B-NPt_1=PRP, p_1=B-NPt0=VBZ, p_1=B-NPt1=., p_1=B-NPt2=eos, p_1=B-NPt_2=bost_1=PRP, p_1=B-NPt_1=PRPt0=VBZ, p_1=B-NPt0=VBZt1=., p_1=B-NPt1=.t2=eos, p_1=B-NPt_2=bost_1=PRPt0=VBZ, p_1=B-NPt_1=PRPt0=VBZt1=., p_1=B-NPt0=VBZt1=.t2=eos, p_1=B-NPw_2=bos, p_1=B-NPw_1=He, p_1=B-NPw0=reckons, p_1=B-NPw1=., p_1=B-NPw2=eos, p_1=B-NPw_1=Hew0=reckons, p_1=B-NPw0=reckonsw1=.]
Dropped event O:[w_2=He, w_1=reckons, w0=., w1=eos, w2=eos, w_1=reckonsw0=., w0=.w1=eos, t_2=PRP, t_1=VBZ, t0=., t1=eos, t2=eos, t_2=PRPt_1=VBZ, t_1=VBZt0=., t0=.t1=eos, t1=eost2=eos, t_2=PRPt_1=VBZt0=., t_1=VBZt0=.t1=eos, t0=.t1=eost2=eos, p_2B-NP, p_1=B-VP, p_2B-NPp_1=B-VP, p_1=B-VPt_2=PRP, p_1=B-VPt_1=VBZ, p_1=B-VPt0=., p_1=B-VPt1=eos, p_1=B-VPt2=eos, p_1=B-VPt_2=PRPt_1=VBZ, p_1=B-VPt_1=VBZt0=., p_1=B-VPt0=.t1=eos, p_1=B-VPt1=eost2=eos, p_1=B-VPt_2=PRPt_1=VBZt0=., p_1=B-VPt_1=VBZt0=.t1=eos, p_1=B-VPt0=.t1=eost2=eos, p_1=B-VPw_2=He, p_1=B-VPw_1=reckons, p_1=B-VPw0=., p_1=B-VPw1=eos, p_1=B-VPw2=eos, p_1=B-VPw_1=reckonsw0=., p_1=B-VPw0=.w1=eos]
Indexing... done.
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at opennlp.tools.ml.model.AbstractDataIndexer.sortAndMerge(AbstractDataIndexer.java:89)
at opennlp.tools.ml.model.TwoPassDataIndexer.<init>(TwoPassDataIndexer.java:105)
at opennlp.tools.ml.AbstractEventTrainer.getDataIndexer(AbstractEventTrainer.java:74)
at opennlp.tools.ml.AbstractEventTrainer.train(AbstractEventTrainer.java:91)
at opennlp.tools.ml.model.TrainUtil.train(TrainUtil.java:53)
at opennlp.tools.chunker.ChunkerME.train(ChunkerME.java:253)
at com.oracle.crm.nlp.CustomChunker2.main(CustomChunker2.java:91)
Sorting and merging events... Process exited with exit code 1.
(My en-chunker.train had only the first 2 and last line of your sample data set.)
Could you please tell me why this is happening and how to fix it?
EDIT2: I got the Chunker to work, however it gives an error when I change the sentence in the training set to any sentence other than the one you've given in your answer. Can you tell me why that could be happening?
As said in Opennlp Documentation
Sample sentence of the training data:
He PRP B-NP
reckons VBZ B-VP
the DT B-NP
current JJ I-NP
account NN I-NP
deficit NN I-NP
will MD B-VP
narrow VB I-VP
to TO B-PP
only RB B-NP
# # I-NP
1.8 CD I-NP
billion CD I-NP
in IN B-PP
September NNP B-NP
. . O
This is how you make your en-chunk.train file and you can create the corresponding .bin file using CLI:
$ opennlp ChunkerTrainerME -model en-chunker.bin -lang en -data en-chunker.train -encoding
or using API
public class SentenceTrainer {
public static void trainModel(String inputFile, String modelFile)
throws IOException {
Objects.nonNull(inputFile);
Objects.nonNull(modelFile);
MarkableFileInputStreamFactory factory = new MarkableFileInputStreamFactory(
new File(inputFile));
Charset charset = Charset.forName("UTF-8");
ObjectStream<String> lineStream =
new PlainTextByLineStream(new FileInputStream("en-chunker.train"),charset);
ObjectStream<ChunkSample> sampleStream = new ChunkSampleStream(lineStream);
ChunkerModel model;
try {
model = ChunkerME.train("en", sampleStream,
new DefaultChunkerContextGenerator(), TrainingParameters.defaultParams());
}
finally {
sampleStream.close();
}
OutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(modelFile));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}
and the main method will be:
public class Main {
public static void main(String args[]) throws IOException {
String inputFile = "//path//to//data.train";
String modelFile = "//path//to//.bin";
SentenceTrainer.trainModel(inputFile, modelFile);
}
}
reference: this blog
hope this helps!
PS: collect/write the data as above in a .txt file and rename it with .train extension or even the trainingdata.txt will work. that is how you make a .train file.
I am trying to write this mapreduce program which has to take input from two files, one has the details of occupations and states , and the other has details of occupation and job growth percentage. I use two mappers and combine them and in my reducer try to see which jobs have growth percent more than 30. My output should ideally be the occupation followed by the list of states. I am however, only getting the occupation names and not the states. I have posted the code and the sample input files below. PLease point out what i am doing wrong. Thanks.
(Please note that the samples of the input files i have provided are just small portions of the actual files).
package com;
import java.io.IOException;
//import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.MultipleInputs;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class GrowthState extends Configured implements Tool {
//Parser for Mapper1
public static class StateParser{
private String State,Occupation;
public void parse(String record){
String str[] = record.split("\t");
if(str[4].length() != 0)
setOccupation(str[4]);
else
setOccupation("Default Occupation");
if(str[2].length() != 0)
setState(str[2]);
else
setState("Default State");
}
public void parse(Text record){
parse(record.toString());
}
public String getState() {
return State;
}
public void setState(String state) {
State = state;
}
public String getOccupation() {
return Occupation;
}
public void setOccupation(String occupation) {
Occupation = occupation;
}
}
//Mapper1 - Processing state.txt
public static class GrowthMap1 extends Mapper<LongWritable,Text,Text,Text>{
StateParser sp = new StateParser();
Text outkey = new Text();
Text outvalue = new Text();
public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{
sp.parse(value);
outkey.set(sp.getOccupation());
outvalue.set("m1\t"+sp.getState());
context.write(outkey,outvalue);
//String str[] = value.toString().split("\t");
//context.write(new Text(str[2]), new Text("m1\t"+str[4]));
}
}
public static class ProjParser{
private String Occupation,percent;
public void parse(String record){
String str[] = record.split("\t");
if(str[0].length() != 0)
setOccupation(str[0]);
else
setOccupation("Default Occupation");
if(str[5].length() != 0)
setPercent(str[5]);
else
setPercent("0");
}
public void parse(Text record){
parse(record.toString());
}
public String getOccupation() {
return Occupation;
}
public void setOccupation(String occupation) {
Occupation = occupation;
}
public String getPercent() {
return percent;
}
public void setPercent(String percent) {
this.percent = percent;
}
}
//Mapper2 - processing projection.txt
public static class GrowthMap2 extends Mapper<LongWritable,Text,Text,Text> {
ProjParser pp = new ProjParser();
Text outkey = new Text();
Text outvalue = new Text();
public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{
pp.parse(value);
outkey.set(pp.getOccupation());
outvalue.set("m2\t"+pp.getPercent());
context.write(outkey, outvalue);
//String str[] = value.toString().split("\t");
//context.write(new Text(str[0]), new Text("m2\t"+str[5]));
}
}
//Reducer
public static class GrowthReduce extends Reducer<Text,Text,Text,Text>{
Text outvalue = new Text();
public void reduce(Text key,Iterable<Text> value,Context context)throws IOException, InterruptedException{
float cent = 0;
String state = "";
for(Text values : value){
String[] str = values.toString().split("\t");
if(str[0].equals("m1")){
state = state + " " + str[1];
}else if(str[0].equals("m2")){
try{
cent = Float.parseFloat(str[1]);
}catch(Exception nf){
cent = 0;
}
}
}
if(cent>=30){
outvalue.set(state);
context.write(key,outvalue );
}
}
}
//Driver
#Override
public int run(String[] args) throws Exception {
Job job = new Job(getConf(), "States of Growth");
job.setJarByClass(GrowthState.class);
job.setReducerClass(GrowthReduce.class);
MultipleInputs.addInputPath(job, new Path(args[0]), TextInputFormat.class, GrowthMap1.class);
MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, GrowthMap2.class);
FileOutputFormat.setOutputPath(job,new Path(args[2]));
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
return job.waitForCompletion(true)?0:1;
}
public static void main(String args[]) throws Exception{
int exitcode = ToolRunner.run(new GrowthState(), args);
System.exit(exitcode);
}
}
Sample input file1:
01 AL Alabama 00-0000 All Occupations total "1,857,530" 0.4 1000.000 1.00 19.66 "40,890" 0.5 8.30 9.72 14.83 23.95 36.04 "17,260" "20,220" "30,850" "49,810" "74,950"
01 AL Alabama 11-0000 Management Occupations major "67,500" 1.1 36.338 0.73 51.48 "107,080" 0.6 24.54 33.09 44.98 62.09 88.43 "51,050" "68,830" "93,550" "129,150" "183,940"
01 AL Alabama 11-1011 Chief Executives detailed "1,080" 4.8 0.580 0.32 97.67 "203,150" 2.5 52.05 67.58 # # # "108,270" "140,570" # # #
01 AL Alabama 11-1021 General and Operations Managers detailed "26,480" 1.5 14.258 0.94 58.00 "120,640" 0.9 27.65 35.76 49.00 71.44 # "57,510" "74,390" "101,930" "148,590" #
01 AL Alabama 11-1031 Legislators detailed "1,470" 8.7 0.790 1.94 * "21,920" 3.5 * * * * * "16,120" "17,000" "18,450" "20,670" "32,820" TRUE
01 AL Alabama 11-2011 Advertising and Promotions Managers detailed 80 16.3 0.042 0.19 44.88 "93,350" 9.5 21.59 30.28 38.92 52.22 74.07 "44,900" "62,980" "80,960" "108,620" "154,060"
01 AL Alabama 11-2021 Marketing Managers detailed 610 11.5 0.329 0.24 61.28 "127,460" 7.4 31.96 37.63 53.39 73.17 # "66,480" "78,280" "111,040" "152,200" #
01 AL Alabama 11-2022 Sales Managers detailed "2,330" 5.4 1.253 0.47 54.63 "113,620" 2.2 27.28 35.42 48.92 67.62 89.42 "56,740" "73,660" "101,750" "140,640" "186,000"
05 AR Arkansas 43-4161 "Human Resources Assistants, Except Payroll and Timekeeping" detailed "1,470" 6.6 1.265 1.26 17.25 "35,870" 1.5 11.09 13.54 17.11 20.74 23.30 "23,060" "28,170" "35,590" "43,150" "48,450"
05 AR Arkansas 43-4171 Receptionists and Information Clerks detailed "7,080" 3.3 6.109 0.84 11.26 "23,420" 0.8 8.14 9.19 10.87 13.09 14.94 "16,940" "19,110" "22,600" "27,230" "31,070"
05 AR Arkansas 43-4181 Reservation and Transportation Ticket Agents and Travel Clerks detailed 590 23.6 0.510 0.50 12.61 "26,220" 6.1 8.99 9.81 10.88 14.82 20.59 "18,710" "20,400" "22,630" "30,830" "42,830"
05 AR Arkansas 43-4199 "Information and Record Clerks, All Other" detailed 920 4.7 0.795 0.61 18.45 "38,370" 1.8 13.59 15.33 18.49 21.35 23.86 "28,270" "31,880" "38,470" "44,410" "49,630"
05 AR Arkansas 43-5011 Cargo and Freight Agents detailed 480 16.5 0.418 0.73 * * * * * * * * * * * * *
05 AR Arkansas 43-5021 Couriers and Messengers detailed 510 12.4 0.444 0.84 11.92 "24,790" 2.1 8.73 9.91 11.26 13.49 16.03 "18,160" "20,620" "23,420" "28,060" "33,350"
sample input file 2:
Management occupations 11-0000 "8,861.5" "9,498.0" 636.6 7.2 22.2 "2,586.7" "$93,910" — — —
Top executives 11-1000 "2,361.5" "2,626.8" 265.2 11.2 3.3 717.4 "$99,550" — — —
Chief executives 11-1011 330.5 347.9 17.4 5.3 17.7 87.8 "$168,140" Bachelor's degree 5 years or more None
General and operations managers 11-1021 "1,972.7" "2,216.8" 244.1 12.4 1.0 613.1 "$95,440" Bachelor's degree Less than 5 years None
Legislators 11-1031 58.4 62.1 3.7 6.4 — 16.5 "$19,780" Bachelor's degree Less than 5 years None
"Advertising, marketing, promotions, public relations, and sales managers" 11-2000 637.4 700.5 63.1 9.9 3.4 203.3 "$107,950" — — —
Advertising and promotions managers 11-2011 35.5 38.0 2.4 6.9 17.8 13.4 "$88,590" Bachelor's degree Less than 5 years None
Marketing and sales managers 11-2020 539.8 592.5 52.7 9.8 2.6 168.6 "$110,340" — — —
Marketing managers 11-2021 180.5 203.4 22.9 12.7 2.6 61.7 "$119,480" Bachelor's degree 5 years or more None
Sales managers 11-2022 359.3 389.0 29.8 8.3 2.7 106.9 "$105,260" Bachelor's degree Less than 5 years None
Public relations and fundraising managers 11-2031 62.1 70.1 8.0 12.9 1.6 21.3 "$95,450" Bachelor's degree 5 years or more None
Operations specialties managers 11-3000 "1,647.5" "1,799.7" 152.1 9.2 3.3 459.1 "$100,720" — — —
Administrative services managers 11-3011 280.8 315.0 34.2 12.2 0.1 79.9 "$81,080" Bachelor's degree Less than 5 years None
Computer and information systems managers 11-3021 332.7 383.6 50.9 15.3 3.1 97.1 "$120,950" Bachelor's degree 5 years or more None
Financial managers 11-3031 532.1 579.2 47.1 8.9 5.1 146.9 "$109,740" Bachelor's degree 5 years or more None
Industrial production managers 11-3051 172.7 168.6 -4.1 -2.4 6.1 31.4 "$89,190" Bachelor's degree 5 years or more None
Purchasing managers 11-3061 71.9 73.4 1.5 2.1 0.3 17.3 "$100,170" Bachelor's degree 5 years or more None
"Transportation, storage, and distribution managers" 11-3071 105.2 110.3 5.1 4.9 4.8 29.1 "$81,830" High school diploma or equivalent 5 years or more None
Compensation and benefits managers 11-3111 20.7 21.4 0.6 3.1 — 6.1 "$95,250" Bachelor's degree 5 years or more None
Human resources managers 11-3121 102.7 116.3 13.6 13.2 1.0 40.6 "$99,720" Bachelor's degree 5 years or more None
Training and development managers 11-3131 28.6 31.8 3.2 11.2 — 10.7 "$95,400" Bachelor's degree 5 years or more None
Other management occupations 11-9000 "4,215.0" "4,371.0" 156.1 3.7 43.1 "1,207.0" "$81,940" — — —
There is a problem with your reducer.
The faulty code is shown below. The loop below gets called for all the values of a particular key (for e.g. for "Advertising and promotions managers", it gets called twice. Once with value "Alabama" and again with value "6.9"). Problem is, you have put the if(cent >= 30) statement, outside the for loop. It should be inside, for matching the key.
for(Text values : value){
String[] str = values.toString().split("\t");
if(str[0].equals("m1")){
state = state + " " + str[1];
}else if(str[0].equals("m2")){
try{
cent = Float.parseFloat(str[1]);
}catch(Exception nf){
cent = 0;
}
}
}
if(cent>=30){
outvalue.set(state);
context.write(key,outvalue );
}
Following piece of code works fine.
//Reducer
public static class GrowthReduce extends Reducer<Text,Text,Text,Text>{
Text outvalue = new Text();
HashMap<String, String> stateMap = new HashMap<String, String>();
public void reduce(Text key,Iterable<Text> value,Context context)throws IOException, InterruptedException{
float cent = 0;
for(Text values : value){
String[] str = values.toString().split("\t");
if(str[0].equals("m1")){
stateMap.put(key.toString().toLowerCase(), str[1]);
}
else if(str[0].equals("m2")){
try{
cent = Float.parseFloat(str[1]);
if(stateMap.containsKey(key.toString().toLowerCase()))
{
if(cent>30) {
outvalue.set(stateMap.get(key.toString().toLowerCase()));
context.write(key, outvalue);
}
stateMap.remove(key.toString());
}
}catch(Exception nf){
cent = 0;
}
}
}
}
}
The logic is:
As and when you encounter a state (value "m1"), you put it in state map.
Next time, when you encounter percent with same key (value "m2"), you check if the state is already in the map. If yes, then you output the key/value.
I have a file which contains thousands line of digits and texts.
I want to create a file of those line where just contains some specific keywords. here is my code, but in the output file I can see some lines which does not have those keywords.
I put the sample of input data and output data ate the end.
import java.io.*;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TagIdentifier {
/**
* creates an Instance of TagIdentifier to find the tags in a file .
* <p>
* The euclidean distance will be used as default distance measure.
*
* #param inputFilePath the name and address of the output file
* #param OutputFilePath the name and address of the file that should be read
* #param fieldSeparator the character for split the fields in line
* #param keywords the list of tag/keyword which should be found
* #throws FileNotFoundException, IOException
*/
public TagIdentifier(String inputFilePath, String OutputFilePath, String fieldSeparator,List<String> keywords )
throws FileNotFoundException, IOException {
/*Create a file to write the result in*/
FileWriter fileStream = new FileWriter(OutputFilePath, false);
BufferedWriter fileResult = new BufferedWriter(fileStream);
/* Create a file reade and buffer the data */
FileReader flickrFileReader = new FileReader(inputFilePath);
BufferedReader bufferedReader = new BufferedReader(flickrFileReader);
//StringBuffer stringBuffer = new StringBuffer();
String line;
int linecount = 0;
while ((line = bufferedReader.readLine()) != null) {
linecount++;
for (String keyword : keywords) {
//keyword = "//b"+keyword+"//b";
Pattern p = Pattern.compile(keyword, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(line);
if(m.find()){
fileResult.write(line);
fileResult.newLine();
break;
}
}
}
fileResult.close();
fileStream.close();
}
}
<B>Sample of list of tags:</B><br>
[place_of_worship, place of worship, religious_administration, cathedral, chapel, mosque, Church, temple, Religion, animist, bahai, buddhist, christian, hindu, jain, jewish, multifaith, muslim, pagan, pastafarian, scientologist, shinto, sikh, spiritualist, taoist, unitarian, yazidi, zoroastrian, nichiren, jodo_shinshu, jodo_shu, vajrayana, shingon_shu, zen, thai_mahanikaya, thai_thammayut, ahmadiyya, alaouite, druze, ibadi, ismaili, nondenominational, shia, sunni, sufi, asatru, celtic, greco-roman, wicca, EVKdFSMiD, VKdFSMA, CotFSM, irani, parsi, alternative, ashkenazi, buchari, conservative, egalitarian, hasidic , humanistic , kabbalistic , karaite , liberal , lubavitch , lubavitch_messianic , mizrachi_baghdadi , mizrachi_chida , mizrachi_jerusalemite , mizrachi_livorno , mizrachi_moroccan , modern_orthodox , neo_orthodox , nondenominational , orthodox, Orthodox Judaism, orthodox_ashkenaz , orthodox_sefard , progressive , reconstructionist , reform , renewal , samaritan , sefardi , sefardi_amsterdam , sefardi_london , traditional , ultra_orthodox , unaffiliated , yemenite , yemenite_baladi , yemenite_shami , Devi/Bhagavati, Krishna, Siva, Parasurama, Muthappan, adventist, alliance, anglican, assemblies_of_god, apostolic, armenian_apostolic, assyrian, baptist, catholic, catholic_apostolic, christ_scientist, christian_community, church_of_scotland, church_of_sweden, coptic_orthodox, czechoslovak_hussite, dutch_reformed, episcopal, evangelical, evangelical_covenant, exclusive_brethren, foursquare, greek_catholic, greek_orthodox, iglesia_ni_cristo, jehovahs_witness, kimbanguist, living_waters_church, lutheran, mariavite, maronite, mennonite, messianic_jewish, methodist, mission_covenant_church_of_sweden, moravian, mormon, nazarene, new_apostolic, nondenominational, orthodox, old_believers, old_catholic, pentecostal, philippine_independent, polish_catholic, polish_national_catholic, presbyterian, protestant, quaker, reformed, roman_catholic, russian_orthodox, salvation_army, santo_daime, serbian_orthodox, seventh_day_adventist, spiritist, united, united_church_of_christ, united_free_church_of_scotland, united_methodist, united_reformed, uniting]
<br>
<b>sample of lines that I want to filter by tags: <b> <br>
35653969 15 -0.14235 51.506416 74937968#N00 DSC02635 1124566870 1085303897 http://www.flickr.com/photos/mount_otz/35653969/ 6 UK;England;London;Hyde Park;Speaker's Corner;Singers uk;england;london;hydepark;speakerscorner;singers<br>
35654116 15 -0.14235 51.506416 74937968#N00 DSC02641 1124566908 1085304006 http://www.flickr.com/photos/mount_otz/35654116/ 5 UK;England;London;Hyde Park;Speaker's Corner uk;england;london;hydepark;speakerscorner<br>
35654245 15 -0.14235 51.506416 74937968#N00 DSC02639 1124566937 1085303967 http://www.flickr.com/photos/mount_otz/35654245/ "Today Speaker's Corner has three main topics, religion, the "evil USA" and the war - he was overdoing the first one....<br />" 5 UK;England;London;Hyde Park;Speaker's Corner uk;england;london;hydepark;speakerscorner<br>
<b>Sample of input<b>
1934995263 15 -0.072269 51.502712 99245765#N00 "Zaha Hadid Exhibition, Abu Dhabi Performing Arts Centre, 2007- Ongoing" 1194630416 1194615799 http://www.flickr.com/photos/blahflowers/1934995263/ 7 architecture;buildings;Abu Dhabi;United Arab Emirates;London;Zaha Hadid;Design Museum architecture;buildings;abudhabi;unitedarabemirates;london;zahahadid;designmuseum
1935258871 15 -0.128198 51.508354 20914166#N00 lomographers 1194632555 1194632555 http://www.flickr.com/photos/dreifachzucker/1935258871/ if I only remembered all the names. 22 "voigtl?�nder;bessa;rangefinder;wide angle;bessa L;super wide heliar aspherical 15mm f:4,5;film;agfa ultra 100;c41;analog;analogue;september;2007;september 21, 2007;september 2007;september 21 til 23, 2007;london;england;uk;united kingdom;lomography world congress 2007;lomo green shoot with scootiepye" voigtl?�nder;bessa;rangefinder;wideangle;bessal;superwideheliaraspherical15mmf45;film;agfaultra100;c41;analog;analogue;september;2007;september212007;september2007;september21til232007;london;england;uk;unitedkingdom;lomographyworldcongress2007;lomogreenshootwithscootiepye
This question already has an answer here:
How to use java.util.Scanner to correctly read user input from System.in and act on it?
(1 answer)
Closed 8 years ago.
I have this assignment where I need to populate a 2d array using info from a text file. I am trying to make every thing even using the primekey variable to replace my poscon that I am using temporarily. What is stopping me from doing that is that primekey wont increment right because of another
(MAIN PROBLEM) variable that is stuck, which is the emailcheck variable. I'm pretty sure it has to do with my for loop syntax but I can't quite figure it out.
...
try{
Scanner check = new Scanner(file);
Scanner checkNext = new Scanner(file);
System.out.println("Success File load");
String data=check.next();
System.out.println("data.next() works");
int emailcheck=0;
int primekey=0;
while(check.hasNext()){
posCon++;
//check for # symbol
for(int i=0;i<data.length();i++){
if(data.charAt(i)=='#'){
emailcheck=emailcheck+1;
}
}
//populates position array
if(data.equalsIgnoreCase("staff")||
data.equalsIgnoreCase("freshman")||
data.equalsIgnoreCase("sohmore")||
data.equalsIgnoreCase("junior")||
data.equalsIgnoreCase("senior")||
data.equalsIgnoreCase("adjunct")||
data.equalsIgnoreCase("professor"))
{
db[0][posCon]=data;
sort=1;
data=check.next();
}
//id
else if(sort==1){
db[1][posCon]=data;
sort=2;
data=check.next();
}
//firstname
else if(sort==2){
db[2][posCon]=data;
sort=3;
data=check.next();
}
//lastname
else if(sort==3){
db[3][posCon]=data;
sort=4;
data=check.next();
}
//department
else if(sort==4){
db[4][posCon]=data;
sort=5;
data=check.next();
}
//email
else if(sort==5 && emailcheck==1){
db[5][posCon]=data;
sort=6;
emailcheck=0;
}
else if(sort==5 && emailcheck==0){
db[5][posCon]="not here";
sort=6;
}
//room
else if(sort==6){
db[6][posCon]=data;
sort=0;
data=check.next();
emailcheck=0;
primekey=primekey+1;
System.out.println(primekey);
}
else{
sort=0;
data=check.next();
emailcheck=0;
}
}
}catch(FileNotFoundException e) {
e.printStackTrace();
}
}//End Constructor
here is the data from the text file
Staff 77778 Julie Chang Registrar
Adjunct 19778 Mike Thompson CS mtxxx#gmail.com GITC2400
Staff 30041 Anne Mathews Security
Junior 98444 Serene Murray Math smyyy#gmail.com
Freshman 98772 Bob Mathew CS bmyyy#gmail.com
Professor 19010 Joan Berry Math jbxxx#gmail.com GITC2315C
Professor 19871 Aparna Khat CS akxxx#gmail.com GITC1400
Adjunct 18821 Hari Mentor Physics hmxxx#gmail.com CK231
Staff 20112 Jim George Plant
Junior 68339 Tom Harry CS thyyy#gmail.com
Senior 78883 Vince Charles IT vcyyy#gmail.com
Freshman 87777 Susan Han EE shyyy#gmail.com
Senior 88888 Janki Khat IE jkyyy#gmail.com
Staff 5555 Aparna Sen Plant
Senior 66663 Jill Kaley it jk#jk.com
Staff 77777 Joe Batra plumbing
Staff 33333 Jim Natale Plumbing
You have the data = check.next() call in the wrong place it should just after the start of the loop with the check.hasNext() test in it, and not elsewhere. You only need one Scanner instance the second one you declare is redundant.
So yesterday I asked a question about some GUI-ing. I completly threw that over, since I found it a little to complicated for me to actually deal with it.
Now I am reworking the thing in the console.
And I got myself stuck again. My problem this time: How can I jump back to a point before a if-command was executed?
Direct example:
import java.util.Scanner;
import java.io.*;
public class HBA {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
System.out.println("Herzlichen Glückwunsch Anna! Und viel Spaß mit deinem Geschenk!") ;
System.out.println("Neben diesem Programm befindet sich eine Passwort gesicherte Datei, die einen weiteren Teil deines Geschenks enthällt."
+ "Um an das Passwort zu gelangen wirst du jedoch ein paar ganz besonders schwierige Fragen beantworten müssen!"
+ "Wenn du bereit für das Quiz bist, gib in die Eingabe: 'ok' ein.");
String OK, Q1, Q2, Q3, Q4, Q5, Q6, Q7;
BufferedReader repo = null;
OK = scan.next();
if (OK == "ok") {
System.out.println("Alles gut, fangen wir mit etwas leichtem an!");
}
else {
System.out.println("Wie... bei so einer einfachen Sache versagst du schon? Versuchs nochmal!");
}
System.out.println("Frage 1: Wer ist Supergeil? \n A: Erik \n B: Anna \n C: The J \n D: Friedrich Liechtenstein");
mark(0);
Q1 = scan.next();
if (Q1 == "D") {
System.out.println("Richtig! Der erste Buchstabe lautet: S");
}
else {
System.out.println("Leider falsch. Versuch es nochmal.");
reset();
}
}
}
The scripted works as expected, besides: If you type something wrong in the last part:
System.out.println("Frage 1: Wer ist Supergeil? \n A: Erik \n B: Anna \n C: The J \n D: Friedrich Liechtenstein");
mark(0);
Q1 = scan.next();
if (Q1 == "D") {
System.out.println("Richtig! Der erste Buchstabe lautet: S");
}
else {
System.out.println("Leider falsch. Versuch es nochmal.");
reset();
}
}
}
It just ends the script. Instead it should jump back to the beginning of the if-command.
Means: The answer to the question in the System.out.printLn (it is a question) is D and you typ A instead, it should reset the whole thing that you can try it again and answer something different. How can I achieve that? I read that BufferedReader have a mark() and reset() function, but I don't know if they work the way I expect them to or how I would have to integrate them.
I also thought about using a while or a do command. But I haven't found a way for that yet.
Can someone pls enlighten me?
Thanks!