small java problem - java

Sorry if my question sounds dumb. But some time small things create big problem for you and take your whole time to solve it. But thanks to stackoverflow where i can get GURU advices. :)
So here is my problem. i search for a word in a string and put 0 where that word occur.
For example : search word is DOG and i have string "never ever let dog bite you" so the string
would be 000100 . Now when I try to convert this string into INT it produce result 100 :( which is bad. I also can not use int array i can only use string as i am concatinating it, also using somewhere else too in program.
Now i am sure you are wondering why i want to convert it into INT. So here my answer. I am using 3 words from each string to make this kind of binary string. So lets say i used three search queries like ( dog, dog, ever ) so all three strings would be
000100
000100
010000
Then I want to SUM them it should produce result like this "010200" while it produce result "10200" which is wrong. :(
Thanks in advance

Of course the int representation won't retain leading zeros. But you can easily convert back to a String after summing and pad the zeros on the left yourself - just store the maximum length of any string (assuming they can have different lengths). Or if you wanted to get even fancier you could use NumberFormat, but you might find this to be overkill for your needs.
Also, be careful - you will get some unexpected results with this code if any word appears in 10 or more strings.

Looks like you might want to investigate java.util.BitSet.

You could prefix your value with a '1', that would preserve your leading 0's. You can then take that prefix into account you do your sum in the end.
That all is assuming you work through your 10 overflow issue that was mentioned in another comment.

Could you store it as a character array instead? Your using an int, which is fine, but your really not wanting an int - you want each position in the int to represent words in a string, and you turn them on or off (1 or 0). Seems like storing them in a character array would make more sense.

Related

Making a prescriptionCode (variable) appear in specific format in java

sorry for the title but couldn't really express this in another way.
So, let's say I have a variable that represents a prescription's unique code. I already know that there a total of 400 prescriptions. So for every new prescription I would like that code to change by one. The first one I want it to be 001, the second one 002 etc. I know I can just set a static int but how can I make the 0's appear in the front so it prints 001 and not just 1? I am new to java so I might be asking a really stupid question. Thanks for your time!
You can format it with Java format specifiers.
Here would be the code to do that:
int prescriptionCode = 1;
System.out.println(String.format("%03d", prescriptionCode));
The string "%03d" is the format specifier. Going backwards, "d" indicates that the value you want here is a decimal, "3" indicates that you want the value to be 3 characters long regardless of its actual length, "0" means that you want to fill the remaining space the number doesn't take up with 0's.

Java: Limiting decimals from a double in output without e.g. %5.2d

I'm very (read: extremely) new to java and was going to make a table with 5 columns print out for an assignement.
The two first columns are strings, third one an int, fourth a double and fifth an int. It is three rows in total that are affected by this.
I formatted it with printf and:
System.out.printf(Locale.ENGLISH, "%s%10s%14d%20.2f%12d\n", stringOne,
stringTwo, firstInt, stupidDouble, secondInt);
and
"%s\t%s\t\t%d\t\t%2.4f\t\t%d\n" (with the format and stuff above as well ofc).
But the teacher didn't want me to "hard code" the layout since it's part of learning special commands (like \t) and thus wanted me to change it and adress it in the variable itself instead.
I've been trying everything I can think of but can't get it to skip the last decimals, which are unneccesary 0's.
I wish I could've just used the numbers as a string instead, but I need it to calculate the last int with:
static int lastInt = (int) Math.round ( stupidDouble - firstInt )
What I suspect that my teacher is looking for me to do is tabs all the way like:
"%s\t%s\t\t%d\t\t%f\t\t%d\n"
But I need to double to be e.g. 1.2345 instead of 1.234500.
If it makes any difference, it's 3 different doubles (3 different rows in the table); one with 2 decimals, one with 4 and one with 5.
It would more or less save my weekend if someone could help me with this. The simplest possible solution would be much appreciated. <3

using java to parse a csv then save in 2D array

Okay so i am working on a game based on a Trading card game in java. I Scraped all of the game peices' "information" into a csv file where each row is a game peice and each column is a type of attribute for that peice. I have spent hours upon hours writing code with Buffered reader and etc. trying to extract the information from my csv file into a 2d Array but to no avail. My csv file is linked Here: http://dl.dropbox.com/u/3625527/MonstersFinal.csv I have one year of computer science under my belt but I still cannot figure out how to do this.
So my main question is how do i place this into a 2D array that way i can keep the rows and columns?
Well, as mentioned before, some of your strings contain commas, so initially you're starting from a bad place, but I do have a solution and it's this:
--------- If possible, rescrape the site, but perform a simple encoding operation when you do. You'll want to do something like what you'll notice tends to be done in autogenerated XML files which contain HTML; reserve a 'control character' (a printable character works best, here, for reasons of debugging and... well... sanity) that, once encoded, is never meant to be read directly as an instance of itself. Ampersand is what I like to use because it's uncommon enough but still printable, but really what character you want to use is up to you. What I would do is write the program so that, at every instance of ",", that comma would be replaced by "&c" before being written to the CSV, and at every instance of an actual ampersand on the site, that "&" would be replaced by "&a". That way, you would never have the issue of accidentally separating a single value into two in the CSV, and you could simply decode each value after you've separated them by the method I'm about to outline in...
-------- Assuming you know how many columns will be in each row, you can use the StringTokenizer class (look it up- it's awesome and built into Java. A good place to look for information is, as always, the Java Tutorials) to automatically give you the values you need in the form of an array.
It works by your passing in a string and a delimiter (in this case, the delimiter would be ','), and it spitting out all the substrings which were separated by those commas. If you know how many pieces there are in total from the get-go, you can instantiate a 2D array at the beginning and just plug in each row the StringTokenizer gives them to you. If you don't, it's still okay, because you can use an ArrayList. An ArrayList is nice because it's a higher-level abstraction of an array that automatically asks for more memory such that you can continue adding to it and know that retrieval time will always be constant. However, if you plan on dynamically adding pieces, and doing that more often than retrieving them, you might want to use a LinkedList instead, because it has a linear retrieval time, but a much better relation than an ArrayList for add-remove time. Or, if you're awesome, you could use a SkipList instead. I don't know if they're implemented by default in Java, but they're awesome. Fair warning, though; the cost of speed on retrieval, removal, and placement comes with increased overhead in terms of memory. Skip lists maintain a lot of pointers.
If you know there should be the same number of values in each row, and you want them to be positionally organized, but for whatever reason your scraper doesn't handle the lack of a value for a row, and just doesn't put that value, you've some bad news... it would be easier to rewrite the part of the scraper code that deals with the lack of values than it would be to write a method that interprets varying length arrays and instantiates a Piece object for each array. My suggestion for this would again be to use the control character and fill empty columns with &n (for 'null') to be interpreted later, but then specifics are of course what will individuate your code and coding style so it's not for me to say.
edit: I think the main thing you should focus on is learning the different standard library datatypes available in Java, and maybe learn to implement some of them yourself for practice. I remember implementing a binary search tree- not an AVL tree, but alright. It's fun enough, good coding practice, and, more importantly, necessary if you want to be able to do things quickly and efficiently. I don't know exactly how Java implements arrays, because the definition is "a contiguous section of memory", yet you can allocate memory for them in Java at runtime using variables... but regardless of the specific Java implementation, arrays often aren't the best solution. Also, knowing regular expressions makes everything much easier. For practice, I'd recommend working them into your Java programs, or, if you don't want to have to compile and jar things every time, your bash scripts (if your using *nix) and/or batch scripts (if you're using Windows).
I think the way you've scraped the data makes this problem more difficult than it needs to be. Your scrape seems inconsistent and difficult to work with given that most values are surrounded by quotes inconsistently, some data already has commas in it, and not each card is on its own line.
Try re-scraping the data in a much more consistent format, such as:
R1C1|R1C2|R1C3|R1C4|R1C5|R1C6|R1C7|R1C8
R2C1|R2C2|R2C3|R2C4|R2C5|R2C6|R2C7|R3C8
R3C1|R3C2|R3C3|R3C4|R3C5|R3C6|R3C7|R3C8
R4C1|R4C2|R4C3|R4C4|R4C5|R4C6|R4C7|R4C8
A/D Changer|DREV-EN005|Effect Monster|Light|Warrior|100|100|You can remove from play this card in your Graveyard to select 1 monster on the field. Change its battle position.
Where each line is definitely its own card (As opposed to the example CSV you posted with new lines in odd places) and the delimiter is never used in a data field as something other than a delimiter.
Once you've gotten the input into a consistently readable state, it becomes very simple to parse through it:
BufferedReader br = new BufferedReader(new FileReader(new File("MonstersFinal.csv")));
String line = "";
ArrayList<String[]> cardList = new ArrayList<String[]>(); // Use an arraylist because we might not know how many cards we need to parse.
while((line = br.readLine()) != null) { // Read a single line from the file until there are no more lines to read
StringTokenizer st = new StringTokenizer(line, "|"); // "|" is the delimiter of our input file.
String[] card = new String[8]; // Each card has 8 fields, so we need room for the 8 tokens.
for(int i = 0; i < 8; i++) { // For each token in the line that we've read:
String value = st.nextToken(); // Read the token
card[i] = value; // Place the token into the ith "column"
}
cardList.add(card); // Add the card's info to the list of cards.
}
for(int i = 0; i < cardList.size(); i++) {
for(int x = 0; x < cardList.get(i).length; x++) {
System.out.printf("card[%d][%d]: ", i, x);
System.out.println(cardList.get(i)[x]);
}
}
Which would produce the following output for my given example input:
card[0][0]: R1C1
card[0][1]: R1C2
card[0][2]: R1C3
card[0][3]: R1C4
card[0][4]: R1C5
card[0][5]: R1C6
card[0][6]: R1C7
card[0][7]: R1C8
card[1][0]: R2C1
card[1][1]: R2C2
card[1][2]: R2C3
card[1][3]: R2C4
card[1][4]: R2C5
card[1][5]: R2C6
card[1][6]: R2C7
card[1][7]: R3C8
card[2][0]: R3C1
card[2][1]: R3C2
card[2][2]: R3C3
card[2][3]: R3C4
card[2][4]: R3C5
card[2][5]: R3C6
card[2][6]: R3C7
card[2][7]: R4C8
card[3][0]: R4C1
card[3][1]: R4C2
card[3][2]: R4C3
card[3][3]: R4C4
card[3][4]: R4C5
card[3][5]: R4C6
card[3][6]: R4C7
card[3][7]: R4C8
card[4][0]: A/D Changer
card[4][1]: DREV-EN005
card[4][2]: Effect Monster
card[4][3]: Light
card[4][4]: Warrior
card[4][5]: 100
card[4][6]: 100
card[4][7]: You can remove from play this card in your Graveyard to select 1 monster on the field. Change its battle position.
I hope re-scraping the information is an option here and I hope I haven't misunderstood anything; Good luck!
On a final note, don't forget to take advantage of OOP once you've gotten things worked out. a Card class could make working with the data even simpler.
I'm working on a similar problem for use in machine learning, so let me share what I've been able to do on the topic.
1) If you know before you start parsing the row - whether it's hard-coded into your program or whether you've got some header in your file that gives you this information (highly recommended) - how many attributes per row there will be, you can reasonably split it by comma, for example the first attribute will be RowString.substring(0, RowString.indexOf(',')), the second attribute will be the substring from the first comma to the next comma (writing a function to find the nth instance of a comma, or simply chopping off bits of the string as you go through it, should be fairly trivial), and the last attribute will be RowString.substring(RowString.lastIndexOf(','), RowString.length()). The String class's methods are your friends here.
2) If you are having trouble distinguishing between commas which are meant to separate values, and commas which are part of a string-formatted attribute, then (if the file is small enough to reformat by hand) do what Java does - represent characters with special meaning that are inside of strings with '\,' rather than just ','. That way you can search for the index of ',' and not '\,' so that you will have some way of distinguishing your characters.
3) As an alternative to 2), CSVs (in my opinion) aren't great for strings, which often include commas. There is no real common format to CSVs, so why not make them colon-separated-values, or dash-separated-values, or even triple-ampersand-separated-values? The point of separating values with commas is to make it easy to tell them apart, and if commas don't do the job there's no reason to keep them. Again, this applies only if your file is small enough to edit by hand.
4) Looking at your file for more than just the format, it becomes apparent that you can't do it by hand. Additionally, it would appear that some strings are surrounded by triple double quotes ("""string""") and some are surrounded by single double quotes ("string"). If I had to guess, I would say that anything included in a quotes is a single attribute - there are, for example, no pairs of quotes that start in one attribute and end in another. So I would say that you could:
Make a class with a method to break a string into each comma-separated fields.
Write that method such that it ignores commas preceded by an odd number of double quotes (this way, if the quote-pair hasn't been closed, it knows that it's inside a string and that the comma is not a value separator). This strategy, however, fails if the creator of your file did something like enclose some strings in double double quotes (""string""), so you may need a more comprehensive approach.

Printing an ArrayList of Strings to a PrintWriter with word wrap

Some classmates and I are working on a homework assignment for Java that requires we print an ArrayList of Strings to a PrintWriter using word wrap, so that none of the output passes 80 characters. We've extensively Googled this and can't find any Java API based way to do this.
I know it's generally "wrong" to ask a homework question on SO, but we're just looking for recommendations of the best way to do this, or if we missed something in the API. This isn't the major part of the homework, just a small output requirement.
Ideally, I'd like to be able to wordwrap the ArrayList's toString since it's nicely formatted already.
Well, this is a first for me, it's the first time one of my students has posted a question about one of the projects I've assigned them. The way it was phrased, that he was looking for an algorithm, and the answers you've all shared are just fine with me. However, this is a typical case of trying to make things too complicated. A part of the spec that was not mentioned was that the 80 characters limit was not a hard limit. I said that each line of the output file had to be roughly 80 characters long. It was OK to go over 80 a little. In my version of the solution, I just had a running count and did a modulus of the count to add the line end. I varied the value of the modulus until the output file looked right. This resulted in lines with small numbers being really short so I used a different modulus when the numbers were small. This wasn't a big part of the project and it's interesting that this got so much attention.
Our solution was to create a temporary string and append elements one by one, followed by a comma. Before adding an element, check if adding it will make the string longer than 80 characters and choose whether to print it and reset or just append.
This still has the issue with the extra trailing comma, but that's been dealt with so many times we'll be fine. I was looking to avoid this because it was originally more complicated in my head than it really is.
I think that better solution is to create your own WrapTextWriter that wraps any other writer and overrides method public void write(String str, int off, int len) throws IOException. Here it should run in loop and perform logic of wrapping.
This logic is not as simple as str.substring(80). If you are dealing with real text and wish to wrap it correctly (i.e. do not cut words, do not move comas or dots to the next line etc) you have to implement some logic. it is probably not too complicated but probably language dependent. For example in English there is not space between word and colon while in French they put space between them.
So, I performed 5 second googling and found the following discussion that can help you.
private static final int MAX_CHARACTERS = 80;
public static void main(String[] args)
throws FileNotFoundException
{
List<String> strings = new ArrayList<String>();
int size = 0;
PrintWriter writer = new PrintWriter(System.out, true); // Just as example
for (String str : strings)
{
size += str.length();
if (size > MAX_CHARACTERS)
{
writer.print(System.getProperty("line.separator") + str);
size = 0;
}
else
writer.print(str);
}
}
You can simply write a function, like "void printWordWrap(List<String> strings)", with that algorithm inside. I think, it`s a good way to solve your problem. :)

comparing "the likes" smartly

Suppose you need to perform some kind of comparison amongst 2 files. You only need to do it when it makes sense, in other words, you wouldn't want to compare JSON file with Property file or .txt file with .jar file
Additionally suppose that you have a mechanism in place to sort all of these things out and what it comes down to now is the actual file name. You would want to compare "myFile.txt" with "myFile.txt", but not with "somethingElse.txt". The goal is to be as close to "apples to apples" rules as possible.
So here we are, on one side you have "myFile.txt" and on another side you have "_myFile.txt", "_m_y_f_i_l_e.txt" and "somethingReallyClever.txt".
Task is to pick the closest name to later compare. Unfortunately, identical name is not found.
Looking at the character composition, it is not hard to figure out what the relationship is. My algo says:
_myFile.txt to _m_y_f_i_l_e.txt 0.312
_myFile.txt to somethingReallyClever.txt 0.16
So _m_y_f_i_l_e.txt is closer to_myFile.txt then somethingReallyClever.txt. Fantastic. But also says that ist is only 2 times closer, where as in reality we can look at the 2 files and would never think to compare somethingReallyClever.txt with _myFile.txt.
Why?
What logic would you suggest i apply to not only figure out likelihood by having chars on the same place, but also test whether determined weight makes sense?
In my example, somethingReallyClever.txt should have had a weight of 0.0
I hope i am being clear.
Please share your experience and thoughts on this.
(whatever approach you suggest should not depend on number of characters filename consists out of)
Possibly helpful previous question which highlights several possible algorithms:
Word comparison algorithm
These algorithms are based on how many changes would be needed to get from one string to the other - where a change is adding a character, deleting a character, or replacing a character.
Certainly any sensible metric here should have a low score as meaning close (think distance between the two strings) and larger scores as meaning not so close.
Sounds like you want the Levenshtein distance, perhaps modified by preconverting both words to the same case and normalizing spaces (e.g. replace all spaces and underscores with empty string)

Categories

Resources