Manual String Find and Replace [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Given a string I #eat# #hamburgers# and a StringBuilder eat: [eat, consume, like] hamburgers: [hamburgers, spinach, bananas], I want to randomly replace the words within hashmarks with randomly chosen ones from their wordbanks, so that phrases such as I like bananas and I consume spinach will be generated. Code to randomly select another word, given a token (i.e. eat, hamburgers) has been written.
I need to use this regex #[^#]+# to find words within the initial string contained by hashmarks, pass them to the replace method, and then put their random correlates back inside the initial string. I tried using StringTokenizer, but realized it's not the tool for the job.
I need to somehow extract the first word within hashmarks and pass it to the method calling for its replacement before calling the method archetypeString(#[^#]+#, replacement) in such a way so that when the loop runs again, both the word grabber&passer-to method and the replacement method are then working with the second hashed word.
tokenizer dead-end:
StringTokenizer stt = new StringTokenizer(archetype);
while(stt.hasMoreTokens()){
String temp = stt.nextToken();
if(temp.charAt(0)=='#');
}
and the getPhrase method:
public List<String> getPhrases(StringBuilder fileContent, String token) {
StreamTokenizer tokenizer = new StreamTokenizer(new StringReader(fileContent.toString()));
List<String> list = new ArrayList<String>();
try {
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
if (tokenizer.sval.equals(token)) {
tokenizer.nextToken(); // '['
do {
tokenizer.nextToken(); // go to the number
list.add(String.valueOf(tokenizer.sval));
} while (tokenizer.nextToken() == ',');
break;
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return list;
}

I need to use this regex #[^#]+# to find words within the initial string contained by hashmarks, pass them to the replace method, and then put their random correlates back inside the initial string. I tried using StringTokenizer, but realized it's not the tool for the job.
It is not clear from your question whether this is part of some sadistic homework assignment or just the first way you thought of to solve whatever problem you're trying to solve. This is not a regular expression problem any more than it's a StringTokenizer problem.
Look at String.format(), and the formatting capabilities of Formatter. I do not understand why you would ever need to know what the last string you generated was if your object is to generate the next one at random. Just pick a new random value and format it with String.format().
--
After reading your comment to this answer and looking at the question you referred to, I'm going to make a couple of recommendations.
(1) start with a simpler coding assignment or two, something without regular expressions. Make sure you absolutely understand the following concepts: instance variables. variable scope. public methods versus private methods. passing parameters to methods, and returning values from methods. You can do quite a bit with just that much. You don't need to study inheritance until you have all of those down cold, and I recommend that you do not try.
(2) for each coding assignment for at least your first 5, make sure you have written out what your program is to be provided as data and what output it is supposed to produce. List any constraints someone has given you separately (must use class X, must display error message, whatever).
(3) Put opening braces and closing braces on lines by themselves; match each opening brace with a closing brace indented the same amount. Indent code within each pair of braces another 2 or 3 spaces further to the right. This means that brace pairs inside other brace pairs will be indented further. I know this is not the way you see most code, and plenty of people will tell you that it is "wrong". But until you get comfortable with scope and whether a given place in your code is inside or outside a method or a loop, I think it best that you give yourself these extra visual cues. For someone not familiar with other ways of doing things, this is easiest.
(4) be careful of your terms when posting here. In the other question you refer to, you say it is about inheritance, but it uses "implements", indicating that it is implementing an interface, not inheriting from a class. It is confusing to those of us trying to help you if you get the terminology wrong.
(5) when you post here: post the entire program (these early assignments should all be under 100 lines total, no reason not to post all of it). Make sure it is properly indented; use spaces instead of tabs. In text, and maybe also in comments, point out the place in the code where you seem to have the problem (if you know). If there is an error message, post the entire error message (don't tell us what it is, and don't try to interpret it for us). Work on your code until you have a specific question: why do I get a compile error here? Why do I get (or fail to get) this output? The program outputs X but I expected Y, why is that? etc.
We're not a tutorial shop; most of us need instruction to learn to program, and you need to get most of that somewhere besides here. We are willing to help with your questions, given that your questions are specific and reasonable and you aren't expecting us to provide the instruction. By itself, "I'm lost and need help" is a bit beyond StackOverflow's normal way of operating.

Related

Count how many list entries have a string property that ends with a particular char

I have an array list with some names inside it (first and last names). What I have to do is go through each "first name" and see how many times a character (which the user specifies) shows up at the end of every first name in the array list, and then print out the number of times that character showed up.
public int countFirstName(char c) {
int i = 0;
for (Name n : list) {
if (n.getFirstName().length() - 1 == c) {
i++;
}
}
return i;
}
That is the code I have. The problem is that the counter (i) doesn't add 1 even if there is a character that matches the end of the first name.
You're comparing the index of last character in the string to the required character, instead of the last character itself, which you can access with charAt:
String firstName = n.getFirstName()
if (firstName.charAt(firstName.length() - 1) == c) {
i++;
}
When you're setting out learning to code, there is a great value in using pencil and paper, or describing your algorithm ahead of time, in the language you think in. Most people that learn a foreign language start out by assembling a sentence in their native language, translating it to foreign, then speaking the foreign. Few, if any, learners of a foreign language are able to think in it natively
Coding is no different; all your life you've been speaking English and thinking in it. Now you're aiming to learn a different pattern of thinking, syntax, key words. This task will go a lot easier if you:
work out in high level natural language what you want to do first
write down the steps in clear and simple language, like a recipe
don't try to do too much at once
Had I been a tutor marking your program, id have been looking for something like this:
//method to count the number of list entries ending with a particular character
public int countFirstNamesEndingWith(char lookFor) {
//declare a variable to hold the count
int cnt = 0;
//iterate the list
for (Name n : list) {
//get the first name
String fn = n.getFirstName();
//get the last char of it
char lc = fn.charAt(fn.length() - 1);
//compare
if (lc == lookFor) {
cnt++;
}
}
return cnt;
}
Taking the bullet points in turn:
The comments serve as a high level description of what must be done. We write them aLL first, before even writing a single line of code. My course penalised uncommented code, and writing them first was a handy way of getting the requirement out of the way (they're a chore, right? Not always, but..) but also it is really easy to write a logic algorithm in high level language, then translate the steps into the language learning. I definitely think if you'd taken this approach you wouldn't have made the error you did, as it would have been clear that the code you wrote didn't implement the algorithm you'd have described earlier
Don't try to do too much in one line. Yes, I'm sure plenty of coders think it looks cool, or trick, or shows off what impressive coding smarts they have to pack a good 10 line algorithm into a single line of code that uses some obscure language features but one day it's highly likely that someone else is going to have to come along to maintain that code, improve it or change part of what it does - at that moment it's no longer cool, and it was never really a smart thing to do
Aominee, in their comment, actually gives us something like an example of this:
return (int)list.stream().filter(e -> e.charAt.length()-1)==c).count();
It's a one line implementation of a solution to your problem. Cool huh? Well, it has a bug* (for a start) but it's not the main thrust of my argument. At a more basic level: have you got any idea what it's doing? can you look at it and in 2 seconds tell me how it works?
It's quite an advanced language feature, it's trick for sure, but it might be a very poor solution because it's hard to understand, hard to maintain as a result, and does a lot while looking like a little- it only really makes sense if you're well versed in the language. This one line bundles up a facility that loops over your list, a feature that effectively has a tiny sub method that is called for every item in the list, and whose job is to calculate if the name ends with the sought char
It p's a brilliant feature, a cute example and it surely has its place in production java, but it's place is probably not here, in your learning exercise
Similarly, I'd go as far to say that this line of yours:
if (n.getFirstName().length() - 1 == c) {
Is approaching "doing too much" - I say this because it's where your logic broke down; you didn't write enough code to effectively implement the algorithm. You'd actually have to write even more code to implement this way:
if (n.getFirstName().charAt(n.getFirstName().length() - 1) == c) {
This is a right eyeful to load into your brain and understand. The accepted answer broke it down a bit by first getting the name into a temporary variable. That's a sensible optimisation. I broke it out another step by getting the last char into a temp variable. In a production system I probably wouldn't go that far, but this is your learning phase - try to minimise the number of operations each of your lines does. It will aid your understanding of your own code a great deal
If you do ever get a penchant for writing as much code as possible in as few chars, look at some code golf games here on the stack exchange network; the game is to abuse as many language features as possible to make really short, trick code.. pretty much every winner stands as a testament to condense that should never, ever be put into a production system maintained by normal coders who value their sanity
*the bug is it doesn't get the first name out of the Name object

Can Java skip .toUpperCase() on literal string constants already in upper case?

I have a .toUpperCase() happening in a tight loop and have profiled and shown it is impacting application performance. Annoying thing is it's being called on strings already in capital letters. I'm considering just dropping the call to .toUpperCase() but this makes my code less safe for future use.
This level of Java performance optimization is past my experience thus far. Is there any way to do a pre-compilation, set an annotation, etc. to skip the call to toUpperCase on already upper case strings?
What you need to do if you can is call .toUpperCase() on the string once, and store it so that when you go through the loop you won't have to do it each time.
I don't believe there is a pre-compilation situation - you can't know in advance what data the code will be handling. If anyone can correct me on this, it's be pretty awesome.
If you post some example code, I'd be able to help a lot more - it really depends on what kind of access you have to the data before you get to the loop. If your loop is actually doing the data access (e.g., reading from a file) and you don't have control over where those files come from, your hands are a lot more tied than if the data is hardcoded.
Any many cases there's an easy answer, but in some, there's not much you can do.
You can try equalsIgnoreCase, too. It doesn't make a new string.
No you cannot do this using an annotation or pre-compilation because your input is given during the runtime and the annotation and pre-compilation are compile time constructions.
If you would have known the input in advance then you could simply convert it to uppercase before running the application, i.e. before you compile your application.
Note that there are many ways to optimize string handling but without more information we cannot give you any tailor made solution.
You can write a simple function isUpperCase(String) and call it before calling toUpperCase():
if (!isUpperCase(s)) {
s = s.toUpperCase()
}
It might be not significantly faster but at least this way less garbage will be created. If a majority of the strings in your input are already upper case this is very valid optimization.
isUpperCase function will look roughly like this:
boolean isUpperCase(String s) {
for (int i = 0; i < s.length; i++) {
if (Character.isLowerCase(s.charAt(i)) {
return false;
}
}
return true;
}
you need to do an if statement that conditions those letters out of it. the ideas good just have a condition. Then work with ascii codes so convert it using (int) then find the ascii numbers for uppercase which i have no idea what it is, and then continue saying if ascii whatever is true then ignore this section or if its for specific letters in a line then ignore it for charAt(i)
sorry its a rough explanation

using java to parse a csv then save in 2D array

Okay so i am working on a game based on a Trading card game in java. I Scraped all of the game peices' "information" into a csv file where each row is a game peice and each column is a type of attribute for that peice. I have spent hours upon hours writing code with Buffered reader and etc. trying to extract the information from my csv file into a 2d Array but to no avail. My csv file is linked Here: http://dl.dropbox.com/u/3625527/MonstersFinal.csv I have one year of computer science under my belt but I still cannot figure out how to do this.
So my main question is how do i place this into a 2D array that way i can keep the rows and columns?
Well, as mentioned before, some of your strings contain commas, so initially you're starting from a bad place, but I do have a solution and it's this:
--------- If possible, rescrape the site, but perform a simple encoding operation when you do. You'll want to do something like what you'll notice tends to be done in autogenerated XML files which contain HTML; reserve a 'control character' (a printable character works best, here, for reasons of debugging and... well... sanity) that, once encoded, is never meant to be read directly as an instance of itself. Ampersand is what I like to use because it's uncommon enough but still printable, but really what character you want to use is up to you. What I would do is write the program so that, at every instance of ",", that comma would be replaced by "&c" before being written to the CSV, and at every instance of an actual ampersand on the site, that "&" would be replaced by "&a". That way, you would never have the issue of accidentally separating a single value into two in the CSV, and you could simply decode each value after you've separated them by the method I'm about to outline in...
-------- Assuming you know how many columns will be in each row, you can use the StringTokenizer class (look it up- it's awesome and built into Java. A good place to look for information is, as always, the Java Tutorials) to automatically give you the values you need in the form of an array.
It works by your passing in a string and a delimiter (in this case, the delimiter would be ','), and it spitting out all the substrings which were separated by those commas. If you know how many pieces there are in total from the get-go, you can instantiate a 2D array at the beginning and just plug in each row the StringTokenizer gives them to you. If you don't, it's still okay, because you can use an ArrayList. An ArrayList is nice because it's a higher-level abstraction of an array that automatically asks for more memory such that you can continue adding to it and know that retrieval time will always be constant. However, if you plan on dynamically adding pieces, and doing that more often than retrieving them, you might want to use a LinkedList instead, because it has a linear retrieval time, but a much better relation than an ArrayList for add-remove time. Or, if you're awesome, you could use a SkipList instead. I don't know if they're implemented by default in Java, but they're awesome. Fair warning, though; the cost of speed on retrieval, removal, and placement comes with increased overhead in terms of memory. Skip lists maintain a lot of pointers.
If you know there should be the same number of values in each row, and you want them to be positionally organized, but for whatever reason your scraper doesn't handle the lack of a value for a row, and just doesn't put that value, you've some bad news... it would be easier to rewrite the part of the scraper code that deals with the lack of values than it would be to write a method that interprets varying length arrays and instantiates a Piece object for each array. My suggestion for this would again be to use the control character and fill empty columns with &n (for 'null') to be interpreted later, but then specifics are of course what will individuate your code and coding style so it's not for me to say.
edit: I think the main thing you should focus on is learning the different standard library datatypes available in Java, and maybe learn to implement some of them yourself for practice. I remember implementing a binary search tree- not an AVL tree, but alright. It's fun enough, good coding practice, and, more importantly, necessary if you want to be able to do things quickly and efficiently. I don't know exactly how Java implements arrays, because the definition is "a contiguous section of memory", yet you can allocate memory for them in Java at runtime using variables... but regardless of the specific Java implementation, arrays often aren't the best solution. Also, knowing regular expressions makes everything much easier. For practice, I'd recommend working them into your Java programs, or, if you don't want to have to compile and jar things every time, your bash scripts (if your using *nix) and/or batch scripts (if you're using Windows).
I think the way you've scraped the data makes this problem more difficult than it needs to be. Your scrape seems inconsistent and difficult to work with given that most values are surrounded by quotes inconsistently, some data already has commas in it, and not each card is on its own line.
Try re-scraping the data in a much more consistent format, such as:
R1C1|R1C2|R1C3|R1C4|R1C5|R1C6|R1C7|R1C8
R2C1|R2C2|R2C3|R2C4|R2C5|R2C6|R2C7|R3C8
R3C1|R3C2|R3C3|R3C4|R3C5|R3C6|R3C7|R3C8
R4C1|R4C2|R4C3|R4C4|R4C5|R4C6|R4C7|R4C8
A/D Changer|DREV-EN005|Effect Monster|Light|Warrior|100|100|You can remove from play this card in your Graveyard to select 1 monster on the field. Change its battle position.
Where each line is definitely its own card (As opposed to the example CSV you posted with new lines in odd places) and the delimiter is never used in a data field as something other than a delimiter.
Once you've gotten the input into a consistently readable state, it becomes very simple to parse through it:
BufferedReader br = new BufferedReader(new FileReader(new File("MonstersFinal.csv")));
String line = "";
ArrayList<String[]> cardList = new ArrayList<String[]>(); // Use an arraylist because we might not know how many cards we need to parse.
while((line = br.readLine()) != null) { // Read a single line from the file until there are no more lines to read
StringTokenizer st = new StringTokenizer(line, "|"); // "|" is the delimiter of our input file.
String[] card = new String[8]; // Each card has 8 fields, so we need room for the 8 tokens.
for(int i = 0; i < 8; i++) { // For each token in the line that we've read:
String value = st.nextToken(); // Read the token
card[i] = value; // Place the token into the ith "column"
}
cardList.add(card); // Add the card's info to the list of cards.
}
for(int i = 0; i < cardList.size(); i++) {
for(int x = 0; x < cardList.get(i).length; x++) {
System.out.printf("card[%d][%d]: ", i, x);
System.out.println(cardList.get(i)[x]);
}
}
Which would produce the following output for my given example input:
card[0][0]: R1C1
card[0][1]: R1C2
card[0][2]: R1C3
card[0][3]: R1C4
card[0][4]: R1C5
card[0][5]: R1C6
card[0][6]: R1C7
card[0][7]: R1C8
card[1][0]: R2C1
card[1][1]: R2C2
card[1][2]: R2C3
card[1][3]: R2C4
card[1][4]: R2C5
card[1][5]: R2C6
card[1][6]: R2C7
card[1][7]: R3C8
card[2][0]: R3C1
card[2][1]: R3C2
card[2][2]: R3C3
card[2][3]: R3C4
card[2][4]: R3C5
card[2][5]: R3C6
card[2][6]: R3C7
card[2][7]: R4C8
card[3][0]: R4C1
card[3][1]: R4C2
card[3][2]: R4C3
card[3][3]: R4C4
card[3][4]: R4C5
card[3][5]: R4C6
card[3][6]: R4C7
card[3][7]: R4C8
card[4][0]: A/D Changer
card[4][1]: DREV-EN005
card[4][2]: Effect Monster
card[4][3]: Light
card[4][4]: Warrior
card[4][5]: 100
card[4][6]: 100
card[4][7]: You can remove from play this card in your Graveyard to select 1 monster on the field. Change its battle position.
I hope re-scraping the information is an option here and I hope I haven't misunderstood anything; Good luck!
On a final note, don't forget to take advantage of OOP once you've gotten things worked out. a Card class could make working with the data even simpler.
I'm working on a similar problem for use in machine learning, so let me share what I've been able to do on the topic.
1) If you know before you start parsing the row - whether it's hard-coded into your program or whether you've got some header in your file that gives you this information (highly recommended) - how many attributes per row there will be, you can reasonably split it by comma, for example the first attribute will be RowString.substring(0, RowString.indexOf(',')), the second attribute will be the substring from the first comma to the next comma (writing a function to find the nth instance of a comma, or simply chopping off bits of the string as you go through it, should be fairly trivial), and the last attribute will be RowString.substring(RowString.lastIndexOf(','), RowString.length()). The String class's methods are your friends here.
2) If you are having trouble distinguishing between commas which are meant to separate values, and commas which are part of a string-formatted attribute, then (if the file is small enough to reformat by hand) do what Java does - represent characters with special meaning that are inside of strings with '\,' rather than just ','. That way you can search for the index of ',' and not '\,' so that you will have some way of distinguishing your characters.
3) As an alternative to 2), CSVs (in my opinion) aren't great for strings, which often include commas. There is no real common format to CSVs, so why not make them colon-separated-values, or dash-separated-values, or even triple-ampersand-separated-values? The point of separating values with commas is to make it easy to tell them apart, and if commas don't do the job there's no reason to keep them. Again, this applies only if your file is small enough to edit by hand.
4) Looking at your file for more than just the format, it becomes apparent that you can't do it by hand. Additionally, it would appear that some strings are surrounded by triple double quotes ("""string""") and some are surrounded by single double quotes ("string"). If I had to guess, I would say that anything included in a quotes is a single attribute - there are, for example, no pairs of quotes that start in one attribute and end in another. So I would say that you could:
Make a class with a method to break a string into each comma-separated fields.
Write that method such that it ignores commas preceded by an odd number of double quotes (this way, if the quote-pair hasn't been closed, it knows that it's inside a string and that the comma is not a value separator). This strategy, however, fails if the creator of your file did something like enclose some strings in double double quotes (""string""), so you may need a more comprehensive approach.

Printing an ArrayList of Strings to a PrintWriter with word wrap

Some classmates and I are working on a homework assignment for Java that requires we print an ArrayList of Strings to a PrintWriter using word wrap, so that none of the output passes 80 characters. We've extensively Googled this and can't find any Java API based way to do this.
I know it's generally "wrong" to ask a homework question on SO, but we're just looking for recommendations of the best way to do this, or if we missed something in the API. This isn't the major part of the homework, just a small output requirement.
Ideally, I'd like to be able to wordwrap the ArrayList's toString since it's nicely formatted already.
Well, this is a first for me, it's the first time one of my students has posted a question about one of the projects I've assigned them. The way it was phrased, that he was looking for an algorithm, and the answers you've all shared are just fine with me. However, this is a typical case of trying to make things too complicated. A part of the spec that was not mentioned was that the 80 characters limit was not a hard limit. I said that each line of the output file had to be roughly 80 characters long. It was OK to go over 80 a little. In my version of the solution, I just had a running count and did a modulus of the count to add the line end. I varied the value of the modulus until the output file looked right. This resulted in lines with small numbers being really short so I used a different modulus when the numbers were small. This wasn't a big part of the project and it's interesting that this got so much attention.
Our solution was to create a temporary string and append elements one by one, followed by a comma. Before adding an element, check if adding it will make the string longer than 80 characters and choose whether to print it and reset or just append.
This still has the issue with the extra trailing comma, but that's been dealt with so many times we'll be fine. I was looking to avoid this because it was originally more complicated in my head than it really is.
I think that better solution is to create your own WrapTextWriter that wraps any other writer and overrides method public void write(String str, int off, int len) throws IOException. Here it should run in loop and perform logic of wrapping.
This logic is not as simple as str.substring(80). If you are dealing with real text and wish to wrap it correctly (i.e. do not cut words, do not move comas or dots to the next line etc) you have to implement some logic. it is probably not too complicated but probably language dependent. For example in English there is not space between word and colon while in French they put space between them.
So, I performed 5 second googling and found the following discussion that can help you.
private static final int MAX_CHARACTERS = 80;
public static void main(String[] args)
throws FileNotFoundException
{
List<String> strings = new ArrayList<String>();
int size = 0;
PrintWriter writer = new PrintWriter(System.out, true); // Just as example
for (String str : strings)
{
size += str.length();
if (size > MAX_CHARACTERS)
{
writer.print(System.getProperty("line.separator") + str);
size = 0;
}
else
writer.print(str);
}
}
You can simply write a function, like "void printWordWrap(List<String> strings)", with that algorithm inside. I think, it`s a good way to solve your problem. :)

Effective way to handle singular/plural word based on some collection size [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
There are many instances in my work projects where I need to display the size of some collection in a sentence. For example, if the collection's size is 5, it will say "5 users". If it is size of 1 or 0, it will say "1 user" or "0 user". Right now, I'm doing it with if-else statements to determine whether to print the "s" or not, which is tedious.
I'm wondering if there's an open source JSP custom tag library that allows me to accomplish this. I know I can write one myself... basically, it will have 2 parameters like this: <lib:display word="user" collection="userList" />. Depending on the collection size, it will determine whether to append an "s" or not. But then, this implementation is not going to be too robust because I also need to handle "ies" and some words don't use any of those. So, instead of creating a half-baked tool, I'm hoping there's a more robust library I could utilize right away. I'm not too worried about prefixing the word with is/are in this case.
I use Java, by the way.
Thanks much.
Take a look at inflector, a java project which lets you do Noun.pluralOf("user"), or Noun.pluralOf("user", userList.size()), and which handles a bunch of variations and unusual cases (person->people, loaf->loaves, etc.), as well as letting you define custom mapping rules when necessary.
Hmm, I don't quite see why you need a library for this. I would think the function to do it is trivial:
public String singlePlural(int count, String singular, String plural)
{
return count==1 ? singular : plural;
}
Calls would look like:
singlePlural(count, "user", "users");
singlePlural(count, "baby", "babies");
singlePlural(count, "person", "people");
singlePlural(count, "cherub", "cherubim");
... etc ...
Maybe this library does a whole bunch of other things that make it useful. I suppose you could say that it supplies a dictionary of what all the plural forms are, but in any given program you don't care about the plurals of all the words in the language, just the ones you are using in this program. I guess if the word that could be singular or plural is not known at compile time, if it's something entered by the user, then I'd want a third party dictionary rather than trying to build one myself.
Edit
Suddenly it occurs to me that what you were looking for was a function for making plurals generically, embodying a set of rules like "normally just add 's', but if the word ends in 'y' change the 'y' to 'ies', if it ends in 's' change it to 'ses', ..." etc. I think in English that would be impossible for any practical purpose: there are too many special cases, like "person/people" and "child/children" etc. I think the best you could do would be to have a generic "add an 's'" rule, maybe a few other common cases, and then a long list of exceptions. Perhaps in other languages one could come up with a fairly simple rule.
So as I say, if the word is not known at compile time but comes from some user input, then yes, a third-party dictionary is highly desirable.
This gets complicated in languages other than English, that inflector aims to support in the future.
I am familiar with Czech where user = uživatel and:
1 uživatel
2 uživatelé
3 uživatelé
4 uživatelé
5 uživatelů
...
You can see why programs written with hardcoded singular+plural would get un-i18n-able.
Edit:
Java11 allows you to use the following:
ChoiceFormat fmt = new ChoiceFormat("1#uživatel | 1.0< uživatelé | 4< uživatelů");
System.out.println(fmt.format(1));
System.out.println(fmt.format(4));
System.out.println(fmt.format(5));
ChoiceFormat documentation
This functionality is built into Ruby on Rails. I don't know exactly where, but it should be easy enough to find in the source code, and then you could simply crib the code.
EDIT: Found you some code:
inflector.rb (very helpful comments!)
inflections.rb (extensive word list)
If I remember correctly, it's mainly a matter of appending an "s" to most words, though I believe there is a list (probably hash, err dictionary) of some common exceptions. Notable is the conversion from "person" to "people" :)
You would of course be in for a world of pain if you decided you want to internationalize this to other languages than English. Welcome to the world of highly irregular grammars, and good luck!

Categories

Resources