Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
In the last weeks I've seen some guys using really long names for a Method or Class (50 characters), this is usually under the premise that it improves readability, my opinion is that a long name like this is an indicator that we are trying to do a lot or too much in a method class if we need such a long name, however I wanted to know what do you guys think about it.
An Example is:
getNumberOfSkinCareEligibleItemsWithinTransaction
A name in Java, or any other language, is too long when a shorter name exists that equally conveys the behavior of the method.
Some techniques for reducing the length of method names:
If your whole program, or class, or module is about 'skin care items' you can drop skin care. For example, if your class is called SkinCareUtils,
that brings you to getNumberOfEligibleItemsWithinTransaction
You can change within to in, getNumberOfEligibleItemsInTransaction
You can change Transaction to Tx, which gets you to getNumberOfEligibleItemsInTx.
Or if the method accepts a param of type Transaction you can drop the InTx altogether: getNumberOfEligibleItems
You change numberOf by count: getEligibleItemsCount
Now that is very reasonable. And it is 60% shorter.
Just for a change, a non-subjective answer: 65536 characters.
A.java:1: UTF8 representation for string "xxxxxxxxxxxxxxxxxxxx..." is too long
for the constant pool
;-)
I agree with everyone: method names should not be too long. I do want to add one exception though:
The names of JUnit test methods, however, can be long and should resemble sentences.
Why?
Because they are not called in other code.
Because they are used as test names.
Because they then can be written as sentences describing requirements. (For example, using AgileDox)
Example:
#Test
public void testDialogClosesDownWhenTheRedButtonIsPressedTwice() {
...
}
See "Behavior Driven Design" for more info on this idea.
Context "...WithinTransaction" should be obvious. That's what object-orientation is all about.
The method is part of a class. If the class doesn't mean "Transaction" -- and if it doesn't save you from having to say "WithinTransaction" all the time, then you've got problems.
Java has a culture of encouraging long names, perhaps because the IDEs come with good autocompletion.
This site says that the longest class name in the JRE is InternalFrameInternalFrameTitlePaneInternalFrameTitlePaneMaximizeButtonWindowNotFocusedState which is 92 chars long.
As for longest method name I have found this one supportsDataDefinitionAndDataManipulationTransactions, which is 52 characters.
Never use a long word when a diminutive one will do.
I don't think your thesis of "length of method name is proportional to length of method" really holds water.
Take the example you give: "getNumberOfSkinCareEligibleItemsWithinTransaction". That sounds to me like it does just one thing: it counts the number of items in a transaction that fall into a certain category. Of course I can't judge without seeing the actual code for the method, but that sounds like a good method to me.
On the other hand, I've seen lots of methods with very short and concise names that do way to much work, like "processSale" or the ever popular "doStuff".
I think it would be tough to give a hard-and-fast rule about method name length, but the goal should be: long enough to convey what the function does, short enough to be readable. In this example, I'd think "getSkinCareCount" would probably have been sufficient. The question is what you need to distinguish. If you have one function that counts skin-care-eligible items in transactions and another that counts skin-care-eligible items in something else, then "withinTransactions" adds value. But if it doesn't mean anything to talk about such items outside of a transaction, then there's no point cluttering up the name with such superfluous information.
Two, I think it's wildly unrealistic to suppose that a name of any manageable length will tell you exactly what the function does in all but the most trivial cases. A realistic goal is to make a name that gives a reader a clue, and that can be remembered later. Like, if I'm trying to find the code that calculates how much antimatter we need to consume to reach warp speed, if I look at function names and see "calibrateTransporter", "firePhasers", and "calcAntimatterBurn", it's pretty clear that the first two aren't it but the third one might be. If I check and find that that is indeed the one I'm looking for, it will be easy to remember that when I come back tomorrow to work on this problem some more. That's good enough.
Three, long names that are similar are more confusing than short names. If I have two functions called "calcSalesmanPay" and "calcGeekPay", I can make a good guess which is which at a quick glance. But if they are called "calculateMonthlyCheckAmountForSalesmanForExportToAccountingSystemAndReconciliation" and "calculateMonthlyCheckAmountForProgrammersForExportToAccountingSystemAndReconciliation", I have to study the names to see which is which. The extra information in the name is probably counter-productive in such cases. It turns a half-second think into a 30-second think.
I tend use the haiku rule for names:
Seven syllable class names
five for variables
seven for method and other names
These are rules of thumb for max names. I violate this only when it improves readability. Something like recalculateMortgageInterest(currentRate, quoteSet...) is better than recalculateMortgageInterestRate or recalculateMortgageInterestRateFromSet since the fact that it involves rates and a set of quotes should be pretty clear from the embedded docs like javadoc or the .NET equivalent.
NOTE: Not a real haiku, as it is 7-5-7 rather than 5-7-5. But I still prefer calling it haiku.
Design your interface the way you want it to be, and make the implementation match.
For example, maybe i'd write that as
getTransaction().getItems(SKIN_CARE).getEligible().size()
or with Java 8 streams:
getTransaction().getItems().stream()
.filter(item -> item.getType() == SKIN_CARE)
.filter(item -> item.isEligible())
.count();
My rule is as follows: if a name is so long that it has to appear on a line of its own, then it is too long. (In practice, this means I'm rarely above 20 characters.)
This is based upon research showing that the number of visible vertical lines of code positively correlates with coding speed/effectiveness. If class/method names start significantly hurting that, they're too long.
Add a comment where the method/class is declared and let the IDE take you there if you want a long description of what it's for.
The length of the method itself is probably a better indicator of whether it's doing too much, and even that only gives you a rough idea. You should strive for conciseness, but descriptiveness is more important. If you can't convey the same meaning in a shorter name, then the name itself is probably okay.
When you are going to write a method name next time , just think the bellow quote
"The man who is going to maintain your code is a phyco who knows where you stay"
That method name is definitely too long. My mind tends to wander when I am reading such sized method names. It's like reading a sentence without spaces.
Personally, I prefer as few words in methods as possible. You are helped if the package and class name can convey meaning. If the responsibility of the class is very concise, there is no need for a giant method name. I'm curious why "WithinTransaction" on there.
"getNumberOfSkinCareEligibleItemsWithinTransaction" could become:
com.mycompany.app.product.SkinCareQuery.getNumEligibleItems();
Then when in use, the method could look like "query.getNumEligibleItems()"
A variable name is too long when a shorter name will allow for better code readability over the entire program, or the important parts of the program.
If a longer name allows you to convey more information about a value. However, if a name is too long, it will clutter the code and reduce the ability to comprehend the rest of the code. This typically happens by causing line wraps and pushing other lines of code off the page.
The trick is determining which will offer better readability. If the variable is used often or several times in a short amount of space, it may be better to give it a short name and use a comment clarify. The reader can refer back to the comment easily. If the variable is used often throughout the program, often as a parameter or in other complicated operations, it may be best to trim down the name, or use acronyms as a reminder to the reader. They can always reference a comment by the variable declaration if they forget the meaning.
This is not an easy trade off to make, since you have to consider what the code reader is likely to be trying to comprehend, and also take into account how the code will change and grow over time. That's why naming things is hard.
Readability is why it's acceptable to use i as a loop counter instead of DescriptiveLoopCounterName. Because this is the most common use for a variable, you can spend the least amount of screen space explaining why it exists. The longer name is just going to waste time by making it harder to understand how you are testing the loop condition or indexing into an array.
On the other end of the spectrum, if a function or variable is used rarely as in a complex operation, such as being passed to a multi-parameter function call, you can afford to give it an overly descriptive name.
As with any other language: when it no longer describes the single action the function performs.
I'd say use a combination of the good answers and be reasonable.
Completely, clearly and readably describe what the method does.
If the method name seems too long--refactor the method to do less.
It's too long when the name of the method wraps onto another line and the call to the method is the only thing on the line and starts pretty close to the margin. You have to take into account the average size of the screen of the people who will be using it.
But! If the name seems too long then it probably is too long. The way to get around it is to write your code in such a way that you are within a context and the name is short but duplicated in other contexts. This is like when you can say "she" or "he" in English instead of someone's full name.
It's too long when it too verbosively explains what the thing is about.
For example, these names are functionally equivalent.
in Java: java.sql.SQLIntegrityConstraintViolationException
in Python/Django: django.db.IntegrityError
Ask yourself, in a SQL/db package, how many more types of integrity errors can you come up with? ;)
Hence db.IntegrityError is sufficient.
An identifier name is too long when it exceeds the length your Java compiler can handle.
There are two ways or points of view here: One is that it really doesn't matter how long the method name is, as long as it's as descriptive as possible to describe what the method is doing (Java best practices basic rule). On the other hand, I agree with the flybywire post. We should use our intelligence to try to reduce as much as possible the method name, but without reducing it's descriptiveness. Descriptiveness is more important :)
A name is too long if it:
Takes more than 1 second to read
Takes up more RAM than you allocate for your JVM
Is something absurdly named
If a shorter name makes perfect sense
If it wraps around in your IDE
Honestly the name only needs to convey its purpose to the the Developers that will utilize it as a public API method or have to maintain the code when you leave. Just remember KISS (keep it simple stupid)
Related
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
When is a Java method name too long?
I know this is probably is a question of personal opinion, but I want to know what's standard practice and what would be frowned upon.
One of my profs in university always seems to make his variable and method names as short as possible (getAmt() instead of getAmount) for instance.
I have no objection to this, but personally, I prefer to have mine a little longer if it adds descriptiveness so the person reading it won't have to check or refer to documentation.
For instance, we made a method that given a list of players, returns the player who scored the most goals. I made the method getPlayerWithMostGoals(), is this wrong? I toiled over choosing a way to make it shorter for awhile, but then I thought "why?". It gets the point across clearly and Eclipse makes it easy to autocomplete it when I type.
I'm just wondering if the short variable names are a piece of the past due to needing everything to be as small as possible to be efficient. Is this still a requirement?
Nothing inherently wrong, it's better to make it descriptive than cryptic. However, it's often code-smell for a method that is trying to do too much or could be refactored
Bad: getActInfPstWeek
OK: getAccountInformationForPastWeek()
Better getAccountInformation(DateRange range)
I prefer to have long variable/method names that describe what's going on. In your case, I think getPlayerWithMostGoals() is appropriate. It bothers me when I see a short variable name like "amt" and I have to transpose that in my head (into "amount").
Something like getAmt() is looks like C++ code style... In java usually are used more descriptive names.
Your professor made a good understandable method. But it's very popular word. It's not a general case. Use your "longWordStyle" style it's more java.
As per standards, longer descriptive names are advised to make it more readable and maintainable on longer term. If you use very short naming e.g. a variable as a, you will forget yourself, what that variable is meant for after sometime. This becomes more problematic in bigger programs. Though I don't see an issue in using getAmt() in place of getAmount(), but definitely getPlayerWithMostGoals() is preferable over something like getPlayer().
Long names, short names, it all depends. There are a lot of approaches and discussions but in fact a method's name should reflect its intention. This helps you to further understand the code. Take this example.
public void print(String s)
Nifty name, short, concise... isn't it? Well, actually no if there's no documentation to tell you what do you mean by "Printing". I say System.our.println is a way of printing a string but you can define printing as saving the string in a file or showing it in a dialog.
public void printInConsole(String s)
Now there are no misunderstandings. Most people can tell you that you can read the method's JavaDoc to understand it but... are you going to read a full paragraph to decide if the method you're going to use does what you need?.
IMO, methods should describe at least an action and an entity (if they're related to one). "Long" is also a perception... but really long names make the code hard to structure. It's a matter of getting the proper balance.
As a rule of thumb, I'd void abreviations and use JavaDoc to further describe a method's intention. Descriptive names can be long but the reward is both readability and a self-explainatory code.
Could you give some good reasons for having the class name as part of the name of any variable? We use to have this policy, which I find quite useful. Some team member wants to revert the decision.
My arguments for the moment:
you can directly know what you're talking about:
for (Student student: students) {
...
}
is quite easy to understand (vs Student s or Student anyone)
it helps self-commenting the code
our ide provides direct support for that
you can directly see wheter you're using apples instead of pears (or bears ;-) )
Less confusion where subtle differences matter:
criteriaBuilder.equal(nameExpression, name);
The only argument I can see against this is that it makes the code longer (which I think isn't an issue with modern IDEs).
Is there public provisioning for such a recommendation? Anyone using the same rule? Any alternative?
That sounds like Hungarian Notation to me.
In principle it sounds like a good idea but I'm honestly not sure there are good reasons for it:
Self commenting / documenting code - this should be possible without putting types in the variable names;
An IDE should also provide support for seeing what type a variable is without putting it in the variable name (e.g. Eclipse can do this)
I don't know that this is really an advantage.
One problem with Hungarian Notation that you don't mention is that if you refactor code, you have to change all the variable names as well. There are plenty of examples on The Daily WTF where variables are named 'strSOMETHING' or 'intSOMETHING', even though the types are defined as something else.
In general, IMO the case for using Hungarian Notation is pretty flimsy and generally I wouldn't recommend making it a policy.
(If this isn't exactly what you are talking about, I apologise!)
Your bible on this question is Steve McConnel's book, Code Complete, which is the most comprehensive book on software construction practice like this. He has a whole chapter on variable naming and why it is important.
The key is to make the name a full description of what the variable does, so that it is easy to understand for the person reading it. If it achieves that, then it's good practice.
Student student looks like a simple to understand policy, but it has an immediate disadvantage - it contains no extra information about the variable. You already know its a student. If you know anything else about the object then add it to the variable name - studentUnderReview, graduatingStudent etc. "student" should only be used if you know absolutely nothing else, such as the variable is used to iterate over all Students. Now in a long method it's useful to know the type by just looking at the name, but if the variable has short scope then it's marginal whether its useful or not. There are some studies (see McConnel) which indicate that for variables with very short scope, such as for loop indices, short names are better.
As soon as you have two variables, this system breaks down. If the default is to call one variable "student" then the temptation is to call two variables "student1" and "student2", which is bad practice indeed (see McConnel for details). You need to make names that describe the object - goodStudent and badStudent; studentBeingSaved and studentBeingRead.
The policy should be to use descriptive variable names. One-letter variable names are bad, but so are variable names based exclusively on class names. Your main argument is really for descriptive variable names.
As for the others:
it helps self-commenting the code - no, it duplicates information from the variable declaration
our ide provides direct support for that - that would only be an argument if the alternatives provide no benefits
you can directly see wheter you're using apples instead of pears (or bears ;-) ) - that's the job of the type system
Of course, if your class names are descriptive, then sometimes it will make sense to have variables with the same name - when the variable describes an instance of the class without any distinctive characteristics. As in your example:
for (Student student: students) { ... }
If you're looping over all students, this is fine. But if you have a non-generic instance of Student, the variable name should describe what particular role that student has in this part of the program (e.g. candidate or graduate).
Generally your variable names should help the developer see quickly what they actually represent.
Student student would be ok if the relation that defines expresses a anything-to-student relation, like Student[] students (or better some collection of Student) would be ok for a class Professor or the like.
String string is generally a bad idea, since it doesn't say anything about the use of that variable. Better names would be String name, String description or similar. In some cases, where all that matters is that you're dealing with one string - like general string utilities - you might call the variable string but if you have two or more, you should use better names (e.g. source and target etc. depending on the class/method).
IMHO, adding prefixes/suffixes might be a good idea if they tell you something about the variable that its base name wouldn't, e.g. in a web environment you might deal with strings that are input by the user as well as escaped strings (e.g. to prevent code injection), so you might use a prefix/suffix to make a disctinction between the user input version and the escaped counterpart.
Sometime ago, I remember being told not to use numbers in Java method names. Recently, I had a colleague ask me why and, for the life of me, I could not remember.
According to Sun (and now Oracle) the general naming convention for method names is:
Methods should be verbs, in mixed case
with the first letter lowercase, with
the first letter of each internal word
capitalized.
Code Conventions of Java
This doesn't specifically say that numbers can't be used, although by omission you can see that it's not advised.
Consider the situatiuon (that my colleague has) where you want to perform some logic based on a specific year, for instance, a new policy that takes affect in 2011, and so your application must act on the information and process it based on it's year. Common sense could tell you that you could call the method:
boolean isSessionPost2011(int id) {}
Is it acceptable to use numbers in method names (despite the wording of the standard)? If not, why?
Edit: "This doesn't specifically say that numbers can't be used, although by omission you can see that it's not advised." Perhaps I worded this incorrectly. The standard says 'Methods should be verbs'. I read this to say that considering a number is not a verb, then method names should not use numbers.
The standard Java class library is full of classes and methods with numbers in it, like Graphics2D.
The method seems ... overly specific.
Couldn't you instead use:
boolean isSessionAfter(int id, Date date)
?
That way the next time you have a policy applied to anything after a particular date, you don't need to copy-paste the old method and change the number - you just call it with a different date.
Sure, it's acceptable to use numbers in method names. But as per your example, that's why it's generally frowned upon. Let's say that there is now a new policy in place for the year 2012. Now, there's a new policy in place for 2014. And maybe 2020! So, you have four methods that are roughly equivalent.
What you want isn't a boolean but rather a strategy to do something, or do nothing, based on whether or not a policy was found. Hence, a method void processPolicy(Structure yourStructure); would be a better approach - now you can shield that you're doing a lookup based on the year, and don't have to have separate methods per year, or even limit it to just one policy per year (maybe a policy takes place in two different years, for example, or just three months).
The Java Language Specification seems fairly specific on this topic:
3.8 Identifiers
An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.
...
The Java letters include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024). The $ character should be used only in mechanically generated source code or, rarely, to access preexisting names on legacy systems.
The "Java digits" include the ASCII digits 0-9 (\u0030-\u0039).
This doesn't specifically say that numbers can't be used, although by omission you can see that it's not advised.
I certainly wouldn't read the Java Style Guide that way. And judging from numerous examples in the Java class libraries, neither do they.
I guess the only caveat is that the JSG recommends use of meaningful names. And the corollary is that you should only use numbers in identifiers when they are semantically meaningful. Good examples are
"3D",
"i18n" ( == internationalization ),
"2020" (the year),
"X509" (a standard), and so on.
Even "int2Real" is meaningful in a folksy way.
UPDATE
#biziclomp has raised the case of LayoutManager2, and claims that the 2 conveys no meaning.
Here's what the javadoc says about the purpose of this interface:
This minimal extension to LayoutManager is intended for tool providers who wish to the creation of constraint-based layouts. It does not yet provide full, general support for custom constraint-based layout managers.
From this, I would say that the 2 in the name is meaningful. Basically, it is saying that you can view this as a successor to LayoutManager. I guess that could have been said in words, but see the examples above on how numbers where numbers are used as short-hand.
# BlueRaja writes:
The 2 does not explain anything - how is LayoutManager2 any different from LayoutManager?
The advice of the Style Guide is NOT that names should explain things. Rather, it advises that they should be meaningful. (For the explanation, refer to the javadoc.) Obviously meaningfulness is relative, but there is a practical limit on the amount of information you can put into an identifier before it becomes hard to read and hard to type.
My take is that the identifier should remind the reader what the meaning of the thing (class, field, method, etc) that is named.
It is a trade-off.
Methods should be verbs, in mixed case with the first letter lowercase, with the first letter of each internal word capitalized.
This phrasing alone already shows that they use a more general meaning of verb than the usual, where only is would be the verb, neither session nor post are verbs. The sentence means something like Method names should be verbs or verbal phrases, ..., and numbers can very well be parts of verbal phrases.
The idea is that a complete method call can be read as a complete sentence, with the subject being the object before the dot, the verb being the method name, and additional objects being the arguments to the method:
if (buffer.isEmpty())
buffer.append(word);
(Most such sentences would be either questioning or imperative ones.)
Your method name has (from a naming convention viewpoint) the only problem that the subject of the sentence (the session) is not the this object of your method, but an parameter, but this can't be avoided with Java, I think (please someone prove me wrong).
For multiple-parameter methods the smalltalk approach would work better:
"Hello" replace: "e" with: "x"
(where replace:with: is one method of the string class.)
Yes, in some circumstances. For example, maybe you want to handle X.509 certificates. I think it would be perfectly acceptable to write a method called handleX509Certificate.
The only problem I see with using numbers in method names is that it may be an indication that something in your design could be improved upon. (I hesitate to say "is wrong.") For instance, in your example, you stated that you have a specific policy which comes into effect after 2011. However, having a method specifically to check for that year seems overly specific and magic-number-y. I'd instead suggest creating a generalized function to check if an event occurred after a specified date as Anon suggested.
(Anon's answer popped up while I was halfway through mine, so my apologies if it seems like I'm just duplicating what he said. I felt that mine expanded on what he was saying a bit, so I thought I'd post it anyway.)
I would consider calling your method something else. Nothing against numbers exactly, but what happens if the project slips it release date? You'll have a method called post2011 - when it should be called post2012 now. Consider calling it postProjNameImplentation instead maybe?
The use of number it is not bad itself, but usually they are not very common.
in the specific case, I don't think isSessionPost2011(int id) {} is a good name. but it is better isSessionPostYear(int id, int year) {} more extensible for future uses.
The fact it is a coding convention and the use of the verb "should" suggest you that digits are permitted but advised against in methods names. However in your example, why not generalizing the same code as?
session.isPostYear(int year);
We use 'em all the time, like the example you showed. Also for interface versions, like IConnection2 and IConnection3.
Eclipse doesn't complain that it's a nontraditional name, either. :)
But acceptable? That's kind-of up to you.
Don't ever forget - rules are made to be broken. The only absolute should be that there are no absolutes.
I don't believe there's a per se reason to avoid numbers in identifiers, although in the case you describe, I don't know that I'd use it. Rather, I'd name the method something like boolean isPolicyXyzApplicable(int id).
If this is a policy that's expected to change more over time, consider splitting policies out into different classes so you don't end up growing a long vine of if(isPolicyX) ... else if(isPolicyY) ... else if(isPolicyZ) ... in your methods. Once this is factored out, use an abstract or interface method Policy.isApplicableTo(transaction) and a collection of Policy objects to determine what to do.
As long as you have a reason for using numbers, then imho I think it's fine.
For your example, there might be 2 isSessionPost method, so how would you name them? isSessionPost and isSessionPost2? Not very clear to be honest.
Just remember that all names must be meaningful and you won't go wrong.
I think in your case it's OK to use it as a one-off marker, specifically if you expect that the method will only live for a short period of time and eventually be deprecated.
If I understand your use case, you need to bring in some legacy data into the new version of your application. If this is the case, then definitely add this method, mark it #deprecated and retire it when all your clients are updated.
On the other hand Ralph here has a valid point. Don't let this project to slip into 2012 :)
nothing is wrong
String int2string(int i)
User findUser4Id(long id)
void startHibern8();
wow! this website doesn't like these method names! I got captchaed!
My boss keeps using the term "string bashing" (we're a Java shop) and usually makes an example out of me whenever I ask him anything (as if, I'm supposed to know it already). I Googled the term only to find results pertaining to theoretical physics and string theory.
I am guessing it has something to do with using String/StringBuilders incorrectly or not in keeping with best practices, but for the life of me, I can't figure out what it is.
"String bashing" is a slang term for cutting up strings and manipulating them: splitting, joining, inserting, tokenizing, parsing, etc..
It's not inherently bad (despite the connotation of "bashing"), but as you point out, in Java, one needs to be careful not to use String when StringBuilder would be more efficient.
Why don't you ask your boss for an example of string bashing.
Don't forget to ask him for the correct way of refactoring the examples he gives you.
Out of context, "string bashing" doesn't really have any meaning in itself. It's not a buzz word for any good or bad behaviour. It would just mean "bashing strings", as in using string operations.
Whether that is good or bad depends on what you are doing, and the role of the strings would not really be important. There are good and bad ways of handling any kind of data.
Sometimes "bashing strings" is actually the best solution. Consider for example that you want to pick out the first three characters of a string. You could create a regular expression that isolates the characters, but that would certainly be overkill as there is a simple string operation that can do the same, which is a lot faster and easier to maintain.
Effective Java has an item about using strings: "Item 50: Avoid strings where other types are more appropriate". Also on stackoverflow: "Stringly typed".
A guess: It might imply something related to creation of unnecessary temporary objects, and in this particular case Strings. For example, if you're constructing a String token by token then it's usually a good idea to use a StringBuilder. If the String is not built using a builder, each concatenation will cause another temporary object to be created (and later garbage collected).
In modern VMs (I'm thinking HotSpot 1.5 or 1.6) this is rarely a problem unless you're in performance critical code or you're building long strings, e.g. in for loops.
Only a guess; might be better to ask what he or she means? I've never heard the term before.
There are a few results on google which refer to string bashing in this context. They don't appear to refer to the concern about the inefficent temporaries and using StringBuilder.
Instead, it appears to refer to simplistic string parsing. I.e. doing stuff like checking for substrings, slicing the string, etc. In particular, it appears to have the implication of it being a hacky solution to the problem.
It might be seen badly because you should either use real parsing or obtain the data in a non-string format.
I overheard two of my colleagues arguing about whether or not to create a new data model class which only contains one string field and a setter and a getter for it. A program will then create a few objects of the class and put them in an array list. The guy who is storing them argue that there should be a new type while the guy who is getting the data said there is not point going through all this trouble while you can simple store string.
Personally I prefer creating a new type so we know what's being stored in the array list, but I don't have strong arguments to persuade the 'getting' data guy. Do you?
Sarah
... a new data model class which only contains one string field and a setter and a getter for it.
If it was just a getter, then it is not possible to say in general whether a String or a custom class is better. It depends on things like:
consistency with the rest of your data model,
anticipating whether you might want to change the representation,
anticipating whether you might want to implement validation when creating an instance, add helper methods, etc,
implications for memory usage or persistence (if they are even relevant).
(Personally, I would be inclined to use a plain String by default, and only use a custom class if for example, I knew that it was likely that a future representation change / refinement would be needed. In most situations, it is not a huge problem to change a String into custom class later ... if the need arises.)
However, the fact that there is proposed to be a setter for the field changes things significantly. Instances of the class will be mutable, where instances of String are not. On the one hand this could possibly be useful; e.g. where you actually need mutability. On the other hand, mutability would make the class somewhat risky for use in certain contexts; e.g. in sets and as keys in maps. And in other contexts you may need to copy the instances. (This would be unnecessary for an immutable wrapper class or a bare String.)
(The simple answer is to get rid of the setter, unless you really need it.)
There is also the issue that the semantics of equals will be different for a String and a custom wrapper. You may therefore need to override equals and hashCode to get a more intuitive semantic in the custom wrapper case. (And that relates back to the issue of a setter, and use of the class in collections.)
Wrap it in a class, if it matches the rest of your data model's design.
It gives you a label for the string so that you can tell what it represents at run time.
It makes it easier to take your entity and add additional fields, and behavior. (Which can be a likely occurrence>)
That said, the key is if it matches the rest of your data model's design... be consistent with what you already have.
Counterpoint to mschaef's answer:
Keep it as a string, if it matches the rest of your data model's design. (See how the opening sounds so important, even if I temper it with a sentence that basically says we don't know the answer?)
If you need a label saying what it is, add a comment. Cost = one line, total. Heck, for that matter, you need a line (or three) to comment your new class, anyway, so what's the class declaration for?
If you need to add additional fields later, you can refactor it then. You can't design for everything, and if you tried, you'd end up with a horrible mess.
As Yegge says, "the worst thing that can happen to a code base is size". Add a class declaration, a getter, a setter, now call those from everywhere that touches it, and you've added size to your code without an actual (i.e., non-hypothetical) purpose.
I disagree with the other answers:
It depends whether there's any real possibility of adding behavior to the type later [Matthew Flaschen]
No, it doesn’t. …
Never hurts to future-proof the design [Alex]
True, but not relevant here …
Personally, I would be inclined to use a plain String by default [Stephen C]
But this isn’t a matter of opinion. It’s a matter of design decisions:
Is the entity you store logically a string, a piece of text? If yes, then store a string (ignoring the setter issue).
If not – then do not store a string. That data may be stored as a string is an implementation detail, it should not be reflected in your code.
For the second point it’s irrelevant whether you might want to add behaviour later on. All that matters is that in a strongly typed language, the data type should describe the logical entity. If you handle things that are not text (but may be represented by text, may contain text …) then use a class that internally stores said text. Do not store the text directly.
This is the whole point of abstraction and strong typing: let the types represent the semantics of your code.
And finally:
As Yegge says, "the worst thing that can happen to a code base is size". [Ken]
Well, this is so ironic. Have you read any of Steve Yegge’s blog posts? I haven’t, they’re just too damn long.
It depends whether there's any real possibility of adding behavior to the type later. Even if the getters and setters are trivial now, a type makes sense if there is a real chance they could do something later. Otherwise, clear variable names should be sufficient.
In the time spent discussing whether to wrap it in a class, it could be wrapped and done with. Never hurts to future-proof the design, especially when it only takes minimal effort.
I see no reason why the String should be wrapped in a class. The basic perception behind the discussion is, the need of time is a String object. If it gets augmented later, get it refactored then. Why add unnecessary code in the name of future proofing.
Wrapping it in a class provides you with more type safety - in your model you can then only use instances of the wrapper class, and you can't easily make a mistake where you put a string that contains something different into the model.
However, it does add overhead, extra complexity and verbosity to your code.