How to designate resources as do-not-translate? - java

I work on the localization of Java software, and my projects have both .properties files and XML resources. We currently use comments to instruct translators to not translate certain strings, but the problem with comments is that they are not machine-readable.
The only solution I can think of is to prefix each do-not-translate key with something like _DNT_ and train our translation tools to ignore these entries. Does anyone out there have a better idea?

Could you break the files up into ones to be translated or ones to be not translated and then only send them the one that are to be translated? (Don't know the structure so har dto know when answering if that is practical...)

The Eclipse JDT also uses comments to prevent the translation of certain Strings:
How to write Eclipse plug-ins for the international market
I think your translation tool should work in a similar way?

The simplest solution is to not put do-not-translate strings (DNTs) in your resource files.
.properties files don't offer much in the way of metadata handling, and since you don't need the data at runtime, its presence in .properties files would be a side-effect rather than something that is desirable. Consider too, partial DNTs where you have something that cannot be translated contained in a translatable string (e.g. a brand name or URI).
"IDENTIFIER english en en en" -> "french fr IDENTIFIER fr fr"
As far as I am aware, even standards like XLIFF do not take DNTs into consideration and you'll have to manage them through custom metadata files, terminology files and/or comments (such as the note element in XLIFF).

Like axelclk posted in his link... eclipse provide a
//$NON-NLS-1$
Statement to notify the project that the first string in this line should not translated. All other string you can find by calling
Source->Externalize Strings
External Strings include all languages you want to support.
File which include the translations looking like:
PluginPage.Error1 = text1
PluginPage.Error2 = text2
Class which read the translation
private static final String BUNDLE_NAME = "com.plugin.name"; //$NON-NLS-1$
private static final ResourceBundle RESOURCE_BUNDLE = ResourceBundle.getBundle(BUNDLE_NAME);
private PluginMessages() {
}
public static String getString(String key) {
// TODO Auto-generated method stub
try {
return RESOURCE_BUNDLE.getString(key);
} catch (MissingResourceException e) {
return '!' + key + '!';
}
}
And you can call it like:
String msg = PluginMessages.getString("PluginPage.Error2"); //$NON-NLS-1$
EDIT:
When a string is externalized and you want to use the original string, you can delete the externalize string from all properties files, without the default one. When the Bundle can not find a message file which is matching to the local language, the default is used.
But this is not working at runtime.

If you do decide to use do-not-translate comments in your properties files, I would recommend you follow the Eclipse convention. It's nothing special, but life will be easier if we all use the same magic string!
(Eclipse doesn't actually support DO-NOT-TRANSLATE comments yet, as far as I know, but Tennera Ant-Gettext has an implementation of the above scheme which is used when converting from resource bundles to Gettext PO files.)

Related

Reading the spss file java

SPSSReader reader = new SPSSReader(args[0], null);
Iterator it = reader.getVariables().iterator();
while (it.hasNext())
{
System.out.println(it.next());
}
I am using this SPSSReader to read the spss file. Here,every string is printed with some junk characters appended with it.
Obtained Result :
StringVariable: nameogr(nulltpc{)(10)
NumericVariable: weightppuo(nullf{nd)
DateVariable: datexsgzj(nulllanck)
DateVariable: timeppzb(null|wt{l)
DateVariable: datetimegulj{(null|ns)
NumericVariable: commissionyrqh(nullohzx)
NumericVariable: priceeub{av(nullvlpl)
Expected Result :
StringVariable: name (10)
NumericVariable: weight
DateVariable: date
DateVariable: time
DateVariable: datetime
NumericVariable: commission
NumericVariable: price
Thanks in advance :)
I tried recreating the issue and found the same thing.
Considering that there is a licensing for that library (see here), I would assume that this might be a way of the developers to ensure that a license is bought as the regular download only contains a demo version as evaluation (see licensing before the download).
As that library is rather old (copyright of the website is 2003-2008, requirement for the library is Java 1.2, no generics, Vectors are used, etc), I would recommend a different library as long as you are not limited to the one used in your question.
After a quick search, it turned out that there is an open source spss reader here which is also available through Maven here.
Using the example on the github page, I put this together:
import com.bedatadriven.spss.SpssDataFileReader;
import com.bedatadriven.spss.SpssVariable;
public class SPSSDemo {
public static void main(String[] args) {
try {
SpssDataFileReader reader = new SpssDataFileReader(args[0]);
for (SpssVariable var : reader.getVariables()) {
System.out.println(var.getVariableName());
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
I wasn't able to find stuff that would print NumericVariable or similar things but as those were the classnames of the library you were using in the question, I will assume that those are not SPSS standardized. If they are, you will either find something like that in the library or you can open an issue on the github page.
Using the employees.sav file from here I got this output from the code above using the open source library:
resp_id
gender
first_name
last_name
date_of_birth
education_type
education_years
job_type
experience_years
monthly_income
job_satisfaction
No additional characters no more!
Edit regarding the comment:
That is correct. I read through some SPSS stuff though and from my understanding there are only string and numeric variables which are then formatted in different ways. The version published in maven only gives you access to the typecode of a variable (to be honest, no idea what that is) but the github version (that does not appear to be published on maven as 1.3-SNAPSHOT unfortunately) does after write- and printformat have been introduced.
You can clone or download the library and run mvn clean package (assuming you have maven installed) and use the generated library (found under target\spss-reader-1.3-SNAPSHOT.jar) in your project to have the methods SpssVariable#getPrintFormat and SpssVariable#getWriteFormat available.
Those return an SpssVariableFormat which you can get more information from. As I have no clue what all that is about, the best I can do is to link you to the source here where references to the stuff that was implemented there should help you further (I assume that this link referenced to in the documentation of SpssVariableFormat#getType is probably the most helpful to determine what kind of format you have there.
If absolutely NOTHING works with that, I guess you could use the demo version of the library in the question to determine the stuff through it.next().getClass().getSimpleName() as well but I would resort to that only if there is no other way to determining the format.
I am not sure, but looking at your code, it.next() is returning a Variable object.
There has to be some method to be chained to the Variable object, something like it.next().getLabel() or it.next().getVariableName(). toString() on an Object is not always meaningful. Check toString() method of Variable class in SPSSReader library.

In Java, how do I extract the domain of a URL?

I'm using Java 8. I want to extract the domain portion of a URL. Just in case I'm using the word "domain" incorrectly, what i want is if my server name is
test.javabits.com
I want to extract "javabits.com". Similarly, if my server name is
firstpart.secondpart.lastpart.org
I want to extract "lastpart.org". I tried the below
final String domain = request.getServerName().replaceAll(".*\\.(?=.*\\.)", "");
but its not extracting the domain properly. Then I tried what this guy has in his site -- https://www.mkyong.com/regular-expressions/domain-name-regular-expression-example/, e.g.
private static final String DOMAIN_NAME_PATTERN = "^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$";
but that is also not extracting what I want. How can I extract the domain name portion properly?
Summary: Do not use regex for this. Use whois.
If I try to extrapolate from your question, to find out what you really want to do, I guess you want to find the domain belonging to some non-infrastructural owner from the host part of a URL. Additionally, from the tag of your question, you want to do it with the help of a regex.
The task you are undertaking is at best impractical, but probably impossible.
There are a number of corner cases that you would have to weed out. Apart from the list of infrastructural domains kindly provided by Lennart in https://publicsuffix.org/list/public_suffix_list.dat, you also have the cases of an empty host field in the URL or an IP-address forming the host part.
So, is there a better approach to this? Of course there is. What you do want to do is query a public database for the data you need. The protocol for such queries is called WHOIS.
Apache Commons provide an easy way to access WHOIS information in the WhoisClient. From there you can query the domain field, and find some more information that may be useful to you.
It shouldn't be harder than
import org.apache.commons.net.whois.WhoisClient;
import java.io.IOException;
public class CommonsTest {
public static void main(String args) {
WhoisClient c = new WhoisClient();
try {
c.connect(WhoisClient.DEFAULT_HOST);
System.out.println(c.query(URL));
c.disconnect();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Using this will get you the whois information aboutt he domain you are asking for. If the domain is uregistered, that is, is a private domain, as in the case of www.stackexchange.com you will get an error saying no domain is registered. Remove the first part of the address and try again. Once you found the registered domain, you will also find the registrar and the registrer.
Now, unfortunately, whois is not as simple as one would think. Read further on https://manpages.debian.org/jessie/whois/whois.1.en.html for an elaboration on how to use it and what information you can expect from different sources.
Also, check related questions here.
try it like this:
String parts[] = longDomain.split(".");
String domain = parts[parts.length-2] + "." + [parts.length -1];

How to resolve External Control of File Name or Path (CWE ID 73)

I am working on fixing Veracode issues in my application. Veracode has highlighted the flaw "External Control of File Name or Path (CWE ID 73) " in below code.
Thread.currentThread().getContextClassLoader().getResourceAsStream(lookupName)
How do I validate the parameter? If I need to use below ESAPI validation, then what is the exact parameter I should be passing in getValidFileName() method. Currently I am passing the parameters as below.
ESAPI.validator().getValidFileName(lookupName, lookupName,
ESAPI.securityConfiguration().getAllowedFileExtensions(), false);
Correct me whether I am following the right approach for fixing this issue.
There are several suggestions at: https://community.veracode.com/s/article/how-do-i-fix-cwe-73-external-control-of-file-name-or-path-in-java
You can use hardcoded values, if these files are stored in the server side.
(i.e.: in a HashMap).
Another solution is to use a custom validator (from veracode page) :
// GOOD Code
String extension = request.getParameter("extension");
File f = new File(buildValidAvatarPath(extension))
#FilePathCleanser
public String buildValidAvatarPath(extension) {
String[] allowedExtensions = new String[]{"jpg","gif","png"};
String extension = "png"; // Default extension
for (String allowedExtension: allowedExtensions) {
if (allowedExtension.equals(request.getParameter("extension"))) {
extension = request.getParameter("extension");
}
}
// See "Note on authorization"
User user = getCurrentUser();
if (!userMayAccessFile(user, path)) {
throw new AuthorizationException("User may not access this file", user);
}
File(configPath + "avatar." + extension)
return path;
}
Okay, so the problem is that you are allowing user-control of that file path. Imagine its on a UNIX box and they enter:
../../../../../../../etc/shadow
Whatever user privileges are granted to the user running that java Thread is possible to expose to the user in question. I don't know what processing is going on in your application, but the danger is that you need to prevent user control of that lookup variable.
The call you're making is consistent with the single test in ValidatorTest.java, which is definitely a deficiency in code coverage on our behalf.
Now, there's an excellent chance that even if you use this call that Veracode might still flag it: the default file list in ESAPI.properties will need to be either truncated for your use case, or you'll have to create your own Validator rule for legal file extensions for your specific use case.
Which brings up the next bit: There's a lot of mischief that can happen in regards to file uploads.
In short, to be actually secure about file uploads will require more than what ESAPI currently offers, which is unfortunately, only an extension check. In your particular case, make sure you try some directory traversal attacks. And use that OWASP link to help analyze your application.
Given that the OP wants to clear the issue in Veracode, you would want to chain a couple calls:
ESAPI.validator().getValidDirectoryPath() and ESAPI.Validator.getValidFileName()
But be sure you've properly truncated the extension list in HttpUtilities.ApprovedUploadExtensions in validator.properties as the default list is too permissive, at least until we release 2.1.0.2.
I have to stress however that even with this particular combination there is absolutely nothing ESAPI does to prevent a user from renaming "netcat.exe" to "puppies.xlsx" and bypassing your validation check, that's why the rant on the first part of this answer.
ESAPI's file validation is NOT secure, it's quite simply better than nothing at all.
Doing this correctly requires more work than just using 1-2 calls to ESAPI.
DISCLAIMER: as of this writing I am the project co-lead for ESAPI.
You can change file name by sanitizing it as below code snippet:
private static String sanitizeFileName(String name) {
return name
.chars()
.mapToObj(i -> (char) i)
.map(c -> Character.isWhitespace(c) ? '_' : c)
.filter(c -> Character.isLetterOrDigit(c) || c == '-' || c == '_' || c == ':')
.map(String::valueOf)
.collect(Collectors.joining());
}

How is obfuscation done in Java?

Today I came across an obfuscated class (well a lot of obfuscated classes in a jar) and I do not have a clue on how this kind of obfuscation is done.
An example:
protected void a(ChannelHandlerContext ☃, ByteBuf ☃, ByteBuf ☃)
throws Exception
{
int ☃ = ☃.readableBytes();
if (☃ < this.c)
{
☃.b(0);
☃.writeBytes(☃);
}
else
{
byte[] ☃ = new byte[☃];
☃.readBytes(☃);
☃.b(☃.length);
this.b.setInput(☃, 0, ☃);
this.b.finish();
while (!this.b.finished())
{
int ☃ = this.b.deflate(this.a);
☃.writeBytes(this.a, 0, ☃);
}
this.b.reset();
}
}
}
As you see above, all the parameter variables are a snow-man. How can this be undone? Also how is it done in the first place; how is the JVM able to "process" those and execute the code without any problem?
To clarify, I am not going to use this code, it is just for educational purposes. I am taking the Computer Science course at school so since we are learning Java and talking of limitations such as decompilations. I am interested in learning more, so I decided to have a look into bigger projects especially servers. This piece of code is pulled out of the Spigot server for Minecraft (A game) that is a fork of Bukkit server for Minecraft that was supposed to be open source.
First of all, you should note that it is the parameters which have this unicode and not the methods. Why is this important?
Parameters do not need to have names specified, as they are mostly indexed by a number reference. However it can be specified and I assume that most java runtimes do in fact not check this name as it is not needed for execution.
In the opposite, class names, method names, and field names are however needed.
About you mentioning Spigot, Spigot is indeed open source. However you most likely decompiled a class which is originally from the original Mojang Minecraft server, which is not open source and is indeed obfuscated.
Edit: In the case you want to investigate these classes, I recently found a tool called Bytecode Viewer, which is available at https://github.com/Konloch/bytecode-viewer
This tool has multiple decompilers as well as some options to view a more bytecode like version of the class file.
An example of a function I found contains the following bytecode data:
<localVar:index=1 , name=☃ , desc=D, sig=null, start=L1, end=L2>
<localVar:index=3 , name=☃ , desc=D, sig=null, start=L1, end=L2>
<localVar:index=5 , name=☃ , desc=D, sig=null, start=L1, end=L2>
Indeed as is visible, the unicode name has been set the same, but it does not matter as in the end the indexes (1,3,5) are used to reference these variables.
protected void a(ChannelHandlerContext ☃, ByteBuf ☃, ByteBuf ☃)
This isn't valid. You cannot have multiple parameters with the same name. It could be that you are not reading the unicode text with the right text format.
Your Text editor is showing the value of the unicode character.
I just tested on eclipse and names with unicode characters are acceptable.
public String publicationXmlUrl(int \u9090currentPage) {
But writing with values are not:
public String publicationXmlUrl(int ♥currentPage) {

Strategy to consolidate Java webapp configuration files for multiple deployments

I apologize if this is a duplicate, I couldn't find anything describing exactly what I wanted. I'm building a webapp that has a number of different properties that need to change depending on the environment in addition to a number of .properties configuration files that need to change as well. Right now I have a global enum (DEVELOPMENT, STAGING, and PRODUCTION) that is used to determine which string constants are used in the application and then I utilize a bunch of comments in the configuration files to switch between database servers, etc. There has got to be a better way to do this...I'd ideally like to be able to make one change in one file (A large block comment would be fine...) to adjust these configurations. I saw this post where the answer is to utilize JNDI which I really like, but it would seem as though I would need to call that from a servlet that starts up or a bean that gets initialized on start in order to use it for my log4j or JDBC configuration files.
Does anybody have any strategies for dealing with this?
Thanks!
I'm not sure if this strategy will apply to your situation, but in the past I've successfully used our build tool (ant in that case) to build different war files depending on the profile. So you would have multiple log4j configuration files in your source tree, and then delete the ones you don't want from the final build depending on the profile that was used to build it.
Traceability becomes slightly hard (i.e. difficult sometimes to figure out which one was used to build it), but it's a very clean solution, from your code perspective, since it's all done in your build script.
We store all our default configuration values in a single XML file. During deployment we apply an XML patch (RFC-5261) with values specific to the environment.
https://www.rfc-editor.org/rfc/rfc5261
http://xmlpatch.sourceforge.net/
I am going to assume that your properties files are made up of 95% name=value pairs that are identical across all your deployment environments and 5% of name=value pairs that change from one deployment environment to another.
If this assumption is correct, then you could try something like the following pseudocode.
void generateRuntimeConfigFiles(int deploymentMode)
{
String[] searchAndReplacePairs;
if (deploymentMode == Constants.PRODUCTION) {
searchAndReplacePairs = ...
} else if (deploymentMode == Constants.STAGING) {
searchAndReplacePairs = ...
} else { // Constants.DEVELOPMENT
searchAndReplacePairs = ...
}
String[] filePairs = new String[] {
"log4j-template.properties", "log4j.properties",
"jdbc-template.properties", "jdbc.properties",
"foo-template.xml", "foo.xml",
...
};
for (int i = 0; i < filePairs.length; i += 2) {
String inFile = filePairs[i + 0];
String ouFile = filePairs[i + 1];
searchAndReplaceInFile(inFile, outFile,
searchAndReplacePairs);
}
}
Your application calls generateRuntimeConfigFiles() before initialising anything else that might rely on properties/XML files.
Now the only problem you have to deal with is how to store and retrieve different settings for searchAndReplacePairs. Perhaps you could obtain them from files with names such as production.properties, staging.properties and development.properties.
If the above approach is appealing to you, then email me for the source code of searchAndReplaceInFile() to save you having to re-invent the wheel. You can find my email address from the "info" box in my Stackoverflow profile.
I suggest using Apache Commons Configuration. It provides all the plumbing for dealing with different configurations depending on your environment.
http://commons.apache.org/configuration

Categories

Resources