Validating Java MessageFormat strings

Validating Java MessageFormat strings - java

I'm working on a Play 2 app which is being translated. Play uses Java's MessageFormat behind the scenes so I have a fair number of property values, ala:
my.interface.key={0,choice,0#{0} families|1#1 family|1<{0,number,integer} families}
I just received back a translation of this in the form:
my.interface.key={0,choix,0#{0} familles|1#1 famille|1<{0,nombre,entier} familles}
If it's not obvious, some bits of that should not have been translated, but mistakes will happen from time to time. That's fair enough, but I'm sure there must be a way of validating these strings prior to my app crashing at runtime with a IllegalArgumentException: unknown format type at ... exception. Preferably with a Git commit hook, or even an SBT build task.
If I was to hack this up myself I would probably make a tool to read these property files and check that, for each value, running MessageFormat.format(value) doesn't blow up.
Ideally I could do this via a Perl (or Python) script. Sadly, the only non-Java library I can find - Text::MessageFormat on CPAN - doesn't seem to support the most error-prone formats, such as pluralisation.
Can anyone suggest a more sensible approach based on existing tooling before I dive in?

We had a similar problem. Our solution was to create classes that model the structure of the message format, then use XML to define the messages in our message bundle.
If the translator uses an XML editor then there is some hope they won't "break" the structure of the message.
See this answer for details.

Related

Java - Managing many if else statements which must be easily changed

I have a little design dilemma. I have java and sql and no rules engine. I don't want to implement a full on rules engine either.
My scenario:
I have some input data, ie. code, description and an amount.
Using these i will pass them into a function which will run lots of if else statements which are my business rules and will determine the output.
I can do this in java, but the problem is that these codes and descriptions may change at anytime and so can the business rules, so my "if elses" need to change easily. My thought was given what i have to work with, is use a stored procedure in sql instead to manage the many if elses, and this can simply be changed by editing the stored proc and simply hitting f5, whereas with java, i'd have to modify the java code and recompile and deploy which takes much longer.
I would like to know if anyone has had such a problem and what were their experiences and successful approaches. The requirement is speed and being able to edit these business rules easily.
Thanks guys

If your requirement is only changing values to check in if and else statements then the answer by ema is the right way to go. If your requirement is that also the logic must be changed and refreshed on the fly then you need to externalize it and deploy apart. There are several ways to do this. In my experience I've used drools a library rule engine from codehouse now from jboss that allow to build from very simple to very complex rules in a scriptable way so that you can deploy your files change and reload it. this is the link to their site http://www.drools.org/

How to generate sequence diagrams automatically on executing junit

I have been given a task of "generate sequence diagrams automatically on execution of junit/test case" in eclipse. I am learning UML. I found tools that can generate a sequence, and I am aware of junit, but how do I club this both.
The tools that I found good were UMLet,ModelGoon UML, Object Aid. But I zeroed in on ModelGoon. I found that simple and easy to use. How do I automate this task, if so please guide me.
If there are any-other tools that are available then guide me.

First: This is a very good idea, and there are several ways to go about it. I will make the assumption that you are working in a jvm language (e.g. Kotlin or Java) so the suggestions I will make are biased by that.
Direct approach
Set up your logging to log using json, it makes the rest much simpler: https://www.baeldung.com/java-log-json-output
Make a library where you log the name of the component/method you are in, and the session you are processing. There are many ways of doing this, but a simple one is to a thread local variable: Set the variable to contain the name of the thing you are tracing ("usecase foobar"), and some unique ID (UUIDs are a decent choice). Another would be to generate some tracing ID (or get one from an external interaction), and send that as a parameter to all involved methods. Both of these will work, and which one is the simplest in practice depends on the architecture of your application.
In the methods you want to trace, write a log entry that contains that tracing information (name of usecase, trace ID, or any combination thereof), the location where the log entry was written, and any other information you want to add to your sequence diagram.
Run your test normally. A log will be produced. You need to be able to retrieve that log. There are many ways this can be done, use one :-)
Filter the log entries so you get only the ones you are interested in. Using the "jq" utility is a decent choice.
Process the filtered output to generate "plant uml" input files (http://plantuml.com/) for sequence diagrams.
Process the plant UML files to get sequence diagrams.
Done.
Industrial approach
Use some standard tooling for tracing like "https://opentracing.io/", instrument your application using this tooling, and extract your diagrams using that standard tooling.
This will also work in production an will probably scale much better than the direct approach, but if scaling isn't your thing, then the direct approach may be what you want to do.

how to format i18n files?

i got three files for internationalization: messages_es.properties, messages_en.properties and messages_pt.properties, those files follow the rule:
message1=value
message2=value2
and it's values changes according the file. example:
messages_en.properties:
hello=welcome
messages_pt.properties:
hello=bem vindo
the problem is, along the project construction those files becames inconsistent, like, lines that exists in one file doesn't exist on the others, the lines are not ordened in these files... i want to know if there is some way to easy rearrange and format those i18n files so the lines that exists in one file and don't exists in the other should be copied and the lines be ordered equals?

Interesting question, you are dealing with text files so there are a lot of possible options to manage this situation but depends on your scenario (source control, ide, etc).
If your are using Eclipse check: http://marketplace.eclipse.org/content/eclipse-resourcebundle-editor
And for IntelliJ: https://www.jetbrains.com/idea/features/i18n_support.html

Yes, the messages should usually appear in each file, unless there's a default message for some key that doesn't need translating (perhaps technical terms). Different IDEs have different support for managing message files.
As far as ordering the messages, there's no technical need to do so, but it can help the human maintainers. Any text-editor's sort routine will work just fine.

The NetBeans IDE has a properties editor across languages, displaying them side-by-side in a matrix. Similarly there are stand-alone editors that allow to do this. One would assume that such an editor would keep the source text synchronized and in one consistent layout.
First go looking for a translator's editor that can maintain a fixed layout. A format like gettext (.po/.pot) which is similar to .properties might be a better choice, depending on the tool.
For more than three languages it would make sense to use a source format more directed at translators, like the XML format xliff (though .properties are well known). And generate from this source (via XSLT perhaps) the several .properties files, or even ListResourceBundles.
The effort for i18n should not stop at providing a list of phrases to
translate, but some info where needed (disambiguating note), and maybe
even a glossary for a consistent use of the same term. The text
presented to the user is a very significant of the products quality
and appeal. Using different synonyms may make the user-interface
fuzzy, needlessly unclear, tangled.

The problem you are facing is invalid Localization process. It has nothing to do with properties files and it is likely that you shouldn't even compare these files now (that is until you fix the process).
To compare properties files, you can use very simple trick: sort each one of them and use standard diff tool to show differences. Sure, you'll miss the comments and logical arrangement in the English file, but at least you can see what's going on. That could be done, but it is a lot of manual work.
Instead of manually fix the files, you should fix the broken process. The successful localization process is basically similar to this one:
Once English file is modified, send the English file for translation. By that I mean all the translations should be based on English file and the localization files should be recreated (stay tuned).
Use Translation Memory to fill up the translations you already have. This could be done by your translation service provider or yourself if you really know how to do it (guess what? it is difficult).
Have the translators translate strings that are missing.
Put localized file back.
Before releasing the software to public have somebody to walk the Linguistic Reviewer through the UI and correct mistranslations.
I intentionally skipped few steps (like localization testing, using pseudo-translations and searching for i18n defects, etc.), but if you use this kind of process, your properties files should always be in sync.
And now your question could be reduced to the one that was already asked (and answered):
Managing the localization of Java properties files.

Look at java.util.PropertyResourceBundle. It is a convenience class for reading a property file and you can obtain a Set<String> of the keys. This should help to compare the contents of several resource files.
But I think that a better approach is to maintain the n languages in a single file, e.g., using XML and to generate the resource files from a single source.
<entry>
<key>somekey</key>
<value lang="en">good bye</value>
<value lang="es">hasta luego</value>
</entry>

define error code, number or string?

When I use enterprise application, I am usually greeted by some error that needs to consult help desk. I found many application still lean to use number as error code instead of a human readable string. Given most enterprise applications are written in modern language like java/C#, I can not figure out what's the benefit in using numeric error code.
So the question is, for enterprise application, is there a common adopted pattern for defining error code? is any reason number preferred to string?
BTW: I understand application using REST API likely use http status code for error code, this is understood as http status code itself is number. but for others, I don't understand

It's usually convenient to have both, a code, numeric or otherwise, and something human-readable.
The code makes it easy for machines to know what happened (and serve as shorthand for humans), and is verbiage- and locale- independent.

The single greatest benefit of error codes -- along with more informative strings -- is that they can be looked up even after your code has been translated into another language which you may not read.
The next greatest benefit is that if someone ever writes code that reads your error messages -- perhaps your own company, to help folks manage your appliction -- it is tremendously helpful for them to have an error code at the start of the message. That both speeds up their decision of what to do, and partially guards them against the risk that you might rephrase the message later (which would mess up attempts to search for the message text).
Any convention can work if you reliably stick with it. If you're thinking about the long term and multiple products, you'll probably want the code to include some indication of which code (application and/or library and/or other module) issued the error, and then you want to be able to quickly find the error in that product's support table.
IBM usually uses a moderately recognizable alphabetic prefix to identify the code, and a numeric suffix to indicate the specific message. Other companies probably do it differently.

I had recently similar problem and I have decided to use following rules:
Similar errors are grouped in types (in Java these would be Exception Classes)
Each error has enumerated value that identifies it (that is an error code)
Each error has human readable message
Errors optionally can have a cause (in many situations your errors were generated because some other error occurred, and that error is a cause)
You could argue about the need of the second rule since you could have as many types as possible errors. However if your system is growing, sooner or later you will find the need of introducing new types of errors, which will entail modification of your API and that's not always possible and even if it is, it might be not very easy since you will have to modify all of your clients. Enumerated error code list is simply easier to maintain in that case.

Is it possible to add custom metadata to .class files?

We have used liquibase at our company for a while, and we've had a continuous integration environment set up for the database migrations that would break a job when a patch had an error.
An interesting "feature" of that CI environment is that the breakage had a "likely culprit", because all patches need to have an "author", and the error message shows the author name.
If you don't know what liquibase is, that's ok, its not the point.
The point is: having a person name attached to a error is really good to the software development proccess: problems get addressed way faster.
So I was thinking: Is that possible for Java stacktraces?
Could we possibly had a stacktrace with peoples names along with line numbers like the one below?
java.lang.NullPointerException
at org.hibernate.tuple.AbstractEntityTuplizer.createProxy(AbstractEntityTuplizer.java:372:john)
at org.hibernate.persister.entity.AbstractEntityPersister.createProxy(AbstractEntityPersister.java:3121:mike)
at org.hibernate.event.def.DefaultLoadEventListener.createProxyIfNecessary(DefaultLoadEventListener.java:232:bob)
at org.hibernate.event.def.DefaultLoadEventListener.proxyOrLoad(DefaultLoadEventListener.java:173:bob)
at org.hibernate.event.def.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:87:bob)
at org.hibernate.impl.SessionImpl.fireLoad(SessionImpl.java:862:john)
That kind of information would have to be pulled out from a SCM system (like performing "svn blame" for each source file).
Now, forget about trashing the compilation time for a minute: Would that be even possible?
To add metadata to class files like that?

In principle you can add custom information to .class files (there's and attribute section where you can add stuff). You will have to write your own compiler/compiler extension to do so. There is no way to add something to your source code that then will show up in the class file.
You will also have major problems in practice:
The way stack-traces a built/printed is not aware of anything you add to the class file. So if you want this stuff printed like you show above, you have to hack some core JDK classes.
How much detail do you want? The last person who committed any change to a given file? That's not precise enough in practice, unless files are owned by a single developer.
Adding "last-committed-by" information at a finer granularity, say per method, or even worse, per line will quickly bloat your class file (and class files are limited in size to 64K)
As a side note, whether or not blaming people for bugs helps getting bugs fixed faster strongly depends on the culture of the development organization. Make sure you work in one where this helps before you spend a lot of time developing something like this.

Normally such feature can be implemented on top of the version control system. You need to know revision of your file in your version control system, then you can call blame/annotate command to get information on who has changed each individual line. You don't need to store this info into the class file, as long as you can identify revision of each class you deploy (e.g. you only deploy certain tag or label).
If you don't want to go into the version control when investigating stack trace, you could store line annotation info into the class file, e.g. using class post processor during your build that can add a custom annotation at the class level (this is relatively trivial to implement using ASM). Then logger that prints stack trace could read this annotation at runtime, similarly to showing jar versions.

One way to add add custom information to your class files using annotations in the source code. I don't know how you would put that information reliably in the stack trace, but you could create a tool to retrieve it.

As #theglauber correctly pointed out , you can use annotations to add custom metadata. Althougth i am not really sure you if you cant retrieve that information from your database implementing beans and decorating your custom exceptions manager.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.