Fail Fast Design Pattern

Fail Fast Design Pattern - java

Firstly I'll explain my code in a nutshell
A config file is present which has a list of ID's. In a for loop these ID's are read one at a time and a list of JSON Structures are created. If everything goes well without any exception (e.g. entire data not being present) they are pushed into a database.
Coming to my question... For each ID, there are a bunch of business rules which are being executed. I've coded in such a way that even if any of the expected data is missing or if the business rules fails at any point of time I'm not inserting the data into the DB. The processing for that ID stops there, error message is written to a log file and then proceeding with the next ID. Can this be defined as a Fail Fast Design Patter is my question.

As answered at CodeRanch: http://www.coderanch.com/t/663709/java/java/Fail-Fast-Design-Pattern
As is often the case the answer is "it depends". It depends on what you consider a failure.
Let's say that an empty config file is an error condition that you want to know about. A "fail fast" approach might be to throw an exception immediately after detecting the file is empty, thus halting the operation at the point of failure. The alternative might be to continue with the processing and end up writing no data into the database. At this point you have a bug in that there's no data in the database. Why is that? Did the database insert code fail? Is there a problem with the database itself? Did the file processing fail? Is the data corrupt? You're now on a bug hunt to track down the error and you have lots of possibilities.
For processing each of the ID's using some business rule. If the requirement is that all ID's are valid then to abort processing upon finding an invalid ID would be the "fail fast" approach. However, in your example you appear to be simply ignoring invalid data and only writing valid data to the database. This is not failing in any way at all, rather you are making a decision on how to handle invalid data where you simply ignore them. So no, you are not failing fast in this design.
The purpose of "fail fast" is to abort the program at the exact point of failure citing the exact cause of failure in order to avoid entering into a running but invalid state.

Related

Password sanitization within stacktraces & errors

We're currently running into an interesting problem regarding the sanitization of error logs being printed into our server logs. We have proper global error handling set up and have custom error messages that are sent back as responses from our OSGi java servlets.
We use dockerized containers as server instances that are autoscaled, so we're thinking about setting up a log aggregator and storing our exceptions within a DB in the cloud, that way we can also track metrics about our exceptions and pinpoint how we could improve our development process to reduce certain types of errors, etc.
I did a bit of research about how that should be done and I found this. The OWASP Logging sheet cheat. It mentions that passwords should never be logged among a few other things. That brings us to my question:
How do I go about properly sanitizing my logs without using some janky text processing or manually covering up all the potential cases?
Example stacktrace:
pkg.exceptions.CustomException: some registration error
ERROR: duplicate key value violates unique constraint "x_username_org_id_key"
Detail: Key (username, org_id)=(SOME EMAIL, 1) already exists.
Query: with A as (some query) insert into someTable (..values...) Parameters: [X, X, X, X, X, SOME_EMAIL, THE_PASSWORD]
at somepkg.etc
This is a pretty common error with registration systems that happens due to username collisions. Sure there's ways that this specific case can be avoided by ensuring the username isn't taken before the insertion isn't attempted and handling that case separately, but that's just a single case among many others.
After looking around to find a solution there doesn't seem to be an obvious way to solve the problem and I'm wondering if everyone out there has simply implemented their own version of a log sanitizer? We could simply purge the stacktrace if some troublesome strings are present, but that's not the best solution. Any suggestions?

If you only store and pass around password hashes you won't need to sanitize the logs for passwords. In cases where a password must be preserved temporarily in code use char[]s rather than Strings. This is a more secure approach in general and is considered a best practice. The standard library APIs all use character arrays for passwords.

Is there any way to guarantee that an ElasticSearch index has been deleted

In some automated tests, I am trying to delete and immediately recreate an index at the start of every test, using ElasticSearch's high-level rest client (version 6.4), as follows:
DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest(indexName);
deleteIndexRequest.indicesOptions(IndicesOptions.lenientExpandOpen());
client.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT);
CreateIndexRequest createIndexRequest = new CreateIndexRequest(indexName);
request.mapping("_doc", "{...}", XContentType.JSON);
client.indices().create(request, RequestOptions.DEFAULT);
The problem I have is that, intermittently, my tests fail at the point of creating the index, with an error:
{"error": {"root_cause":[{"type":"resource_already_exists_exception","reason":"index [(index-name)/(UUID)] already exists, ...,}] "status":400}
The more tests I run, the more likely I am to see the error, which seems to be a strong indicator that it's a race condition - presumably when I try to recreate the index, the previous delete operation hasn't always completed.
This is backed-up with the fact that if I put a breakpoint immediately after the delete operation, and manually run a curl request to look at the index that I tried to delete, I find that it's still there some of the time; on those occasions the error above appears if I continue the test.
I've tried asserting the isAcknowledged() method of the response to the delete operation, but that always returns true, even in cases when the error occurs.
I've also tried doing an exists() check before the create operation. Interestingly in that case if I run the tests without breakpoints, the exists() check always returns false (i.e. that the index doesn't exist) even in cases where the error will then occur, but if I put a breakpoint in before the create operation, then the exists() check returns true in cases where the error will happen.
I'm at a bit of a loss. As far as I understand, my requests should be synchronous, and from a comment on this question, this should mean that the delete() operation only returns when the index has definitely been deleted.
I suspect a key part of the problem might be that these tests are running on a cluster of 3 nodes. In setting up the client, I'm only addressing one of the nodes:
client = new RestHighLevelClient(RestClient.builder(new HttpHost("example.com", 9200, "https")));
but I can see that each operation is being replicated to the other two nodes.
When I stop a breakpoint before the create operation, in cases where the index is not deleted, I can see that it's not being deleted on any of the nodes, and it seems not to matter how long I wait, it never gets deleted.
Is there some way I can reliably determine whether the index has been deleted before I create it? Or perhaps something I need to do before I attempt the delete operation, to guarantee that it will succeed?

Hey I think there are quite a few things to think about. For one I'd test everything with curl or some kind of rest client till I start doing anything in code. Might just help you conceptually, but that's just my opinion.
This is one thing you should consider:
"If an external versioning variant is used, the delete operation automatically creates an index if it has not been created before (check out the create index API for manually creating an index)."
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html
Which kind of would explain why exists() would return false. So if external versioning variant is used then the delete option would actually create an index with the same name prior to deleting it.
You mentioned about the fact that you are working with a three node cluster. Something you can try is:
"When making delete requests, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the delete request." Here is a super detailed explanation which is certainly worth reading: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-wait-for-active-shards
I suggest you try:
curl -X DELETE 127.0.0.1:9200/fooindex?wait_for_active_shards=3
You said you have 3 nodes in your cluster,so this means that:"...indexing operation will require 3 active shard copies before proceeding, a requirement which should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard."
This check is probably not 100% water tight since according to the docs here:https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-wait-for-active-shards
"It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation commences. Once the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the write operation’s response reveals the number of shard copies on which replication succeeded/failed." so perhaps use this parameter, but have your code check the response to see if any operations failed.
Something you can also try is:
(I can't seem to find good documentation to back this info up)
This should be able to tell you if the cluster isn't ready to accept deletes.
curl -X DELETE 127.0.0.1:9200/index?wait_for_completion=true

How to make two independent changes through api

I have a situation in which I have to update video details on facebook, that update needs two different calls, now my problem is
If any of the call fails, then what should I do?
I dont want to show user partially updated data, neither I want to retry as that might fail again if it failed already.
Only solution I think is to make a new call that will revert that previous call change, but that doesnt seems to be a nice solution.
Can someone suggest a better approach?

It's a difficult situration.
Prior to making any API calls you should do anything you can to check for possible reasons why the calls may fail ahead of time - like missing authentication or authorization.
I agree that it is best practice not to show the user partially updated data, so you have a few options depending on what kind of failure you are up against.
If it fails with a network-like error you could just attempt a single retry to the 2nd endpoint.
Otherwise I'd recommend rolling back. If your rollback call fails, you should inform your user that some data has changed and could not be rolled back. If the error is a network error again you could create a queuejob to run later for rolling back the changes.
In any event, you should inform your user.

Should I throw an exception or print out an error statement in a program?

I have my program working and all done (java). It's a short and easy program for a job interview. I handle stuff like improper input format by throwing a custom exception. Is that the best way to do it or should I just make a print statement?

Exceptions are only useful if they will be handled by other code.
If you're writing a reusable library, you should by all means throw an exception.
There is nothing more frustrating than calling a third-party library that logs errors to the console instead of telling your code about them.
However, if you're writing a standalone utility, it's nicer to print friendly error messages than an ugly stack trace.
The most flexible approach is to write reusable code that throws exceptions, then add catch blocks in main() (or elsewhere in the standalone portion) that prints friendly messages.

If you handle improper format inline is the code readable? If so - fine, if not - throw an exception and handle it elsewhere
Are you able to handle improper format properly in the place you are parsing it or maybe some more generic method/class/module is actually calling your routine and should decide what to do? If the latter is the case -> throw an exception
In general - it depends. If you can handle this special situation "inline" - you can do it (make sure it's readable). If not - throw an exception.

Here's a good reference on exception best practices. You should make sure you are following these.
In your particular case (based on the details you have provided) a user may upload/select a file that has bad data. You program should handle that by catching any basic Java runtime issues and returning information to the user (not "Exception in thread..." but something more readable to a user). If you are checking for these alpha characters then you should just handle that (with an error to the user) without throwing an exception - unless this is truly the behavior you want.

Exception are cause when the program cannot work in a normally correct manner.
The exceptions get more complicated and increase in numbers when you evolve from j2se to j2ee.
For a stand alone application
If your application is just a extremely simple calculator then you may just completely forget about exception because your user input would be filtered and one of the few exception would be division by zero
If your application is a simple utility tool say screen capture , then if your file cannot be saved (exception at file i/o) then all you need to do is simply terminate all your task and say some error message to the user.
For an advanced project of example 2 , you need to save the image in a temp , and perform saving of file once the issue is rectified
For a enterprise scaled and distributed application
Here transaction(inter related activities) is involved . Here a simple message to the user is also needed at times and also handle(do needed changes to related transactions) the exception !
If the application is distributed in many countries then exception in one traction needs alteration in another server in another country , this demands optional incorporation of a some thing that uses JMS API(message sending inside application)
JPA (java persistence api) implicitly rolls back the database on event of a exception and provides facility to do so for interrelated transactions . But still the roll back only affects the database and not the instance variable(object values)
and at all times you don't want to user to read your exact stack trace that says error at line number .....

Using A BlockingQueue With A Servlet To Persist Objects

First, this may be a stupid question, but I'm hoping someone will tell me so, and why. I also apologize if my explanation of what/why is lacking.
I am using a servlet to upload a HUGE (247MB) file, which is pipe (|) delineated. I grab about 5 of 20 fields, create an object, then add it to a list. Once this is done, I pass the the list to an OpenJPA transactional method called persistList().
This would be okay, except for the size of the file. It's taking forever, so I'm looking for a way to improve it. An idea I had was to use a BlockingQueue in conjunction with the persist/persistList method in a new thread. Unfortunately, my skills in java concurrency are a bit weak.
Does what I want to do make sense? If so, has anyone done anything like it before?

Servlets should respond to requests within a short amount of time. In this case, the persist of the file contents needs to be an asynchronous job, so:
The servlet should respond with some text about the upload job, expected time to complete or something like that.
The uploaded content should be written to some temp space in binary form, rather than keeping it all in memory. This is the usual way the multi-part post libraries to their work.
You should have a separate service that blocks on a queue of pending jobs. Once it gets a job, it processes it.
The 'job' is simply some handle to the temporary file that was written when the upload happened... and any metadata like who uploaded it, job id, etc.
The persisting service needs to upload a large number of rows, but make it appear 'atomic', either model the intermediate state as part of the table model(s), or write to temp spaces.
If you are writing to temp tables, and then copying all the content to the live table, remember to have enough log space and temp space at the database level.
If you have a full J2EE stack, consider modelling the job queue as a JMS queue, so recovery makes sense. Once again, remember to have proper XA boundaries, so all the row persists fall within an outer transaction.
Finally, consider also having a status check API and/or UI, where you can determine the state of any particular upload job: Pending/Processing/Completed.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.