The purpose of this quetion is to evaluate the reasonability of using Drools for complex display value lookup processing within a web application. I have been tasked to retrieve coverage information (from an insurance policy) stored in a vendor database and display the information within a custom web app. There are 100s of coverages and the display values that are to be displayed for each coverage can be based on a combination of 4 or 5 different columns per type of coverage. I think there may be as many as 40 different types of coverages.
So with that being said, would drools or a decision tree mechanism provide a good way of handling this? I should point out that it is very likely that we'll need to add / modify the coverage information often and one thing that draws me to this mechanism is that the BAs could help keep the rules up-to-date. I am worried however that the speed may be adversely impacted by this option. Currently I have a working protototype using databases in combination with reflection and XML in the the database to make sure the mapping is done successfully.
I am open to other options if you can think of them as well.
Thanks,
Jeremy
Based solely on the fact that you want to offer BAs the possibility to dynamically update the coverage rules, it seems that Drools would offer you a big head start.
Regarding performance, it seems very unlikely to me that the performance would be an issue, as you're talking about a small number of facts and rules really. Biggest performance penalty is usually the parsing of the rules, which can be done on start-up and cached/shared afterward.
Related
I have a collection of 350 locations in the United States with each containing about 25 subcategories. The data structure looks something like this:
Location (ex: Albany, NY)
--> Things to do
--> Population
... 23 More
Which of the following would be best for loading this data into the app: JSON, XML, or SQLite? Just to clarify, I don't need to edit this data in any way. I simply need to read it so that the information can be loaded into TextView's.
Edit:
I'm attempting to implement Room and XML and so far the XML seems to be the simplest to implement. Is it bad practice to use the XML solution? It doesn't seem to be using too many resources and it isn't running slow at all when tested on a few devices. Would it still be a better practice to implement the Room solution?
Undoubtedly, among all of these RDB is the most efficient one, both in terms of storage and query response. I personally do not see any point in using xml and json as these have been traditionally used for exchange of data and are inefficient for storage and queries.
I would suggest that you evaluate the following:
a) how are you going to store the data: single file vs multiple files(for example by subject)
b) are you going to be doing updates on the strings or just appending(SQL will be better suited for updates but if it just reading data after a batch processing flat files might be better suited)
c) How complex are the queries that you want to implement.XML and SQL are better suited for queries that might try to address metadata (date stored, original location address, etc.) than JSON
Once you determine what you want to optimize: whether it is on adding metadata, fast updates, fast querying, ease of storage, fast retrieval of subject files, etc. then you can decide the tradeoffs with other less important goals. In this specific instance the devil is very much in the details.
In most cases it would be better to use a database because it increases readability and maintainability. Especially if you want to show these information inside a kind of list-view. If you use JSON or XML you'll have to parse or write a lot of code to switch between things or load them with a good performance. Consider the case of using Room, LiveData and a RecyclerView, this will reduce the code you'll need and improve( a lot) performance and readability of your app code. By the way you should provide more information about how you want to use and where you want to show these information. XML (or the Android resource system) should be used if you plan to use the resource system itself with its qualifiers to reduce your work. Most of the time JSON is used to communicate outside or with another app in an easy way or for REST requests/responses.
The one option that wouldn't make sense to use at all for your use case is SQLite. Unless you plan on running specific queries on the data for preprocessing before loading them into your view it doesn't worth the overhead (even if I don't imagine is a lot with 350 locations)
XML vs JSON serve the same usecase without much difference, read up their specifics in this website: https://www.json.org/xml.html
I would personally go for JSON due to the simplicity of the format.
Edit:
#simo-r Argument is also a valid one in regards to readability of your code. While there are libraries that can make reading json/xml easier by default Android has really good SQLite support so it might make sense to use it. Ultimately it is in your personal preference and where you see the project growing.
I have a collection of 350 locations in the United States with each containing about 25 subcategories.
The main issue is scalability
Will you, in the next few years, keep just a few hundred locations, or do you imagine, that, if your software becomes successful, your data would grow to many thousands of locations?
If yes: choose SQLite because it could store many records, in an efficient way. Don't forget to have a good database schema with appropriate indexes. See this and read about database normalization. Also, an SQLite database could later be migrated (with efforts) to PostGreSQL.
If no (your data has just a few megabytes): keep JSON or XML. The data is in the page cache.
Consider also YAML, and sometimes a mixed approach.
don't forget to document how your data is organized and accessed.
See also the data persistence chapter of this draft report
If you gonna simply bind data into text views, you can just store the text as strings.xml. As simple as that.
Go with JSON.
Advantages :
Low overhead ( Vs SQLite )
Lightweight parsers like Jackson available using which you can easily convert your data into custom object or data-structure if you need.
Maintainable. As most of the developers understand the format.
I would suggest using JSON. Reason below
JSON vs XML
JSON is lightweight than XML and would take fewer resources(network and storage). Performance of the app increases.
JSON parsing is easy and as mentioned above, its trivial.
JSON is friendly to javascript, in case it's required.
JSON vs SQLite
350 data set with 23 attributes, can be easily managed by JSON. RDBMS is not required.
SQLite becomes an overhead. It's an extra layer and layer comes with a cost. Especially if the application is containerized, the architecture becomes complicated. One needs to deal with volume mapping etc, in case of JSON you can keep the data as part of the application code.
Importantly, since data is static, keep the application stateless by keeping the data alongside the codebase. This makes lot more sense from architectural perspective.
Problem
You have a fixed set of information with a simple structure that you wish to deliver to clients.
Questions to Reflect On
Do I expect this information to significantly changed or modified ever?
Do I expect to increase the amount of information available?
What kind of help do I have? Do they have a background in software engineering or is it someone of a different profession that has to wear a lot of hats?
What is the scale of the project? Are you expecting a large amount of users or just people interested in a very niche application?
JSON or XML
JSON and XML provide similar services: they are both data transfer protocols. If the information is not expected to grow both might be a great option. If its public information, just serve these files statically over nginx. You can point a worker with limited software engineering experience to update these files; they're just files in a folder presented in a human readable format... its extremely simple to do. These updates should be minor and infrequent.
JavaScript Object Notation(JSON) Pros
solid browser and backend support
small size and fast parsing by the javascript engine
very human readable, easy for the untrained eye to make changes
Extensible Markup Language(XML) Pros
standard meta-data option
supports namespaces
solid backend support and is often baked into frameworks
This article explains XML and JSON differences really well (in 2020) if these highlights were not sufficient for your investigation.
Database System
There are a plethora of database systems out there. Their job is to efficiently retrieve specific information from a large volume of data stored. The key reason to use databases is scalability. Scalability means a number of things; I view it as adapting to drastic change. If you expect this information to frequently change or grow, go with a database.
Object Relational Mapping (ORM)
Databases can be cumbersome to use. I would recommend using an ORM on top of them. These encapsulate a database and makes it more user friendly (language specific). Room makes sense in your use case especially for java android development. Encapsulation also allows you to migrate to other databases later without change your code. Here's a good article that discusses Room and SQLite!
Miscellaneous
"Is it bad practice to use an XML solution?"
No. The important thing is that it works, is understandable, and runs efficiently. Just keep in mind that XML and JSON are data transfer protocols and they do THAT job well. This stackoverflow discussion may be helpful to gain a better picture of what that means; be sure to read more than just the accepted answer.
"It doesn't seem to be using too many resources and it isn't running slow at all when tested on a few devices."
Although testing for functionality is great, keep in mind that your test is not a load test and does not verify what you're trying to confirm. I would explore load testing, Wikipedia is a good place to start!
I am a newbie with the rules engine, so bear with me if this question is very basic. All the tutorials for rules engines have been saying that you can move your business logic outside your code and get it updated by BAs/ end users instead of putting it inside Java code.
I have the following questions
But why can't we write our code to read values from property files and do the same thing?
Also, the rules files seem to have a syntax which is not simply one-liners, compared to .properties files.
Does putting these rules in Rule engine make the code/app work without requiring an app server restart?
3a. If it does NOT, then how can we achieve it?
Had been doing some reading the last few days and I think (it is IMHO), the capacity for allowing business rules to be updated using simple spreadsheets, gives Rules Engines the edge over property files. I can make property files as highly configurable as possible using multiple properties and instructions for modifying rules as comments under each property.
But in a scenario where the business user is able to directly configure the application to apply values based on a "decision table" in a spreadsheet, then that solution will be more desirable.
If any other (budding) developer looking for justification on the for the need of Rule Engines is convinced with this answer, please leave a thumbs up!
If there's a change in logic, you'll change the properties file and deploy the whole project again. Whereas, if you maintain it using BRMS, you can change & test individually on the BRMS only without needing to deploy the whole project again. Once the testing is done and you finally want to deploy the new rule in place, then also, no need to deploy the whole project in production. If you've exposed your rule as API using KIE Server, redeploying just the KIE server would do.
One can write decision tables in such a way that all the logic is contained in the top rows. Then the developer can lock & hide those top rows and then give it to BA. Now BA doesn't see any logic but knows how to maintain the file. Also, not all logics should be written as decision tables.
As I mentioned above, one can deploy each and every rule as a separate rest API and hence is deployable independent of the rest.
In the end, I'd say the main reason we use Redhat BRMS is, as they mention in their documentation,:
Agility: No need to involve developers for a change request. BA's themselves can change the logic.
Visibility: What you see (in the excel) is what you get.
Consistency: Rules are evaluated the same way every time.
Rules engines are not always the answer. However, they provide, in theory, the advantage that the engine can perform complex processing on a simple rule expression and return a result. Other advantages are visibility to the rules and less code.
Answers to your questions.
You can. In simple cases,using property files makes sense.
Rules need to sufficiently complex to cover the business issues they validate. A good rules engine uses a syntax that is readable, even if it is complicated.
In theory, the rules server could run independently of the app server. In large companies, that is normal. The rules server could allow updates without a restart, or it could be restarted (rippled, if there are multiple instances,) without affecting the app server.
Rules engine comes into picture when business users of company want to set certain rules and drive application based on execution results / outcome decisions of rules set. One of examples of such company could be a Law firm or Insurance company where lawyers set rules to drive the quotes calculation for a insurance & rules are subjected to change over period of time. Property file is developer area where business user may not be proficient to make changes. Having separate rules engine tracks the rules and make a business user and a developer work together automating the business seamlessly which could be difficult with properties file.
Rule files syntax is way to convert business rules (verbal) to coding instructions which are executable. Thats where the syntax comes into picture. That way rules engine provide data abstraction to business entities and their relationships.
Integration with rules engine may be done with some broker or a web service or whatever, based on that, server app need rules client jars to make call against. So its matter of deployment and how server picks up changes / hot deploys if rules client jar is updated.
Rules engines are just algorithms for organizing many rules. See the Rete Algorithm.
Basically, it all comes down to complexity. If you have a few simple rules, of course you can use a .properties file. But imagine if some of your rules are 'chained' - one rule affects some other property, which triggers some other rule, which changes another property... you'd have to scan every rule, every change. For thousands of rules, it would take forever. Hence a 'rules engine'.
There are many articles on why you should or shouldn't use a rule engine. Here is one good example.
https://martinfowler.com/bliki/RulesEngine.html
Here is my scenario I am using drools for storing the rules mainly decision tables not I have a scenario where I want to create version of my knowledge. and depending on certain param want to use the knowledge.
today is my base engine when I am deciding the rules and everything. tomorrow if it changes than it should affect only the new user and old user should be able to use the older rules and if it changes again than we have three sets of user and three sets of rules.
I could see that I can maintain different Excel file and load them all and keep adding them once I get some changes in the rule.
In the same file I could have a date or some flag by that I could decide which rule has to be used.
but none of this seems to be looks like a standard drools solution to me. any thoughts or suggestions
If I'm not mistaken Drools Guvnor offers an option to publish different versions of your rule set. In a similar fashion as companies can publish different versions of Web Services.
Please comment back on this as I'm also interested whether I'm correct or not and it'll work in such a manner :)
I am in the process of creating a UI configuration tool for my pet project. One aspect of this tool lets the end user DEFINE his orchestration. I then need to save this orchestration definition into a database. There will be a executable version of this definition in a running system. The executable version is created dynamically on-demand.
Idea is to separate the DEFINITION from EXECUTABLE version so that I have the flexibility to choose the runtime version among BPMN or JPDL or a POJO based workflow solution (BeanFlow).
Limitation: I can't use the BPMN editors that come with frameworks like jBPM, Activiti etc as I wan't to use my own UI that is specific to my domain.
I need suggestions on HOW to PERSIST the definition.
Should I use rdbms tables? If so, is there a db schema I can borrow that is close to orchestration concepts?
Should I serialize my definition to BPMN/JPDL XML instance document?
Are there any other simple formats that I can use?
By "orchestration" I'm assuming you mean a finite state machine. Where the current state dictates what transitions can be followed to other states. The representation of states and transitions as edges and vertices often produces a directed acyclic graph, however there are times when the graph will cycle (e.g. draft -- submit for approval --> pending approval -- reject --> draft).
In practice, separating the definition from execution calls for a persistence format that can easily accommodate customization. As your system evolves you will find a number of unanticipated edge cases whose solution should not require altering a persistence schema, only code. This implies XML or a NoSQL solution - something whose schema is easily changed or non existent.
Now, having written my own XML definition for this purpose (for uninteresting reasons I'll exclude), my suggestion is using JPDL (or BPMN). Reason is their definitions likely incorporate whatever you're considering now, will in the future, and enable customization - such as hanging arbitrary data or behavior off them at a given point. You also get the advantage of tools already built - not just UI - for dealing with cycle detection and ensuring there is a path to completion for example.
Some of the interesting features I know JPDL possesses are an ability to help merge forked processes, timed tasks (including those that repeat periodically), and facilities for sending notification. This last item - notification - bears some further exposition. One of the things I've found with my own system is the need for sending out configurable email whose content is based on the data flowing through. These existing engines make that relatively easy by providing a way to plugin variables for instance into text that's then dynamically evaluated at run time before transmission. Also they provide bridges between the engine and whatever user store for the purpose of sending notifications to groups of people, tasking them and enforcing security policy.
Finally, depending on the scope of your system, you will probably still be using a database as well. What I suggest is storing off the XML and data being orchestrated into the database in a serialized format. Then, if the data is being altered as it travels through the execution, write out serializations of the data - and perhaps workflow if it is also changed - into a history/audit log table as well.
I would NOT use rdbms tables, or if you do, store the definitions as text blobs. Trying to make records for the definition is a bad idea because it's much more inflexible and difficult to change your definition over time. Many people would use different approaches, but I'd use JSON or YAML, and avoid XML. The motivation for that is to make it as simple as possible. Trying to use XML, especially a formalized specific format of XML is going to make you spend much more time meeting an exact specification that doesn't actually do anything to help what you're trying to accomplish. JSON and YAML are both very easy to work with from a code perspective. YAML is more easily readable by humans and easier to edit, and isn't as tricky for punctuation and escaping as JSON. JSON is more widely used, and is smaller than YAML. JSON also has a binary counterpart, BSON, if document size is a concern.
Once you have an importer/exporter that goes to/from your internal objects to your data format, then persisting using RDBMS, or other mechanisms, will be straightforward. You could even use CouchDB, which could offer other benefits to your application and may be a great fit.
Very good question! Here is my two cents:
RDBMS: if you do this you will be able to query the workflow instances, for example which tokens are at 'node X'?
Storing XML as clob: the simplicity is the truth of this solution, but you can't really query these just get them by id
NOSQL: there are a lot of different solutions for different problems. MongoDB is a popular solution, it provides document oriented persistence.
How about a simple serialisation of the composed UI using for example XStream and then store the serialised bits into the database as a binary column. Then when user logs in, get the associated data, deserialise, initialise if required and display.
Let's say you have a database with a lot of products/customers/orders and the codebase (java/c#) contains all the business logic. During the night several batches are needed to export data to flat files and then ftp them to a proprietary system.
How should we do this "write-database-into-a-flat-file? What are the best-practices?
Some thoughts:
we could create a stored procedure and use f.ex ssis to fetch the data? Maybe we can do this if we have a "batch-output-database-table" but not if we have to do logic before the file is written?
we could do all the logic in managed code using the same repositories / business logic as the rest of the domain? (this could be a slow process compared to the stored procedure solution)
What if the only interface for the domain-services are web services (which could take "long" time for each request), will the "best practices" change ?
I personally prefer to use normal (managed) code to implement feeds instead of stored procs, mainly because:
1) It's usually easier to interface with the other system (even if it is only shared drive)
2) It is easy to log everything you need and debug if something goes wrong
3) You can reuse the same code you use for normal business logic (its beneficial even if you just reference the same projects, etc.)
4) Often you need to enrich data with some information from other systems and this again is much easier to do from managed code.
5) Its much easier to test managed code, have all the unit tests, automated builds, etc.
I am not sure why it needs to be that much slower than doing it all in a stored proc. You just need to write a good stored procedure to extract the data you need, and the C#/java app will do all the transformations, enrichment, etc.
EDIT: Answering the comment:
I don't think it is possible to say if you should reuse the existing stored procs, tweak them, or create new ones. I think that is the performance hit or needed changes are not too big than I would try to use one set of procs, to avoid duplication of logic. But if the differences are substantial, then probably the cost of maintaining extra procs will be lower than changing and releasing existing ones.
Go with the repository code you've already got. Do a few performance tests and see if it meets the perf. requirements. If there is a significant perf. issue that can be nailed down to too much DB IO then go for the sproc or implement a bulk export repository.