Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Background:
I'm working on a web-based application built in Spring MVC and Angular. we have a help desk module where agents use to work for customer care. The application is deployed on a single server. we have a ticket locking mechanism i-e when an agent opens a ticket to start working on it ticket get locked to that agent so that other agents may not work on the same ticket at the same time. as soon as the agent closes the ticket its available for other agents to open and update if needed. For locking ticket to avoid too much DB calls we have implemented ConcurrentHashMap so that everyone gets updated for the locking ticket using the same map this is working absolutely fine.
Issue:
Now the application is deployed on two different servers and this ConcurrentHashMap is not working as MAP is maintained by each server. If a user is locking a ticket using Node-1 and if 2nd user's request goes to node-2, this approach is not going to work. To avoid this situation we are planning to change the flow so that we may avoid such issues. Parallelly, we don't want to save this locking details directly to DB to avoid DB IO as it a very frequent usage area of application.
Options
After doing some R&D I got the following options that we can implement, keeping the persistence in mind.
We can implement the In-Memory table concept using MSSQL or Redis
RabbitMQ
We can implement an API that will be deployed on a single node and both of our servers will use that to maintain locking tickets but we still have two problems with this calling API would be time taking and 2nd it's not persisting the data, if the server will get restarted we will lose the data.
Can anyone advise me on which approach should be good for the above case and how to implement it. I just need a startup.
thanks in advance.
I think your real problem is this:
For locking ticket to avoid too much DB calls [ you decided not to use the database ].
IMO, that was a mistake. A database call to acquire a "lock" on a ticket is unlikely to result in too many database calls.
In analyzing this, you need to consider how often someone will want to start working on a ticket, and how often it is likely to fail because someone is already working on the ticket. I don't know your use-case details, but I would be very surprised if the latter event happens more often than once per second.
If your database cannot sustain one "small" database operation per second (worst case!) for locking, then it won't be able to sustain the larger transactions involved in creating tickets, agents updating them, user reading them, and so on.
So suggestions are:
Work out what the actual database load for ticket locking will be ... relative to all of the other things that the database needs to do.
If it is small, just go back to the database for ticket locking. Keep it simple!
If it is large; either:
Scale up or scale out the existing database; e.g. use sharding. It seems likely that you will need to do this anyway. That should give you the "headroom" to use the existing database for locking as well.
Create a separate database server for the locking. It is unlikely that it will need to be big, and I can't envisage that it needs to be very fast. (See below!!)
Use one of your proposed solutions.
But my main advice is to AVOID the trap of premature optimization. You seem to be designing for bottlenecks that you think will exist without any clear evidence for this. For example:
"We can implement an API that will be deployed on a single node and both of our servers will use that to maintain locking tickets but we still have [the problem] with this calling API would be time taking ..."
Unless the time taken is multiple seconds, this is unlikely to be a real problem. The best strategy is to implement the system first the simple way and then measure performance to see 1) whether optimization effort is warranted and 2) where the real bottlenecks are in the complete system.
In your case, I doubt that the users will care if it takes (say) 1 second versus 2 seconds to be told that someone else is already working on a ticket.
Finally, wouldn't it be simpler to use an existing off-the-shelf ticketing system? There are many of them out there. Commercial products, open source, hosted, etcetera. (OK it is probably too late for this, because it sounds like you are committed to implementing your own ticketting system from scratch. But it may not be too late to reconsider your strategy.)
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I already have a blog application which is built on Spring-Boot as a monolith.
There are 2 entities.
user
post
And the mapping is one to many.
A single user can create multiple blog posts.
How can i recreate the same functionality as separate microservice applications.
So far on researching through internet, what i see people saying is create database per service etc.
Suppose if i create 2 services say
UserService(which has separate DB and CRUD operations associated with it)
PostService(which has separate DB and CRUD operations associated with it)
How can i make communications in between them.
In the monolith app the POST entity has createdBy mapped to User.
But how this works in Microservices architecture?
Can any one please help me how can i design such architecture?
First list out the reasons why you want to break it in micro-services. To make it more scalable (in following scenarios for example).
Post comments becomes slow, and during this period registration of
new Users to remain unaffected.
Very few users upload/download files, and you want general users who simply view comments and post comments to be unaffected, while upload/download files may remain
slow.
Answers of the above question and analyzing,priotizing other NFR's shall help to determine how and what to break.
Additionally,
Post service only needs to validate whether the user is a valid logged in user.(Correct?)
User Service does not really need to communicate with post service at all.
Further you might want to decouple other minor features as well. Which in turn talk to each other which can be authenticated via other means like(Certificates, etc.) as they will be internal and updating some stats(user ranking), aggregates data etc.
The system might also have a lot of smaller hidden features, which might or might not have to do anything with Post Service at all, which can be separated in terms of different micro-services(like video/file/picture/any binary content upload/download) also and prioritized based on computation power needed, hit frequency and business priority.
After breaking it in to micro-services, you need to run some stress tests (based on current load) to know which services needs replication and which not and needs a automatic load balancing. Writing stress load first before breaking can also help to understand which features need to be move out of the monolith first.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
A relative of mine wants me to to make a program to organize some information at where she works. It's fairly simple, however I don't know what kind of office drama they have going on, but she doesn't want to bother IT with all sorts of questions about the database that can be used, how it will connect (multiple people will read info stored in a database of some sort hosted within the companies intranet).
Anyways, I'm thinking it shouldn't be a problem to just use something like a local Microsoft access database file for now, and then rewrite the database component when I have more information. Is that an insane idea? This program is not hard, it can probably be written and tested in a week if I was working on it full time (I'm still in college). For that matter, I am thinking of using Java in Netbeans simply because I am comfortable with it. Should I worry that I find out they use some sort of database or other solution that cannot be (easily) worked with in Java?
While knowing a requirement like database type upfront is a good idea, being able to adapt to new requirements is a part of Agile development.
I'd argue it's not an insane idea. If you're careful about your design, switching out database won't be too bad. If you don't mind, I'll elaborate on a (possible) pattern that might save you trouble.
Overview
In my experience I have found it best to abstract the database logic (how to communicate) from the business logic (how to accomplish a task). These two layers are going to make your code much more maintainable for when you find out the company is running an Oracle database and not Access.
Data Access Layer
The DAL has one job and that is to communicate to the database. It needs to read, it needs to write, and that is it. Your classes will likely include information like table attribute names or queries. It's OK to do that here since the class is specific to a particular database. Using a DAL will greatly simply your database calls later on.
I would highly suggest looking into factory pattern for how to design these classes. Using factory pattern will completely decouple the Business Layer from the database specific classes using interfaces. In other words, you can completely change out the database without needing to modify the business logic.
Business Layer
In fancy terms, all I'm talking about is the logic for how to accomplish a task. The business layer doesn't have anything to do with where buttons appear on a screen nor should it worry about table names.
In this layer you will find yourself needing access to the database to read/write information and that is when you call on your Data Access Layer. It will handle the ugly details keeping your business logic from having to know what type of database your are using.
Data Transfer Object
Lastly, you're going to be pushing a lot of information between these layers. I suggest you design some classes that can help you transfer data that belongs together. Consider a call to the DAL requesting a book...
Book book = libraryAccessObject.getBookById("ABC123.45");
Getting a book is going to return a lot of information. Creating a book object to organize that information will make life easier.
In summary, it's not a far fetched idea but be careful with your design. Poor deign now could cause you a lot of problems next week. Layering your code will make it much more maintainable.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am new to zookeeper, Apache curator and need your help to design a prorgram:
I need to create a java program, that will run a script every hour (based on cron expression provided by end user).
Consider I have 3 servers, I need to make sure the script runs every hour without failure even in case of a server is down (in this case script must run on other server). Every hour script will be running only on one server.
I have to create an interface to provide input the this java program. Input will be (i) Script to be run and (ii) Cron expression to schedule script.
1) Please suggest an idea how can I design my program to achieve this. How zookeeper, Apache curator can be used in the same.
2) Is there any way to cache the script on these 3 servers that end-user provide to run?
Can Apache curator's NodeCache be used to cache the script on these 3 servers?
Your response will be highly appreciated.
With three servers, where one is to run no matter what, you need a distributed approach. The problem is that in the event of failures, you might not be able to solve the puzzle of whether to run the script or not.
For a start, you can just have one computer connect to others and tell them not to run. This is called a "hold down" approach; but, it has a lot of issues when you can't connect to the other computers. The problems are that most starting programmers fail to really understand the changes a network environment makes on how they need to design programs. Please take a little time to read over the typical fallacies of distributed computing.
Chron solves this by not caring what happens on other computers, so chron has the wrong design goals.
With three computers, you will also have three different clocks, with their own speeds and times. A good distributed solution will have some concept of time that doesn't directly rely on each machine's clock.
Distributed solutions (if they are to tolerate faults or failures) must be able to run without reliable communication to the other machines. Sometimes the group gets split in half, where one group of machines cannot communicate to the other group. In many cases, both group will perform the "critical" action in fear that the other group didn't. It other cases, both groups might not perform the "critical" action assuming that the other group did. A good solution will ensure that the "critical action" is performed once, even when the computers cannot communicate. Often this is done by "majority" where your group (quorum) cannot perform a critical action if you don't have access to at least a majority of the involved machines.
Look at the Paxos algorithim to get an idea of the issues; and, once you are more aware of the problems, look back at your chosen technologies to determine which parts of the problems they are attempting to solve considering the "fallacies of distributed computing". Also realize that a perfect, 100% correct solution might not be possible; because, the pre-selected machine(s) to run the script might suffer a network failure, and then a power failure in sequence in such a manner that the up machines just assume there's only a network outage.
This is an interview question, right? If yes, be aware that this answer only gets you partway.
The simplest solution is to have all three servers running, and attempt to acquire a lock to perform the processing. See http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks
To ensure that only one server runs the job, you will need to record the last execution time. This is simply "store a value with known key," and you'll find it in one of the intro tutorials.
Of course, if this is an interview question, the interviewer will ask follow-on questions such as "what happens if the script fails halfway through?" or "what if the computers don't have the same time?" You won't (easily) solve either of those problems with ZooKeeper.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Given a series of URLS from a stream where millions could be bit.ly, google or tinyurl shortened links, what is the most scalable way to resolve those to get their final url?
A Multi-threaded crawler doing HEAD requests on each short link while caching ones you've already resolved? Are there services that already provide this?
Also factor in not getting blocked from the url shortening service.
Assume the scale is 20 million shortened urls per day.
Google provides an API. So does bit.ly (and bit.ly asks to be notified of heavy use, and specify what they mean by light usage). I am not aware of an appropriate API for tinyurl (for decoding), but there may be one.
Then you have to fetch on the order of 230 URLs per second to keep up with your desired rates. I would measure typical latencies for each service and create one master actor and as many worker actors as you needed so the actors could block on lookup. (I'd use Akka for this, not default Scala actors, and make sure each worker actor gets its own thread!)
You also should cache the answers locally; it's much faster to look up a known answer than it is to ask these services for one. (The master actor should take care of that.)
After that, if you still can't keep up because of, for example, throttling by the sites, you had better either talk to the sites or you'll have to do things that are rather questionable (rent a bunch of inexpensive servers at different sites and farm out the requests to them).
Using HEAD method is an interesting idea by I am afraid it can fail because I am not sure the services you mentioned support HEAD at all. If for example the service is implemented as a java servlet it can implement doGet() only. In this case doHead() is unsupported.
I'd suggest you to try to use GET but do not read the whole response. Read HTTP status line only.
As far as you have very serious requirements for performance you cannot these requests synchronously, i.e. you cannot use HttpUrlConnection. You should use NIO package directly. In this case you will be able to send requests to all millions of destinations using only one thread and get responses very quickly.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
We are developing a Java EE application backed by any database of customer choice.
We will sell to customers based on per user license price. How do I make sure, the application is getting used as per our conditions, i.e., not easily hackable? Are there any tutorials available?
Bill Karwin's answer was the most useful of the answers from the question mentioned in the comments. Assuming that you will go ahead with a "protection" scheme, try to do the bare minimum. Anything else tends to frustrate users immensely and leads to lower repeat business and/or an increased desire to hack around your frustrating system.
From your question, it's tough to tell if each user will install the application. If so, you probably just need to require a license code that they must contact you in some way to get. If it's a client-server thing, then your options are a lot more limited; in fact, I can't think of a single solution I've ever designed in my head or come across in practice that isn't massively frustrating. You could probably do a license code solution here, too, except the license code would somehow carry a payload that indicated the number of users they paid for and then disallow the creation/use of users in excess of that number. At that point, though, you're really walking that frustration line I mentioned.
If you can obfuscate - this is the way to go for a start. But it could be painful if you use inversion of control frameworks (e.g. spring). I heard that it's possible to obfuscate spring context as well, never tried it though. Also (just guessing) there could be some surprises with reflections, dynamic proxies and such. As to the licensing, I can suggest using TrueLicense. It has very flexible means of handling various aspects of protection as well as free trial periods out of the box. Works very well and has great documentation.
Do clients pay for support of this application? If so, there is a chance that support is a bigger pay-off than the licensing of the application itself. If so, you may consider not locking down the application, but rather, choosing to only provide support for authentic copies of the software (unmodified copies proved via checksums and the such). Many businesses licensing this software would be more inclined to avoid any modifications (even though the chance of them wanting to actually do this is probably tiny) in order to not jeopardize their support.
FYI: This is how Oracle tends to operate with their e-Business Suite. You can modify pretty much any component you want. Good luck on getting support, though!
Look at how Atlassian sells their products. I believe this is an approach that works very well, and probably would for you too. Note: There should be added value in subscribing to updates!