Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have Spring Boot RESTful service API. The consumers are other applications. One of the app's controller returns up to 1 000 000 strings on request.
What is the best practice of splitting such responses in Spring applications?
Update:
I figured out the response is needed for developer's needs and would be executed only once. So it's better to create the script for this operation.
Thanks for the answers.
Here is a good example to use multi part request in spring boot: https://murygin.wordpress.com/2014/10/13/rest-web-service-file-uploads-spring-boot/
However, I would prefer to think about your problem from an architectural point of view.Why should the rest returns such a huge response?And is it necessary to really returns all those results? There are a few factors that might help me to give a better answer.
This is the kind of situation when there is always a trade off.
1)The basic question is, can't you provide additional(they don't have to be mandatory, they can be optional parameters) to reduce the amount of returned results?
2)How frequent does your data change?If they don't change pretty often(let's say once a day) then you can introduce a sort of paging mechanism so you return only a segment of the result. From your side , you can introduce a caching mechanism between your business logic layer/data base and the rest client.
3)If your data are changing frequently(like you are providing list of flight prices), then you can introduce a caching layer per client ID. You can cache the results from your side and send it to the client divided into several requests. Of course you will have to add a time stamp and expiry date for each cached request or otherwise you will face memory issues.
4) This leads us to another question, where does the pain comes from?
Does the application clients complain about not being able to handle the amount of data they receive? Or do they complain from your response time of your service?Or are you having performance issue on your server side?
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
In order to understand the following question you need to know that I'm a complete novice in the whole Spring Boot ecosystem, as well as, the architectural philosophy behind it.
Task
The app I'm developing with Spring Boot requires, on a business level, some data which are simple collections stored in Firestore. Now, the user when inputting some parameters on the front end (the REQUEST method) and asking for the execution of a certain algorithm on the back-end will trigger the following:
1. The business logic part of the app is going to retrieve some data from the database based on the user input.
2. It's going to process this data and create a RESPONSE based on the retrieved data and a number of other user input.
The problem
So, I'm not really sure if I should be even bothering with creating a service connection for the database since the only one accessing it will be the business logic layer. The database will primarily be build for reads only while at the same time I want to leave open the possibility of later creating a system for auto-updating it (again, only from the back-end, no user interaction/input). Also, what I'm possibly forgetting is the support for multiple connections. Each user may trigger the main algorithm to run utilizing a different set of data retrieved from the database. In that vein, while I would love to leverage the capabilities of Firestore, is the use of it justified in the sense of the data being static for the time being?
You should strive to keep the business logic as pure as possible from implementation choices. Ideally your business logic should not talk to network, file systems or databases. It should be just the pure, refined business logic.
You will then have outer layers that abstract as much as possible these external dependencies. In the case of database, usually you'd have a persistence layer of sorts, which is responsible for accessing directly the database.
For instance, lets say the business logic needs a list of clients sorted by last name. From the business perspective, they're calling a method fetchClientsSortedByLastName() and what that method does is a black box. If at a later moment you decide to switch from Firestore to Postgres or Mysql, you only need to change the persistence method. The business logic will remain exactly the same.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have a monolithic app that does the following logics:
Get a list A (Customer) from database
Validate data in A using some criteria, if it's not validated, throw an error
Do some operations on A to get a list B (e.g. Regional customers)
Do sth with B
Now I am transforming my app using microservices, but I have trouble in designing the calls.
As B can be deduced from A entirely, I want to just make a single micro service getCustomerA that returns all the dataset A. That means a single database access is needed. That will be a performance plus.
But the problem is, the operations on A to retrieve list B is also part of the business code. So it's more logical to put these codes in Customer microservice side, if we follow domain driven design, in microservice Customer, maybe getRegionalCustomer.
So I want to know, what is the best practice in this case ? Should we priotize the single database call (first case) or it's better to do two calls (but in this case, 2 database calls) ?
Since this is mainly opinion based I can only give you that :-)
From my experience splitting the app into microservices just for the sake of doing it puts technical dogma over technical simplicity and often introduces a lot of unnecessary overhead.
With regard to the database calls I can also tell you from experience that quite often you win performance when doing two simple calls over doing one overly complex one. Especially if you start introducing big joins over many tables or - ouch - subselects in the on clause.
See if the most simple solution works and keeps the code tidy. Constantly improve quality and optimize when the need for it arises. If you have a piece of logic that warrants to be split of into a microservice (e.g. because you want to use a different language, framework or want to offload some calculations) then go for it.
Domain driven design does not tell that each boundle context only can contains one entity, in fact, a bounded context (or microservice) can contains more than one entity when these entites are clearly related, in other words, when they need to be persisted transactionally.
In your case, due to the tight relation between the two entites, the best way is to build only one microservice that do both operations
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am still new in Java Reactive Programming and my project Requirement wants me to implement Pagination with Reactive Programming in Java.
Like if I hit an API which returns me 10000 records in stream then I needs to return a flux with proper pagination.
Can anyone suggest me a good approach about this?
Like this is the approach that I am following.
Repository
public interface CouchBaseRepository extends ReactiveCouchBaseRepository<Book,Integer> {
#Query("#{#n1ql.selectEntity} where name=$1")
public Flux<Book> getPaginatedFlux(String name ,final Pageable pageable);
This is my Repositorty but when I up my application then it shows this following error.
java.lang.IllegalStateException: Method has to have one of the following return types! [interface org.springframework.data.domain.Page, interface org.springframework.data.domain.Slice, interface java.util.List]
I cannot use Page inteface here as it is blocking so Is there any way to deal with this problem?
I haven't worked with spring-webflux yet, so I can't comment on specific API calls, but I'll provide a "theoretical" answer that might help as well.
Flux represents a stream of data (possibly infinite). So, pagination is kind of not consistent with reactivity, only because they're talking about different things
Consider implementing pagination with input parameters (like usual limit/offset) in the method that returns Flux of (logically decided) up to 10000 records as per your requirement.
So, one call will be handled in a "reactive manner" but it will return only one page of data, if you want to load another page - do another reactive call.
Of course at the level of streams, after 10000 objects receive, the stream should be closed.
This is the approach I suggest.
There is an another option: implement everything via one stream, but in this case the client side (UI or whatever that consumes the paged data) will have to be "smart enough" to load /unload only the required data. In other words, if all-in-all you have, say 1 million objects to show, think whether you should avoid situation where all 1 million is loaded on client side at once.
In addition the page navigation will be kind of tricky (stuff like, get next/previous page). I haven't worked like this before. I think bottom line, the choice will be requirement driven.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
We are building a system which needs to put tons of data into some persistent storage for a fixed amount of time - 30 to 60 days. Since the data is not critical (we can lose some for example when virtual machine goes down) and we don't want to pay the price of persisting it with every request (latency is critical for us) we were thinking about either buffering & batching the data or sending in an async manner.
Data is append only, we would need to persist 2-3 items per request, system processes ~10k rps on multiple hosts scaled horizontally.
We are hesitating between choosing Mongo (3.x?) or Cassandra, but we can go with any other solution. Does anyone here have any experience or hints in solving this kind of problem? We are running some PoCs, but we might not be able to find all the problems early enough and pivot might be costly.
I can't comment on MongoDB but I can talk to Cassandra. Cassandra does indeed have a TTL feature in which you can expire data after a certain time. You have to plan for it though because TTL's do add some overhead during a process Cassandra runs called 'compaction' - see: http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_write_path_c.html
and: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html
As long as you size for that kind of workload, you should be OK. That being said, Cassandra really excels when you have event driven data - things like time series, product catalogs, click stream data, ETC.
If you aren't familiar with Patrick McFadin, meet your new best friend: https://www.youtube.com/watch?v=tg6eIht-00M
And of course, the plenty of free tutorials and training here: https://academy.datastax.com/
EDIT to add one more idea of expiring data 'safely' and with the least overhead. This is one done by a sharp guy by the name of Ryan Svihla https://lostechies.com/ryansvihla/2014/10/20/domain-modeling-around-deletes-or-using-cassandra-as-a-queue-even-when-you-know-better/
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I know real-time chatting app definitely needs reverse AJAX.But how about other applications that real-time function is not so important?
Say there is a notification function like on Stackoverflow. When people answer your question you get a notification. Probably this is not so important for user to get notified immediately when there is a new answer.
Does this kind of function needs a reverse AJAX? Or it is good enough to set it as a basic AJAX that request the new notification every 60 seconds? Does the basic AJAX consume a lot of server resources? How to choose between them?
I think generally there can be 3 use cases:
You want to display every single update to the user as soon as possible. If so, definitely go with Long Polling. Example: a chat.
You don't need real time notifications and big delays are not a problem. In this case, Traditional Polling with a big interval is good enough for you. Example: updating some statistics in user profile.
The third case is something in between - you don't need real-time updates, but still don't want a big delay.
a. If your data changes frequently, but you don't need to report every single change to users, then Traditional Polling might be better, because it wouldn't send unnecessary updates.
b. If the data rarely changes and you prefer to notify about every change, then again Long Polling might be better, because it won't send the same data again and again.
As user489041 noted, you also need to take into consideration your sever environment. Long Polling keeps a TCP connection open the whole time a user is on the page which has your Long Polling AJAX script. It can become a problem if you have tens of thousands users and only one server. Even if you have less than 10000 users, you need to make sure that your application server is configured to handle that many simultaneous connections. For example, Tomcat in default configuration can't handle more than 200 simultaneous connections.