Java database driver design

Java database driver design - java

I have this problem where I need to design a Java package which is used for:
Getting data from different data sources. For example, Class A will retrieve customer data from an Oracle database, while Class B will retrieve the same information from a web service data source (via SOAP).
The results will need to be combined, the rule for combination is quite complex, so ideally I should hide this from the users (other developers) of this package.
When one data sources fails, I need to still return the result from other data sources. However, I also need to let the caller know one of the data sources failed to respond.
Right now I'm doing it by having a boolean value inside the Class A and Class B indicating whether there's an error, and another object for storing the actual error message. The caller will have to check this boolean value after making a call to see whether an error has occurred.
What is a good design model for this?

The answer would be very broad, so I would suggest you to use the:
The Data Access Object (DAO) design pattern to abstract the source of the data (database or webservice)
The strategy pattern to abstract the algorithm by which the data is merged (when both sources are available and one there is only one)
And finally the state design pattern to change the way your application works depending on which source is available.
All this wrapped (why not) in a nice facade.
This psuedo code has similar syntax as UML and Python:
// The data implements one interface
Data {interface}
// And you implement it with DatabaseData
DbData -> Data
...
// Or WebServiceData
WsData -> Data
...
// -- DAO part
Dao {interface}
+ fetch(): Data[]
// From database
DatabaseDao -> Dao
- data: Data[0..*]
// Query database and create dbData from rows...
+ fetch(): Data[]
self.status = "Not ok"
self.status = connectToDb()
if( self.status == ok ,
performQuery()
forEach( row in resultSet,
data.add( DbData.new( resultSet.next() ) )
)
disconnect()
)
...
// From web service
WebServiceDao -> Dao
- data: Data[0..*]
// Execute remote method and create wsData from some strange object
+ fetch(): Data[]
remoteObject: SoapObject = SoapObject()
remoteObject.connect()
if (remoteObject.connected?(),
differentData: StrangeObject = remoteObject.getRemoteData()
forEach( object in differentData ,
self.data.add( WsData.new( fromElement ))
)
).else(
self.status = "Disconnected"
)
....
// -- State part
// Abstract the way the data is going to be retrieved
// either from two sources or from a single one.
FetcheState { abstract }
- context: Service
- dao: Dao // Used for a single source
+ doFetch(): Data[] { abstract }
+ setContext( context: Service )
self.context = context
+ setSingleSource( dao: Dao)
self.dao = dao
// Fetches only from one DAO, and it doesn't quite merge anything
// because there is only one source after all.
OneSourceState -> FetcheState
// Use the single DAO and fetch
+ doFetch(): Data[]
data: Data[] = self.dao.doFetch()
// It doesn't hurt to call "context's" merger anyway.
context.merger.merge( data, null )
// Two sources, are more complex, fetches both DAOs, and validates error.
// If one source had an error, it changes the "state" of the application (context),
// so it can fetch from single source next time.
TwoSourcesState -> FetcheState
- db: Dao = DatabaseDao.new()
- ws: Dao = WebServiceDao.new()
+ doFetch(): Data[]
dbData: Data[] = db.doFetch()
wsData: Data[] = ws.doFetch()
if( ws.hadError() or db.hadError(),
// Changes the context's state
context.fetcher = OneSourceState.new()
context.merger = OneKindMergeStrategy.new()
context.fetcher.setContext( self.context )
// Find out which one was broken
if( ws.hadError(),
context.fetcher.setSingleSource( db )
)
if( db.hadError(),
context.fetcher.setSingleSource( ws )
)
)
// Since we have the data already let's
// merge it with the "context's" merger.
return context.merger.merge( dbData, wsData)
// -- Strategy part --
// Encapsulate algoritm to merge data
Strategy{ interface }
+ merge( a: Data[], with : Data[] )
// One kind doesn't merge too much, just "cast" one array
// because there is only one source after all.
OneKindMergeStrategy -> Strategy
+ merge( a: Data[], b: Data[] )
mergedData: Data[]
forEach( item, in( a ),
mergedData = Data.new( item ) // Take values from wsData or dbData
)
return mergedData
// Two kinds merge, encapsulate the complex algorithm to
// merge data from two sources.
TwoKindsMergeStrategy -> Strategy
+ merge( a: Data[], with: Data[] ): Data[]
forEach( item, in( a ),
mergedData: Data[]
forEach( other, in(with ),
WsData wsData = WsData.cast( item )
DbData dbData = DbData.cast( other )
// Add strange and complex logic here.
newItem = Data.new()
if( wsData.name == dbData.column.name and etc. etc ,
newItem.name = wsData+dbData...e tc. etc
...
mergedData.add( newItem )
)
)
)
return mergedData
// Finally, the service where the actual fetch is being performed.
Service { facade }
- merger: Strategy
- fetcher: FetcheState
// Initialise the object with the default "strategy" and the default "state".
+ init()
self.fetcher = TwoSourcesState()
self.merger = TwoKindsMergeStrategy()
fetcher.setContext( self )
// Nahh, just let the state do its work.
+ doFetch(): Data[]
// Fetch using the current application state
return fetcher.doFetch()
Client usage:
service: Service = Service.new()
service.init()
data: Data[] = service.doFetch()
Unfortunately, it looks a bit complex.
OOP is based a lot on polymorphism.
So in Dao, you let the subclass fetch data from whatever place and you just call it dao.fetch().
In Strategy the same, the subclass performs one algorithm or the other (to avoid having a lot of strange if's, else's, switch's, etc.).
With State the same thing happens. Instead of going like:
if isBroken and itDoesntWork() and if ImAlive()
etc., etc. you just say, "Hey, this will be the code one. There are two connections and this is when there is only one.".
Finally, facade say to the client "Don't worry, I'll handle this.".

Do you need to write a solution, or do you need a solution? There's plenty of free Java software that does these things - why re-invent the wheel. See:
Pentaho Data Integration (Kettle)
CloverETL
Jitterbit

I would suggest a Facade that would represent the object as a whole (the customer data) and a factory which creates that object by retrieving from each data source and passing those to the Facade (in the constructor or as a builder, depending on how many there are). The individual class with the specific data source would have a method (on a common interface or base class) to indicate if there was an error retrieving the data. The Facade (or a delegate) would be responsible for combining the data.
Then the Facade would have a method that would return a collection of some sort indicating which data sources the object represented, or which ones failed - depending on what the client needs to know.

Related

How to error handle for 20 reactive async API calls in java?

I'm writing a service that calls 20 external vendor APIs, aggregates that data and writes it to a blob storage. This is how I am calling each api, after I am using Mono.zip() and writing that result into a blob storage. However I feel that the way I am writing the code is really redundant, specifically the error handling using .onErrorResume() and .doOnSuccess() Is there a way I can maek this code cleaner by using generics or utilizing inheritance some way? I just dont want hundreds of lines of code that are basically doing the same thing...
Mono<MailboxProvidersDTO> mailboxProvidersDTOMono = partnerAsyncService.asyncCallPartnerApi(getMailboxProvidersUrl, MailboxProvidersDTO.class)
.retryWhen(getRetrySpec())
//need to have error handling for all 20 api calls
.doOnSuccess(res -> {
log.info(Logger.EVENT_SUCCESS, "Mailbox Providers report successfully retrieved.");
res.setStatus("Success");
})
.onErrorResume(BusinessException.class, ex -> {
log.error(Logger.EVENT_FAILURE, ex.getMessage());
MailboxProvidersDTO audienceExplorerDTO = new MailboxProvidersDTO();
audienceExplorerDTO.setStatus("Failed");
return Mono.just(audienceExplorerDTO);
});
Mono<TimeZonesDTO> timeZonesDTOMono = partnerAsyncService.asyncCallPartnerApi(getTimeZonesUrl, TimeZonesDTO.class);
Mono<RegionsDTO> regionsDTOMono = partnerAsyncService.asyncCallPartnerApi(getRegionsUrl, RegionsDTO.class);
Mono<AudienceExplorerDataSourcesDTO> audienceExplorerDataSourcesDTOMono = partnerAsyncService.asyncCallPartnerApi(getAudienceExplorerDataSourcesUrl, AudienceExplorerDataSourcesDTO.class);
...

You can effectively use generics to refactor your code. You can couple functional interfaces and Generics to create what you need:
On your example, you need both to "setStatus" and create new instances of different classes. You could then create a utility function to add onSuccess/onFailure behaviours over your initial data fetching Mono:
public <T> Mono<T> withRecovery(Mono<T> fetchData, BiConsumer<T, String> setStatus, Supplier<T> createFallbackDto) {
return fetchData
.doOnSuccess(result -> {
log.info...
setStatus.accept(result, "Success");
})
.doOnError(BusinessException.class, err -> {
log.error...
T fallback = createFallbackDto.get();
setStatus.accept(fallback, "Error");
return Mono.just(fallback);
});
}
Then, you can use this method like that:
Mono<MailProvidersDto> mails = withRecovery(
partnersAsyncService.asyncCallPartnerApi(getMailboxProvidersUrl, MailboxProvidersDTO.class),
MailProvidersDto::setStatus,
MailProvidersDto::new
);
Mono<TimeZoneDto> timezone = withRecoery(
partnersAsyncService.asyncCallPartnerApi(getMailboxProvidersUrl, TimeZoneDto.class),
TimeZoneDto::setStatus,
TimeZoneDto::new
);
... // Repeat for each api
Notes:
If the setStatus method is available through a common interface that all DTO implement, you can get rid of the biconsumer, and directly call result.setStatus(String), by specializing T generic to T extends StatusInterface.
With it, you could also factorize initial fetching and retry calls, by passing related parameters (url, class, retry spec) as method input.

In spark Dataset s can be passed as input args to a function to get out put args of function?

Me new to spark , in our project we are using spark-structured streaming to write kafka consumer.
We have a use case where I need to modular the code so that multiple people can work on different pieces of spark-job simultaneously.
In first step we read different kafka topics now i have two datasets.
Lets say ds_input1 and ds_input2.
I need to pass these to next step where other person working on.
So i have done as below in java8
DriverClass{
Dataset<Row> ds_input1 = //populate it from kafka topic
Dataset<Row> ds_output1 = null;
SecondPersonClass.process(ds_input1 , ds_output1 );
//here outside I get ds_output1 as null
//Why it is not working as List<Objects> in java ?
//Is there anything wrong I am doing ? what is the correct way to do?
Dataset<Row> ds_output2 = null;
ThirdPersonClass.process(ds_output1 , ds_output2);
//here outside I get ds_output2 as null
//though ds_output2 populated inside function why it is still null outside?
}
SecondPersonClass{
static void process(ds_input1 , ds_output1 ){
//here have business logic to work on ds_input1 data.
//then i will update and assign it back to out put dataSets
//i.e. ds_output1
//for simplicity lets says as below
ds_output1 = ds_input1 ;
//here I see data in ds_output1 i.e ds_output1 is not null
}
}
ThirdPersonClass{
static void process2(ds_input2 , ds_output2 ){
//here have business logic to work on ds_input2 data.
// then i will update and assign it back to out put dataSets
//i.e. ds_output2
//for simplicity lets says as below
ds_output2 = ds_input2 ;
//here I see data in ds_output2 i.e ds_output2 is not null
}
}
Question :
Even though dataset is populated inside the function static method why those are not reflecting outside the function and still null?
Why java call by reference to objects not working here ?
How to handle this ?
Can we return multiple Datasets from a function if so how to do it ?

Should I abstract the service layer on the client side and if yes how?

The thing is that I am using Hibernate on the server side and that I am sending basically "raw" database data to the client - which is fine I guess but that also means that my client gets a List<UpcomingEventDTO> when calling the according service which is just a list from a specified date to another one.
If I now want to split those events into a map where the keys map to lists of events of one day e.g. a Map<Integer, List<UpcomingEventDTO>> then I will have to do this on the client side. This wouldn't bother me if I wouldn't have to do that in my Presenter.
On the one hand I'm having the loading in my presenter:
private void loadUpcomingEvents(final Integer calendarWeekOffset) {
new XsrfRequest<StoreServletAsync, List<UpcomingEventDTO>>(this.storeServlet) {
#Override
protected void onCall(AsyncCallback<List<UpcomingEventDTO>> asyncCallback) {
storeServlet.getUpcomingEventsForCalendarWeek(storeId, calendarWeekOffset, asyncCallback);
}
#Override
protected void onFailure(Throwable caught) {
}
#Override
protected void onSuccess(List<UpcomingEventDTO> result) {
upcomingEvents = result;
presentUpcomingEvents();
}
}.request();
}
and the conversion of the data before I can present it:
private void presentUpcomingEvents() {
Map<Integer, List<UpcomingEventDTO>> dayToUpcomingEvent = new HashMap<>();
for (UpcomingEventDTO upcomingEvent : this.upcomingEvents) {
#SuppressWarnings("deprecation")
Integer day = upcomingEvent.getDate().getDay();
List<UpcomingEventDTO> upcomingEvents = dayToUpcomingEvent.get(day);
if(upcomingEvents == null) {
upcomingEvents = new ArrayList<>();
}
upcomingEvents.add(upcomingEvent);
dayToUpcomingEvent.put(day, upcomingEvents);
}
List<Integer> days = new ArrayList<Integer>(dayToUpcomingEvent.keySet());
Collections.sort(days);
this.calendarWeekView.removeUpcomingEvent();
for(Integer day : days) {
CalendarDayPresenterImpl eventCalendarDayPresenter = null;
eventCalendarDayPresenter = this.dayToEventCalendarDayPresenter.get(day);
if(eventCalendarDayPresenter == null) {
List<UpcomingEventDTO> upcomingEvents = dayToUpcomingEvent.get(day);
eventCalendarDayPresenter = new CalendarDayPresenterImpl(upcomingEvents);
this.dayToEventCalendarDayPresenter.put(day, eventCalendarDayPresenter);
}
this.calendarWeekView.appendEventCalendarDay(eventCalendarDayPresenter.getView());
}
}
So my problem is basically that I am not really happy with having code like this in my presenter but on the other hand I wouldn't know how and where to provide the data in this "upgraded" form for my presenter(s).
One could argue and say that I could also just return the data from the server in a way I would need it on the server but then I would lose generality and I don't want to write for all views and presenters their "own" API to the database.
Another possibility would be e.g. to introduce another layer between the service/servlet layer and have something like a DAO- or database-layer before my presenters model. But this would also raise quite a lot questions for me. E.g. what would be the name of such a layer ^^ and would that layer provide "customize" data for presenters or would the data still be kind of generalized?
I'm having quite a huge issue figuring out what to do here so I hope I can benefit from someones experience.
Thanks a lot for any help here!

The presentation logic should be on server side in controller layer where its meant to prepare the view for the clients. ( MVC pattern )
And if many views want to use this, you can make an abstract controller which can be reused for other views.
Also its good to prepare your controller layer for the future requirements. Ask yourself whether another client will ask to present the data in different granularity ? May be show the upcoming events by month/time ? Hence you have to provide your API a granularity enum UPCOMING_EVENTS_DAY_GRANULARITY( DAY, MONTH, HOUR) as a method parameter so that you will make client to decide what they want.
And to make it more beautiful, you can also say rename/move controller layer into a webservice layer which can be considered as your future API for external systems (not only for your views but for anyone outside your system)..

Web Service Return Function Specification Instead of Object?

Apologies if this question is a duplicate (or if it has an obvious answer that I'm missing) -->
Is there a practice or pattern that involves a web service returning a function definition to the client, instead of a value or object?
For an extra rough outlining example:
I'm interested in the results of some statistical model. I have a dataset of 100,000 objects of class ClientSideClass.
The statistical model sits on a server, where it has to have constant access to a large database and be re-calibrated/re-estimated frequently.
The statistical model takes some mathematical form, like RESULT = function(ClientSideClass) = AX + BY + anotherFunction(List(Z))
The service in question takes requests that have a ClientSideClass object, performs the calculation using the most recent statistical model, and then returns a result object of class ModelResultClass.
In pseudo OOP (again, sorry for the gnarly example) :
My program as a client :
static void main() {
/* assume that this assignment is meaningful and that all
the objects in allTheThings have the same identifying kerjigger */
SomeIdentifier id = new SomeIdentifier("kerjigger");
ClientSideClass[100000] allTheThings = GrabThoseThings(id);
for (ClientSideClass c : allTheThings) {
ModelResult mr = Service.ServerSideMethod(c);
// more interesting things
}
}
With my client side class :
ClientSideClass {
SomeIdentifier ID {}
int A {}
double[] B {}
HashTable<String,SomeSimpleClass> SomeHash {}
}
On the server, my main service :
Service {
HashTable<SomeIdentifier,ModelClass> currentModels {}
ModelClass GetCurrentModel(SomeIdentifier id) {
return currentModels.get(id);
}
ModelResultClass ServerSideMethod(ClientSideClass clientObject) {
ModelClass mc = GetCurrentModel(clientObject.ID);
return mc.Calculate(clientObject);
}
}
ModelClass {
FormulaClass ModelFormula {}
ModelResultClass Calculate(ClientSideClass clientObject) {
// apply formula to client object in whatever way
ModelResult mr = ModelFormula.Execute(clientObject);
return mr;
}
}
FormulaClass {
/* no idea what this would look like, just assume
that it is mutable and can change when the model
is updated */
ModelResultClass Execute(clientObject) {
/* do whatever operations on the client object
to get the forecast result
!!! this method is mutable, it could change in
functional form and/or parameter values */
return someResult;
}
}
This form results in a lot of network chatter, and it seems like it could make parallel processing problematic because there's a potential bottleneck in the number of requests the server can process simultaneously and/or how blocking those calls might be.
In a contrasting form, instead of returning a result object, could the service return a function specification? I'm thinking along the lines of a Lisp macro or an F# quotation or something. Those could be sent back to the client as simple text and then processed client-side, right?
So the ModelClass would instead look something like this? -->
ModelClass {
FormulaClass ModelFormula {}
String FunctionSpecification {
/* some algorithm to transform the current model form
to a recognizable text-formatted form */
string myFuncForm = FeelTheFunc();
return myFuncForm;
}
}
And the ServerSideMethod might look like this -->
String ServerSideMethod(SomeIdentifier id) {
ModelClass mc = GetCurrentModel(id);
return mc.FunctionSpecification;
}
As a client, I guess I would call the new service like this -->
static void main() {
/* assume that this assignment is meaningful and that all
the objects in allTheThings have the same identifier */
SomeIdentifier id = new SomeIdentifier("kerjigger");
ClientSideClass[100000] allTheThings = GrabThoseThings(id);
string functionSpec = Service.ServerSideMethod(id);
for (ClientSideClass c : allTheThings) {
ModelResult mr = SomeExecutionFramework.Execute(functionSpec, c);
}
}
This seems like an improvement in terms of cutting the network bottleneck, but it should also be readily modified so that it could be sped up by simply throwing threads at it.
Is this approach reasonable? Are there existing resources or frameworks that do this sort of thing or does anyone have experience with it? Specifically, I'm very interested in a use-case where an "interpretable" function can be utilized in a large web service that's written in an OO language (i.e. Java or C#).
I would be interested in specific implementation suggestions (e.g. use Clojure with a Java service or F# with a C#/WCF service) but I'd also be stoked on any general advice or insight.

Making massive amounts of individual row updates faster or more efficient

I'm writing a java application that copies one database's information (db2) to anther database (sql server). The order of operations is very simple:
Check to see if anything has been updated in a certain time frame
Grab everything from the first database that is within the designated time frame
Map database information to POJOs
Divide subsets of POJOs into threads (pre defined # in properties file)
Threads cycle through each POJO Individually
Update the second database
I have everything working just fine, but at certain times of the day there is a huge jump in the amount of updates that need to take place (can get in to the hundreds of thousands).
Below you can see a generic version of my code. It follows the basic algorithm of the application. Object is generic, the actual application has 5 different types of specified objects each with its own updater thread class. But the generic functions below are exactly what they all look like. And in the updateDatabase() method, they all get added to threads and all run at the same time.
private void updateDatabase()
{
List<Thread> threads = new ArrayList<>();
addObjectThreads( threads );
startThreads( threads );
joinAllThreads( threads );
}
private void addObjectThreads( List<Thread> threads )
{
List<Object> objects = getTransformService().getObjects();
logger.info( "Found " + objects.size() + " Objects" );
createThreads( threads, objects, ObjectUpdaterThread.class );
}
private void createThreads( List<Thread> threads, List<?> objects, Class threadClass )
{
final int BASE_OBJECT_LOAD = 1;
int objectLoad = objects.size() / Database.getMaxThreads() > 0 ? objects.size() / Database.getMaxThreads() + BASE_OBJECT_LOAD : BASE_OBJECT_LOAD;
for (int i = 0; i < (objects.size() / objectLoad); ++i)
{
int startIndex = i * objectLoad;
int endIndex = (i + 1) * objectLoad;
try
{
List<?> objectSubList = objects.subList( startIndex, endIndex > objects.size() ? objects.size() : endIndex );
threads.add( new Thread( (Thread) threadClass.getConstructor( List.class ).newInstance( objectSubList ) ) );
}
catch (Exception exception)
{
logger.error( exception.getMessage() );
}
}
}
public class ObjectUpdaterThread extends BaseUpdaterThread
{
private List<Object> objects;
final private Logger logger = Logger.getLogger( ObjectUpdaterThread.class );
public ObjectUpdaterThread( List<Object> objects)
{
this.objects = objects;
}
public void run()
{
for (Object object : objects)
{
logger.info( "Now Updating Object: " + object.getId() );
getTransformService().updateObject( object );
}
}
}
All of these go to a spring service that looks like the code below. Again its generic, but each type of object has the exact same type of logic to them. The getObjects() from the code above are just one line pass throughs to the DAO so no need to really post that.
#Service
#Scope(value = "prototype")
public class TransformServiceImpl implements TransformService
{
final private Logger logger = Logger.getLogger( TransformServiceImpl.class );
#Autowired
private TransformDao transformDao;
#Override
public void updateObject( Object object )
{
String sql;
if ( object.exists() )
{
sql = Object.Mapper.UPDATE;
}
else
{
sql = Object.Mapper.INSERT;
}
boolean isCompleted = false;
while ( !isCompleted )
{
try
{
transformDao.updateObject( object, sql );
isCompleted = true;
}
catch (Exception exception)
{
logger.error( exception.getMessage() );
threadSleep();
logger.info( "Now retrying update for Object: " + object.getId() );
}
}
logger.info( "Updated Object: " + object.getId() );
}
}
Finally these all go to the DAO that looks like this:
#Repository
#Scope(value = "prototype")
public class TransformDaoImpl implements TransformDao
{
//#Resource is like #Autowired but with the added option of being able to specify the name
//Good for autowiring two different instances of the same class [NamedParameterJdbcTemplate]
//Another alternative = #Autowired #Qualifier(BEAN_NAME)
#Resource(name = "db2")
private NamedParameterJdbcTemplate db2;
#Resource(name = "sqlServer")
private NamedParameterJdbcTemplate sqlServer;
final private Logger logger = Logger.getLogger( TransformerImpl.class );
#Override
public void updateObject( Objet object, String sql )
{
MapSqlParameterSource source = new MapSqlParameterSource();
source.addValue( "column1_value", object.getColumn1Value() );
//put all source values from the POJO in just like above
sqlServer.update( sql, source );
}
}
My insert statements look like this:
"INSERT INTO dbo.OBJECT_TABLE " +
"(COLUMN1, COLUMN2...) " +
"VALUES(:column1_value, :column2_value... "
And my update statements look like this:
"UPDATE dbo.OBJECT_TABLE SET " +
"COLUMN1 = :column1_value, COLUMN2 = :column2_value, " +
"WHERE PRIMARY_KEY_COLUMN = :primary_key_value"
Its a lot of code and stuff I know, But I just wanted to layout everything I have in hopes that I can get help making this faster or more efficient. It takes hours on hours to update so many rows and it would nice if it only took a couple/few hours instead hours on hours. Thanks for any help. I welcome all learning experiences about spring, threads and databases.

If you're sending large amounts of SQL to the server, you should consider Batching it using the Statement.addBatch and Statement.executeBatch methods. The batches are finite in size (I always limited mine to 64K of SQL), but they dramatically lower the round trips to the database.
As I was iterating and creating SQL, I would keep track of how much I had batched already, when the SQL crossed the 64K boundary, I'd fire off an executeBatch and start a fresh one.
You may want to experiment with the 64K number, it may have been an Oracle limitation, which I was using at the time.
I can't speak to Spring, but batching is a part of the JDBC Statement. I'm sure it's straightforward to get to this.

Check to see if anything has been updated in a certain time frame
Grab everything from the first database that is within the designated time frame
Is there an index on the LAST_UPDATED_DATE column (or whatever you're using) in the source table? Rather than put the burden on your application, if it's within your control, why not write some triggers in the source database that create entries in an "update log" table? That way, all that your app would need to do is consume and execute those entries.
How are you managing your transactions? If you're creating a new transaction for each operation it's going to be brutally slow.
Regarding the threading code, have you considered using something more standard rather than writing your own? What you have is a pretty typical producer/consumer and Java has excellent support for that type of thing with ThreadPoolExecutor and numerous queue implementations to move data between threads that perform different tasks.
The benefit with using something off the shelf is that 1) it's well tested 2) there are numerous tuning options and sizing strategies that you can adjust to increase performance.
Also, rather than use 5 different thread types for each type of object that needs to be processed, have you considered encapsulating the processing logic for each type into separate strategy classes? That way, you could use a single pool of worker threads (which would be easier to size and tune).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java database driver design - java

Do you need to write a solution, or do you need a solution? There's plenty of free Java software that does these things - why re-invent the wheel. See: Pentaho Data Integration (Kettle) CloverETL Jitterbit

Related

How to error handle for 20 reactive async API calls in java?

In spark Dataset s can be passed as input args to a function to get out put args of function?

Should I abstract the service layer on the client side and if yes how?

Web Service Return Function Specification Instead of Object?

Making massive amounts of individual row updates faster or more efficient

Categories

Resources