Apache Flink Table 1.4: External SQL execution on Table possible? - java

is it possible to query an existing StreamTable externally, without uploading a .jar get the execution environment and retrieve the table environment? I had waited for Apache Flink Table 1.4 release, because of its dynamic (continuous) table features. I expected something else, I thought it would be possible to alter the table at runtime and modify its parameters. In order to do some live queries, instead of defining (continuous or append-only) database views on top of a data stream. I know I could export my table into some database and query this database dynamically using SQL, but this is kind of awkward behavior. The beauty of Flink is that everything is in real-time and everything is a stream, so is it possible to query a Flink table in real time from some external program?

No, this is not supported at the moment.
There has been some work on storing the result table of a streaming query as queryable state. This would allow for point (key look-up) queries on a fixed key attribute. This feature might become available with Flink 1.5.
There are not concrete plans to support SQL queries on a dynamic table produced by a streaming SQL (or Table API) query. You would have to emit the table to a RDBMS and query the data from there.

Related

Generic approach of mirroring data from Oracle to another database

We have source Oracle database, where we have a lot of tabels (let say 100) which we need to mirror to target database. So we need to copy data increments periodically to another db tables. The target database is currently Oracle, but in the short future it will be probably changed to a different database technology.
So currently we can create a PL/SQL procedure which will dynamically generate DML (insert, update or merge statements) for each table (assuming that the source and target table have exactly the same attributes) from Oracle metadata.
But we would rather create some db technology independent solution so when we change target database to another (e.g. MS SQL or Postgres), then we will no need to change whole logic of data mirroring.
Does anyone have a suggestion how to do it differently (preferably in java)?
Thanks for every advice.
The problem you have is called CDC - continuous data capture. In case of Oracle this is complicated because Oracle is usually asking money for this.
So you can use:
PL/SQL or Java and use SQL to incrementally detect changes in data. IT requires plenty of work and performance is bad.
Use tools based on Oracle triggers, which will dects data changes and pushes them into some queue.
Use tool which can parse content of Oracle Archive logs. These are commercial products: GoldenGate (from Oracle) and Shareplex (Dell/EMC/dunno). GoldenDate also contains Java technology(XStreams) which allows you to inject Java visitor into the data stream. Those technologies also support sending data changes into Kafka stream.
There are plenty of tools like Debezium, Informatica, Tibco which can not parse Archived logs by themself, but rather they use Oracle's internal tool LogMiner. These tools usually do not scale well and can not cope with higher data volumes.
Here is quite article in as a summary. If you have money pick GoldenGate or Shareplex. If you don't pick Debezium or any other Java CDC project based on Logminer.

Enhance persistence.xml for database update

For development and deployment of my WAR application I use the drop-and-create functionality. Basically erasing everything from the database and then automatically recreating all the necessary tables and fields according to my #Entity-classes.
Obviously, for production the drop-and-create functionality is out of question. How would I have to create the database tables and fields?
The nice thing about #Entity-classes is that due to OQL and the use of EntityManager all the database queries are generated, hence the WAR application gets database independent. If I now had to create the queries by hand in SQL and then let the application execute them, then I would have to decide in which sql dialect they are (i.e. MySQL, Oracly, SQL Server, ...). Is there a way to create the tables database independently? Is there a way to run structural database updates as well database independently (i.e. for database version 1 to database version 2)? Like altering field or table names, adding tables, droping tables, etc.?
Thank you #Qwerky for mentioning Liquibase. This absolutely is a solution and perfect for my case as I won't have to worry about versioning anymore. Liquibase is very easy to understand and studied in minutes.
For anyone looking for database versioning / scheme appliance:
Liquibase

Database independency using Hibernate

I am using Hibernate for ORM in my Java application. I want to write custom queries combining multiple tables and using DB functions like sum(salary).
I also want to support multiple databases without writing the SQLs again and again for each database. The approach currently followed
is having Stored Procedures specific to each DB (Oracle, MySQL etc) and whichever we want to support, we change the configuration file in the application.
What I am looking for is a solution very generic so that I need not write Stored Procedures or SQLs for every new functionality.
If you really want to keep it portable, you'll need to do it all with HQL.
There's no reason that you couldn't do multi-table joins and aggregate functions in HQL, you just need to limit yourself to the ones it supports.
Once you start doing database-vendor specific things, you are no longer database independent, by definition.
A perfect suite is HIbernate Criterias
Hibernate provides alternate ways of manipulating objects and in turn data available in RDBMS tables. One of the methods is Criteria API which allows you to build up a criteria query object programmatically where you can apply filtration rules and logical conditions.
http://www.tutorialspoint.com/hibernate/hibernate_criteria_queries.htm

How do I implement the UnityJDBC in my Java project?

i have a project am working on, its all about querying a data from multiple databases from different vendors (i mean querying databases like mysql, hsqldb, microsoft sql, oracle, etc at the same time using one query statement).
Though i have achieved this by loading each driver of the database connector sequentially and execute the query sequentially across the databases. But the project architecture is such that when i sent a query statement, it shouldgo simultaneously to each database and retrieve the item ifavailable in all databases involved.
I came across this unityjdbc software, a mediation software but dont know how to implement it in my java source file so that to achieve my aim. I have read the unityjdbc user manual but is not clear and straight-forward.
Please can anyone advise how toimplement this unityjdbc driver in my java application and use it to successful query multiple databases.
Suggestions for any other way to simultaneously query their multiple databases with a single statement would also be welcome.
UnityJDBC allows you to query multiple databases in one SQL query. You cannot do this using separate threads as you would then be responsible for merging the data from the multiple databases yourself in your Java program.
The setup steps are easy:
Use the SourceBuilder application to specify the JDBC connection information to your databases.
Test a sample query that accesses multiple databases. Standard SQL is supported. To reference tables in different databases use databaseName.tableName in your FROM clause.
For example:
SELECT *
FROM Database1.Table1 T1 INNER JOIN Database2.Table2 T2 ON T1.id = T2.id
The SourceBuilder application will provide an XML configuration file as output often called sources.xml. To use this in your own Java program or any software that supports JDBC the connection URL is: jdbc:unity://sources.xml You may specify an absolute or relative path to the sources.xml file.
There is documentation on their site at http://www.unityjdbc.com/support/ or contact them for free support.
Another way to get started quickly is to use the MultiSource SQL Plugin that comes with the open source query software SQuirreL SQL. The plugin will allow you to query any number of databases using SQL in SQuirreL and will generate the XML configuration files for you to use in other programs. The plugin is open source and free. The plugin also supports querying and joining NoSQL databases like MongoDB with relational databases such as MySQL and Postgres.
You don't need UnityJDBC for what you want to do, if you've already managed to load all the db-specific JDBC drivers.
Instead, you should look at doing each query in a separate thread. That way, you don't need to wait for one database to return its results before querying the next one.

Can I use JPQL to query a MySQL database which was populated by loading CSV files?

This is something of a noob question, so please bear with me.
I'm building a Java web app which is deployed on JBoss. Part of the functionality is populating a MySQL DB with data from an Excel spreadsheet. This can be achieved in 2 ways:
Using JExcel / Apache POI to parse the spreadsheet data and creating Entity "beans" which are then persisted to the DB.
Using scripts to convert the spreadsheet to csv files and then load the csv files into the DB.
My question is: If I choose the scripting / csv route, can I still use JPQL to query the DB or will I have to resort to native SQL queries in the Java code?
JPQL can be used to query table independently from method that was used to populate table. Data stored to table is not aware of with which method it was inserted.
JPA is not notified about changes made to data via script, but in typical use case with no additional caches and transaction-scoped PersistenceContext that is not the issue, because query will hit the database and deliver fresh data.

Categories

Resources