Neo4j encryption

Neo4j encryption - java

How I can encrypt a Neo4j database?
For example if I have a small database only two nodes and one relationship,
Node (Tom:person) and Node (ABC: company) and the relationship is Employer
and I have this query
Cypher query:
MATCH (Tom:person) - [:EMPLOYER] - > (ABC:company)
WHERE Tom.name = “Tom”
RETURN company.name;
I have read about Neo4j encryption and I found the following:
Neo4j does not currently deal with data encryption explicitly, for scenarios where additional security is desired two approaches are common:
Encrypting the filesystem the database sits upon and encrypting the data itself from the application.
Many Thanks

As the explanation states, Neo4J has no built-in encryption. Either encrypt the filesystem or just the data before you insert it. The latter one is probably easier if you don't have the resources for a crypted filesystem, but it requires you to write more code.

Related

Fetch large volume of data from DynomoDB?

I am developing a spring boot REST API, which has to fetch large volume of data (100-200k records) from dynamoDB table based on search conditions and return the response to the API consumer without loading the entire object list in its memory. With SQL based database, I have used JDBCTemplate queryForStreams method for similar requirement. But for no-sql database like DynamoDB, I could not find similar methods to stream the data.
One sample scenario is to fetch all passengers who booked business class ticket on Christmas weekend from xyz airline dynamoDB database.
note: Edited for clarity.

Reading GB's of data per request from DynamoDB does not seem scalable. Does the end user require all that data, what is the purpose?
DynamoDB can only return 1MB per request so for a single end user API call you would have to make many paginated requests to DynamoDB.
If you are using Scan then your solution is not at all scalable and I would possibly suggest using a different database.

This is not a good use case for REST in general. Have you considered storing the query result in an S3?
Your rest API will return a task id, that you can then use to check the progress of the query and eventually download the result.
This way you get infinite scalability and can run huge amounts of parallel dynamo scans or queries.

The fastest way to do this is going to be using a parallel Scan operation. Assuming you have sufficient read capacity on the DynamoDB table, this is going to give you very high speed results.
See "parallel scan using Java" at https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ScanJavaDocumentAPI.html for an example

DynamoDB
Think of DDB in the way that Amazon uses it, for e-commerce - small sub 100s list of paginated data, the items are usually small in size, but the items must be easy to update
In this case you would never need to store / fetch GBs of data from the tables
Your needs as 'how might we...' question
How might we store GBs of data in AWS and retrieve that data quickly?
AWS Best Practices
Before we dive in solving the 'hmw' question above we need to understand some core tenets of AWS
operational excellence
security
reliability
performance efficiency
cost optimisation
sustainability
AWS call these tenets or 'pillars' their Well-Architected Framework
You can read more about these here https://aws.amazon.com/architecture/well-architected/
Most of them are as described: monitoring, security, reliability, performance, cost efficient, computationally cheap (which means environmentally friendly)
A sprawling buffet of solutions
Storage
Your needs are the storage of GBs of data
It still depend on what you're trying to store here, but for most storage needs you'd use S3
To make sure we keep things 'compliant' with the Well-Architected framework we'd need to enable the use of encryption (in transite, at rest), block public bucket access etc.
To make everything cost efficient we will have to think about when we want to access this data. If accessed regularly then we'll have to use 'hot' storage, otherwise 'cold' storage S3 options are cheaper but you trade retrieval time.
Notable mentions
If you have specific data science needs you should checkout: Data Lakes (still uses S3 under the hood), Glue, Athena (a query layer on top of S3)
If you're storing text based data and require near instant seaching and retrieval using OpenSearch - this is very useful for chat related data
Data storage
This depends of your app, but most people still keep a DynamoDB table that acts as a map for S3 queries.
DDB is query optimised and super performant when you fully understand you data queries or access pattern.
Design you table around your access patterns not around entities.
eg.
Option 1: One table
PK SK
type#order timestamp
type#transaction timestamp
....
Option 2: Multiple Entity based tables
Order table,
PK SK Attr
id timestamp productIDs
Transactions table
PK SK Attr
id timestamp amount, orderId
Products table
PK SK
id category
The one-table design just simplifies the retrieval of data in a small number or requests, but you do need to play with your table design until it's just right.
My recommendation: be creative and mix and match the table styles to your needs. Entity-based tables are still useful in most apps.
Also expect to redo your tables once you find out new things.
It's crucial here that you use an infrastructure-as-code tool to make it easier to teardown and recreate tables - CDK is great for this.
Remember that you are billed per Read and Write units. This is where a well-designed table (to match your access patterns) will help you make concise queries at a low cost.
Data retrieval
This is where you have some options, depending on your app
Again I would recommend the storage for big items in S3 not DynamoDB, so in this case it's relatively easy to download GBs of data from S3.
You can also store data in optimised formats using parquet.
Also if you choose to use DynamoDB as a hash map for the S3 bucket you can quickly find your files and locations and then place those in a queue, so that the retrieval happens in the background.
You can also copy files within the bucket to a job folder, zip the data and provide the user with the URL to that zip.
You can also use DataSync for copying across buckets.
Final notes
It sounds to me like you are storing data in AWS and downloading for processing.
Most teams approach this by moving their processing and storage to AWS, running the whole process in the cloud.

How to Convert SQL table into Redis Data

Hi I am new to redis and want some help over here. I am using java and sql server 2008 and redis server. To interact with redis I am using jedis api for java. I know that redis is used to store key value based things. Every key has values.
Problem Background:
I have a table names "user" which stores data like id, name, email, age, country. This is schema of sql table. Now this table have some rows(means some data as well). Now here my primary key is id and its just for DB use Its of no use for me in application.
Now in sql I can insert new row, can update a row, can search for any user, can delete a user.
I want to store this tables data into redis. Then I want to perform similar operations on redis as well, like search, insert, delete. But if I have a good design on "Storing this info in DB and Redis" then these operations will be carried out simply. Remember I can have multiple tables as well. So should store data in redis on basis of table.
My Problem
Any design or info you can advise me that how I can convert DB data to Redis and perform all operations. I am asking this because I know Facebook is also using redis to store data. Then how they are storing data.
Any help would be very appreciative.

This is a very hard question to answer as there are multiple ways you could do.
The best way in my opinion would be use hashes. This is basically a nested a nested key-value type. So your key would match to the hash so you can store username, password, etc.
One problem is indexing, you would need to have an ID stored in the key. For example each user would have to have a key like: USER:21414
The second thing unless you want to look at commands like KEYS or SCAN you are going to have to maintain your own list of users to iterate, only if you need to do that. For this you will need to look at lists or sorted sets.
To be honest there is no true answer to this question, SQL style data does not map to key-value's in any real way. You usually have to do a lot more work yourself.
I would suggest reading as much as you can and would start here http://redis.io/commands and here http://redis.io/documentation.
I have no experience using Jedis so I can't help on that side. If you want an example I have an open-source social networking site which uses Redis as it's sole data store. You can take a look at the code to get some ideas https://github.com/pjuu/pjuu/blob/master/pjuu/auth/backend.py. It uses Python but Redis is such an easy thing to use everywhere there will not be that much to difference.
Edit: My site above no longer solely uses Redis. An older branch will need to be checked such as 0.4 or 0.3 :)

Web-App using Hibernate that queries a SQL Server 2005 encrypted column

we are devolping a web-application using Spring 3.1.2 and Hibernate 4.1.7 with a database SQL Server 2005.
On a table we've got a column encrypted and we need to perform some queries like, for example, this one:
OPEN SYMMETRIC KEY PasswordFieldSymmetricKey
DECRYPTION BY PASSWORD = 'myPassword'
SELECT id,
plain,
cipher,
CONVERT(varchar(50),
DecryptByKey(cipher)) AS 'Decrypted'
FROM TS_OWN.cryptest;
GO
CLOSE SYMMETRIC KEY PasswordFieldSymmetricKey
As a solution, someone proposed to create a view that manages the decryption but we need that no one must see the decrypted data, and of course DBA for example could query that view.
At the same time we don't want to perform the decryption on java side, due to some heavy aggregation logic that is expected to be performed by database engine due to performance reasons.
A possible solution is to create a view that performs decryption, aggregations and then encrypts the result one more time, performing decryption of the aggregated values on Java side.
Does someone know alternatives?
Thank you all,
Luca

From a server-side perspective, the most transperent solution is to use Jasypt. This library comes with several Hibernate UserTypes for encrypting text/password fields.
As mentioned in the reference documentation, there are limitations:
But encryption sets a limitation on your Hibernate usage: security
standards establish that two different encryption operations on the
same data should not return the same value (due to the use of a random
salt). Because of this, none of the fields that are set to be
encrypted when persisted can be a part of a WHERE clause in your
search queries for the entity they belong to.
While your HQL/SQL queries will hide the decrypting complexity, you won't get the same performance as with a specific database decryption function.
Using database decryption functions performs better, but then all your queries will be embedded in views and that's going to change dramatically the way you use Hibernate.
You could map entities to views instead, but you'll have to pay attention to DML statements (some DBs offer updatable views, others give you materialized views or you might use INSTEAD OF triggers).
One possible solution for OPEN/CLOSE SYMETRIC is to use your own #Decrypt annotation and add an aspect to insert those right after the transaction starts and right before it ends. This will work because the sql session/connection is bound to the current transaction/thread.

performance improvement of queries against encrypted table without changing the application code

I have tagged this problem with both Oracle and Java because both Oracle and Java solutions would be accepted for this problem.
I am new to Oracle security and have been presented with the below problem to solve. I have done some research on the internet but I have had no luck so far. At first, I thought Oracle TDE might be helpful for my problem but here: Can Oracle TDE protect data from the DBA? it seems TDE doesn't protect data against DBA and this is an issue which is not to be tolerated.
Here is the problem:
I have a table containing millions of records. I have a Java application which queries this table using equality or range criteria against a column in the table which is the primary key column of the table. The primary key column contains sensitive data and thus has been encrypted already. As the result, querying data using normal (i.e. decrypted) values from the application cannot use the primary key's unique index access path. I need to improve the queries' performance without any changes on the application code (application config can be modified if necessary but not the code). It would be OK to do any changes that are necessary on the database side as long as that column remains encrypted.
Oracle people please: What solution(s) do you suggest to this problem? How can I create an index on decrypted column values and somehow force Oracle to utilize this index? How can I use partitioning such as hash-partitioning? How about views? Any, Any solution?
Java people please: I myself have this very vague idea which is to create a separate application in between (i.e between the database and the application) which acts as a proxy that receives the queries from the application and replaces the decrypted values with encrypted values and sends it for the database, it then receives the response and return the results back to the application. The proxy should behave like a database so that it should be possible for the application to connect to it by changing the connection string in the configuration file only. Would this work? How?
Thanks for all your help in advance!

which queries this table using equality or range criteria against a column in the table which is the primary key column of the table
To find a specific value it's simple enough - you can store the data encrypted any way you like - even as a hash and still retrieve a specific value using an index. But as per my comment elsewhere, you can't do range queries without either:
decrypting each and every row in the table
or
using an algorithm that can be cracked in a few seconds.
Using a linked list (or a related table) to define order instead of an algorithm with intrinsic ordering would force a brute force check on a much larger set of values - but it's nowhere near as secure as a properly encrypted value.
It doesn't matter if you use Oracle, Java or pencil and paper. Might be possible using quantum computing - but if you can't afford to ensure the security of your application / pay for good advice from an expert cryptographer, then you certainly won't be able to afford that.

How can I create an index on decrypted column values and somehow force Oracle to utilize this index?
Maybe you could create a function based index in which you index the decrypted value.
create index ix1 on tablename (decryptfunction(pk1));

how to create "has many" between two documents in couchdb?

basically I am wondering how you would go about in Couchdb as you would in MysQL: storing username, password in one table and link the user id as foreign key on another table of tasks?
should I just use mysql for the user authentication part and couchdb to store lots of user submitted documents? so create a random unique token to link each user to their "documents" on couchdb?
also I am looking to store Java objects to the couchdb, and retrieve them to be used directly in my application. which Java-couchdb library does this? Ektorp's example is seems more complicated compared to couchdb4j.

I do not know Java very well, but I suggest use the most simple tool you find. CouchDB is very simple and usually it is most beneficial to access it with simple tools too.
Yes, if you will have many relationships in the data, MySQL will help. However CouchDB can do some simple has-many queries.
First, there is view collation. You use map/reduce, and for every "child" document, you emit a key pointing to the parent document. When you query for ?key=parent then you get a long list of children. (The wiki explains it pretty well.)
Secondly, I suggest the article What's new in CouchDB 0.11 which shows how to use document _ids to link between two documents.
Good luck!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.