Which is the optimized way to query using aerospike client?

Which is the optimized way to query using aerospike client? - java

I have a set (set1)
Bins :
bin1 (PK = key1)
bin2 (PK = key1)
bin3 (PK = key2)
bin4 (PK = key2)
Which is more optimized way(in terms of query time, cpu usage, failure cases for 1 client call vs 2 client calls) for querying the data from aerospike client from the below 2 approaches:
Approach 1 : Make 1 get call using aeropsike client which has bins = [bin1, bin2, bin3, bin4] and keys = [key1, key2]
Approach 2 : Make 2 aerospike client get calls. First call will have bins = [bin1, bin2] and keys = [key1] and Second call will have bins = [bin3, bin4] and keys = [key2]
I find Approach 2 more cleaner, since in Approach 1 we will try to get the record for all combinations (e.g. : bin1 with key2 as primary key) and it will be extra computation and the primary key set can be large. But the disadvantage of Approach 2 is two Aerospike client calls.

A. Batch reads vs. multiple single reads
This is kind of a false choice. Yes, you could make a batch call for [key1, key2] (1), and you shouldn't specify bin1, bin2, bin3, bin4, just get the full records without selecting bins. Or you could make two independent get() calls, one for key1, one for key2 (2).
However, there's no reason you need to read key1, wait for the result, then read key2. You can read them with a synchronous get(key1) in one thread, and a synchronous get(key2) in another thread. The Java client can handle multi-threaded use. Alternatively, you can async get(key1) and immediately async get(key2).
Batch reads (such as in (1)) are not as efficient as single reads when the number of records is smaller than at least the number of nodes in the cluster. The records are evenly distributed, so if you have a 4 node cluster, and you make a batch request with 4 keys, you end up with parallel sub-batches of roughly 1 record per-node. The overhead associated with batch-reads isn't worth it when that's the case. See more about batch index in the docs and the knowledge base FAQ - batch-index tuning parameters. The FAQ - Differences between getting single record versus batch should answer your question.
B. The number of records in an Aerospike database doesn't impact read performance!
You are worried that "the primary key set can be large". That is not a problem at all for Aerospike. In fact, one of the best things about Aerospike is that getting a single record from a database with 1 million records or one with 1 trillion records is pretty much the same big-O computational cost.
Each record has a 64 byte metadata entry in the primary index. The primary index is spread evenly across the nodes of the cluster, because data distribution in Aerospike is extremely even. Each node stores an even share of the partitions, out of 4096 logical partitions for each namespace in the cluster. The partitions are represented as a collection of red-black binary trees (sprigs) with a hash table leading to the correct sprig.
To find any record the client hashes its key into a 20 byte digest. Using 12 bits of the digest the client finds the partition ID, looks it up in the partition map it holds locally, and finds the correct node. Reading the record is now a single hop to the correct node. On that node, a service thread picks up the call from a channel of the network card, looks it up in the correct partition (again, finding the partition ID from the digest is a simple O(1) operation). It hops directly to the correct sprig (also O(1)) and then does a simple O(n log n) binary tree lookup for the record's metadata. Now the service thread knows exactly where to find the record in storage, with a single read IO. I explained this read flow in more detail here (though in version 4.7 transaction queues and threads were removed; the service thread does all the work ).
Another point is that the time spent looking up record metadata in the index is orders of magnitude less than getting the record from storage.
So, the number of records in the cluster doesn't change how fast it takes to read a random record, from a data set of any size.
I wrote an article Aerospike Modeling: User Profile Store that shows how this fact is leveraged to make sub-millisecond reads at millions of transactions-per-second from a petabyte scale data store.

Related

What is the best way to cache large data objects into Hazlecast

We have around 20k merchants data ,size around 3mb
If we cache these much data together then hazlecast performance not doing good
Please note if we cache all 20k individual then for get all merchants call slowing down as reading each merchant from cache costs high network time.
How should we partition these data
What will be the partition key
What will be the max size per partition
Merchant entity attributed as below
Merchant Id , parent merchant id, name , address , contacts, status, type
Merchant id is the unique attribute
Please suggest

Adding to what Mike said, it's not unusual to see Hazelcast maps with millions of entries, so I wouldn't be concerned with the number of entries.
You should structure your map(s) to fit your applications design needs. Doing a 'getAll' on a single map seems inefficient to me. It may make more sense to create multiple maps or use a complex key that allows you to be more selective with entries returned.
Also, you may want to look at indexes. You can index the key and/or value which can really help with performance. Predicates you construct for selections will automatically use any defined indexes.

I wouldn't worry about changing partition key unless you have reason to believe the default partitioning scheme is not giving you a good distribution of keys.
With 20K merchants and 3MB of data per merchant, your total data is around 60GB. How many nodes are you using for your cache, and what memory size per node? Distributing the cache across a larger number of nodes should give you more effective bandwidth.
Make sure you're using an efficient serialization mechanism, the default Java serialization is very inefficient (both in terms of object size and speed to serialize and deserialize); using something like IdentifiedDataSerializable (if Java) or Portable (if using non-Java clients) could help a lot.

I would strongly recommend that you break down your object from 3MB to few 10s of KBs, otherwise you will run into problems that are not particularly related to Hazelcast. For example, fat packets blocking other packets resulting in heavy latency in read/write operations, heavy serialization/deserialization overhead, choked network etc. You have already identified high network time and it is not going to go away without flattening the value object. If yours is read heavy use case then I also suggest to look into NearCache for ultra low latency read operations.
As for partition size, keep it under 100MB, I'd say between 50-100MB per partition. Simple maths will help you:
3mb/object x 20k objects = 60GB
Default partition count = 271
Each partition size = 60,000 MB / 271 = 221MB.
So increasing the partition count to, lets say, 751 will mean:
60,000 MB / 751 = 80MB.
So you can go with partition count set to 751. To cater to possible increase in future traffic, I'd set the partition count to an even higher number - 881.
Note: Always use a prime number for partition count.
Fyi - in one of the future releases, the default partition count will be changed from 271 to 1999.

Cassandra size limit per partition key

I have this table in cassandra :
CREATE TABLE adress (
adress_id uuid,
adress_name text,
key1 text,
key2 text,
key3 text,
key4 text,
effective_date timestamp,
value text,
active boolean,
PRIMARY KEY ((adress_id, adress_name), key1, key2, key3, key4, effective_date)
)
As I can understand, cassandra will distribute the data of the table adress based on the partition key which is (adress_id, adress_name).
There is a risk when I try to insert too many data where they share the same (adress_id,adress_name)..
I would like to check before inserting data, the check happen like this:
how many data do I already have in cassandra with the couple (adress_id, adress_name), lets suppose it's 5MO.
I need to check that the size of data that I'm trying to insert don't exceed the Cassandra limit per partition key minus the existing data in cassandra.
My question is how to query cassandra to get the size of data with the couple (adress_id, adress_name).
After that what is the size limit of a partition key in Cassandra.

As Alex Ott noted above, you should spend more time on the data model to avoid the possibility of huge partitions in the first place, by organizing your data differently, or by artificially splitting partitions to more pieces (e.g., time-series data often splits data into a separate partition each day, for example).
It is technically possible to figure out the existing size of a partition, but it will never be efficient. To understand why, you need to recall how Cassandra stores data. The content of a single partition isn't always stored in the same sstable (on-disk file) - data for the same partition may be spread across multiple files. One file may have a few rows, another file may have a few more rows, a third file may delete or modify some old rows, and so on. To figure out the length of the partition, Cassandra would need to read all this all data, merge it together, and measure the size of the result. Cassandra does not normally do this on writes - it just writes the new update to memory (and eventually a new sstable), without reading the old data first. This is what makes writes in Cassandra so fast - and your idea to read the entire partition before each write will drastically slow them down.
Finally while Cassandra does not handle huge partitions very well, there is no inherent reason why it never could if the developers wanted to solve this issue. The developers of the Cassandra clone Scylla a worried about this issue, and are working to improve it, but even in Scylla the handling of huge partitions isn't perfect yet. But eventually it will be. Almost - there will always be a limit for the size of a single partition (which, by definition, is stored on a single node) as the size of a single disk. This limit too may become a serious problem if your data model is really broken and you can end up with a terabyte in a single partition.

DynamoDB Scan Query and BatchGet

We have a Dynamo DB table structure which consists Hash and Range as primary key.
Hash = date.random_number
Range = timestamp
How to get items within X and Y timestamp? Since hash key is attached with random_number, those many times query has to be fired. Is it possible to give multiple hash values and single RangeKeyCondition.
What would be most efficient in terms of cost and time?
Random number range is from 1 to 10.

If I understood correctly, you have a table with the following definition of Primary Keys:
Hash Key : date.random_number
Range Key : timestamp
One thing that you have to keep in mind is that , whether you are using GetItem or Query, you have to be able to calculate the Hash Key in your application in order to successfully retrieve one or more items from your table.
It makes sense to use the random numbers as part of your Hash Key so your records can be evenly distributed across the DynamoDB partitions, however, you have to do it in a way that your application can still calculate those numbers when you need to retrieve the records.
With that in mind, let's create the query needed for the specified requirements. The native AWS DynamoDB operations that you have available to obtain several items from your table are:
Query, BatchGetItem and Scan
In order to use BatchGetItem you would need to know beforehand the entire primary key (Hash Key and Range Key), which is not the case.
The Scan operation will literally go through every record of your table, something that in my opinion is unnecessary for your requirements.
Lastly, the Query operation allows you to retrieve one or more items from a table applying the EQ (equality) operator to the Hash Key and a number of other operators that you can use when you don't have the entire Range Key or would like to match more than one.
The operator options for the Range Key condition are: EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN
It seems to me that the most suitable for your requirements is the BETWEEN operator, that being said, let's see how you could build the query with the chosen SDK:
Table table = dynamoDB.getTable(tableName);
String hashKey = "<YOUR_COMPUTED_HASH_KEY>";
String timestampX = "<YOUR_TIMESTAMP_X_VALUE>";
String timestampY = "<YOUR_TIMESTAMP_Y_VALUE>";
RangeKeyCondition rangeKeyCondition = new RangeKeyCondition("RangeKeyAttributeName").between(timestampX, timestampY);
ItemCollection<QueryOutcome> items = table.query("HashKeyAttributeName", hashKey,
rangeKeyCondition,
null, //FilterExpression - not used in this example
null, //ProjectionExpression - not used in this example
null, //ExpressionAttributeNames - not used in this example
null); //ExpressionAttributeValues - not used in this example
You might want to look at the following post to get more information about DynamoDB Primary Keys:
DynamoDB: When to use what PK type?
QUESTION: My concern is querying multiple times because of random_number attached to it. Is there a way to combine these queries and hit dynamoDB once ?
Your concern is completely understandable, however, the only way to fetch all the records via BatchGetItem is by knowing the entire primary key (HASH + RANGE) of all records you intend to get. Although minimizing the HTTP roundtrips to the server might seem to be the best solution at first sight, the documentation actually suggests to do exactly what you are doing to avoid hot partitions and uneven use of your provisioned throughput:
Design For Uniform Data Access Across Items In Your Tables
"Because you are randomizing the hash key, the writes to the table on
each day are spread evenly across all of the hash key values; this
will yield better parallelism and higher overall throughput. [...] To
read all of the items for a given day, you would still need to Query
each of the 2014-07-09.N keys (where N is 1 to 200), and your
application would need to merge all of the results. However, you will
avoid having a single "hot" hash key taking all of the workload."
Source: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html
Here there is another interesting point suggesting the moderate use of reads in a single partition... if you remove the random number from the hash key to be able to get all records in one shot, you are likely to fall on this issue, regardless if you are using Scan, Query or BatchGetItem:
Guidelines for Query and Scan - Avoid Sudden Bursts of Read Activity
"Note that it is not just the burst of capacity units the Scan uses
that is a problem. It is also because the scan is likely to consume
all of its capacity units from the same partition because the scan
requests read items that are next to each other on the partition. This
means that the request is hitting the same partition, causing all of
its capacity units to be consumed, and throttling other requests to
that partition. If the request to read data had been spread across
multiple partitions, then the operation would not have throttled a
specific partition."
And lastly, because you are working with time series data, it might be helpful to look into some best practices suggested by the documentation as well:
Understand Access Patterns for Time Series Data
For each table that you create, you specify the throughput
requirements. DynamoDB allocates and reserves resources to handle your
throughput requirements with sustained low latency. When you design
your application and tables, you should consider your application's
access pattern to make the most efficient use of your table's
resources.
Suppose you design a table to track customer behavior on your site,
such as URLs that they click. You might design the table with hash and
range type primary key with Customer ID as the hash attribute and
date/time as the range attribute. In this application, customer data
grows indefinitely over time; however, the applications might show
uneven access pattern across all the items in the table where the
latest customer data is more relevant and your application might
access the latest items more frequently and as time passes these items
are less accessed, eventually the older items are rarely accessed. If
this is a known access pattern, you could take it into consideration
when designing your table schema. Instead of storing all items in a
single table, you could use multiple tables to store these items. For
example, you could create tables to store monthly or weekly data. For
the table storing data from the latest month or week, where data
access rate is high, request higher throughput and for tables storing
older data, you could dial down the throughput and save on resources.
You can save on resources by storing "hot" items in one table with
higher throughput settings, and "cold" items in another table with
lower throughput settings. You can remove old items by simply deleting
the tables. You can optionally backup these tables to other storage
options such as Amazon Simple Storage Service (Amazon S3). Deleting an
entire table is significantly more efficient than removing items
one-by-one, which essentially doubles the write throughput as you do
as many delete operations as put operations.
Source: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html

Cassandra Read/Get Performance

My Cassandra table has following schema
CREATE TABLE cachetable1 (
id text,
lsn text,
lst timestamp,
PRIMARY KEY ((id))
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='{"keys":"ALL", "rows_per_partition":"ALL"}' AND
comment='' AND
dclocal_read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.000000 AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
Above table contains 221 Million rows (approx. 16 GB data). The CassandraDaemon is running with 4GB heap space and I have configured 4 GB memory for row cache. I am try to run select queries from my java code like this
for(int i = 0; i < 1000; i ++)
{
int id = random.nextInt(20000000 - 0) + 0;
for(j = id; j <= id + 100; j++)
{
ls.add(j+"");
}
Statement s = QueryBuilder.select("lst","lsn").from("ks1" , "cachetable1").where(QueryBuilder.in("id",ls.toArray()));
s.setFetchSize(100);
ResultSet rs=sess.execute( s );
List<Row> lsr=rs.all();
for(Row rw:lsr)
{
//System.out.println(rw.toString());
count++;
}
ls.clear();
}
In above code, I am trying to fetch 0.1 Million records. But the read/get performance is very bad. It takes 400-500 seconds to fetch 0.1 Million rows. Is there any better way to read/get records from Cassandra through Java? Is some tuning required other than row cache size and Cassandra heap size?

You appear to want to retrieve your data in 100 row chunks. This sounds like a good candidate for a clustering column.
Change your schema to use an id as the partition key and a chunk index as a clustering column, i.e. PRIMARY KEY ( (id), chunk_idx ). When you insert the data, you will have to figure out how to map your single indexes into an id and chunk_idx (e.g. perhaps do a modulo 100 on one of your values to generate a chunk_idx).
Now when you query for an id and don't specify a chunk_idx, Cassandra can efficiently return all 100 rows with one disk read on the partition. And you can still do range queries and retrievals of single rows within the partition by specifying the chunk_idx if you don't always want to read a whole chunk of rows.
So your mistake is you are generating 100 random partition reads with each query, and this will hit all the nodes and require a separate disk read for each one. Remember that just because you are querying for sequential index numbers doesn't mean the data is stored close together, and with Cassandra it is exactly the opposite, where sequential partition keys are likely stored on different nodes.
The second mistake you are making is you are executing the query synchronously (i.e. you are issuing the query and waiting for the request to finish before you issue any more queries). What you want to do is use a thread pool so that you can have many queries running in parallel, or else use the executeAsync method in a single thread. Since your query is not efficient, waiting for the 100 random partition reads to complete is going to be a long wait, and a lot of the highly pipelined Cassandra capacity is going to be sitting there twiddling its thumbs waiting for something to do. If you are trying to maximize performance, you want to keep all the nodes as busy as possible.
Another thing to look into is using the TokenAwarePolicy when connecting to your cluster. This allows each query to go directly to a node that has a replica of the partition rather than to a random node that might have to act as a coordinator and get the data via an extra hop. And of course using consistency level ONE on reads is faster than higher consistency levels.
The row cache size and heap size are not the source of your problem, so that's the wrong path to go down.

I am going to guess that this is your culprit:
.where(QueryBuilder.in("id",ls.toArray()))
Use of the IN relation in the WHERE clause is widely known to be non-performant. In some case, performing many parallel queries can be faster than using one IN query. From the DataStax SELECT documentation:
When not to use IN
...Using IN can degrade performance because usually many nodes must be
queried. For example, in a single, local data center cluster with 30
nodes, a replication factor of 3, and a consistency level of
LOCAL_QUORUM, a single key query goes out to two nodes, but if the
query uses the IN condition, the number of nodes being queried are
most likely even higher, up to 20 nodes depending on where the keys
fall in the token range.
So you have two options (assuming that living with this poor-performing query isn't one of them):
Rewrite your code to make multiple, parallel requests for each id.
Revisit your data model to see if you have another value that it makes sense to key your data by. For instance, if all of your ids in ls happen to share a common column value that is unique to them, that's a good candidate for a primary key. Basically, find another way to query all of the ids that you are looking for, and build a specific query table to support that.

Distributed sequence number generation?

I've generally implemented sequence number generation using database sequences in the past.
e.g. Using Postgres SERIAL type http://www.neilconway.org/docs/sequences/
I'm curious though as how to generate sequence numbers for large distributed systems where there is no database. Does anybody have any experience or suggestions of a best practice for achieving sequence number generation in a thread safe manner for multiple clients?

OK, this is a very old question, which I'm first seeing now.
You'll need to differentiate between sequence numbers and unique IDs that are (optionally) loosely sortable by a specific criteria (typically generation time). True sequence numbers imply knowledge of what all other workers have done, and as such require shared state. There is no easy way of doing this in a distributed, high-scale manner. You could look into things like network broadcasts, windowed ranges for each worker, and distributed hash tables for unique worker IDs, but it's a lot of work.
Unique IDs are another matter, there are several good ways of generating unique IDs in a decentralized manner:
a) You could use Twitter's Snowflake ID network service. Snowflake is a:
Networked service, i.e. you make a network call to get a unique ID;
which produces 64 bit unique IDs that are ordered by generation time;
and the service is highly scalable and (potentially) highly available; each instance can generate many thousand IDs per second, and you can run multiple instances on your LAN/WAN;
written in Scala, runs on the JVM.
b) You could generate the unique IDs on the clients themselves, using an approach derived from how UUIDs and Snowflake's IDs are made. There are multiple options, but something along the lines of:
The most significant 40 or so bits: A timestamp; the generation time of the ID. (We're using the most significant bits for the timestamp to make IDs sort-able by generation time.)
The next 14 or so bits: A per-generator counter, which each generator increments by one for each new ID generated. This ensures that IDs generated at the same moment (same timestamps) do not overlap.
The last 10 or so bits: A unique value for each generator. Using this, we don't need to do any synchronization between generators (which is extremely hard), as all generators produce non-overlapping IDs because of this value.
c) You could generate the IDs on the clients, using just a timestamp and random value. This avoids the need to know all generators, and assign each generator a unique value. On the flip side, such IDs are not guaranteed to be globally unique, they're only very highly likely to be unique. (To collide, one or more generators would have to create the same random value at the exact same time.) Something along the lines of:
The most significant 32 bits: Timestamp, the generation time of the ID.
The least significant 32 bits: 32-bits of randomness, generated anew for each ID.
d) The easy way out, use UUIDs / GUIDs.

You could have each node have a unique ID (which you may have anyway) and then prepend that to the sequence number.
For example, node 1 generates sequence 001-00001 001-00002 001-00003 etc. and node 5 generates 005-00001 005-00002
Unique :-)
Alternately if you want some sort of a centralized system, you could consider having your sequence server give out in blocks. This reduces the overhead significantly. For example, instead of requesting a new ID from the central server for each ID that must be assigned, you request IDs in blocks of 10,000 from the central server and then only have to do another network request when you run out.

Now there are more options.
Though this question is "old", I got here, so I think it might be useful to leave the options I know of (so far):
You could try Hazelcast. In it's 1.9 release it includes a Distributed implementation of java.util.concurrent.AtomicLong
You can also use Zookeeper. It provides methods for creating sequence nodes (appended to znode names, though I prefer using version numbers of the nodes). Be careful with this one though: if you don't want missed numbers in your sequence, it may not be what you want.
Cheers

It can be done with Redisson. It implements distributed and scalable version of AtomicLong. Here is example:
Config config = new Config();
config.addAddress("some.server.com:8291");
Redisson redisson = Redisson.create(config);
RAtomicLong atomicLong = redisson.getAtomicLong("anyAtomicLong");
atomicLong.incrementAndGet();

If it really has to be globally sequential, and not simply unique, then I would consider creating a single, simple service for dispensing these numbers.
Distributed systems rely on lots of little services interacting, and for this simple kind of task, do you really need or would you really benefit from some other complex, distributed solution?

There are a few strategies; but none that i know can be really distributed and give a real sequence.
have a central number generator. it doesn't have to be a big database. memcached has a fast atomic counter, in the vast majority of cases it's fast enough for your entire cluster.
separate an integer range for each node (like Steven Schlanskter's answer)
use random numbers or UUIDs
use some piece of data, together with the node's ID, and hash it all (or hmac it)
personally, i'd lean to UUIDs, or memcached if i want to have a mostly-contiguous space.

Why not use a (thread safe) UUID generator?
I should probably expand on this.
UUIDs are guaranteed to be globally unique (if you avoid the ones based on random numbers, where the uniqueness is just highly probable).
Your "distributed" requirement is met, regardless of how many UUID generators you use, by the global uniqueness of each UUID.
Your "thread safe" requirement can be met by choosing "thread safe" UUID generators.
Your "sequence number" requirement is assumed to be met by the guaranteed global uniqueness of each UUID.
Note that many database sequence number implementations (e.g. Oracle) do not guarantee either monotonically increasing, or (even) increasing sequence numbers (on a per "connection" basis). This is because a consecutive batch of sequence numbers gets allocated in "cached" blocks on a per connection basis. This guarantees global uniqueness and maintains adequate speed. But the sequence numbers actually allocated (over time) can be jumbled when there are being allocated by multiple connections!

Distributed ID generation can be archived with Redis and Lua. The implementation available in Github. It produces a distributed and k-sortable unique ids.

I know this is an old question but we were also facing the same need and was unable to find the solution that fulfills our need.
Our requirement was to get a unique sequence (0,1,2,3...n) of ids and hence snowflake did not help.
We created our own system to generate the ids using Redis. Redis is single threaded hence its list/queue mechanism would always give us 1 pop at a time.
What we do is, We create a buffer of ids, Initially, the queue will have 0 to 20 ids that are ready to be dispatched when requested. Multiple clients can request an id and redis will pop 1 id at a time, After every pop from left, we insert BUFFER + currentId to the right, Which keeps the buffer list going. Implementation here

I have written a simple service which can generate semi-unique non-sequential 64 bit long numbers. It can be deployed on multiple machines for redundancy and scalability. It use ZeroMQ for messaging. For more information on how it works look at github page: zUID

Using a database you can reach 1.000+ increments per second with a single core. It is pretty easy. You can use its own database as backend to generate that number (as it should be its own aggregate, in DDD terms).
I had what seems a similar problem. I had several partitions and I wanted to get an offset counter for each one. I implemented something like this:
CREATE DATABASE example;
USE example;
CREATE TABLE offsets (partition INTEGER, offset LONG, PRIMARY KEY (partition));
INSERT offsets VALUES (1,0);
Then executed the following statement:
SELECT #offset := offset from offsets WHERE partition=1 FOR UPDATE;
UPDATE offsets set offset=#offset+1 WHERE partition=1;
If your application allows you, you can allocate a block at once (that was my case).
SELECT #offset := offset from offsets WHERE partition=1 FOR UPDATE;
UPDATE offsets set offset=#offset+100 WHERE partition=1;
If you need further throughput an cannot allocate offsets in advance you can implement your own service using Flink for real time processing. I was able to get around 100K increments per partition.
Hope it helps!

The problem is similar to:
In iscsi world, where each luns/volumes have to be uniquely identifiable by the initiators running on the client side.
The iscsi standard says that the first few bits have to represent the Storage provider/manufacturer information, and the rest monotonically increasing.
Similarly, one can use the initial bits in the distributed system of nodes to represent the nodeID and the rest can be monotonically increasing.

One solution that is decent is to use a long time based generation.
It can be done with the backing of a distributed database.

My two cents for gcloud. Using storage file.
Implemented as cloud function, can easily be converted to a library.
https://github.com/zaky/sequential-counter

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.