"Atomically" update an entire array

"Atomically" update an entire array - java

I have a single writer thread and single reader thread to update and process a pool of arrays(references stored in map). The ratio of writes to read is almost 5:1(latency of writes is a concern).
The writer thread needs to update few elements of an array in the pool based on some events. The entire write operation(all elements) needs to be atomic.
I want to ensure that reader thread reads the previous updated array if writer thread is updating it(something like volatile but on entire array rather than individual fields). Basically, I can afford to read stale values but not block.
Also, since the writes are so frequent, it would be really expensive to create new objects or lock the entire array while read/write.
Is there a more efficient data structure that could be used or use cheaper locks ?

How about this idea: The writer thread does not mutate the array. It simply queues the updates.
The reader thread, whenever it enters a read session that requires a stable snapshot of the array, applies the queued updates to the array, then reads the array.
class Update
{
int position;
Object value;
}
ArrayBlockingQueue<Update> updates = new ArrayBlockingQueue<>(Integer.MAX_VALUE);
void write()
{
updates.put(new Update(...));
}
Object[] read()
{
Update update;
while((update=updates.poll())!=null)
array[update.position] = update.value;
return array;
}

Is there a more efficient data structure?
Yes, absolutely! They're called persistent data structures. They are able to represent a new version of a vector/map/etc merely by storing the differences with respect to a previous version. All versions are immutable, which makes them appropiate for concurrency (writers don't interfere/block readers, and vice versa).
In order to express change, one stores references to a persistent data structure in a reference type such as AtomicReference, and changes what those references point to - not the structures themselves.
Clojure provides a top-notch implementation of persistent data structures. They're written in pure, efficient Java.
The following program exposes how one would approach your described problem using persistent data structures.
import clojure.lang.IPersistentVector;
import clojure.lang.PersistentVector;
public class AtomicArrayUpdates {
public static Map<Integer, AtomicReference<IPersistentVector>> pool
= new HashMap<>();
public static Random rnd = new Random();
public static final int SIZE = 60000;
// For simulating the reads/writes ratio
public static final int SLEEP_TIMÉ = 5;
static {
for (int i = 0; i < SIZE; i++) {
pool.put(i, new AtomicReference(PersistentVector.EMPTY));
}
}
public static class Writer implements Runnable {
#Override public void run() {
while (true) {
try {
Thread.sleep(SLEEP_TIMÉ);
} catch (InterruptedException e) {}
int index = rnd.nextInt(SIZE);
IPersistentVector vec = pool.get(index).get();
// note how we repeatedly assign vec to a new value
// cons() means "append a value".
vec = vec.cons(rnd.nextInt(SIZE + 1));
// assocN(): "update" at index 0
vec = vec.assocN(0, 42);
// appended values are nonsense, just an example!
vec = vec.cons(rnd.nextInt(SIZE + 1));
pool.get(index).set(vec);
}
}
}
public static class Reader implements Runnable {
#Override public void run() {
while (true) {
try {
Thread.sleep(SLEEP_TIMÉ * 5);
} catch (InterruptedException e) {}
IPersistentVector vec = pool.get(rnd.nextInt(SIZE)).get();
// Now you can do whatever you want with vec.
// nothing can mutate it, and reading it doesn't block writers!
}
}
}
public static void main(String[] args) {
new Thread(new Writer()).start();
new Thread(new Reader()).start();
}
}

Another idea, given that the array contains only 20 doubles.
Have two arrays, one for write, one for read.
Reader locks the read array during read.
read()
lock();
read stuff
unlock();
Writer first modifies the write array, then tryLock the read array, if locking fails, fine, write() returns; if locking succeeds, copy the write array to the read array, then release the lock.
write()
update write array
if tryLock()
copy write array to read array
unlock()
Reader can be blocked, but only for the time it takes to copy the 20 doubles, which is short.
Reader should use spin lock, like do{}while(tryLock()==false); to avoid being suspended.

I would do as follows:
synchronize the whole thing and see if the performance is good enough. Considering you only have one writer thread and one reader thread, contention will be low and this could work well enough
private final Map<Key, double[]> map = new HashMap<> ();
public synchronized void write(Key key, double value, int index) {
double[] array = map.get(key);
array[index] = value;
}
public synchronized double[] read(Key key) {
return map.get(key);
}
if it is too slow, I would have the writer make a copy of the array, change some values and put the new array back to the map. Note that array copies are very fast - typically, a 20 items array would most likely take less than 100 nanoseconds
//If all the keys and arrays are constructed before the writer/reader threads
//start, no need for a ConcurrentMap - otherwise use a ConcurrentMap
private final Map<Key, AtomicReference<double[]>> map = new HashMap<> ();
public void write(Key key, double value, int index) {
AtomicReference<double[]> ref = map.get(key);
double[] oldArray = ref.get();
double[] newArray = oldArray.clone();
newArray[index] = value;
//you might want to check the return value to see if it worked
//or you might just skip the update if another writes was performed
//in the meantime
ref.compareAndSet(oldArray, newArray);
}
public double[] read(Key key) {
return map.get(key).get(); //check for null
}
since the writes are so frequent, it would be really expensive to create new objects or lock the entire array while read/write.
How frequent? Unless there are hundreds of them every millisecond you should be fine.
Also note that:
object creation is fairly cheap in Java (think around 10 CPU cycles = a few nanoseconds)
garbage collection of short lived object is generally free (as long as the object stays in the young generation, if it is unreachable it is not visited by the GC)
whereas long lived objects have a GC performance impact because they need to be copied across to the old generation

The following variation is inspired by both my previous answer and one of zhong.j.yu's.
Writers don't interfere/block readers and vice versa, and there are no thread safety/visibility issues, or delicate reasoning going on.
public class V2 {
static Map<Integer, AtomicReference<Double[]>> commited = new HashMap<>();
static Random rnd = new Random();
static class Writer {
private Map<Integer, Double[]> writeable = new HashMap<>();
void write() {
int i = rnd.nextInt(writeable.size());
// manipulate writeable.get(i)...
commited.get(i).set(writeable.get(i).clone());
}
}
static class Reader{
void read() {
double[] arr = commited.get(rnd.nextInt(commited.size())).get();
// do something useful with arr...
}
}
}

You need two static references: readArray and writeArray and a simple mutex to track when write has been changed.
have a locked function called changeWriteArray make changes to a deepCopy of writeArray:
synchronized String[] changeWriteArray(String[] writeArrayCopy, other params go here){
// here make changes to deepCopy of writeArray
//then return deepCopy
return writeArrayCopy;
}
Notice that changeWriteArray is functional programming with effectively no side effect since it is returning a copy that is neither readArray nor writeArray.
whoever calles changeWriteArray must call it as writeArray = changeWriteArray(writeArray.deepCopy()).
the mutex is changed by both changeWriteArray and updateReadArray but is only checked by updateReadArray. If the mutex is set, updateReadArray will simply point the reference of readArray to the actual block of writeArray
EDIT:
#vemv concerning the answer you mentioned. While the ideas are the same, the difference is significant: the two static references are static so that no time is spent actually copying the changes into the readArray; rather the pointer of readArray is moved to point to writeArray. Effectively we are swapping by means of a tmp array that changeWriteArray generates as necessary. Also the locking here is minimal as reading does not require locking in the sense that you can have more than one reader at any given time.
In fact, with this approach, you can keep a count of concurrent readers and check the counter to be zero for when to update readArray with writeArray; again, furthering that reading requires no lock at all.

Improving on #zhong.j.yu's answer, it is really a good idea to queue the writes instead of trying to perform them when they occur. However, we must tackle the problem when updates are coming so fast that the reader would choke on updates continuously coming in. My idea is what if the reades only performs the writes that were queued before the read, and ignoring subsequent writes (those would be tackled by next read).
You will need to write your own synchornised queue. It will be based off a linked list, and would contain only two methods:
public synchronised enqeue(Write write);
This method will atomically enqueue a write. There is a possible deadlock when writes would come faster than it would actually take to enqueue them, but I think there would have to be hundreds of thousands of writes every second to achieve that.
public synchronised Element cut();
This will atomically empty the queue and returns its head (or tail) as the Element object. It will contain a chain of other Elements (Element.next, etc..., just the usual linked list stuff), all those representing a chain of writes since last read. The queue would then be empty, ready to accept new writes. The reader then can trace the Element chain (which will be standalone by then, untouched by subsequent writes), perform the writes, and finally perform the read. While the reader processes the read, new writes would be enqueued in the queue, but those will be next read's problem.
I wrote this once, albeit in C++, to represent a sound data buffer. There were more writes (driver sends more data), than reads (some mathematical stuff over the data), while the writes had to finish as soon as possible. (The data came in real-time, so I needed to save them before next batch was ready in the driver.)

I've got a funny solution using three arrays and a volatile boolean toggle. Basically, both threads have its own array. Additionally, there's a shared array controlled via the toggle.
When the writer finishes and the toggle allows it, it copies the newly written array into the shared array and flips the toggle.
Similarly, before the reader starts, when the toggle allows it, it copies the shared array into its own array and flips the toggle.
public class MolecularArray {
private final double[] writeArray;
private final double[] sharedArray;
private final double[] readArray;
private volatile boolean writerOwnsShared;
MolecularArray(int length) {
writeArray = new double[length];
sharedArray = new double[length];
readArray = new double[length];
}
void read(Consumer<double[]> reader) {
if (!writerOwnsShared) {
copyFromTo(sharedArray, readArray);
writerOwnsShared = true;
}
reader.accept(readArray);
}
void write(Consumer<double[]> writer) {
writer.accept(writeArray);
if (writerOwnsShared) {
copyFromTo(writeArray, sharedArray);
writerOwnsShared = false;
}
}
private void copyFromTo(double[] from, double[] to) {
System.arraycopy(from, 0, to, 0, from.length);
}
}
It depends on the "single writer thread and single reader" assumption.
It never blocks.
It uses a constant (albeit huge) amount of memory.
Repeated calls to read without any intervening write do no copying and vice versa.
The reader does not necessarily see the most recent data, but it sees the data from the first write started after the previous read, if any.
I guess, this could be improved using two shared arrays.

Related

Concurrency of RandomAccessFile in Java

I am creating a RandomAccessFile object to write to a file (on SSD) by multiple threads. Each thread tries to write a direct byte buffer at a specific position within the file and I ensure that the position at which a thread writes won't overlap with another thread:
file_.getChannel().write(buffer, position);
where file_ is an instance of RandomAccessFile and buffer is a direct byte buffer.
For the RandomAccessFile object, since I'm not using fallocate to allocate the file, and the file's length is changing, will this utilize the concurrency of the underlying media?
If it is not, is there any point in using the above function without calling fallocate while creating the file?

I made some testing with the following code :
public class App {
public static CountDownLatch latch;
public static void main(String[] args) throws InterruptedException, IOException {
File f = new File("test.txt");
RandomAccessFile file = new RandomAccessFile("test.txt", "rw");
latch = new CountDownLatch(5);
for (int i = 0; i < 5; i++) {
Thread t = new Thread(new WritingThread(i, (long) i * 10, file.getChannel()));
t.start();
}
latch.await();
file.close();
InputStream fileR = new FileInputStream("test.txt");
byte[] bytes = IOUtils.toByteArray(fileR);
for (int i = 0; i < bytes.length; i++) {
System.out.println(bytes[i]);
}
}
public static class WritingThread implements Runnable {
private long startPosition = 0;
private FileChannel channel;
private int id;
public WritingThread(int id, long startPosition, FileChannel channel) {
super();
this.startPosition = startPosition;
this.channel = channel;
this.id = id;
}
private ByteBuffer generateStaticBytes() {
ByteBuffer buf = ByteBuffer.allocate(10);
byte[] b = new byte[10];
for (int i = 0; i < 10; i++) {
b[i] = (byte) (this.id * 10 + i);
}
buf.put(b);
buf.flip();
return buf;
}
#Override
public void run() {
Random r = new Random();
while (r.nextInt(100) != 50) {
try {
System.out.println("Thread " + id + " is Writing");
this.channel.write(this.generateStaticBytes(), this.startPosition);
this.startPosition += 10;
} catch (IOException e) {
e.printStackTrace();
}
}
latch.countDown();
}
}
}
So far what I've seen:
Windows 7 (NTFS partition): Run linearly (aka one thread writes and when it is over, another one gets to run)
Linux Parrot 4.8.15 (ext4 partition) (Debian based distro), with Linux Kernel 4.8.0: Threads intermingle during the execution
Again as the documentation says:
File channels are safe for use by multiple concurrent threads. The
close method may be invoked at any time, as specified by the Channel
interface. Only one operation that involves the channel's position or
can change its file's size may be in progress at any given time;
attempts to initiate a second such operation while the first is still
in progress will block until the first operation completes. Other
operations, in particular those that take an explicit position, may
proceed concurrently; whether they in fact do so is dependent upon the
underlying implementation and is therefore unspecified.
So I'd suggest to first give it a try and see if the OS(es) you are going to deploy your code to (possibly the filesystem type) support parallel execution of a FileChannel.write call
Edit: As pointed out to, the above does not mean that threads can write concurrently to the file, it is actually the opposite as the write call behave according to the contract of a WritableByteChannel which clearly specifies that only one thread at a time can write to a given file:
If one thread initiates a write operation upon a channel then any
other thread that attempts to initiate another write operation will
block until the first operation is complete

As the documentation states and Adonis already mentions this, a write can only be performed by one thread at a time. You won't achieve performance gains through concurreny, moreover, you should only worry about performance if it's an actual issue, because writing concurrently to a disk may actually degrade your performance (probably less for SSDs than HDDs).
The underlying media is in most cases (SSD, HDD, Network) single-threaded - actually, there is no such thing as a thread on hardware level, threads are nothing but an abstraction.
In your case the media is an SSD.
While the SSD internally may write data to multiple modules concurrently ( they may reach a level of parallism where writes may be as fast and even outperform a read), the internal mapping datastructures are shared resource and therefore contended, especially on frequent updates such as concurrent writes. Nevertheless, the updates of this datastructure is quite fast and therefore nothing to worry about unless it becomes a problem.
But apart from this, those are just internals of the SSD. On the outside you communicate over a Serial ATA interface, thus one-byte-at-a-time (actually packets in a Frame Information Structure, FIS). On top of this is a OS/Filesystem that again has a probably contended datastructure and/or applies their own means of optimization such as write-behind-caching.
Further, as you know what your media is, you may optimize especially for that and SSDs are really fast when one single threads writes a large piece of data.
Thus, instead of using multiple threads for writing, you may create a large In-Memory Buffer (probably consider a memory-mapped file) and write concurrently into this buffer. The memory itself is not contended, as long as you ensure each thread access it's own address space of the buffer. Once all threads are done, you write this one buffer to the SSD (not needed if using memory-mapped file).
See also this good summary about developing for SSDs:
A Summary – What every programmer should know about solid-state drives
The point for doing pre-allocation (or to be more precise, file_.setLength(), which acutally maps to ftruncate) is that the resizing of the file may use extra-cycles and you may wan't to avoid that. But again, this may depend on the OS/Filesystem.

Volatile arrays and memory barriers and visibility in Java

I am having difficulties understanding memory barriers and cache coherence in Java, and how these concepts relate to arrays.
I have the following scenario, where one thread modifies an array (both the reference to it and one of its internal values) and another thread reads from it.
int[] integers;
volatile boolean memoryBarrier;
public void resizeAndAddLast(int value) {
integers = Arrays.copyOf(integers, integers.size + 1);
integers[integers.size - 1] = value;
memoryBarrier = true;
}
public int read(int index) {
boolean memoryBarrier = this.memoryBarrier;
return integers[index];
}
My question is, does this do what I think it does, i.e. does "publishing" to memoryBarrier and subsequently reading the variable force a cache-coherence action and make sure that the reader thread will indeed get both the latest array reference and the correct underlying value at the specified index?
My understanding is that the array reference does not have to be declared volatile, it should be enough to force a cache-coherence action using any volatile field. Is this reasoning correct?
EDIT: there is precisely one writer thread and many reader threads.

Nope, your code is thread-unsafe. A variation which would make it safe is as follows:
void raiseFlag() {
if (memoryBarrier == true)
throw new IllegalStateException("Flag already raised");
memoryBarrier = true;
}
public int read(int index) {
if (memoryBarrier == false)
throw IllegalStateException("Flag not raised yet");
return integers[index];
}
You only get to raise the flag once and you don't get to publish more than one integers array. This would be quite useless for your use case, though.
Now, as to the why... You do not guarantee that between the first and second line of read() there wasn't an intervening write to integers which was observed by the second line. The lack of a memory barrier does not prevent another thread from observing an action. It makes the result unspecified.
There is a simple idiom that would make your code thread-safe (specialized for the assumption that a single thread calls resizeAndAddLast, otherwise more code is necessary and an AtomicReference):
volatile int[] integers;
public void resizeAndAddLast(int value) {
int[] copy = Arrays.copyOf(integers, integers.length + 1);
copy[copy.length - 1] = value;
integers = copy;
}
public int read(int index) {
return integers[index];
}
In this code you never touch an array once it got published, therefore whatever you dereference from read will be observed as intended, with the index updated.

There are multiple reasons why it wont work in general:
Java doesnt say anything about memory barriers or about the ordering
of unrelated variables. Global Memory barriers is a side effect of
x86
Even with global memory barriers: The write-order of array-reference and indexed array-value is undefined. It is guarantied that both happen-before the memory barrier, but in which order? An unsynchronized read may see the reference but not the array-value. Your read-barrier doesnt help here in case of multiple read/writes.
Beware of arrays of references: Visibility of referenced values requires special attention
A slightly better approach would be to declare the array itself as volatile and treat its values as immutable:
volatile int[] integers; // volatile (or maybe better AtomicReference)
public void resizeAndAddLast(int value) {
// enforce exactly one volatile read!
int[] copy = integers;
copy = Arrays.copyOf(copy, copy.size + 1);
copy[copy.size - 1] = value;
// may lose concurrent updates. Add synchronization or a compareExchange-loop!
integers = copy;
}
public int read(int index) {
return integers[index];
}

Unless you declare a variable volatile there is no guarantee that the thread will get the correct value. Volatile guarantees change in the variable is visible meaning instead of using the CPU cache it will write/read from main memory.
You will also need synchronization so that the reading thread does not read before the write is complete. Any reason for going with array rather than an ArrayList object because you are already using Arrays.copyOf and resizing?

ReentrantReadWriteLock multiple reading threads

Good Day
I have a question relating ReentrantReadWriteLocks. I am trying to solve a problem where multiple reader threads should be able to operate in parallel on a data structure, while one writer thread can only operate alone (while no reader thread is active). I am implementing this with the ReentrantReadWriteLocks in Java, however from time measurement it seems that the reader threads are locking each other out aswell. I don't think this is supposed to happen, so I am wondering if I implemented it wrong. The way I implemented it is as follows:
readingMethod(){
lock.readLock().lock();
do reading ...
lock.readLock().unlock();
}
writingMethod(){
lock.writeLock().lock();
do writing ...
lock.writeLock().unlock();
}
Where the reading method is called by many different threads. From measuring the time, the reading method is being executed sequentially, even if the writing method is never invoked! Any Idea on what is going wrong here? Thank you in advance -Cheers
EDIT: I tried to come up with a SSCCE, I hope this is clear:
public class Bank {
private Int[] accounts;
public ReadWriteLock lock = new ReentrantReadWriteLock();
// Multiple Threads are doing transactions.
public void transfer(int from, int to, int amount){
lock.readLock().lock(); // Locking read.
// Consider this the do-reading.
synchronized(accounts[from]){
accounts[from] -= amount;
}
synchronized(accounts[to]){
accounts[to] += amount;
}
lock.readLock().unlock(); // Unlocking read.
}
// Only one thread does summation.
public int totalMoney(){
lock.writeLock().lock; // Locking write.
// Consider this the do-writing.
int sum = 0;
for(int i = 0; i < accounts.length; i++){
synchronized(accounts[i]){
sum += accounts[i];
}
}
lock.writeLock().unlock; // Unlocking write.
return sum;
}}
I know the parts inside the read-Lock are not actually reads but writes. I did it this way because there are multiple threads performing writes, while only one thread performs reads, but while reading, no changes can be made to the array. This works in my understanding. And again, the code inside the read-Locks works fine with multiple threads, as long as no write method and no read-locks are added.

Your code is so horribly broken that you should not worry about any performance implication. Your code is not thread safe. Never synchronize on a mutable variable!
synchronized(accounts[from]){
accounts[from] -= amount;
}
This code does the following:
read the contents of the array accounts at position from without any synchronization, thus possibly reading a hopelessly outdated value, or a value just being written by a thread still inside its synchronized block
lock on whatever object it has read (keep in mind that the identity of Integer objects created by auto-boxing is unspecified [except for the -128 to +127 range])
read again the contents of the array accounts at position from
subtract amount from its int value, auto-box the result (yielding a different object in most cases)
store the new object in array accounts at position from
This implies that different threads can write to the same array position concurrently while having a lock on different Integer instances found on their first (unsynchronized) read, opening the possibility of data races.
It also implies that threads may block each other on different array positions if these positions happen to have the same value happened to be represented by the same instance. E.g. pre-initializing the array with zero values (or all to the same value within the range -128 to +127) is a good recipe for getting close to single thread performance as zero (or these other small values) is one of the few Integer values being guaranteed to be represented by the same instance. Since you didn’t experience NullPointerExceptions, you obviously have pre-initialized the array with something.
To summarize, synchronized works on object instances, not variables. That’s why it won’t compile when trying to do it on int variables. Since synchronizing on different objects is like not having any synchronization at all, you should never synchronize on mutable variables.
If you want thread-safe, concurrent access to the different accounts, you may use AtomicIntegers. Such a solution will use exactly one AtomicInteger instance per account which will never change. Only its balance value will be updated using its thread-safe methods.
public class Bank {
private final AtomicInteger[] accounts;
public final ReadWriteLock lock = new ReentrantReadWriteLock();
Bank(int numAccounts) {
// initialize, keep in mind that this array MUST NOT change
accounts=new AtomicInteger[numAccounts];
for(int i=0; i<numAccounts; i++) accounts[i]=new AtomicInteger();
}
// Multiple Threads are doing transactions.
public void transfer(int from, int to, int amount){
final Lock sharedLock = lock.readLock();
sharedLock.lock();
try {
accounts[from].addAndGet(-amount);
accounts[to ].addAndGet(+amount);
}
finally {
sharedLock.unlock();
}
}
// Only one thread does summation.
public int totalMoney(){
int sum = 0;
final Lock exclusiveLock = lock.writeLock();
exclusiveLock.lock();
try {
for(AtomicInteger account: accounts)
sum += account.get();
}
finally {
exclusiveLock.unlock();
}
return sum;
}
}
For completeness, as I guess this question will arise, here is how a withdraw process forbidding taking more money than available may look like:
static void safeWithdraw(AtomicInteger account, int amount) {
for(;;) {
int current=account.get();
if(amount>current) throw new IllegalStateException();
if(account.compareAndSet(current, current-amount)) return;
}
}
It may be included by replacing the line accounts[from].addAndGet(-amount); by safeWithdraw(accounts[from], amount);.
Well after writing the example above, I remembered that there is the class AtomicIntegerArray which fits even better to this kind of task…
private final AtomicIntegerArray accounts;
public final ReadWriteLock lock = new ReentrantReadWriteLock();
Bank(int numAccounts) {
accounts=new AtomicIntegerArray(numAccounts);
}
// Multiple Threads are doing transactions.
public void transfer(int from, int to, int amount){
final Lock sharedLock = lock.readLock();
sharedLock.lock();
try {
accounts.addAndGet(from, -amount);
accounts.addAndGet(to, +amount);
}
finally {
sharedLock.unlock();
}
}
// Only one thread does summation.
public int totalMoney(){
int sum = 0;
final Lock exclusiveLock = lock.writeLock();
exclusiveLock.lock();
try {
for(int ix=0, num=accounts.length(); ix<num; ix++)
sum += accounts.get(ix);
}
finally {
exclusiveLock.unlock();
}
return sum;
}

You can run 2 threads on this test
static ReadWriteLock l = new ReentrantReadWriteLock();
static void readMehod() {
l.readLock().lock();
System.out.println(Thread.currentThread() + " entered");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
l.readLock().unlock();
System.out.println(Thread.currentThread() + " exited");
}
and see if both threads enter the readlock.

How to modify a 2D array from multiple threads without synchronization

How to design a 2D array in java in such a way that it allows multiple threads to modify or insert value at a particular position without using synchronization

Well you cannot do it without synchronization. Only thing you can do is reduce the scope of he lock used. If you don't know yet, read about lock coarsening. ConcurrentHashMap uses it. Idea is that instead of locking the whole array to modifications you just lock a segment of array (e.g: start, middle or end) where the modification would happen. This keeps the array DS open for reads and writes by other threads and no blocking happens unless 2 threads try to mutate the same segment simultaneously.

There are several ways to do it.
Firstly, if you want maximum concurrency as a common library regardless of how many threads are accessing the matrix, and your int[][] is reasonably small, you can do like this.
Set up a matrix of AtomicBoolean equals to the dimension of the int[][].
While you are trying to update the int matrix, lock that AtomicBoolean first, the use Unsafe.putIntVolatile() to update the value in volatile way, then unluck the AtomicBoolean.
Secondly, if you know the number of threads in advance.
Keep a marker object/value as per thread like int; then create an additional int[][] as the marker matrix. Initialize the marker matrix to some dummy value which doesn't equal to any of your thread marker value, e.g. -1.
While you are trying to update the int matrix, lock the cell in thr marker matrix by using Unsafe.compareAndSet() first; then update it by Unsafe.putIntVolatile(); and finally unmark the marker matrix by reset the value to -1.
Lastly, if you can tolerate the int[][] isn't really an int matrix in the actual code, what you can do is to use a long[][] matrix, where the first 32 bit is the current number of updates to the value and the last 32 bit is the current value. Then you can easily use Unsafe.compareAndSet() to set this long value as a whole. Note you can circular the upper 32 bit value so you don't have to have a update upper limit == Integer.MAX_VALUE.

Does this question make sense? If we assume an int[][] all operations on it will be atomic so there should be no need for synchronization! But what confuses me on the question: you cannot insert values in means of changing the array bounds after you have declared it.

If you can guarantee that each element of the array(s) is only ever accessed by a single thread (until you completed the computation that can change the array(s) contents), then you can safely use many threads. (because they will be effectively independent) Otherwise, it's probably a bad idea…

Create a static variable that can be accessed by all your Threads like:
public class MainClass {
public static void main(String[] args) {
Thread t1 = new Thread(new Runnable() {
public void run() {
Constants.intValue[0][0] = new Integer(10);
}
});
Thread t2 = new Thread(new Runnable() {
public void run() {
Constants.intValue[0][1] = new Integer(20);
}
});
Thread t3 = new Thread(new Runnable() {
public void run() {
for (Integer[] valueArr : Constants.intValue) {
for (Integer intValue : valueArr) {
System.err.println(intValue);
}
}
}
});
t1.start();
t2.start();
t3.start();
}
}
interface Constants {
Integer[][] intValue = new Integer[2][2];
}
The only case will be next thread will override previous set value if it is using the same array position.

Lock Free Array Element Swapping

In multi-thread environment, in order to have thread safe array element swapping, we will perform synchronized locking.
// a is char array.
synchronized(a) {
char tmp = a[1];
a[1] = a[0];
a[0] = tmp;
}
Is it possible that we can make use of the following API in the above situation, so that we can have a lock free array element swapping? If yes, how?
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicReferenceFieldUpdater.html#compareAndSet%28T,%20V,%20V%29

Regardless of API used you won't be able to achieve both thread-safe and lock-free array element swapping in Java.
The element swapping requires multiple read and update operations that need to be performed atomically. To simulate the atomicity you need a lock.
EDIT:
An alternative to lock-free algorithm might be micro-locking: instead of locking the entire array it’s possible to lock only elements that are being swapped.
The value of this approach fully is questionable. That is to say if the algorithm that requires swapping elements can guarantee that different threads are going to work on different parts of the array then no synchronisation required.
In the opposite case, when different threads can actually attempt swapping overlapping elements then thread execution order will matter. For example if one thread tries to swap elements 0 and 1 of the array and the other simultaneously attempts to swap 1 and 2 then the result will depend entirely on the order of execution, for initial {‘a’,’b’,’c’} you can end up either with {‘b’,’c’,’a’} or {‘c’,’a’,’b’}. Hence you’d require a more sophisticated synchronisation.
Here is a quick and dirty class for character arrays that implements micro locking:
import java.util.concurrent.atomic.AtomicIntegerArray;
class SyncCharArray {
final private char array [];
final private AtomicIntegerArray locktable;
SyncCharArray (char array[])
{
this.array = array;
// create a lock table the size of the array
// to track currently locked elements
this.locktable = new AtomicIntegerArray(array.length);
for (int i = 0;i<array.length;i++) unlock(i);
}
void swap (int idx1, int idx2)
{
// return if the same element
if (idx1==idx2) return;
// lock element with the smaller index first to avoid possible deadlock
lock(Math.min(idx1,idx2));
lock(Math.max(idx1,idx2));
char tmp = array[idx1];
array [idx1] = array[idx2];
unlock(idx1);
array[idx2] = tmp;
unlock(idx2);
}
private void lock (int idx)
{
// if required element is locked when wait ...
while (!locktable.compareAndSet(idx,0,1)) Thread.yield();
}
private void unlock (int idx)
{
locktable.set(idx,0);
}
}
You’d need to create the SyncCharArray and then pass it to all threads that require swapping:
char array [] = {'a','b','c','d','e','f'};
SyncCharArray sca = new SyncCharArray(array);
// then pass sca to any threads that require swapping
// then within a thread
sca.swap(15,3);
Hope that makes some sense.
UPDATE:
Some testing demonstrated that unless you have a great number of threads accessing the array simulteniously (100+ on run-of-the-mill hardware) a simple synchronise (array) {} works much faster than the elaborate synchronisation.

// lock-free swap array[i] and array[j] (assumes array contains not null elements only)
static <T> void swap(AtomicReferenceArray<T> array, int i, int j) {
while (true) {
T ai = array.getAndSet(i, null);
if (ai == null) continue;
T aj = array.getAndSet(j, null);
if (aj == null) {
array.set(i, ai);
continue;
}
array.set(i, aj);
array.set(j, ai);
break;
}
}

The closest you're going to get is java.util.concurrent.atomic.AtomicReferenceArray, which offers CAS-based operations such as boolean compareAndSet(int i, E expect, E update). It does not have a swap(int pos1, int pos2) operation though so you're going to have to emulate it with two compareAndSet calls.

"The principal threat to scalability in concurrent applications is the exclusive resource lock." - Java Concurrency in Practice.
I think you need a lock, but as others mention that lock can be more granular than it is at present.
You can use lock striping like java.util.concurrent.ConcurrentHashMap.

The API you mentioned, as already stated by others, may only be used to set values of a single object, not an array. Nor even for two objects simultaneously, so you wouldn't have a secure swap anyway.
The solution depends on your specific situation. Can the array be replaced by another data structure? Is it also changing in size concurrently?
If you must use an array, it could be changed it to hold updatable objects (not primitive types nor a Char), and synchronize over both being swapped. S data structure like this would work:
public class CharValue {
public char c;
}
CharValue[] a = new CharValue[N];
Remember to use a deterministic synchronization order for not having a deadlocks (http://en.wikipedia.org/wiki/Deadlock#Circular_wait_prevention)! You could simply follow index ordering to avoid it.
If items should also be added or removed concurrently from the collection, you could use a Map instead, synchronize swaps on the Map.Entry'es and use a synchronized Map implementation. A simple List wouldn't do it because there are no isolated structures for retaining the values (or you don't have access to them).

I don't think the AtomicReferenceFieldUpdater is meant for array access, and even if it were, it only provides atomic guarantees on one reference at a time. AFAIK, all the classes in java.util.concurrent.atomic only provide atomic access to one reference at a time. In order to change two or more references as one atomic operation, you must use some kind of locking.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.