Java multiple objects reading different parts of same file [closed]

Java multiple objects reading different parts of same file [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am working on a project where I will have a binary file. The file is split into multiple sections, each of which represents a list of primitive values. I need a solution where I can have a collection of objects, each of which represents a section of the file. These collections are then all held within a "file" object that represents the file as a whole.
Each collections object will need to provide sequential access to each value in the represented section of the file. What method would provide the fastest data retrieval without loading all the data into memory first?
Also it would be nice if two separate collections of the same "file" object could be accessed by two separate Threads, but this is not as important.

A good approach is to divide the solution into layers, here: one for the file i/o, mapping bytes to Java shorts and ints, another one for the abstraction of the file sections and the entire file.
java.nio's MappedByteBuffer provides a good interface between the "byte array" of a random access file and what you need for getting the Java typed data from that.
As Kayaman has mentioned, FileChannel.map() returns a MappedByteBuffer and you can navigate easily on that with its methods.
The implemention should make use of the OS feature for mapping memory pages to file pages, actually accessing on the file only what you really access in memory. (I've used this recently with Java 8 and Linux, and it performed well on files exceeding even the capacity of a single MappedByteBuffer.)

Related

Move C++ memory without copy [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Currently I am working on transfer image from C++ to Java.
The destination location is allocate by Java,
the source location is the image generated by C++, so.
I have a
uint8_t* pixelPtr
, I want to move the content of this to a
__uint8_t* data
without copy.
I have 1920*1080*3 bytes in total, so I want to move rather than copy to be fast in computation, I am wondering is there any trick way to do so?
Thank you in advance!

Let's recap:
The source is a buffer allocated in C++ by an image generation function.
The destination is a buffer allocated in Java by some other code somewhere.
You want to transfer data between the two buffers.
As long as those two buffers are distinct, there is no "trick" to avoid this. "Moving" in this context would mean swapping the pointers around, but that does nothing to the underlying buffers. You will just have to copy the data.
Explore solutions such as generating the data in the destination buffer in the first place, or making use of appropriate functionality exposed by the C++ image generation function (or the Java code). Unfortunately we can't speculate on the possible existence or form of such solutions, from here.

The standard way is, you should modify your C++ code so it creates the data not wherever it wants, but in the given place. That is, if you have code like this
uint8_t* GenerateImage(...parameters...)
{
uint8_t* output = ... allocate ...
return output;
}
you should change it to receive the destination as a parameter
void GenerateImage(...parameters..., __uint8_t* destination)
{
... fill the destination ...
}
The latter is better C++ design anyway - this way you don't need to make a separate DestroyImage function - the memory is managed entirely by Java.

read Java doubles from binary file into Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have several matrices in Java that I would like to transfer to Python as efficiently as possible, without requiring anything but standard libraries on both the Java and Python sides.
Currently I serialize them to file using the writeDouble function to write the entries out one by one, and writeInt to write the dimensions of the matrices. Now I would like to read these matrices back into Python. I can get the integers using struct.unpack, but Java's serialization of doubles does not correspond to an algorithm that struct.unpack can implement.
How can I decode a Java double in the binary format that writeDouble uses? I have trouble even finding a specification for the encoding that writeDouble uses.

You're overengineering it; DataOutputStream.writeDouble() and related methods are for manually serializing a Java Object, so it can be re-read as a Java Object. If all you need is to transfer data, you can simply write them out as text (or bytes), then read them back in. Common formats are CSV, JSON, XML, and ProtoBuf.
If all you're doing is trying to transfer a list of doubles, you can even just write them out one per line, and read them right back in with Python.

Most efficient way to replace many (5000+) strings in a .txt file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Using a general-purpose programming language like Java, what is the most efficient way to search through a ~20 page document to replace a set of 5000+ strings with some predetermined replacement string? The program should not replace any strings that have already been replaced. What data structure would be optimal to store the 5000+ strings and each of their replacements - two arrays, a dictionary, or something else?
Here are some of the options that I have considered so far:
Iterate through the entire .txt document once time per string using string.replace. The problem is that the algorithm must iterate through the entire .txt document an extra time for each string stored.
Iterate through the .txt once while replacing string as necessary while creating a new string by appending replacements. This seems more efficient, but each step would still require checking the entire set of 5000+ strings for any strings to replace.
Is there a more optimized means of solving this problem, or is one of the above attempts already optimal?
Also, would it be possible to run this algorithm more efficiently in a lower-level language like C?

You want to replace some string in 5000 strings and you want to make it optimal ... Now my question to you is: How will you know if you have to replace a string if you dont read the string? It's not possible, you have to read everything. And the shortest way to do that is to go line by line and replace immediatly. And somebody can correct me if i'm wrong, but reading a file is one of the most basic operations there is so using a library for that besides what is available by default in the programming language seems total overkill to me. Furthermore, every language has basic io and if it doesn't then don't use it.
To store strings, it all depends what you want to do with them. Different data structures have different purposes and some are better suited in some situations then others. If you just need to store them then a simple array is fine. However, if you need more advanced functions then you need to consider your options. But again it's all up to what you want to do with them later.
And there is the memory issue, you need to calculate how much memory your 5000+ strings will take, because you might run out of memory. Then you need to think if it's worth it to use all that memory. check this link
Finally your question about C, ofcourse it will be more efficient. Java runs in a virtual machine that adds considerable overhead. So basically your Java program runs in another Java program and if you know that there is a cost for every single operation then you understand that C will be more efficient then Java in terms of performance.

I would use the commons-lang library, which I think has exactly what you are looking for. Basically you create one array with all the strings you want to substitute and another array with the substitutions. See http://commons.apache.org/proper/commons-lang/javadocs/api-release/index.html for details on the StringUtils#replaceEach method.

OO PHP service performance [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
So I have this app, a Java servlet. It uses a dictionary object that reads words from a file specified as a constructor parameter on instantiation and then serves queries.
I can do basically the same on PHP, but it's my understanding the class will be instantiated on each and every request, and the file will be read again every time. In fact, I did it and it works, but it collapses my humble amazon EC2 micro instance at the ridiculous amount of 11 requests per second or more.
My question is: Shouldn't some kind of compiler/file system optimization be kicking in and making the performance impact insignificant when the file does not change at all?
If the answer is no, I guess my design is quite poor and I should try to improve it. In that case, my second question is: What would be the best approach to improve it?
Building a servlet-like service so the code is properly reused?
Using memcached to keep the words file content in memory?
Using a RDBMS instead of a plain text file and have my dictionary querying it?
(despite the dictionary being only a few KB of static data and despite having to perform some complex queries such as selecting a
(cryptographically safe) random word from those having a length
higher than some per-request user setting and such?)
Something else?

Your best bet is to generate a PHP file which contains the final structure of the dictionary in PHP code. You could then include() that cache file into your code or write a new one when the file changes. You should store it on the filesystem, no databases. You could cache it in memory as well. But I don't think this is really needed at this point.

how to properly insert data to database from txt file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
simple question: any ideas how this should be properly done?. I have 3 txt files with lots of information, I created a class that will be in charge of reading the data from the txt files and returning the data as a list of DTO components (Yes, the information can be bundle as such logic unit), depending on the txt file, after that the client will use a DAO and will use such a list and insert the data to a local database (sqlite). My concern is that having such a List could be memory demanding, should I avoid using such list and somehow insert this data using the dao object directly without bundling the data into a dto and finally such list?

You are asking a good question and partially answering it yourself.
Yes, sure if you really have a lot of information you should not read all information from file and then store it in DB. You should read information chunk-by chunk or even if it is possible (from application point of view) line-by-line and store each line is DB.
In this case you will need memory for one line only at any time.
You can design the application as following.
File parser that returns Iterable<Row>
DB writer that accepts Iterable<Row> and stores rows in DB,
Manager that calls both.
In this case the logic responsible on reading and writing file will be encapsulated into certain modules and no extra memory consumption will be required.

do not return list, but an iterator like in this example: Iterating over the content of a text file line by line - is there a best practice? (vs. PMD's AssignmentInOperand)
you have to modify this iterator to return your DTO instead of String:
for(MyDTO line : new BufferedReaderIterator(br)){
// do some work
}
Now you will iterate over file line by line, but you will return DTOs instead of returning lines. Such solution has small memory impact.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.