i have this code below, but it is not efficient at all, it is very very slow and more pictures i have to compare more long time it takes.
For example i have 500 pictures, each process lasts 2 minutes, 500 x 2 min =1000 min !
the specificity is as soon as there is picture same as compared, move it to another folder. then retrieve the rest files to compare i++
any idea ?
public static void main(String[] args) throws IOException {
String PicturesFolderPath=null;
String removedFolderPath=null;
String pictureExtension=null;
if(args.length>0) {
PicturesFolderPath=args[0];
removedFolderPath=args[1];
pictureExtension=args[2];
}
if(StringUtils.isBlank(pictureExtension)) {
pictureExtension="jpg";
}
if(StringUtils.isBlank(removedFolderPath)) {
removedFolderPath=Paths.get(".").toAbsolutePath().normalize().toString()+"/removed";
}
if(StringUtils.isBlank(PicturesFolderPath)) {
PicturesFolderPath=Paths.get(".").toAbsolutePath().normalize().toString();
}
System.out.println("path to find pictures folder "+PicturesFolderPath);
System.out.println("path to find removed pictures folder "+removedFolderPath);
Collection<File> fileList = FileUtils.listFiles(new File(PicturesFolderPath), new String[] { pictureExtension }, false);
System.out.println("there is "+fileList.size()+" files founded with extention "+pictureExtension);
Iterator<File> fileIterator=fileList.iterator();
//Iterator<File> loopFileIterator=fileList.iterator();
File dest=new File(removedFolderPath);
while(fileIterator.hasNext()) {
File file=fileIterator.next();
System.out.println("process image :"+file.getName());
//each new iteration we retrieve the files staying
Collection<File> list = FileUtils.listFiles(new File(PicturesFolderPath), new String[] { pictureExtension }, false);
for(File f:list) {
if(compareImage(file,f) && !file.getName().equals(f.getName()) ) {
String filename=file.getName();
System.out.println("file :"+file.getName() +" equal to "+f.getName()+" and will be moved on removed folder");
File existFile=new File(removedFolderPath+"/"+file.getName());
if(existFile.exists()) {
existFile.delete();
}
FileUtils.moveFileToDirectory(file, dest, false);
fileIterator.remove();
System.out.println("file :"+filename+" removed");
break;
}
}
}
}
// This API will compare two image file //
// return true if both image files are equal else return false//**
public static boolean compareImage(File fileA, File fileB) {
try {
// take buffer data from botm image files //
BufferedImage biA = ImageIO.read(fileA);
DataBuffer dbA = biA.getData().getDataBuffer();
int sizeA = dbA.getSize();
BufferedImage biB = ImageIO.read(fileB);
DataBuffer dbB = biB.getData().getDataBuffer();
int sizeB = dbB.getSize();
// compare data-buffer objects //
if(sizeA == sizeB) {
for(int i=0; i<sizeA; i++) {
if(dbA.getElem(i) != dbB.getElem(i)) {
return false;
}
}
return true;
}
else {
return false;
}
}
catch (Exception e) {
e.printStackTrace();
return false;
}
}
The already mentioned answer should help you a bit, as considering the width and height of a picture should exclude more candidate pairs quickly.
However, you still have a big problem: For every new file, you read all old files. The number of comparisons grows quadratically and with doing ImageIO.read for every step, it simply must be slow.
You need some fingerprints, which can be compared very fast. You can't use fingerprinting over the whole file content as its infested by the metadata, but you can fingerprint the image data alone.
Just iterate over the image data of a file (like you do), and compute e.g., MD5 hash of it. Store it e.g., as a String in HashSet and you'll get a very fast lookup.
Some untested code
For every image file you want to compare, you compute (using Guava's hashing)
HashCode imageFingerprint(File file) {
Hasher hasher = Hashing.md5().newHasher();
BufferedImage image = ImageIO.read(file);
DataBuffer buffer = image.getData().getDataBuffer();
int size = buffer.getSize();
for(int i=0; i<size; i++) {
hasher.putInt(buffer.getElem(i));
}
return hasher.hash();
}
The computation works with the image data only, just like compareImage in the question, so the metadata get ignored.
Instead of searching for a duplicate in a directory, you compute the fingerprints of all its files and store them in a HashSet<HashCode>. For a new file, you compute its fingerprint and look it up in the set.
Related
I trying to create a custom photo gallery which the user can store Images from MediaStore photo gallery,
And my target is that the user can only the store different Image in database, if it is a duplicate Image it will not store .So , Im thinking hey, I can use the filepath as sort of A Id to check for duplicate Images. But evertime I restart the app and repick the same picture the Uri/filepath is not the same espicially the lastpart of the filePath/Uri
My Weird Implementation to Check duplicate:
public void checkAndSetPhotoGalleryData() throws IOException {
RoomDB db = RoomDB.getInstance(this);
PhotoGalleryData photoGalleryData = new PhotoGalleryData();
String stringUri = String.valueOf(imageUri);
String lastPartFile = stringUri.substring(stringUri.lastIndexOf('/')+1);
TextView testThis = findViewById(R.id.testName);
testThis.setText(db_path);
if(photoGarList.isEmpty()) {
try {
InputStream iStream = getContentResolver().openInputStream(imageUri);
byte[] inputData = getBytes(iStream);
photoGalleryData.setKey_Value_Quiz(String.valueOf(lastPartFile));
photoGalleryData.setPhoto(inputData);
db.questionDao().insert(photoGalleryData);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}if(!photoGarList.isEmpty())
{for(int i = 0; i<photoGarList.size();i++){
if (photoGarList.get(i).getKey_Value_Quiz().length() > 0 && !photoGarList.isEmpty() && photoGarList.get(i).getKey_Value_Quiz().equals(String.valueOf(lastPartFile))) {
showLongToast("Are Identical");
}
if (photoGarList.get(i).getKey_Value_Quiz().length() > 0 && !photoGarList.isEmpty() && !photoGarList.get(i).getKey_Value_Quiz().equals(String.valueOf(lastPartFile)) && i == photoGarList.size()-1) {
try {
InputStream iStream = getContentResolver().openInputStream(imageUri);
byte[] inputData = getBytes(iStream);
photoGalleryData.setKey_Value_Quiz(String.valueOf(lastPartFile));
photoGalleryData.setPhoto(inputData);
db.questionDao().insert(photoGalleryData);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
}
}
lastPart of filePath:
///both of this are the same picture:
KeyValue:924270434
KeyValue:239256090
I guess that for security reason they change the last part every time, So I want to know is there any constant factor of a picture except for byte[] or Bitmap(Size is to Huge to store at once and this is not the case).
You could use simple pixel by pixel comparison of the images.
Try to use ImageMagic:
ImageMagic
Or you could try Java OpenCV: How to compare two images using Java OpenCV library
I built a classic Hoffman code, with encoder and decoder. I noticed that I had a problem, I use code in "bitset", to compress the input file. But the "bitset" - does not decode all the files I send to, for example when I send a txt file, it works great, but when I send other files like BMP. It doesn't work.
Before I used bitset - the code worked - but without any compression - so I'm afraid the problem is with bitset.
The decoder I built is:
public void Decompress(String[] input_names, String[] output_names) {
HuffmanVerticle tree = new HuffmanVerticle();
tree = readTreeFile(output_names);
restoreInput(tree, output_names, input_names);
}
public static void restoreInput(HuffmanVerticle tree, String[] binary_names, String[] original_names) {
BitSet huffmanCodeBit;
try {
FileOutputStream to_original = new FileOutputStream(original_names[0]);
FileInputStream binary = new FileInputStream(binary_names[0]);
ObjectInputStream s = new ObjectInputStream(binary);
huffmanCodeBit = (BitSet) s.readObject();
System.out.println(huffmanCodeBit.toString());
int index = 0;
while(huffmanCodeBit.length() > index)
{
HuffmanVerticle tmp = tree;
while (!tmp.isNullTree())
{
boolean bit = huffmanCodeBit.get(index);
index++;
System.out.println(bit);
if (!bit)
tmp = tmp.left;
else
tmp = tmp.right;
}
to_original.write(tmp.character);
}
binary.close();
to_original.close();
} catch (Exception e) {
e.printStackTrace();
}
}
What am I missing here? Why doesn't the code work for certain files? I'm trying to run the code on some files but it doesn't work, the files that come back don't work.
The code does not work for bmp files at all, even after half an hour, for example txt files, it runs very fast.
Thank for your help.
Am trying to encode pdf documents to base64, If it is less in number ( like 2000 documents) its working nicely. But am having 100k plus doucments to be encode.
Its take more time to encode all those files. Is there any better approach to encode large data set.?
Please find my current approach
String filepath=doc.getPath().concat(doc.getFilename());
file = new File(filepath);
if(file.exists() && !file.isDirectory()) {
try {
FileInputStream fileInputStreamReader = new FileInputStream(file);
byte[] bytes = new byte[(int) file.length()];
fileInputStreamReader.read(bytes);
encodedfile = new String(Base64.getEncoder().encodeToString(bytes));
fileInputStreamReader.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
Try this:
Figure out how many files you need to encode.
int files = Files.list(Paths.get(directory)).count();
Split them up into a reasonable amount that a thread can handle in java. I.E) If you have 100k files to encode. Split it into 1000 lists of 1000, something like that.
int currentIndex = 0;
for (File file : filesInDir) {
if (fileMap.get(currentIndex).size() >= cap)
currentIndex++;
fileMap.get(currentIndex).add(file);
}
/** Its going to take a little more effort than this, but its the idea im trying to show you*/
Execute each worker thread one after another if the computers resources are available.
for (Integer key : fileMap.keySet()) {
new WorkerThread(fileMap.get(key)).start();
}
You can check the current resources available with:
public boolean areResourcesAvailable() {
return imNotThatNice();
}
/**
* Gets the resource utility instance
*
* #return the current instance of the resource utility
*/
private static OperatingSystemMXBean getInstance() {
if (ResourceUtil.instance == null) {
ResourceUtil.instance = ManagementFactory.getOperatingSystemMXBean();
}
return ResourceUtil.instance;
}
I want find out, if two audio files are same or one contains the other.
For this I use Fingerprint of musicg
byte[] firstAudio = readAudioFileData("first.mp3");
byte[] secondAudio = readAudioFileData("second.mp3");
FingerprintSimilarityComputer fingerprint =
new FingerprintSimilarityComputer(firstAudio, secondAudio);
FingerprintSimilarity fingerprintSimilarity = fingerprint.getFingerprintsSimilarity();
System.out.println("clip is found at " + fingerprintSimilarity.getScore());
to convert audio to byte array I use sound API
public static byte[] readAudioFileData(final String filePath) {
byte[] data = null;
try {
final ByteArrayOutputStream baout = new ByteArrayOutputStream();
final File file = new File(filePath);
final AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(file);
byte[] buffer = new byte[4096];
int c;
while ((c = audioInputStream.read(buffer, 0, buffer.length)) != -1) {
baout.write(buffer, 0, c);
}
audioInputStream.close();
baout.close();
data = baout.toByteArray();
} catch (Exception e) {
e.printStackTrace();
}
return data;
}
but when I execute it, I became at fingerprint.getFingerprintsSimilarity() an Exception.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 15999
at com.musicg.fingerprint.PairManager.getPairPositionList(PairManager.java:133)
at com.musicg.fingerprint.PairManager.getPair_PositionList_Table(PairManager.java:80)
at com.musicg.fingerprint.FingerprintSimilarityComputer.getFingerprintsSimilarity(FingerprintSimilarityComputer.java:71)
at Main.main(Main.java:42)
How can I compare 2 mp3 files with fingerprint in Java?
I never did any audio stuff in Java before, but I looked into your code briefly. I think that musicg only works for WAV files, not for MP3. Thus, you need to convert the files first. A web search reveals that you can e.g. use JLayer for that purpose. The corresponding code looks like this:
package de.scrum_master.so;
import com.musicg.fingerprint.FingerprintManager;
import com.musicg.fingerprint.FingerprintSimilarity;
import com.musicg.fingerprint.FingerprintSimilarityComputer;
import com.musicg.wave.Wave;
import javazoom.jl.converter.Converter;
import javazoom.jl.decoder.JavaLayerException;
public class Application {
public static void main(String[] args) throws JavaLayerException {
// MP3 to WAV
new Converter().convert("White Wedding.mp3", "White Wedding.wav");
new Converter().convert("Poison.mp3", "Poison.wav");
// Fingerprint from WAV
byte[] firstFingerPrint = new FingerprintManager().extractFingerprint(new Wave("White Wedding.wav"));
byte[] secondFingerPrint = new FingerprintManager().extractFingerprint(new Wave("Poison.wav"));
// Compare fingerprints
FingerprintSimilarity fingerprintSimilarity = new FingerprintSimilarityComputer(firstFingerPrint, secondFingerPrint).getFingerprintsSimilarity();
System.out.println("Similarity score = " + fingerprintSimilarity.getScore());
}
}
Of course you should make sure that you do not convert each file again whenever the program starts, i.e. you should check if the WAV files already exist. I skipped this step and reduced the sample code to a minimal working version.
For FingerprintSimilarityComputer(input1, input2), it suppose to take in the fingerprint of the loaded audio data and not the loaded audio data itself.
In your case, it should be:
// Convert your audio to wav using FFMpeg
Wave w1 = new Wave("first.wav");
Wave w2 = new Wave("second.wav");
FingerprintSimilarityComputer fingerprint =
new FingerprintSimilarityComputer(w1.getFingerprint(), w2.getFingerprint());
// print fingerprint.getFingerprintSimilarity()
Maybe I am missing a point, but if I understood you right, this should do:
byte[] firstAudio = readAudioFileData("first.mp3");
byte[] secondAudio = readAudioFileData("second.mp3");
byte[] smaller = firstAudio.length <= secondAudio.lenght ? firstAudio : secondAudio;
byte[] bigger = firstAudio.length > secondAudio.length ? firstAudio : secondAudio;
int ixS = 0;
int ixB = 0;
boolean contians = false;
for (; ixB<bigger.length; ixB++) {
if (smaller[ixS] == bigger[ixB]) {
ixS++;
if (ixS == smaller.lenght) {
contains = true;
break;
}
}
else {
ixS = 0;
}
}
if (contains) {
if (smaller.length == bigger.length) {
System.out.println("Both tracks are equal");
}
else {
System.out.println("The bigger track, fully contains the smaller track starting at byte: "+(ixB-smaller.lenght));
}
}
else {
System.out.println("No track completely contains the other track");
}
I'd like to make my Java program compare the actual screen with a picture (screenshot).
I don't know if it's possible, but I have seen it in Jitbit (a macro recorder) and I would like to implement it myself. (Maybe with that example you understand what I mean).
Thanks
----edit-----
In other words, is it possible to check if an image is showing in? To find and compare that pixels in the screen?
You may try aShot: documentation link
1) aShot can ignore areas you mark with special color.
2) aShot can provide image which display difference between images.
private void compareTowImages(BufferedImage expectedImage, BufferedImage actualImage) {
ImageDiffer imageDiffer = new ImageDiffer();
ImageDiff diff = imageDiffer
.withDiffMarkupPolicy(new PointsMarkupPolicy()
.withDiffColor(Color.YELLOW))
.withIgnoredColor(Color.MAGENTA)
.makeDiff(expectedImage, actualImage);
// areImagesDifferent will be true if images are different, false - images the same
boolean areImagesDifferent = diff.hasDiff();
if (areImagesDifferent) {
// code in case of failure
} else {
// Code in case of success
}
}
To save image with differences:
private void saveImage(BufferedImage image, String imageName) {
// Path where you are going to save image
String outputFilePath = String.format("target/%s.png", imageName);
File outputFile = new File(outputFilePath);
try {
ImageIO.write(image, "png", outputFile);
} catch (IOException e) {
// Some code in case of failure
}
}
You can do this in two steps:
Create a screenshot using awt.Robot
BufferedImage image = new Robot().createScreenCapture(new Rctangle(Toolkit.getDefaultToolkit().getScreenSize()));
ImageIO.write(image, "png", new File("/screenshot.png"));
Compare the screenshots using something like that: How to check if two images are similar or not using openCV in java?
Have a look at Sikuli project. Their automation engine is based on image comparison.
I guess, internally they are still using OpenCV for calculating image similarity, but there are plenty of OpenCV Java bindings like this, which allow to do so from Java.
Project source code is located here: https://github.com/sikuli/sikuli
Ok then, so I found an answer after a few days.
This method takes the screenshot:
public static void takeScreenshot() {
try {
BufferedImage image = new Robot().createScreenCapture(new Rectangle(490,490,30,30));
/* this two first parameters are the initial X and Y coordinates. And the last ones are the increment of each axis*/
ImageIO.write(image, "png", new File("C:\\Example\\Folder\\capture.png"));
} catch (IOException e) {
e.printStackTrace();
} catch (HeadlessException e) {
e.printStackTrace();
} catch (AWTException e) {
e.printStackTrace();
}
}
And this other one will compare the images
public static String compareImage() throws Exception {
// savedImage is the image we want to look for in the new screenshot.
// Both must have the same width and height
String c1 = "savedImage";
String c2 = "capture";
BufferedInputStream in = new BufferedInputStream(new FileInputStream(c1
+ ".png"));
BufferedInputStream in1 = new BufferedInputStream(new FileInputStream(
c2 + ".png"));
int i, j;
int k = 1;
while (((i = in.read()) != -1) && ((j = in1.read()) != -1)) {
if (i != j) {
k = 0;
break;
}
}
in.close();
in1.close();
if (k == 1) {
System.out.println("Ok...");
return "Ok";
} else {
System.out.println("Fail ...");
return "Fail";
}
}