I'm interesting in creating a layered tif with Java in a way that Photoshop will recognize the layers. I was able to create a multi-page tif, but Photoshop does not recognize the pages as layers. The pages are viewable with Acrobat though. Anyone know how Photoshop stores tif layer data and how that could be generated with Java?
Thanks.
I have researched this for my TIFF ImageIO plugin, and as far as I understand, the way Photoshop stores layer information in TIFFs is completely proprietary and not using standard TIFF mechanisms, like multi-page documents utilizing linked or nested IFDs (330/SubIFD), or file types (254/NewSubFileType), etc.
Instead, it stores the layer information,
along with the layer image data, in a Photoshop specific TIFF tag; 37724/ImageSourceData, which has type UNDEFINED (or "just bytes"). Luckily, the contents of this tag is documented in Adobe Photoshop®
TIFF Technical Notes.
The content of this tag will always start with the 0-terminated string "Adobe Photoshop Document Data Block". The rest of the contents is various Photoshop resources, identified by the Photoshop 4 byte resource identifier 8BIM, followed 4 bytes resource key and 4 bytes length for each individual resource.
The interesting resource in this block, with regards to Photoshop layers, is the one identified with the resource key Layr. This is the same structure documented in Layer and Mask Information Section in the Photoshop File Format.
There's also a different tag, 34377/Photoshop, which contains other image resources read and written by Photoshop. It's also documented in the Image Resources Section of the above document. It does contain some information which is interesting in regards to layers, but I'm not sure how much of this you need to write. You will probably need a Photoshop installation and test using the "real thing".
I do have code to read both of these structures in the PSD ImageIO plugin, which might be worth looking at, but it doesn't yet support writing.
When you can write the contents Photoshop TIFF tags, you should be able to pass it to the TIFFImageWriter as part of the TIFF IIOMetadata and the writer will write it along with any other metadata and pixel data you pass.
So, as you see, this is all (mostly) documented and for sure doable in Java, but still not completely trivial.
I started a solution based on TinyTIFF, the answer from #haraldK on this SO question, the TIFF spec, and the Photoshop TIFF spec. It is about the simplest possible way to write a TIFF. I put in the code to write the Photoshop section, but it is not finished.
Note that Photoshop uses the TIFF image as the "preview" image, similar to the flattened composite image at the very end of a PSD file. The Photoshop TIFF section is what contains the pixel data for all the layers (again similar to a PSD). Adobe's use of TIFF in this way is pretty dirty. You might as well just use the (also terrible) PSD format, since smashing PSD data into the TIFF format just adds complexity for no benefit. This is why I did not finish the code below. If you do finish it, please post it here.
The Output class is from Kryo. pixmap.getPixels() is 4 bytes per pixel, RGBA.
/* Copyright (c) 2008-2015 Jan W. Krieger (<jan#jkrieger.de>, <j.krieger#dkfz.de>), German Cancer Research Center (DKFZ) & IWR, University of Heidelberg
* Copyright (c) 2018, Nathan Sweet, Esoteric Software LLC
* All rights reserved.
*
* This software is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public
* License (LGPL) as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later
* version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You
* should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. */
public class TiffWriter {
private Output out;
private int width, height;
private int ifdCount, ifdLastOffset, ifdData, headerStart;
private Output header;
public void start (OutputStream output, int width, int height) throws IOException {
this.out = new Output(output);
this.width = width;
this.height = height;
out.writeByte('M'); // Big endian.
out.writeByte('M');
out.writeShort(42); // Magic number.
ifdLastOffset = out.total();
out.writeInt(8); // Offset of first IFD.
}
public void frame (Pixmap pixmap, String name, int frame, int endFrame) throws IOException {
ByteBuffer pixels = pixmap.getPixels();
headerStart = out.total();
ifdData = 2 + TIFF_HEADER_MAX_ENTRIES * 12;
ifdCount = 0;
header = new Output(TIFF_HEADER_SIZE + 2);
header.setPosition(2);
writeLongIFD(TIFF_FIELD_IMAGEWIDTH, width);
writeLongIFD(TIFF_FIELD_IMAGELENGTH, height);
writeShortIFD(TIFF_FIELD_BITSPERSAMPLE, 8, 8, 8);
writeShortIFD(TIFF_FIELD_COMPRESSION, COMPRESSION_NO);
writeShortIFD(TIFF_FIELD_PHOTOMETRICINTERPRETATION, PHOTOMETRIC_INTERPRETATION_RGB);
writeLongIFD(TIFF_FIELD_STRIPOFFSETS, headerStart + 2 + TIFF_HEADER_SIZE);
writeShortIFD(TIFF_FIELD_SAMPLESPERPIXEL, 4);
writeLongIFD(TIFF_FIELD_ROWSPERSTRIP, height);
writeLongIFD(TIFF_FIELD_STRIPBYTECOUNTS, width * height);
writeRationalIFD(TIFF_FIELD_XRESOLUTION, 720000, 10000);
writeRationalIFD(TIFF_FIELD_YRESOLUTION, 720000, 10000);
writeShortIFD(TIFF_FIELD_PLANARCONFIG, PLANAR_CONFIGURATION_CHUNKY);
writeShortIFD(TIFF_FIELD_RESOLUTIONUNIT, RESOLUTION_UNIT_INCH);
writeShortIFD(TIFF_FIELD_EXTRASAMPLES, 1); // Adds alpha to last samples per pixel.
// writeIFDEntrySHORT(TIFF_FIELD_SAMPLEFORMAT, SAMPLE_FORMAT_FLOAT);
// Photoshop layer entry.
ifdCount++;
header.writeShort(TIFF_FIELD_PHOTOSHOP_IMAGESOURCEDATA);
header.writeShort(TIFF_TYPE_UNDEFINED);
int sizePosition = header.position();
header.writeInt(0); // Size in bytes.
header.writeInt(ifdData + headerStart);
int pos = header.position();
header.setPosition(ifdData);
writeString(header, "Adobe Photoshop Document Data Block");
// Unfinished!
int size = header.position() - ifdData;
ifdData = header.position();
header.setPosition(sizePosition);
header.writeInt(size);
header.setPosition(pos);
if (ifdCount > TIFF_HEADER_MAX_ENTRIES) throw new RuntimeException();
header.setPosition(0);
header.writeShort(ifdCount);
header.setPosition(2 + ifdCount * 12); // header start + 12 bytes per IFD entry
header.writeInt(headerStart + 2 + TIFF_HEADER_SIZE + width * height);
out.writeBytes(header.getBuffer(), 0, TIFF_HEADER_SIZE + 2);
ifdLastOffset = headerStart + 2 + ifdCount * 12;
pixels.position(0);
for (int i = 0, n = width * height * 4; i < n; i += 4) {
byte a = pixels.get(i + 3);
float pma = (a & 0xff) / 255f;
out.writeByte((byte)((pixels.get(i) & 0xff) * pma));
out.writeByte((byte)((pixels.get(i + 1) & 0xff) * pma));
out.writeByte((byte)((pixels.get(i + 2) & 0xff) * pma));
out.writeByte(a);
}
pixels.position(0);
}
public void end () throws IOException {
out.close();
// Erase last IFD offset.
RandomAccessFile file = new RandomAccessFile("test.tif", "rw");
file.seek(ifdLastOffset);
file.write((byte)0);
file.write((byte)0);
file.write((byte)0);
file.write((byte)0);
file.close();
}
public void close () throws IOException {
end();
}
private void writeString (Output output, String value) {
for (int i = 0, n = value.length(); i < n; i++)
output.writeByte(value.charAt(i));
output.writeByte(0);
}
private void writeLongIFD (int tag, int data) {
ifdCount++;
header.writeShort(tag);
header.writeShort(TIFF_TYPE_LONG);
header.writeInt(1);
header.writeInt(data);
}
private void writeShortIFD (int tag, int data) {
ifdCount++;
header.writeShort(tag);
header.writeShort(TIFF_TYPE_SHORT);
header.writeInt(1);
header.writeShort(data);
header.writeShort(0); // Pad bytes.
}
private void writeShortIFD (int tag, int... data) {
ifdCount++;
header.writeShort(tag);
header.writeShort(TIFF_TYPE_SHORT);
header.writeInt(data.length);
if (data.length == 1)
header.writeInt(data[0]);
else {
header.writeInt(ifdData + headerStart);
int pos = header.position();
header.setPosition(ifdData);
for (int value : data)
header.writeShort(value);
ifdData = header.position();
header.setPosition(pos);
}
}
private void writeRationalIFD (int tag, int numerator, int denominator) {
ifdCount++;
header.writeShort(tag);
header.writeShort(TIFF_TYPE_RATIONAL);
header.writeInt(1);
header.writeInt(ifdData + headerStart);
int pos = header.position();
header.setPosition(ifdData);
header.writeInt(numerator);
header.writeInt(denominator);
ifdData = header.position();
header.setPosition(pos);
}
static private final int TIFF_HEADER_SIZE = 510;
static private final int TIFF_HEADER_MAX_ENTRIES = 16;
static private final int TIFF_FIELD_IMAGEWIDTH = 256;
static private final int TIFF_FIELD_IMAGELENGTH = 257;
static private final int TIFF_FIELD_BITSPERSAMPLE = 258;
static private final int TIFF_FIELD_COMPRESSION = 259;
static private final int TIFF_FIELD_PHOTOMETRICINTERPRETATION = 262;
static private final int TIFF_FIELD_IMAGEDESCRIPTION = 270;
static private final int TIFF_FIELD_STRIPOFFSETS = 273;
static private final int TIFF_FIELD_SAMPLESPERPIXEL = 277;
static private final int TIFF_FIELD_ROWSPERSTRIP = 278;
static private final int TIFF_FIELD_STRIPBYTECOUNTS = 279;
static private final int TIFF_FIELD_XRESOLUTION = 282;
static private final int TIFF_FIELD_YRESOLUTION = 283;
static private final int TIFF_FIELD_PLANARCONFIG = 284;
static private final int TIFF_FIELD_RESOLUTIONUNIT = 296;
static private final int TIFF_FIELD_EXTRASAMPLES = 338;
static private final int TIFF_FIELD_SAMPLEFORMAT = 339;
static private final int TIFF_FIELD_PHOTOSHOP_IMAGESOURCEDATA = 37724;
static private final int TIFF_TYPE_BYTE = 1;
static private final int TIFF_TYPE_ASCII = 2;
static private final int TIFF_TYPE_SHORT = 3;
static private final int TIFF_TYPE_LONG = 4;
static private final int TIFF_TYPE_RATIONAL = 5;
static private final int TIFF_TYPE_UNDEFINED = 7;
static private final int SAMPLE_FORMAT_UNSIGNED_INT = 1;
static private final int SAMPLE_FORMAT_SIGNED_INT = 2;
static private final int SAMPLE_FORMAT_FLOAT = 3;
static private final int SAMPLE_FORMAT_UNDEFINED = 4;
static private final int COMPRESSION_NO = 1;
static private final int COMPRESSION_CCITT_HUFFMAN = 2;
static private final int COMPRESSION_T4 = 3;
static private final int COMPRESSION_T6 = 4;
static private final int COMPRESSION_LZW = 5;
static private final int COMPRESSION_JPEG_OLD = 6;
static private final int COMPRESSION_JPEG_NEW = 7;
static private final int COMPRESSION_DEFLATE = 8;
static private final int PHOTOMETRIC_INTERPRETATION_WHITE_IS_ZERO = 0;
static private final int PHOTOMETRIC_INTERPRETATION_BLACK_IS_ZERO = 1;
static private final int PHOTOMETRIC_INTERPRETATION_RGB = 2;
static private final int PHOTOMETRIC_INTERPRETATION_PALETTE = 3;
static private final int PHOTOMETRIC_INTERPRETATION_TRANSPARENCY = 4;
static private final int PLANAR_CONFIGURATION_CHUNKY = 1;
static private final int PLANAR_CONFIGURATION_PLANAR = 2;
static private final int RESOLUTION_UNIT_NO = 1;
static private final int RESOLUTION_UNIT_INCH = 2;
static private final int RESOLUTION_UNIT_CENTIMETER = 3;
static public void main (String[] args) throws Exception {
FileOutputStream output = new FileOutputStream("test.tif");
TiffWriter writer = new TiffWriter();
writer.start(output, imageWidth, imageHeight);
for (int i = 0; i < 16; i++) {
Pixmap pixmap = new Pixmap(...);
writer.frame(pixmap, "run", i, 16);
}
writer.end();
writer.close();
}
}
Related
This MainActivity.java was written for quantised models and I'm trying to use unquantised model.
After making the changes as mentioned here, here to MainActivity.java, my code is
public class MainActivity extends AppCompatActivity implements AdapterView.OnItemSelectedListener {
private static final String TAG = "MainActivity";
private Button mRun;
private ImageView mImageView;
private Bitmap mSelectedImage;
private GraphicOverlay mGraphicOverlay;
// Max width (portrait mode)
private Integer mImageMaxWidth;
// Max height (portrait mode)
private Integer mImageMaxHeight;
private final String[] mFilePaths =
new String[]{"mountain.jpg", "tennis.jpg","96580.jpg"};
/**
* Name of the model file hosted with Firebase.
*/
private static final String HOSTED_MODEL_NAME = "mobilenet_v1_224_quant";
private static final String LOCAL_MODEL_ASSET = "retrained_graph_mobilenet_1_224.tflite";
/**
* Name of the label file stored in Assets.
*/
private static final String LABEL_PATH = "labels.txt";
/**
* Number of results to show in the UI.
*/
private static final int RESULTS_TO_SHOW = 3;
/**
* Dimensions of inputs.
*/
private static final int DIM_BATCH_SIZE = 1;
private static final int DIM_PIXEL_SIZE = 3;
private static final int DIM_IMG_SIZE_X = 224;
private static final int DIM_IMG_SIZE_Y = 224;
private static final int IMAGE_MEAN = 128;
private static final float IMAGE_STD = 128.0f;
/**
* Labels corresponding to the output of the vision model.
*/
private List<String> mLabelList;
private final PriorityQueue<Map.Entry<String, Float>> sortedLabels =
new PriorityQueue<>(
RESULTS_TO_SHOW,
new Comparator<Map.Entry<String, Float>>() {
#Override
public int compare(Map.Entry<String, Float> o1, Map.Entry<String, Float>
o2) {
return (o1.getValue()).compareTo(o2.getValue());
}
});
/* Preallocated buffers for storing image data. */
private final int[] intValues = new int[DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y];
/**
* An instance of the driver class to run model inference with Firebase.
*/
private FirebaseModelInterpreter mInterpreter;
/**
* Data configuration of input & output data of model.
*/
private FirebaseModelInputOutputOptions mDataOptions;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
mGraphicOverlay = findViewById(R.id.graphic_overlay);
mImageView = findViewById(R.id.image_view);
Spinner dropdown = findViewById(R.id.spinner);
List<String> items = new ArrayList<>();
for (int i = 0; i < mFilePaths.length; i++) {
items.add("Image " + (i + 1));
}
ArrayAdapter<String> adapter = new ArrayAdapter<>(this, android.R.layout
.simple_spinner_dropdown_item, items);
dropdown.setAdapter(adapter);
dropdown.setOnItemSelectedListener(this);
mLabelList = loadLabelList(this);
mRun = findViewById(R.id.button_run);
mRun.setOnClickListener(new View.OnClickListener() {
#Override
public void onClick(View v) {
runModelInference();
}
});
int[] inputDims = {DIM_BATCH_SIZE, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y, DIM_PIXEL_SIZE};
int[] outputDims = {DIM_BATCH_SIZE, mLabelList.size()};
try {
mDataOptions =
new FirebaseModelInputOutputOptions.Builder()
.setInputFormat(0, FirebaseModelDataType.FLOAT32, inputDims)
.setOutputFormat(0, FirebaseModelDataType.FLOAT32, outputDims)
.build();
FirebaseModelDownloadConditions conditions = new FirebaseModelDownloadConditions
.Builder()
.requireWifi()
.build();
FirebaseLocalModelSource localModelSource =
new FirebaseLocalModelSource.Builder("asset")
.setAssetFilePath(LOCAL_MODEL_ASSET).build();
FirebaseCloudModelSource cloudSource = new FirebaseCloudModelSource.Builder
(HOSTED_MODEL_NAME)
.enableModelUpdates(true)
.setInitialDownloadConditions(conditions)
.setUpdatesDownloadConditions(conditions) // You could also specify
// different conditions
// for updates
.build();
FirebaseModelManager manager = FirebaseModelManager.getInstance();
manager.registerLocalModelSource(localModelSource);
manager.registerCloudModelSource(cloudSource);
FirebaseModelOptions modelOptions =
new FirebaseModelOptions.Builder()
.setCloudModelName(HOSTED_MODEL_NAME)
.setLocalModelName("asset")
.build();
mInterpreter = FirebaseModelInterpreter.getInstance(modelOptions);
} catch (FirebaseMLException e) {
showToast("Error while setting up the model");
e.printStackTrace();
}
}
private void runModelInference() {
if (mInterpreter == null) {
Log.e(TAG, "Image classifier has not been initialized; Skipped.");
return;
}
// Create input data.
ByteBuffer imgData = convertBitmapToByteBuffer(mSelectedImage, mSelectedImage.getWidth(),
mSelectedImage.getHeight());
try {
FirebaseModelInputs inputs = new FirebaseModelInputs.Builder().add(imgData).build();
// Here's where the magic happens!!
mInterpreter
.run(inputs, mDataOptions)
.addOnFailureListener(new OnFailureListener() {
#Override
public void onFailure(#NonNull Exception e) {
e.printStackTrace();
showToast("Error running model inference");
}
})
.continueWith(
new Continuation<FirebaseModelOutputs, List<String>>() {
#Override
public List<String> then(Task<FirebaseModelOutputs> task) {
float[][] labelProbArray = task.getResult()
.<float[][]>getOutput(0);
List<String> topLabels = getTopLabels(labelProbArray);
mGraphicOverlay.clear();
GraphicOverlay.Graphic labelGraphic = new LabelGraphic
(mGraphicOverlay, topLabels);
mGraphicOverlay.add(labelGraphic);
return topLabels;
}
});
} catch (FirebaseMLException e) {
e.printStackTrace();
showToast("Error running model inference");
}
}
/**
* Gets the top labels in the results.
*/
private synchronized List<String> getTopLabels(float[][] labelProbArray) {
for (int i = 0; i < mLabelList.size(); ++i) {
sortedLabels.add(
new AbstractMap.SimpleEntry<>(mLabelList.get(i), (labelProbArray[0][i] )));
if (sortedLabels.size() > RESULTS_TO_SHOW) {
sortedLabels.poll();
}
}
List<String> result = new ArrayList<>();
final int size = sortedLabels.size();
for (int i = 0; i < size; ++i) {
Map.Entry<String, Float> label = sortedLabels.poll();
result.add(label.getKey() + ":" + label.getValue());
}
Log.d(TAG, "labels: " + result.toString());
return result;
}
/**
* Reads label list from Assets.
*/
private List<String> loadLabelList(Activity activity) {
List<String> labelList = new ArrayList<>();
try (BufferedReader reader =
new BufferedReader(new InputStreamReader(activity.getAssets().open
(LABEL_PATH)))) {
String line;
while ((line = reader.readLine()) != null) {
labelList.add(line);
}
} catch (IOException e) {
Log.e(TAG, "Failed to read label list.", e);
}
return labelList;
}
/**
* Writes Image data into a {#code ByteBuffer}.
*/
private synchronized ByteBuffer convertBitmapToByteBuffer(
Bitmap bitmap, int width, int height) {
ByteBuffer imgData =
ByteBuffer.allocateDirect(
4*DIM_BATCH_SIZE * DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y * DIM_PIXEL_SIZE);
imgData.order(ByteOrder.nativeOrder());
Bitmap scaledBitmap = Bitmap.createScaledBitmap(bitmap, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y,
true);
imgData.rewind();
scaledBitmap.getPixels(intValues, 0, scaledBitmap.getWidth(), 0, 0,
scaledBitmap.getWidth(), scaledBitmap.getHeight());
// Convert the image to int points.
int pixel = 0;
for (int i = 0; i < DIM_IMG_SIZE_X; ++i) {
for (int j = 0; j < DIM_IMG_SIZE_Y; ++j) {
final int val = intValues[pixel++];
imgData.putFloat((((val >> 16) & 0xFF)-IMAGE_MEAN)/IMAGE_STD);
imgData.putFloat((((val >> 8) & 0xFF)-IMAGE_MEAN)/IMAGE_STD);
imgData.putFloat(((val & 0xFF)-IMAGE_MEAN)/IMAGE_STD);
}
}
return imgData;
}
private void showToast(String message) {
Toast.makeText(getApplicationContext(), message, Toast.LENGTH_SHORT).show();
}
public void onItemSelected(AdapterView<?> parent, View v, int position, long id) {
mGraphicOverlay.clear();
mSelectedImage = getBitmapFromAsset(this, mFilePaths[position]);
if (mSelectedImage != null) {
// Get the dimensions of the View
Pair<Integer, Integer> targetedSize = getTargetedWidthHeight();
int targetWidth = targetedSize.first;
int maxHeight = targetedSize.second;
// Determine how much to scale down the image
float scaleFactor =
Math.max(
(float) mSelectedImage.getWidth() / (float) targetWidth,
(float) mSelectedImage.getHeight() / (float) maxHeight);
Bitmap resizedBitmap =
Bitmap.createScaledBitmap(
mSelectedImage,
(int) (mSelectedImage.getWidth() / scaleFactor),
(int) (mSelectedImage.getHeight() / scaleFactor),
true);
mImageView.setImageBitmap(resizedBitmap);
mSelectedImage = resizedBitmap;
}
}
#Override
public void onNothingSelected(AdapterView<?> parent) {
// Do nothing
}
// Utility functions for loading and resizing images from app asset folder.
public static Bitmap getBitmapFromAsset(Context context, String filePath) {
AssetManager assetManager = context.getAssets();
InputStream is;
Bitmap bitmap = null;
try {
is = assetManager.open(filePath);
bitmap = BitmapFactory.decodeStream(is);
} catch (IOException e) {
e.printStackTrace();
}
return bitmap;
}
// Returns max image width, always for portrait mode. Caller needs to swap width / height for
// landscape mode.
private Integer getImageMaxWidth() {
if (mImageMaxWidth == null) {
// Calculate the max width in portrait mode. This is done lazily since we need to
// wait for a UI layout pass to get the right values. So delay it to first time image
// rendering time.
mImageMaxWidth = mImageView.getWidth();
}
return mImageMaxWidth;
}
// Returns max image height, always for portrait mode. Caller needs to swap width / height for
// landscape mode.
private Integer getImageMaxHeight() {
if (mImageMaxHeight == null) {
// Calculate the max width in portrait mode. This is done lazily since we need to
// wait for a UI layout pass to get the right values. So delay it to first time image
// rendering time.
mImageMaxHeight =
mImageView.getHeight();
}
return mImageMaxHeight;
}
// Gets the targeted width / height.
private Pair<Integer, Integer> getTargetedWidthHeight() {
int targetWidth;
int targetHeight;
int maxWidthForPortraitMode = getImageMaxWidth();
int maxHeightForPortraitMode = getImageMaxHeight();
targetWidth = maxWidthForPortraitMode;
targetHeight = maxHeightForPortraitMode;
return new Pair<>(targetWidth, targetHeight);
}
}
But I'm still getting Failed to get input dimensions. 0-th input should have 268203 bytes, but found 1072812 bytes for inception and 0-th input should have 150528 bytes, but found 602112 bytes for mobilenet. So, a factor is 4 there always.
To see what I've changed, the output of diff original.java changed.java is: (Ignore the line numbers)
32a33,34
> private static final int IMAGE_MEAN = 128;
> private static final float IMAGE_STD = 128.0f;
150,151c152,153
< byte[][] labelProbArray = task.getResult()
< .<byte[][]>getOutput(0);
---
> float[][] labelProbArray = task.getResult()
> .<float[][]>getOutput(0);
170c172
< private synchronized List<String> getTopLabels(byte[][] labelProbArray) {
---
> private synchronized List<String> getTopLabels(float[][] labelProbArray) {
173,174c175
< new AbstractMap.SimpleEntry<>(mLabelList.get(i), (labelProbArray[0][i] &
< 0xff) / 255.0f));
---
> new AbstractMap.SimpleEntry<>(mLabelList.get(i), (labelProbArray[0][i] )));
214c215,216
< DIM_BATCH_SIZE * DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y * DIM_PIXEL_SIZE);
---
> 4*DIM_BATCH_SIZE * DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y * DIM_PIXEL_SIZE);
>
226,228c228,232
< imgData.put((byte) ((val >> 16) & 0xFF));
< imgData.put((byte) ((val >> 8) & 0xFF));
< imgData.put((byte) (val & 0xFF));
---
> imgData.putFloat((((val >> 16) & 0xFF)-IMAGE_MEAN)/IMAGE_STD);
> imgData.putFloat((((val >> 8) & 0xFF)-IMAGE_MEAN)/IMAGE_STD);
> imgData.putFloat(((val & 0xFF)-IMAGE_MEAN)/IMAGE_STD);
This is how the buffer is allocated in the code lab:
ByteBuffer imgData = ByteBuffer.allocateDirect(
DIM_BATCH_SIZE * DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y * DIM_PIXEL_SIZE);
DIM_BATCH_SIZE - A typical usage is for supporting batch processing (if the model supports it). In our sample and probably your test, you feed 1 image at a time and just keep it as 1.
DIM_PIXEL_SIZE - We set 3 in the code lab, which corresponds to r/g/b 1 byte each.
However, looks like you are using a float model. Then instead of one byte each for r/g/b, you use a float (4 bytes) to represent r/g/b each (you figured out this part already yourself). Then the buffer you allocated using above code is no longer sufficient.
You can follow example here for float models:
https://github.com/tensorflow/tensorflow/blob/25b4086bb5ba1788ceb6032eda58348f6e20a71d/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/ImageClassifierFloatInception.java
To be exact on imgData population, below should be the formula for allocation:
ByteBuffer imgData = ByteBuffer.allocateDirect(
DIM_BATCH_SIZE * getImageSizeX() * getImageSizeY() * DIM_PIXEL_SIZE
* getNumBytesPerChannel());
getNumBytesPerChannel() should be 4 in your case.
[Update for the new question, in regards of below error]:
Failed to get input dimensions. 0-th input should have 268203 bytes, but found 1072812 bytes
This is the check that number of bytes expected by the model == number of bytes passed in. 268203 = 299 * 299 * 3 & 1072812 = 4 * 299 * 299 * 3. Looks like you are using a quantized model but fed it with data for float model. Could you double check the model you used? To make things simple, don't specify cloud model source and use local model from assets only.
[Update 0628, developer said they trained a float model]:
It could be your model is wrong; it could also be you have a Cloud model downloaded which overrides your local model. But the error message tells us that the model being loaded is NOT a float model.
To isolate the issue, I'd recommend below few testings:
1) Remove setCloudModelName / registerCloudModelSource from quick start app
2) Play with official TFLite float model You will have to download the model mentioned in comment and change Camera2BasicFragment to use that ImageClassifierFloatInception (instead of ImageClassifierQuantizedMobileNet)
3) Still use the same TFLite sample app, switch to your own trained model. Make sure to tune the image size to your values.
I am working with Android Google Vision API, and have created a standard barcode reader, but I want to detect what type/format of barcode is read i.e. CODE 39,
CODE 128, QR Code.... etc.
Is there anyway to return the type?
Thanks
Because I does not found any bulid-in function to decode Format integer value to text value
I used following custom method
private String decodeFormat(int format) {
switch (format){
case Barcode.CODE_128:
return "CODE_128";
case Barcode.CODE_39:
return "CODE_39";
case Barcode.CODE_93:
return "CODE_93";
case Barcode.CODABAR:
return "CODABAR";
case Barcode.DATA_MATRIX:
return "DATA_MATRIX";
case Barcode.EAN_13:
return "EAN_13";
case Barcode.EAN_8:
return "EAN_8";
case Barcode.ITF:
return "ITF";
case Barcode.QR_CODE:
return "QR_CODE";
case Barcode.UPC_A:
return "UPC_A";
case Barcode.UPC_E:
return "UPC_E";
case Barcode.PDF417:
return "PDF417";
case Barcode.AZTEC:
return "AZTEC";
default:
return "";
}
}
Found it in the documentation (missed it previously). https://developers.google.com/android/reference/com/google/android/gms/vision/barcode/Barcode
Using
format
you can get the barcode type this is retuned as an integer.
valueFormat returns the type, it can match the static variables of the API.
Example:
final SparseArray <Barcode> barcodes = detections.getDetectedItems ();
if (barcodes.size ()! = 0) {
txtBarcodeValue.post (new Runnable () {
#Override
public void run () {
System.out.println ("barcodes");
System.out.println (barcodes.valueAt (0) .format); // 256
System.out.println (barcodes.valueAt (0) .valueFormat); // 1 or 2 or 3 ....
......
.
and the codes you can find them in the class Barcode.class
public static final int CONTACT_INFO = 1;
public static final int EMAIL = 2;
public static final int ISBN = 3;
public static final int PHONE = 4;
public static final int PRODUCT = 5;
public static final int SMS = 6;
public static final int TEXT = 7;
public static final int URL = 8;
public static final int WIFI = 9;
public static final int GEO = 10;
I am experiencing a weird behavior with Java objects. I have this ComponentPlane.class with two different versions. Difference is marked by ******.
First WORKING Version
package app.pathsom.som.output;
import java.awt.Color;
import java.awt.Font;
import java.awt.Graphics;
import java.awt.Graphics2D;
import javax.swing.JPanel;
import app.pathsom.som.map.Lattice;
import app.pathsom.som.map.Node;
public class ComponentPlane extends JPanel{
private Lattice lattice;
private int componentNumber;
private double minValue;
private double maxValue;
private double origMinValue;
private double origMaxValue;
public ComponentPlane(Lattice lattice, int componentNumber){
this.lattice = new Lattice();
this.componentNumber = componentNumber;
initLattice(lattice);
initComponentPlane();
}
private void initLattice(Lattice lattice){
this.lattice.setLatticeHeight(lattice.getLatticeHeight());
this.lattice.setLatticeWidth(lattice.getLatticeWidth());
this.lattice.setNumberOfNodeElements(lattice.getNumberOfNodeElements());
this.lattice.initializeValues();
this.lattice.setNodeHeight(lattice.getNodeHeight());
this.lattice.setNodeWidth(lattice.getNodeWidth());
this.lattice.setTotalNumberOfNodes(lattice.getTotalNumberOfNodes());
for(int i = 0; i < lattice.getTotalNumberOfNodes(); i++){
******this.lattice.getLatticeNode()[i] = new Node(lattice.getLatticeNode()[i]);******
}
}
}
The only difference of the second NON-WORKING version is with this FUNCTION REPLACING the FUNCTION above
private void initLattice(Lattice lattice){
//same code here
for(int i = 0; i < lattice.getTotalNumberOfNodes(); i++){
******this.lattice.getLatticeNode()[i] = lattice.getLatticeNode()[i];******
}
}
I have also tried doing a third non-working version which is...
private void initLattice(Lattice lattice){
//same code here
******this.lattice.setLatticeNode(lattice.getLatticeNode());******
}
A constructor in the Node.class (WHICH is USED in the first WORKING version is this one...
public Node (Node node){
this.xPos = node.xPos;
this.yPos = node.yPos;
this.numOfElements = node.numOfElements;
this.cluster = -1;
this.nodeIndex = node.getNodeIndex();
for(int i = 0; i < this.numOfElements; i++){
this.addElement(node.getDoubleElementAt(i));
}
}
Lattice.class
public class Lattice {
private int latticeWidth;
private int latticeHeight;
private int numOfNodeElements;
private int nodeWidth;
private int nodeHeight;
private int totalNumOfNodes;
private Node[] latticeNodes;
private final int MAP_RADIUS = 225;
public Lattice(int latticeWidth, int latticeHeight, int numOfNodeElements){
this.latticeWidth = latticeWidth;
this.latticeHeight = latticeHeight;
this.numOfNodeElements = numOfNodeElements;
initializeLattice();
}
public Lattice(){
this(10, 10, 3);
}
public void initializeValues(){
totalNumOfNodes = this.latticeHeight * this.latticeWidth;
latticeNodes = new Node[totalNumOfNodes]; //specify the array of nodes
nodeWidth = (int) Math.floor(450/this.latticeWidth);
nodeHeight = (int) Math.floor(450/this.latticeHeight);
}
protected void initializeLattice(){
totalNumOfNodes = this.latticeHeight * this.latticeWidth;
latticeNodes = new Node[totalNumOfNodes];
nodeWidth = (int) Math.floor(450/this.latticeWidth);
nodeHeight = (int) Math.floor(450/this.latticeHeight);
//initialize colors
for(int i = 0; i <totalNumOfNodes; i++){
latticeNodes[i] = new Node(((i % this.latticeWidth) * nodeWidth) + nodeWidth / 2,
((i / this.latticeWidth) * nodeHeight ) + nodeHeight/2, numOfNodeElements, i);
latticeNodes[i].setNodeColor(new Color((int)(latticeNodes[i].getDoubleElementAt(0)
* 255), (int)(latticeNodes[i].getDoubleElementAt(1) * 255), (int) (latticeNodes[i].getDoubleElementAt(2) * 255)));
}
}
public int getLatticeHeight(){
return latticeHeight;
}
public void setLatticeHeight(int latticeHeight){
this.latticeHeight = latticeHeight;
}
public Node[] getLatticeNode(){
return latticeNodes;
}
public void setLatticeNode(Node[] latticeNodes){
this.latticeNodes = latticeNodes;
}
public int getLatticeWidth(){
return latticeWidth;
}
public void setLatticeWidth(int latticeWidth){
this.latticeWidth = latticeWidth;
}
public int getNodeHeight(){
return nodeHeight;
}
public int getNodeWidth(){
return nodeWidth;
}
public void setNodeHeight(int nodeHeight){
this.nodeHeight = nodeHeight;
}
public void setNodeWidth(int nodeWidth){
this.nodeWidth = nodeWidth;
}
public int getNumberOfNodeElements(){
return numOfNodeElements;
}
public void setNumberOfNodeElements(int numOfNodeElements){
this.numOfNodeElements = numOfNodeElements;
}
public int getTotalNumberOfNodes(){
return totalNumOfNodes;
}
public void setTotalNumberOfNodes(int totalNumberOfNodes){
this.totalNumOfNodes = totalNumberOfNodes;
}
}
A certain Visualization.class initiates all these actions and stores the ComponentPlane arrays. Here is the function
public void initComponentPlanes(){
componentPlanes = new ComponentPlane[somtrainer.getLattice().getNumberOfNodeElements()];
int size = somtrainer.getLattice().getNumberOfNodeElements();
for(int i = 0; i < size; i++){
System.out.println(i + ": " + inputData.getVariableLabels()[i] + " size : " + size);
componentPlanes[i] = new ComponentPlane(somtrainer.getLattice(), i);
componentPlanes[i].setBounds((240 - 225)/2, (280-240)/2, 225, 240);
componentPlanes[i].setOrigMaxMin(maxMin[i][0], maxMin[i][1]);
}
}
My problems are
The First one works fine. It creates HEATMAPS or COMPONENTPLANES for each component number (meaning they differ from each other) but I cannot use it as the line with ****** which references to the ("this.addElement....") in the Node.class constructor gives me OUTOFMEMORY error so it LAGS and FREEZES whenever I have many COMPONENTPLANES to do. (I am actually doing an ARRAY of COMPONENTPlane objects) so I decided to try the second and third option. I have already increased my heap size SO this is OUT of the question
If I use the second and third one, I end up with no LAGS even with large amount of ComponentPlanes (probably less memory taking up because of creating new Node objects or idk) but these creates wrong heatmaps. All heatmaps are the same. And the thing is, all heatmaps are like the last element of the ComponentPlanes array (e.g. if I have ten ComponentPlane objects, all heatmaps look exactly like the tenth Component Object)
All of the heatmaps are like this - the same as the last heatmap in the array:
Is there a way to make the second and third one work?
The obvious difference is that in the first one you're creating new nodes, and in the others you're reusing the old ones. The line
this.lattice.getLatticeNode()[i] = lattice.getLatticeNode()[i];
is setting one object equal to another. If, later, a property of lattice.getLatticeNode()[i] changes, that will also effect this.lattice.getLatticeNode()[i]. And that can cause hard-to-find bugs. In contrast, the line
this.lattice.getLatticeNode()[i] = new Node(lattice.getLatticeNode()[i]);
is creating a new object, distinct from the old one. But, of course, this means that you're using more memory, because now you have two objects instead of one.
There are little things you can do to reduce the amount of memory used. private final int MAP_RADIUS = 225; could be made static, so a new copy of the constant isn't created for each node.
How can you play multiple (audio) byte arrays simultaneously? This "byte array" is recorded by TargetDataLine, transferred using a server.
What I've tried so far
Using SourceDataLine:
There is no way to play mulitple streams using SourceDataLine, because the write method blocks until the buffer is written. This problem cannot be fixed using Threads, because only one SourceDataLine can write concurrently.
Using the AudioPlayer Class:
ByteInputStream stream2 = new ByteInputStream(data, 0, data.length);
AudioInputStream stream = new AudioInputStream(stream2, VoiceChat.format, data.length);
AudioPlayer.player.start(stream);
This just plays noise on the clients.
EDIT
I don't receive the voice packets at the same time, it's not simultaneously, more "overlapping".
Apparently Java's Mixer interface was not designed for this.
http://docs.oracle.com/javase/7/docs/api/javax/sound/sampled/Mixer.html:
A mixer is an audio device with one or more lines. It need not be
designed for mixing audio signals.
And indeed, when I try to open multiple lines on the same mixer this fails with a LineUnavailableException. However if all your audio recordings have the same audio format it's quite easy to manually mix them together. For example if you have 2 inputs:
Convert both to the appropriate data type (for example byte[] for 8 bit audio, short[] for 16 bit, float[] for 32 bit floating point etc)
Sum them in another array. Make sure summed values do not exceed the range of the datatype.
Convert output back to bytes and write that to the SourceDataLine
See also How is audio represented with numbers?
Here's a sample mixing down 2 recordings and outputting as 1 signal, all in 16bit 48Khz stereo.
// print all devices (both input and output)
int i = 0;
Mixer.Info[] infos = AudioSystem.getMixerInfo();
for (Mixer.Info info : infos)
System.out.println(i++ + ": " + info.getName());
// select 2 inputs and 1 output
System.out.println("Select input 1: ");
int in1Index = Integer.parseInt(System.console().readLine());
System.out.println("Select input 2: ");
int in2Index = Integer.parseInt(System.console().readLine());
System.out.println("Select output: ");
int outIndex = Integer.parseInt(System.console().readLine());
// ugly java sound api stuff
try (Mixer in1Mixer = AudioSystem.getMixer(infos[in1Index]);
Mixer in2Mixer = AudioSystem.getMixer(infos[in2Index]);
Mixer outMixer = AudioSystem.getMixer(infos[outIndex])) {
in1Mixer.open();
in2Mixer.open();
outMixer.open();
try (TargetDataLine in1Line = (TargetDataLine) in1Mixer.getLine(in1Mixer.getTargetLineInfo()[0]);
TargetDataLine in2Line = (TargetDataLine) in2Mixer.getLine(in2Mixer.getTargetLineInfo()[0]);
SourceDataLine outLine = (SourceDataLine) outMixer.getLine(outMixer.getSourceLineInfo()[0])) {
// audio format 48khz 16 bit stereo (signed litte endian)
AudioFormat format = new AudioFormat(48000.0f, 16, 2, true, false);
// 4 bytes per frame (16 bit samples stereo)
int frameSize = 4;
int bufferSize = 4800;
int bufferBytes = frameSize * bufferSize;
// buffers for java audio
byte[] in1Bytes = new byte[bufferBytes];
byte[] in2Bytes = new byte[bufferBytes];
byte[] outBytes = new byte[bufferBytes];
// buffers for mixing
short[] in1Samples = new short[bufferBytes / 2];
short[] in2Samples = new short[bufferBytes / 2];
short[] outSamples = new short[bufferBytes / 2];
// how long to record & play
int framesProcessed = 0;
int durationSeconds = 10;
int durationFrames = (int) (durationSeconds * format.getSampleRate());
// open devices
in1Line.open(format, bufferBytes);
in2Line.open(format, bufferBytes);
outLine.open(format, bufferBytes);
in1Line.start();
in2Line.start();
outLine.start();
// start audio loop
while (framesProcessed < durationFrames) {
// record audio
in1Line.read(in1Bytes, 0, bufferBytes);
in2Line.read(in2Bytes, 0, bufferBytes);
// convert input bytes to samples
ByteBuffer.wrap(in1Bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(in1Samples);
ByteBuffer.wrap(in2Bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(in2Samples);
// mix samples - lower volume by 50% since we're mixing 2 streams
for (int s = 0; s < bufferBytes / 2; s++)
outSamples[s] = (short) ((in1Samples[s] + in2Samples[s]) * 0.5);
// convert output samples to bytes
ByteBuffer.wrap(outBytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().put(outSamples);
// play audio
outLine.write(outBytes, 0, bufferBytes);
framesProcessed += bufferBytes / frameSize;
}
in1Line.stop();
in2Line.stop();
outLine.stop();
}
}
Allright, I put something together which should get you started. I'll post the full code below but I'll first try and explain the steps involved.
The interesting part here is to create you're own audio "mixer" class which allows consumers of that class to schedule audio blocks at specific points in the (near) future. The specific-point-in-time part is important here: i'm assuming you receive network voices in packets where each packet needs to start exactly at the end of the previous one in order to play back a continuous sound for a single voice. Also since you say voices can overlap I'm assuming (yes, lots of assumptions) a new one can come in over the network while one or more old ones are still playing. So it seems reasonable to allow audio blocks to be scheduled from any thread. Note that there's only one thread actually writing to the dataline, it's just that any thread can submit audio packets to the mixer.
So for the submit-audio-packet part we now have this:
private final ConcurrentLinkedQueue<QueuedBlock> scheduledBlocks;
public void mix(long when, short[] block) {
scheduledBlocks.add(new QueuedBlock(when, Arrays.copyOf(block, block.length)));
}
The QueuedBlock class is just used to tag a byte array (the audio buffer) with the "when": the point in time where the block should be played.
Points in time are expressed relative to the current position of the audio stream. It is set to zero when the stream is created and updated with the buffer size each time an audio buffer is written to the dataline:
private final AtomicLong position = new AtomicLong();
public long position() {
return position.get();
}
Apart from all the hassle to set up the data line, the interesting part of the mixer class is obviously where the mixdown happens. For each scheduled audio block, it's split up into 3 cases:
The block is already played in it's entirety. Remove from the scheduledBlocks list.
The block is scheduled to start at some point in time after the current buffer. Do nothing.
(Part of) the block should be mixed down into the current buffer. Note that the beginning of the block may (or may not) be already played in previous buffer(s). Similarly, the end of the scheduled block may exceed the end of the current buffer in which case we mix down the first part of it and leave the rest for the next round, untill all of it has been played an the entire block is removed.
Also note that there's no reliable way to start playing audio data immediately, when you submit packets to the mixer be sure to always have them start at least the duration of 1 audio buffer from now otherwise you'll risk losing the beginning of your sound. Here's the mixdown code:
private static final double MIXDOWN_VOLUME = 1.0 / NUM_PRODUCERS;
private final List<QueuedBlock> finished = new ArrayList<>();
private final short[] mixBuffer = new short[BUFFER_SIZE_FRAMES * CHANNELS];
private final byte[] audioBuffer = new byte[BUFFER_SIZE_FRAMES * CHANNELS * 2];
private final AtomicLong position = new AtomicLong();
Arrays.fill(mixBuffer, (short) 0);
long bufferStartAt = position.get();
for (QueuedBlock block : scheduledBlocks) {
int blockFrames = block.data.length / CHANNELS;
// block fully played - mark for deletion
if (block.when + blockFrames <= bufferStartAt) {
finished.add(block);
continue;
}
// block starts after end of current buffer
if (bufferStartAt + BUFFER_SIZE_FRAMES <= block.when)
continue;
// mix in part of the block which overlaps current buffer
int blockOffset = Math.max(0, (int) (bufferStartAt - block.when));
int blockMaxFrames = blockFrames - blockOffset;
int bufferOffset = Math.max(0, (int) (block.when - bufferStartAt));
int bufferMaxFrames = BUFFER_SIZE_FRAMES - bufferOffset;
for (int f = 0; f < blockMaxFrames && f < bufferMaxFrames; f++)
for (int c = 0; c < CHANNELS; c++) {
int bufferIndex = (bufferOffset + f) * CHANNELS + c;
int blockIndex = (blockOffset + f) * CHANNELS + c;
mixBuffer[bufferIndex] += (short)
(block.data[blockIndex]*MIXDOWN_VOLUME);
}
}
scheduledBlocks.removeAll(finished);
finished.clear();
ByteBuffer
.wrap(audioBuffer)
.order(ByteOrder.LITTLE_ENDIAN)
.asShortBuffer()
.put(mixBuffer);
line.write(audioBuffer, 0, audioBuffer.length);
position.addAndGet(BUFFER_SIZE_FRAMES);
And finally a complete, self-contained sample which spawns a number of threads submitting audio blocks representing sinewaves of random duration and frequency to the mixer (called AudioConsumer in this sample). Replace sinewaves by incoming network packets and you should be halfway to a solution.
package test;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.Line;
import javax.sound.sampled.Mixer;
import javax.sound.sampled.SourceDataLine;
public class Test {
public static final int CHANNELS = 2;
public static final int SAMPLE_RATE = 48000;
public static final int NUM_PRODUCERS = 10;
public static final int BUFFER_SIZE_FRAMES = 4800;
// generates some random sine wave
public static class ToneGenerator {
private static final double[] NOTES = {261.63, 311.13, 392.00};
private static final double[] OCTAVES = {1.0, 2.0, 4.0, 8.0};
private static final double[] LENGTHS = {0.05, 0.25, 1.0, 2.5, 5.0};
private double phase;
private int framesProcessed;
private final double length;
private final double frequency;
public ToneGenerator() {
ThreadLocalRandom rand = ThreadLocalRandom.current();
length = LENGTHS[rand.nextInt(LENGTHS.length)];
frequency = NOTES[rand.nextInt(NOTES.length)] * OCTAVES[rand.nextInt(OCTAVES.length)];
}
// make sound
public void fill(short[] block) {
for (int f = 0; f < block.length / CHANNELS; f++) {
double sample = Math.sin(phase * 2.0 * Math.PI);
for (int c = 0; c < CHANNELS; c++)
block[f * CHANNELS + c] = (short) (sample * Short.MAX_VALUE);
phase += frequency / SAMPLE_RATE;
}
framesProcessed += block.length / CHANNELS;
}
// true if length of tone has been generated
public boolean done() {
return framesProcessed >= length * SAMPLE_RATE;
}
}
// dummy audio producer, based on sinewave generator
// above but could also be incoming network packets
public static class AudioProducer {
final Thread thread;
final AudioConsumer consumer;
final short[] buffer = new short[BUFFER_SIZE_FRAMES * CHANNELS];
public AudioProducer(AudioConsumer consumer) {
this.consumer = consumer;
thread = new Thread(() -> run());
thread.setDaemon(true);
}
public void start() {
thread.start();
}
// repeatedly play random sine and sleep for some time
void run() {
try {
ThreadLocalRandom rand = ThreadLocalRandom.current();
while (true) {
long pos = consumer.position();
ToneGenerator g = new ToneGenerator();
// if we schedule at current buffer position, first part of the tone will be
// missed so have tone start somewhere in the middle of the next buffer
pos += BUFFER_SIZE_FRAMES + rand.nextInt(BUFFER_SIZE_FRAMES);
while (!g.done()) {
g.fill(buffer);
consumer.mix(pos, buffer);
pos += BUFFER_SIZE_FRAMES;
// we can generate audio faster than it's played
// sleep a while to compensate - this more closely
// corresponds to playing audio coming in over the network
double bufferLengthMillis = BUFFER_SIZE_FRAMES * 1000.0 / SAMPLE_RATE;
Thread.sleep((int) (bufferLengthMillis * 0.9));
}
// sleep a while in between tones
Thread.sleep(1000 + rand.nextInt(2000));
}
} catch (Throwable t) {
System.out.println(t.getMessage());
t.printStackTrace();
}
}
}
// audio consumer - plays continuously on a background
// thread, allows audio to be mixed in from arbitrary threads
public static class AudioConsumer {
// audio block with "when to play" tag
private static class QueuedBlock {
final long when;
final short[] data;
public QueuedBlock(long when, short[] data) {
this.when = when;
this.data = data;
}
}
// need not normally be so low but in this example
// we're mixing down a bunch of full scale sinewaves
private static final double MIXDOWN_VOLUME = 1.0 / NUM_PRODUCERS;
private final List<QueuedBlock> finished = new ArrayList<>();
private final short[] mixBuffer = new short[BUFFER_SIZE_FRAMES * CHANNELS];
private final byte[] audioBuffer = new byte[BUFFER_SIZE_FRAMES * CHANNELS * 2];
private final Thread thread;
private final AtomicLong position = new AtomicLong();
private final AtomicBoolean running = new AtomicBoolean(true);
private final ConcurrentLinkedQueue<QueuedBlock> scheduledBlocks = new ConcurrentLinkedQueue<>();
public AudioConsumer() {
thread = new Thread(() -> run());
}
public void start() {
thread.start();
}
public void stop() {
running.set(false);
}
// gets the play cursor. note - this is not accurate and
// must only be used to schedule blocks relative to other blocks
// (e.g., for splitting up continuous sounds into multiple blocks)
public long position() {
return position.get();
}
// put copy of audio block into queue so we don't
// have to worry about caller messing with it afterwards
public void mix(long when, short[] block) {
scheduledBlocks.add(new QueuedBlock(when, Arrays.copyOf(block, block.length)));
}
// better hope mixer 0, line 0 is output
private void run() {
Mixer.Info[] mixerInfo = AudioSystem.getMixerInfo();
try (Mixer mixer = AudioSystem.getMixer(mixerInfo[0])) {
Line.Info[] lineInfo = mixer.getSourceLineInfo();
try (SourceDataLine line = (SourceDataLine) mixer.getLine(lineInfo[0])) {
line.open(new AudioFormat(SAMPLE_RATE, 16, CHANNELS, true, false), BUFFER_SIZE_FRAMES);
line.start();
while (running.get())
processSingleBuffer(line);
line.stop();
}
} catch (Throwable t) {
System.out.println(t.getMessage());
t.printStackTrace();
}
}
// mix down single buffer and offer to the audio device
private void processSingleBuffer(SourceDataLine line) {
Arrays.fill(mixBuffer, (short) 0);
long bufferStartAt = position.get();
// mixdown audio blocks
for (QueuedBlock block : scheduledBlocks) {
int blockFrames = block.data.length / CHANNELS;
// block fully played - mark for deletion
if (block.when + blockFrames <= bufferStartAt) {
finished.add(block);
continue;
}
// block starts after end of current buffer
if (bufferStartAt + BUFFER_SIZE_FRAMES <= block.when)
continue;
// mix in part of the block which overlaps current buffer
// note that block may have already started in the past
// but extends into the current buffer, or that it starts
// in the future but before the end of the current buffer
int blockOffset = Math.max(0, (int) (bufferStartAt - block.when));
int blockMaxFrames = blockFrames - blockOffset;
int bufferOffset = Math.max(0, (int) (block.when - bufferStartAt));
int bufferMaxFrames = BUFFER_SIZE_FRAMES - bufferOffset;
for (int f = 0; f < blockMaxFrames && f < bufferMaxFrames; f++)
for (int c = 0; c < CHANNELS; c++) {
int bufferIndex = (bufferOffset + f) * CHANNELS + c;
int blockIndex = (blockOffset + f) * CHANNELS + c;
mixBuffer[bufferIndex] += (short) (block.data[blockIndex] * MIXDOWN_VOLUME);
}
}
scheduledBlocks.removeAll(finished);
finished.clear();
ByteBuffer.wrap(audioBuffer).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().put(mixBuffer);
line.write(audioBuffer, 0, audioBuffer.length);
position.addAndGet(BUFFER_SIZE_FRAMES);
}
}
public static void main(String[] args) {
System.out.print("Press return to exit...");
AudioConsumer consumer = new AudioConsumer();
consumer.start();
for (int i = 0; i < NUM_PRODUCERS; i++)
new AudioProducer(consumer).start();
System.console().readLine();
consumer.stop();
}
}
You can use the Tritontus library to do software audio mixing (it's old but still works quite well).
Add the dependency to your project:
<dependency>
<groupId>com.googlecode.soundlibs</groupId>
<artifactId>tritonus-all</artifactId>
<version>0.3.7.2</version>
</dependency>
Use the org.tritonus.share.sampled.FloatSampleBuffer. Both buffers must be of same AudioFormat before calling #mix.
// TODO instantiate these variables with real data
byte[] audio1, audio2;
AudioFormat af1, af2;
SourceDataLine sdl = AudioSystem.getSourceDataLine(af1);
FloatSampleBuffer fsb1 = new FloatSampleBuffer(audio1, 0, audio1.length, af1.getFormat());
FloatSampleBuffer fsb2 = new FloatSampleBuffer(audio2, 0, audio2.length, af2.getFormat());
fsb1.mix(fsb2);
byte[] result = fsb1.convertToByteArray(af1);
sdl.write(result, 0, result.length); // play it
i am writing a program on smartphone (on Android)
It is about :
Analyzing spectrum of sound by fft algorithms
measuring the intensity of a sound have f = fo (ex. fo = 18khz) from the spectrum which I have got results from the analysis above.
Calculating the distance from smartphone to source of sound with this intensity
After fft, I got two arrays (real and image). I calculate the sound intensity at f=18000hz( suppose that source of sound at 18000 hz is unchanged so that it makes it easier to measure sound intensity). As follow:
frequency at bin FFT[i] is :
if i <= [N/2] then i * SamplingFrequency / N
if i >= [N/2] then (N-i) * SamplingFrequency / N
therefore at frequency = 18000hz then I choose i = 304
sound intensity = real_array[304] * real_array[304] + image_array[304] * image_array[304]
However, the intensity, in fact, varies a lot making it difficult to measure the distance. And, I have no idea how to explain this.
Besides, I would like to ask you a question that the intensity I have measured above uses what unit to calculate.
Here is my code:
a. fft algorithms( I use fft 512 point)
import define.define512;
public class fft {
private static float[] W_real;
private static float[] W_img;
private static float[] input_real= new float[512];
private static float[] input_img;
//input_real1 is values from mic(smartphone)
//output is values of sound intensity
public static void FFT(float[] input_real1, float[] output)
{
for(int i =0;i<512;i++) input_real[i] = input_real1[i];
input_img = new float[512];
W_real = define512.W_IMAG;
W_img = define512.W_IMAG;
int[] W_order = define512.ORDER;
float[] output_real = new float[512], output_img = new float[512];
fftradix2(0,511);
//reorder deals with inverse bit
reorder(input_real, input_img, output_real, output_img, W_order, 512);
for(int i =0;i<512;i++)
{
output[i] = sqrt((output_real[i]*output_real[i] + output_img[i]*output_img[i]));
}
}
private static void reorder(float[] in_real,float[] in_imag, float[] out_real,float[] out_imag,int[] order,int N){
for(int i=0;i<N;i++){
out_real[i]=in_real[order[i]];
out_imag[i]=in_imag[order[i]];
}
}
//fft algorithms
private static void fftradix2(int dau,int cuoi)
{
int check = cuoi - dau;
if (check == 1)
{
input_real[dau] = input_real[dau] + input_real[cuoi];
input_img[dau] = input_img[dau] + input_img[cuoi];
input_real[cuoi] = input_real[dau] -2* input_real[cuoi];
input_img[cuoi] = input_img[dau] -2* input_img[cuoi];
}
else
{
int index = 512/(cuoi - dau + 1);
int tg = (cuoi - dau)/2;
fftradix2(dau,(dau+tg));
fftradix2((cuoi-tg),cuoi);
for(int i = dau;i<=(dau+tg);i++)
{
input_real[i] = input_real[i] + input_real[i+tg+1]*W_real[(i-dau)*index] - input_img[i+tg+1]*W_img[(i-dau)*index];
input_img[i] = input_img[i] + input_real[i+tg+1]*W_img[(i-dau)*index] + input_img[i+tg+1]*W_real[(i%(tg+1))*index];
input_real[i+tg+1] = input_real[i] -2* input_real[i+tg+1]*W_real[(i-dau)*index] +2* input_img[i+tg+1]*W_img[(i-dau)*index];
input_img[i+tg+1] = input_img[i] -2* input_real[i+tg+1]*W_img[(i-dau)*index] -2* input_img[i+tg+1]*W_real[(i-dau)*index];
}
}
}
}
b. code use mic in smartphone
NumOverlapSample = 800;
NumNewSample = 224;
private static int Fs = 44100;
private byte recorderAudiobuffer[] = new byte [1024];
AudioRecord recorder = new AudioRecord(AudioSource.MIC, Fs, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, 4096);
//start recorder
recorder.startRecording();
timer.schedule(new task_update(), 1000, 10);
class task_update extends TimerTask
{
#Override
public void run() {
// TODO Auto-generated method stub
for(int i=0;i<NumOverlapSample;i++)
recorderAudiobuffer[i] = recorderAudiobuffer[i+NumNewSample];
int bufferRead = recorder.read(recorderAudiobuffer,NumOverlapSample,NumNewSample);
convert.decode(recorderAudiobuffer, N, input);
fft.FFT(input, output);
}
and my soucre https://www.box.com/s/zuppzkicymfsuv4kb65p
thanks for all
At 18 kHz, microphone type, position and direction, as well as sound reflections from the nearby acoustic environment will strongly influence the sound level.