How to export embeded file which from excel using POI? - java

I have made a java basic program written below, which is making 3 kind of files (ppt,doc,txt) embedded in Excel sheet using Apache POI. Now this file I want to Export with its original format. How to do this?
Reference link is Embed files into Excel using Apache POI.
I have made program from this link.
In short I want export functionality on Embedded file.
I have tried above problem using give below code but it not working for exporting embedded file in excel sheet:
Here this is the code which is tried to solve:
public static void main(String[] args) throws IOException {
String fileName = "ole_ppt_in_xls.xls";
ReadExcel(fileName);
}
public static void ReadExcel(String fileName) throws IOException {
FileInputStream inputFileStream = new FileInputStream(fileName);
POIFSFileSystem fs = new POIFSFileSystem(inputFileStream);
HSSFWorkbook workbook = new HSSFWorkbook(fs);
for (HSSFObjectData obj : workbook.getAllEmbeddedObjects()) {
// the OLE2 Class Name of the object
String oleName = obj.getOLE2ClassName();
System.out.println(oleName);
if (oleName.equals("Worksheet")) {
System.out.println("Worksheet");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
HSSFWorkbook embeddedWorkbook = new HSSFWorkbook(dn, fs, false);
} else if (oleName.equals("Document")) {
System.out.println("Document");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
HWPFDocument embeddedWordDocument = new HWPFDocument(dn, fs);
} else if (oleName.equals("Presentation")) {
System.out.println("Presentation");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
SlideShow embeddedPowerPointDocument = new SlideShow(
new HSLFSlideShow(dn, fs));
} else if (oleName.equals("Presentation")) {
System.out.println("Presentation");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
SlideShow embeddedPowerPointDocument = new SlideShow(
new HSLFSlideShow(dn, fs));
}else {
System.out.println("Else part ");
if (obj.hasDirectoryEntry()) {
System.out.println("obj.hasDirectoryEntry()"+obj.hasDirectoryEntry());
// The DirectoryEntry is a DocumentNode. Examine its entries
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
for (Iterator entries = dn.getEntries(); entries.hasNext();) {
Entry entry = (Entry) entries.next();
System.out.println(oleName + "." + entry.getName());
}
} else {
System.out.println("Else part 22");
byte[] objectData = obj.getObjectData();
}
}
}
}
Output screen of above program:
So ,how to exporting functionality implement?

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
/**
* Demonstrates how you can extract embedded data from a .xlsx file
*/
public class GetEmbedded {
public static void main(String[] args) throws Exception {
String path = "SomeExcelFile.xlsx"
XSSFWorkbook workbook = new XSSFWorkbook(new FileInputStream(new File(path)));
for (PackagePart pPart : workbook.getAllEmbedds()) {
String contentType = pPart.getContentType();
if (contentType.equals("application/vnd.ms-excel")) { //This is to read xls workbook embedded to xlsx file
HSSFWorkbook embeddedWorkbook = new HSSFWorkbook(pPart.getInputStream());
int countOfSheetXls=embeddedWorkbook.getNumberOfSheets();
}
else if (contentType.equals("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")) { //This is to read xlsx workbook embedded to xlsx file
if(pPart.getPartName().getName().equals("/xl/embeddings/Microsoft_Excel_Worksheet12.xlsx")){
//"/xl/embeddings/Microsoft_Excel_Worksheet12.xlsx" - Can read an Excel from a particular sheet
// This is the worksheet from the Parent Excel-sheet-12
XSSFWorkbook embeddedWorkbook = new XSSFWorkbook(pPart.getInputStream());
int countOfSheetXlsx=embeddedWorkbook.getNumberOfSheets();
ArrayList<String> sheetNames= new ArrayList<String>();
for(int i=0;i<countOfSheetXlsx;i++){
String name=workbook.getSheetName(i);
sheetNames.add(name);
}
}
}
}
}
}

This is partly a duplicate of How to get pictures with names from an xls file using Apache POI, for which I've written the original paste.
As per request, I've added also an example of how to add and embedding with the help of a OLE 1.0 packager - in the meantime I've added the code to POI, so this easier now. For the OOXML based files have a look into this answer.
So the code iterates through all shapes of the DrawingPatriarch and extracts the pictures and embedded files.
I've added the full code - instead of a snippet - to this answer, as I expect the next "why can't I export this kind of embedding" to come up soon ...
package poijartest;
import java.awt.Color;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.Closeable;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.lang.reflect.Method;
import java.net.URL;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.imageio.ImageIO;
import org.apache.poi.ddf.EscherComplexProperty;
import org.apache.poi.ddf.EscherOptRecord;
import org.apache.poi.ddf.EscherProperty;
import org.apache.poi.hpsf.ClassID;
import org.apache.poi.hslf.usermodel.HSLFSlideShow;
import org.apache.poi.hssf.usermodel.HSSFClientAnchor;
import org.apache.poi.hssf.usermodel.HSSFObjectData;
import org.apache.poi.hssf.usermodel.HSSFPatriarch;
import org.apache.poi.hssf.usermodel.HSSFPicture;
import org.apache.poi.hssf.usermodel.HSSFPictureData;
import org.apache.poi.hssf.usermodel.HSSFShape;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFSimpleShape;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.poifs.filesystem.DirectoryNode;
import org.apache.poi.poifs.filesystem.Entry;
import org.apache.poi.poifs.filesystem.Ole10Native;
import org.apache.poi.poifs.filesystem.Ole10NativeException;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.sl.usermodel.AutoShape;
import org.apache.poi.sl.usermodel.ShapeType;
import org.apache.poi.sl.usermodel.Slide;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.apache.poi.util.IOUtils;
import org.apache.poi.xssf.usermodel.XSSFDrawing;
import org.apache.poi.xssf.usermodel.XSSFPicture;
import org.apache.poi.xssf.usermodel.XSSFPictureData;
import org.apache.poi.xssf.usermodel.XSSFShape;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.openxmlformats.schemas.drawingml.x2006.spreadsheetDrawing.CTPicture;
/**
* Tested with POI 3.16-beta1
*
* 17.12.2014: original version for
* http://apache-poi.1045710.n5.nabble.com/How-to-get-the-full-file-name-of-a-picture-in-xls-file-td5717205.html
*
* 17.12.2016: added sample/dummy data for
* https://stackoverflow.com/questions/41101012/how-to-export-embeded-file-which-from-excel-using-poi
*/
public class EmbeddedReader {
private File excel_file;
private ImageReader image_reader;
public static void main(String[] args) throws Exception {
File sample = new File("bla.xls");
getSampleEmbedded(sample);
ImageReader ir = new ImageReader(sample);
for (EmbeddedData ed : ir.embeddings) {
System.out.println(ed.filename);
FileOutputStream fos = new FileOutputStream(ed.filename);
IOUtils.copy(ed.is, fos);
fos.close();
}
ir.close();
}
static void getSampleEmbedded(File sample) throws IOException {
HSSFWorkbook wb = new HSSFWorkbook();
int storageId = wb.addOlePackage(getSamplePPT(), "dummy.ppt", "dummy.ppt", "dummy.ppt");
int picId = wb.addPicture(getSamplePng(), HSSFPicture.PICTURE_TYPE_PNG);
HSSFSheet sheet = wb.createSheet();
HSSFPatriarch pat = sheet.createDrawingPatriarch();
HSSFClientAnchor anc = pat.createAnchor(0, 0, 0, 0, 1, 1, 3, 6);
HSSFObjectData od = pat.createObjectData(anc, storageId, picId);
od.setNoFill(true);
wb.write(sample);
wb.close();
}
static byte[] getSamplePng() throws IOException {
ClassLoader cl = Thread.currentThread().getContextClassLoader();
URL imgUrl = cl.getResource("javax/swing/plaf/metal/icons/ocean/directory.gif");
BufferedImage img = ImageIO.read(imgUrl);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ImageIO.write(img, "PNG", bos);
return bos.toByteArray();
}
static byte[] getSamplePPT() throws IOException {
HSLFSlideShow ppt = new HSLFSlideShow();
Slide<?,?> slide = ppt.createSlide();
AutoShape<?,?> sh1 = slide.createAutoShape();
sh1.setShapeType(ShapeType.STAR_32);
sh1.setAnchor(new java.awt.Rectangle(50, 50, 100, 200));
sh1.setFillColor(Color.red);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ppt.write(bos);
ppt.close();
POIFSFileSystem poifs = new POIFSFileSystem(new ByteArrayInputStream(bos.toByteArray()));
poifs.getRoot().setStorageClsid(ClassID.PPT_SHOW);
bos.reset();
poifs.writeFilesystem(bos);
poifs.close();
return bos.toByteArray();
}
public EmbeddedReader(String excel_path) throws IOException {
excel_file = new File(excel_path);
image_reader = new ImageReader(excel_file);
}
public String[] get_file_names() {
ArrayList<String> file_names = new ArrayList<String>();
for (EmbeddedData ed : image_reader.embeddings) {
file_names.add(ed.filename);
}
return file_names.toArray(new String[file_names.size()]);
}
public InputStream get_stream(String file_name) {
InputStream input_stream = null;
for (EmbeddedData ed : image_reader.embeddings) {
if(file_name.equals(ed.filename)) {
input_stream = ed.is;
break;
}
}
return input_stream;
}
static class ImageReader implements Closeable {
EmbeddedExtractor extractors[] = {
new Ole10Extractor(), new PdfExtractor(), new WordExtractor(), new ExcelExtractor(), new FsExtractor()
};
List<EmbeddedData> embeddings = new ArrayList<EmbeddedData>();
Workbook wb;
public ImageReader(File excelfile) throws IOException {
try {
wb = WorkbookFactory.create(excelfile);
Sheet receiptImages = wb.getSheet("Receipt images");
if (wb instanceof XSSFWorkbook) {
addSheetPicsAndEmbedds((XSSFSheet)receiptImages);
} else {
addAllEmbedds((HSSFWorkbook)wb);
addSheetPics((HSSFSheet)receiptImages);
}
} catch (Exception e) {
// todo: error handling
}
}
protected void addSheetPicsAndEmbedds(XSSFSheet sheet) throws IOException {
if (sheet == null) return;
XSSFDrawing draw = sheet.createDrawingPatriarch();
for (XSSFShape shape : draw.getShapes()) {
if (!(shape instanceof XSSFPicture)) continue;
XSSFPicture picture = (XSSFPicture)shape;
XSSFPictureData pd = picture.getPictureData();
PackagePart pp = pd.getPackagePart();
CTPicture ctPic = picture.getCTPicture();
String filename = null;
try {
filename = ctPic.getNvPicPr().getCNvPr().getName();
} catch (Exception e) {}
if (filename == null || "".equals(filename)) {
filename = new File(pp.getPartName().toString()).getName();
}
EmbeddedData ed = new EmbeddedData();
ed.filename = fileNameWithoutPath(filename);
ed.is = pp.getInputStream();
embeddings.add(ed);
}
}
protected void addAllEmbedds(HSSFWorkbook hwb) throws IOException {
for (HSSFObjectData od : hwb.getAllEmbeddedObjects()) {
String alternativeName = getAlternativeName(od);
if (od.hasDirectoryEntry()) {
DirectoryNode src = (DirectoryNode)od.getDirectory();
for (EmbeddedExtractor ee : extractors) {
if (ee.canExtract(src)) {
EmbeddedData ed = ee.extract(src);
if (ed.filename == null || ed.filename.startsWith("MBD") || alternativeName != null) {
ed.filename = alternativeName;
}
ed.filename = fileNameWithoutPath(ed.filename);
ed.source = "object";
embeddings.add(ed);
break;
}
}
}
}
}
protected String getAlternativeName(HSSFShape shape) {
EscherOptRecord eor = reflectEscherOptRecord(shape);
if (eor == null) return null;
for (EscherProperty ep : eor.getEscherProperties()) {
if ("groupshape.shapename".equals(ep.getName()) && ep.isComplex()) {
return new String(((EscherComplexProperty)ep).getComplexData(),
Charset.forName("UTF-16LE"));
}
}
return null;
}
protected void addSheetPics(HSSFSheet sheet) {
if (sheet == null) return;
int picIdx=0;
int emfIdx = 0;
HSSFPatriarch patriarch = sheet.getDrawingPatriarch();
if (patriarch == null) return;
// Loop through the objects
for (HSSFShape shape : patriarch.getChildren()) {
if (!(shape instanceof HSSFPicture)) {
continue;
}
HSSFPicture picture = (HSSFPicture) shape;
if (picture.getShapeType() != HSSFSimpleShape.OBJECT_TYPE_PICTURE) continue;
HSSFPictureData pd = picture.getPictureData();
byte pictureBytes[] = pd.getData();
int pictureBytesOffset = 0;
int pictureBytesLen = pictureBytes.length;
String filename = picture.getFileName();
// try to find an alternative name
if (filename == null || "".equals(filename)) {
filename = getAlternativeName(picture);
}
// default to dummy name
if (filename == null || "".equals(filename)) {
filename = "picture"+(picIdx++);
}
filename = filename.trim();
// check for emf+ embedded pdf (poor mans style :( )
// Mac Excel 2011 embeds pdf files with this method.
boolean validFile = true;
if (pd.getFormat() == Workbook.PICTURE_TYPE_EMF) {
validFile = false;
int idxStart = indexOf(pictureBytes, 0, "%PDF-".getBytes());
if (idxStart != -1) {
int idxEnd = indexOf(pictureBytes, idxStart, "%%EOF".getBytes());
if (idxEnd != -1) {
pictureBytesOffset = idxStart;
pictureBytesLen = idxEnd-idxStart+6;
validFile = true;
}
} else {
// This shape was not a Mac Excel 2011 embedded pdf file.
// So this is a shape related to a regular embedded object
// Lets update the object filename with the shapes filename
// if the object filename is of format ARGF1234.pdf
EmbeddedData ed_obj = embeddings.get(emfIdx);
Pattern pattern = Pattern.compile("^[A-Z0-9]{8}\\.[pdfPDF]{3}$");
Matcher matcher = pattern.matcher(ed_obj.filename);
if(matcher.matches()) {
ed_obj.filename = filename;
}
emfIdx += 1;
}
}
EmbeddedData ed = new EmbeddedData();
ed.filename = fileNameWithoutPath(filename);
ed.is = new ByteArrayInputStream(pictureBytes, pictureBytesOffset, pictureBytesLen);
if(fileNotInEmbeddings(ed.filename) && validFile) {
embeddings.add(ed);
}
}
}
private static EscherOptRecord reflectEscherOptRecord(HSSFShape shape) {
try {
Method m = HSSFShape.class.getDeclaredMethod("getOptRecord");
m.setAccessible(true);
return (EscherOptRecord)m.invoke(shape);
} catch (Exception e) {
// todo: log ... well actually "should not happen" ;)
return null;
}
}
private String fileNameWithoutPath(String filename) {
int last_index = filename.lastIndexOf("\\");
return filename.substring(last_index + 1);
}
private boolean fileNotInEmbeddings(String filename) {
boolean exists = true;
for(EmbeddedData ed : embeddings) {
if(ed.filename.equals(filename)) {
exists = false;
}
}
return exists;
}
public void close() throws IOException {
Iterator<EmbeddedData> ed = embeddings.iterator();
while (ed.hasNext()) {
ed.next().is.close();
}
wb.close();
}
}
static class EmbeddedData {
String filename;
InputStream is;
String source;
}
static abstract class EmbeddedExtractor {
abstract boolean canExtract(DirectoryNode dn);
abstract EmbeddedData extract(DirectoryNode dn) throws IOException;
protected EmbeddedData extractFS(DirectoryNode dn, String filename) throws IOException {
assert(canExtract(dn));
POIFSFileSystem dest = new POIFSFileSystem();
copyNodes(dn, dest.getRoot());
EmbeddedData ed = new EmbeddedData();
ed.filename = filename;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
dest.writeFilesystem(bos);
dest.close();
ed.is = new ByteArrayInputStream(bos.toByteArray());
return ed;
}
}
static class Ole10Extractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return ClassID.OLE10_PACKAGE.equals(clsId);
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
try {
Ole10Native ole10 = Ole10Native.createFromEmbeddedOleObject(dn);
EmbeddedData ed = new EmbeddedData();
ed.filename = new File(ole10.getFileName()).getName();
ed.is = new ByteArrayInputStream(ole10.getDataBuffer());
return ed;
} catch (Ole10NativeException e) {
throw new IOException(e);
}
}
}
static class PdfExtractor extends EmbeddedExtractor {
static ClassID PdfClassID = new ClassID("{B801CA65-A1FC-11D0-85AD-444553540000}");
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return (PdfClassID.equals(clsId)
|| dn.hasEntry("CONTENTS"));
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
EmbeddedData ed = new EmbeddedData();
ed.is = dn.createDocumentInputStream("CONTENTS");
ed.filename = dn.getName()+".pdf";
return ed;
}
}
static class WordExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return (ClassID.WORD95.equals(clsId)
|| ClassID.WORD97.equals(clsId)
|| dn.hasEntry("WordDocument"));
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
return extractFS(dn, dn.getName()+".doc");
}
}
static class ExcelExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return (ClassID.EXCEL95.equals(clsId)
|| ClassID.EXCEL97.equals(clsId)
|| dn.hasEntry("Workbook") /*...*/);
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
return extractFS(dn, dn.getName()+".xls");
}
}
static class FsExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
return true;
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
return extractFS(dn, dn.getName()+".dat");
}
}
private static void copyNodes(DirectoryNode src, DirectoryNode dest) throws IOException {
for (Entry e : src) {
if (e instanceof DirectoryNode) {
DirectoryNode srcDir = (DirectoryNode)e;
DirectoryNode destDir = (DirectoryNode)dest.createDirectory(srcDir.getName());
destDir.setStorageClsid(srcDir.getStorageClsid());
copyNodes(srcDir, destDir);
} else {
InputStream is = src.createDocumentInputStream(e);
dest.createDocument(e.getName(), is);
is.close();
}
}
}
/**
* Knuth-Morris-Pratt Algorithm for Pattern Matching
* Finds the first occurrence of the pattern in the text.
*/
private static int indexOf(byte[] data, int offset, byte[] pattern) {
int[] failure = computeFailure(pattern);
int j = 0;
if (data.length == 0) return -1;
for (int i = offset; i < data.length; i++) {
while (j > 0 && pattern[j] != data[i]) {
j = failure[j - 1];
}
if (pattern[j] == data[i]) { j++; }
if (j == pattern.length) {
return i - pattern.length + 1;
}
}
return -1;
}
/**
* Computes the failure function using a boot-strapping process,
* where the pattern is matched against itself.
*/
private static int[] computeFailure(byte[] pattern) {
int[] failure = new int[pattern.length];
int j = 0;
for (int i = 1; i < pattern.length; i++) {
while (j > 0 && pattern[j] != pattern[i]) {
j = failure[j - 1];
}
if (pattern[j] == pattern[i]) {
j++;
}
failure[i] = j;
}
return failure;
}
}

Required jar File List:
commons-codec-1.10.jar
dom4j.jar
poi-3.16-beta1.jar
poi-ooxml-3.8.jar
poi-ooxml-schemas-3.9.jar
poi-scratchpad-3.9.jar
xmlbeans-2.3.0.jar
This is my Whole code implementation:
import java.awt.Color;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.Closeable;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.lang.reflect.Method;
import java.net.URL;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.imageio.ImageIO;
import org.apache.poi.ddf.EscherComplexProperty;
import org.apache.poi.ddf.EscherOptRecord;
import org.apache.poi.ddf.EscherProperty;
import org.apache.poi.hpsf.ClassID;
import org.apache.poi.hslf.HSLFSlideShow;
import org.apache.poi.hssf.usermodel.HSSFClientAnchor;
import org.apache.poi.hssf.usermodel.HSSFObjectData;
import org.apache.poi.hssf.usermodel.HSSFPatriarch;
import org.apache.poi.hssf.usermodel.HSSFPicture;
import org.apache.poi.hssf.usermodel.HSSFPictureData;
import org.apache.poi.hssf.usermodel.HSSFShape;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFSimpleShape;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.poifs.filesystem.DirectoryNode;
import org.apache.poi.poifs.filesystem.Entry;
import org.apache.poi.poifs.filesystem.Ole10Native;
import org.apache.poi.poifs.filesystem.Ole10NativeException;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.sl.usermodel.AutoShape;
import org.apache.poi.sl.usermodel.Slide;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.apache.poi.util.IOUtils;
import org.apache.poi.xssf.usermodel.XSSFDrawing;
import org.apache.poi.xssf.usermodel.XSSFPicture;
import org.apache.poi.xssf.usermodel.XSSFPictureData;
import org.apache.poi.xssf.usermodel.XSSFShape;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.openxmlformats.schemas.drawingml.x2006.spreadsheetDrawing.CTPicture;
public class EmbeddedReader {
public static final OleType OLE10_PACKAGE = new OleType("{0003000C-0000-0000-C000-000000000046}");
public static final OleType PPT_SHOW = new OleType("{64818D10-4F9B-11CF-86EA-00AA00B929E8}");
public static final OleType XLS_WORKBOOK = new OleType("{00020841-0000-0000-C000-000000000046}");
public static final OleType TXT_ONLY = new OleType("{5e941d80-bf96-11cd-b579-08002b30bfeb}");
public static final OleType EXCEL97 = new OleType("{00020820-0000-0000-C000-000000000046}");
public static final OleType EXCEL95 = new OleType("{00020810-0000-0000-C000-000000000046}");
public static final OleType WORD97 = new OleType("{00020906-0000-0000-C000-000000000046}");
public static final OleType WORD95 = new OleType("{00020900-0000-0000-C000-000000000046}");
public static final OleType POWERPOINT97 = new OleType("{64818D10-4F9B-11CF-86EA-00AA00B929E8}");
public static final OleType POWERPOINT95 = new OleType("{EA7BAE70-FB3B-11CD-A903-00AA00510EA3}");
public static final OleType EQUATION30 = new OleType("{0002CE02-0000-0000-C000-000000000046}");
public static final OleType PdfClassID = new OleType("{B801CA65-A1FC-11D0-85AD-444553540000}");
private File excel_file;
private ImageReader image_reader;
static class OleType {
final String classId;
OleType(String classId) {
this.classId = classId;
}
ClassID getClassID() {
ClassID cls = new ClassID();
byte clsBytes[] = cls.getBytes();
String clsStr = classId.replaceAll("[{}-]", "");
for (int i = 0; i < clsStr.length(); i += 2) {
clsBytes[i / 2] = (byte) Integer.parseInt(
clsStr.substring(i, i + 2), 16);
}
return cls;
}
}
public static void main(String[] args) throws Exception {
File sample = new File("D:\\ole_ppt_in_xls.xls");
ImageReader ir = new ImageReader(sample);
for (EmbeddedData ed : ir.embeddings) {
FileOutputStream fos = new FileOutputStream(System.getProperty("user.home") + "/Desktop" + "/sumit/"+ ed.filename);
IOUtils.copy(ed.is, fos);
fos.close();
}
ir.close();
}
static byte[] getSamplePng() throws IOException {
ClassLoader cl = Thread.currentThread().getContextClassLoader();
URL imgUrl = cl.getResource("javax/swing/plaf/metal/icons/ocean/directory.gif");
BufferedImage img = ImageIO.read(imgUrl);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ImageIO.write(img, "PNG", bos);
return bos.toByteArray();
}
public EmbeddedReader(String excel_path) throws IOException {
excel_file = new File(excel_path);
image_reader = new ImageReader(excel_file);
}
public String[] get_file_names() {
ArrayList<String> file_names = new ArrayList<String>();
for (EmbeddedData ed : image_reader.embeddings) {
file_names.add(ed.filename);
}
return file_names.toArray(new String[file_names.size()]);
}
public InputStream get_stream(String file_name) {
InputStream input_stream = null;
for (EmbeddedData ed : image_reader.embeddings) {
if (file_name.equals(ed.filename)) {
input_stream = ed.is;
break;
}
}
return input_stream;
}
static class ImageReader implements Closeable {
EmbeddedExtractor extractors[] = { new Ole10Extractor(),new PdfExtractor(), new WordExtractor(), new ExcelExtractor(),new FsExtractor() };
List<EmbeddedData> embeddings = new ArrayList<EmbeddedData>();
Workbook wb;
public ImageReader(File excelfile) throws IOException {
try {
wb = WorkbookFactory.create(excelfile);
Sheet receiptImages = wb.getSheet("Receipt images");
if (wb instanceof XSSFWorkbook) {
addSheetPicsAndEmbedds((XSSFSheet) receiptImages);
} else {
addAllEmbedds((HSSFWorkbook) wb);
addSheetPics((HSSFSheet) receiptImages);
}
} catch (Exception e) {
e.printStackTrace();
}
}
protected void addSheetPicsAndEmbedds(XSSFSheet sheet)throws IOException {
if (sheet == null)
return;
XSSFDrawing draw = sheet.createDrawingPatriarch();
for (XSSFShape shape : draw.getShapes()) {
if (!(shape instanceof XSSFPicture))
continue;
XSSFPicture picture = (XSSFPicture) shape;
XSSFPictureData pd = picture.getPictureData();
PackagePart pp = pd.getPackagePart();
CTPicture ctPic = picture.getCTPicture();
String filename = null;
try {
filename = ctPic.getNvPicPr().getCNvPr().getName();
} catch (Exception e) {
}
if (filename == null || "".equals(filename)) {
filename = new File(pp.getPartName().toString()).getName();
}
EmbeddedData ed = new EmbeddedData();
ed.filename = fileNameWithoutPath(filename);
ed.is = pp.getInputStream();
embeddings.add(ed);
}
}
protected void addAllEmbedds(HSSFWorkbook hwb) throws IOException {
for (HSSFObjectData od : hwb.getAllEmbeddedObjects()) {
String alternativeName = getAlternativeName(od);
if (od.hasDirectoryEntry()) {
DirectoryNode src = (DirectoryNode) od.getDirectory();
for (EmbeddedExtractor ee : extractors) {
if (ee.canExtract(src)) {
EmbeddedData ed = ee.extract(src);
if (ed.filename == null || ed.filename.startsWith("MBD")|| alternativeName != null) {
if (alternativeName != null) {
ed.filename = alternativeName;
}
}
ed.filename = fileNameWithoutPath(ed.filename);
ed.source = "object";
embeddings.add(ed);
break;
}
}
}
}
}
protected String getAlternativeName(HSSFShape shape) {
EscherOptRecord eor = reflectEscherOptRecord(shape);
if (eor == null) {
return null;
}
for (EscherProperty ep : eor.getEscherProperties()) {
if ("groupshape.shapename".equals(ep.getName())
&& ep.isComplex()) {
return new String(
((EscherComplexProperty) ep).getComplexData(),
Charset.forName("UTF-16LE"));
}
}
return null;
}
protected void addSheetPics(HSSFSheet sheet) {
if (sheet == null)
return;
int picIdx = 0;
int emfIdx = 0;
HSSFPatriarch patriarch = sheet.getDrawingPatriarch();
if (patriarch == null)
return;
// Loop through the objects
for (HSSFShape shape : patriarch.getChildren()) {
if (!(shape instanceof HSSFPicture)) {
continue;
}
HSSFPicture picture = (HSSFPicture) shape;
if (picture.getShapeType() != HSSFSimpleShape.OBJECT_TYPE_PICTURE)
continue;
HSSFPictureData pd = picture.getPictureData();
byte pictureBytes[] = pd.getData();
int pictureBytesOffset = 0;
int pictureBytesLen = pictureBytes.length;
String filename = picture.getFileName();
// try to find an alternative name
if (filename == null || "".equals(filename)) {
filename = getAlternativeName(picture);
}
// default to dummy name
if (filename == null || "".equals(filename)) {
filename = "picture" + (picIdx++);
}
filename = filename.trim();
// check for emf+ embedded pdf (poor mans style :( )
// Mac Excel 2011 embeds pdf files with this method.
boolean validFile = true;
if (pd.getFormat() == Workbook.PICTURE_TYPE_EMF) {
validFile = false;
int idxStart = indexOf(pictureBytes, 0, "%PDF-".getBytes());
if (idxStart != -1) {
int idxEnd = indexOf(pictureBytes, idxStart,"%%EOF".getBytes());
if (idxEnd != -1) {
pictureBytesOffset = idxStart;
pictureBytesLen = idxEnd - idxStart + 6;
validFile = true;
}
} else {
// This shape was not a Mac Excel 2011 embedded pdf file.
// So this is a shape related to a regular embedded object
// Lets update the object filename with the shapes filename
// if the object filename is of format ARGF1234.pdf
EmbeddedData ed_obj = embeddings.get(emfIdx);
Pattern pattern = Pattern
.compile("^[A-Z0-9]{8}\\.[pdfPDF]{3}$");
Matcher matcher = pattern.matcher(ed_obj.filename);
if (matcher.matches()) {
ed_obj.filename = filename;
}
emfIdx += 1;
}
}
EmbeddedData ed = new EmbeddedData();
ed.filename = fileNameWithoutPath(filename);
ed.is = new ByteArrayInputStream(pictureBytes,
pictureBytesOffset, pictureBytesLen);
if (fileNotInEmbeddings(ed.filename) && validFile) {
embeddings.add(ed);
}
}
}
private static EscherOptRecord reflectEscherOptRecord(HSSFShape shape) {
try {
Method m = HSSFShape.class.getDeclaredMethod("getOptRecord");
m.setAccessible(true);
return (EscherOptRecord) m.invoke(shape);
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
private String fileNameWithoutPath(String filename) {
int last_index = filename.lastIndexOf("\\");
return filename.substring(last_index + 1);
}
private boolean fileNotInEmbeddings(String filename) {
boolean exists = true;
for (EmbeddedData ed : embeddings) {
if (ed.filename.equals(filename)) {
exists = false;
}
}
return exists;
}
public void close() throws IOException {
Iterator<EmbeddedData> ed = embeddings.iterator();
while (ed.hasNext()) {
ed.next().is.close();
}
wb.close();
}
}
static class EmbeddedData {
String filename;
InputStream is;
String source;
}
static abstract class EmbeddedExtractor {
abstract boolean canExtract(DirectoryNode dn);
abstract EmbeddedData extract(DirectoryNode dn) throws IOException;
protected EmbeddedData extractFS(DirectoryNode dn, String filename)
throws IOException {
assert (canExtract(dn));
POIFSFileSystem dest = new POIFSFileSystem();
copyNodes(dn, dest.getRoot());
EmbeddedData ed = new EmbeddedData();
ed.filename = filename;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
dest.writeFilesystem(bos);
bos.close();
ed.is = new ByteArrayInputStream(bos.toByteArray());
return ed;
}
}
static class Ole10Extractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return OLE10_PACKAGE.equals(clsId);
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
try {
Ole10Native ole10 = Ole10Native.createFromEmbeddedOleObject(dn);
EmbeddedData ed = new EmbeddedData();
ed.filename = new File(ole10.getFileName()).getName();
ed.is = new ByteArrayInputStream(ole10.getDataBuffer());
return ed;
} catch (Ole10NativeException e) {
e.printStackTrace();
throw new IOException(e);
}
}
}
static class PdfExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return (PdfClassID.equals(clsId) || dn.hasEntry("CONTENTS"));
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
EmbeddedData ed = new EmbeddedData();
ed.is = dn.createDocumentInputStream("CONTENTS");
ed.filename = dn.getName() + ".pdf";
return ed;
}
}
static class WordExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return (WORD95.equals(clsId) || WORD97.equals(clsId) || dn.hasEntry("WordDocument"));
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
return extractFS(dn, dn.getName() + ".doc");
}
}
static class ExcelExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
ClassID clsId = dn.getStorageClsid();
return (EXCEL95.equals(clsId) || EXCEL97.equals(clsId) || dn
.hasEntry("Workbook") /* ... */);
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
return extractFS(dn, dn.getName() + ".xls");
}
}
static class FsExtractor extends EmbeddedExtractor {
public boolean canExtract(DirectoryNode dn) {
return true;
}
public EmbeddedData extract(DirectoryNode dn) throws IOException {
return extractFS(dn, dn.getName() + ".dat");
}
}
private static void copyNodes(DirectoryNode src, DirectoryNode dest)
throws IOException {
for (Entry e : src) {
if (e instanceof DirectoryNode) {
DirectoryNode srcDir = (DirectoryNode) e;
DirectoryNode destDir = (DirectoryNode) dest
.createDirectory(srcDir.getName());
destDir.setStorageClsid(srcDir.getStorageClsid());
copyNodes(srcDir, destDir);
} else {
InputStream is = src.createDocumentInputStream(e);
dest.createDocument(e.getName(), is);
is.close();
}
}
}
/**
* Knuth-Morris-Pratt Algorithm for Pattern Matching Finds the first
* occurrence of the pattern in the text.
*/
private static int indexOf(byte[] data, int offset, byte[] pattern) {
int[] failure = computeFailure(pattern);
int j = 0;
if (data.length == 0)
return -1;
for (int i = offset; i < data.length; i++) {
while (j > 0 && pattern[j] != data[i]) {
j = failure[j - 1];
}
if (pattern[j] == data[i]) {
j++;
}
if (j == pattern.length) {
return i - pattern.length + 1;
}
}
return -1;
}
/**
* Computes the failure function using a boot-strapping process, where the
* pattern is matched against itself.
*/
private static int[] computeFailure(byte[] pattern) {
int[] failure = new int[pattern.length];
int j = 0;
for (int i = 1; i < pattern.length; i++) {
while (j > 0 && pattern[j] != pattern[i]) {
j = failure[j - 1];
}
if (pattern[j] == pattern[i]) {
j++;
}
failure[i] = j;
}
return failure;
}
}

To simplify the processing of embedded data, I've added an extractor class to POI, which will be available in POI 3.16-beta2 or a nightly until then.
The following will extract the objects of .xls/x files - all which is left, is to write the embedded bytes somewhere. It's possible to extend the extractor classes by simply extending EmbeddedExtractor and provide your own iterator() method.
import java.io.FileInputStream;
import java.io.InputStream;
import org.apache.poi.ss.extractor.EmbeddedData;
import org.apache.poi.ss.extractor.EmbeddedExtractor;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
public class BlaExtract {
public static void main(String[] args) throws Exception {
InputStream fis = new FileInputStream("bla.xlsx");
Workbook wb = WorkbookFactory.create(fis);
fis.close();
EmbeddedExtractor ee = new EmbeddedExtractor();
for (Sheet s : wb) {
for (EmbeddedData ed : ee.extractAll(s)) {
System.out.println(ed.getFilename()+" ("+ed.getContentType()+") - "+ed.getEmbeddedData().length+" bytes");
}
}
wb.close();
}
}

Related

Add custom metadata to tiff

I want to add some custom metadata to a multipage tiff for further processing steps, like
identifier1 = XYZ1
identifier2 = XYZ2
...
My idea was to update (see code/TODO below)
IIOMetadata streamMetadata [option 1]
IIOMetadata imageMetadata [option 2]
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import javax.imageio.IIOImage;
import javax.imageio.ImageIO;
import javax.imageio.ImageReader;
import javax.imageio.ImageWriteParam;
import javax.imageio.ImageWriter;
import javax.imageio.metadata.IIOMetadata;
import javax.imageio.stream.ImageInputStream;
import javax.imageio.stream.ImageOutputStream;
public class TiffMetadataExample {
public static void addMetadata(File tiff, File out, Object metadata2Add)
throws FileNotFoundException, IOException {
try (FileInputStream fis = new FileInputStream(tiff);
FileOutputStream fos = new FileOutputStream(out)) {
addMetadata(fis, fos, metadata2Add);
}
}
public static void addMetadata(InputStream inputImage, OutputStream out, Object metadata2Add)
throws IOException {
List<IIOMetadata> metadata = new ArrayList<>();
List<BufferedImage> images = getImages(inputImage, metadata);
if (metadata.size() != images.size()) {
throw new IllegalStateException();
}
// Obtain a TIFF writer
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
try (ImageOutputStream output = ImageIO.createImageOutputStream(out)) {
writer.setOutput(output);
ImageWriteParam params = writer.getDefaultWriteParam();
params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
// Compression: None, PackBits, ZLib, Deflate, LZW, JPEG and CCITT variants allowed
// (different plugins may use a different set of compression type names)
params.setCompressionType("Deflate");
// streamMetadata is null here
IIOMetadata streamMetadata = writer.getDefaultStreamMetadata(params);
// TODO: add custom metadata fields [option 1]
writer.prepareWriteSequence(streamMetadata);
for (int i = 0; i < images.size(); i++) {
BufferedImage image = images.get(i);
IIOMetadata imageMetadata = metadata.get(i);
// TODO: add custom metadata fields [option 2]
writer.writeToSequence(new IIOImage(image, null, imageMetadata), params);
}
writer.endWriteSequence();
} finally {
writer.dispose();
}
}
private static List<BufferedImage> getImages(final InputStream inputImage,
final List<IIOMetadata> metadata) throws IOException {
List<BufferedImage> images = new ArrayList<>();
ImageReader reader = null;
try (ImageInputStream is = ImageIO.createImageInputStream(inputImage)) {
Iterator<ImageReader> iterator = ImageIO.getImageReaders(is);
reader = iterator.next();
reader.setInput(is);
int numPages = reader.getNumImages(true);
for (int numPage = 0; numPage < numPages; numPage++) {
BufferedImage pageImage = reader.read(numPage);
IIOMetadata imageMetadata = reader.getImageMetadata(numPage);
metadata.add(imageMetadata);
images.add(pageImage);
}
return images;
} finally {
if (reader != null) {
reader.dispose();
}
}
}
}
Try to update imageMetadata [option 2] with following code does not work. What is wrong here?
IIOMetadataNode textEntry = new IIOMetadataNode("tEXtEntry");
textEntry.setAttribute("keyword", "aaaaaa");
textEntry.setAttribute("value", "bbb");
IIOMetadataNode text = new IIOMetadataNode("tEXt");
text.appendChild(textEntry);
Node root = meta.getAsTree(formatName);
root.appendChild(text);
//e.g. formatName = "javax_imageio_1.0"
imageMetadata.setFromTree(imageMetadata.getNativeMetadataFormatName(), root);
Or is there a nicer/other way to store some further processing informations within the tiff?
This is my working solution.
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.attribute.UserDefinedFileAttributeView;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.TreeMap;
import javax.imageio.IIOImage;
import javax.imageio.ImageIO;
import javax.imageio.ImageReader;
import javax.imageio.ImageWriteParam;
import javax.imageio.ImageWriter;
import javax.imageio.metadata.IIOMetadata;
import javax.imageio.metadata.IIOMetadataNode;
import javax.imageio.stream.ImageInputStream;
import javax.imageio.stream.ImageOutputStream;
import org.apache.commons.imaging.common.RationalNumber;
import org.apache.commons.imaging.formats.tiff.constants.TiffTagConstants;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfo;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoAscii;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoBytes;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoDouble;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoFloat;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoLong;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoRational;
import org.apache.commons.imaging.formats.tiff.taginfos.TagInfoShort;
import com.twelvemonkeys.imageio.metadata.tiff.Rational;
public class TiffMetadataExample {
public static final int TIFF_TAG_XMP = 0x2BC;
public static final String TIFF_TAG_XMP_NAME = "XMP";
private static final String SUN_TIFF_FORMAT = "com_sun_media_imageio_plugins_tiff_image_1.0";
private static final String SUN_TIFF_STREAM_FORMAT =
"com_sun_media_imageio_plugins_tiff_stream_1.0";
private static final String TAG_SET_CLASS_NAME =
"com.sun.media.imageio.plugins.tiff.BaselineTIFFTagSet";
public static void setMetaData(File in, File out, Metadata metaData) throws IOException {
try (FileInputStream fis = new FileInputStream(in);
FileOutputStream fos = new FileOutputStream(out)) {
setMetaData(fis, fos, metaData);
}
UserDefinedFileAttributeView userDefView =
Files.getFileAttributeView(out.toPath(), UserDefinedFileAttributeView.class);
for (Entry<String, String> fileAttEntry : metaData.getfileAtt().entrySet()) {
userDefView.write(fileAttEntry.getKey(),
Charset.defaultCharset().encode(fileAttEntry.getValue()));
}
}
public static void setMetaData(InputStream inputImage, OutputStream out, Metadata metdaData2Add)
throws IOException {
List<IIOMetadata> metadataList = new ArrayList<>();
List<BufferedImage> images = getImages(inputImage, metadataList);
// Obtain a TIFF writer
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
try (ImageOutputStream output = ImageIO.createImageOutputStream(out)) {
writer.setOutput(output);
ImageWriteParam params = writer.getDefaultWriteParam();
params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
// Compression: None, PackBits, ZLib, Deflate, LZW, JPEG and CCITT variants allowed
// (different plugins may use a different set of compression type names)
params.setCompressionType("Deflate");
IIOMetadata streamMetadata = writer.getDefaultStreamMetadata(params);
writer.prepareWriteSequence(streamMetadata);
for (int i = 0; i < images.size(); i++) {
BufferedImage image = images.get(i);
IIOMetadata imageMetadata = metadataList.get(i);
updateMetadata(imageMetadata, metdaData2Add.get());
writer.writeToSequence(new IIOImage(image, null, imageMetadata), params);
}
writer.endWriteSequence();
} finally {
writer.dispose();
}
}
private static void updateMetadata(IIOMetadata metadata, List<IIOMetadataNode> metdaData2AddList)
throws IOException {
if (SUN_TIFF_FORMAT.equals(metadata.getNativeMetadataFormatName())
|| SUN_TIFF_STREAM_FORMAT.equals(metadata.getNativeMetadataFormatName())) {
// wanted format
} else {
throw new IllegalArgumentException(
"Could not write tiff metadata, wrong format: " + metadata.getNativeMetadataFormatName());
}
IIOMetadataNode root = new IIOMetadataNode(metadata.getNativeMetadataFormatName());
IIOMetadataNode ifd;
if (root.getElementsByTagName("TIFFIFD").getLength() == 0) {
ifd = new IIOMetadataNode("TIFFIFD");
ifd.setAttribute("tagSets", TAG_SET_CLASS_NAME);
root.appendChild(ifd);
} else {
ifd = (IIOMetadataNode) root.getElementsByTagName("TIFFIFD").item(0);
}
for (IIOMetadataNode metdaData2Add : metdaData2AddList) {
ifd.appendChild(metdaData2Add);
}
metadata.mergeTree(metadata.getNativeMetadataFormatName(), root);
}
private static List<BufferedImage> getImages(final InputStream inputImage,
final List<IIOMetadata> metadata) throws IOException {
List<BufferedImage> images = new ArrayList<>();
ImageReader reader = null;
try (ImageInputStream is = ImageIO.createImageInputStream(inputImage)) {
Iterator<ImageReader> iterator = ImageIO.getImageReaders(is);
reader = iterator.next();
reader.setInput(is);
int numPages = reader.getNumImages(true);
for (int numPage = 0; numPage < numPages; numPage++) {
BufferedImage pageImage = reader.read(numPage);
IIOMetadata meta = reader.getImageMetadata(numPage);
metadata.add(meta);
images.add(pageImage);
}
return images;
} finally {
if (reader != null) {
reader.dispose();
}
}
}
public static class Metadata {
private final List<IIOMetadataNode> addList = new ArrayList<>();
private final Map<String, String> fileAtt = new TreeMap<>();
public Metadata() {}
private List<IIOMetadataNode> get() {
return addList;
}
private Map<String, String> getfileAtt() {
return fileAtt;
}
public void add(int exifTag, String exifTagName, Object val) {
IIOMetadataNode md;
if (val instanceof byte[]) {
md = createBytesField(exifTag, exifTagName, (byte[]) val);
} else if (val instanceof String) {
md = createAsciiField(exifTag, exifTagName, (String) val);
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof Short) {
md = createShortField(exifTag, exifTagName, ((Short) val).intValue());
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof Integer) {
md = createShortField(exifTag, exifTagName, ((Integer) val).intValue());
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof Long) {
md = createLongField(exifTag, exifTagName, ((Long) val).longValue());
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof Float) {
md = createFloatField(exifTag, exifTagName, ((Float) val).floatValue());
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof Double) {
md = createDoubleField(exifTag, exifTagName, ((Double) val).doubleValue());
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof Rational) {
md = createRationalField(exifTag, exifTagName, ((Rational) val));
fileAtt.put(exifTagName, String.valueOf(val));
} else if (val instanceof RationalNumber) {
md = createRationalField(exifTag, exifTagName, ((RationalNumber) val));
fileAtt.put(exifTagName, String.valueOf(val));
} else {
throw new IllegalArgumentException("unsupported value class: " + val.getClass().getName());
}
addList.add(md);
}
/**
*
* #param tagInfo {#link TiffTagConstants} like {#link TiffTagConstants#TIFF_TAG_XMP}
* #param val String, byte[],
*/
public void add(TagInfo tagInfo, Object val) {
if (tagInfo instanceof TagInfoBytes) {
if (!(val instanceof byte[])) {
throw new IllegalArgumentException("expecting byte[] value");
}
} else if (tagInfo instanceof TagInfoAscii) {
if (!(val instanceof String)) {
throw new IllegalArgumentException("expecting String value");
}
} else if (tagInfo instanceof TagInfoShort) {
if (val instanceof Short || val instanceof Integer) {
// ok
} else {
throw new IllegalArgumentException("expecting Short/Integer value");
}
} else if (tagInfo instanceof TagInfoLong) {
if (!(val instanceof Long)) {
throw new IllegalArgumentException("expecting Long value");
}
} else if (tagInfo instanceof TagInfoDouble) {
if (!(val instanceof Double)) {
throw new IllegalArgumentException("expecting double value");
}
} else if (tagInfo instanceof TagInfoFloat) {
if (!(val instanceof Float)) {
throw new IllegalArgumentException("expecting float value");
}
} else if (tagInfo instanceof TagInfoRational) {
if (val instanceof RationalNumber || val instanceof Rational) {
// ok
} else {
throw new IllegalArgumentException("expecting rational value");
}
}
add(tagInfo.tag, tagInfo.name, val);
}
private static IIOMetadataNode createBytesField(int number, String name, byte[] bytes) {
IIOMetadataNode field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
IIOMetadataNode arrayNode = new IIOMetadataNode("TIFFBytes");
field.appendChild(arrayNode);
for (byte b : bytes) {
IIOMetadataNode valueNode = new IIOMetadataNode("TIFFByte");
valueNode.setAttribute("value", Integer.toString(b));
arrayNode.appendChild(valueNode);
}
return field;
}
private static IIOMetadataNode createShortField(int number, String name, int val) {
IIOMetadataNode field, arrayNode, valueNode;
field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
arrayNode = new IIOMetadataNode("TIFFShorts");
field.appendChild(arrayNode);
valueNode = new IIOMetadataNode("TIFFShort");
arrayNode.appendChild(valueNode);
valueNode.setAttribute("value", Integer.toString(val));
return field;
}
private static IIOMetadataNode createAsciiField(int number, String name, String val) {
IIOMetadataNode field, arrayNode, valueNode;
field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
arrayNode = new IIOMetadataNode("TIFFAsciis");
field.appendChild(arrayNode);
valueNode = new IIOMetadataNode("TIFFAscii");
arrayNode.appendChild(valueNode);
valueNode.setAttribute("value", val);
return field;
}
private static IIOMetadataNode createLongField(int number, String name, long val) {
IIOMetadataNode field, arrayNode, valueNode;
field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
arrayNode = new IIOMetadataNode("TIFFLongs");
field.appendChild(arrayNode);
valueNode = new IIOMetadataNode("TIFFLong");
arrayNode.appendChild(valueNode);
valueNode.setAttribute("value", Long.toString(val));
return field;
}
private static IIOMetadataNode createFloatField(int number, String name, float val) {
IIOMetadataNode field, arrayNode, valueNode;
field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
arrayNode = new IIOMetadataNode("TIFFFloats");
field.appendChild(arrayNode);
valueNode = new IIOMetadataNode("TIFFFloat");
arrayNode.appendChild(valueNode);
valueNode.setAttribute("value", Float.toString(val));
return field;
}
private static IIOMetadataNode createDoubleField(int number, String name, double val) {
IIOMetadataNode field, arrayNode, valueNode;
field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
arrayNode = new IIOMetadataNode("TIFFDoubles");
field.appendChild(arrayNode);
valueNode = new IIOMetadataNode("TIFFDouble");
arrayNode.appendChild(valueNode);
valueNode.setAttribute("value", Double.toString(val));
return field;
}
private static IIOMetadataNode createRationalField(int number, String name, Rational rational) {
return createRationalField(number, name, rational.numerator(), rational.denominator());
}
private static IIOMetadataNode createRationalField(int number, String name,
RationalNumber rational) {
return createRationalField(number, name, rational.numerator, rational.divisor);
}
private static IIOMetadataNode createRationalField(int number, String name, long numerator,
long denominator) {
IIOMetadataNode field, arrayNode, valueNode;
field = new IIOMetadataNode("TIFFField");
field.setAttribute("number", Integer.toString(number));
field.setAttribute("name", name);
arrayNode = new IIOMetadataNode("TIFFRationals");
field.appendChild(arrayNode);
valueNode = new IIOMetadataNode("TIFFRational");
arrayNode.appendChild(valueNode);
valueNode.setAttribute("value", numerator + "/" + denominator);
return field;
}
}
}
Usage
byte[] bytes = create();
TiffMetadata.Metadata metaData = new TiffMetadata.Metadata();
metaData.add(TiffTagConstants.TIFF_TAG_SOFTWARE, "FUBAR");
// metaData.add(TiffMetadata.TIFF_TAG_XMP, TiffMetadata.TIFF_TAG_XMP_NAME, bytes );
metaData.add(TiffTagConstants.TIFF_TAG_XMP, bytes);
TiffMetadata.setMetaData(tiffIn, tiffOut, metaData);

Java - Clipboard modification : Works in Eclipse, but not as JAR

I created a software based on two sources (stated in the code below) which detect HTML copied to the clipboard and change local images to base 64 ones.
This code works perfectly when I run it in Eclipse, but not using a JAR.
Initially, I was not using the method getHtmlDataFlavor, but I added it when I tried the software as a JAR. Then, I had to ensure in HtmlSelection.getTransferData to have if (flavor.getRepresentationClass() == java.io.Reader.class) otherwise it would crash. But using the JAR, I'm only getting the plain text version! Though, it stills works when ran in Eclipse.
Does someone have an idea ?
I am running on Windows 10.
Executing in command line using : java -jar ClipboardImageToBase64-1.0.0-jar-with-dependencies.jar
GitHub project :
https://github.com/djon2003/ClipboardImageToBase64
/**
* Library from HTML parsing : https://jsoup.org
*
* Code based on :
* - http://stackoverflow.com/a/14226456/214898
* - http://elliotth.blogspot.ca/2005/01/copying-html-to-clipboard-from-java.html
*/
import java.awt.Toolkit;
import java.awt.datatransfer.Clipboard;
import java.awt.datatransfer.ClipboardOwner;
import java.awt.datatransfer.DataFlavor;
import java.awt.datatransfer.Transferable;
import java.awt.datatransfer.UnsupportedFlavorException;
import java.io.File;
import java.io.IOException;
import java.io.StringReader;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.commons.codec.binary.Base64;
import org.apache.commons.io.FileUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class ClipBoardListener extends Thread implements ClipboardOwner {
Clipboard sysClip = Toolkit.getDefaultToolkit().getSystemClipboard();
private static DataFlavor HTML_FLAVOR = new DataFlavor("text/html;class=java.io.Reader", "HTML");
private int nbImagesConverted = 0;
private Transferable currentTransferable;
#Override
public void run() {
Transferable trans = sysClip.getContents(this);
TakeOwnership(trans);
}
#Override
public void lostOwnership(Clipboard c, Transferable t) {
System.out.println("Copy to clipboard detected");
try {
ClipBoardListener.sleep(250); // waiting e.g for loading huge
// elements like word's etc.
} catch (Exception e) {
System.out.println("Exception: " + e);
}
Transferable contents = sysClip.getContents(this);
try {
process_clipboard(contents, c);
} catch (Exception ex) {
Logger.getLogger(ClipBoardListener.class.getName()).log(Level.SEVERE, null, ex);
}
TakeOwnership(currentTransferable);
}
void TakeOwnership(Transferable t) {
sysClip.setContents(t, this);
}
private void getHtmlDataFlavor(Transferable t) {
DataFlavor df = null;
for (DataFlavor tDf : t.getTransferDataFlavors()) {
if (tDf.getMimeType().contains("text/html")) {
if (tDf.getRepresentationClass() == java.io.Reader.class) {
df = tDf;
break;
}
}
}
HTML_FLAVOR = df;
}
public void process_clipboard(Transferable t, Clipboard c) {
String tempText = "";
Transferable trans = t;
currentTransferable = t;
getHtmlDataFlavor(t);
if (HTML_FLAVOR == null) {
System.out.println("No HTML flavor detected");
return;
}
nbImagesConverted = 0;
try {
if (trans != null ? trans.isDataFlavorSupported(HTML_FLAVOR) : false) {
if (trans.isDataFlavorSupported(DataFlavor.stringFlavor)) {
tempText = (String) trans.getTransferData(DataFlavor.stringFlavor);
}
java.io.Reader r = (java.io.Reader) trans.getTransferData(HTML_FLAVOR);
StringBuilder content = getReaderContent(r);
String newHtml = changeImages(content);
currentTransferable = new HtmlSelection(newHtml, tempText);
System.out.println("Converted " + nbImagesConverted + " images");
} else {
System.out.println("Not converted:" + trans.isDataFlavorSupported(HTML_FLAVOR));
System.out.println(trans.getTransferData(HTML_FLAVOR));
/*
for (DataFlavor tt : trans.getTransferDataFlavors()) {
if (tt.getMimeType().contains("text/html")) {
System.out.println("-------");
System.out.println(tt.toString());
}
}
*/
}
} catch (Exception e) {
currentTransferable = t;
System.out.println("Conversion error");
e.printStackTrace();
}
}
private String changeImages(StringBuilder content) throws RuntimeException, IOException {
Document doc = Jsoup.parse(content.toString());
Elements imgs = doc.select("img");
for (Element img : imgs) {
String filePath = img.attr("src");
filePath = filePath.replace("file:///", "");
filePath = filePath.replace("file://", "");
File file = new File(filePath);
if (file.exists()) {
String encoded = Base64.encodeBase64String(FileUtils.readFileToByteArray(file));
String extension = file.getName();
extension = extension.substring(extension.lastIndexOf(".") + 1);
String dataURL = "data:image/" + extension + ";base64," + encoded;
img.attr("src", dataURL); // or whatever
nbImagesConverted++;
}
}
String html = doc.outerHtml();
html = html.replaceAll("(?s)<!--.*?-->", ""); //Remove html comments
return html; // returns the modified HTML
}
private StringBuilder getReaderContent(java.io.Reader r) throws IOException {
char[] arr = new char[8 * 1024];
StringBuilder buffer = new StringBuilder();
int numCharsRead;
while ((numCharsRead = r.read(arr, 0, arr.length)) != -1) {
buffer.append(arr, 0, numCharsRead);
}
r.close();
return buffer;
}
private static class HtmlSelection implements Transferable {
private String html;
private String plainText;
public HtmlSelection(String html, String plainText) {
this.html = html;
this.plainText = plainText;
}
public DataFlavor[] getTransferDataFlavors() {
DataFlavor[] dfs = {HTML_FLAVOR, DataFlavor.stringFlavor};
return dfs;
}
public boolean isDataFlavorSupported(DataFlavor flavor) {
return flavor.getMimeType().contains("text/html") || flavor.getMimeType().contains("text/plain");
}
public Object getTransferData(DataFlavor flavor) throws UnsupportedFlavorException {
if (flavor.getMimeType().contains("text/html")) {
if (flavor.getRepresentationClass() == java.io.Reader.class) {
return new StringReader(html);
} else {
return html;
}
} else {
return plainText;
}
//throw new UnsupportedFlavorException(flavor);
}
}
}
I finally fixed all my problems. I created a GitHub project for those who are interested.
https://github.com/djon2003/ClipboardImageToBase64

Print md5 checksum values using hashmap and strings

i have following code for comparing the md5 hash values for two folder but i need to show the list of files and the hash value of each file. can anyone please help me out with this. i just need to get hash value for one folder only.
package com.example;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.security.MessageDigest;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
public class Compare
{
//This can be any folder locations which you want to compare
File dir1 = new File("/Users/Samip/Desktop/crypto");
File dir2 = new File("/Users/Samip/Desktop/crypto1");
public static void main(String ...args)
{
Compare compare = new Compare();
try
{
compare.getDiff(compare.dir1,compare.dir2);
}
catch(IOException ie)
{
ie.printStackTrace();
}
}
public void getDiff(File dirA, File dirB) throws IOException
{
File[] fileList1 = dirA.listFiles();
File[] fileList2 = dirB.listFiles();
Arrays.sort(fileList1);
Arrays.sort(fileList2);
HashMap<String, File> map1;
if(fileList1.length < fileList2.length)
{
map1 = new HashMap<String, File>();
for(int i=0;i<fileList1.length;i++)
{
map1.put(fileList1[i].getName(),fileList1[i]);
}
compareNow(fileList2, map1);
}
else
{
map1 = new HashMap<String, File>();
for(int i=0;i<fileList2.length;i++)
{
map1.put(fileList2[i].getName(),fileList2[i]);
}
compareNow(fileList1, map1);
}
}
public void compareNow(File[] fileArr, HashMap<String, File> map) throws IOException
{
for(int i=0;i<fileArr.length;i++)
{
String fName = fileArr[i].getName();
File fComp = map.get(fName);
map.remove(fName);
if(fComp!=null)
{
if(fComp.isDirectory())
{
getDiff(fileArr[i], fComp);
}
else
{
String cSum1 = checksum(fileArr[i]);
String cSum2 = checksum(fComp);
if(!cSum1.equals(cSum2))
{
System.out.println(fileArr[i].getName()+"\t\t"+ "different");
}
else
{
System.out.println(fileArr[i].getName()+"\t\t"+"identical");
}
}
}
else
{
if(fileArr[i].isDirectory())
{
traverseDirectory(fileArr[i]);
}
else
{
System.out.println(fileArr[i].getName()+"\t\t"+"only in "+fileArr[i].getParent());
}
}
}
Set<String> set = map.keySet();
Iterator<String> it = set.iterator();
while(it.hasNext())
{
String n = it.next();
File fileFrmMap = map.get(n);
map.remove(n);
if(fileFrmMap.isDirectory())
{
traverseDirectory(fileFrmMap);
}
else
{
System.out.println(fileFrmMap.getName() +"\t\t"+"only in "+ fileFrmMap.getParent());
}
}
}
public void traverseDirectory(File dir)
{
File[] list = dir.listFiles();
for(int k=0;k<list.length;k++)
{
if(list[k].isDirectory())
{
traverseDirectory(list[k]);
}
else
{
System.out.println(list[k].getName() +"\t\t"+"only in "+ list[k].getParent());
}
}
}
public String checksum(File file)
{
try
{
InputStream fin = new FileInputStream(file);
java.security.MessageDigest md5er = MessageDigest.getInstance("MD5");
byte[] buffer = new byte[1024];
int read;
do
{
read = fin.read(buffer);
if (read > 0)
md5er.update(buffer, 0, read);
} while (read != -1);
fin.close();
byte[] digest = md5er.digest();
if (digest == null)
return null;
String strDigest = "0x";
for (int i = 0; i < digest.length; i++)
{
strDigest += Integer.toString((digest[i] & 0xff) + 0x100, 16).substring(1).toUpperCase();
}
return strDigest;
}
catch (Exception e)
{
return null;
}
}
}
In you main method, instead using Compare.getDiff(dir1, dir2) you want to
Get a file listing of your target directory
Invoke Compare.checksum(file) on each file and print the result
Looks like you have all the code, you just need to reshape it a little.
Consider this example. The hash-generating code has been taken from your previous question - same goes for the file-iteration code. You just replace that folder to match your.
import java.io.*;
import java.security.MessageDigest;
public class PrintChecksums {
public static void main(String[] args) {
String sourceDir = "/Users/Jan/Desktop/Folder1";
try {
new PrintChecksums().printHashs(new File(sourceDir));
} catch (Exception e) {
e.printStackTrace();
}
}
private void printHashs(File sourceDir) throws Exception {
for (File f : sourceDir.listFiles()) {
String hash = createHash(f); // That you almost have
System.out.println(f.getAbsolutePath() + " / Hashvalue: " + hash);
}
}
public String createHash(File datafile) throws Exception {
// SNIP - YOUR CODE BEGINS
MessageDigest md = MessageDigest.getInstance("SHA1");
FileInputStream fis = new FileInputStream(datafile);
byte[] dataBytes = new byte[1024];
int nread = 0;
while ((nread = fis.read(dataBytes)) != -1) {
md.update(dataBytes, 0, nread);
}
byte[] mdbytes = md.digest();
// convert the byte to hex format
StringBuffer sb = new StringBuffer("");
for (int i = 0; i < mdbytes.length; i++) {
sb.append(Integer.toString((mdbytes[i] & 0xff) + 0x100, 16).substring(1));
}
// SNAP - YOUR CODE ENDS
return sb.toString();
}
}
Please have a look at the below code. I have added a function printCheckSum() which iterates though directory, scans each file and prints its hash value.
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.security.MessageDigest;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
public class Compare
{
//This can be any folder locations which you want to compare
File dir1 = new File("D:\\dir1");
File dir2 = new File("D:\\dir2");
public static void main(String ...args)
{
Compare compare = new Compare();
try
{
compare.printCheckSum(compare.dir1);
}
catch(IOException ie)
{
ie.printStackTrace();
}
}
public void getDiff(File dirA, File dirB) throws IOException
{
File[] fileList1 = dirA.listFiles();
File[] fileList2 = dirB.listFiles();
Arrays.sort(fileList1);
Arrays.sort(fileList2);
HashMap<String, File> map1;
if(fileList1.length < fileList2.length)
{
map1 = new HashMap<String, File>();
for(int i=0;i<fileList1.length;i++)
{
map1.put(fileList1[i].getName(),fileList1[i]);
}
compareNow(fileList2, map1);
}
else
{
map1 = new HashMap<String, File>();
for(int i=0;i<fileList2.length;i++)
{
map1.put(fileList2[i].getName(),fileList2[i]);
}
compareNow(fileList1, map1);
}
}
public void compareNow(File[] fileArr, HashMap<String, File> map) throws IOException
{
for(int i=0;i<fileArr.length;i++)
{
String fName = fileArr[i].getName();
File fComp = map.get(fName);
map.remove(fName);
if(fComp!=null)
{
if(fComp.isDirectory())
{
getDiff(fileArr[i], fComp);
}
else
{
String cSum1 = checksum(fileArr[i]);
String cSum2 = checksum(fComp);
if(!cSum1.equals(cSum2))
{
System.out.println(fileArr[i].getName()+"\t\t"+ "different");
}
else
{
System.out.println(fileArr[i].getName()+"\t\t"+"identical");
}
}
}
else
{
if(fileArr[i].isDirectory())
{
traverseDirectory(fileArr[i]);
}
else
{
System.out.println(fileArr[i].getName()+"\t\t"+"only in "+fileArr[i].getParent());
}
}
}
Set<String> set = map.keySet();
Iterator<String> it = set.iterator();
while(it.hasNext())
{
String n = it.next();
File fileFrmMap = map.get(n);
map.remove(n);
if(fileFrmMap.isDirectory())
{
traverseDirectory(fileFrmMap);
}
else
{
System.out.println(fileFrmMap.getName() +"\t\t"+"only in "+ fileFrmMap.getParent());
}
}
}
public void traverseDirectory(File dir)
{
File[] list = dir.listFiles();
for(int k=0;k<list.length;k++)
{
if(list[k].isDirectory())
{
traverseDirectory(list[k]);
}
else
{
System.out.println(list[k].getName() +"\t\t"+"only in "+ list[k].getParent());
}
}
}
public String checksum(File file)
{
try
{
InputStream fin = new FileInputStream(file);
java.security.MessageDigest md5er = MessageDigest.getInstance("MD5");
byte[] buffer = new byte[1024];
int read;
do
{
read = fin.read(buffer);
if (read > 0)
md5er.update(buffer, 0, read);
} while (read != -1);
fin.close();
byte[] digest = md5er.digest();
if (digest == null)
return null;
String strDigest = "0x";
for (int i = 0; i < digest.length; i++)
{
strDigest += Integer.toString((digest[i] & 0xff) + 0x100, 16).substring(1).toUpperCase();
}
return strDigest;
}
catch (Exception e)
{
return null;
}
}
public void printCheckSum(File dir) throws IOException{
File[] fileList = dir.listFiles();
for(File file : fileList){
if(file.isDirectory()){
printCheckSum(file);
}else
System.out.println(file.getName() +"\t :: \t" + checksum(file));
}
}
}
Hope this helps. Cheers!

How to convert dicom file to jpg conversion

How we can convert a dicom file(.dcm) to a jpeg image using java?
Here is my code:
import java.io.File;
import java.io.IOException;
import org.dcm4che2.tool.dcm2jpg.Dcm2Jpg;
public class MainClass {
public static void main(String[] args) throws IOException{
Dcm2Jpg conv = new Dcm2Jpg();
conv.convert(new File("C:\\Users\\lijo.joseph\\Desktop\\Dicom\\IM-0001-0001.dcm"), new File("C:\\Users\\lijo.joseph\\Desktop\\Dicom\\IM-0001-0001.jpg"));
}
}
and i am getting the following error while running the project
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli/ParseException
at MainClass.main(MainClass.java:7)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli.ParseException
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 1 more
please help and thanks in advance
Here is the link Converting DICOM to JPEG using dcm4che 2
Following is my code which works perfectly.I have placed it with imports so it might be use-full.
import java.awt.image.BufferedImage;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.Iterator;
import javax.imageio.ImageIO;
import javax.imageio.ImageReader;
import javax.imageio.stream.ImageInputStream;
import org.dcm4che2.imageio.plugins.dcm.DicomImageReadParam;
import com.sun.image.codec.jpeg.JPEGCodec;
import com.sun.image.codec.jpeg.JPEGImageEncoder;
public class Examplke1 {
static BufferedImage myJpegImage=null;
public static void main(String[] args) {
File file = new File("test5/12840.dcm");
Iterator<ImageReader> iterator =ImageIO.getImageReadersByFormatName("DICOM");
while (iterator.hasNext()) {
ImageReader imageReader = (ImageReader) iterator.next();
DicomImageReadParam dicomImageReadParam = (DicomImageReadParam) imageReader.getDefaultReadParam();
try {
ImageInputStream iis = ImageIO.createImageInputStream(file);
imageReader.setInput(iis,false);
myJpegImage = imageReader.read(0, dicomImageReadParam);
iis.close();
if(myJpegImage == null){
System.out.println("Could not read image!!");
}
} catch (IOException e) {
e.printStackTrace();
}
File file2 = new File("/test.jpg");
try {
OutputStream outputStream = new BufferedOutputStream(new FileOutputStream(file2));
JPEGImageEncoder encoder = JPEGCodec.createJPEGEncoder(outputStream);
encoder.encode(myJpegImage);
outputStream.close();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Completed");
}
}
}
Jars Used to Run it
dcm4che-imageio-2.0.28.jar
dcm4che-image-2.0.28.jar
jai_imageio-1.1.jar
dcm4che-core-2.0.28.jar
slf4j-api-1.7.7.jar
slf4j-log4j12-1.7.7.jar
apache-logging-log4j.jar
Hope it helps.
This Code is used for Converting Dicom Image to JPG Image
import java.io.File;
import java.io.IOException;
public class Dcm2JpgTest {
public static void main(String[] args) throws IOException {
try{
File src = new File("d:\\Test.dcm");
File dest = new File("d:\\Test.jpg");
Dcm2Jpeg dcm2jpg= new Dcm2Jpeg();
dcm2jpg.convert(src, dest);
System.out.println("Completed");
} catch(IOException e){
e.printStackTrace();
} catch(Exception e){
e.printStackTrace();
}
}
}
Dcm2Jpeg.java File
import java.awt.image.BufferedImage;
import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.util.Iterator;
import java.util.List;
import javax.imageio.ImageIO;
import javax.imageio.ImageReader;
import javax.imageio.stream.ImageInputStream;
import org.apache.commons.cli.CommandLine;
import org.apache.commons.cli.GnuParser;
import org.apache.commons.cli.HelpFormatter;
import org.apache.commons.cli.OptionBuilder;
import org.apache.commons.cli.Options;
import org.apache.commons.cli.ParseException;
import org.dcm4che2.data.DicomObject;
import org.dcm4che2.imageio.plugins.dcm.DicomImageReadParam;
import org.dcm4che2.io.DicomInputStream;
import org.dcm4che2.util.CloseUtils;
import com.sun.image.codec.jpeg.JPEGCodec;
import com.sun.image.codec.jpeg.JPEGImageEncoder;
public class Dcm2Jpeg {
private static final String USAGE =
"dcm2jpg [Options] <dcmfile> <jpegfile>\n" +
"or dcm2jpg [Options] <dcmfile>... <outdir>\n" +
"or dcm2jpg [Options] <indir>... <outdir>";
private static final String DESCRIPTION =
"Convert DICOM image(s) to JPEG(s)\nOptions:";
private static final String EXAMPLE = null;
private int frame = 1;
private float center;
private float width;
private String vlutFct;
private boolean autoWindowing;
private DicomObject prState;
private short[] pval2gray;
private String fileExt = ".jpg";
private void setFrameNumber(int frame) {
this.frame = frame;
}
private void setWindowCenter(float center) {
this.center = center;
}
private void setWindowWidth(float width) {
this.width = width;
}
public final void setVoiLutFunction(String vlutFct) {
this.vlutFct = vlutFct;
}
private final void setAutoWindowing(boolean autoWindowing) {
this.autoWindowing = autoWindowing;
}
private final void setPresentationState(DicomObject prState) {
this.prState = prState;
}
private final void setPValue2Gray(short[] pval2gray) {
this.pval2gray = pval2gray;
}
public final void setFileExt(String fileExt) {
this.fileExt = fileExt;
}
public void convert(File src, File dest) throws IOException {
Iterator<ImageReader> iter = ImageIO.getImageReadersByFormatName("DICOM");
ImageReader reader = iter.next();
DicomImageReadParam param =
(DicomImageReadParam) reader.getDefaultReadParam();
param.setWindowCenter(center);
param.setWindowWidth(width);
param.setVoiLutFunction(vlutFct);
param.setPresentationState(prState);
param.setPValue2Gray(pval2gray);
param.setAutoWindowing(autoWindowing);
ImageInputStream iis = ImageIO.createImageInputStream(src);
BufferedImage bi;
OutputStream out = null;
try {
reader.setInput(iis, false);
bi = reader.read(frame - 1, param);
if (bi == null) {
System.out.println("\nError: " + src + " - couldn't read!");
return;
}
out = new BufferedOutputStream(new FileOutputStream(dest));
JPEGImageEncoder enc = JPEGCodec.createJPEGEncoder(out);
enc.encode(bi);
} finally {
CloseUtils.safeClose(iis);
CloseUtils.safeClose(out);
}
//System.out.print('.');
}
public int mconvert(List<String> args, int optind, File destDir)
throws IOException {
int count = 0;
for (int i = optind, n = args.size() - 1; i < n; ++i) {
File src = new File(args.get(i));
count += mconvert(src, new File(destDir, src2dest(src)));
}
return count;
}
private String src2dest(File src) {
String srcname = src.getName();
return src.isFile() ? srcname + this.fileExt : srcname;
}
public int mconvert(File src, File dest) throws IOException {
if (!src.exists()) {
System.err.println("WARNING: No such file or directory: " + src
+ " - skipped.");
return 0;
}
if (src.isFile()) {
try {
convert(src, dest);
} catch (Exception e) {
System.err.println("WARNING: Failed to convert " + src + ":");
e.printStackTrace(System.err);
System.out.print('F');
return 0;
}
System.out.print('.');
return 1;
}
File[] files = src.listFiles();
if (files.length > 0 && !dest.exists()) {
dest.mkdirs();
}
int count = 0;
for (int i = 0; i < files.length; ++i) {
count += mconvert(files[i], new File(dest, src2dest(files[i])));
}
return count;
}
#SuppressWarnings("unchecked")
public static void main(String args[]) throws Exception {
CommandLine cl = parse(args);
Dcm2Jpeg dcm2jpg = new Dcm2Jpeg();
if (cl.hasOption("f")) {
dcm2jpg.setFrameNumber(
parseInt(cl.getOptionValue("f"),
"illegal argument of option -f",
1, Integer.MAX_VALUE));
}
if (cl.hasOption("p")) {
dcm2jpg.setPresentationState(loadDicomObject(
new File(cl.getOptionValue("p"))));
}
if (cl.hasOption("pv2gray")) {
dcm2jpg.setPValue2Gray(loadPVal2Gray(
new File(cl.getOptionValue("pv2gray"))));
}
if (cl.hasOption("c")) {
dcm2jpg.setWindowCenter(
parseFloat(cl.getOptionValue("c"),
"illegal argument of option -c"));
}
if (cl.hasOption("w")) {
dcm2jpg.setWindowWidth(
parseFloat(cl.getOptionValue("w"),
"illegal argument of option -w"));
}
if (cl.hasOption("sigmoid")) {
dcm2jpg.setVoiLutFunction(DicomImageReadParam.SIGMOID);
}
dcm2jpg.setAutoWindowing(!cl.hasOption("noauto"));
if (cl.hasOption("jpgext")) {
dcm2jpg.setFileExt(cl.getOptionValue("jpgext"));
}
final List<String> argList = cl.getArgList();
int argc = argList.size();
File dest = new File(argList.get(argc-1));
long t1 = System.currentTimeMillis();
int count = 1;
if (dest.isDirectory()) {
count = dcm2jpg.mconvert(argList, 0, dest);
} else {
File src = new File(argList.get(0));
if (argc > 2 || src.isDirectory()) {
exit("dcm2jpg: when converting several files, "
+ "last argument must be a directory\n");
}
dcm2jpg.convert(src, dest);
}
long t2 = System.currentTimeMillis();
System.out.println("\nconverted " + count + " files in " + (t2 - t1)
/ 1000f + " s.");
}
private static DicomObject loadDicomObject(File file) {
DicomInputStream in = null;
try {
in = new DicomInputStream(file);
return in.readDicomObject();
} catch (IOException e) {
exit(e.getMessage());
throw new RuntimeException();
} finally {
CloseUtils.safeClose(in);
}
}
private static short[] loadPVal2Gray(File file) {
BufferedReader r = null;
try {
r = new BufferedReader(new InputStreamReader(new FileInputStream(
file)));
short[] pval2gray = new short[256];
int n = 0;
String line;
while ((line = r.readLine()) != null) {
try {
int val = Integer.parseInt(line.trim());
if (n == pval2gray.length) {
if (n == 0x10000) {
exit("Number of entries in " + file + " > 2^16");
}
short[] tmp = pval2gray;
pval2gray = new short[n << 1];
System.arraycopy(tmp, 0, pval2gray, 0, n);
}
pval2gray[n++] = (short) val;
} catch (NumberFormatException nfe) {
// ignore lines where Integer.parseInt fails
}
}
if (n != pval2gray.length) {
exit("Number of entries in " + file + ": " + n
+ " != 2^[8..16]");
}
return pval2gray;
} catch (IOException e) {
exit(e.getMessage());
throw new RuntimeException();
} finally {
CloseUtils.safeClose(r);
}
}
private static CommandLine parse(String[] args) {
Options opts = new Options();
OptionBuilder.withArgName("frame");
OptionBuilder.hasArg();
OptionBuilder.withDescription(
"frame to convert, 1 (= first frame) by default");
opts.addOption(OptionBuilder.create("f"));
OptionBuilder.withArgName("prfile");
OptionBuilder.hasArg();
OptionBuilder.withDescription(
"file path of presentation state to apply");
opts.addOption(OptionBuilder.create("p"));
OptionBuilder.withArgName("center");
OptionBuilder.hasArg();
OptionBuilder.withDescription("Window Center");
opts.addOption(OptionBuilder.create("c"));
OptionBuilder.withArgName("width");
OptionBuilder.hasArg();
OptionBuilder.withDescription("Window Width");
opts.addOption(OptionBuilder.create("w"));
opts.addOption("sigmoid", false,
"apply sigmoid VOI LUT function with given Window Center/Width");
opts.addOption("noauto", false,
"disable auto-windowing for images w/o VOI attributes");
OptionBuilder.withArgName("file");
OptionBuilder.hasArg();
OptionBuilder.withDescription(
"file path of P-Value to gray value map");
opts.addOption(OptionBuilder.create("pv2gray"));
OptionBuilder.withArgName(".xxx");
OptionBuilder.hasArg();
OptionBuilder.withDescription(
"jpeg file extension used with destination directory argument,"
+ " default: '.jpg'.");
opts.addOption(OptionBuilder.create("jpgext"));
opts.addOption("h", "help", false, "print this message");
opts.addOption("V", "version", false,
"print the version information and exit");
CommandLine cl = null;
try {
cl = new GnuParser().parse(opts, args);
} catch (ParseException e) {
exit("dcm2jpg: " + e.getMessage());
throw new RuntimeException("unreachable");
}
if (cl.hasOption('V')) {
Package p = Dcm2Jpeg.class.getPackage();
System.out.println("dcm2jpg v" + p.getImplementationVersion());
System.exit(0);
}
if (cl.hasOption('h') || cl.getArgList().size() < 2) {
HelpFormatter formatter = new HelpFormatter();
formatter.printHelp(USAGE, DESCRIPTION, opts, EXAMPLE);
System.exit(0);
}
return cl;
}
private static int parseInt(String s, String errPrompt, int min, int max) {
try {
int i = Integer.parseInt(s);
if (i >= min && i <= max)
return i;
} catch (NumberFormatException e) {
// parameter is not a valid integer; fall through to exit
}
exit(errPrompt);
throw new RuntimeException();
}
private static float parseFloat(String s, String errPrompt) {
try {
return Float.parseFloat(s);
} catch (NumberFormatException e) {
exit(errPrompt);
throw new RuntimeException();
}
}
private static void exit(String msg) {
System.err.println(msg);
System.err.println("Try 'dcm2jpg -h' for more information.");
System.exit(1);
}
}
Jars Files Used to Run this code
dcm4che-core-2.0.23.jar
dcm4che-image-2.0.23.jar
dcm4che-imageio-2.0.23.jar
dcm4che-imageio-rle-2.0.23.jar
slf4j-log4j12-1.5.0.jar
slf4j-api-1.5.0.jar
log4j-1.2.13.jar
commons-cli-1.2.jar
If you don't want to use direct Dcm2Jpg.java file then you can include below jar file.
dcm4che-tool-dcm2jpg-2.0.23.jar
In this jar you can import org.dcm4che2.tool.dcm2jpg.Dcm2Jpg this java file

FileNotFoundException when using Hadoop distributed cache

this time someone should please relpy
i am struggling with running my code using distributed cahe. i have already the files on hdfs but when i run this code :
import java.awt.image.BufferedImage;
import java.awt.image.DataBufferByte;
import java.awt.image.Raster;
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.DataInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URISyntaxException;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.imageio.ImageIO;
import org.apache.hadoop.filecache.*;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import java.lang.String;
import java.lang.Runtime;
import java.net.URI;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
public class blur2 {
public static class BlurMapper extends MapReduceBase implements Mapper<Text, BytesWritable, LongWritable, BytesWritable>
{
OutputCollector<LongWritable, BytesWritable> goutput;
int IMAGE_HEIGHT = 240;
int IMAGE_WIDTH = 320;
public BytesWritable Gmiu;
public BytesWritable Gsigma;
public BytesWritable w;
byte[] bytes = new byte[IMAGE_HEIGHT*IMAGE_WIDTH*3];
public BytesWritable emit = new BytesWritable(bytes);
int count = 0;
int initVar = 125;
public LongWritable l = new LongWritable(1);
byte[] byte1 = new byte[IMAGE_HEIGHT*IMAGE_WIDTH];
byte[] byte2 = new byte[IMAGE_HEIGHT*IMAGE_WIDTH];
byte[] byte3 = new byte[IMAGE_HEIGHT*IMAGE_WIDTH];
public void map(Text key, BytesWritable file,OutputCollector<LongWritable, BytesWritable> output, Reporter reporter) throws IOException {
goutput = output;
BufferedImage img = ImageIO.read(new ByteArrayInputStream(file.getBytes()));
Raster ras=img.getData();
DataBufferByte db= (DataBufferByte)ras.getDataBuffer();
byte[] data = db.getData();
if(count==0){
for(int i=0;i<IMAGE_HEIGHT*IMAGE_WIDTH;i++)
{
byte1[i]=20;
byte2[i]=125;
}
Gmiu = new BytesWritable(data);
Gsigma = new BytesWritable(byte1);
w = new BytesWritable(byte2);
count++;
}
else{
byte1 = Gmiu.getBytes();
byte2 = Gsigma.getBytes();
byte3 = w.getBytes();
for(int i=0;i<IMAGE_HEIGHT*IMAGE_WIDTH;i++)
{
byte pixel = data[i];
Double tempmiu=new Double(0.0);
Double tempsig=new Double(0.0);
double temp1=0.0; double alpha = 0.05;
tempmiu = (1-alpha)*byte1[i] + alpha*pixel;
temp1=temp1+(pixel-byte1[i])*(pixel-byte1[i]);
tempsig=(1-alpha)*byte2[i]+ alpha*temp1;
byte1[i] = tempmiu.byteValue();
byte2[i]= tempsig.byteValue();
Double w1=new Double((1-alpha)*byte3[i]+alpha*100);
byte3[i] = w1.byteValue();
}
Gmiu.set(byte1,0,IMAGE_HEIGHT*IMAGE_WIDTH);
Gsigma.set(byte2,0,IMAGE_HEIGHT*IMAGE_WIDTH);
w.set(byte3,0,IMAGE_HEIGHT*IMAGE_WIDTH);
}
byte1 = Gsigma.getBytes();
for(int i=0;i<IMAGE_HEIGHT*IMAGE_WIDTH;i++)
{
bytes[i]=byte1[i];
}
byte1 = Gsigma.getBytes();
for(int i=0;i<IMAGE_HEIGHT*IMAGE_WIDTH;i++)
{
bytes[IMAGE_HEIGHT*IMAGE_WIDTH+i]=byte1[i];
}
byte1 = w.getBytes();
for(int i=0;i<IMAGE_HEIGHT*IMAGE_WIDTH;i++)
{
bytes[2*IMAGE_HEIGHT*IMAGE_WIDTH+i]=byte1[i];
}
emit.set(bytes,0,3*IMAGE_HEIGHT*IMAGE_WIDTH);
}
#Override
public void close(){
try{
goutput.collect(l, emit);
}
catch(Exception e){
e.printStackTrace();
System.exit(-1);
}
}
}
//end of first job , this is running perfectly
public static void main(String[] args) throws URISyntaxException {
if(args.length!=3) {
System.err.println("Usage: blurvideo input output");
System.exit(-1);
}
JobClient client = new JobClient();
JobConf conf = new JobConf(blur2.class);
conf.setOutputValueClass(BytesWritable.class);
conf.setInputFormat(SequenceFileInputFormat.class);
//conf.setNumMapTasks(n)
SequenceFileInputFormat.addInputPath(conf, new Path(args[0]));
TextOutputFormat.setOutputPath(conf, new Path(args[1]));
conf.setMapperClass(BlurMapper.class);
conf.setNumReduceTasks(0);
//conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
client.setConf(conf);
try {
JobClient.runJob(conf);
} catch (Exception e) {
e.printStackTrace();
}
// exec("jar cf /home/hmobile/hadoop-0.19.2/imag /home/hmobile/hadoop-0.19.2/output");
JobClient client2 = new JobClient();
JobConf conf2 = new JobConf(blur2.class);
conf2.setOutputValueClass(BytesWritable.class);
conf2.setInputFormat(SequenceFileInputFormat.class);
//conf.setNumMapTasks(n)
SequenceFileInputFormat.addInputPath(conf2, new Path(args[0]));
SequenceFileOutputFormat.setOutputPath(conf2, new Path(args[2]));
conf2.setMapperClass(BlurMapper2.class);
conf2.setNumReduceTasks(0);
DistributedCache.addCacheFile(new URI("~/ayush/output/part-00000"), conf2);// these files are already on the hdfs
DistributedCache.addCacheFile(new URI("~/ayush/output/part-00001"), conf2);
client2.setConf(conf2);
try {
JobClient.runJob(conf2);
} catch (Exception e) {
e.printStackTrace();
}
}
public static class BlurMapper2 extends MapReduceBase implements Mapper<Text, BytesWritable, LongWritable, BytesWritable>
{
int IMAGE_HEIGHT = 240;
int T =60;
int IMAGE_WIDTH = 320;
public BytesWritable Gmiu;
public BytesWritable Gsigma;
public BytesWritable w;
byte[] bytes = new byte[IMAGE_HEIGHT*IMAGE_WIDTH];
public BytesWritable emit = new BytesWritable(bytes);
int initVar = 125;int gg=0;
int K=64;int k=0,k1=0,k2=0;
public LongWritable l = new LongWritable(1);
byte[] Gmiu1 = new byte[IMAGE_HEIGHT*IMAGE_WIDTH*K];
byte[] Gsigma1 = new byte[IMAGE_HEIGHT*IMAGE_WIDTH*K];
byte[] w1 = new byte[IMAGE_HEIGHT*IMAGE_WIDTH*K];
public Path[] localFiles=new Path[2];
private FileSystem fs;
#Override
public void configure(JobConf conf2)
{
try {
fs = FileSystem.getLocal(new Configuration());
localFiles = DistributedCache.getLocalCacheFiles(conf2);
//System.out.println(localFiles[0].getName());
} catch (IOException ex) {
Logger.getLogger(blur2.class.getName()).log(Level.SEVERE, null, ex);
}
}
public void map(Text key, BytesWritable file,OutputCollector<LongWritable, BytesWritable> output, Reporter reporter) throws IOException
{
if(gg==0){
//System.out.println(localFiles[0].getName());
String wrd; String line;
for(Path f:localFiles)
{
if(!f.getName().endsWith("crc"))
{
// FSDataInputStream localFile = fs.open(f);
BufferedReader br = null;
try {
br = new BufferedReader(new InputStreamReader(fs.open(f)));
int c = 0;
try {
while ((line = br.readLine()) != null) {
StringTokenizer itr = new StringTokenizer(line, " ");
while (itr.hasMoreTokens()) {
wrd = itr.nextToken();
c++;
int i = Integer.parseInt(wrd, 16);
Integer I = new Integer(i);
byte b = I.byteValue();
if (c < IMAGE_HEIGHT * IMAGE_WIDTH) {
Gmiu1[k] = b;k++;
} else {
if ((c >= IMAGE_HEIGHT * IMAGE_WIDTH) && (c < 2 * IMAGE_HEIGHT * IMAGE_WIDTH)) {
Gsigma1[k] = b;k1++;
} else {
w1[k] = b;k2++;
}
}
}
}
} catch (IOException ex) {
Logger.getLogger(blur2.class.getName()).log(Level.SEVERE, null, ex);
}
} catch (FileNotFoundException ex) {
Logger.getLogger(blur2.class.getName()).log(Level.SEVERE, null, ex);
} finally {
try {
br.close();
} catch (IOException ex) {
Logger.getLogger(blur2.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
}
gg++;
}
}
}
}
tackled a lot with this, can anyone please tell why i am getting this error:
java.io.FileNotFoundException: File does not exist: ~/ayush/output/part-00000
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:394)
at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:475)
at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:676)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:774)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)
at blur2.main(blur2.java:175)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
The problem is with the filename you are using "~/ayush/output/part-00000" relies on Unix shell (sh, bash, ksh) tilde expansion to replace the "~" with the pathname of your home directory.
Java (and C, and C++, and most other programming languages) don't do tilde expansion. You need to provide the pathname as "/home/ayush/output/part-00000" ... or whatever absolute pathname it is that the tilded form expands to.
Strictly speaking, the URI should be created as follows:
new File("/home/ayush/output/part-00000").toURI()
not as
new URI("/home/ayush/output/part-00000")
The latter creates a URI without a "protocol", and that could be problematic.

Categories

Resources