how to decode/ get encoding of file (Power BI desktop file) - java

I am having power BI desktop report(pbix) internal file (DataMashup), which i am trying to decode.
My Aim is to create Power-BI desktop report, Data Model using any programming language. I am using Java for initial.
files are encoded with some encoding technique.
I tried to get encoding of file and it is returning windows 1254. but decoding is not happening.
File f = new File("example.txt");
String[] charsetsToBeTested = {"UTF-8", "windows-1254", "ISO-8859-7"};
CharsetDetector cd = new CharsetDetector();
Charset charset = cd.detectCharset(f, charsetsToBeTested);
if (charset != null) {
try {
InputStreamReader reader = new InputStreamReader(new FileInputStream(f), charset);
int c = 0;
while ((c = reader.read()) != -1) {
System.out.print((char)c);
}
reader.close();
} catch (FileNotFoundException fnfe) {
fnfe.printStackTrace();
}catch(IOException ioe){
ioe.printStackTrace();
}
}else{
System.out.println("Unrecognized charset.");
}
Unzipping of file is also not working
public void unZipIt(String zipFile, String outputFolder)
{
byte buffer[] = new byte[1024];
try
{
File folder = new File(outputFolder);
if(!folder.exists())
{
folder.mkdir();
}
ZipInputStream zis = new ZipInputStream(new FileInputStream(zipFile));
System.out.println(zis);
System.out.println(zis.getNextEntry());
for(ZipEntry ze = zis.getNextEntry(); ze != null; ze = zis.getNextEntry())
{
String fileName = ze.getName();
System.out.println(ze);
File newFile = new File((new StringBuilder(String.valueOf(outputFolder))).append(File.separator).append(fileName).toString());
System.out.println((new StringBuilder("file unzip : ")).append(newFile.getAbsoluteFile()).toString());
(new File(newFile.getParent())).mkdirs();
FileOutputStream fos = new FileOutputStream(newFile);
int len;
while((len = zis.read(buffer)) > 0)
{
fos.write(buffer, 0, len);
}
fos.close();
}
zis.closeEntry();
zis.close();
System.out.println("Done");
}
catch(IOException ex)
{
ex.printStackTrace();
}
}

The file contains a binary header and then XML with UTF-8 specified.
The header data seems to hold the file name (Config/Package.xml), so assuming a zip format is understandable. With a zip format also there would be binary data at the end of file.
Maybe the file was downloaded using FTP, and a text conversion ("\n" to "\r\n") was done. Then the zip would be corrupted. Renaming the file to .zip might help testing the file with zip tools.
Try first the .tar format. This would be logical as the XML file is not compressed. Add .tar to the file ending.
Otherwise, if the content is always UTF-8 XML:
Path f = Paths.get("example.txt");
String start ="<?xml";
String end = ">";
byte[] bytes = Files.readAllBytes(f);
String s = new String(bytes, StandardCharsets.ISO_8859_1); // Single byte encoding.
int startI = s.indexOf(start);
int endI = s.lastIndexOf(end) + end.length();
//bytes = Arrays.copyOfRange(bytes, startI, endI);
String xml = new String(bytes, startI, endI - startI, StandardCharsets.UTF_8);

You can use the System.IO.Packaging library to extract the Power BI data mashup. It uses the OPC package standard, see here.

Related

How to read multiple-encoded zip file

In my java web application when I upload a Zip file (thread dump), I get inputstream in servlet. I use the Zip4j library to unzip the file and then write it into a file. This zip file has multi encoded content (UTF-8, windows-1252, ISO-8859-1, ISO-8859-2, IBM424_rtl). When I open the output file, I see some characters like this Mac OS X 2 € ² ATTR ² ˜
Here is a sample code. Can you please let me know how can I fix this issue?
// Using Zip4j library to uncompress ZIP format
ZipInputStream zis = new ZipInputStream(iStream);
FileOutputStream zos = new FileOutputStream("output_file.txt");
ByteArrayOutputStream out = new ByteArrayOutputStream();
LocalFileHeader localFileHeader = zis.getNextEntry();
while (localFileHeader != null) {
if(localFileHeader.isDirectory()) {
localFileHeader = zis.getNextEntry();
continue;
}
IOUtils.copy(zis, out);
localFileHeader = zis.getNextEntry();
}
InputStreamReader isr = new InputStreamReader(new ByteArrayInputStream(out.toByteArray()));
BufferedReader reader = new BufferedReader(isr);
String str;
while ((str = reader.readLine()) != null) {
// This is a custom method that will return the charset of the input string using apache tikka library
String encoding = CharsetDetector.detectCharset(str);
zos.write(str.getBytes(encoding));
zos.write("\n".getBytes());
}
isr.close();
reader.close();
zos.close();
zis.close();
// Method is used to detect charset
public static String detectCharset(String text) throws IOException {
org.apache.tika.parser.txt.CharsetDetector detector = new org.apache.tika.parser.txt.CharsetDetector();
detector.setText(text.getBytes());
String charset = detector.detect().getName();
return charset;
}
Note: I am running application on windows machine.
Thanks in advance!

How to get Java Resource file into byte[]?

I have a Java program that needs to read a file from a resource within the JAR and it only takes it through byte[]. My problem is converting the resource file from a folder within the project (i.e. tools/test.txt) into byte[]. I have tried the following (gave an "undefined for type" error):
final byte[] temp = new File("tools/test.txt").getBytes();
Another method I tried resulted in not being able to find the file:
FileOutputStream fos = new FileOutputStream("tools/test.txt");
byte[] myByteArray = null;
fos.write(myByteArray);
fos.close();
System.out.println("Results = " + myByteArray);
And lastly using Inputstream and BufferedReader. This actually gave the content of the file when running the program from Eclipse, but came out as null when running it as a jar (I am assuming that it is also not reading the file).
InputStream is = null;
BufferedReader br = null;
String line;
ArrayList list = new ArrayList();
try {
is = Main.class.getResourceAsStream("tools/test.txt");
br = new BufferedReader(new InputStreamReader(is));
while (null != (line = br.readLine())) {
list.add(line);
System.out.println("Output:" + line);
}
while (null == (line = br.readLine())) {
System.out.println("Error loading file:" + line);
}
}
catch (Exception ef) {
ef.printStackTrace();
System.out.println("Output:" + ef);
}
So my question is, if I have a folder named "tools" and have a file called "test.txt", what code would I use to turn it into byte[] and still work when compiled into a Jar file?
ByteArrayOutputStream baos = new ByteArrayOutputStream();
InputStream in = Main.class.getResourceAsStream("/tools/test.txt");
byte[] buffer = new byte[4096];
for (;;) {
int nread = in.read(buffer);
if (nread <= 0) {
break;
}
baos.write(buffer, 0, nread);
}
byte[] data = baos.toByteArray();
String text = new String(data, "Windows-1252");
Byte[] asByteObjects = new Byte[data.length];
for (int i # 0; i < data.length: ++i) {
asByteObjects[i] = data[i];
}
Without the heading slash the path would be relative to the package of the class. A ByteArrayOutputStream serves to collect for a byte[].
If the bytes represent text is some encoding, one can turn it into a String. Here with Windows Latin-1.
have you tried Scanner.nextByte()? make a new scanner with the file you want to parse as the input and use a for loop to create your array.

How to read a jar file, convert it to a string and create a new jar file from that string?

I´m trying to implement some "over the air" update mechanism for OSGi bundles. For that, I need to be able to create a jar file from a String (basically the content of the jar file read by JarInputStream). The following example code should illustrate my needs:
//read bundle to be copied!
File originalFile = new File(
"/Users/stefan/Documents/Projects/OSGi/SimpleBundle_1.0.0.201404.jar");
JarInputStream fis = new JarInputStream(new FileInputStream(originalFile));
StringBuilder stringBuilder = new StringBuilder();
int ch;
while ((ch = fis.read()) != -1) {
stringBuilder.append((char) ch);
}
fis.close();
//Create content string
String content = stringBuilder.toString();
if (logger.isInfoEnabled()) {
logger.info(content);
}
//Init new jar input stream
JarInputStream jarInputStream = new JarInputStream(
new ByteArrayInputStream(content.getBytes()));
if (logger.isInfoEnabled()) {
logger.info("Save content to disc!");
}
File newFile = new File(
"/Users/stefan/Documents/Projects/OSGi/equinox/SimpleBundle_1.0.0.201404.jar");
//Init new jar output stream
JarOutputStream fos = new JarOutputStream(
new FileOutputStream(newFile));
if (!newFile.exists()) {
newFile.createNewFile();
}
int BUFFER_SIZE = 10240;
byte buffer[] = new byte[BUFFER_SIZE];
while (true) {
int nRead = jarInputStream.read(buffer, 0,
buffer.length);
if (nRead <= 0)
break;
fos.write(buffer, 0, nRead);
}
//Write content to new jar file.
fos.flush();
fos.close();
jarInputStream.close();
Unfortunately, the created jar file is empty and throws an "Invalid input file" error if I try to open it with JD-GUI. Is it possible to create a jar file from the String "content"?
Best regards and thank you very much
Stefan
Your jar is empty because you do not read anything from the JarInputStream. If you want to read JarInputStream, you should iterate its entries. If you want to change the Manifest, the first entry should be skipped, use the getManifest() of the jarInputStream and the constructor of the JarOutputStream, where Manifest can be specified. Based on your code (no manifest change but plain jar copy):
ZipEntry zipEntry = jarInputStream.getNextEntry();
while (zipEntry != null) {
fos.putNextEntry(zipEntry);
// Simple stream copy comes here
int BUFFER_SIZE = 10240;
byte buffer[] = new byte[BUFFER_SIZE];
int l = jarInputStream.read(buffer);
while(l >= 0) {
fos.write(buffer, 0, l);
l = jarInputStream.read(buffer);
}
zipEntry = jarInputStream.getNextEntry();
}
You only need this if you want to change the content (Manifest or entries) of the JAR file during the copy. Otherwise, simple InputStream and FileOutputStream will do the work (as Tim said).

Java Connecting URL and downloading a zip but when extracting the zip it's not properly downloaded

I am sending a request XML to the URL and receiving a zip file to the given path.
Sometimes I'm facing troubles when the bandwidth is low this zip file, most likely 120MB size is not getting downloaded properly. And getting an error when extracting the zip file. Extracting happens from the code as well. When I download in high bandwidth this file gets download without issue.
I'm looking for a solution without making the bandwidth high, from program level are there any ways to download this zip file, may be part by part or something like that? Or anyother solution that you all are having is highly appreciated.
Downloading :
url = new URL(_URL);
sc = (HttpURLConnection) url.openConnection();
sc.setDoInput(true);
sc.setDoOutput(true);
sc.setRequestMethod("POST");
sc.connect();
OutputStream mOstr = sc.getOutputStream();
mOstr.write(request.getBytes());
InputStream in = sc.getInputStream();
FileOutputStream out = new FileOutputStream(path);
int count;
byte[] buffer = new byte[86384];
while ((count = in.read(buffer,0,buffer.length)) > 0)
out.write(buffer, 0, count);
out.close();
Extracting :
try {
ZipFile zipFile = new ZipFile(path+zFile);
Enumeration<?> enu = zipFile.entries();
while (enu.hasMoreElements()) {
ZipEntry zipEntry = (ZipEntry) enu.nextElement();
String name = path+"/data_FILES/"+zipEntry.getName();
long size = zipEntry.getSize();
long compressedSize = zipEntry.getCompressedSize();
System.out.printf("name: %-20s | size: %6d | compressed size: %6d\n", name, size, compressedSize);
File file = new File(name);
if (name.endsWith("/")) {
file.mkdirs();
continue;
}
File parent = file.getParentFile();
if (parent != null) {
parent.mkdirs();
}
InputStream is = zipFile.getInputStream(zipEntry);
FileOutputStream fos = new FileOutputStream(file);
byte[] bytes = new byte[86384];
int length;
while ((length = is.read(bytes)) >= 0) {
fos.write(bytes, 0, length);
}
is.close();
fos.close();
}
zipFile.close();
} catch (Exception e) {
log("Error in extracting zip file ");
e.printStackTrace();
}

Unzipping the content of a file

I have an application where Service A will provide a zipped data to Service B. And service B needs to unzip it.
Service A has an exposes method getStream and it gives ByteArrayInputStream as output and the data init is zipped data.
However passing that to GzipInputStream gives Not in Gzip format exception.
InputStream ins = method.getInputStream();
GZIPInputStream gis = new GZIPInputStream(ins);
This gives an exception. When the file is dumped in Service A the data is zipped. So getInputStream gives the zipped data.
How to process it ans pass it to the GzipInputStream?
Regards
Dheeraj Joshi
If it zipped, then you must use ZipInputstream.
It does depend on the "zip" format. There are multiple formats that have the zip name (zip, gzip, bzip2, lzip) and different formats call for different parsers.
http://en.wikipedia.org/wiki/List_of_archive_formats
http://www.codeguru.com/java/tij/tij0115.shtml
http://docstore.mik.ua/orelly/java-ent/jnut/ch25_01.htm
If you are using zip then try this code:
public void doUnzip(InputStream is, String destinationDirectory) throws IOException {
int BUFFER = 2048;
// make destination folder
File unzipDestinationDirectory = new File(destinationDirectory);
unzipDestinationDirectory.mkdir();
ZipInputStream zis = new ZipInputStream(is);
// Process each entry
for (ZipEntry entry = zis.getNextEntry(); entry != null; entry = zis
.getNextEntry()) {
File destFile = new File(unzipDestinationDirectory, entry.getName());
// create the parent directory structure if needed
destFile.getParentFile().mkdirs();
try {
// extract file if not a directory
if (!entry.isDirectory()) {
// establish buffer for writing file
byte data[] = new byte[BUFFER];
// write the current file to disk
FileOutputStream fos = new FileOutputStream(destFile);
BufferedOutputStream dest = new BufferedOutputStream(fos,
BUFFER);
// read and write until last byte is encountered
for (int bytesRead; (bytesRead = zis.read(data, 0, BUFFER)) != -1;) {
dest.write(data, 0, bytesRead);
}
dest.flush();
dest.close();
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
is.close();
}
public static void main(String[] args) {
UnzipInputStream unzip = new UnzipInputStream();
try {
InputStream fis = new FileInputStream(new File("test.zip"));
unzip.doUnzip(fis, "output");
} catch (IOException e) {
e.printStackTrace();
}
}

Categories

Resources