Glpk java and .mod file - java

I've got a .mod file and I can run it in java(Using netbeans).
The file gets data from another file .dat, because the guy who was developing it used GUSEK. Now we need to implement it in java, but i dont know how to put data in the K constant in the .mod file.
Doesn't matter the way, can be through database querys or file reading.
I dont know anything about math programming, i just need to add values to the already made glpk function.
Here's the .mod function:
# OPRE
set K;
param mc {k in K};
param phi {k in K};
param cman {k in K};
param ni {k in K};
param cesp;
param mf;
var x {k in K} binary;
minimize custo: sum {k in K} (mc[k]*phi[k]*(1-x[k]) + cman[k]*phi[k]*x[k]);
s.t. recursos: sum {k in K} (cman[k]*phi[k]*x[k]) - cesp <= 0;
s.t. ocorrencias: sum {k in K} (ni[k] + (1-x[k])*phi[k]) - mf <= 0;
end;
And here's the java code:
package br.com.genera.service.otimi;
import org.gnu.glpk.*;
public class Gmpl implements GlpkCallbackListener, GlpkTerminalListener {
private boolean hookUsed = false;
public static void main(String[] arg) {
String[] nomeArquivo = new String[2];
nomeArquivo[0] = "C:\\PodaEquipamento.mod";
System.out.println(nomeArquivo[0]);
GLPK.glp_java_set_numeric_locale("C");
System.out.println(nomeArquivo[0]);
new Gmpl().solve(nomeArquivo);
}
public void solve(String[] arg) {
glp_prob lp = null;
glp_tran tran;
glp_iocp iocp;
String fname;
int skip = 0;
int ret;
// listen to callbacks
GlpkCallback.addListener(this);
// listen to terminal output
GlpkTerminal.addListener(this);
fname = arg[0];
lp = GLPK.glp_create_prob();
System.out.println("Problem created");
tran = GLPK.glp_mpl_alloc_wksp();
ret = GLPK.glp_mpl_read_model(tran, fname, skip);
if (ret != 0) {
GLPK.glp_mpl_free_wksp(tran);
GLPK.glp_delete_prob(lp);
throw new RuntimeException("Model file not found: " + fname);
}
// generate model
GLPK.glp_mpl_generate(tran, null);
// build model
GLPK.glp_mpl_build_prob(tran, lp);
// set solver parameters
iocp = new glp_iocp();
GLPK.glp_init_iocp(iocp);
iocp.setPresolve(GLPKConstants.GLP_ON);
// do not listen to output anymore
GlpkTerminal.removeListener(this);
// solve model
ret = GLPK.glp_intopt(lp, iocp);
// postsolve model
if (ret == 0) {
GLPK.glp_mpl_postsolve(tran, lp, GLPKConstants.GLP_MIP);
}
// free memory
GLPK.glp_mpl_free_wksp(tran);
GLPK.glp_delete_prob(lp);
// do not listen for callbacks anymore
GlpkCallback.removeListener(this);
// check that the hook function has been used for terminal output.
if (!hookUsed) {
System.out.println("Error: The terminal output hook was not used.");
System.exit(1);
}
}
#Override
public boolean output(String str) {
hookUsed = true;
System.out.print(str);
return false;
}
#Override
public void callback(glp_tree tree) {
int reason = GLPK.glp_ios_reason(tree);
if (reason == GLPKConstants.GLP_IBINGO) {
System.out.println("Better solution found");
}
}
}
And i'm getting this in the console:
Reading model section from C:\PodaEquipamento.mod...
33 lines were read
Generating custo...
C:\PodaEquipamento.mod:24: no value for K
glp_mpl_build_prob: invalid call sequence
Hope someone can help, thanks.

The best way would be to read the data file the same way you read the modelfile.
ret = GLPK.glp_mpl_read_data(tran, fname_data, skip);
if (ret != 0) {
GLPK.glp_mpl_free_wksp(tran);
GLPK.glp_delete_prob(lp);
throw new RuntimeException("Data file not found: " + fname_data);
}

I resolved just copying the data block from the .data file into the .mod file.
Anyway,Thanks puhgee.

Related

Can you rebalance an unbalanced Spliterator of unknown size?

I want to use a Stream to parallelize processing of a heterogenous set of remotely stored JSON files of unknown number (the number of files is not known upfront). The files can vary widely in size, from 1 JSON record per file up to 100,000 records in some other files. A JSON record in this case means a self-contained JSON object represented as one line in the file.
I really want to use Streams for this and so I implemented this Spliterator:
public abstract class JsonStreamSpliterator<METADATA, RECORD> extends AbstractSpliterator<RECORD> {
abstract protected JsonStreamSupport<METADATA> openInputStream(String path);
abstract protected RECORD parse(METADATA metadata, Map<String, Object> json);
private static final int ADDITIONAL_CHARACTERISTICS = Spliterator.IMMUTABLE | Spliterator.DISTINCT | Spliterator.NONNULL;
private static final int MAX_BUFFER = 100;
private final Iterator<String> paths;
private JsonStreamSupport<METADATA> reader = null;
public JsonStreamSpliterator(Iterator<String> paths) {
this(Long.MAX_VALUE, ADDITIONAL_CHARACTERISTICS, paths);
}
private JsonStreamSpliterator(long est, int additionalCharacteristics, Iterator<String> paths) {
super(est, additionalCharacteristics);
this.paths = paths;
}
private JsonStreamSpliterator(long est, int additionalCharacteristics, Iterator<String> paths, String nextPath) {
this(est, additionalCharacteristics, paths);
open(nextPath);
}
#Override
public boolean tryAdvance(Consumer<? super RECORD> action) {
if(reader == null) {
String path = takeNextPath();
if(path != null) {
open(path);
}
else {
return false;
}
}
Map<String, Object> json = reader.readJsonLine();
if(json != null) {
RECORD item = parse(reader.getMetadata(), json);
action.accept(item);
return true;
}
else {
reader.close();
reader = null;
return tryAdvance(action);
}
}
private void open(String path) {
reader = openInputStream(path);
}
private String takeNextPath() {
synchronized(paths) {
if(paths.hasNext()) {
return paths.next();
}
}
return null;
}
#Override
public Spliterator<RECORD> trySplit() {
String nextPath = takeNextPath();
if(nextPath != null) {
return new JsonStreamSpliterator<METADATA,RECORD>(Long.MAX_VALUE, ADDITIONAL_CHARACTERISTICS, paths, nextPath) {
#Override
protected JsonStreamSupport<METADATA> openInputStream(String path) {
return JsonStreamSpliterator.this.openInputStream(path);
}
#Override
protected RECORD parse(METADATA metaData, Map<String,Object> json) {
return JsonStreamSpliterator.this.parse(metaData, json);
}
};
}
else {
List<RECORD> records = new ArrayList<RECORD>();
while(tryAdvance(records::add) && records.size() < MAX_BUFFER) {
// loop
}
if(records.size() != 0) {
return records.spliterator();
}
else {
return null;
}
}
}
}
The problem I'm having is that while the Stream parallelizes beautifully at first, eventually the largest file is left processing in a single thread. I believe the proximal cause is well documented: the spliterator is "unbalanced".
More concretely, appears that the trySplit method is not called after a certain point in the Stream.forEach's lifecycle, so the extra logic to distribute small batches at the end of trySplit is rarely executed.
Notice how all the spliterators returned from trySplit share the same paths iterator. I thought this was a really clever way to balance the work across all spliterators, but it hasn't been enough to achieve full parallelism.
I would like the parallel processing to proceed first across files, and then when few large files are still left spliterating, I want to parallelize across chunks of the remaining files. That was the intent of the else block at the end of trySplit.
Is there an easy / simple / canonical way around this problem?
Your trySplit should output splits of equal size, regardless of the size of the underlying files. You should treat all the files as a single unit and fill up the ArrayList-backed spliterator with the same number of JSON objects each time. The number of objects should be such that processing one split takes between 1 and 10 milliseconds: lower than 1 ms and you start approaching the costs of handing off the batch to a worker thread, higher than that and you start risking uneven CPU load due to tasks which are too coarse-grained.
The spliterator is not obliged to report a size estimate, and you are already doing this correctly: your estimate is Long.MAX_VALUE, which is a special value meaning "unbounded". However, if you have many files with a single JSON object, resulting in batches of size 1, this will hurt your performance in two ways: the overhead of opening-reading-closing the file may become a bottleneck and, if you manage to escape that, the cost of thread handoff may be significant compared to the cost of processing one item, again causing a bottleneck.
Five years ago I was solving a similar problem, you can have a look at my solution.
After much experimentation, I was still not able to get any added parallelism by playing with the size estimates. Basically, any value other than Long.MAX_VALUE will tend to cause the spliterator to terminate too early (and without any splitting), while on the other hand a Long.MAX_VALUE estimate will cause trySplit to be called relentlessly until it returns null.
The solution I found is to internally share resources among the spliterators and let them rebalance amongst themselves.
Working code:
public class AwsS3LineSpliterator<LINE> extends AbstractSpliterator<AwsS3LineInput<LINE>> {
public final static class AwsS3LineInput<LINE> {
final public S3ObjectSummary s3ObjectSummary;
final public LINE lineItem;
public AwsS3LineInput(S3ObjectSummary s3ObjectSummary, LINE lineItem) {
this.s3ObjectSummary = s3ObjectSummary;
this.lineItem = lineItem;
}
}
private final class InputStreamHandler {
final S3ObjectSummary file;
final InputStream inputStream;
InputStreamHandler(S3ObjectSummary file, InputStream is) {
this.file = file;
this.inputStream = is;
}
}
private final Iterator<S3ObjectSummary> incomingFiles;
private final Function<S3ObjectSummary, InputStream> fileOpener;
private final Function<InputStream, LINE> lineReader;
private final Deque<S3ObjectSummary> unopenedFiles;
private final Deque<InputStreamHandler> openedFiles;
private final Deque<AwsS3LineInput<LINE>> sharedBuffer;
private final int maxBuffer;
private AwsS3LineSpliterator(Iterator<S3ObjectSummary> incomingFiles, Function<S3ObjectSummary, InputStream> fileOpener,
Function<InputStream, LINE> lineReader,
Deque<S3ObjectSummary> unopenedFiles, Deque<InputStreamHandler> openedFiles, Deque<AwsS3LineInput<LINE>> sharedBuffer,
int maxBuffer) {
super(Long.MAX_VALUE, 0);
this.incomingFiles = incomingFiles;
this.fileOpener = fileOpener;
this.lineReader = lineReader;
this.unopenedFiles = unopenedFiles;
this.openedFiles = openedFiles;
this.sharedBuffer = sharedBuffer;
this.maxBuffer = maxBuffer;
}
public AwsS3LineSpliterator(Iterator<S3ObjectSummary> incomingFiles, Function<S3ObjectSummary, InputStream> fileOpener, Function<InputStream, LINE> lineReader, int maxBuffer) {
this(incomingFiles, fileOpener, lineReader, new ConcurrentLinkedDeque<>(), new ConcurrentLinkedDeque<>(), new ArrayDeque<>(maxBuffer), maxBuffer);
}
#Override
public boolean tryAdvance(Consumer<? super AwsS3LineInput<LINE>> action) {
AwsS3LineInput<LINE> lineInput;
synchronized(sharedBuffer) {
lineInput=sharedBuffer.poll();
}
if(lineInput != null) {
action.accept(lineInput);
return true;
}
InputStreamHandler handle = openedFiles.poll();
if(handle == null) {
S3ObjectSummary unopenedFile = unopenedFiles.poll();
if(unopenedFile == null) {
return false;
}
handle = new InputStreamHandler(unopenedFile, fileOpener.apply(unopenedFile));
}
for(int i=0; i < maxBuffer; ++i) {
LINE line = lineReader.apply(handle.inputStream);
if(line != null) {
synchronized(sharedBuffer) {
sharedBuffer.add(new AwsS3LineInput<LINE>(handle.file, line));
}
}
else {
return tryAdvance(action);
}
}
openedFiles.addFirst(handle);
return tryAdvance(action);
}
#Override
public Spliterator<AwsS3LineInput<LINE>> trySplit() {
synchronized(incomingFiles) {
if (incomingFiles.hasNext()) {
unopenedFiles.add(incomingFiles.next());
return new AwsS3LineSpliterator<LINE>(incomingFiles, fileOpener, lineReader, unopenedFiles, openedFiles, sharedBuffer, maxBuffer);
} else {
return null;
}
}
}
}
This is not a direct answer to your question. But I think it is worth a try with Stream in library abacus-common:
void test_58601518() throws Exception {
final File tempDir = new File("./temp/");
// Prepare the test files:
// if (!(tempDir.exists() && tempDir.isDirectory())) {
// tempDir.mkdirs();
// }
//
// final Random rand = new Random();
// final int fileCount = 1000;
//
// for (int i = 0; i < fileCount; i++) {
// List<String> lines = Stream.repeat(TestUtil.fill(Account.class), rand.nextInt(1000) * 100 + 1).map(it -> N.toJSON(it)).toList();
// IOUtil.writeLines(new File("./temp/_" + i + ".json"), lines);
// }
N.println("Xmx: " + IOUtil.MAX_MEMORY_IN_MB + " MB");
N.println("total file size: " + Stream.listFiles(tempDir).mapToLong(IOUtil::sizeOf).sum() / IOUtil.ONE_MB + " MB");
final AtomicLong counter = new AtomicLong();
final Consumer<Account> yourAction = it -> {
counter.incrementAndGet();
it.toString().replace("a", "bbb");
};
long startTime = System.currentTimeMillis();
Stream.listFiles(tempDir) // the file/data source could be local file system or remote file system.
.parallel(2) // thread number used to load the file/data and convert the lines to Java objects.
.flatMap(f -> Stream.lines(f).map(line -> N.fromJSON(Account.class, line))) // only certain lines (less 1024) will be loaded to memory.
.parallel(8) // thread number used to execute your action.
.forEach(yourAction);
N.println("Took: " + ((System.currentTimeMillis()) - startTime) + " ms" + " to process " + counter + " lines/objects");
// IOUtil.deleteAllIfExists(tempDir);
}
Till end, the CPU usage on my laptop is pretty high(about 70%), and it took about 70 seconds to process 51,899,100 lines/objects from 1000 files with Intel(R) Core(TM) i5-8365U CPU and Xmx256m jvm memory. Total file size is about: 4524 MB. if yourAction is not a heavy operation, sequential stream could be even faster than parallel stream.
F.Y.I I'm the developer of abacus-common

iocp.setPresolve(GLPKConstants.GLP_ON); rises an error

Here is my code :
public void solve(String[] arg) throws FileNotFoundException {
glp_prob lp = null;
glp_tran tran;
glp_iocp iocp;
String fname;
//String res = null;
int skip = 0;
int ret;
// listen to callbacks
GlpkCallback.addListener(this);
// listen to terminal output
GlpkTerminal.addListener(this);
fname = new String(arg[0]);
System.out.println(fname);
lp = GLPK.glp_create_prob();
System.out.println("Problem created");
tran = GLPK.glp_mpl_alloc_wksp();
ret = GLPK.glp_mpl_read_model(tran, fname, skip);
if (ret != 0) {
GLPK.glp_mpl_free_wksp(tran);
GLPK.glp_delete_prob(lp);
System.out.println(ret);
throw new RuntimeException("Model file not found: " + fname);
}
// generate model
GLPK.glp_mpl_generate(tran, null);
// build model
GLPK.glp_mpl_build_prob(tran, lp);
// set solver parameters
iocp = new glp_iocp();
GLPK.glp_init_iocp(iocp);
iocp.setPresolve(GLPKConstants.GLP_ON);
// do not listen to output anymore
GlpkTerminal.removeListener(this);
// solve model
ret = GLPK.glp_intopt(lp, iocp);
// postsolve model
if (ret == 0) {
GLPK.glp_mpl_postsolve(tran, lp, GLPKConstants.GLP_MIP);
write_lp_solution(lp);
}
// free memory
GLPK.glp_mpl_free_wksp(tran);
GLPK.glp_delete_prob(lp);
// do not listen for callbacks anymore
GlpkCallback.removeListener((GlpkCallbackListener) this);
// check that the hook function has been used for terminal output.
if (!hookUsed) {
System.out.println("Error: The terminal output hook was not used.");
System.exit(1);
}
}
when i run it i have this error :
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.gnu.glpk.GLPKJNI.GLP_USE_AT_get()I
at org.gnu.glpk.GLPKJNI.GLP_USE_AT_get(Native Method)
at org.gnu.glpk.GLPKConstants.<clinit>(GLPKConstants.java:74)
at glpk.Optimisation.solve(Optimisation.java:58)
The line 58 corresponds to
iocp.setPresolve(GLPKConstants.GLP_ON);
This code worked well before but since i changed the model to be solved i have this error.
But when i run it with my terminal
glpsol --model -mymodel.mod
All works well, the linear problem is solved.
I have to admit that i have no idea from where this error comes.
If anyone can help me...
Looks like the JNI library is missing in the classpath. Have a look at http://glpk-java.sourceforge.net/architecture.html.

How can a Java program list all partitions and get the free space of them on Linux?

I want the name of the partition with their total space, used space and free space for the Linux system using java program.
I am getting correct value in case of Windows system but in Linux I am getting only one drive information:
Here is what I have tried so far.
public class DiskSpace {
public static void main(String[] args) {
FileSystemView fsv = FileSystemView.getFileSystemView();
File[] drives = File.listRoots();
if (drives != null && drives.length > 0) {
for (File aDrive : drives) {
System.out.println("Drive Letter: " + aDrive);
System.out.println("\tType: " + fsv.getSystemTypeDescription(aDrive));
System.out.println("\tTotal space: " + aDrive.getTotalSpace());
System.out.println("\tFree space: " + aDrive.getFreeSpace());
System.out.println();
}
}
}
There are no drive letters on Linux. If you want to know which partitions there are, and where they are mounted, read /proc/mounts. When you have a mount point (2nd column in /proc/mounts), use new File(mountpoint).getTotalSpace() to get the total space.
Linux, Unix, and Unix-like systems have one filesystem with one root within which there may be multiple mount points at which partitions containing Unix filesystems, partial or complete, may be mounted -- non-Unix filesystems may also be mounted with the appropriate software available to handle the transformations necessary, but the unified, single-root filesystem model remains.
If the FileSystemView class you are using is from the javax.swing.filechooser package, don't expect too much:
FileSystemView is JFileChooser's gateway to the file system. Since the JDK1.1 File API doesn't allow access to such information as root partitions, file type information, or hidden file bits, this class is designed to intuit as much OS-specific file system information as possible.
Java Licensees may want to provide a different implementation of FileSystemView to better handle a given operating system.
That second paragraph is key.
Java's virtual machine implementation is meant to abstract away the very kind of platform specific things you want in this case. To be successful, you will need to write or find native call wrapper classes for the native system API of each platform you will support. "High-level" abstractions like the FileSystemView class are unlikely to be complete or reliable in providing the information you need.
This is a code snippet to display name mounted external media only.
String OS = System.getProperty("os.name");
if(OS.equals("Linux"))
{
String s = "";
Runtime rt = Runtime.getRuntime();
int n=0;
Process ps = rt.exec("ls /run/media/Rancho");// Write your UserName, mine is Rancho
InputStream in = ps.getInputStream();
while((n = in.read())!=-1)
{
char ch = (char) n;
s+ = ch;
}
System.out.println(s);
}
I know I'm late to the party here, but I wrote a class that just does its job. What it does is reading the /proc/mounts file, parses it into the Block Device (like /dev/sda), Mount Point, file system Type and its mount options. If you don't need the additional things that I wrote, you can exclude it, but for the project that I am working on, I wanted to include it, like if the current user can write or read into/from the current mount point.
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.StringTokenizer;
public class UnixPartition {
private final String block, mount, fsType, mntOptions;
private final long totalSpace, freeSpace, usableSpace;
private final File root;
protected UnixPartition(String block, String mount, String fsType, String mountOptions, long totalSpace, long freeSpace, long usableSpace, File root) {
this.block = block;
this.mount = mount;
this.fsType = fsType;
this.mntOptions = mountOptions;
this.totalSpace = totalSpace;
this.freeSpace = freeSpace;
this.usableSpace = usableSpace;
this.root = root;
}
public String getBlock() {
return block;
}
public String getMount() {
return mount;
}
public String getFilesystemType() {
return fsType;
}
public long getTotalSpace() {
return totalSpace;
}
public long getFreeSpace() {
return freeSpace;
}
public long getUsableSpace() {
return usableSpace;
}
public String getMountOptions() {
return mntOptions;
}
public boolean canCurrentUserWrite() {
return root.canWrite();
}
public boolean canCurrentUserRead() {
return root.canRead();
}
private String getLineSeparator(String contents) {
char[] chars = contents.toCharArray();
long r = 0;
long n = 0;
for (char c : chars) {
if (c == '\r')
r++;
if (c == '\n')
n++;
}
if (r == n)
return "\r\n";
else if (r >= 1 && n == 0)
return "\r";
else if (n >= 1 && r == 0)
return "\n";
else
return "";
}
public static List<UnixPartition> getPartitions(boolean filterSpecials) throws IOException {
String contents = new String(FileUtil.read("/proc/mounts"));
String[] lines = contents.split(getLineSeparator(contents));
List<UnixPartition> partitions = new ArrayList<>();
for (String line : lines) {
StringTokenizer tokenizer = new StringTokenizer(line, " ");
String blk = tokenizer.nextToken(),
mnt = tokenizer.nextToken(),
type = tokenizer.nextToken(),
opt = tokenizer.nextToken();
if (filterSpecials) {
if ((blk.contains("proc") || mnt.contains("proc") || type.contains("proc")) ||
(blk.contains("systemd") || mnt.contains("systemd") || type.contains("systemd")) ||
(blk.contains("binmft_misc") || mnt.contains("binmft_misc") || type.contains("binmft_misc")) ||
(blk.contains("udev") || mnt.contains("udev") || type.contains("udev")) ||
(blk.contains("devpts") || mnt.contains("devpts") || type.contains("devpts")) ||
(blk.contains("fuse") || mnt.contains("fuse") || type.contains("fuse")) ||
(blk.contains("pstore") || mnt.contains("pstore") || type.contains("pstore")) ||
type.contains("tmp"))
continue;
if (blk.contains("none"))
continue;
}
File root = new File(mnt);
partitions.add(new UnixPartition(blk, mnt, type, opt,
root.getTotalSpace(), root.getFreeSpace(), root.getUsableSpace(), root));
}
return partitions;
}
}
Edit: You might want to filter out the special mount types yourself, since I just used them from another answer here.

What is the least painful way to extract data at the protocol layer in Java?

I'm trying to implement an Android application to connect to the open source software Motion. The goal is to be able to check the status of the application and get the last image captured.
I do not program in Java very much, my background is principally in C and Python. I've not had any real issues with understanding the UI part of Android, but I've found it to be incredibly painful to work with any sort of byte buffer. The Motion software has an HTTP API that is very simple. Opening the URL connection is easy in Java. The response from the default page looks like this
Motion 3.2.12 Running [4] Threads
0
1
2
3
For my purposes the first thing the application needs to do it parse out the number of threads. At some point I can also retrieve the version number from the first line, but that's not really important presently.
Here's my code
package com.hydrogen18.motionsurveillanceviewer;
import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
public class MotionHttpApi {
String host;
int port = 80;
boolean secure = false;
int numberOfThreads = -1;
String getBaseUrl()
{
StringBuilder sb = new StringBuilder();
sb.append(secure ? "https://" : "http://");
sb.append(host);
sb.append(':');
sb.append(port);
return sb.toString();
}
public int getNumberOfCameras() throws IOException
{
if(numberOfThreads == -1)
{
retrieveSplash();
}
if(numberOfThreads == 1)
{
return 1;
}
return numberOfThreads - 1;
}
void retrieveSplash () throws IOException
{
URL url = new URL(getBaseUrl());
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
if(conn.getResponseCode()!=HttpURLConnection.HTTP_OK)
{
throw new IOException("Got response code" + conn.getResponseCode());
}
try{
Byte[] buffer = new Byte[512];
byte[] sbuf = new byte[128];
int offset = 0;
InputStream in = new BufferedInputStream(conn.getInputStream());
boolean foundInfoString= false;
while( ! foundInfoString)
{
//Check to make sure we have not run out of space
if(offset == buffer.length)
{
throw new IOException("Response too large");
}
//Read into the smaller buffer since InputStream
//can't write to a Byte[]
final int result = in.read(sbuf,0,sbuf.length);
//Copy the data into the larger buffer
for(int i = 0; i < result;++i)
{
buffer[offset+i] = sbuf[i];
}
//Add to the offset
offset+=result;
//Wrap the array as a list
List<Byte> list = java.util.Arrays.asList(buffer);
//Find newline character
final int index = list.indexOf((byte) '\n');
//If the newline is present, extract the number of threads
if (index != -1)
{
//Find the number of threads
//Thread number is in the first lin like "[X]"
final int start = list.indexOf((byte)'[');
final int end = list.indexOf((byte)']');
//Sanity check the bounds
if(! (end > start))
{
throw new IOException("Couldn't locate number of threads");
}
//Create a string from the Byte[] array subset
StringBuilder sb = new StringBuilder();
for(int i = start+1; i != end; ++i)
{
final char c = (char) buffer[i].byteValue();
sb.append(c);
}
String numThreadsStr = sb.toString();
//Try and parse the string into a number
try
{
this.numberOfThreads = Integer.valueOf(numThreadsStr);
}catch(NumberFormatException e)
{
throw new IOException("Number of threads is NaN",e);
}
//No more values to extract
foundInfoString = true;
}
//If the InputStream got EOF and the into string has not been found
//Then an error has occurred.
if(result == -1 && ! foundInfoString )
{
throw new IOException("Never got info string");
}
}
}finally
{
//Close the connection
conn.disconnect();
}
}
public MotionHttpApi(String host,int port)
{
this.host = host;
this.port = port;
}
}
The code works just fine when you call getNumberOfCameras(). But I think I must not be really understandings omething in terms of java, because the retrieveSplash method is far too complex. I could do the same thing in just 10 or so lines of C or 1 line of Python. Surely there must be a saner way to manipulate bytes in java?
I think there are some style issues, like I probably should not be throwing IOException whenever the integer fails to parse. But that's a separate issue.
Read the first line as Gautam Tandon suggested and then use a regex.
You can then check if the regex matches and even easily extract the number.
Regex' can be created at http://txt2re.com. I've already done that for you.
The page even creates Java, Pyhton, C, etc. files for you to work with.
// URL that generated this code:
// http://txt2re.com/index-java.php3?s=Motion%203.2.12%20Running%20[4]%20Threads&-7&-19&-5&-20&-1&2&-22&-21&-62&-63&15
import java.util.regex.*;
class Main
{
public static void main(String[] args)
{
String txt="Motion 3.2.12 Running [4] Threads";
String re1="(Motion)"; // Word 1
String re2="( )"; // White Space 1
String re3="(3\\.2\\.12)"; // MMDDYY 1
String re4="( )"; // White Space 2
String re5="(Running)"; // Word 2
String re6="( )"; // White Space 3
String re7="(\\[)"; // Any Single Character 1
String re8="(\\d+)"; // Integer Number 1
String re9="(\\])"; // Any Single Character 2
String re10="( )"; // White Space 4
String re11="((?:[a-z][a-z]+))"; // Word 3
Pattern p = Pattern.compile(re1+re2+re3+re4+re5+re6+re7+re8+re9+re10+re11,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Matcher m = p.matcher(txt);
if (m.find())
{
String word1=m.group(1);
String ws1=m.group(2);
String mmddyy1=m.group(3);
String ws2=m.group(4);
String word2=m.group(5);
String ws3=m.group(6);
String c1=m.group(7);
String int1=m.group(8);
String c2=m.group(9);
String ws4=m.group(10);
String word3=m.group(11);
System.out.print("("+word1.toString()+")"+"("+ws1.toString()+")"+"("+mmddyy1.toString()+")"+"("+ws2.toString()+")"+"("+word2.toString()+")"+"("+ws3.toString()+")"+"("+c1.toString()+")"+"("+int1.toString()+")"+"("+c2.toString()+")"+"("+ws4.toString()+")"+"("+word3.toString()+")"+"\n");
}
}
}
//-----
// This code is for use with Sun's Java VM - see http://java.sun.com/ for downloads.
//
// Paste the code into a new java application or a file called 'Main.java'
//
// Compile and run in Unix using:
// # javac Main.java
// # java Main
//
String int1=m.group(8); gives you the desired integer. Of course you can simplify the above code. It's way to verbose right now.
You can simplify the retrieveSplash method considerably by using BufferedReader. Here's a simpler version of your function:
void retrieveSplash_simpler() throws IOException {
URL url = new URL(getBaseUrl());
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
// open the connection
conn.connect();
// create a buffered reader to read the input stream line by line
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
// find number of threads
String firstLine = reader.readLine();
int x = firstLine.indexOf("[");
int y = firstLine.indexOf("]");
if (x > 0 && y > 0 && x < y) {
try {
numberOfThreads = Integer.parseInt(firstLine.substring(x+1, y));
} catch (NumberFormatException nfe) {
// disconnect and throw exception
conn.disconnect();
throw new IOException("Couldn't locate number of threads");
}
} else {
// disconnect and throw exception
conn.disconnect();
throw new IOException("Couldn't locate number of threads");
}
// disconnect
conn.disconnect();
}
I'd further clean up the above method by using try/catch/finally blocks at the appropriate places so that I don't have to duplicate that "conn.disconnect()". But I didn't do that here to keep it simple (try/catch/finally do become tricky sometimes...).

Filter (search and replace) array of bytes in an InputStream

I have an InputStream which takes the html file as input parameter. I have to get the bytes from the input stream .
I have a string: "XYZ". I'd like to convert this string to byte format and check if there is a match for the string in the byte sequence which I obtained from the InputStream. If there is then, I have to replace the match with the bye sequence for some other string.
Is there anyone who could help me with this? I have used regex to find and replace. however finding and replacing byte stream, I am unaware of.
Previously, I use jsoup to parse html and replace the string, however due to some utf encoding problems, the file seems to appear corrupted when I do that.
TL;DR: My question is:
Is a way to find and replace a string in byte format in a raw InputStream in Java?
Not sure you have chosen the best approach to solve your problem.
That said, I don't like to (and have as policy not to) answer questions with "don't" so here goes...
Have a look at FilterInputStream.
From the documentation:
A FilterInputStream contains some other input stream, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.
It was a fun exercise to write it up. Here's a complete example for you:
import java.io.*;
import java.util.*;
class ReplacingInputStream extends FilterInputStream {
LinkedList<Integer> inQueue = new LinkedList<Integer>();
LinkedList<Integer> outQueue = new LinkedList<Integer>();
final byte[] search, replacement;
protected ReplacingInputStream(InputStream in,
byte[] search,
byte[] replacement) {
super(in);
this.search = search;
this.replacement = replacement;
}
private boolean isMatchFound() {
Iterator<Integer> inIter = inQueue.iterator();
for (int i = 0; i < search.length; i++)
if (!inIter.hasNext() || search[i] != inIter.next())
return false;
return true;
}
private void readAhead() throws IOException {
// Work up some look-ahead.
while (inQueue.size() < search.length) {
int next = super.read();
inQueue.offer(next);
if (next == -1)
break;
}
}
#Override
public int read() throws IOException {
// Next byte already determined.
if (outQueue.isEmpty()) {
readAhead();
if (isMatchFound()) {
for (int i = 0; i < search.length; i++)
inQueue.remove();
for (byte b : replacement)
outQueue.offer((int) b);
} else
outQueue.add(inQueue.remove());
}
return outQueue.remove();
}
// TODO: Override the other read methods.
}
Example Usage
class Test {
public static void main(String[] args) throws Exception {
byte[] bytes = "hello xyz world.".getBytes("UTF-8");
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
byte[] search = "xyz".getBytes("UTF-8");
byte[] replacement = "abc".getBytes("UTF-8");
InputStream ris = new ReplacingInputStream(bis, search, replacement);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int b;
while (-1 != (b = ris.read()))
bos.write(b);
System.out.println(new String(bos.toByteArray()));
}
}
Given the bytes for the string "Hello xyz world" it prints:
Hello abc world
The following approach will work but I don't how big the impact is on the performance.
Wrap the InputStream with a InputStreamReader,
wrap the InputStreamReader with a FilterReader that replaces the strings, then
wrap the FilterReader with a ReaderInputStream.
It is crucial to choose the appropriate encoding, otherwise the content of the stream will become corrupted.
If you want to use regular expressions to replace the strings, then you can use Streamflyer, a tool of mine, which is a convenient alternative to FilterReader. You will find an example for byte streams on the webpage of Streamflyer. Hope this helps.
I needed something like this as well and decided to roll my own solution instead of using the example above by #aioobe. Have a look at the code. You can pull the library from maven central, or just copy the source code.
This is how you use it. In this case, I'm using a nested instance to replace two patterns two fix dos and mac line endings.
new ReplacingInputStream(new ReplacingInputStream(is, "\n\r", "\n"), "\r", "\n");
Here's the full source code:
/**
* Simple FilterInputStream that can replace occurrances of bytes with something else.
*/
public class ReplacingInputStream extends FilterInputStream {
// while matching, this is where the bytes go.
int[] buf=null;
int matchedIndex=0;
int unbufferIndex=0;
int replacedIndex=0;
private final byte[] pattern;
private final byte[] replacement;
private State state=State.NOT_MATCHED;
// simple state machine for keeping track of what we are doing
private enum State {
NOT_MATCHED,
MATCHING,
REPLACING,
UNBUFFER
}
/**
* #param is input
* #return nested replacing stream that replaces \n\r (DOS) and \r (MAC) line endings with UNIX ones "\n".
*/
public static InputStream newLineNormalizingInputStream(InputStream is) {
return new ReplacingInputStream(new ReplacingInputStream(is, "\n\r", "\n"), "\r", "\n");
}
/**
* Replace occurances of pattern in the input. Note: input is assumed to be UTF-8 encoded. If not the case use byte[] based pattern and replacement.
* #param in input
* #param pattern pattern to replace.
* #param replacement the replacement or null
*/
public ReplacingInputStream(InputStream in, String pattern, String replacement) {
this(in,pattern.getBytes(StandardCharsets.UTF_8), replacement==null ? null : replacement.getBytes(StandardCharsets.UTF_8));
}
/**
* Replace occurances of pattern in the input.
* #param in input
* #param pattern pattern to replace
* #param replacement the replacement or null
*/
public ReplacingInputStream(InputStream in, byte[] pattern, byte[] replacement) {
super(in);
Validate.notNull(pattern);
Validate.isTrue(pattern.length>0, "pattern length should be > 0", pattern.length);
this.pattern = pattern;
this.replacement = replacement;
// we will never match more than the pattern length
buf = new int[pattern.length];
}
#Override
public int read(byte[] b, int off, int len) throws IOException {
// copy of parent logic; we need to call our own read() instead of super.read(), which delegates instead of calling our read
if (b == null) {
throw new NullPointerException();
} else if (off < 0 || len < 0 || len > b.length - off) {
throw new IndexOutOfBoundsException();
} else if (len == 0) {
return 0;
}
int c = read();
if (c == -1) {
return -1;
}
b[off] = (byte)c;
int i = 1;
try {
for (; i < len ; i++) {
c = read();
if (c == -1) {
break;
}
b[off + i] = (byte)c;
}
} catch (IOException ee) {
}
return i;
}
#Override
public int read(byte[] b) throws IOException {
// call our own read
return read(b, 0, b.length);
}
#Override
public int read() throws IOException {
// use a simple state machine to figure out what we are doing
int next;
switch (state) {
case NOT_MATCHED:
// we are not currently matching, replacing, or unbuffering
next=super.read();
if(pattern[0] == next) {
// clear whatever was there
buf=new int[pattern.length]; // clear whatever was there
// make sure we start at 0
matchedIndex=0;
buf[matchedIndex++]=next;
if(pattern.length == 1) {
// edgecase when the pattern length is 1 we go straight to replacing
state=State.REPLACING;
// reset replace counter
replacedIndex=0;
} else {
// pattern of length 1
state=State.MATCHING;
}
// recurse to continue matching
return read();
} else {
return next;
}
case MATCHING:
// the previous bytes matched part of the pattern
next=super.read();
if(pattern[matchedIndex]==next) {
buf[matchedIndex++]=next;
if(matchedIndex==pattern.length) {
// we've found a full match!
if(replacement==null || replacement.length==0) {
// the replacement is empty, go straight to NOT_MATCHED
state=State.NOT_MATCHED;
matchedIndex=0;
} else {
// start replacing
state=State.REPLACING;
replacedIndex=0;
}
}
} else {
// mismatch -> unbuffer
buf[matchedIndex++]=next;
state=State.UNBUFFER;
unbufferIndex=0;
}
return read();
case REPLACING:
// we've fully matched the pattern and are returning bytes from the replacement
next=replacement[replacedIndex++];
if(replacedIndex==replacement.length) {
state=State.NOT_MATCHED;
replacedIndex=0;
}
return next;
case UNBUFFER:
// we partially matched the pattern before encountering a non matching byte
// we need to serve up the buffered bytes before we go back to NOT_MATCHED
next=buf[unbufferIndex++];
if(unbufferIndex==matchedIndex) {
state=State.NOT_MATCHED;
matchedIndex=0;
}
return next;
default:
throw new IllegalStateException("no such state " + state);
}
}
#Override
public String toString() {
return state.name() + " " + matchedIndex + " " + replacedIndex + " " + unbufferIndex;
}
}
There isn't any built-in functionality for search-and-replace on byte streams (InputStream).
And, a method for completing this task efficiently and correctly is not immediately obvious. I have implemented the Boyer-Moore algorithm for streams, and it works well, but it took some time. Without an algorithm like this, you have to resort to a brute-force approach where you look for the pattern starting at every position in the stream, which can be slow.
Even if you decode the HTML as text, using a regular expression to match patterns might be a bad idea, since HTML is not a "regular" language.
So, even though you've run into some difficulties, I suggest you pursue your original approach of parsing the HTML as a document. While you are having trouble with the character encoding, it will probably be easier, in the long run, to fix the right solution than it will be to jury-rig the wrong solution.
I needed a solution to this, but found the answers here incurred too much memory and/or CPU overhead. The below solution significantly outperforms the others here in these terms based on simple benchmarking.
This solution is especially memory-efficient, incurring no measurable cost even with >GB streams.
That said, this is not a zero-CPU-cost solution. The CPU/processing-time overhead is probably reasonable for all but the most demanding/resource-sensitive scenarios, but the overhead is real and should be considered when evaluating the worthiness of employing this solution in a given context.
In my case, our max real-world file size that we are processing is about 6MB, where we see added latency of about 170ms with 44 URL replacements. This is for a Zuul-based reverse-proxy running on AWS ECS with a single CPU share (1024). For most of the files (under 100KB), the added latency is sub-millisecond. Under high-concurrency (and thus CPU contention), the added latency could increase, however we are currently able to process hundreds of the files concurrently on a single node with no humanly-noticeable latency impact.
The solution we are using:
import java.io.IOException;
import java.io.InputStream;
public class TokenReplacingStream extends InputStream {
private final InputStream source;
private final byte[] oldBytes;
private final byte[] newBytes;
private int tokenMatchIndex = 0;
private int bytesIndex = 0;
private boolean unwinding;
private int mismatch;
private int numberOfTokensReplaced = 0;
public TokenReplacingStream(InputStream source, byte[] oldBytes, byte[] newBytes) {
assert oldBytes.length > 0;
this.source = source;
this.oldBytes = oldBytes;
this.newBytes = newBytes;
}
#Override
public int read() throws IOException {
if (unwinding) {
if (bytesIndex < tokenMatchIndex) {
return oldBytes[bytesIndex++];
} else {
bytesIndex = 0;
tokenMatchIndex = 0;
unwinding = false;
return mismatch;
}
} else if (tokenMatchIndex == oldBytes.length) {
if (bytesIndex == newBytes.length) {
bytesIndex = 0;
tokenMatchIndex = 0;
numberOfTokensReplaced++;
} else {
return newBytes[bytesIndex++];
}
}
int b = source.read();
if (b == oldBytes[tokenMatchIndex]) {
tokenMatchIndex++;
} else if (tokenMatchIndex > 0) {
mismatch = b;
unwinding = true;
} else {
return b;
}
return read();
}
#Override
public void close() throws IOException {
source.close();
}
public int getNumberOfTokensReplaced() {
return numberOfTokensReplaced;
}
}
I came up with this simple piece of code when I needed to serve a template file in a Servlet replacing a certain keyword by a value. It should be pretty fast and low on memory. Then using Piped Streams I guess you can use it for all sorts of things.
/JC
public static void replaceStream(InputStream in, OutputStream out, String search, String replace) throws IOException
{
replaceStream(new InputStreamReader(in), new OutputStreamWriter(out), search, replace);
}
public static void replaceStream(Reader in, Writer out, String search, String replace) throws IOException
{
char[] searchChars = search.toCharArray();
int[] buffer = new int[searchChars.length];
int x, r, si = 0, sm = searchChars.length;
while ((r = in.read()) > 0) {
if (searchChars[si] == r) {
// The char matches our pattern
buffer[si++] = r;
if (si == sm) {
// We have reached a matching string
out.write(replace);
si = 0;
}
} else if (si > 0) {
// No match and buffered char(s), empty buffer and pass the char forward
for (x = 0; x < si; x++) {
out.write(buffer[x]);
}
si = 0;
out.write(r);
} else {
// No match and nothing buffered, just pass the char forward
out.write(r);
}
}
// Empty buffer
for (x = 0; x < si; x++) {
out.write(buffer[x]);
}
}

Categories

Resources