How to create a binary file?

How to create a binary file? - java

I want to create my own binary file.
When I've opened a .3ds file, at the beginning I've found :
MM����
������==���>=
�����������������Standard_1������ ������ ����� ������0����� ������#����0����
�A����0������P����0������R����0������S����0�����������������0�������������
������?��3���0����d������LUMBERJA.PNG�Q������S�
These letters are unreadable. I would like to make my own file which no one can read.
I want to make this in Java on Android. Note that I don't want to use Cypher.

Those letters are not unreadable.
Let's say you don't speak french. At all.
That doesn't make french text 'unreadable'... it is merely unreadable to you. It is perfectly readable to just about everybody who lives in France, for starters.
Your example is no different. It may not be readable to you, but anybody who knows the file format can write software that can read this data.
Making software that can read something, in a way that no other software can be written that can read it, is also impossible: Your software runs on a device you do not control. The owner of that device can sandbox your application and figure out the format. even if you try to encrypt the data, that won't work: The owner of the device can use tools to extract the key from your software.
If you want truly 'unreadable' data, either [A] have that data in a location where you can control who gets to read it (example: Host files on a server, not in the application), or [B] encrypt the data and have the user provide the key information for it (for example, a password).
If your aim is merely to be able to produce a binary file.. java's OutputStream can do that. Write whatever bytes you like.

Related

Can you translate data pulled with NFC to a different language?

Is there a way for the data I read from a NFC tag to then be able to translate that data into a choice of my own? For example, I scan a tag with direction data on it. How can I choose which language I wish to read that data in?

NFC tags don't have much memory. And the ones that do have more memory are much more expensive.
And so sometimes data is represented with a short hexadecimal number. In the case of transit data for instance, in San Francisco this application called Farebot will be able to tell you which Bart Station you got in and which Bart Station you exited (BART stands for Bay Area Rapid Transit, it's a type of medium-range subway train that connects multiple cities within the Bay Area).
But instead of storing "16th Street Mission" inside the transit card, it might only store a shorter hexadecimal number like B2
And inside the Farebot application, there would already be an internal table that knows that B2 is equivalent to "16th Street Mission" for instance, but depending on the language setting of your app, there is no reason that you couldn't have a French table that tells you that B2 corresponds to "16eme Rue Mission" also.
Does my explanation make sense? Unfortunately, with the limited memory of NFC tags, you can't store too much data within the tag/card itself, and so storing multiple translations inside the tag itself wouldn't make a lot of sense.

It's the case that NFC tags can only hold limited amounts of data (think very small chunks of data) exactly as Stephan above said. To see you an example of another app made using NFC tags, kindly go to this repo and see the MainActivity, more precisely what's inside the if-statement starting from line 114 (I would've pasted the code, but it's not formatted properly). What it does is it gets the data from the tag and esentially translates it into a string that, if I remember well, was only something like (LR1, LR2, LR3...) and so on. How you "interpret" this tiny bit of information should be entirely on your app's backend code (e.g. we knew that LR1 was lecture room 1 so we had an idea of what to do).
What I'm trying to say is that you can encode hexadecimal strings of limited length on the NFC and then convert it back to whatever data structure you want to use, but the logic and interpretation take place on your app's side. The repo above shows an example on how to do this in java.

If I understand your question fine.
** NFC tags have very little memory, so you might have to build a dictionary in your application
For ex:
A1 is mapped to Right
B1 is mapped to Left. etc
Then you store A1 or B1 for example on your tags.
Then what you need to do is, when you resolve the A1 to Right after that you need to call a Translate API like
Google translate
Bing FREE

Java - Extracting plaintext from web-page source code (getting massive quantities of lyrics from website)

O community, I'm in the process of writing the pseudocode for an application that extracts song lyrics from a remote host (web-server, not my own) by reading the page's source code.
This is assuming that:
Lyrics are being displayed in plaintext
Portion of source code containing lyrics is readable by Java front-end application
I'm not looking for source code to answer the question, but what is the technical term used for querying a remote webpage for plaintext content?
If I can determine the webpage naming scheme, I could set the pointer of the URL object to the appropriate webpage, right? The only limitations would be irregular capitalization, and would only be effective if the plaintext was found in EXACTLY the same place.
Do you have any suggestions?
I was thinking something like this for "Buck 65", singing "I look good"
URL url = new URL(http://www.elyrics.net/read/b/buck-65-lyrics/i-look-good-lyrics.html);
I could substitute "buck-65-lyrics" & "i-look-good-lyrics" to reflect user input?
Input re-directed to PostgreSQL table
Current objective:
User will request name of {song, artist, album}, Java front-end will query remote webpage
Full source code (containing plaintext) will be extracted with Java front-end
Lyrics will be extracted from source code (somehow)
If song is not currently indexed by PostgreSQL server, will be added to table.
Operations will be made on the plaintext to suit the objectives of the program
I'm only looking for direction. If I'm headed completely in the wrong direction, please let me know. This is only for the pseudocode. I'm not looking for answers, or hand-outs, I need assistance in determining what I need to do. Are there external libraries for extracting plaintext that you know of? What technical names are there for what I'm trying to accomplish?
Thanks, Tyler

This approach is referred to as screen or data scraping. Note that employing it often breaks the target service's terms of service. Usually, this is not a robust approach, which is why API-like services with guarantees about how they operate are preferable.
Your approach sounds like it will work for the most part, but a few things to keep in mind.
If the web service you're interacting with requires a very precise URL scheme, you should not feed your user-provided data directly into it, since it is likely to be muddied by missing words, abbreviations, or misspellings. You might be better off doing some sort of search, first, and using that search's best result.
Reading HTML data is more complicated than you think. Use an existing library like jsoup to assist you.

The technical term to extract content from a site is web scraping, you can google that. There are a lot of online libraries, for java there is jsoup. Though its easy to write your own regex.
1st thing I would do i use curl and get the content from the site just for testing, this will give you a fair idea of what to do.

You will have to use a HTML parser. One of the most popular is jsoup.
Take care abut the legal aspect fo what you you do ;)

Best file format regarding standard string and integer data?

For my project, I need to store info about protocols (the data sent (most likely integers) and in the order it's sent) and info that might be formatted something like this:
'ID' 'STRING' 'ADDITIONAL INTEGER DATA'
This info will be read by a Java program and stored in memory for processing, but I don't know what would be the most sensible format to store this data in?
EDIT: Here's some extra information:
1)I will be using this data in a game server.
2)Since it is a game server, speed is not the primary concern, since this data will primary be read and utilized during startup, which shouldn't occur very often.
3)Memory consumption I would like to keep at a minimum, however.
4)The second data "example" will be used as a "dictionary" to look up names of specific in-game items, their stats and other integer data (and therefore might become very large, unlike the first data containing the protocol information, where each file will only note small protocol bites, like a login protocol for instance).
5)And yes, I would like the data to be "human-editable".
EDIT 2: Here's the choices that I've made:
JSON - For the protocol descriptions
CSV - For the dictionaries

There are many factors that could come to weigh--here are things that might help you figure this out:
1) Speed/memory usage: If the data needs to load very quickly or is very large, you'll probably want to consider rolling your own binary format.
2) Portability/compatibility: Balanced against #1 is the consideration that you might want to use the data elsewhere, with programs that won't read a custom binary format. In this case, your heavy hitters are probably going to be CSV, dBase, XML, and my personal favorite, JSON.
3) Simplicity: Delimited formats like CSV are easy to read, write, and edit by hand. Either use double-quoting with proper escaping or choose a delimiter that will not appear in the data.
If you could post more info about your situation and how important these factors are, we might be able to guide you further.

How about XML, JSON or CSV ?

I've written a similar protocol-specification using XML. (Available here.)
I think it is a good match, since it captures the hierarchal nature of specifying messages / network packages / fields etc. Order of fields are well defined and so on.
I even wrote a code-generator that generated the message sending / receiving classes with methods for each message type in XSLT.
The only drawback as I see it is the verbosity. If you have a really simple structure of the specification, I would suggest you use some simple home-brewed format and write a parser for it using a parser-generator of your choice.

In addition to the formats suggested by others here (CSV, XML, JSON, etc.) you might consider storing the info in a Java properties file. (See the java.util.Properties class.) The code is already there for you, so all you have to figure out is the properties names (or name prefixes) you want to use.
The Properties class also provides for storing/loading properties in a simple XML format.

What technologies are there for formatted, structured data input and output?

I am working on a project here that ingests internal resumes from people at my company, strips out the skills and relevant content from them and stores it in a database. This was all done using docx4j and Grails. This required the resumes to first be submitted via a template that formatted everything just right so that the ingest tool knew what to look for to strip the data.
The 2nd portion of this, is what if we want to get out a "reduced" resume from the database. In other words, I want to search the uploaded content I now have, and only print out new resumes for people who have Java programming experience lets say. So I can go into my database, find the people who originally had java as a skill, and output a new set of resumes that are also still in a nice templated format, and only have the relevant info in them, instead of ALL the content.
I have been writing some software to do this in Java that will basically use a docx template, overwriting the items in customXML which are bound to the content controls in the doc, so the new data shows up and can eb saved as a new docx with that custom data.
This seems really cumbersome to me, and has some limitations. For one, lets say my template has a place for 3 Skills, and the particular person has 8 skills. There seems to be no good way to add those 5 additional skills to the docx other than painstakingly inserting the data with all of the formatting XML tags and such. This is a real pain, because if the template changes, I dont want to have to go back into my software and edit source code to change that additional data input XML tag to bold instead of italic.
I was doing some reading up on using Infopath to create a form that I could use to get the input, connecting to some sharepoint data source or something to store the stripped out data. However, I can't seem to find out if it is possible using sharepoint to get the data back out, in a nice formatted way. What would the general steps for this be? It seems like I couldnt find very much about this topic with any quick googling.
Thanks

You could set up the skills:
<skills>
<skill>..</skill>
<skill>..</skill>
and use a "repeat" content control pointing to the container. This would handle any number of <skill> entries.

Detect if user changed file extension to upload?

Using a Java servlet, is it possible to detect the true file type of a file, regardless of its extension?
Scenario: You only allow plain text file uploads (.txt and .csv) The user takes the file, mypicture.jpg, renames it to mypicture.txt and proceeds to upload the file. Your servlet expects only text files and blows up trying to read the jpg.
Obviously this is user error, but is there a way to detect that its not plain text and not proceed?

You can do this using the builtin URLConnection#guessContentTypeFromStream() API. It's however pretty limited in content types it can detect, you can then better use a 3rd party library like jMimeMagic.
See also:
Best way to determine file type in Java
When do browsers send application/octet-stream as Content-Type?

No. There is no way to know what type of file you are being uploaded. You must make all verifications on the server before taking any actions with the file.

I think you should consider why your program might blow up when give a JPEG (say) and make it defensive against this. For example a JPEG file is likely to have apparently very long lines (any LF of CR LF will be soemwhat randomly spread). But a so called text file could equally have long lines that might kill your program,

What exactly do you mean by "plain text file"? Would a file consisting of Chinese text be a plain text file? If you assume English text in ASCII or ANSI coding, you would have to read the full file as binary file, and check that e. g. all byte values are between, say, 32 and 127 plus 13, 10, 9, maybe.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.