Algorithm or api to create Intrusion Detection System inputs - java

Hello I want to develope Intrusion detection system using neural network.
I know there are 41 inputs. ( I know this from the Dataset which I used to train the neural network) .
I need help how to capture this 41 inputs in live connection. Please somebody help me or atleast guide me in the correct direction.
Thank you for your answers in advance...

What you are trying to do is feature extraction or reduction on your input data.
As input data I could imagine logs from a firewall, captured packets, ...
And as features you could have things like failed login attempts per time unit, number of connections, ...
But if you want to have your system work with the training you feed it, you need to have the same distribution of the features in the data you process, as you have trained it on (or at least very similar).
So to make matters short and simple : if you want to use the training data you cite, you need to get to know exactly which data they worked on gathering the training data, and exactly how they preprocessed it.

I have answered your other question (http://stackoverflow.com/questions/7587657/building-intrusion-detection-system-but-from-where-to-begin)
more thoroughly. But I repeat here.
Read this article to learn more about how it (KDD99) is constructed
Article (Lee2000framework) Lee, W. & Stolfo, S. J.
A framework for constructing features and models for intrusion detection systems
ACM Trans. Inf. Syst. Secur., ACM, 2000, 3, 227-261

Related

Comparing between two skeleton tracking on processing 3

I am currently doing my dissertation which would involve in having 2 people a professional athlete and an amateur. First with the image processing skeletonization I would like to record the professional athlete while performing the squat exercise , then when the amateur performs the exercise I want to be able to compare the professional skeleton with that of the amateur to see if it is properly formed.
Please I m open for any suggestions and opinions , Would gladly appreciate some help
Here lies your question:
properly formed.
What does properly performed actually mean ? How can this be quantified ?
Bare in mind I'm not an athletic/experienced in this field.
If I were given the task I would counter-intuitively go in the opposite direction:
moving away form Processing 3/kinect/computer. I would instead:
find a professional athlete
find a skilled with trainer with functional mobility training.
find an amateur (probably easiest)
Item 2 will be trickier.For example FMS seems to put a lot of emphasis on correct exercising and mobility (to enhance performance and reduce risk of injuries). I'm not sure if that's the only approach or the best. You might want to check opinions on Physical Fitness, consult with people studying/teaching exercise science, etc. Do check credentials as it feels like a field where everyone has an opinion/preference.
The idea is to understand how a professional educated trainer asses correct movement. Take note of how that works in the real world and try to systemise it.
What are the cues for a correct execution ?
is the key poses
the motion in between
how the skeletal and muscular system work together/ the weights/forces applied/etc.
Having a better understanding of how this works in the real world should lead you to things you can start quantifying/comparing numerically on a computer.
Try to make a checklist/score system manually using a pen and paper based on the information you gather. If this works you already have a system you can start programming.
The next step is acquiring the data.
This is probably where the kinect comes, but bare in mind:
the second version of the kinect is more precise than the first
there is a Kinect2 SDK wrapper for Processing 3: use that if you can (windows only). There is a way you can get libfreenect2 working with OpenNI on osx/linux and therefore with SimpleOpenNI in Processing, but it's not straight forward and you won't have the same precision on the skeleton tracking algorithm
use data that is as precise as possible:
you can get the accuracy of a tracked skeleton joint
use an environment that doesn't contain a complex background (makes it easy to segment users and detect/track skeletons with little change of mistaking it for something else). prefer artificial non-incandescent light (less of a problem with kinect v2, but still you want as little IR interference as possible).
comparing orientation matrices or joints on single poses might not be enough to get the full picture: how do you capture/quantify motion taking into account the things that the kinect can't easily see: muscles flexing/forces applied/moving centre of gravity/etc.
try to use a grid system that will make it simple to pair the digital values with real world measurements. Check out how people used to study motion in the past, for example Étienne-Jules Marey or Eadweard Muybridge
Motion capture by Étienne-Jules Marey
Motion study by Eadweard Muybridge (notice the grid)
It's a pretty full on project to get right involving bits of anatomy/physics/kinematics/etc.
Start with the research first:
how did people study this in the past ?
what are the current developments ?
how does it work in the real world (without computers) ?
Take your constraints into account:
what resources (people/gear/etc.) can you use ?
how much time do you have available ?
Given the above, what topic/section of the project can be realistically be tackled to get useful results.
Overall probably something along these lines:
background research
real world studies
comparison system has feature which can be measured both with kinect and by a person
record data (real world data + mobility comparison evalutation and kinect data + mobility comparison)
compare data
write evaluation of findings (how effective is the system? what are limitations ? what could be improved (future work) ? etc.)
In short be aware of the kinect limitations: skeleton tracking is probability based: it's not 100% accurate. use data that's as clean/correct as possible to begin with (make it easy to acquire good data if you can control the capture environment). From what a real trainer would track, what could you track with a kinect ? do a comparison of the intersecting measurements.

Is there a way to configure Google Project Tango the depth scanning technique and range to less than 4 meters?

I'm working with the project Tango (java Library) and after I read about Tango techniques to capture the depth data (Structured Light, Time of Flight, and Stereo), I was wondering f there is a way to choose between them through the code?
Another thing, in my app I should only capture depth data within 0.5-2 meters only instead of 0.5-4 meters . I searched on the whole Java API, but I couldn't find anything helpful. So, is there a way to do this?
Best Regards,
Ok. I tried to search in classes and methods in the Tango Java API to specify the z range but I couldn't anything. Thus, I add an if statement when I read the point buffer, which it actually solve my issue. Also, I read that I could change the range by using PCL pass-through, but I didn't need it in my case.
Regarding the depth capturing techniques, I still curios about it. I just want to understand what is the technique that is used in Tango by default. I'm really new to the field of sensors, 3D depth data and heir techniques and I really would like to understand this very well in Tango.

converting audio file into text file using java

i am developing a desktop application using java. this application is for school kid to teach English, where user can upload some English audio can be in any format which need to be converted into text file. where they can read the text.
I've found some api but i am not sure about them.
http://cmusphinx.sourceforge.net/wiki/
I've seen many question on stackoverflow regarding this but none was helpful. if someone can help on this will be very greatful
thank you
There are many technologies and services available to perform speech recognition. For an intro to some of the choices see https://stackoverflow.com/a/6351055/90236.
I'm not sure that the results will be acceptable for teaching children English as a second language, but it is worth trying.
What you seek is currently breaking edge technology. Tools like cmusphinx can detect words from a dedicated, limited dictionary (so you can teach it to understand, say, 15 words and that's it - you can't teach it to understand English).
Basically, those tools try to find patterns in the sound waves that you feed them. They don't understand anything, they just use the same algorithm on anything and then try to find the closest match. This works well for small sets of words but as the number of words increases, the difference between then shrinks and the jobs gets ever harder (without even starting with words like whether and weather or C and see).
What you might consider is "repeat after me" software. Here, you need to record all words for the test as templates. Then you can record the words from the pupils and then compute the difference. If the difference is not too large, the word is correct. But again: This is simple repetition to improve pronunciation - not English.
There is desktop software which can understand a lot of English (for example the products from Nuance, Dragon Naturally Speaking being one of the most prominent). They do offer server solutions but that software isn't free or cheap if you're on a tight budget.

Defining Trending Topics in a specific collection of tweets

Im doing a Java application where I'll have to determine what are the Trending Topics from a specific collection of tweets, obtained trough the Twitter Search. While searching in the web, I found out that the algorithm defines that a topic is trending, when it has a big number of mentions in a specific time, that is, in the exact moment. So there must be a decay calculation so that the topics change often. However, I have another doubt:
How does twitter determines what specific terms in a tweet should be the TT? For example, I've observed that most TT's are hashtag or proper nouns. Does this make any sense? Or do they analyse all words and determine the frequency?
I hope someone can help me! Thanks!
I don't think anyone knows except Twitter, however it seems hashtags do play a big part, but there are other factors in play. I think mining the whole text would take more time than needed, and would result in too many false positives.
Here is an interested article from Mashable:
http://www.sparkmediasolutions.com/pdfs/SMS_Twitter_Trending.pdf
-Ralph Winters
You may be interested in meme tracking, which as I recall, does interesting things with proper nouns, but basically identifies topics in a stream as they become more and less popular:
And in Eddi, interactive topic-based browsing of social status streams

Relationship discovery using Neural networks

Suppose I am sampling a number of signals at a fixed rate (say once per second) and extracting some metrics from the signals such as, ratio of one to the other, the rate of change, relative rate of change, etc.
I've heard that Neural Networks can be of use in discovering relationships. Is this true?
If so, what books/internet resources can I use to learn more about how do do this.
The processing is being done is Java, so a Java slant on all your answers would be most appreciated.
Thanks
Most likely you would need to determine some sort of a "window", like maybe the last 10 samples. You would normalize your signal into an array of 10 "doubles" normalized between -1 and 1. This would form the "input" into your neural network. So you would have 10 input neurons. Then you have to decide what you want the output to be. Maybe you have 100 different classifications that you may want to classify the signals into. If this is the case you would have 100 different output neurons that would each be trained to produce a higher output than the other output neurons when they recognize a specific signal.
Between the input and output layers neural networks usually have one or more hidden layers. These just provide additional capability to the neural network.
For Java neural network programming, you might try the Encog project. There is also a DotNet version of Encog as well.
That is true. You can discover relationships with NN. The problems is that it's very hard to interpret the weightings after you make the calibration.. so they are a bit of a black box (more so than other data mining algorithms).
I would actually recommend exploring the Neural Net algorithm that comes with MS Analysis Services. It's a good way to learn about NNets before you start programming anything (and since it's a server service you can call it from java).

Categories

Resources