At https://www.conzoom.eu/find-dig-selv/?default there is a form to enter an address and a postcode, when you search you get a code; A1, D3, E2 etc defining which segment this address is in.. I have a lot of addresses in an excel sheet of my customers that I would like to look up - is there an easier way than doing this manually?
Selenium might be what you are looking for. It is made to simulate a user on a website, so it can the inputs, read the outputs, and wait for the site to be ready before entering the next address.
The tricky part is about reading the excel sheet, depending on its format. But you can always write a macro to make the input more "readable".
It is not impossible to do this in Javascript if the data can be moved to a format that the Javascript on the page can parse.
This is not best solution, but it is an approach I have taken in the past when the web server can only server static files, no server side processing.
From what your question suggests the data set is this may not be a practical solution, due to its size and complexity.
If the data was something like POSTCODE, LOCATION_CODE and there was a one to one mapping such as all postcodes starting with MK had a LOCATION_CODE of 83 then the data can be serialised into JSON or XML(JSON prefered).
Now when the user enters the postcode on the form the Javascript retrieves the data from the server as a static file and parses the results. Compares the users inputed POSTCODE against the data and returns a corresponding LOCATION_CODE.
This only works for simple data that changes very infrequently. Alternatively you need a server backend that will connect either to your excel spreadsheet (not good practice) or to a central database with the logic running to perform the search. This logic will need something like #Todd Motto suggested Java or any number of technologies such as C#, PHP, perl
Related
If I want to save a response to a query on a website I'm coding to a server, how would I do that?
Here's an example. If I had a site with a "Rate us" form, and a person answered with a "AWFUL SITE!" how would I be able to save & retrieve that information?
There are several ways to do what you want to do. I'll describe two of them.
You could append each rating to the end of a file on the web server. This would be done in a server-side scripting language usually, such as PHP or ASP.NET, and you would probably want to set the permissions on the file so that it's not readable to everyone.
You could set up a table in a database (MySQL or otherwise) and add a new row for each rating given. Again, this would be done in something like PHP or ASP.NET and you would want to make sure you take precautions against SQL injection attacks (not much of a problem if you use PHP Data Objects rather than the deprecated mysql_* functions).
I would personally go for the second option as it's easier to manage and change, and it's easier to set it up so that you can store IP, name, optional email and message in every row. And like I said, you can add a new field later down the line without running into the obvious problems.
O community, I'm in the process of writing the pseudocode for an application that extracts song lyrics from a remote host (web-server, not my own) by reading the page's source code.
This is assuming that:
Lyrics are being displayed in plaintext
Portion of source code containing lyrics is readable by Java front-end application
I'm not looking for source code to answer the question, but what is the technical term used for querying a remote webpage for plaintext content?
If I can determine the webpage naming scheme, I could set the pointer of the URL object to the appropriate webpage, right? The only limitations would be irregular capitalization, and would only be effective if the plaintext was found in EXACTLY the same place.
Do you have any suggestions?
I was thinking something like this for "Buck 65", singing "I look good"
URL url = new URL(http://www.elyrics.net/read/b/buck-65-lyrics/i-look-good-lyrics.html);
I could substitute "buck-65-lyrics" & "i-look-good-lyrics" to reflect user input?
Input re-directed to PostgreSQL table
Current objective:
User will request name of {song, artist, album}, Java front-end will query remote webpage
Full source code (containing plaintext) will be extracted with Java front-end
Lyrics will be extracted from source code (somehow)
If song is not currently indexed by PostgreSQL server, will be added to table.
Operations will be made on the plaintext to suit the objectives of the program
I'm only looking for direction. If I'm headed completely in the wrong direction, please let me know. This is only for the pseudocode. I'm not looking for answers, or hand-outs, I need assistance in determining what I need to do. Are there external libraries for extracting plaintext that you know of? What technical names are there for what I'm trying to accomplish?
Thanks, Tyler
This approach is referred to as screen or data scraping. Note that employing it often breaks the target service's terms of service. Usually, this is not a robust approach, which is why API-like services with guarantees about how they operate are preferable.
Your approach sounds like it will work for the most part, but a few things to keep in mind.
If the web service you're interacting with requires a very precise URL scheme, you should not feed your user-provided data directly into it, since it is likely to be muddied by missing words, abbreviations, or misspellings. You might be better off doing some sort of search, first, and using that search's best result.
Reading HTML data is more complicated than you think. Use an existing library like jsoup to assist you.
The technical term to extract content from a site is web scraping, you can google that. There are a lot of online libraries, for java there is jsoup. Though its easy to write your own regex.
1st thing I would do i use curl and get the content from the site just for testing, this will give you a fair idea of what to do.
You will have to use a HTML parser. One of the most popular is jsoup.
Take care abut the legal aspect fo what you you do ;)
Just wondering which is best here. I want to output data from a table in my DB then put a lot of this data into a html table on the fly on my page. I'm working with Java on the server side. Basically I pull the results form the DB and have the raw data..just what next?
There is a chance I may want to take data from multiple tables in order to combine it into one table for my site.
I retrieve the results of the query from the DB, now do I create a text from it in the form of json which I can parse as json using jquery upon the return of the object to my browser?(kind of a sub question of this question: Is just using a stringbuilder the correct way to make a json object to output?)
Or..
Should I build the HTML as a string and output that to the browser instead?
Which is better and why?
I've built entire pages from JSON data on the client. It reduces the redundancy of repeating HTML and can lead to better performance, depending on the complexity of your HTML.
I had large a catalog that used multiple tabs for different sections. Sending it all to the client as JSON and generating the resulting HTML was way faster than downloading the equivalent HTML.
What you lose, of course, is SEO. Search engines won't be able to see the Javascript-generated output. There are ways around this, using hash URL techniques.
I used to be in favor of generating HTML on the server so that the client can be dumb and simply inject dynamic content. The pragmatic real world advantages for our small team was that we needed to be experts at fewer different technologies. We focused on the middle tier and back end and spent less time on the front end.
Lately, with tools like jQuery, it is easier and easier to do more robust client stuff without having to increase the dev bandwidth much. From a client side, I can say building dynamic HTML from JSON using jQuery isn't that hard.
From the server side, I'm sure there are tools to serialize to JSON. I wouldn't roll your own with StringBuilder. Sorry, I'm not a Java guy so don't have a recommendation.
I'd go with JSON if I knew I had anything more than just static views of the data in mind later on.
But if it's just so that you can see what the result was, and don't care too much for the data then I'd go with the straight forward HTML output.
For actually generating the JSON server-side, there are a number of libraries you can use. org.json is the canonical one, but I prefer Stringtree personally.
I am working on a project here that ingests internal resumes from people at my company, strips out the skills and relevant content from them and stores it in a database. This was all done using docx4j and Grails. This required the resumes to first be submitted via a template that formatted everything just right so that the ingest tool knew what to look for to strip the data.
The 2nd portion of this, is what if we want to get out a "reduced" resume from the database. In other words, I want to search the uploaded content I now have, and only print out new resumes for people who have Java programming experience lets say. So I can go into my database, find the people who originally had java as a skill, and output a new set of resumes that are also still in a nice templated format, and only have the relevant info in them, instead of ALL the content.
I have been writing some software to do this in Java that will basically use a docx template, overwriting the items in customXML which are bound to the content controls in the doc, so the new data shows up and can eb saved as a new docx with that custom data.
This seems really cumbersome to me, and has some limitations. For one, lets say my template has a place for 3 Skills, and the particular person has 8 skills. There seems to be no good way to add those 5 additional skills to the docx other than painstakingly inserting the data with all of the formatting XML tags and such. This is a real pain, because if the template changes, I dont want to have to go back into my software and edit source code to change that additional data input XML tag to bold instead of italic.
I was doing some reading up on using Infopath to create a form that I could use to get the input, connecting to some sharepoint data source or something to store the stripped out data. However, I can't seem to find out if it is possible using sharepoint to get the data back out, in a nice formatted way. What would the general steps for this be? It seems like I couldnt find very much about this topic with any quick googling.
Thanks
You could set up the skills:
<skills>
<skill>..</skill>
<skill>..</skill>
and use a "repeat" content control pointing to the container. This would handle any number of <skill> entries.
What's the best way to do spreadsheet-like calculations in a programming language? Example: A multi-user application needs to be available over the web that crunches columns and cells of numbers like a spread-sheet based on user submission. What are the best data structures/ database models/patterns to handle this type of work so that handling the different columns are done efficiently and easily in php, java, or even .Net. Is it better to use data structures within the language, or is it better to use a database? If using a database is the way, how does one go about doing this?
To do the actual calculation, look at graph theory. Basically you want to represent each cell as a node in a graph and each dependency as a directed edge. Next, do a topological sort to calculate the value of each cell in the right order.
Aspose.Cells (formerly Aspose.Excel.Web) is a good way to get the functionality you are looking for.
Unless you are asking more for a "How is it done?" than "I need to do it." Then I would look at the other answers given.
Along the lines of "I need to do it"
Microsoft has Excel Services which does just what you want.
Spreadsheet operations on the server. It is available via a web services interface, so you can connect and drive calculations from Java, PHP, .NET, whatever.
Excel Services is part of Sharepoint 2007.
Resolver One is a Spreadsheet app made in IronPython.
There is an explanation of the overall mechanic for the calculation [pythonology.org] it uses for user generated ecuations.
The relevant image showing Resolver One's overall algorithm.
Should note that users can write python code to be interpreted both on the cells and a special 'outside of sheet' place.
Look at another question here in SO, from where I reused my answer.
I can't tell you how to do it. But I would recommend you to look at the code of PHPExcel. PHPExcel is a library that allows you to create Excel files within PHP.
The workflow of PHPExcel is simplified like this:
Create an empty Excel file object
Add cells (with either data or formulas) to the "Excel file"
Call the create function which is generating the file itself
In your case you would have to replace 3. with something like "Create web interface".
Therefore I would recommend you to look at the code of this open source project and look how the general structure is. This should help you solving your problem.
I once used a binary tree to store the output of parsing a string using BODMAS. Each node was an operation between two other nodes, which could be a number, a variable or another operation.
So y = x * x + 2
became:
+
* 2
x x
Sadly this was at school in Pascal and is stored on a 5 1/4" disk, so you don't want it :)
SpreadsheetGear for .NET will let you load Excel workbooks, plug in values, calculate and then get the results.
You can see a few simple ASP.NET calculation samples here, other ASP.NET samples here and download a free trial here.
Disclaimer: I own SpreadsheetGear LLC
I must point out that google spreadsheets already does this kind of stuff.