I have understood how to add xml files to SOLR and be able to search them via the SOLR ADMIN interface...
I need to know however, how to make SOLR work with PHP, and index MYSQL records...
This is what I want to do:
I have a mysql table, which I would like to add to SOLR (index it), so that instead of searching the MYSQL table directly via PHP, I first take the querystring, send it to SOLR, and then SOLR sends back results in form of ID:nrs, then use the ID:s to query mysql and fetch proper records...
I have no clue on how to communicate with SOLR using PHP, any help is appreciated!
Thanks
There's a good article here that will help you through the integration of PHP and SOLR:
http://www.ibm.com/developerworks/opensource/library/os-php-apachesolr/
There's a number of PHP interfaces to SOLR, that article references PHP SOLR client:
http://code.google.com/p/solr-php-client/
but there's also this:
http://pecl.php.net/package/solr
I'd suggest that you start with using DataImportHandler (http://wiki.apache.org/solr/DataImportHandler) for indexing the database and use one of the many Solr PHP clients (see SolrPHP wiki page). Note that Solr also emits JSON responses so if you are familiar with JSON, it may be the easiest way to get started.
I've been there too and it was the first time I found the Internet to be annoying! Maybe that was because I was in such a hurry to learn it in under a minute. Here's what i suggest:
1.
Don't panic. Understanding the working or even just the implementation takes more than just a few seconds. So, keep some time aside for this.
2.
Learn how to use JSON. You can use this to communicate across languages.
3.
Check the apache site
Related
I've got a client who uses Quickbooks Online for accounts, and he wants to be able to read data from it programmatically. We're using Clojure, so any solution in Java will work, or http gets etc can be made directly if necessary.
They've got what appears to be a nice RESTFUL interface to their stuff, and a java library for accessing it, but I can't make head or tail of their documentation: https://developer.intuit.com/docs?redirectid=accounting, which all seems to be about webapps and OAuth and other stuff.
All I want to be able to do is get, say, a single customer record.
Can anyone point me to the simplest possible Hello World type program, in any language? (Preferably Java or something easy to read like python)
I'd imagine that what I'm looking for would look something like:
import quickbooksapi
username='fluffy'
password='doom'
cus=quickbooksapi.get_customer(username,password,id=4)
print(cus)
or something?
Or have I just got the wrong end of some gigantic stick here?
This looks the best available documentation:
https://developer.intuit.com/hub/blog/2016/04/25/quick-start-to-quickbooks-online-rest-api-with-oauth1-0
It's a recent blog post by a Quickbooks developer, showing how to get Oauth keys and then use curl to access the REST API.
It seems that you have to pretend to make a SaaS app in order to get one bit of gubbins and then there's another thing where you can get the rest of the gubbins.
After that you can use curl, putting all the gubbins in the headers. (The postman extension for chrome that he uses can generate curl commands and equivalents in many other languages)
It works exactly as advertised (6th August 2016).
That's enough. I can take it from there.
I just couldn't able to configure the SOLR to get the documents from existing MONGODB. Even Google search for same dint gave me anything worth. Is there any tutorial, video to do the same?
In a past life, we tailed the oplog and used that to feed info to solr.
I need to create a document store with search capabilities. Sounds simple...
That means that I have documents which I need to store in database. I thought about CouchDB, and about few other document-oriented databases, but I'm still not sure what would be the best solution.
On the other side, I thought about integrating Solr in some kind of web application which I'm going to use for uploading, indexing, search, update, delete documents.
And, of course, the main problem is that most of these documents are written using Cyrillic characters.
Maybe I'm trying to combine things that do not match together.
Could someone give me an advice what would be the best way to implement solution like this.
Best,
Joksimovic
Brate Srbine/Crnogorče :)
I suggest you use MongoDB as your database and use Solr to get index/search capability.
I used Solr in my previous (government tender) project and it's GREAT.
No bugs, easy to use when you get into it and it's blindingly fast.
Looks like for your needs Thinking sphinx could help. You could store documents in any database(SQL-oriented or not) and search them with sphinx.
Sphinx supports cyrillic characters from the box, also it's possible to use stemming, faceted search, fuzzy search, etc. May be it helps you.
Read more about sphinx here
I am also working on such a content management system. Utill now i am going to use a database to store the metadata.
Store the documents on file system.
Dont go for storing documents in database like SQL server. since it has a limitation and licensing cost.For search you can use Solr (better in terms of support and acceptance in open source over sphinx)
Choosing a stand-alone full-text search server: Sphinx or SOLR?
. either way you need to populate indexes. then call API methods to search.
I want to know what are the query classes that Solr use for querying. And what are the difference in querying using lucene and Solr
I am not sure what you are asking, but SOLR is basically a search/indexing server. It has an external http based api for sending documents to be indexed and to search them.
One of the core pieces of SOLR is Lucene. This is the library that actually indexes/searches stuff.
If you need the API/query info for SOLR (which should mirror very closely that of lucene), look on lucene.apache.org
Solr allows you to have a distributed search engine that is exposed as a web-service to your client application. If you are asking, how to use it on the client side, just look at solrj api. If you ask for internal SOLR apis and classes, then you could start from the QueryComponent class, e.g. http://lucene.apache.org/solr/api/org/apache/solr/handler/component/QueryComponent.html.
Lucene is the technology used by solr to perform searches.
I'm not 100% what you are asking but if its how do i query solr, then you simply visit or curl a url, the url will contain the solr query. e.g.
price:[0-1000]
or
name:test
the first part (before the :) is the field,and the second part is the search which can be text,numeric range etc...
there is plenty of documentation regarding this on solr's wiki
Let me know what your actual problem is and ill gladly help
I am familier with java programming language I like to extract the data from a website and store it to my database running on my machine.Is that possible in java.If so which API I should use. For example the are number of schools listed on a website How can I extract that data and store it to my database using java.
What you're referring to is commonly called 'screenscraping'. There are a variety of ways to do this in Java, however, I prefer HtmlUnit. While it was designed as a way to test web functionality, you can use it to hit a remote webpage, and parse it out.
I would recommend using a good error handling html parser like Tagsoup to extract from the HTML exactly what you're looking for.
You definitely need a good parser like NekoHTML.
Here's an example of using NekoHTML, albeit using Groovy (a Java-based scripting language) rather than Java itself:
http://www.keplarllp.com/blog/2010/01/better-competitive-intelligence-through-scraping-with-groovy
You can use VietSpider XML from
http://sourceforge.net/projects/binhgiang/files/
Download VietSpider3_16_XML_Windows.zip or VietSpider3_16_XML_Linux.zip
VietSpider Web Data Extractor: Software crawls the data from the websites ((Data Scraper)), format to XML standard (Text, CDATA) then store in the relational database. Product supports the various of RDBMs such as Oracle, MySQL, SQL Server, H2, HSQL, Apache Derby, Postgres …VietSpider Crawler supports session (login, query by form input), multi-downloading, JavaScript handling, proxy (and multi-proxy by auto scan the proxies from website)…
Depending on what you are really trying to do, you can use many different solutions.
If you juste wanna fetch the HTML code of a web page, then URL.getContent() may be your solution. Here is a little tutorial :
http://www.javacoffeebreak.com/books/extracts/javanotesv3/c10/s4.html
EDIT : didn't understand he was searching for a way to parse the HTML code. Some tools have been suggested above. Sorry for that.