

Implementing the RAKE Algorithm with NLTK.

We add the mscore and fscore field definitions also in the schema.xml file in the fields block as follows: The title is already present in schema.xml with type="text_general", which works fine for us, since it will tokenize individual words (we want to be able to search on coffee, cocoa and sugar). Return the Distance between two Vectors (points) in an n-dimensional space. dist(2,x,y,0,0) :- calculates the Euclidean distance between (0,0) and (x,y) for each document. Source: src/main/scala/com/mycompany/solr4extras/funcquery/FuncQueryDataGenerator.scala package import import ._ import. import .SolrInputDocument object FuncQueryDataGenerator extends App What is the Difference between Geodist(sfield,x,y) and dist(2,x,y,a,b) in Apache Solr for Geo-Spacial Searches.
GEODIST SOLR CODE
Here is some Scala/SolrJ code that will generate and populate the data into a vanilla Solr 4.1.0 instance.
GEODIST SOLR PLUS
The mscore and fscore are random integers in a range of 1-1000, and the title contains one of three strings "coffee", "cocoa" and "sugar" plus the mscore and fscore values (primarily for visual feedback). In this post, I will describe a possible implementation that uses Function Queries to rerank search results using male/female appeal document scores.įor testing, I created some dummy data of 100,000 records with three fields - title, mscore and fscore. This idea can be easily extended for multi-category features such as ethnicity as well. So the idea is that if we know that the profile is male, we should boost the documents that have a high male appeal score and deboost the ones that have a high female appeal score, and vice versa if the profile is female. For example, we can assign a score to a document that indicates its appeal/information value to males versus females that would correspond to the profile's gender. On the content side, we can annotate the document with various features corresponding to these profile features. This could be gender, age, ethnicity and a variety of other things. A question that arose recently at the Freenode Solr IRC channel was about dist:geodist() failing to include a field named dist in the response a field which would contain the return value. We want to be able to customize our search results based on what a (logged-in) user tells us about himself or herself via their profile. My introduction to Function Queries was through a problem posed to me by one of my coworkers. So far, I haven't had the opportunity to personally use either feature in a real application. Most people get introduced to Function Queries through the bf parameter in the DisMax Query Parser or through the geodist function in Spatial Search. Which is probably why when I would read about Function Queries, they would seem like a nice idea, but not interesting enough to pursue further. Lowest priority.Solr has had support for Function Queries since version 3.1, but before sometime last week, I did not have a use for it. We will need to do Multi-polygon searches at some point later in the year. Get the parameters from him and implement to API. Jeff has put algorithm to prevent "whole world" spatial queries in Gate. It's already less than a millimeter at that precision. Don't allow any additional precision beyond that above. Validate these are numbers being passed in and for lat not > 90 or 180 or < -180. Validate it is a polygon by making sure the last points in the array match the starting points. it is utilizing similar architecture and has exactly named fields as we do (location_location_search and location_geohash_search) Subject: RE: Rentals API / Points Searching
