1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Filter query - algorithm

Discussion in 'HTML, Graphics & Programming' started by SoapSurgeon, 19 Jan 2012.

  1. SoapSurgeon

    Wise Guy

    Joined: 14 Apr 2003

    Posts: 1,101

    Hi,

    I have a series of records in MongoDB (although it could be anything for the purpose of this question). Lets say I have a number of products:

    Product:
    id
    name
    price
    date_added
    interestingness

    I want to list the 25 most interesting products ordered by date (latest added at the top). I always want to have some products listed so if there aren't any good ones it should choose worse ones. How does this general kind of algorithm work, ideally at a database level.
     
  2. rickh

    Hitman

    Joined: 26 Dec 2008

    Posts: 618

    SELECT * FROM `products` ORDER by `interestingness` DESC, `date_added` DESC LIMIT 25

    Assuming 'interestingness' is a double from 0->5 (or whatever, as long as its a numerical scale), this will list the 25 most interesting products, and where 2 products share the same level of "interestingness", the latest one will be ranked higher.

    However, this will be quite a stale set of results - they won't change much as time goes on, assuming that the most interesting products are the most popular products and therefore more users rate those products as interesting as time goes on.

    A less stale set of results (will be more different as time goes on) could be achieved by placing more importance on the date_added field by rounding the interestingness value to 0 decimal places, for example:


    SELECT * FROM `products` ORDER by ROUND(`interestingness`,0) DESC, `date_added` DESC LIMIT 25
     
  3. SoapSurgeon

    Wise Guy

    Joined: 14 Apr 2003

    Posts: 1,101

    Hi,

    Thanks for your reply. There will be new 'products' added all the time and I want the most interesting recent ones to show. Being stale is a definite no-no - I don't want a really interesting product that was added a year ago to show on the front page (strange as it may seem!).

    I think this is very similar to the facebook feed?

    I did think about weighting the products based on date_added. today == 1.0 and +7 days == 0. This would remove any products over a week old. Just wondered if it was a standard problem with a standard solution before I start hacking away :D
     
  4. rickh

    Hitman

    Joined: 26 Dec 2008

    Posts: 618

    It's not similar to facebook because I think you also want to show old products which are rated very highly amongst the new products?

    If you don't want to do that, just add a date limit to the query:

    SELECT * FROM `products`WHERE date_added > [last month timestamp]
    ORDER by ROUND(`interestingness`,0) DESC, `date_added` DESC LIMIT 25

    If you're generating "interestingness" by user votes, you should read this:
    http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
    You should read it anyway though.
     
  5. SoapSurgeon

    Wise Guy

    Joined: 14 Apr 2003

    Posts: 1,101

    Hi,

    Thanks I've read that link - very interesting but not what I'm after, it might come in handy later though.

    What I require is a timeline of products. Let's imagine that 5000 products have been added today. I want to display a subset of those products hand picked for a particular user (the interestingness). The most recent product should be at the top and the user should be able to 'load more' which will select another set of interesting products. An old product should not appear on the list until the user has clicked load more sufficient enough times.

    If no 'interesting' products have been added out of the 5000 i still want to show something, so inferior products (but still added today) will be shown.

    I cant limit the products to added today, because in the morning you would have very few products - in which case I would want the end of yesterdays to be included...
     
  6. visibleman

    Wise Guy

    Joined: 3 Jun 2005

    Posts: 1,911

    Location: The South

    What is the 'interestingness' value?
    Is that votes similar to 'hotness' you get on HDUK, Reddit etc where people rate a product? Or is the value something you assign to a product?


    Edit - Evan Miller also did an article on 'hotness' http://www.evanmiller.org/rank-hotness-with-newtons-law-of-cooling.html where he mentions exponential decay of products.
     
    Last edited: 19 Jan 2012
  7. SoapSurgeon

    Wise Guy

    Joined: 14 Apr 2003

    Posts: 1,101

    It is a value that is calculated on-the-fly based on a number of factors.

    For example, if a user has shown interest in a similar product, the product has been purchased/viewed a lot etc will all increase a products interestingness...

    This will be calculated during the query. I was thinking a map reduce job might be the way forward, but it isn't quite right.