Google Mini

the google mini is great, I installed and configured one for our website at work about 9 months ago. Its extreamly fast (think nearly every query under a second), fairly good at gettings results.

I haven't tested it against normal work documents yet, but we maybe getting a second so I'll try & post if I get the chance. If you want to index a file system get a version 2, the version 1 requires a "web interface" (aka you have to present the whole file system as a webpage).

[edit]Yes it is expensive at £2000 but compared to other similarly specified search engines (software & appliances) its peanuts.[/edit]

Anything you where particularly intrested in?

akakjs
 
We h ave one for our intranet and it's brilliant. Easy to set up and works exactly as well as you would expect it to. Also comes with a free Google t-shirt.
 
FishFluff said:
We h ave one for our intranet and it's brilliant. Easy to set up and works exactly as well as you would expect it to. Also comes with a free Google t-shirt.
Ours (aka mine) went missing :( We got our google mini delivered to our hosting center and the t-shirt was no-where to be seen when I went to setup the server :(

akakjs
 
robc123 said:
What DB does google use to index files?
no idea. One of the downsides of the google mini is its a black-box solution, and the NDA is about an inch thick.

I suspect its totally bispoke though, as document indexing is a highly specilised application.

akakjs
 
robc123 said:
What DB does google use to index files?
Google has its own filesystem - the GFS :cool:

I'd expect all their core software is totally bespoke, though for stuff like Adwords etc I know they don't use expensive commercial software (they tried it then reverted back to the other thing they were using, which I think was MySQL).

No idea what the mini uses but I'd expect it's something similar to the core search setup.
 
I had heard that is wasn't as good for intranets as it is for the internet due to the different way in which intranets are set out and linked between etc. Could be wrong though.

And Google have their own webserver and filesystem, and would be surprised if they hadn't developed their own RDB...
 
growse said:
I had heard that is wasn't as good for intranets as it is for the internet due to the different way in which intranets are set out and linked between etc. Could be wrong though.

And Google have their own webserver and filesystem, and would be surprised if they hadn't developed their own RDB...
The only draw-back for intranets I can think of is you can't assign your own importance to documents (like sitemaps), which in a highly structured intranet might be of siginifiant benfit.

[edit]actually thinking about it I can't see a reason google would need an RDB for their primary search clusters. Its all about indexes and pre-processed page references, if it was relational there would be a large overhead that they would never use. Thats not to say that the googlebot/analysis systems wouldn't use RDBs.

Google have never been afraid of customizing everything to their needs. They even wrote their own programming language for highly parellel operations : http://labs.google.com/papers/sawzall-sciprog.pdf [/edit]

Google issued a research paper on the GFS, makes for intresting reading : http://labs.google.com/papers/gfs-sosp2003.pdf

akakjs
 
Last edited:
Anyone still using Google mini? I'm having a look at it on a test install at work and have a question that I can't answer.

A search on the client site e.g. site:www.domain.gov.uk returns different (better) results than the Google mini search. The Google mini crawler is set to the same URL, anything else I need to check?
 
Don't Google set a limit on how many documents it will index? That was a stumbling block for a lot of my clients. The next one up costs way more.
 
Don't Google set a limit on how many documents it will index? That was a stumbling block for a lot of my clients. The next one up costs way more.

Yeah for 300k its £7,700, but then if you have that many documents to search then surely thats not much for such a company?
 
Anyone still using Google mini? I'm having a look at it on a test install at work and have a question that I can't answer.

A search on the client site e.g. site:www.domain.gov.uk returns different (better) results than the Google mini search. The Google mini crawler is set to the same URL, anything else I need to check?

We found there was a lot of improvement to be had by having the Mini index specially tailored pages. These would only have the important content on them, without any user-friendly navigation. I'd also have a look at the meta data indexing options that could be used to improve your search as well. It's not perfect, but it might help you.

akakjs
 
Yeah for 300k its £7,700, but then if you have that many documents to search then surely thats not much for such a company?

You'd be surprised how much data even a small company collects. I work for a recruitment software company; about 9 employees; and we have about 20GB of data in our database and I'm not counting archives or indexes.
 
Back
Top Bottom