****Fresh random image thread (with rule addition)Every post MUST contain an image!****

Status
Not open for further replies.
agdsd.jpg


vYVAT.jpg


5WVyo.jpg


srrhp.jpg


xNqhw.png
 
A remote worker here has been battling broadband problems from her Northern site for many months now and just today I get an email with pics asking where she can get this kind of cable because no shop sells them.

Well I think we have found out, after many days of complaiining to BT, scratching heads and so on, what the problem is...

Someone has stripped and taped strands of a phone cable and ethernet cable together and used that going into the BT router....

IMG-20110927-00289.jpg


:rolleyes:
 
Thats pretty terrible

OCR is incredibly difficult for computers to do, especially when you're dealing with poor quality and/or dirty paper as is the case with the ReCaptcha system.

I used to install OCR systems, and had a really hard time managing people's expectations after the salesman had gone in and told them about how our system is 95% accurate. The customer used to think that 95% of the pages they put through would go through first time without human intervention.

What actually happens is that 95% of the characters were read accurately, so given an average word length of 5 words, one in 4 words would have an incorrect character. The system made attempts to fix this using judgement and a database of common English Language constructs but this often made the problem worse as it would throw other characters into doubt, and of course if presented with a word or name that wasn't English the system would really start to screw up, my favourite example being when presented with the name "Rajit Patel". The j was misread as a v and the i and t were low confidence due to a poor scan, similar problems with the surname, so his name got changed to "Raved Paper"

It was also painfully obvious that the 95% accuracy claim is only valid if you use this font:

UoTOI.gif


It's lower with normal type (e.g. Times or Helvetica) and even lower still with handwriting.

The ReCaptcha system only works because they can crowdsource the OCR corrections and people are forced to do it. The accuracy shown in that image is about what I would expect. It does look like that system doesn't attempt to correct based on language or context, however that would make sense because they need to isolate the problematic words and send them off for manual processing by us lot.

In short, I hate OCR systems.
 
Last edited:
What's funny about recaptcha is the distortion. If their systems can't read the word then why do they need to distort the words? If it's so bots can't read it then they're clearly employing the wrong people to write the original OCR software in the first place.

bJML6.jpg
 
What's funny about recaptcha is the distortion. If their systems can't read the word then why do they need to distort the words? If it's so bots can't read it then they're clearly employing the wrong people to write the original OCR software in the first place.
The way it works is that one of the words is a "control" word, meaning that it is known by OCR recognition, and then the other word is worked out using a point system.

landscape.jpg
 
Status
Not open for further replies.
Back
Top Bottom