You may have seen Google’s new No CPATCHA reCAPTCHA system already – if not it looks like this:

It replaces the old CAPTCHA system that has long frustrated web users. CAPTCHAs are the deliberately hard to read blocks of text or numbers used by websites to try and ensure that you’re not a software robot. They are not designed to be a form of identity but can be considered a form a low-level security in that they are designed to prevent things like forum spam.

Many websites, including GOV.UK, have banned CAPTCHAs as they are a great example of prioritising business needs (e.g. reduce spam) over user needs (the site is easy to use). Amazon’s Mechanical Turk has been used to crowdsource correct responses to be fed into the bots CAPTCHAs were designed to prevent. Overall there is a widespread conclusion that it’s a piece of technology that has had its day.

Enter Google’s “No CAPTCHA reCAPTCHA” system. This uses undisclosed AI techniques to identify whether the user is a human and only reverts to the old CAPTCHA system, in their case house numbers from Google Streetview, if the level of surety isn’t high enough.

Understandably Google isn’t going to disclose what methods are used to assess the “humanness” of the user. People have theorised that it involves monitoring mouse movements, timing of actions, location of clicking and other similar human / computer interaction inputs. On top of this they will likely be using browser fingerprinting and, of course, whether you are logged into your google account.

This amounts to Google using two sets of analysis to decide on your human (or not) status. One based on who you are (identity) and the other on what you do. What I wonder is whether similar mechanisms can be used for identifying internet trolls.

Would it be great to be able to add an “I’m a nice person” button to a website that would use magic behind-the-scenes AI to decide whether you’re the kind of person that should be allowed to post in the forum? It could even be designed so that it gave an individual user a “niceness” rating that website owners could design their sites around (“you need to be this nice to enter here”).

Of course this idea is fraught with problems. The first obvious one is how the algorithm for “niceness” would be designed. It would need to remove biased / fallible human judgement as part of the assessment. It would have to be non-judgemental in terms of opinions that many people may not agree with but that are still very much open for reasonable debate. It would also need to be able to use advanced natural language processing to be able to understand quotations, sarcasm, irony, ad homonym attacks, bullying language, etc. That sounds extremely hard but is exactly the kind of thing Google, Apple and Microsoft are going to have to research in order to develop the next generation of intelligent personal agents. It’s exactly the kind of work that Deepmind or Watson should be very good at.

The second problem is how such a system would be able to cope with the negative aspects of categorisation and how people evolve over time. If a person is repeatedly judged as not being “nice” enough to mix with people in mainstream sites will they adopt that as a badge of honor and slink off to self-reinforcing areas of the internet like 4Chan and the Slymepit? I live in the hope that people can improve their level of discourse over time – would such a system be able to assess that improvement in some areas of the internet in order to open further sites to the person?

There are also issues with users that are just starting out. If there’s not a corpus of their previous work to analyse would the system banish them to uncontrolled areas until they’ve proved that they are sufficiently “nice”?

Finally, there are substantial difficulties related to anonymity. The service could be set up to allow it to read everything you post when you’re logged into any site but in order to do that it would need to categorically prove who you are and that usually means having a whole set of trusted identity relationships (think “connect Twitter to this site”). This is deeply problematic when the site is dedicated to people being able to discuss issues they don’t want traced back to them in their “real” lives.

One potential solution to this would be to force everyone who wants to post to a site to enter their comments and posts through the service. The user would still need an identity relationship with the “niceness” site but they could then choose to post identified or anonymous to the forum in question. This version could even potentially help people write better posts as a warning along the lines of “this posts is insufficiently nice to be posted on the site you have selected” could be a real eye-opener.

This is all a thought experiment but as natural language processing gets better things like this are going to start coming to the fore and we should be ready to discuss whether we think they’re a good idea.