Big Malware: Why We Must All Treat Data as a Currency

Thursday 13 May 2021
Bob Leggitt
"In other words, we are buying online services for an unspecified and upwardly dynamic price. We are handing Big Tech a blank cheque."
Data havens
Photo by NASA on Unsplash

Imagine what it would be like if inflation topped 100%… Imagine pumping double the amount into your entertainment budget this year, as compared with last - and getting a worse experience. I'm sure you wouldn't be very happy. And yet that may be happening to you right now, without you even noticing.

In the data economy, 100%+ inflation is already a very real scenario. And the primary reason for this out-of-control data-sucking spiral is that most of us have no idea how valuable our data is. Indeed, critically, we have almost no control at all over the spread of our data, once we've given it up. Access to our data can be sold again, and again, and again. Far from being able to stop this happening, we're most unlikely even to know about it. In other words, we are buying online services for an unspecified and upwardly dynamic price. We are handing Big Tech a blank cheque.

Do you let out a sigh when you hear the word “telemetry”? What about “session recording”? “Data points”? “Data log”? “Canvas fingerprinting”? “Cloudflare”? The “third-party data-piggyback loophole”? “JavaScript enforcement”? “Unauthorised human A/B testing”? “Unauthorised human psychological profiling?” “Unauthorised human intelligence testing”? “Unauthorised human labour-tasking”?…

For many people, the language above has little significance, and if you're unfamiliar with any of the terms, don't worry - I'll give you a glossary at the end of this post. But all of the phrases describe ways in which we're paying for the use of online services today, and almost certainly were not ten years ago. In data, the price of using online services has skyrocketed, but we're not getting any more in return. We're routinely getting less.

"It has nothing to do with protecting you from smalltime baddies. Tech doesn't give a shit if you get hacked, scammed or robbed."

Have you ever looked at a supposed “privacy solution” and wondered: why can't I just run this myself, as desktop software, without any third-party involvement? There's only ever one answer: because data.

Have you wondered why you were required to solve four CAPTCHAs in succession when you were logged in and thus it was always known you were not some random bot? Answer? Because it was never about you being a “potential bot”. It was about you being used as human labour, to train an AI mechanism to oppress you at some point in the future - whilst at the same time assessing your own intelligence.

The cost, in non-financial currencies, of using online services, has increased by more than a thousand percent over the past decade, but the quality of most of those services has declined enormously. We have less choice, less access, less information, and that's not incompetence. It's a plan.

I regularly spot huge increases in data demand going hand in hand with reductions in usability. Most recently I observed this in WordPress.com's theme-selection process. One month ago, the privacy extension uBlock Origin would report between ten and twenty blocked items when I used the theme selector. Now it reports over a hundred. And the user experience has predictably and dramatically worsened, becoming more time-consuming, more restrictive and more oppressive, as has consistently been the case with WordPress.com over the past decade.

In fact, we're now so used to having our user experience compromised to facilitate data collection that we've come to actually dread updates. And it's not just the interfaces that get worse, deliberately adding extra, unnecessary clicks so that our thinking processes can be monitored, etc. No, we're also steadily being cut off from the information we request, so that we can instead be fed what the Tech Collective wants to feed us. Put more simply, search engines are getting worse.

I'm not joking! Search engines are categorically getting worse. They were spammy a decade ago, but they did at least attempt to find what we asked for. Today, they behave much more like malware, frequently leading our attention towards results that don't even remotely resemble the phrases we typed in.

"Decentralised platforms can be just as oppressive as large, centralised platforms - sometimes more so."

We know there's a lot more online censorship than ever before, but it goes beyond that. Like the trojan malware toolbars of old, the search engines of today have become push mechanisms for preferred “organic” content, to the extent that only the use of advanced search tools will find us what we want.

TRUST A SEARCH ENGINE...

So what percentage of people actually use advanced search? I know the answer to this, because I can see in the analytics for various blogs whether or not people used advanced search methods to access the pages. But I thought I'd get a search engine to tell me all the same. Here's the fiasco that ensued…

I started with Google. And?…

Google - percentage of people who use advanced search

Top result is about search engine marketing. Of course it is. What else but search engine marketing would a search engine marketing company want me to know about? But did I ask a question about search engine marketing? No. Did I ask a question about any of the other results on the page? No. Every result is irrelevant. EVERY. RESULT. And in case you're wondering, that top result doesn't answer the question, and doesn't at any point mention advanced search. So, umm… In terms of fitness for purpose, epic fail.

So let's try Startpage… Same irrelevant dog-dump as Google.

Yahoo?… Nah, we're still obsessed with marketing.

Mojeek?… Top result: “percentage of people who use dating sites”. Nice job. But let's be fair. The second result is about advanced search… LET'S GO!…

Site cannot be reached

Oh. Site apparently inaccessible. Or is the browser just blocking it because the admin didn't suck Big Tech's webmaster advisories? Don't know. Haven't got time to find out. A quick glance at the rest of the results, and there's more dating site dross on the page, so let's not take this search engine seriously. Let's find another…

Okay, so let's stop messing about now. Who's a really serious player in the search market? Microsoft! Bing will know! Or will it?… Nope. Another bumper load of irrelevant marketing blab.

So I'm thinking by now, that no one has a post about the percentage of people who use advanced search. Because if they did, at least one of the above search engines would find it, right? One would think. But it turns out that if you enter this simple search on DuckDuckGo, top result: “What percentage of search users use advanced search?”. Bingo!…

DuckDuckGo search

So there was at least a relevant title after all. It was just that none of the search engines I tried, other than DuckDuckGo, actually wanted to show it to me. So now, hopefully, we can finally get our answer…

Not quite that simple I'm afraid, Sir”, says my Chromium browser. “Privacy error!”, screams my Google-funded web-access tool. “Your connection is not private”.

Your connection is not private

Yeah. I want to read a piece of public information - not hand over my ID scans, banking details and entire password list via a contact form. I do not need a private connection. And on the screen by default, there is no option to proceed. It looks like I'm blocked.

To move forward I have to click the Advanced button - which many people, stricken with fear due to this dire (yet completely unnecessary) safety warning, will not do. Most people will instead click the reassuringly blue “Back to safety” button, and the browser will have blocked someone from visiting a site that would not play ball with Big Tech's drive to encrypt the entire Web and police how it's encrypted. And don't worry, I'll explain the real reason why Big Tech wants the whole Web encrypted before this diatribe is out - and it's nothing to do with protecting you from the baddies.

Let's be real. The "privacy error" on Chromium is not a measured warning for people who may intend to hand over private info. It's clearly intended as a block. For all traffic. Why are we, as a collective public, accepting these oppressive, anti-competitive, totalitarian tactics, which are obviously mandated by the wealthy tech organisations who fund browsers? Why?…

Because we don't think we're paying, and we're brainwashed into believing that we have no right to criticise or complain about something that's “free”. Only by establishing data as a real currency, will we begin to stand up against these abuses of power, on the basis that we're not getting what we pay for.

But to deal with things in order, I was trying to get an answer to a question, via a search engine. And courtesy of DuckDuckGo, and only DuckDuckGo, I found a lead. Unfortunately, it turns out that whilst the site was perfectly safe, it didn't answer the question. Right keywords, but wrong answer. At least DDG showed me what I asked it to show me.

So what now? The only solution is to do what I always do in these circumstances. Go back to Google, out-think the machine, and do an advanced search. If I enter the line: “of people use advanced search”, in quotes, I'm now forcing the search engine to look for that exact phrase. Because I've removed the specific variables, I should find an instance or two of someone saying those actual words. And that's very likely to tell me what percentage of people use advanced search.

Google advanced search

It worked. The blog Geeking With Greg, tells me in a 2006 post that only 1% of people use advanced search. Thanks Greg. What a shame the search engines were so resolute in trying to bury your answer to my question under a ton of totally irrelevant faff. But importantly, we now know that most search engines are serving duff results to 99% of the public.

You'd think, given that 99% is NEARLY EVERYONE WHO USES THE SERVICE, that they might try slightly harder to serve a matching result - especially since they are evidently capable of finding matches when they do try. But then, why bother when people will accept duff results, and obligingly read whatever self-serving drone-fodder the search engines feed them?

RESULT

The takeaway is that with a pretty straightforward question, most search engines showed me something that met three criteria.

  • It was not what I searched for.
  • It was similar enough to what I searched for to persuade me I was seeing a best match (even though in truth it was not a best match).
  • It was more beneficial to the search engine's interests than to my actual, expressed interests.

I could go back to the year 2001, download some grim, cowboy search toolbar, and describe its function in exactly the same way. And the major search engines really haven't always been this bad. They've progressively got worse, and will continue to get worse until we start to wake up. We are PAYING, with our data, to use these services. But we still consider them to be free, which makes us very lenient on them when we get poor value. Recognising data as a currency would help raise our expectations in line with the real costs we pay.

FEDIVERSE 'FUSCATION

One of the problems with the online world is that the biggest players can so easily prioritise their own brainwashing and drown out the truth. And this is a core reason why we've lost sight of what real privacy is. We think the Fediverse is a privacy solution. We do!

I'm not an advocate of the current “decentralisation” genre, because at the end of the day, unless every individual member of the public can run their own access means, it's no different from centralisation. I'm calling decentralisation a genre, by the way, because it's certainly not at the present time a separate concept from centralised tech.

Dividing up a large network into ten smaller components doesn't change the reality of what's happening. The average user is still giving data to someone they don't know, and the person who is receiving the data is still a central information-sponge and spy-log. It's the policy and design that matters. Not the number and size of the sponges in the ring.

Decentralised platforms can be just as oppressive as large, centralised platforms - sometimes more so. For example, diaspora* is a walled garden. No one describes it as such, because they've all been blinded by the decentralisation hype. But the fact is that if you want to explore a diaspora* pod, you have to sign up, and you have to log in. And you have to enable JavaScript and cookies. Whatever that is, it is not privacy. It's the opposite of privacy. If you were asked to come up with a basic conceptual design with a brief of awful privacy, that is what you'd come up with.

These things are not designed as privacy-first arrangements. They're designed to do exactly the same as centralised platforms do. Except they're less reliable, more limited in almost every way, and the components can shut down out of the blue.

And yet most of the diaspora* pods either pester for donations or require money up front. Exactly what you'd be paying for when you get vastly less than Twitter offers, and still no privacy whatsoever, I have no idea. But this is why the monetary value of data needs to be recognised. As a society, supported by law, we need to understand that real privacy is zero connection activity, and place a burden of debt on people who in any way at all require us to give up basic privacy freedoms.

The providers can negate that burden by offering us something of equal value. What they shouldn't be able to do is guilt us into making donations on the basis that we're supposedly getting something for free. If they have the wherewithal to spy on our every move, log what we look at, record the time we looked at it, etc, THE PRODUCT IS NOT FREE. If they want to call it free, they need to redesign the protocol so our privacy is as well-protected as it was in the days when we bought software from high street retailers, in cash, installed the product, and used it completely severed from the Internet. That's the benchmark for privacy.

Are we not noticing that even the supposed “privacy” options require us to be online and in the grip of a third-party data log as a condition of use? Are we really so stupid as to consider that privacy? EVERYTHING is online. No one makes offline-first utils anymore. My criterion for monetary exchange regarding tech products is that if I get to own it, I pay for it. If I'm just a lab rat in someone else's spyware regime, I don't. We should all see the tech world that way, and until we do, people are going to keep paying twice. Once in data, and once in cash.

"What it's really all about is making sure that Tech's key, high-paying customers and potential customers cannot bypass the current data gatekeepers and mine big data directly, for free. THAT's why your browser is supporting the authorised-HTTPS-only bully-fest."

WHAT IS OUR DATA WORTH?

Long gone are the days when the only value in data was its indication of which type of breakfast cereal we're more likely to buy, or whether we need gardening products, or how often we eat out. But tech companies love us to believe in this quaint vision of data being used purely to guide a shopping cart, and being worth next to nothing.

Your local shop is not paying much for data, and probably doesn't even want to manage any more data than a customer database. Your local Police or Government, on the other hand, can pay stupendous amounts of money for access to big data banks.

Data is a market, and the Government, Police et al are not only customers, but extremely high-paying customers. If the Government, Police and other high-value data-seekers gain direct access to your data, the tech industry loses a phenomenally large potential revenue stream.

And this is what Almighty Tech's heavy-handed quest to deprecate unencrypted connections is really about. It has nothing to do with protecting you from smalltime baddies. Tech doesn't give a shit if you get hacked, scammed or robbed. Facebook thought its incompetent transfer of more than half a billion private emails and phone numbers into the hands of criminals was so trivial that it didn't even warrant a notification. Google thought filming your house and putting it online for burglars to assess was just something that would be cool for a search engine to do. That's how much they care whether or not you get hacked, scammed or robbed.

What it's really all about is making sure that Tech's key, high-paying customers and potential customers cannot bypass the current data gatekeepers and mine big data directly, for free. THAT's why your browser is supporting the authorised-HTTPS-only bully-fest. That's why the Tech-funded EFF lobbies against Police and Government data access on a loop. It's not to protect your privacy. It's to protect a vast, multi-£billion data access market.

Think about it. How does reading published information from an unencrypted Web page threaten you or put you at risk? It doesn't. It can't. No more so than visiting a random, unrecognised site in the first place would. Obviously, if you're sending information through a form on an unencrypted page, that's different. But merely reading public information from an unencrypted page? Find me ONE example of someone ever falling victim to hackers because they read a page of Wikipedia in its pre-encryption days. I'll wait.

Data is so valuable and exploitable that the Big Tech Collective are going to the end of the Earth to stop their customers sourcing it at source. And to do that, they need a lockout mechanism, i.e. HTTPS.

How much do you think large data companies can make from the Police and Government for keeping an open data door when the law doesn't require it? In fact, individual Police forces have paid seven digits for access to data banks. Imagine the collective reward for the gatherers of big data that the Police find useful.

And data companies have also infiltrated the £multi-billion health business. Research, and particularly medical research, will pay breathtaking fees for the high-volume data it wants.

Most of the data “shared” with research facilities is cited as “aggregated” and/or “anonymised”, which seeks to persuade us of two things…

  • That the data can't be linked with the individual people it came from.
  • That the data is barely worth anything.

Both of these notions are false. Almost any company that already holds significant data can easily reverse engineer “anonymised” records to identify specific people. This is so much the case, that some bodies acquiring “anonymised” data even provide a route back to the data origin.

For example, The Boston Collaborative Drug Surveillance Program - a US buyer of UK medical data - cites a means to validate the data via the source doctor's surgery. Which means the source of the data is retained in records, and enough information remains present in each record for the doctor to categorically pinpoint the patient. That is very far from “aggregate” data. This is personal information masquerading as aggregate.

And if you look carefully at online service providers' Terms of Service, you'll see that even the brands who lead on “We Do Not Sell Your Personal Data!” commonly sell what they describe as “aggregate”. We have to stop believing that these slippery terms have a real meaning. We have to stop believing companies are not selling our data when they are.

Every time we use the Internet, we're giving away something of tremendous value. And if you never have wondered why these companies are not giving us tools we can use on the desktop without their involvement, ask yourself that question. Ask it every day. Even the ones you think are not interested in your data, somehow can't manage to operate without you pinging their data-logging system.

The compromise alone of being spied upon in this way entitles you to compensation. Even when the gatherer does nothing with that data beyond looking at it themselves. If they're selling access to the data, or exploiting it in other ways, they owe you more.

Obviously, if they're giving you back equivalent value, then cool, you're square. But all of us should be evaluating our use of the Internet in this way. Reminding ourselves that we are paying, and making sure we do get back good value.

We must remind ourselves that “telemetry” is a fancy word for spyware, used by people who don't want their products be be considered malware. Everything with telemetry is spyware.

We must remind ourselves that “session recording” is a fancy term for spyware.

Data points”? Dossier notes. A company with 5,000 data points has 5,000 dossier notes on each person it profiles.

Data log”? Dossier that preserves details of your every login, typically forever.

Canvas fingerprinting”? A means of identifying you even when you've expressly disabled cookies and asserted that you don't want to be identified.

Cloudflare”? A control-freak organisation seeking to own the Internet by infiltrating every Web connection in existence and then deciding who can access it. If you've used a communal access point or proxy and encountered the dreaded "One more step" screen, you've experienced Cloudflare's antics already.

The “third-party data-piggyback loophole”? A sneaky system in which a website can claim it harvests no data, then a allow a third party to harvest data from the site via an embed, then acquire the data secondhand, from the third party. Google notably wants to shut this racket down and is updating Chrome in a bid to do so - but more for its own benefit than ours, I suspect.

Javascript enforcement”? Designing and developing a site so it won't work unless the visitor or user enables JavaScript. JavaScript is then used to power the spyware, so the user has no way to opt out of micro-monitoring.

Unauthorised human A/B testing”? Monitoring users' decision-making likelihoods by placing half in one situation, and half in other - without telling them you're doing it.

Unauthorised human psychological profiling”? Building psychological profiles on users - without telling them you're doing it.

Unauthorised human intelligence testing”? Gauging the intelligence of users, without telling them you're doing it.

Unauthorised human labour-tasking”? Duping users into performing valuable labour on the pretext of another purpose. For example, serving unnecessary CAPTCHAs to force human labour on the pretext that the person is “proving they're not a bot”.

Data and tech companies don't like us to think about any of this, because it reveals how poorly they compensate us with their increasingly annoying, controlling and deliberately restrictive or inept tools. But we must not forget that enormous value is being squeezed out of us, on a daily basis, in all of the above ways. And then turned directly into cash. It's already a transaction. Please stop imagining you can possibly owe any of these providers money, or that they're doing you a favour in not asking for financial payment. Or, indeed, that they're entitled to a free pass on criticism.

Data is a currency. There is no reason why we shouldn't regard data as an equivalent to the money it generates.