Tuesday, May 28, 2013

Privacy and Browsing: Does Google Know You Too Well?


Recently a colleague asked if I had any recommendations for maintaining some semblance of privacy when online. His specific concerns were web browsing, search, and email. In each of these cases, one or two well-known names have a reputation of knowing their users a little too well. How often do you see advertisements that seem to read your mind? Have you ever researched or purchased a product, only to see lots of advertisements for a related product or accessory?

Google is the natural scapegoat with their well-known penchant for collecting and using data. So much so in fact that TheOnion (beloved for their tongue-in-cheek satire at the expense of anyone and everyone) once did a “news report” on a “Google Opt Out” feature whereby users could escape Google’s eye by moving to an approved remote village. Google is not the only offender though. Microsoft (the owner of Skype) is now getting considerable flak for eavesdropping on Skype IM conversations to intercept URLs and login credentials.

Similarly, many sites have begun using “Facebook Connect” to manage logins. While this vastly simplifies a web application developer’s job, it also means Facebook knows what you do and post on these sites. Notice the comments box that looks suspiciously like a mini Facebook on so many pages? So-called “federated authentication” significantly blurs the lines between various sites.

The challenge is that if you are not paying for a service, you are not the customer. You (or more precisely, your behavior and your data) are the product being sold. Google, Microsoft, Facebook, Yahoo, etc. have all the incentive in the world to know as much as they can about you and your online habits, because it translates into dollars for them. The better they can fine-tune their advertising audiences, and the better they can correlate data with future actions, the more likely you are to act upon an advertisement, and thus the more they can charge their actual customers for that data.

I encountered a slightly shocking (until I thought about it and could clearly see the flow) example when I started using an Android smartphone. I use a variety of web browsers; when I use Chrome, I don’t log into Google because I have no desire to have my browsing patterns associated with an account (I know I can still be identified, but let’s at least not make it too easy). Imagine my surprise when I looked up a restaurant on my laptop, and subsequently my phone prompted me with driving directions to said location! Evidently since I had previously logged into a Gmail account on my laptop, Chrome quietly kept me logged on, and shared my browsing history with my Google-based Android phone. Convenient? Heck yeah. But also a stark reminder of how much Google knows about me.

How much of a concern is this, really? Well, it depends. It depends on what sort of activity you engage in online. It depends on how much you value personal privacy. It depends on your own personal level of paranoia. Below are a few ideas of varying degrees of complexity. Take one, or take them all – or ignore them all – I won’t be offended :-)

Some of these suggestions stray deeply into the realms of paranoia, some are rather complicated, and all come with trade-offs. These are NOT “recommended basic steps” but rather possible ways to deal with specific concerns.

Scenario 1: You don’t want your email provider to read your mail.
Freemail services will scan your email for key words, and display advertisements that match its interpretation of your interests. A solution is to pay for email. Businesses don’t provide a service out of the goodness of their heart – they are in business to make money. As long as you are using a free email service (Gmail, Yahoo, Hotmail, etc.), if you aren’t paying, someone else is – and what they are paying for is you. I can’t vouch for any individual email service (read the terms of service closely), but your ISP will provide an email address and since you pay for it, you get to be the customer. Depending on your affiliations, you may also be able to get email through a university, or professional association, or your employer.

The trade-off is portability. If your email is through affiliation with an organization and that affiliation changes (by moving to a home not serviced by the same ISP, or by graduating from a university that does not provide alumni email, or ending membership in an association), then your email address may also have to change.

Scenario 2: prying eyes while email is in transit
By default, email is not a secure channel. Much like a traditional postcard could be read by the mail carrier, an employee in the sort office, or even a snooping neighbor, standard email can be read by any network it passes through (the Internet is after all the interconnection of millions of individual networks). A solution is to use encrypted email (such as PGP). By using an encrypted mail product, the content of your email cannot be read by anyone except the recipient (or whomever has the recipient’s encryption key, but that’s another story).

Email encryption works though a concept known as “asymmetric encryption.” In simple terms, we use a well-defined mathematical algorithm in which each person has two “keys” – a public key and a matching private key. The public key allows encrypting a message in such a way that it can only be decrypted using the matching private key. The public key can be freely shared (it is only useful for encrypting the message – it cannot be used to reverse the process); only the holder of the private key can read the message. See the self-proclaimed “Best Free Ways to Send Encrypted Email and Secure Messages” for some options (note that some of the products work through the clunky method of encrypting text separately and then attaching it to the email, whereas others actually integrate with your mail client and encrypt the message itself).

The downside to this is, it only works when both parties use encryption. You can only send encrypted mail to someone who has set up encryption on their side, and you have no way of requiring others to encrypt messages they send to you.

Scenario 3: You want full control over your email
For about $10 a year, you can register a domain name (such as www.myname.net), then set up a free mail server on a computer in your home or office (sendmail is freely available for Linux, and Linux will run on very lightweight systems – you may even already have a network hard drive that has an embedded Linux operating system already). While this doesn’t do anything about mail as it crosses the Internet, once it is on your server it is away from the prying eyes of the freemail services.

Understand though that when you run your own email server, you become your own technical support. In exchange for the added privacy, you take on the responsibility of maintaining your mail server. Read this very involved Linux Mail Server guide for some insight into what this might involve.

Scenario 4: You worry that websites are tracking you
Web sites use several means of customizing content for each user, the most common of which are called “cookies.” Cookies are little bits of data that might record your username (used by the “remember me” feature on many sites), or your location (so your weather forecast site can always give you your local forecast), or your preferences (so a shoe store might pre-populate an order form with your shipping address). Think of the barista at your favorite coffee shop that knows what you like to order, and prepares it as soon as you walk in the shop before you even have to ask. In theory, these cookies can only be used by the site that created them, but in practice many advertising firms share cookies between hundreds or thousands of sites. A way around this is to disable cookies in your browser.

Understand that some sites will not work properly, and some features will not work. Cookies enable customization, and without that customization, you lose the ability to save preferences. In the coffee shop example, it is as if the barista is fired every evening and a new person is behind the counter every morning. Some shopping cart apps rely on cookies to keep track of the items you select before you check out – you may not be able to purchase things online with cookies disabled.

Another option is to make use of the “private browsing” feature offered by modern browsers. Chrome calls it “Incognito,” Internet Explorer calls it “InPrivate,” and Firefox calls it a Private Window. Private browsing is not as draconian as disabling cookies altogether, but it prevents cookies from being saved between sessions. You can launch a web site and it will work normally, but when you close the browser, everything in that session is wiped clean. Keep in mind that this does nothing for cookies already stored, it simply ensures nothing from that session is saved. Continuing the Coffee Shop theme, it is as if the barista suffered from short-term amnesia a la “50 First Dates” and remembers everything from the past, but cannot remember any new information.

Scenario 5: Your search engine knows a little too much about you
Most search engines such as Google pass along your search query terms to the web site when you click through, and store information about what you have searched for in the past. They do that so they can improve “predictive search” algorithms. Start typing in “ski...” in the search box, and perhaps Google will offer links to the rock band “Skillet” instead of links to the nearest ski resort or water ski supplier depending what you have searched for in the past. One way to avoid this data collection is to use an alternate search engine such as DuckDuckGo or Ixquick that does not collect personal information. The tradeoff is that they may not have the same breadth of search index as the “big guys,” and sometimes predictive search is quite helpful.

Scenario 6: you don’t want your physical location known
Just as in real life, on the Internet you have an address. When you browse to a web site, that web site needs to know where to send the page you ask for – it needs to know the address of your computer. That address is associated with your ISP, which in turn can associate it with you individually (or at least with your household or business). Try this: go to http://whatismyipaddress.com/. Depending on your ISP’s configuration, you will likely get an accurate response as to what city you are in.

If you have need to hide your identity and location, two options are a VPN service or a TOR service. In both cases, you change your browser traffic so it appears to come from another location. A VPN (often used as a way to access your employer’s business network from offsite) is a virtual private network – essentially a tunnel from your location to your business network; from there, anything you do appears to originate from the business network. TOR, or “The Onion Router,” (completely unrelated to The Onion News) is a network of volunteers worldwide; your connection is randomly routed through any number of TOR relays before being handed off to the final destination; when properly used, the destination only knows the “exit point” and not your true location.