Archive for the ‘Technology’ Category

Public Databases and Massive Aggregation of Data

Tuesday, July 22nd, 2008

200807221411
(Source)

This is just a really quick thought that I wanted to toss out.

I perceive a problem associated with the digitization of public records: such digitization allows business interests to gather aggregate data on large collections of people while retaining identifiable characteristics. This allows for a phenomenal sorting potential. At the same time, we might ask, “is there anything we can, or really want to, do about this?”

Paradigm Shift
I hear this a lot - ‘Chris, you have to understand that things are different now. The paradigm is shifting towards transparency, and there’s nothing wrong with that, and you’re being a pain in the ass suggesting that there is anything wrong with transparency. Do you have something to hide, or something like that?’ This particular line bothers the hell out of me, because I shouldn’t have to expose myself without giving my consent, especially when I previously enjoyed a greater degree of privacy as a consequence of obscurity and/or the costs involved with copying, sorting, and analyzing analogue records. I fail to see why I have to give up past nascent rights and expectations just because we can mine data more effectively (hell, that would have been a meaningless statement around the time that I was born…). Efficiency is not the same as superior, better, or (necessarily) wanted.

Solution One: Creative Commons
I (generally) don’t mind people reading about what I’ve written, or about various facets of my life. Were I in court for some reason, a part of the justice system really does entail other people being able to read court records so that they can identify with the law as it was dispensed by and for the people (this is one of the areas where Hegel certainly puts an explanation of the legal system far more eloquently than Kant ever did, though both argue this point along dramatically different avenues). Perhaps some version of the Creative Commons could be developed so that designated uses can automatically search public databases, whereas other uses (such as corporate interests in some cases) would be restricted in the information they could collect per day/have access to in aggregate. Using a spider-like text file, and legislating that business is required to abide by these files, might be one way of dealing with this.

Solution Two: Limited Access Points
This won’t win me friends with advocates of ‘openness’, so get ready. Hell, I don’t know that *I* like this idea, and think that it sacrifices a bit much on the alter of the past. Be that as it may …

What if, to access public databases, you had to have an IP that located you within a particular geographic range? Say you had to be within 50 km of the hosting location/location you presume it should be hosted at to get full access (i.e. if you are accessing information that the Ontario government holds onto, you need to be within 50 km of the parliament, even though the databases might actually be housed in Yellowknife). Perhaps, instead of this location based access, documents should have to be manually saved somehow, with the method used for displaying and saving documents intentionally randomized to prevent mass-saving and aggregation. In essence, why not implement some kind of technology that either correlates geographic location with the ease or difficulty of accessing documents, or implementing quasi-DRM solutions (that felt dirty to suggest…) to limit the easy aggregation of public records.

Thoughts?

Technorati Tags: , ,

Mac Preview: Towards Breaching Bill C-61 (Copyright in Canada)

Wednesday, July 2nd, 2008

200807021624
(Source)

If you’re Canadian, and haven’t exiled yourself from society for the past several weeks, then you’ve heard about the Federal Conservative Party’s ‘dreaded’ Bill C-61″An Act to amend the Copyright Act”. While a lot of people have been talking somewhat broadly about the issues of digital locks, and posing their own examples about how Canadians will be criminalized when they use media in sensible ways, I wanted to talk about how Mac Preview threatens to criminalize a lot of Mac users.

Mac Preview
I’ll start with a quick quotation of how Apple describes Preview:

If you’ve got PDFs to read, or images to view, Preview makes it easy. This built-in PDF file viewer allows you to view, work with, and print PDF files; view and edit images (including JPEG, TIFF, GIF, PICT, and other image file formats). (Source)

Preview is an awesome integrated part of OS X, and it makes my daily life a lot nicer - no longer is Adobe something that I have to put up with on a regular basis! Another great feature of preview is the ability to print .PDF files that you already have opened. This might seem stupid to bring up, but it turns out that this feature is pretty important in the present computing environment that I find myself in.

Why Print a .PDF … to a .PDF?
There are great reasons to print .PDF, and they range from a personal fear or hatred of the Earth’s pollen-bearing agents (such as trees), to wanting a physical copy of a document to make notes on, and even using the print function to create another .PDF of the .PDF you have opened. You might be wondering if you just read that you might be interested in printing a .PDF file to a .PDF file - you did just read that, and I really do mean it.

There are some .PDF files that are laced with Digital Rights Management (DRM) technology. This technology prevents you from manipulating the content in any fashion that isn’t pre-approved by the content’s creator. Inserting DRM on a file is oftentimes done to avoid legal issues, but more often than not it is set into a file so that users can only use content is a particular way, as identified by the content creator. While it might make sense to stop someone from making changes to a contract that has gone through a lengthy process with a lawyer, it makes less sense in other cases, such as publicly available documents and (in the more widely known case), purchased music files.

Let’s take Anagran’s white paper “Eliminating Network Congestion Anywhere with Fast Flow Technology from Anagran” as an example - this file (which you can only download after providing Anagran with a load of personal information) is coded so that you can’t make modifications to the file. This might not sound so bad (who really reads white papers, you might say), but if you want to keep notes in a digital format, and attached to Anagran’s .PDF, then by default Preview won’t let you save the document with your changes. The DRM in this .PDF actively prevents the user from saving the .PDF if any modifications or additions have been made to the file. This is a problem if you don’t want to quickly develop a growing pile of printed white papers, where they were printed for the sole purpose of making notes to the document. You’ll note that there isn’t a technology that prevents me from writing on the paper - DRM is special in that it actually takes away your right to use something, when in the thing’s previous technological format nothing prevented you from freely manipulating the content in a wide variety of ways.

Evading .PDF DRM in Preview
Say that you had downloaded Anagran’s aforementioned whitepaper, had made notes throughout the document, and only then discovered that the .PDF didn’t allow you to save the document if any modifications were made to it. You could just give up an print the document off….or you could do something particularly simple and effective that would evade and ultimately break the digital lock on the document.

After making the notes to the .PDF, you could do the following:

  1. Click Picture 2 in Preview
  2. Click on the PDF button in the print menu, as shown belowPicture 4
  3. From the drop down menu, click Save as PDF and save the file to the location desire

Congratulations! If you just followed the steps above, you have just bypassed/broke a digital lock. If you performed this operation after C-61 were made into law, you would have broken the law by writing on a .PDF and saving it.

A Sensible Copyright Bill?
It’s not unreasonable for me to want to make comments on a document for personal use - I do it all the time, when I mark up a newspaper, write in the margins of a book, or scribble directions on the back of a napkin. These mediums’ digital counterparts, however, might make it impossible to make those changes depending on whether or not the content creators use DRM to lock down their communication mediums. Does a bill that would make using digital media as we do analogue media illegal sound like a sensible copyright reform bill to you? I certainly don’t think so, and I hope that you don’t either. Contact your MP and demand that they take up the task of remedying the clear deficiencies in Bill C-61 as it has been presented in parliament.

Technorati Tags: , ,

I See Your DPI and Raise You a SSL

Sunday, June 29th, 2008

200806272354
(Source)

A little while ago I was talking about network neutrality and Deep Packet Inspection (DPI) technologies with a person interested in the issue (shocking, I know), and one of the comments that I made went something like this: given the inability of DPI technologies to effectively crack encrypted payloads, it’s only a matter of time until websites start to move towards secure transactions - in other words, it’s only a matter of time until accessing websites will involve sending encrypted data between client computers and servers.

The Pirate Bay and Beyond
Recently, Sweden passed a bill that allows for the wiretapping of electronic communications without a court order. This caused the Pirates Bay, a well-known BitTorrent index site, to announce that it was adding SSL encryption to their website as well as VPN solutions for native Swedes who wanted to avoid the possibility of having their network traffic surveyed. Recently, isohunt.com has done the same, and other major torrent sites are expected to follow the lead. The groups who are running these websites are technically savvy, allowing them to implement encrypted access rapidly and with little technical difficulty, but as more and more sites move to SSL there will be an increasing demand amongst tech-savvy users that their favorite sites similarly protect them from various corporate and government oversight methods.

The Open Web: Closing for Repair
John Gilmore’s famous line, “The Internet interprets censorship as damage and routes around it” seems to be a little less true now than it was when he proclaimed it. Rather than ‘routing around’ damage brought on by censorship/surveillance that is enabled by DPI technologies, packets charge right through the offending hardware having hardened their skins to avoid the penetrating gaze of their surveyors. The open web of the past, where most application traffic was available for inspection, where you could identify it at a glance, is gradually being abandoned and replaced with a web of fear, where individuals slowly move towards securing even their routine content.

I take Gilmore’s quote as an optimistic expression of what would happen on the open web - when a particular brand/node of the ‘net was found to be censoring groups, that particular node is cut out of available routing addresses and packets carry along the network with few concerns. As we pass from the open web to the web of fear, entering a electronic environment where and increasing number of the primary routing hosts are inspecting traffic and preventing/hindering packets from traversing the globe, an cautionary mindset that accords with the ’security state’ sets in; while the security state sees citizens abandon/lose core rights and freedoms in the name of national and personal security without significant concerns, that same culture of security may allow for the easy adoption of encrypted data traffic on the basis of it maximally securing personal (though potentially not state) security. It will be interested to see how these two modes of approaching security develop and play out against one another.

Technorati Tags: , , ,

Transparency and *My* Click-Stream

Wednesday, June 25th, 2008

 2008 2422786254 A46Ef53635 B
(Source)

I get strange looks from some of my friends and colleagues sometimes. On the one hand, I strongly advance the idea that people’s privacy should be protected, by default, and at the same time I blog, use social networking sites (though somewhat uncomfortably), own a cell phone, use credit cards, etc. This week I’ve ’stepped things up’ by syndicating my del.icio.us bookmarks with my blog - you’ll now be treated (or spammed, I guess, depending on how you see things) with the articles that I’ve tagged in the past 24 hours that I think are interesting.

SPAM Ahoy!
I’ll start by stating this: I don’t think that the links you’ll be seeing are Spam. I think that I’m tagging good, solid, helpful links for people that might be interested in surveillance, privacy, and (typically) how either of those topics intersects with technology in some fashion. You’ll note that, for the next little while at least, you’ll see links to articles on Deep Packet Inspection (DPI) and behavioral advertising. I expect some WiMAX stuff as well. There are a couple reasons why I’m syndicating this kind of content:

  1. I think that it’s important, and posting links here increase the chances of people reading about these topics. More attention needs to be given to them.
  2. I don’t get a chance to blog as often as I’d like, but I’m always finding stuff that other people might find interesting. While my short descriptors of links lacks the comprehensiveness of a blog post, it’s enough that might get people interested in these topics to articles they will find useful.
  3. It will potentially increase my own page rank in Google’s analytics, increasing the chance that people can find this webspace by searching for privacy- and surveillance-related issues. (This is a fairly selfish reason, I admit, but since it’s my space, and attached to my name, selfish seems OK here *grin*)

Stop It, Stop It!
If all of the del.icio.us links are REALLY annoying you, let me know. Alternately, if there are particular links that you find interesting/want to know more about, let me know - there is a decent chance that I will have something more to say on the topic of any of the links beyond the 250 character limit that del.icio.us holds me to, and I may have links around that I haven’t tagged yet.

So…We Finally Learn Who Christopher Parsons Is
You might be thinking to yourself, “this seems particularly transparent for Chris. Given his focus on privacy, and that he at least claims to like personal privacy, what the hell is he doing releasing some information on his own click-stream?” This is a good question, but I’m not actually releasing the majority of what I’m searching - things that I don’t think are appropriate for the theme of this blog, or that is personal enough in nature that I don’t feel comfortable discussing it (here) won’t be added to the blog. Thus, I’m distinguishing between what I want you to see, and what I want to keep between myself, behavioral advertising groups, and my search engines.

This concludes today’s explanation of the click stream. Back to Google!

Technorati Tags: , ,

DPI, Employees, and Proper Inspection

Monday, June 23rd, 2008

 27 88558314 429Fc887B1
(Source)

In my last post I alluded to the fact that Deep Packet Inspection (DPI) technologies could be used by businesses to try and reduce the possibility of ‘inappropriate’ employee use of bandwidth and wrongful or accidental transmissions of confidential IP. In that last post I was talking about IT security, and this post will continue to reflect on DPI technologies’ applications and benefits to and for corporate environments.

A Quick Refresher on DPI
From ArsTechnica:

The “deep” in deep packet inspection refers to the fact that these boxes don’t simply look at the header information as packets pass through them. Rather, they move beyond the IP and TCP header information to look at the payload of the packet. The goal is to identify the applications being used on the network, but some of these devices can go much further; those from a company like Narus, for instance, can look inside all traffic from a specific IP address, pick out the HTTP traffic, then drill even further down to capture only traffic headed to and from Gmail, and can even reassemble e-mails as they are typed out by the user. (Source)

For a slightly longer discussion/description of DPI I suggest that you look at the wiki page that I’m gradually putting together on the topic of Deep Packet Inspection.

Employers and Data Breaches
We often hear about the loss of personal information in the news these days - it seems that almost every day another few tens of thousands of records are lost, often because a database was poorly secured, or because a laptop was lost or stolen. What isn’t covered in the news as often as it once was, is that breaches of confidential information also are (still) caused by email that is sent by employees with access to that confidential information. Indeed, a recent (and somewhat sensationalized) article by SmartCompany.com outlines that 40% of the companies that they surveyed are already watching their employee’s email. (It should be noted that, to date, I haven’t found the raw data that these statistics are based on, so take them with a grain of salt!) Quoted below are the reasons why:

  • 40.6% say it is to ensure they are doing their job properly.
  • 47% say they are worried about too much personal use of email.
  • 40.6% say they only do it if they have a problem with a staff member (such as bullying or stealing). (Source)

In addition, the article notes that “[e]mployers are very concerned that IP, customers and other information might be stolen and either passed on to competitors or used to set up other businesses in competition.”

Scaling and Cost-Effectiveness
One of the issues with having people actually read email before it is delivered past a corporate network’s perimeter is that people cost money. A lot of it. In addition to this financial disincentive to monitor email (though it is only a disincentive when the costs of reading email exceed those of preventing IP and data breaches that would cost the corporation money), people get incredibly antsy when they find out that their email is being read. In particular, they become increasingly guarded against their corporation - why, if they (the employee) slave for the corporation in good faith, should the corporation be hiring people to double-check employee loyalty? As your workforce increasingly feels monitored and untrusted, it reflects this lack of trust towards the corporation and sheds the devotion to the corporate brand (and potentially principles) that are so helpful in raising morale.

In addition to these problems, as your corporation expands it gets increasingly expensive to monitor the email sent from your company. What if there was a way of easily scaling your monitoring system, easily monitoring your employees, and ‘tricking’ them into believing that you trust them and simply run routine operations on all email?

Monetizing ISP-Level DPI
I won’t lie: I don’t particularly like DPI technology. It strikes me as a sneaky way of spying on your users. Moreover, I don’t particularly like the idea that I’m about to suggest, but think that its interesting enough that others might be able to run with it in helpful ways for their own work.

As it stands, ISPs use DPI to look at the payload of packets - this lets them evaluate what is inside packets and prioritize traffic as per their traffic shaping rules. Now, when you send an email from your corporate email account it moves from your corporate email server (assuming that you haven’t outsourced your email to a third party, such as Yahoo!, Microsoft, or Google) to your ISP’s network, to the Internet at large. When you send email from your corporate account, right now, it passes through the ISP’s DPI system.

What if a corporation could invest/pay their ISP some money, and have STMP (email) traffic that leaves corporate servers be inspected by applying corporate-inspired heuristics. This would let the corporation automate their surveillance of email, and have ‘flagged’ email brought to the attention of system administrators before the mail could be passed forward. Moreover, depending on the legality, the corporation could have all email, including personal web email, scanned using their ISP’s DPI technology, letting them identifying any and all possibilities of data breaches.

This holds a series of benefits for corporations:

  1. Enterprise-level heuristic analysis, retention, and flagging;
  2. (Presumably) easily updatable heuristics, allowing for improved surveillance as time passes;
  3. Impersonal, insofar as a computer rather than a person is responsible for email screening;
  4. Better allocation of resources - a smaller number of people will have to be retained to analyze email, letting you hire IP creators, rather than IP defenders

At the same time, there are some downsides:

  1. Employees may not share the corporate mantra that ‘impersonal’ scanning is less intrusive than ‘personalized’ scanning;
  2. It will take time to weed out heuristics that persistently result in false positives;
  3. If employees learn to bypass in-place heuristics, then the ’stop before sending’ aspect of this system may fail;
  4. ISPs must develop a corporate-cost mode;
  5. Corporate heuristics would have to, presumably, remain secret (i.e. codenames, upcoming trademarks and IP could not be accessible/known by the ISP network admins without them signing confidentiality agreements)

In a forthcoming post I’ll talk briefly (again) about why I think that this mode of sorting is a questionable practice, but given the present legal attitudes surrounding email it seems like corporations should, in some jurisdictions, be permitted to filter email in this fashion without falling prey to legal concerns related to inspecting employee email. This, of course, is a somewhat scary and censoring use of DPI technologies - it acts as a nice way of filtering out conversation that once took place around the watercooler as people become increasingly mindful of what they are saying. Given that a substantial amount of personal development almost of necessity has to happen at work, given the periods of time that are spent there, DPI applied to corporate email threatens to totally remove the ‘private-personal’ from the ‘private-workplace’ environment by potentially publicizing ‘private-personal’ interactions and disciplining those who engage in such activity in their workplaces.

Technorati Tags: , , ,