I would like to thank the committee for the invitation today to discuss the privacy implications of online platforms and appropriate legislative responses to the concerns of citizens about how their personal information is being used.
As you are aware, I received a complaint about this matter and announced some weeks ago that my office is conducting a formal investigation into how personal information on Canadians has been impacted by the activities of Facebook and Aggregate IQ.
Due to my confidentiality obligations under the law, I'm not in a position to discuss the details of this investigation with you today. I cannot prejudge our findings.
What I can share with you, however, is some perspective on the wider context that may assist you as you begin your study.
Canadians want to enjoy the many benefits of the digital economy, but they rightly expect they can do so without fear that their rights will be violated and their personal information will be used against them. They want to trust that rules, legislation, and government will protect them from harm.
In the recent Facebook matter, what happened, as acknowledged by CEO Mark Zuckerberg, was, quote, a “major breach of trust”. As recognized by the CEO of another giant tech company, Tim Cook of Apple, the situation is so dire that it is now time to develop well-crafted legislation to regulate the digital economy. The time of self-regulation is over.
In Canada, we of course have privacy legislation, but it is quite permissive and gives companies wide latitude to use personal information for their own benefit. Under PIPEDA, organizations have a legal obligation to be accountable, but Canadians cannot rely exclusively on companies to manage their information responsibly. Transparency and accountability are necessary, but they are not sufficient.
To be clear, it is not enough to simply ask companies to live up to their responsibilities. Canadians need stronger privacy laws that will protect them when organizations fail to do so. This was a major conclusion of my annual report to Parliament last year, and a point I made during your recent study of PIPEDA, Canada's private sector privacy law.
Significantly, given the opaqueness of business models and complexity of data flows, the law should allow my office to go into an organization to independently confirm that the principles in our privacy laws are being respected—without necessarily suspecting a violation of the law.
The time has also come to provide my office with the power to make orders and issue financial penalties, helping us to more effectively deal with those who refuse to comply with the law.
Strengthened legislation does not need to be an impediment to innovation. We know that personal information plays a key role in the digital economy, including advances in the field of artificial intelligence, which are necessary for Canada's social and economic development. We need legislation that ensures, as a general rule, that Canadians provide meaningful, informed consent for the collection and use of their personal information. But consent will not always be possible in the world of big data and artificial intelligence, where personal information may be used for multiple purposes not always known when it is collected.
This is why we recommended that Parliament examine exceptions to consent. We believe such exceptions, subject to conditions that would offer other forms of privacy protection, are preferable to relying on an interpretation of consent that is so broad as to become meaningless. We prefer narrower, specific exceptions, but we recognize that one option could be a European-style legitimate interest exception.
I'm of course very pleased that your committee recently issued a report calling for comprehensive changes to the federal private sector privacy law, which included several recommendations I had made but also others that would significantly improve the privacy rights of Canadians. Your report has shown that you are attuned to the issues stemming from the dated state of federal privacy laws in Canada, and you have actively called upon the government to make comprehensive changes.
Many in society, particularly in the last few weeks, are making similar calls. Even leaders of the tech industry now see the need for enhanced regulations.
If there was ever a time for action, I think, frankly, this is it.
Another area ripe for action concerns privacy protections and political parties.
As you are aware, no federal privacy law applies to political parties; British Columbia is the only province with legislation that covers them.
This is not the case in many other jurisdictions. The UK, much of the EU and New Zealand all cover political organizations with their laws.
In point of fact, in many EU states, information about political views and membership is considered highly sensitive, even within existing data protection regimes, requiring additional protections.
There are also now—in the digital environment—so many more actors involved: data brokers, analytics firms, social networks, content providers, digital marketers, telecom firms and so forth.
So while I am currently investigating commercial organizations such as Facebook and Aggregate IQ, I am unable to investigate how political parties use the personal information they may receive from corporate actors.
In my view, this is a significant gap.
Some independent authority needs to have the ability to review the practices of political parties and to assess whether privacy rights are being truly respected by all relevant players.
This gap requires addressing in one statutory form or another, either in privacy laws, in the Canada Elections Act or in a specific statute.
In conclusion, I would again highlight the urgency to act, as well as the stakes involved.
The integrity of our democratic processes—as well as trust in our digital economy—are both clearly facing significant risks.
I cannot think of more relevant questions for legislators to confront, and I applaud you for doing so.
Thank you again for your invitation, and I would welcome your questions.
It is a pleasure to be appearing before you. I am grateful for the opportunity. I believe the matter before us is one of very great importance. Facebook is certainly one of the core elements involved, but I would urge all of you to keep an eye towards the very focused efforts of others who rely on Facebook as a pillar of their operations but not solely on Facebook; others who are tending to cause direct harm to what I believe is the institution of democracy itself as sort of an end goal of what they're working towards here.
In case you don't know anything about me, I am somewhat uniquely situated to speak on the topic. The majority of my work can be described as hunting down data breaches. I openly call myself a “data breach hunter”. Over the last several years, my reputation has grown to be one of a leading authority on the prevalence and causes of data breaches as well as common patterns of incident response by the affected entities. Please note, though, that the data breaches that I locate and secure are not the result of actual computer exploitation or malicious acts. This is just data that has been left out in the open for whatever reason, and nobody realized it until I came along and found it. You may think there probably wouldn't be that much of that, but you'd be surprised. There is quite an epidemic of misconfigurations out on the Internet.
Some examples of data that I've secured stem from Verizon; Viacom; Microsoft; Hewlett-Packard; the United States Department of Defense; the Mexican national institute of elections, the INE; a couple of international terrorism blacklists; as well as the 2016 Trump presidential campaign website. They were leaking a bit of information as well.
The sum total of the efforts I've undertaken has resulted in the safeguarding of nearly two billion records containing private information, so I am well versed in this stuff. I look forward to answering any questions you may have.
More on point, I would like to point out that two data breaches that I came across in December 2015 involved the United States voter registration in its entirety, all 50 states plus DC. The second time, in the December that I found it, they were more enhanced. They had private details about people, with various pieces of personality and behavioural things—whether or not somebody was a gun owner, whether or not they lived a biblical lifestyle.
Six months later, in 2016, I came across another nationwide U.S. voter registration database, this one even more enhanced, having details on whether or not somebody watched NASCAR, whether or not they were anti-abortion sentiment holders, or whether or not they likely owned a gun.
Then another set of nationwide records came to my attention. I downloaded them after finding them in June 2017. This would be the third round of complete U.S. voter registration records that I came across. This was 198 million records, ranking as the largest U.S. voter data breach in known history. I would like to point out that at the time of the discovery, not a single one of these database breaches were protected with even a username or a password. They were simply out in the open. If you knew where to look, anyone in the entire world could find them.
The AggregateIQ situation that brings me here today is one that first started on March 20 of this year—not that long ago. I didn't know who AggregateIQ was until March 20. I was fiddling around on an open public website called GitHub where the developers collaborate and publish open source code.
I saw a reference to @aggregateiq.com in relation to some SCL Group code that was out there and just available to the public. I followed the bread crumbs, figured out what AggregateIQ was, and noticed they had a sub-domain called GitLab. When I viewed gitlab.aggregateiq.com, it occurred to me that the registration was available, and they were in essence inviting the entire world to register for an account on their collaboration portal.
I proceeded to register an account, it let me in, and all of these tools, utilities, credentials, scripts, employee notes and issues, and merge requests were all present before me. I very quickly realized the importance of this and that there would be likely heavy interest from regulators, governments, and the populace of several nations, so I began downloading. Normally, I go to great efforts to protect anybody who may be affected by this type of thing, but the overwhelming public interest in knowing the truth behind what Cambridge Analytica, AggregateIQ, and SCL Group have been doing is a compelling factor in this particular situation. I don't want you to think I just run out there and hand out everyone's dirty laundry when these things are found. This is a different situation.
Again, keep in mind that anyone in the entire world with an Internet connection could have found the same thing, gotten an account the same way I did, and downloaded the exact same things, regardless of what nation they were in or what loyalties they might feel. This was completely exposed with no manner of protection whatsoever. A malicious actor could have taken it a step further in that there were, and are, database passwords, usernames, credentials, keys, and authentication methods documented in these files that I did not take advantage of. I did download them, but I did not go the extra step and use those passwords to access the additional databases.
If it were found by someone else, and they were of the persuasion that would take advantage of it, it could have been, and may be, a much more serious data breach than has been mentioned. They could be completely infiltrated. Every bit of data that has ever crossed through AggregateIQ's hands could be in the hands of anyone who had found this same exposure.
There are a few remaining questions that I have not been able to decipher fully that I believe your investigation should figure out. While I am still looking into quite a bit of the data, I have not come to the exact final conclusion as to what AggregateIQ's relationship is to SCL Group and Cambridge Analytica. The walls of the separation between those entities are very porous. It's clear that code access permissions and data have traversed between the three of them, and other groups, so I would implore you to get to the bottom of that.
The second question is to what extent, if any, restricted political and private data has been utilized by AggregateIQ or AggregateIQ employees for commercial profit-seeking ventures. I have found evidence of ad networks being developed under the same domain, one notably called Ad*Reach network—and there are a few Ad*Reach networks on the Internet, so make sure you're looking at the right one before going after anybody in a questioning manner—as well as aq-reach. One of the employees who was working at AIQ was doing simultaneous work for an ad company called easyAd Group AG, which is based in Switzerland and has subsidiaries in the U.S. and in Russia. I would love to know what work was being done and if any of the data travelling through AIQ was utilized in any of those ad campaigns or set-ups that the employee was working on at the same time.
I would say probably both, actually. The situation currently is that most federal political parties have privacy policies—internal codes of conduct, so to speak, in their relationship with the people with whom they interact and from whom they collect information. That's a start.
I think, first of all, the substance of these policies could be improved, from what we have seen. One common element missing from the privacy policies of federal parties is the right of individual electors to have access to the information that parties have about them. That's a huge flaw. There is, then, the issue of the substance. But these are voluntary codes, and no one independent of the parties examines whether the parties actually live up to the promise they're making in these policies. That leads me to a very important reason that political parties should be governed by legislation: to ensure that whatever substantive rules exist, hopefully better than what they are now, are verified by an independent third party.
Should that independent third party be the Privacy Commissioner, the Chief Electoral Officer, a third person? That can be discussed, but I think that what this leads to—leads me to, at least—is that there are at least two types of issues at play here. There's the issue of privacy and whether parties treat the personal information of individuals properly, which is a privacy issue that would make me, perhaps, the best person to look at the question. Then, the allegations that we have been seeing in the past few weeks lead to a mix of the use of personal information and privacy on one hand and political purposes on the other, which is more the domain of the Chief Electoral Officer. Ideally, I would say, the two institutions would be able to verify what is happening so that the expertise of each is put in common.
Thank you, gentlemen. This has been very, very eye-opening.
I'd like to start with you, Mr. Therrien. In 2008, CIPPIC, the Canadian Internet Policy and Public Interest Clinic, launched its complaint, with your predecessor, on Facebook. At that time, they identified the issue of third party applications as a threat to privacy.
In the world of 2008, there was much a feeling, and I was very much in that world, of a deregulated Internet—you know, let them build—and Facebook was a fun place to meet former people from high school. Ten years later, it has morphed into the primary source of news—false news, real news—and has become the major, dominant player in many of the elections around the world.
Looking at the European data protection supervisor who says that the result of Facebook's dominant control are growing political extremism and isolation and political points of view, I want to ask you, looking back on that 2008 review, about those third party applications. Would it have made a difference if the Privacy Commissioner had come down harder? Did you have the tools at that time to address those breaches? And now, in light of what we're seeing with Cambridge Analytica, do we need really much stronger tools to be able to address these issues?
My question is for you, Mr. Therrien.
About two weeks ago, in an interview that you gave to a national French-language media outlet, I heard you say that a gray area surrounds the information collected by the political parties. I would like some clarification on that.
All politicians and political parties receive the voters' list, which includes each citizen's first and last name, full address, permanent voter number, and polling station location. Everyone has access to that, not only the political parties, but also the candidates running in a constituency, whether they are independent candidates or not. However, to the great dismay of all those politicians, the list has no phone numbers.
When we were all younger, it was relatively easy to find a phone number using a phone book, because 80% of subscribers to a fixed telephone network were listed in it. When we wanted to call someone, we just had to look up their name in the directory. Then we could add their phone numbers to the voters' list.
Mr. Therrien, are Canadians' telephone numbers now considered information covered by privacy? Should they not be accessible to political parties, or is this an example of a gray area? We have fixed telephone networks and we also have cellular networks. However, cellphone numbers are becoming more difficult to find. A phone number in a fixed network is public, but a cellphone number is not.
Thank you, Mr. Therrien and Mr. Vickery, for being here.
It seems that we have a fundamental question to ask as a society, and that is, when we come to data, what will we allow and what will we not allow? It really falls on the government to make the rules and not allow each company to decide how and when they use data in whatever manner.
I'm going to put that question to both of you, in order to understand. Before I do so, I'll say that we've always had targeted marketing. I was just looking up Michael Dell from Dell Computers. Before he became a computer mogul, he used to sell newspapers. He would look at the database of people who were newly married or people who had just moved. He was extremely successful as a teenager doing that. I could get that data now from, say, Facebook. If I wanted to sell newspapers the old-fashioned way, tell me who has just moved and who just married. We've allowed targeted marketing before.
Now, selling of data—nothing to do with Facebook, again; I give charitable donations, and I know that some of these charities share my data with other charities, because it's a good way to hit someone up again. Sometimes they ask permission to share the data, and sometimes they don't. That sharing of data for commercial reasons, that targeting, has been allowed. Both of these things have been allowed in the past; Facebook makes it far more efficient. If I were a political party, let's say the Green Party, I'd say that whoever's posting a lot about environmental issues might be a good person for me to target to get a donation or to convert.
I want to ask you this fundamental question. What should we allow, knowing that these things have already happened, and what should we not allow? How should we as a government put parameters around this behaviour?
I'll start off with you, Mr. Therrien, and then we'll go to Mr. Vickery.
Thank you, Commissioner, for noting this committee's unanimous report and recommendations to the government in February. We hope that the government has consumed it as you did.
One recommendation in that report, one that you have made in a variety of rather tangential ways, is to work with the European Union privacy regulators. In just a couple of weeks the new EU GDPR, the general data protection regulation, comes into effect. It protects virtually every data element of citizens across the EU, from their basic information—social insurance number, in the Canadian context—to all of their social media activity, all of their personal information, the computers they own, their telephone numbers, and so forth.
Has this Facebook scandal, the Cambridge Analytica scandal, AIQ, all of the things we're talking about today, and the fact that artificial intelligence, which has generated magnificent benefits to society, to mankind, while at the same time there's been a rush to develop new programs without any consideration for protections and precautions...? Is it time for Canada to consider something like the GDPR regulations to protect privacy, from the most minimum basic level up to the most complicated, when it gets to algorithms and stereotyping and exploitation?
Gentlemen, thank you for being here today.
Mr. Therrien, you have become a regular. It's like you're a favourite on Tout le monde en parle and can come whenever you like. I will begin with you, because I really want to understand the exercise we are doing now and, most importantly, the one you are doing on your side.
As you know, the committee also unanimously decided to investigate the apparent breach of Facebook data by Cambridge Analytica, but without compromising your own investigation.
I am curious to know how you characterize the breach of privacy in this case. If I have understood the comments you have made recently, it is your belief that the regulations in force have left too much leeway for Facebook in collecting personal data and that this has created the right conditions for Cambridge Analytica to use that information in an illegal or unethical manner.
Could you characterize the breach of privacy that you are currently studying?
Because of the ongoing investigation and our legal obligations, the most important of which is not to draw any conclusions before completing this investigation, I would like to qualify your remarks slightly.
The conclusions you attribute to me would be more a function of what we generally see, as representatives of a regulatory agency, with the behaviour of all companies and the legislation that applies to them. Every day, we see that privacy policies are very permissive in that they allow for a very broad use of information, which is not always consistent with informed consent.
Can we say that Facebook violated privacy based on the facts alleged? We will certainly be looking into it. Our investigation is ongoing and we cannot draw conclusions yet. I can tell you what issues we will be looking into, but we are not going to talk about any conclusions in this case.
In general, we will be asking ourselves whether the two companies we are investigating, Facebook and Aggregate IQ, have violated the federal privacy legislation and, in the case of British Columbia, the provincial legislation.
More specifically, we will be examining whether Facebook's privacy policies actually were too permissive and whether they played a role in the subsequent use of the information by analytics firms to give advice that may or may not have been useful to political parties, among other things.
We will also be trying to determine, as I said earlier, whether the recommendations made by the Office before I arrived in 2009 are still applicable in 2018.
Finally, we will be looking at the role played by Aggregate IQ in all this and how the company collected the information. Was it done in accordance with the legislation? We will mainly consider the type of data analysis that was done. Did the final product, as communicated to the political parties, comply with privacy protection laws?
All those questions are relevant, and we will examine them. Obviously, I cannot draw any conclusions right now.
I'd like to thank both of you for being here. There's a lot of information here. What we really want to get down to is what the remedies are, what things we can do as legislators, because I think a lot of this is very alarming to our constituents and to Canadians.
I want to be sure that I'm understanding exactly what the problem is. You mentioned, Mr. Vickery, that data multiplies, so you can't actually prevent it, but you can contain it, and that these spills are happening all the time. This to me is a very bad combination. On the one hand, you have the issue of legitimate use of data. Let's say a political party is going door to door, and they run into somebody who says,“I really like your child care platform. I'm going to vote for you because of that.” They make a note of that so that the next time they do something on child care, they can let them know. Even if the person gives consent and says, “Yes, please keep me updated about that”, you've got that. Then that goes into a database. The issue to me is not so much whether the candidate goes back to the person and says, “Hey, look at this great policy we have”, but whether it is then shared, either accidentally or maliciously, with, say, Toys“R”Us, who says, “Ah, they're concerned about child care, therefore let's sell them toys.”
Is that where we're looking? Is that the problem, or is it vice versa, that Toys“R”Us might be accessing somehow this political data...or you're looking at what Toys“R”Us is selling to kids, or somebody on Facebook who shows they have kids, and inferring it? So it's the cross-purposes of data: is that where the issue resides?
I would agree with what Mr. Vickery has said as to the sum of the solutions. Given the mandate you have, perhaps you want to look at what might be some legitimate uses of the information to communicate legitimately with electors. I agree that there's a concern both ways—information collected for political purposes being used commercially, or vice versa—and that needs to be looked at, but what I'm perhaps adding on the table is that this flow of information is certainly worth looking at, but it may not all be inappropriate. If you take your example of the family that buys toys and you say that political parties need to communicate with electors, to convince them properly, knowing who the electors are, is it necessarily a bad thing that the commercial habits of the family are part of what is assessed?
I'm not an expert in elections and in what goes against the integrity of an election or does not, but I'm looking at it conceptually. As there is a need for parties to communicate with electors intelligently, knowing who the electors are, some of the data analysis may be okay, but certainly not all of it, and the allegations in the case of Facebook and Cambridge Analytica certainly suggest an inappropriate use of information for political purposes. I'm just saying that there might be some legitimate uses.
As for legitimate interests, we're not in the world now of Facebook and Cambridge Analytica. We're more in the world in which, if the privacy laws are strengthened, there is a legitimate concern that the rules, or some would say restrictions, should not inhibit legitimate, responsible innovation. In answer to Mr. Baylis, I said that the value at stake for the most part is consent—control by individuals over their personal information. In the modern world, however, information may be used for several purposes, and it may not always be possible to inform the holder of that data of all the purposes to which the information will be put. The information is properly put to use in certain artificial intelligence initiatives, for instance.
Part of the challenge is to have strong rules that generally ensure that consent is respected, but in the world of big data and artificial intelligence, it may be that there's a need for an exception to consent. The Europeans use this exception of legitimate business interest as a way to ground lawful processing of data without consent. I think a balanced piece of legislation would enhance consent, on one hand, but also needs to consider what we do as a country with proper business or social concerns—it may be in the health sector—that need to have information without necessarily the consent of the individual, for a true benefit for society.
My question is for Mr. Vickery.
In the last U.S. presidential election, which was relatively tight, the candidate who won the popular vote lost, while the candidate who, in some minds, should have come second in the voting managed to win by collecting a majority in the electoral college, perhaps through more targeted advertising.
Last week, the founder of Facebook explained that his company's raison d’être, its business model, is to sell advertising. And Facebook does it very well, being particularly able to target regions, even streets or buildings: if someone lives in building X, they will receive advertising Y.
As an example, I own a Mazda, and, as if by chance, Facebook sends me a Mazda advertisement every day on my Facebook feed. So we see that Facebook targets ads in an extremely effective way. It is likely that American political parties use Facebook to advertise in certain sectors, states, or parts of states where voters are more likely to be supportive and therefore to vote for them.
Do you think that American political parties, both Democrats and Republicans, have done any electoral profiling or used the services of companies that have analyzed the best way to target advertisements or influence Americans in certain states? Would it be possible to conclude that the person or party who was most effective in his Facebook advertising campaign won the U.S. election?
I would like to come back to the theme of our study, that is, the information that we describe as personal and what we do with it. The different possible scenarios aside, I believe that the use of this information by a company is only an ancillary dimension of the essential problem that we have to study.
I have two questions, which I will illustrate with two scenarios. Based on those scenarios, I would like your comments on my understanding of the problem.
My questions are as follows. Is the government's role to define in detail what constitutes personal information? Or would the role of the government be to ban any transaction that contains this personal information?
Here are my two scenarios.
In the first, I do business with a book supplier: Amazon, as it happens. I find it normal and expected that, when I purchase my first book or on a subsequent visit, Amazon will suggest a number of other books based on the preferences of other readers, or buyers, or just simply based on my own history of buying books from Amazon. In establishing my relationship with the company, I provided it with a certain amount of personal information, so that it can provide me with a service based on its expertise in this area.
Here is my other scenario. I am naive enough to announce that, in a month, I will be going on a cruise for a week. It would not be surprising if a user who reads my Facebook feed and works in a travel agency contacts me to let me know about some cruise-related deals. Nor should I be surprised at the risk of my house being broken into during the one-week absence I announced. Both the criminal and the travel agent used my personal information, but I was the one who made it public. This is personal information that I shared on Facebook with my friends and followers, which is the service that the social network offers. So I made that information public.
Let me go back to my questions. Both scenarios describe realistic situations. Who is responsible for defining the granularity of personal information? Each type of company requires different categories of information. In addition, to the extent that a transaction depends on the expertise of the company—such as Amazon—of which I am a customer, I do not expect that company to sell my personal information to another company for purposes, including commercial solicitation, other than those established in my relationship with Amazon, that is, buying books.
Which role do you think is better, or should we consider a mix of both?
Perhaps Mr. Therrien could answer first.
I'm going to expand on your questions. If I misinterpret them, please tell me.
Basically, individuals give certain information in order to get a service. One of the consequences is that information is communicated at the time the service is provided. In the case of Amazon, for example, the company uses the information of people who are like you or who share your interests, that is, people who have liked a particular book. I would say that too is personal information to an extent, within the meaning of the definition of the term.
The conclusion that Amazon takes from your interests, for example, that you like detective novels, is actually the result of your personal information, but that conclusion itself becomes your personal information too: your actual or potential interest in detective novels is personal information about you.
The role of the state is to define what personal information is. In that respect, I think that the legislation is doing a good job, because it provides a very broad brush that provides me with the interpretation that I am giving to you.
Is it the role of the government to prohibit the use of personal information? No. The use should be regulated, but it should not be prohibited.
Have I answered your question?