Mapping the Doctrine of Discovery

S06E10: How Colonial Law Shaped Modern Data Extraction

The Doctrine of Discovery Project Season 6 Episode 10

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 32:24

Clicking “I agree” can feel harmless until you hear what it echoes. We follow a striking thread from the Doctrine of Discovery and Terra Nullius to the digital present, where human attention and behavior are often treated as if they belong to no one, ready to be “discovered” and taken. Our guest, a mixed-heritage settler Mennonite and Taino scholar who teaches AI ethics and policy at Queen’s University, opens with a jarring comparison between colonial “terms of subjugation” and today’s terms of service agreements.

From there, we map Data Nullius: the idea that platform capitalism converts lived experience into corporate assets by first making it legible. With Édouard Glissant as a guide, we dig into how Western “comprehension” can operate like seizure, reducing relationships into measurable objects. We connect papal bulls and Johnson v. McIntosh to the property grammar of res nullius and show how that grammar resurfaces when data becomes metadata, signals, and behavioral traces that AI systems can ingest.

The conversation moves through the history of statistics and state counting, Quetelet’s “average man,” Galton’s ranking, WWII-era computing, and the rise of surveillance infrastructure. We also keep Indigenous resistance in view, including the right to opacity and community-forward governance like the OCAP principles, which predate many mainstream data protections. Finally, we confront modern data ownership fights around large language models, genomic data, privacy law, contract law, and the hypocrisy of platforms claiming “our data” while individuals are left with little real control.

If this reframed how you think about data sovereignty, share the episode with a friend, subscribe, and leave a review on Apple Podcasts or Spotify. What would meaningful consent and accountability look like to you?

Support the show

View the transcript and show notes at podcast.doctrineofdiscovery.org.  Learn more about the Doctrine of Discovery on our site DoctrineofDiscovery.org.

Welcome And Land Acknowledgement

Jordan Loewen-Colón

Hello and welcome to the Mapping the Doctrine of Discovery podcast. The producers of this podcast would like to acknowledge with respect the Onondaga Nation, firekeepers of the Huda Shoney, the indigenous peoples on whose ancestral lands Syracuse University now stands. I had to move all this stuff over here because I'm a wanderer and I know if I don't have this podium, I'm going to be back and forth. So stay right here. And it's it's it's inspiring to see how this stuff is being used on the ground. My talk is a little bit different. We're going to go through a bit of the history uh that may have gotten us to this point. So Miyari Daire La Kola Bri Dawari Koka Wiho Taino Landorike Aricomabu Bia Kaya Yowahi Fu Ataru Kina Bu Daromata Tainka Huya Antira Taika. Hello everybody. Uh I'm Jordan Loewen-Colón. I'm a mixed heritage settler Mennonite and Taino scholar who teaches AI ethics and policy at Queen's University at the Smith School of Business. Uh I'm also a part of Oh oh no. Can you click on the PowerPoint? There we go. Hold on. There we go. I'm also a member of the Indigenous Values Initiative, and this research is a part of our grant-funded work, which is called Mapping the Doctrine of Discovery. And our work aims to uh center Indigenous uh perspectives and sovereignty, uh, interrogate religion, law, and white supremacy, and promote healing, justice, and alternative futures. So my paper today is called Data Resnellius, Mapping Data and the Doctrine of Discovery. Uh, and it actually started uh with this. Um it was one of uh life's wild synchronicities. I happen to be scrolling through a terms of service agreement while doing some podcast editing for the Discovery project. And I wasn't reading very closely, as I haven't had much of a masochist kink since I was in graduate school, uh, and I don't often read through uh legal documents or submit myself to the pain of them. However, just as the speakers in the podcast mentioned the Spanish requitamiento, I scrolled onto this all caps section. This is from the OpenAI Terms of Service Agreement. And it starts with it says you can see it at the top. It says, you accept and agree that any use of any outputs from a service is your sole risk, and you will not rely on output as a source of truth, and it goes on and on and on. The visual shouting, you'll notice it's in all caps, uh, coupled with the requiring reference, must have opened up some sort of metaphysical portal. As I was transported into a vision, hands bound and clasped behind my back among my fellow Taino, as a priest decked in the black and white cassock, holding the Hempen document aloft and reading to me my terms of subjugation. Terms not nearly so mundane, but perhaps equally violent given the systems that sustain them. I was reminded of Patrick Wolfe's line that invasion is a structure rather than an event. If participating in digital life requires that you submit yourself to terms you cannot negotiate, then we really shouldn't be calling what we have agency. Our choices are already structurally conditioned, and just because we continue to use these tools and find them useful doesn't make them any less coercive. In fact, it often deepens it. That is where this paper begins. The argument's pretty simple, though the history underneath it is not. The doctrine of discovery empowered early empires with Terra Nelius and now empowers the empires of platform capitalism with data nelius, the treatment of human attention, behavior, and engagement as if they were ownerless resources waiting to be stolen, I mean, discovered. Because this research is still very much under development, this paper shifts between two approaches that I hope aren't too confusing. On the one hand, my claim is genealogical. There are identifiable transmission mechanisms through which colonial property laws treatment of lands and peoples as ownerless things migrated through colonial US law into the legal frameworks that now govern data. The Res Nellius concept's journey from Roman law through papal bulls into contemporary property theory is just one path. My other claim is structural or analogical. The doctrine of discovery and data nelius share a formal grammar, the rendering of existing relationships as legally empty in order to authorize colonial theft. In the big picture, and what requires more time than I can give today, understanding how the doctrine of Christian discovery justified the seizure of the world requires unpacking the very nature of Western understanding or comprehension itself. As the Caribbean poet philosopher Edward Glissant observes, the West operates through the logic of comprender, literally to take or to seize. So comprehension is to take or to seize. The grasping form of knowledge requires that the other to become transparent and reducible, effectively transforming the all of world diversity into the one of a singular linear history. Legitimacy in this tradition is embedded with an ontology of fixed singular identity over shifting dynamic relation. As history shows, any culture not inscribed in this specific linear history was a non-entity or must become a non-entity through destruction. The doctrine of discovery ultimately transformed a philosophical preference into a divine mandate for global taking. I lean on Glissant's theorizing, which allows us to identify the epistemological precondition for every juridical act of taking that I will mention today. To claim the lands, palpables first had to render non-Christian peoples epistemically transparent, or things to be discovered, and therefore available for legal action. The same operation recurs in the data context. Before human experience can be claimed as a corporate asset, it must first be rendered legible, decomposed into behavioral traces, metadata, and quantifiable signals that can be comprehended by algorithmic systems. In both cases, the act of knowing and the act of taking are inseparable. This is what Glissant's Comprendera illuminates for us. Understanding information in the Western tradition has always been a form of taking. The doctrine as we know it emerges primarily from three Papel documents the Dum Diversus, the Romanus Pontificus, and the Intercatera. You're welcome to dig in more. Or you can check out our podcast, QuickBook, having the doctrine of discovery. In the U.S., the doctrine would move from international claim into domestic property doctrine in the Johnson v. McIntosh case in 1823, where the Supreme Court established a chain of title principle that validated settler grants while foreclosing indigenous conveyance except through the sovereign. The court affirmed that, quote, discovery gave an exclusive right to extinguish the Indian title of occupancy, either by purchase or by conquest, while vesting the sovereign, the U.S., with the power to grant the soil, leaving indigenous peoples with the right of occupancy that the U.S. can end at any time. And this law still stands today, being cited by Ruth Bader Ginsburg back in 2005 in the city of Sheryl Versus the Neida court case, and a bunch of other cases you're welcome to check today. So literally, this 15th central papal document uh informs and is the legal foundation for just modern property law and global. This is not just the US, it's like it's all over, it's wild. Oh. There we go. Uh undergirding these land claims are two are Roman principles of resnelius and ternalius. So resnelius, or an ownerless thing, was based on the Roman legal principle that objects with no obvious owner, including property, could be acquired by the first possessor. While terra nelius is the later territorial analog, land belonging to no one. As Lauren Benton and Benjamin Strauman have argued, resnelius had a real foundation in Roman law, but Teranelius was derived from it by analogy rather than by being the same concept. And this distinction matters because the doctrine of discovery was about sovereignty, title, and jurisdiction over lands and peoples, not just about taking ownerless objects. So in our framing, it's important to highlight how resnelius sits inside the doctrine of discovery as a supporting legal concept, not as the whole doctrine itself, even if, as we argue, it's at the core of contemporary data sovereignty as we know it. What this means is that Resnelius is one jurisdiction or juristic building block in the genealogy of the doctrine of discovery, but the doctrine's central colonial move was broader and harsher. It fused Christian civilization, hierarchy, and papal royal claims, conquest, and later secular property laws, all those things together. Resnelius helped supply the property grammar, and Terra Nelius became the territorial version. And then the doctrine of Discovery used both kinds of reasoning to rationalize dispossession. In 1823, Chief Justice John Marshall adopted the doctrine of U.S. federal law, identifying the royal charters of Great Britain as the documentary source of the government's title. And this court ruling gave Discovery absolute title to the government, despite the occupancy of the natives who were dismissed as heathens. Again, the doctrine is not historical, it is active and present. And for more, I listed a bunch of authors that are just so amazing writing on this topic. So now part two, I want to talk about data specifically. So with data, uh the original definition definition of data wasn't actually tied to information in the modern loose sense that we have it, but rather things given. It comes from the Latin datum, something given, from dare to give. So we already have this really interesting relational element here that is missing from our contemporary understanding. In classical usage, it referred uh especially to things granted as in premises and geometry or population, but in English, datum appears by the early 1600s. So it took uh a couple or almost 2,000 years before it gets into this English usage. And by the mid-1600s in the 17th and 18th centuries, the concept began to widen. It took on the science area, uh it started appearing in uh science and writing and uh first circulated as a technical term in mathematics and astronomy and then spread through other scientists uh sciences. Uh specifically, there's a document called the Philosophical Transactions, which shows exactly this pattern. The term enters through mathematics and later into other fields like science, physics, and chemistry. So even at this early stage, though, the etymology encodes attention. Things given implies a relationship, something offered, received, sit situated within an exchange. Here, Lesant reminds us that the modern transformation of data into things taken represents not just a semantic shift, but an ontological one. Information is severed from the relational context that produced it. This severance, like the one between encountered peoples and the land they inhabited, is the epistemological precondition for treating data as resnellious, as ownerless, precisely because its origins in human relationships have been rendered invisible. No longer things given from one thing to another. So where and when did this shift occur? In How Data Happened, oh no, we'll keep going. Uh, in how data happened, historians Wiggins and Jones argue that data, as we understand it today, had its origin in the collection of numerical medical data emerging in 17th century England. Examples like the London Bills of Mortality represented an early effort to render human life eligible through quantification by documenting causes of death in parishes. They don't really comment on why things like census data or records of grain and trade, et cetera, count, because obviously now there were uh censuses happening throughout most of human history. The one argument we might have is that all that potentially like pre-data that was being collected in terms of census actually was uh usually about folks in power just caring about who else was in power. But there's a shift that happens uh in the uh the 1800s where it's not just about those in power, because those in power realize, oh wait, we can we can take from everybody. Uh and so it shifts to something broader. Okay. Um in the night err, yeah, where were we? Uh yes. Okay, so then in the 19th century, this quantifying impulse accelerated, ultimately becoming what we now call the science of statistics, derived from the German statistique, which is descriptive knowledge of a state's resources, its people, land, taxes, and military capacity. As European states competed for industrial and martial dominance, they triggered what Ian Hacking has called an avalanche of numbers. And then using these population counts as a measure of state vigor, um, they started going to war. All the states at this time are really interesting. Like, how do we start getting all this information? How can we use this to see how good we are and then compare ourselves to others? So data gets tied to statistics and it's all uh wrapped up in there. Liggins and Jones go on to argue that the biggest shift occurs when the Belgian astronomer Adolf Quetlit developed what he called social physics. So he began applying statistical methods originally designed for astronomical observation to human populations. And his concept of the home moyen, the average man, was an attempt to convert many to the one by demonstrating that large aggregates of human behavior exhibited statistical regularities. Doing so quickly established the foundation for treating people not as irreducible individuals, but as predictable, quantifiable givens or data. So close. Soon after, the notorious Francis Galton, who I'm sure many of you have heard of, of eugenics fame, implicitly recognizing the ridiculousness of data being about singularity and not relation, uh, attempts to revivify data's relationality by coining the statistical term co-relation, correlation. So again, we know now the history, data being given, being inherently relational. Here comes Francis Galton. Goal. Oh my colonizer saying I've discovered data is relational, it's co-relations. Have you ever heard that term? This is a co-relation, that's Francis Galton. So where Quitlett sought the average, Galton was obsessed with deviation from it and with ranking individuals within a distribution. This statistical project was undeniably tainted, as most things in the late 19th and early 20th centuries were, by biological determinism, the idea that individual capacities for intelligence, beauty, and other measures were connected to innate biological factors. Galton's work laid the statistical groundwork for mental testing, the eugenics movement, and ultimately the modern capacity to micro-target individuals based on mass data. This marked an early manifestation of the Gilles Deleuze, which later theorizes the Divisual. Um, but I won't get into that. It's me being too philosophical. I'm a philosopher, I like getting into it. Do you need help catching up on today's topic? Or do you want to learn more about the resources mentioned? If so, please check our website at podcast.doctrine of discovery.org for more information. And if you like this episode, review it on Apple, Spotify, or wherever you listen to podcasts. And now, back to the conversation. Moving past Jones and Wiggins, uh, we argue that this Quetlit-Galton arc is the moment at which Resnellius grammar first attaches to human information. When Quitlett demonstrated that individual behavior could be dissolved into statistical aggregates, and when Galton showed that individuals could be decomposed into ranked data points, they created a technical and conceptual apparatus for treating human attributes as separable from the persons who bore them. No one asked whether the thousands of people measured in Galton's anthropometric laboratories owned the data extracted from their bodies. In fact, the question wasn't just unasked, it was unintelligible in their framework. Human data, like the territories mapped by colonial surveyors, was simply there for the taking by those with the institutional capacity to take it. Fast forward to the 20th century, and data capture and statistical coercion took on a radically deeper power during World War II. A largely female work for workforce of mathematicians and crypto analysts built the Colossus computing machines and deployed Bayesian statistical methods to break the German enigma codes. This was data's first truly big moment, the moment at which information processing became an industrial operation capable of handling mass volumes of data at high speeds in ways that made large nations take notice. The war made clear that whoever controlled the flow and analysis of data controlled the outcome of conflict. Soon after the war, the US recognized the potential for control and established the National Security Agency, or the NSA, in 1952, tasking it with uh signal intelligence and electronic surveillance of both firm foreign and domestic communications. So, yeah, birth of NSA, uh, but it was interesting how the US government started working with private sector companies, uh, and you start having the birth of the computer and collection of data all happening in this realm. And I want to go back, oh, can we uh can you click on it? Is it gonna work? For each of these slides, I want to show where indigenous resistance was also happening at the time, too. I won't be able to get to the pre-slide. Uh, but in this slide, as you're starting to see this data collection and scientific work, um, there's all sorts of experiments happening uh in Boraken, um my Baricwa uh sisters and ancestors. And uh at the time, these women were resisting this data capture uh by actually doing what we now call as poisoning the data. So if you've ever heard of anyone uh or these artists who are uh putting kind of uh uh broken false claims of the stuff that the LLMs are searching up, these Bariqua women were doing that with the scientists who were extracting data from their bodies. They were lying about the effects uh that were uh happening on them as they were being tested on by the modern birth control project. And previously, uh in the early uh era, so uh when governments were really interested in doing a statistical gathering in the US, you had the DAWS Act, uh, and there was tons of indigenous resistance about their information being collected, right? Uh we we might now say they were trying to push back and establish their right to opacity, their right to not be known. Because by being known, they were pushed off and their lands were taken. Okay, now uh moving a little bit closer to our contemporary area. Uh so uh through the 80s and 90s, data became big as we know it. And in new analytical techniques and tools that grow up in response, like knowledge discovery and databases, which emerged as a field to extract non-trivial and actionable knowledge from the massive high-dimensional data accumulation happening in corporate and scientific warehouses. After the World Wide Web went mainstream, began generating data in unprecedented scales, though mostly remained unstructured and therefore uncapitalized. And as we know, uncapitalized is a problem because money needs to be made. Then data had its Oppenheimer moment. Stanford researchers Sergey Brin and Larry Page began capturing and using human data to form an organic structure, creating the page rank algorithm and founding Google. The attention economy was born, and the continuous extraction of user data turned to behavioral micro-targeting became the primary engine of corporate profitability. Almost all the money made in the Valley today resolves around the attention economy, around how do we capture human attention, how do we create ads, all this, and it happened in the 90s. The internet did not have to go that way, and it did. 20 years later, the EU finally, paying attention to what's happening, drafts and implements the General Data Protection Regulation, or GDPR, attempting to protect data as a fundamental human right. It emphasizes individual consent, the right to know how data is used, and the right to erasure. However, what's not talked about in this history, and often remains hidden in the general stories that are. Told is just as the internet began to take shape, the First Nations in Canada developed the OCAP principles. I don't think people realize how prescient this was. No one else was talking or thinking in this way. It took the EU 20 years to establish and codify these laws. So again, indigenous resistance at the forefront at every step. The intellectual and sophistication and political foresight OCAP cannot be overstated. At a moment when mainstream policy disclosures had not yet grasped the political stakes of data, indigenous communities were already articulating a comprehensive governance framework rooted in self-determination. OCAP represents more than a policy alternative to the extractive model. It embodies a fundamentally different ontology of the relationship between persons, communities, and information. Where the Res Nellius framework treats data as an ownerless object awaiting individual appropriation through first control, OCAP begins from the premise that data is about community. It's constitutively related to that community. It is not a separable object, but an expression of collective life. Ownership under OCAP is not individual possession, but collective stewardship. Control means that community determines the terms of access. Access ensures that the community itself benefits from the data, and possession enables that the physical infrastructure of storage and management remains under community authority. Read through Glissant's framework, OCAP and the Later Care principles represent the assertion of relation against filiation in the domain of data governance. Where the Western legal traditions, res nelius grammar serves data or severs data from its relational context, rendering it available for seizure, for taking. Indigenous data sovereignty insists on the irreducibility of that relational context. Data about indigenous is not a free-floating object. It is an expression of community experience shaped by historical trauma, collective memory, and ongoing political relationships. To abstract it from that context and treat it as res nelius is to perform in the information domain the same operation that Terranellius performed in the territorial domain, the erasure of indigenous bodies. Now we're going to look at a bit the legality of it. So moving from cultural and politics to law. Today, the driving concern in large language model, language large language models and predictive AI, which is finally heating up debate and public knowledge around the concept of data ownership and not just privacy. Unfortunately, the legal and philosophical frameworks for addressing that ownership remain profoundly contested. The three primary areas and approaches involve contract law, like terms of service agreements, intellectual property and databases, and privacy law, the rights over personal data. I want to make a quick disclaimer here. Next section is definitely a work in progress and pulls thinking from multiple legal theories, which is interesting broadly, but not necessarily methodologically sharp, so please bear with me. On the one hand, uh a cross-jurisdictional approach might actually be relevant given the data analysis global uh operating framework, but I'm not trying to harmonize these systems. So uh here. Writing for the La Harvard Law School, the settler colonial South African lawyer, uh Donric Thalder uses South African law as a case study to show how genomic data is the crucible for legal data rights around the globe. So Harvard Law is one of the most uh important um uh organizations for establishing debate around legal theory and uh uh often brings together the top minds in law. And uh Donic Thalder is one of those. He was invited to be there and started writing about this data ownership conversation. I let Thalder have a voice not because I agree with him, but because I think he says the quiet part out loud. According to Thalder, no one domain of the law holds exclusive sway over human genomic data. Instead, the current structure involves a multifarious weavings of property law, privacy law, and contract law and intellectual property law, with the conclusion that who ultimately or who ultimately has the rights requiring interpretation of rules for each domain. However, since the right of ownership is essentially tied to property law, and in the absence of such specific legislation, general property law rules must be applied. So in order to be owned, data must be a legal object. And because genetic information in its natural form, which is encoded in DNA, is neither capable of human control nor useful in conscious activity, it becomes, uh it does not have the quality of a legal object. To become an object under these legal frameworks, it must be controllable and useful to conscious human activity, such as genomics research. So control leads to ownership, and because data files on a computer cannot be controlled, they cannot be owned. That's the basic summary. But control and ownership does not equal acquisition. According to Thalder, there are only two original modes of acquiring which are potentially relevant to genomic data, which is acquisition by fruit or acquisition by Resnellius. Thalder says, uh Thalder argues that there are underlying problems with the fruit approach, and that the only remaining option is that a newly minted genomic data instance belongs to no one. It is Resnelius. As you saw this here before, uh, genomic data is this interesting crucible for all kinds of conversations and property laws around data as we know it, which is why it's worth talking about. So here you have one of the leading legal uh scholars specifically naming that our approach to data must be Resnelius. But what about the person to whom the genomic data relates, the data subject? Should they, uh should this person not automatically be deemed the owner? Thalder's answer is an unequivocal no. Having a personal connection with an object is no legal ground for claiming any property rights in such an object. Thalder's analysis is rigorous within its own terms, and this essay draws on it extensively. But the Resnellius construction he or construction he describes as quite workable deserves more critical scrutiny than Thalder himself provides. Um I'm realizing it is 11:30 already, so I'm gonna have to skip through a bunch more. Uh it's all legal stuff we can uh maybe chat a lot of later. Um, I do want to talk about this slide though. I think this is important. So we have the story, doctrine of discovery, uh tying uh setting out the the kind of the theological property rights of how data is owned. We have this uh establishment of the legal term Resnelius, and then we get to our contemporary time where LLMs are running out, soaking up all the data. And then you have these companies like Reddit, Twitter, uh Facebook, and so many more that have been collecting and acquiring our data, freaking out because now AI is doing the same thing to them. And what is their response? Enclosure. Their response is no, it's not resin on yes. We were here. This is our data. And so they shut off the LLMs. They get that right because they have uh uh digital sovereignty under the gray areas of our law today, but we do not. And that's the hypocrisy here. To conclude, um, I want to end with the irony that in uh early modern scholastic debate, right around the time the Tainio Cacico Yorro Yoan was uh initiating rebellion at Bordequen, Resnelius was actually used against imperial seizure. Uh Benton and Strauman show that in 1539, Francisco de Vitoria, who's here on the right, and the rounds on the left, um, so this uh monastic scholar uh was fighting back against Spain, explicitly uh saying that discovery gave Spain no title to the Indies precisely because those lands were not resnelius. They already had owners. And indigenous peoples possessed true dominion in both public and private affairs. One important strand of the tradition thus used resnelius to limit discovery claims and not validate them. That this counter-tradition was ultimately overridden by the dominant logic of seizure makes the genealogy all the more interesting for the contemporary data question. Like all histories, data's history isn't neutral. It's just the latest conceptual object to be folded into an older colonial grammar that renders relations inert so they can be seized and owned and operationalized. Data nullius is not an accident of the digital age, it is a continuation of a much longer project, making the world legible in order to take it. This is where Clissant's intervention becomes critical. Against the logic of comprehension, where to know is to grasp, to reduce, he offers a way of thinking grounded in relation and opacity. Applied to data, this requires a fundamental shift. Data is lived experience, collective histories, and situated encounters. To render it fully transparent, fully extractable is to repeat the colonial gesture at the level of information. A relational approach to data, by contrast, accepts limits, recognizes partiality, and refuses the demand that all knowledge be made available for capture. If data is treated as something that can be abstracted from the conditions of its emergence, it will continue to be governed as a resource. If it is understood instead as an expression of ongoing relationships, it demands a different form of accountability altogether. Under data relationality, it moves from mere asset to responsibility. Thank you. Haha. The producers of this podcast were Adam DJ Brett and Jordan Lawrence Columbus. Our intro and outro is Social Dancing Music by Oris Edwards and Richis Cook. This podcast is funded in collaboration with the Henry Lee Foundation, Syracuse University, and Hendrix Chapel, and the Indigenous Values Initiative.