When Databases Get to Define Family

“Error: Unmarried Mother” flashed across the computer screen as 30-year-old Riz began the process of renewing his Pakistani Computerized National ID Card (CNIC), a compulsory identification document that functions like a social security number, driver’s license, and passport all rolled into one. Riz’s parents have been married for 31 years, but the database did not agree; there was no way to proceed without this validation check. Every visit to the registration office ended with an officer saying, “Sorry, sir, the computer doesn’t allow it.”

Without a renewed CNIC, Riz could not even buy a bus ticket. In Pakistan, access to sectors and services as diverse as telecom, banking, health records, social welfare, voting, and employment have all been made contingent on having a verified record with the National Database and Registration Authority (NADRA).

Riz’s identity validation problem was not caused by a glitch in the system. The requirement of having two married parents is, instead, an example of the social judgments encoded within Pakistan’s digital ID database design. It turned out that, to avoid taking on her husband’s family name, Riz’s mother had never updated her marital status with NADRA. In the analog Pakistan of the early 1990s, she had gotten by without issue. Thirty years later, social expectations had become embedded into databases, and Riz would be unable to access basic services unless a query on his mother’s marital status returned “TRUE.”

Riz’s experience tells the larger story of how Pakistan chose to structure its digital ID system. The system places each individual within a comprehensive digital family tree. Digital households are built up of pre-encoded, socially and legally approved relationships, and can be connected to other households through similar socially and legally approved relationships. Each registered individual is required to prove ties of blood or marriage to another verified Pakistani citizen. Marriages (state-approved) create a link between two households, and children (only through marriage) create a continuing link with both households’ genealogies.

Pakistan’s experience with creating databases that encode kinship reveals important lessons about the complexities of building digital ID systems. Database design is not just computational. At every step, social, political, and technical decisions coalesce.

In 1973, Pakistan was fresh out of a war of independence; two years earlier, East Pakistan had become Bangladesh. Pakistan, having suffered a blow to its legitimacy, now wanted “a full statistical database of the people of this country.” Parliament created an agency responsible for providing every citizen a state-issued ID, conducting statistical analysis of the population, and building rules around the identification of citizens.

Who counts as a citizen is a politically fraught question for any nation, but particularly for a country with a complex relationship with migration. After the 1947 partition between India and Pakistan, many hundreds of thousands of individuals born in land accorded to Pakistan migrated to India, and vice versa. Citizenship rules became a tricky dance between ensuring that descendants of these migrants to Pakistan received citizenship while not setting precedence for later waves of migrants to lay claims on the state. Citizenship was thus accorded to those who were born in Pakistan after 1951, and to the descendants of those who migrated to Pakistan before 1951. (This cut-off date was later changed to 1971 to accommodate the wave of migration after Bangladesh’s independence.) As Pakistan faced more waves of migration, largest among them from Afghanistan, rules for citizenship and identification merged. Evidence of identity, like citizenship, was tied to family and descent.

In 2001 the National Database and Registration Authority was created to digitize all this preexisting paper-based citizen data. Six years later, partially due to NADRA’s ongoing attempt at separating “genuine Pakistani citizens” from migrants, digital biometrics were introduced.

NADRA’s family-based system design is a legacy of Pakistan’s past, says Zehra Hashmi, an anthropologist and historian of identification technologies in South Asia. She explains that identification is always an act of verification, which requires linking the individual to something else. Today, many databases link the individual to their body via biometric information. NADRA, by contrast, inherited Pakistan’s preexisting ID verification logics, which were based on paper family registries and evidence of familial relations such as the Child Registration Certificate. Hashmi says that while other countries do use family relationships for ID verification (i.e., the US immigration system), Pakistan might be one of the only countries in the world to have family relationships underpin their national, comprehensive, and centralized digital ID system.

Ranjit Singh, a postdoctoral fellow at Data & Society, agrees that NADRA’s unique kinship-based digital ID system is representative of the foundational logics of the system merging with the historical moment in which NADRA was being designed. Countries trying to create digital ID systems today are able to capture biometric data and thus do not need to introduce other forms of verification into the database. On the other hand, NADRA, unlike other systems, also uses its ID system to verify citizenship—and Singh adds that if these other systems are used to resolve questions of citizenship, they will start needing to consider these same questions of mapping genealogies.

Like other large-scale data management systems around the world, NADRA uses relational databases. The structure of the relational database allows all data points to be tied to each other in pre-defined relationships with set rules for what can or cannot be stored in each data field. This blueprint, called a schema, keeps data entry reliable, search efficient, and the system parsimonious.

It is in the design of the schema that the assumptions of NADRA’s system architects quietly creep in. By requiring all individuals to be verified through their family structures, NADRA’s database schema encodes judgment on what counts as a legitimate relationship. For example, if designers think most children in Pakistan should only be born out of marriage between a man and a woman, they might choose to avoid writing additional code that would allow for an unmarried, single woman’s record to be linked to a child’s. Other ideas about family built into the system include that children should not be born out of wedlock, all individuals will have two known parents, and that families are headed by a male citizen.

The sanctity of the biological household is so well-established in NADRA’s relational database schema that when the conception of a specific kind of family breaks down, halfway solutions (instead of a system overhaul) are used to create digital substitutions for kin and familial relationships.

In 2009 the Supreme Court of Pakistan ordered NADRA to include an expanded list of gender identities within their registration system, in part to accommodate citizens who identify as part of Pakistan’s khawaja sira community: a “third-gender” identity indigenous to South Asia. The historic judgement stated that forcing khawaja sira citizens to choose from two genders was restricting their access to a CNIC, and thus prevented them from exercising their rights.

Updating the gender schema for the database was an easy fix. Databases around the world have run into the problem of gender being coded as a boolean operative—i.e., with only two possible values the field could take. Activism by the trans community globally has forced governments to change definitions of the gender field in databases to include multiple gender identities.

For NADRA, the real and unanticipated challenge emerged in the field for parentage. Many khawaja sira in Pakistan leave their families of origin and enter into something akin to a disciple-mentor relationship with a guru. As the khawaja sira community replaces the biological family, bringing proof of parentage to register for CNIC proves difficult. NADRA thus made a policy to add the guru as a parent of each registrant, digitizing these otherwise adopted families that are based on allegiance and affinity rather than biological descent. This meant, though, that many in the khawaja sira community could not get a CNIC unless the gurus (who also did not have parents to testify for them) were registered in NADRA’s system.

Similar edge cases abound. Orphans, too, cannot meet the requirement of presence or documentation of parents or guardians. There is no override that allows a citizen’s record to exist alone without a family to be connected to. NADRA’s solution was to pick a family at random and connect the orphan’s record to them. This “database adoption” mostly happens without the knowledge of any of the parties involved.

For single mothers or children born out of wedlock, NADRA officers reportedly ask for a fake marriage certificate. A few years ago, one Pakistani woman reportedly became pregnant through sperm donation and could not get her daughter registered because she was unable to prove the father’s existence. Thus, the seemingly innocuous “technical” rules of the database become political by restricting whose citizenship, marriage, family, or identity will be recognized by the state. Political wins can even be nullified by technical oversight: If same-gender marriage is legalized but the state’s databases are designed to do a gender check when creating marriage records, same-gender marriages will not be registered.

Riz was eventually able to convince NADRA and its database of his parents’ three-decade-long partnership. Old certificates were dug up; Riz’s father took multiple 16-hour flights to the country to verify his marriage. After many months of limbo, his mother’s record was at last tied to his father’s in the relational schema of the database and Riz’s existence was validated.

For Singh from Data & Society, Riz’s experience and that of others in similar situations are a reflection of the “computational troubles of a world where citizenship is organized by descent and countries try to digitize family trees to validate someone’s identity.”

Are we destined, then, to have our identities forever essentialized by rigid demands of the relational database?

Jehangir Amjad, an MIT software engineer who works on large-scale database system design, believes that the solution lies in more creative future-proof system architecture. “Large-scale database infrastructure should not be relying on patchwork solutions—not scalable—or updating schemas on a case-by-case basis—risky,” he says. “Good database design does not need to predict what will happen 20 years in the future, but needs to acknowledge there will be change.”

Take gender, discussed earlier as a boolean field with only two possible values. A more robust database design would have encoded gender as a flexible ENUM (enumerated) data type that can hold any value from a list. To update the field, the database manager would need to simply include more options in the list, as opposed to recoding the entire field. Amjad argues it may even be time for NADRA to reconsider how they are building relationships between households. If, in reality, every atomic structure is no longer a biological, nuclear family, not every record needs to exist in a familial relational structure. Often, though, such overhauls do not happen in government systems, built atop legacy structures. The hurdle here is not computational, he emphasizes. Instead, it is in the lack of design thinking before implementing a system, which makes it inflexible and updating it harder.

The answer from Hashmi, the anthropologist, focuses not on the database but on the security politics within which the database is embedded. It is not the edge cases that concern her. After all, these solutions and the half-way hacks of NADRA officers, Hashmi argues, are a sign of flexibility in the system. Instead, Hashmi questions the increasing centralization of the CNIC to civic life in Pakistan.

Over time, NADRA’s demands for citizenship verification have become more intensive, with cases of blocked cards increasing, Hashmi says. Cards can get blocked for any number of reasons outside the control of the citizen: suspicion of your citizenship status, of your recorded family genealogy, of someone in your extended family tree requiring re-verification. Until re-verification is complete, the CNIC remains blocked, leaving the citizen in limbo. Hashmi tells the story of Hamida Bibi, who threw a party when her CNIC was unblocked after a re-verification process that took five years.

She sees this shift in NADRA as a reflection of the increasing centralization of security concerns for the Pakistani state, particularly when it comes to the more than 1 million Afghan refugees living in the country. There was more flexibility possible before 2011, when you could escape being linked to a known family. After 2016, with more data being collected by NADRA and more sophisticated biometric identification techniques being used, this window has all but closed.

While “NADRA sees this as an important step toward heightened accuracy in identification,” there has been little reflection on “what this means for those whose circumstances do not meet the normative standard of biological kinship,” Hashmi says.

As more countries move toward deploying biometric ID databases, NADRA’s case is instructive . Imposing computational structures on complex social phenomena like identity, family, and relationships will necessarily be a fraught process. There are no neat solutions.

ID databases are socio-technical objects and their design requires imagination, both sociological and technological. For instance, how a database engineer chooses the data-type for a field recording gender is informed by a specific understanding of how gender manifests in the world, and an implicit acceptance of gender needing to be a category of identification.

In Pakistan, minute choices made by NADRA’s system architects 15 years ago have today become the difference between a citizen accessing their rights or not. Considerations for ID databases, then, cannot just be computational parameters. Designers must also engage with questions about the shifting concept of identity and the politics of ID databases. Doing so would require breaking down the silos that make designing for social realities the sole purview of computer scientists and engineers. We must recognize that, as technologies proliferate our social, economic, and political lives, banal “unexamined” technical choices have far reaching social consequences—especially as they make our basic existence contingent upon them.


More Great WIRED Stories