The Invisible Herd
India's dairy sector supports 80 million farmers and 5% of GDP. The state cannot tell you which farmer owns which animal, where the animal is today, or whether it has been vaccinated. Ninth in a serie
In October 2024, in a revenue village called Hadapsar on the outskirts of Pune, a dairy farmer named Ramakrishna Patil owned seventy-three buffalo. He knew this because he had counted them himself — a tally he updated every morning after milking, scratched into a ledger that his father had started in 1987. The evening milk collection truck came at 6:30 p.m. By 7:15 p.m., milk from Ramakrishna's herd — along with that from 212 other dairy farmers in the village — had been bulked, weighed, tested for fat and protein content, recorded in the driver's handheld device, and transmitted to the dairy cooperative's central server in Satara district. By 8 p.m., Ramakrishna had received an SMS confirming the weight, the price, and the amount that would be deposited in his account by Wednesday morning.
Two days later, a field enumerator arrived at Ramakrishna's house as part of the 21st National Livestock Census. The exercise was a quinquennial affair — one of the largest statistical operations in the world, involving 87,000 government enumerators fanned out across India to visit 30 crore households. The enumerator had a tablet running a digital form called eLISS. He asked Ramakrishna how many buffalo he owned. Ramakrishna said seventy-three. The enumerator entered seventy-three into the tablet, added some category markers about the animals' use (draft, dairy, breeding, others), confirmed Ramakrishna's name and the village code, and moved on. The data went into a mobile application, was transmitted to an NIC server, and eventually became a data point in a national count of livestock that, for the purposes of understanding India's economy, would be precisely as useful as a fish trap had been in the 1940s.
This is the ninth essay in a continuing series on India's data infrastructure. The previous eight essays have shown, at different scales and across different sectors, that the country is trying to run a 21st-century economy on 20th-century data. This essay concerns a sector where that metaphor barely holds: the livestock economy is not running on 20th-century data. It is running on 1940s data overlaid with 2025s transaction data, and it is the collision between those two layers that reveals how far behind the country actually is.
THE SCALE
Start with the numbers, because they are startling if you are not in the sector and quietly known if you are.
India's livestock sector contributes 5.5 percent of the national GVA — the gross value added to the economy. In absolute terms, as of 2024, that is roughly ₹9 lakh crore. The dairy component alone — which accounts for about half of livestock sector output — contributes 5 percent to the national economy and is the largest agricultural product in the country by value. India is the world's largest milk producer, with a production of 239.3 million tonnes in 2023-24, up from 146.31 million tonnes in 2014-15. This represents 24.76 percent of global milk production. A country that is one-sixth of the world's population produces one-quarter of the world's milk.
The livestock sector supports the livelihood of around 8 crore farmers — 80 million people — and provides 60 to 65 percent of the annual income of marginal and small-scale farmers, who account for more than 80 percent of India's farm households. It supports the livelihoods of around 66 percent of rural families. The sector is, by any measure, foundational to India's rural economy. It is also, by any measure of data infrastructure, nearly invisible to the state.
The total livestock population of India, as of the 20th Livestock Census in 2019, was 535.78 million animals across 16 species. Cattle formed the largest component at 192.9 million, followed by buffaloes at 109.85 million, goats at 148.8 million, and sheep at 71.6 million. These are enormous numbers. They are also largely inert for the purposes of governance, risk management, or policy.
WHAT THE CENSUS RECORDS AND WHAT IT MISSES
The Livestock Census is administered every five years by the Ministry of Statistics and Programme Implementation, coordinated with the Department of Animal Husbandry & Dairying. The methodology has remained essentially unchanged since its inception in 1919, though the data collection mechanism has been modernised. The 20th Census, concluded in 2019 but not released in final form until 2021, was the first to use a digital enumeration system: enumerators visited households with a mobile application, eLISS, and data was transmitted in near-real time to NIC servers.
The scope of the census is comprehensive on paper. The enumerators are meant to visit every household, every household enterprise, every non-household institution, and every community entity that might own livestock, in both rural and urban areas. They are meant to record, for each animal species, the count of animals by age and sex, and the primary use — draft, dairy, breeding, meat, wool, others. They are meant to capture the full diversity of India's livestock practices, including those of nomadic and transhumant communities. The exercise engages over 80,000 field personnel, mostly veterinarians and veterinary para-professionals, deployed across 28 states and 8 union territories.
By raw operational metrics, the 20th Census succeeded. Data was collected from more than 27 crore households. The count was completed, compiled, and released. The enumeration staff were equipped with modern technology. And the headline figure — 535.78 million livestock — is almost certainly not wildly off the true number.
But here is where the data infrastructure becomes visible: almost everything of operational value to the state, to the private sector, and to the farmer lies in the details that the census cannot capture or does not bother to record.
Take the simplest question: ownership and identity. The census records Ramakrishna Patil as owning seventy-three buffalo in October 2024. The number is technically accurate for that one moment. By the time the census data is compiled, processed, and released — a process that typically takes two to three years — several things have occurred. Two calves born in late October 2024 are now two-year-old breeding animals. One elderly buffalo has died. One animal has been sold to a dairy farmer in the next district. One has been stolen. One has given birth to twins (rare, but it happens). The census Ramakrishna number is no longer a description of his herd. It is a historical snapshot from a specific moment, frozen and increasingly obsolete the moment it is taken.
Now scale that temporal decay across 30 crore households and 536 million animals. By the time the 21st Census — conducted in October 2024 through February 2025 — is finally compiled and released (expected by July 2026), the numbers will be two to three years out of date. For a sector where animals are constantly being born, dying, sold, stolen, moved between districts, and changing their use status, a dataset with a three-year lag is barely a dataset at all. It is a historical record. It is not useful for governance.
THE TRANSACTION LAYER VS. THE CENSUS LAYER
Here is where the collision becomes visible. Ramakrishna's milk gets weighed, tested, and recorded by a dairy cooperative's system in real time. That transaction data is current, granular, and operationally useful. The cooperative knows exactly how much milk Ramakrishna is producing, what time of day it is arriving, what the fat and protein content is, whether there are signs of infection or disease, and whether his animals are producing at expected levels for their age and breed.
But that transaction data is private. It sits in the cooperative's database. It is not linked to any government registry. It is not aggregated. The state does not have access to it. The veterinary services do not know who owns what animals in the district. The disease surveillance programme does not know which herds have been vaccinated against FMD. The animal welfare authority does not have a map of where animals are being held or in what conditions. The insurance companies do not have a basis for microinsurance on dairy animals. The credit system does not have a way to use animal ownership as collateral or as a signal of creditworthiness.
What the state has instead is a census number, three years out of date, that tells it there are 109.85 million buffalo in the country and not much else.
This is a fundamental design problem in how India's data infrastructure has developed. The transaction layer — the actual movement of milk, the actual exchange of money, the actual vaccination records, the actual disease surveillance — has developed within private silo systems. The census layer — the statistical backbone that is meant to be the public registry of who owns what animals — has remained stationary since 1919 in its core logic, and only recently modernised in its mechanics. The two layers speak different languages, use different identifiers, and are not linked.
DISEASE SURVEILLANCE IN THE BLIND
The collision becomes acute when you look at disease surveillance and risk management.
In September 2019, the Department of Animal Husbandry & Dairying launched the National Animal Disease Control Programme (NADCP). The goal is to control and eventually eradicate two major diseases: Foot and Mouth Disease (FMD) and Brucellosis. FMD causes severe economic losses because it reduces milk production, reduces weight gain, and causes lameness — an infected cow can lose up to 30 percent of its productive capacity for months. Brucellosis is a zoonotic disease that causes abortion in cattle and buffalo and can affect humans as well. Both diseases, if uncontrolled, spread rapidly through herds and across regions. Both are preventable through vaccination. Both have been the subject of coordinated vaccination drives in India since the launch of NADCP.
The programme introduced a traceability system called "Pashu Aadhar" — an animal Aadhaar, similar to the human unique identification system. Under the scheme, eligible cattle, buffalo, sheep, and goats are tagged with 12-digit UID tags. These tags are meant to be affixed to each animal's ear, and the owner is provided with a card recording the animal's UID, age, breed, vaccination status, and other metadata. The idea is sound: a unique ID per animal allows the state to track which animals have been vaccinated, which have contracted disease, which have been moved between districts, and which are part of high-risk clusters. A linked database of animals, ownership, vaccination status, and disease occurrence would give the state real-time visibility of disease risk. It would allow rapid containment during outbreaks. It would allow insurance and credit mechanisms to be built on top.
The implementation is, as so often, the problem.
As of early 2026, roughly 9 crore cattle and buffalo have been tagged under Pashu Aadhar. That is about 24 percent of the bovine population. Coverage is uneven: some states have achieved 60+ percent tagging, while others remain below 10 percent. The tagged animals are registered in the INAPH portal — an APHIS-like database maintained by the Department. Vaccination records are being added. But the tagging process is slow, manual, and resource-intensive. The tags cost ₹8 each, which is cheap, but the labour to affix tags, record them, and enter them into the database costs more. The database itself, while it exists, is not integrated with the dairy cooperative transaction systems. A dairy cooperative in Maharashtra knows which milk is coming from which farmer and has some idea of which animals are producing, but that dairy cooperative's system is not connected to the state's Pashu Aadhar registry. The veterinary department has vaccination records in one system, the cooperative has transaction data in another, the census has a three-year-old population count in a third, and the animal welfare records exist in a fourth. None of them talk to each other.
The result is that India's disease surveillance system is blind. The NADCP has conducted massive vaccination campaigns — millions of doses administered each season. But the state cannot see the actual coverage in real time, cannot see which herds remain unvaccinated, cannot see whether new cases are clustering in areas of low vaccination, and cannot respond to outbreaks with the surgical precision that real-time data would allow. When FMD breaks out in a district, the response is reactive, based on veterinary department reports and rumour, not based on a real-time map of which animals were vaccinated and which were not.
THE INFRASTRUCTURE THAT EXISTS, PARTIAL
Some pieces of what would need to exist are being built, but they remain partial and unconnected.
The first piece is the census layer, which has been modernised. The 21st Census, conducted in late 2024 and early 2025, used a more robust digital system than the 20th. Data collection was decentralised to individual enumerators' tablets, with better sync protocols and data validation at the point of entry. The final dataset, when it arrives in 2026, will be more reliable than the 2019 census. But it will still be a periodic snapshot, updated every five years, with a two-to-three-year lag to compilation and release. For a sector with constant animal churn, that is a structural limitation that no amount of modernisation of the enumeration process can overcome.
The second piece is the transaction layer in organised dairy. India's dairy cooperatives — particularly the large federations like the Gujarat Cooperative Milk Marketing Federation (GCMMF), the Karnataka Milk Federation, and others — have built transaction systems that rival those of any developed country. Milk is tested, weighed, priced, and recorded at point of collection. Data flows from collection centres through district networks to state-level aggregation hubs in near real time. Some of the larger cooperatives have begun investing in animal-health monitoring: milk composition data is used to infer disease status, and alerts are sent to farmers if indicators suggest infection. This is genuinely sophisticated data infrastructure.
But it covers only the organised dairy sector — roughly 20-30 percent of India's milk production. The remaining 70-80 percent — milk produced by smallholders, pastoralists, and farmers in regions without developed cooperative infrastructure — is sold through informal channels: directly to neighbours, to local vendors, to small-scale processors. There is no transaction data. There is no visibility. There is no record.
The third piece is the traceability layer, Pashu Aadhar, which is partial. As noted, about 24 percent of eligible bovines are tagged as of early 2026. The database exists. But it is not integrated with the census data, not integrated with the cooperative transaction data, and not accessible to most farmers for their own record-keeping. A farmer with a tagged animal does not have an easy way to access her own animal's record or to share it with a veterinarian. The system is top-down, state-administered, not bottom-up and farmer-facing.
Each of these three pieces — the census, the transaction data, the traceability system — would be individually useful if it were sufficiently complete and current. Taken together, and connected, they would form a working livestock data infrastructure. Kept separate, updated at different cadences, and speaking different languages, they form a fragmented picture that serves no one.
WHAT A WORKING LIVESTOCK DATA LAYER WOULD LOOK LIKE
A functioning livestock data infrastructure for India is not technically mysterious. It would require four components.
The first is a unique animal identifier and registry. Every animal that enters the formal production system — every animal that is registered with a dairy cooperative, every animal that is vaccinated, every animal that is sold at a regulated mandi — would have a 12-digit UID. The Pashu Aadhar system is the right foundation. But it needs to be deployed to universal coverage (100% of dairy animals, not 24%), linked to farmer identity through the Aadhaar-based unique owner ID so that farmers can access and update animal records, and integrated with the transaction layer so that every milk collection event is recorded against the animal's UID.
The second is a disease surveillance and vaccination registry. This is partially under way with Pashu Aadhar and NADCP, but it needs to be real-time and integrated with veterinary field operations — when a veterinarian vaccinates an animal, that event is recorded immediately in the system, not weeks later. It needs to be linked to diagnostic capacity, with samples tested and results recorded against the animal's UID. And it needs to be accessible to farmers, so each farmer can see her own animals' records.
The third is an actual-time census alternative. The state has poured enormous resources into conducting a livestock census every five years. Those resources could instead be deployed to building a continuous-flow livestock registry, where every animal that is bought, sold, vaccinated, medically treated, or moved between districts is recorded in a central system. A transaction-based registry would be far more current than a periodic census.
The fourth is a public, animal-level data repository. Every transaction involving a livestock animal — every milk collection, every vaccine dose, every disease diagnosis, every purchase or sale — would be recorded in a database that is publicly queryable at the appropriate level of aggregation. Aggregate data — herd-level, village-level, district-level — would be public. How much milk is being produced in each district? What is the disease burden? What is the vaccination coverage? What is the productivity trend for each breed in each region? These are not secrets.
None of this is technically mysterious. Every component has been piloted somewhere in India or elsewhere. The pieces exist. What is missing is the conviction to wire them together into an operating system that is, by design, more current and more useful than a census conducted every five years.
THE UNWRITTEN NUMBER
Like every essay in this series, this one ends with a number that the state has never calculated.
What is the annual productivity loss to India's livestock sector due to vaccine-preventable disease that goes unvaccinated because there is no real-time map of which animals have been covered? FMD alone is estimated to cause ₹20,000 crore in annual losses globally; India's share, given its production scale and incomplete vaccination coverage, is likely several lakh crore per year. Unestimated by any government agency.
What is the cost of disease outbreaks that spread unchecked because the state cannot detect clusters in real time? What is the food-safety cost of untraced milk in the informal sector? What is the credit cost of smallholder farmers being locked out of formal lending because there is no way to assess their productive assets or their creditworthiness?
What is the cost, in farmer distress, in disease burden, in lost productivity, and in economic growth, of 80 million farmers whose primary productive asset — their animals — remains, by the state's official measure, barely visible?
Ramakrishna Patil's dairy cooperative knows his herd's productivity to the kilogram, the day, and the test result. The state knows he owns some buffalo. The difference between those two pictures, multiplied across 30 crore households, is the cost that India's livestock data infrastructure does not measure.
The census happens again in 2029. The data will arrive around 2031 or 2032. By then, perhaps the components — the tagging, the transaction data, the disease surveillance — will have advanced enough to wire together. Perhaps not. Until then, India's largest source of rural income, its largest agricultural product, the foundation of 80 million livelihoods, remains officially half-visible, managed by transaction data held in private systems and by intuition held in farmers' hands and old ledgers.
The herd moves forward in real time. The state's eye on the herd moves in five-year increments. The gap between those two rhythms is where India loses sight of itself.

