Data is essentially a gathering and a counting, not a thing-in-itself. It does not exist independently of us, as physical objects and scientific processes do. It is called into being by its compilers according to their predetermined criteria and categories, which are as prone to assumption and prejudice as any other thought processes.
Data gathering is immensely complicated. Even the narrowest, most simple investigation has an enormous number of variables. All sorts of factors can limit the validity of, or invalidate, a study. These range from the size of the data set (often too small or lacking in variety), to its limiting factors and how subjects are selected, or self-selected, to the questions asked to gain the data. Asking people who have already caught Covid how they behaved gives you different data from asking the wider population. Asking the former group if they’d taken a bath or shower around the time they were infected, may yield the conclusion that contact with water gives you the disease.
More importantly, data must be analysed and interpreted to have any significance at all. When dealing with populations numbered in millions, gathering billions of thoughts and attitudes and behaviours and circumstances will lead to millions of tenuous correlations, apparently supporting hundreds of thousands of theories, which self-appointed spokespeople and experts can use to back up whatever theory they like.
This is, of course, confirmation bias: the questions you ask and the people you decide to ask them to, are designed to confirm what you already think. You pick the statistics that support your claims and ignore any to the contrary.