The vast majority of computer vision research leads to technology that surveils human beings, a new preprint study that analyzed more than 20,000 computer vision papers and 11,000 patents spanning three decades has found. Crucially, the study found that computer vision papers often refer to human beings as “objects,” a convention that both obfuscates how common surveillance of humans is in the field, and objectifies humans by definition.
Conversely, the fairly banal language used to describe the tech is how the structures that make technology evil are concealed.
Calling humans human rather than objects, even if object detection is what AI does, re-instills certain objects with a whole host of features that distinguishes them from other objects. It won’t matter for the AI, obviously. But it will matter for the people involved with creating and using it.
I mean, imagine if Tesla shows “Object Identified” as it barrels over a misplaced jaywalker. My previous sentence buries the horror of someone being murdered. Similarly, humans are understood to have rights, thoughts, feelings, whole worlds that exist\ inside of their heads, and they exist within a social ecosystem where their presence is fundamental to its health. “Object” capture none of that. But identifying human objects as human does.
Relabeling human objects as human reintroduces all the associated values of being human into AI object detection discussions. And so it becomes easier to see how the evils of technology are acting on us rather than concealing it.
This still just feels like a muddying of technical language. If you were to write an article about autopilot killing somebody and use object to refer to them, that’s certainly dehumanization, but saying that an object detection algorithm performs poorly on humans doesn’t feel like it is.
Part of the problem is that in general we aren’t talking about specialized human detection models that incorporate things like pose estimation. Instead it is almost always a general object detection alg, and referring to the same models differently based on the subject just adds muddiness.
I’m mostly familiar with AI within healthcare, and in my workplace, any released model is going to have a number of conversations and evaluations about the technical performance, practical impact on patients, and general ethics of the model. Those conversations blend, but it’s harmful to make the language less clear in any one of those contexts.