July 19, 2024

New Mechanism Maximizes Data Acquisition Without Compromising User Privacy

With the increasing demand for data in artificial intelligence (AI) and machine learning applications, there is a need to incentivize data-sharing while protecting user privacy. Ali Makhdoumi, an associate professor of decision sciences at Duke University’s Fuqua School of Business, along with co-authors from the University of California, Berkeley, University of Toronto, and Massachusetts Institute of Technology, propose a mechanism that measures the privacy sensitivity of users and compensates them for sharing personal data. Their findings are detailed in a paper soon to be published in the journal Operations Research.

The researchers argue that privacy-sensitive users should be compensated for relinquishing their personal data. They propose using differential privacy, a widely adopted approach in the tech industry, which involves adding noise to the data to make it less revealing. For example, if a company is querying hospital records to determine the percentage of individuals in a certain zip code with a medical condition, adding noise to the data will ensure privacy.

There are currently two methods of delivering privacy: locally and centrally. In the local setting, data is randomized on the user’s device before being shared with the processing entity. This approach results in less accurate statistical estimations because the data is already randomized. In the centralized system, users share raw data with companies, which then add noise to the results.

The researchers designed a new data acquisition mechanism that considers the privacy sensitivity of users and assigns a value to it. This mechanism determines the optimal incentive for data-dependent platforms and compensates users for sharing their personal information. By considering both the price for users’ privacy loss and the utility they derive from the service, companies can collect data while compensating users adequately.

The research emphasizes the importance of data centralization for efficient data collection, as it ensures precise results for business analysis. It also raises concerns about the potential risks of AI and machine learning, such as price discrimination and manipulations that harm users.

Makhdoumi acknowledges that there is still much to learn about the societal harms of data collection. However, this research represents a significant step forward in understanding the complexities of the data market and finding ways to protect user privacy while maximizing data acquisition for AI and machine learning applications.

1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it