Crowd Computing is a computational paradigm that advocates for the adoption of human intelligence at scale, to improve the performance of computational systems.
My research effort is devoted to the creation of mathematical models and computational methods for Crowd Computing, to address both problems of analysis and design of this class of computational systems.
My team and I study and build novel Crowd Computing methods and tools that combine the cognitive and reasoning abilities of individuals and crowds, with the computational powers of machines, and the value of big amounts of heterogeneous data.
What is Crowd Computing?
Crowd Computing advocates for human-in-the-loop systems, where computational tasks are outsourced to a crowd of individuals, or to communities. Examples of relevant computational tasks include data creation, analysis, or interpretation.
In recent years, several successful examples have shown the potential and effectiveness of Crowd Computing: Wikipedia, the ESP game, Galaxy Zoo, online labor markets such as Amazon’s Mechanical Turk, the DARPA Balloon Challenge, FoldIt (a protein folding game), Google Maps, Duolingo, Waze, and others.
At the same time, the Crowd Computing paradigm enabled (and it is enabling) significant advances in many fields such as Information Retrieval and Machine Learning, with applications, e.g., to computer vision, natural language understanding. ImageNet is the quintessential example, as its availability is considered to be a significant enabler for the “deep learning revolution” of this decade.
Economic and Societal Relevance
Crowd Computing is a research field in Computer Science that is bound to become increasingly relevant in the near future, as testified by many compelling and converging trends.
The recently gained awareness of online content quality issues combined with the limitation of current artificial intelligence techniques (e.g., the “Fake News” and Russian meddling with election processing) has put Crowd Computing in the spotlight, highlighting its central role in providing reliable, transparent, and safe content exploration and consumption experience.
Several reports project the size of the crowd work market to grow between $16b and $47 billion by 2020. Gartner predicts that by 2020, more than 40% of data science tasks will be automated, resulting in increased productivity and broader usage by citizen data scientists; 20% of companies will dedicate workers to monitor and guide neural networks. In recent events, researchers from leading Web companies estimated at around 1 billion dollars the resources spent on a yearly bases by industries all around the world on human data collection.
Despite the successful application of the Crowd Computing paradigm to a variety of domains, we still have a poor understanding of Crowd Computing from a design and engineering standpoint.
What is missing is a broad theoretical foundation for Crowd Computing, where successful abstractions and well-understood and tested design principles allow the design, creation, and deployment of robust, efficient, and scalable hybrid systems.
The involvement of humans in computation activities is a fundamental scientific challenge that requires obtaining the best from human abilities, while effectively embedding them into traditional computational processes and systems. Examples of the computer science challenges that are relevant for Crowd Computing systems, and that are currently the object of my research include Abstractions, models and methods for Efficient and Effective Crowd Computing, and strategies for sustainability.
Crowd Computing systems combine traditional computational and networking resources with human intelligence and contributions. The challenge is to define a level of abstraction able to minimally yet exhaustively capture the essential properties of electronic and human computers. Such abstractions should also allow designing and experimentally validating methods able to capture and measure (at scale) “functional” and “non-functional” properties of crowd computing tasks (e.g. granularity, complexity, clarity) human computational units (e.g. expertise, availability, trustworthiness, bias), and crowd computing systems (e.g. batch/stream processing).
Efficiency and Effectiveness
Humans are naturally slower than machines in terms of information processing; also, while machines deterministically compute, humans behavior may be unpredictable, ambiguous, possibly malicious. The challenge is to develop models and methods able to optimally tap human processing to minimize computational latency and cost while maximizing the quality, generalisability, and explainability of human processing outcomes.
Human participation is a crucial constraint for the success and long-term sustainability of Crowd Computing. The challenge is to design and experimentally test different computer-mediated engagement and retainment strategies, based on incentive schemes developed in fields such as behavioral economics and behavioral psychology.
My Research Goals
The long-term objective of my research is to contribute to a foundational theory for Crowd Computing. To this end, both mathematical and experimental research is needed.
Mathematical models and analysis enable reasoning on Crowd Computing systems’ properties and behavior, to enable rigorous design and testing. On the other hand, experimental work is essential to inform the development of such models and to test their validity.
I organized my research activities around 3 application domains, each addressing high-relevance and high-impact societal and industrial needs:
- Crowd Computing for Knowledge Generation, which focuses on the design and validation of Crowd Computing methods and tools for knowledge generation in the context of online communities (e.g., online crowd-work markets, Q&A systems) and content digitalization and access (e.g., cultural heritage and digital libraries);
- Crowd Computing for Urban Data Science, which addresses Crowd Computing methods and tools in the context of urban data and intelligent cities applications; and
- Enterprise Crowd Computing, which addresses Crowd Computing in the context of enterprise-class applications.
My short term plan targets new classes of Crowd Computing problems characterized by higher levels of task complexity, physical work environment, structured work organization, and financial/educational incentives.
This work addresses novel and relevant open research problems that, once tackled, will bring Crowd Computing into the level of maturity required by a discipline that holds such broad societal and economic impact.