Can crowdsourcing create the missing crash data?

  • Sveta Milusheva
  • , Robert Marty
  • , Guadalupe Bedoya
  • , Elizabeth Resor
  • , Sarah Williams
  • , Arianna Legovini

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

UPDATED - -June 1, 2020. Road traffic crashes (RTCs) are the primary cause of death among children and young adults. Yet data on RTCs is incomplete, hindering effective road safety policymaking in many developing countries where mortality is purportedly highest. We web-scrape 850,000 tweets to create crash data and develop a machine learning algorithm to geolocate RTCs. Our algorithm is nearly twice as precise as a standard geoparsing algorithm in identifying the set of locations that include the crash location. Above and beyond, it identifies the unique location of a crash from the set of possible locations in a majority of cases. We dispatch a set of motorcycle drivers to the site of the presumed crash in real time to verify the validity of the crowdsourced data and document the performance of the algorithm. The study can be used as a proof of concept for countries interested to improve RTC data at low cost through a machine learning approach and substantially increase the data available to analyze RTCs and prioritize road safety policies.

Original languageEnglish
Title of host publicationCOMPASS 2020 - Proceedings of the 2020 3rd ACM SIGCAS Conference on Computing and Sustainable Societies
PublisherAssociation for Computing Machinery (ACM)
Pages305-306
Number of pages2
ISBN (Electronic)9781450371292
DOIs
Publication statusPublished - 15 Jun 2020
Externally publishedYes
Event3rd ACM SIGCAS Conference on Computing and Sustainable Societies, COMPASS 2020 - Guayaquil, Ecuador
Duration: 15 Jun 202017 Jun 2020

Publication series

NameCOMPASS 2020 - Proceedings of the 2020 3rd ACM SIGCAS Conference on Computing and Sustainable Societies

Conference

Conference3rd ACM SIGCAS Conference on Computing and Sustainable Societies, COMPASS 2020
Country/TerritoryEcuador
CityGuayaquil
Period15/06/2017/06/20

Keywords

  • geoparse
  • natural language processing
  • road safety
  • twitter

Fingerprint

Dive into the research topics of 'Can crowdsourcing create the missing crash data?'. Together they form a unique fingerprint.

Cite this