I heard about this project, EMBERS (acronym to Early Model Based Event Recognition using Surrogates), in a DC Data Science meetup. The speaker was Naren Ramakrishnan from VirginiaTech.
To me, it is a real big data project. It is a software that forecasts massive atrocities, particularly on civil unrest (mainly in Latin America and Middle East). They make use of open-source indicators, such as tweets, Facebook events, news, blog posts, open economic figures etc. to predict the outbreak of big events with advanced mathematical models. It is collaborative project involves nine universities and private corporations.
EMBERS ingests a large amount of unstructured data 24/7. Evidently, techniques in natural language processing (NLP) are involved. Besides English, at least Spanish and Arabic are incorporated into the system. And this real-time prediction process is very challenging.
System architecture of EMBERS
Output screenshot of EMBERS
The system performance is quite good. For a 24-month period, it has a recall of 0.65 and a precision of 0.94.
Who need EMBERS? Government must be a big customer. And not surprisingly, some travelers, social scientists and corporate firms find it useful because safety in, information about and business environment in various countries are their main concerns. Of course it is not a free software. It is undeniably a lucrative project.
- Publication: ‘Beating the news’ with EMBERS: Forecasting Civil Unrest using Open Source Indicators [arXiv:1402.7035]
- A related project: Early Warning Project