Sherlock Semantic Type Detection Demo

Sherlock is a neural network trained on real-world datasets collected from the web. Unlike rule-based approaches based on hard-coded regular expressions and values, Sherlock detects types using word embeddings and distributions of characters.

Because Sherlock is trained on real data, it is robust to messy entries (e.g. blanks, mispellings, malformatted values) and includes over 20 semantic types. See for yourself by changing table values:

Country
Head of Government
Capital City
Latitude
Longitude
Continent
Birth of Current Form of Government
One Day Δ in USD Exchange Rate
As Of
JapanTokyo35.68333333139.75Asia194?0.0001Aug 19 2018
Theresa MayLondon51.5-0.083333Europe18XX02018.08.19
SWEAlain BersetBern46.91666667LONGITUDEEurope1947?08/19/2018
CanadaTODOTODOTODOTODONorth America1867-0.000508/19/2018
AustraliaTODOTODOTODOTODOOCE1900 +/ 1.2 pct08/19/2018
New ZealandJacinda Ardern\t\t\tOCE1840-0.36%08/19/2018
SwedenStefan LöfvenStockholm59.3333333318.05Europe19740.0016
NorwayErna SolbergOslo59.9166666710.7518140.000808/19/2018
中国Xi JinpingBeijing39 +/ 10116.383333Asia194?08/19/2018
Росси́яMoscow55.7537.6Europe19930.59%08/19/2018
INDNarendra ModiNew Delhi28.677.2Asia20th century0.00262018-?-?
TurkeyRecep Tayyip ErdoğanAnkara39.9333333332.8666670.009208/19/2018
Prayut Chan-o-chaBangkok13.75100.516667Asia0.44%08/19/2018
IndonesiaJoko WidodoJakarta-6.166666667106.816667Asia-0.000908/19/2018
Myanmar\n\nAsia20110.0012018-08-19
MexicoEnrique Peña Nieto?19.43333333-99.133333CA19170.001708/19/2018
ArgentinaMauricio Macri?-34.58333333-58.666667SA1853.11 pctAug 19 2018
Lars Løkke Rasmussen55.6666666712.583333EU19530.002508/19/2018
IsraelBenjamin NetanyahuJerusalemAsia1948 +/- 5?August 2018
PhilippinesRodrigo DuterteManila14.6120.96666718350.001208/19/2018
Country
Head of Government
Capital City
Latitude
Longitude
Continent
Birth of Current Form of Government
One Day Δ in USD Exchange Rate
As Of