Software and Data
I maintain datasets and software tools to support research in finance and economics.
-
CIK to CUSIP Mapping
Linking files between CIK and CUSIP using 13G and 13F filings.
-
USPTO Full Text Database
OCR full text data for pre-1975 USPTO patents with improved quality and coverage compared to Google Patents.
-
Name Matching
Algorithm to match firm names based on string similarities.
-
Replace and Delete (rd)
Extremely fast command line utility to replace and delete strings in text files.
-
Fuzzy Process (fuzzprocess)
Deep-learning approach to find nearest K matches for two sets of names.