Research

Market Microstructure for Blockchain Assets Link to heading

I continue to investigate the microstructure of markets for blockchain based assets: (1) cryptocurrencies, and (2) smart contracts and their important implementations, e.g., the NFT markets. Two things make this project particularly interesting:

  1. The cryptocurrency and smart contract exchanges provide mountains of data that can be used, applying methods analogous to those in analysis of traditional asset markets, to determine wealth distribution, transaction protocols and wealth concentration (which affects liquidity and price stability).
  2. The issue of wealth concentration in individual cryptocurrencies is the subject of much theorization by computer scientists, but little actual evidence or analysis. Most cryptocurrencies will start with a Gini coefficient of 1; (e.g., Satoshi Nakamoto mined the first million Bitcoin, and is still the larges single owner). The top four cryptocurrencies have 2 to 4 owners who possess more than 50% of that cryptocurrency. As adoption spreads, any single currency’s Gini coefficient is theorized to decrease naturally. The question is “how fast?” Based on existing studies of the three major cryptocurrency wallets, Bitcoin, Ethereum and Ripple, all have Gini coefficients of ~0.99. I strongly believe that this will be an interesting, productive, and relatively high profile stream of research for me to pursue over the next few years.

Prescriptive Analytics Link to heading

Prescriptive analytics based on applications of graph theory to comorbidities, chronic diseases and healthcare management I am continuing my research into the empirical co-morbity ‘structure’ of chronic diseases, with the acquisition of significant new datasets through my European Union collaborations. Chronic diseases are costly because they demand long-term care. Management of chronic disease is estimated to account for 78% of health expenditures in the United States and patients with more than one chronic condition are estimated to account for 95% of all Medicare spending. One of every ten Americans has a chronic condition that causes a major limitation in activity and quality of life. And 70% of annual deaths in the US are from a chronic disease and more than 70% of the annual healthcare bill relates to chronic diseases and conditions. This research will attempt to provide a single empirically based method to identify comorbities and their strength. Currently, three overlapping but contradictory definitions based on subjective consensus among doctors work to confound the research in “comorbidity” in extant literature. A comorbidity may be defined as one of:

  1. a medical condition existing simultaneously but independently with another condition in a patient,
  2. a medical condition in a patient that causes, is caused by, or is otherwise related to another condition in the same patient,]
  3. two or more medical conditions existing simultaneously regardless of their causal relationship.

As a result, there are multiple coding tables of comorbidities that differ from each other, e.g., ICD-10-CM, Elixhauser, etc.

This stream of research addresses the ability of empirical graph theoretic classifications and metrics for chronic disease comorbidity to provide: (1) provide finer granularity of diagnosis than canonical diagnostic codes; (2) improve cost-control of diagnosis and treatment; and (3) provide assurance that treatment decisions are evidence-based. So far the research has applied three classes of graph metrics to comorbidity data at both micro- (patient) and macro- (population) levels: (1) topology metrics; (2) hub metrics; and (3) clustering metrics. I am working with Erasmus University faculty to gain access to European Union data on the curated OHDSI platform for a broad collection of datasets to which I can apply these methods (so far, the been used with two Chinese databases, from Sichuan and Heilongjiang provincial hospital systems.

Security Audits Link to heading

Prediction of firm-level security threats through the use of published financial data from the US SEC EDGAR repository, asset exchanges, and additional derivative metrics, especially the Fama-French risk metrics. Some of this will be directed to cryptocurrency trading platforms through my association with Rayleigh Research Firm level security threats from foreign and domestic agents operating on the Internet have driven an exponential increase in losses to criminals. This has become a significant problem on cryptocurrency trading platforms, where I will conduct research through my association with Rayleigh Research in Finland. Because the relationships I have been analyzing are typically highly non-linear, with stochastic distributions that are highly non-Gaussian, traditional regression statistics perform poorly. I am looking specifically at cryptocurrency thefts and abuse of wallet, trading and clearing platforms, credit card fraud, hacking incidents, data theft and unauthorized disclosures, which evidence suggests are responsible for billions of dollars of corporate loss annually. To address these problems in predicting security breaches, I have developed software that systematically tunes hyperparameters for a suite of machine learning models, starting with a random forest, an extremely-randomized forest, a random grid of gradient boosting machines (GBMs), a random grid of deep neural nets, a fixed grid of general linear models which are then assembled into two trained stacked ensemble models optimized for F1 performance. I have gathered and curated three new datasets that I will be analyzing in research using my machine learning suite, with the intention of completing one or two new papers.

Auditing Link to heading

My recent work in auditing data analytics, SOX auditing with machine learning, and assessment of compliance and internal controls is becoming recognized in the industry. In particular, the Certified Public Accountant’s examination will in 2024 move to a data analytics centric focus where all CPAs are expected to demonstrate on the exam a deep knowledge in one of the following three primary disciplines: • Business Analysis and Reporting (BAR) – a continuation of the Accounting core; • Information Systems and Controls (ISC) – a continuation of the Auditing core, or; • Tax Compliance and Planning (TCP) – a continuation of the Tax core. I plan to continue to promote my book Audit Analytics and continue with the Python version of the book. I will work on outreach to the Chicago audit community. In addition there are product-oriented opportunities to pursue, and I expect to devote some energy to these pursuits. I have wrapped up all of my ongoing ‘blockchain’ projects with the submission of my NFT paper to Scientific Reports. I currently have no further plans in this area. In my teaching, I will continue to work on expanding the use of simulations in courses. This reflects my strong belief that simulations, not ‘talking heads’, are the best way to teach marketable skills that best prepare students for the workplace. Using simulations is difficult, and sometimes thankless, as students may be frustrated at their inability to ‘game’ the system, and simulations themselves don’t always work as they should.