The Center for Data Innovation spoke with Justin Abold-LaBreche, acting director of the Office of Compliance Analytics at the Internal Revenue Service (IRS). Abold-Labreche discussed how the IRS uses analytics to enforce tax compliance, as well as the value of third party reporting for closing the tax gap in the United States.
This interview has been lightly edited.
Joshua New: What is the goal of the Office of Compliance Analytics? Is it simply cracking down on tax fraud with analytics, or something broader?
Justin Abold-LaBreche: Protecting taxpayers from identity theft, in collaboration with our IRS, industry, and state and local business partners, is an extremely important goal for the Office of Compliance Analytics, but our mission is indeed broader than that. Our office was established in 2011 to materially enhance compliance outcomes across the IRS and to continue to build a culture of data driven decision making and innovation. Over the past five years we have collaborated with our IRS business operating divisions to tackle a wide range of issues such as small business cash under-reporting, enhancing online authentication, developing preventative treatments to help businesses meet their federal tax deposit obligations, and improving the detection of tax noncompliance. As you can see, we have had a broad mission and haven’t been shy about partnering with our operating divisions to tackle some of the most challenging issues facing U.S. tax administration.
We are now entering a new phase where we are forming a new central analytics, statistics, and research organization within the IRS which combines both the Office of Compliance Analytics and the Research, Analysis and Statistics office. Dr. Ben Herndon, formerly a professor at Georgia Tech’s School of Business, has joined us to head up this new organization. The missions of the Office of Compliance Analytics will continue on as part of this new organization.
New: What kind of tools have advancements in analytics technologies, such as machine learning, made possible for the IRS in recent years?
Abold-LaBreche: The budget has been very tight at the IRS but we have used a combination of open source analytic technologies and a lot of ingenuity to remain at the cutting edge of applying data analytics to some of the toughest challenges facing government agencies, including protecting citizens from identity theft, reducing fraud, and innovating service offerings to better meet citizen needs effectively and efficiently. As more and more advanced analytic tools are available as open source, over the past three years we have moved toward a richer use of advanced data analytic approaches like neural networks, advanced clustering, and a range of anomaly detection approaches to help us find hard to detect areas of tax noncompliance. We also are experimenting with massively parallel processing to make it possible both to do population inclusive analytics—performing analytics on the full population, rather than just a sample—as well as to make it computationally possible to work with highly linked data.
New: I imagine investigating issues such as fraud and identify theft require examining data from a variety of sources within IRS, as well as from external sources, including other government agencies. What are some of the challenges in pulling all of this data together?
Abold-LaBreche: The IRS faces the typical challenges of working with big datasets: there is a considerable investment upfront in data cleansing, validation, and joining, as well as the computational challenges of processing big data quickly. We are tackling these challenges on both the submission processing side of things, such as by processing returns real time and risk assessing for identity theft, as well as on our analytics side of things, such as by building models that will eventually run in production. We are meeting the challenge by applying the best practices in use across industry, which is why we adopted massively parallel processing. The IRS is also very careful about protecting the data and ensuring it is accessed only by those with a need to know and only for appropriate purposes.
New: In 2012, payment processing companies such as Paypal were required to report transaction data to the IRS, which allowed IRS to tap third-party data to analyze business transactions for the first time. How does this help? How else could private sector data be used to improve tax compliance?
Abold-LaBreche: The Housing Assistance Tax Act of 2008 was a milestone in providing the IRS with the data needed to help address small business cash under-reporting. Based on the tax gap analysis conducted by the Office of Research, small business cash under-reporting is a very substantial contributor to the overall tax gap—the difference between what is owed to the U.S. government in taxes and what is reported on tax returns. The legislation provides third party reporting, called the 1099-K, which enables the IRS to use analytics to identify business returns with elevated risk of having under-reported cash income. We know from other experiences with third party reporting that when there is third party reporting on income, businesses report income significantly more completely on the tax return. We are seeing that same effect here for cash reporting. In addition, the IRS has tested a number of treatments that make use of 1099-K data to identify potential under-reporting, contact taxpayers to help them correct discrepancies, and to enhance our ability to detect cash under-reporting while limiting the burden on compliant taxpayers.
New: You’ve worked at the IRS for close to five years now. How has the agency changed over the years in terms of how it uses data to solve problems? What major obstacles remain?
Abold-LaBreche: The IRS is an amazing place to work, with some of the most talented and dedicated people I have seen in government. Over the past five years, we have worked together to continuously enhance our use of rigorous research and analytics to inform decision-making and accelerate innovation while managing risks. This wasn’t new for the IRS—there has always been a strong capability in research here—but the combination of faster and cheaper analytics tools have meant we can widen the application of analytics across the IRS. And there is now broad recognition across the IRS that analytics and test-and-learn approaches to innovation are very powerful tools as we seek to manage risks such as fraud and improve customer service. I think the largest change I have seen is the movement to mainstream these approaches so that it isn’t just a small “analytics group” that is using analytics and testing, but instead employees and managers at all levels asking “what is the problem I am trying to solve?” and “what is my hypothesis?” and “how do I test it?”
The IRS is truly a government leader in the application of data science to help identity and accelerate progress toward new approaches to enhance taxpayer service and increase voluntary compliance with the nation’s tax laws.
We anticipate recruiting for entry level positions this fall (for college students graduating with quantitative degrees) and have a continual need for experienced quantitative analysts from a variety of backgrounds. We’re an inclusive, diverse team that loves intellectually challenging problems and working collaboratively with one another to find the best solutions—if this appeals to you, we’d love to have you come work with us.