March 03, 2025
Assessing AI From a Tax Perspective, Part 2
By William M. VanDenburgh, Ph.D.; Kimberly J. Tribou, Ph.D., CPA; and James M. Braswell, Ph.D.
While artificial intelligence (AI) can support the filing of tax returns, its uncontrolled use in an accounting practice poses significant risks. The wording of the AI prompt can greatly impact the quality of its response. AI’s accuracy is improved with prompt specificity, but divulging a client’s privileged tax information in a prompt could be a violation of AICPA’s Statement on Standards for Tax Services (SSTS) and the IRS’s Circular 230: Regulations Governing Practice before the Internal Revenue Service.
Consistent with these standards, accounting professionals need to verify the output of these tools in the same way they would verify someone else’s work. Our prior companion piece, “Assessing AI From a Tax Perspective, Part 1,” strongly indicates that current AI tools are error-prone and provide limited value to preparers. In this article, we discuss the development and utilization of AI, as well as ethical/practice considerations for tax preparers.
AI History and Explosion
Chatbots, which utilize natural language processing and interact with users through prompt-based commands, have been used in computer applications since the mid-1960s. Companies began experimenting with machine learning and generative AI, the data-intensive technologies powering chatbots like ChatGPT and Microsoft Copilot, in the early 2000s.
Machine learning identifies patterns within its training data and uses these patterns to automate tasks and formulate predictions. Generative AI creates new content – such as computer coding, images or written works – in response to specific user prompts.
OpenAI, the company that developed and released ChatGPT, was founded in 2015 as a not-for-profit research company committed to the ethical development of artificial intelligence. From its inception, OpenAI initially pledged to make its AI technology freely available. Keeping this pledge, OpenAI released ChatGPT 3.0 to the public in November 2022.1
Other tech firms released competing chat-based AI tools in the months following ChatGPT’s release. Both Microsoft and Google subsequently launched sophisticated AI tools (Copilot and Gemini, respectively) in late 2023.2 Companies have also adopted generative AI technology specific to their organizations or adapted existing AI tools for limited purposes. For example, KPMG partnered with Microsoft to incorporate OpenAI technology atop its proprietary systems.3
Please see Exhibit 1 at the end of this article for a summary of the evolution of chat-based technology as a means of accessing artificial intelligence.
Comparing Tools
Currently, four major AI tools are publicly available: Microsoft Copilot, Google Gemini, Perplexity, and ChatGPT, with other specialized tools like TaxGPT available on a fee-basis.4 Table 1 compares the costs of three tools that may support tax research and preparation.
Table 1. Comparison of Costs of Chatbot AI Tools | |||
Microsoft Copilot | Perplexity.ai | TaxGPT | |
Underlying Technology | GPT 4o (OpenAI) | GPT-4 Omni and Claude 3 | GPT-4o (OpenAI) and Claude 3.5 |
Per User Cost* | Copilot Pro for Individuals: $20/month; Copilot for Microsoft 365: $30/month | Free users can conduct 5 free daily searches; subscribers (/month) can conduct 600 daily searches | $1,000 per seat |
* As of the writing of this article |
Microsoft’s Copilot, which uses OpenAI’s GPT-4 technology, was included in Windows 11 updates beginning in November 2023. Initially tested in companies with 300 or more employees, the tool became widely available in early 2024. Microsoft currently charges corporate subscribers a fee of $30 per user per month for its Microsoft Copilot 365 tool.
The Perplexity.ai tool, launched in September 2023, utilizes artificial intelligence to provide direct and factual answers to internet search questions.5 While ChatGPT and Copilot rely on their large language model (LLM) training to generate responses, Perplexity uses AI (GPT-4 Omni and Claude 3) to tailor real-time search prompts, summarize results and possibly return more accurate results.
TaxGPT utilizes both OpenAI GPT.4o and Claude 3.5 generative AI technologies to support tax research, tax preparation and tax communication. TaxGPT claims that its single focus on tax issues and its ability to search tax law in real time significantly reduce the risk of AI-hallucinations, which occur when the system interprets the prompt incorrectly and provides logical but incorrect responses.6
Initial results were mixed as to whether the workplace enhancements created by AI tools are worth the investment. For example, Copilot is known to make mistakes when asked to “crunch numbers” in Excel, the fundamental financial spreadsheet. Juniper Networks’ CIO Sharon Mandell said, “I wouldn’t say we’re ready to spend $30 per user for every user in the company.”7
However, chemical company Chemours provided Copilot access and AI training to over 1,000 professionals, finding it reduced the time required to compile and analyze financial data. For tax professionals, the monthly or annual per user costs could add up, but the investment could be worth it if it expedites addressing inevitable tax questions that occur during the tax preparation process.
Asking the Chatbots
How AI is utilized and the way the questions are asked impacts the results. Since each of these tools differ slightly in the underlying AI technology and the LLMs used to train the tools, we used Copilot, Perplexity and TaxGPT to answer tax preparation questions. The results of these queries are presented in Exhibit 2 at the end of this article.
While chat-based AI tools can automate routine tasks and support tax decision-making, using these tools without training and professional judgment can result in error-prone tax returns and could leave tax professionals susceptible to penalties. As seen in our queries, answers were often misleading or wrong.
At times, all three tools appeared to answer prompts with hedged or statistically likely responses, rather than technically or factually accurate responses. Copilot could provide disparate answers to the same tax questions depending on whether the (1) More Creative, (2) More Balanced, or (3) More Precise mode was selected. Additionally, Copilot did not provide consistent answers if the question was asked multiple times.
The AI tools used in our comparison have time limitations. Copilot crafted most of its responses from its training database, which was current through 2021. As federal, state and local laws are constantly changing, this limitation could significantly impact tax preparation.
The GPT-4 technology embedded in each tool can execute real-time internet searches; in our tests, the search tool did so with limited success. Only Perplexity could correctly determine the percentage of government securities within Fidelity’s Government Money Market Fund (SPAXX); this may have occurred as we asked at a later date. Further, TaxGPT’s internet search ability appears to be limited to tax preparation, not the valuations that support it.
AI’s current limitations can result in responses that are vague, hedged or patently incorrect. Our findings are consistent with other tax-based experiments using ChatGPT.8 Without a proper understanding of the underlying tax rules, a tax professional who follows the advice of chatbots risks providing incorrect information to their clients or the IRS.
Tax Data Confidentiality Risks
To avoid violating AICPA and IRS standards of professional conduct, tax professionals must apply caution when using AI to support their work. Both AICPA and IRS standards require tax professionals to protect privileged tax information. Per OpenAI’s privacy policy, the application collects user-provided information to support its research, train the bot and improve the quality of future responses.9
Prompting the bot with privileged client information may inadvertently result in divulging this information in future prompts; however, TaxGPT’s privacy settings reportedly mitigate this risk. Unauthorized disclosures of privileged tax return information could be subject to strict and severe IRS penalties.
Additionally, Circular 230 requires due diligence from the tax professional and this due diligence extends to reliance on others. Using a chat-based AI tool to support tax preparation thus requires “engaging, supervising, training, and evaluating” the processes and the output provided by the bot.
Circular 230 imposes an even higher standard for written tax advice, forbidding tax professionals from relying on the advice of others if “the practitioner knows or reasonably should know that the opinion of the other person should not be relied on.”10 AI’s time limitations and admonition to “always consult with a tax professional or the IRS for the most accurate and up-to-date information” cast significant doubt on the reliability of the tool for many tax questions.
Implications of Errors on Returns
Our study illustrates how utilizing AI in tax preparation can yield incorrect answers to complex tax questions. It is possible that AI can be used to support the preparation of “simple” 1040 returns, freeing preparers to focus on complex tax matters that require significant judgment and reducing preparation errors that may occur near the filing deadline.
Our companion article ("Assessing AI From a Tax Perspective, Part 1") observed widely variable estimations of the tax basis used to calculate taxable income from investments. Reliance on AI’s bogus estimations could widen this tax gap.
Tax Note: The IRS is utilizing AI to combat the estimated $700 billion annual tax gap. In October 2024, Martin Fiore, deputy vice chair of tax at EY America, stated, “With AI, millions of lines of information can be looked at in minutes or hours instead of weeks or months.”11 As early as 2020, the IRS, states and foreign governments were utilizing AI to improve tax audits.12
Conclusion: Utilize AI Platforms With Extreme Care
Based on our tax scenario questions arising from the 2023 tax filing season, chat-based AI tools cannot fully synthesize constantly evolving federal and state laws, regulations and court opinions that drive tax judgments. While it will be interesting to see if AI tools’ tax answers improve in the coming years, tax law could be difficult or impossible for AI to truly master, as tax law is often full of gray areas, legal interpretations and changing rules.
AI answers are based largely on available reams of historical data that take time to incorporate into an AI platform. For certain tax prompts, Copilot’s database only included updates through December 2021.
Tax professionals should utilize AI platforms with extreme care, as answers are often outdated, wrong, misleading, and/or incomplete (so-called AI hallucinations). Further, tax queries become incorporated in future updates to AI platforms and this can cause material data confidentiality concerns (i.e., AICPA Statement on SSTS and IRS Circular 230).
As Beverly Goodman, a tax manager quoted in the Washington Post, said, “I feel that my job as a tax professional is very secure.” The results of our study strongly support this statement.
Exhibit 1. Evolution of Chat-Based AI |
1966: First chatbot, ELIZA, is programmed to identify key terms in user prompt and respond with sympathetic rephrasing. |
1980s: Chatbot programming incorporates statistical modeling to forecast the most likely conversational response. |
Early 2000s: E-Commerce gives rise to new chatbot-driven digital assistants, such as Alaska Airlines’ Jenn. |
2011: Apple phones begin featuring Siri, which uses a large language model to respond to user prompts and automate tasks. |
2014: Amazon launches Alexa, its digital assistant powered by large language models. |
2015: OpenAI founded as a not-for-profit organization devoted to making artificial intelligence technology widely and freely available. |
2022: ChatGPT 3.0 released to the public. |
2023: (March) ChatGPT 3.5 released to the public and ChatGPT 4.0 made available to paid subscribers. (September) Perplexity launches its AI-assisted search tool. (November) Microsoft begins incorporating Copilot, its GPT-4 powered AI tool, within Windows 11 updates. (December) Google launches its upgraded AI tool, Gemini. |
2024: Microsoft advertises its Copilot tool during a Super Bowl advertisement. |
Exhibit 2. Comparing GPT Tool Responses to Tax Prompts |
In late April 2024, we asked AI platforms Perplexity and TaxGPT the questions below and compared these tools’ results we obtained earlier with Copilot’s during the 2024 tax season. We did not observe significant improvements in quality or accuracy using these tools. |
Question: SPAXX % of US government securities held in 2023 (41.18 percent) |
Copilot: Provided 2022 rates and recommended checking with Fidelity for updated information. |
Perplexity: Correct. |
TaxGPT: Could Not Determine. |
Question: Closing value of Coca-Cola stock on October 5, 1995 (It was $17.66 per share, split-adjusted) |
Copilot: The earliest recorded stock quote was $9.30 on December 29, 1995. Copilot recommended checking a “detailed historical stock price database.” |
Perplexity: Could Not Determine. Like Copilot, the closest information available was the closing price for Coca-Cola (KO) on December 29, 1995, which was $12.92. |
TaxGPT: Could Not Determine. It referred us to financial databases, stock market archives or historical stock price services. |
Question: Per the IRC, does the cost of a computer qualify for education credits? |
Copilot: Responded vaguely, that “…the cost of a computer may qualify for education credits under certain conditions.” None of Copilot’s three precision modes answered completely correctly. |
Perplexity: Incorrect. Perplexity’s search concluded that “the cost of a computer qualifies for education credits as long as it is used primarily by the beneficiary…” |
TaxGPT: Incorrectly reported that a computer can be a qualifying expense (if required by the institution) under both AOTC and the LLC. |
About the Authors: William M. VanDenburgh, Ph.D., is an Associate Professor of Accounting at the College of Charleston (vandenburghbm@cofc.edu). Kimberly Jane Tribou, Ph.D., CPA, is an Assistant Professor of Accounting at the College of Charleston (triboukj@cofc.edu). James M. Braswell, Ph.D., is an Associate Professor of Accounting at the College of Charleston (braswelljm@cofc.edu).
Footnotes
1. OpenAI blog, 12/11/2015: https://openai.com/blog/introducing-openai
2. Microsoft Copilot announcement: https://blogs.microsoft.com/blog/2023/09/21/announcing-microsoft-copilot-your-everyday-ai-companion/; Google Gemini (Will Knight, Wired Magazine): https://www.wired.com/story/google-gemini-generative-ai-boom/
4. https://copilot.microsoft.com/, https://blog.google/technology/ai/google-gemini-ai/#sundar-note, https://www.perplexity.ai/ and https://chat.openai.com/auth/login
5. Forbes Magazine on Perplexity.ai: https://www.forbes.com/sites/joannechen/2023/09/06/how-perplexityai-is-pioneering-the-future-of-search/?sh=5dcde902ad91
6. https://www.cpajournal.com/2023/09/19/chatgpt-for-legal-and-tax-professionals/, September 2023; https://www.taxgpt.com/blog/taxgpt-vs-chatgpt
8. For a sampling of these experiments, https://www.journalofaccountancy.com/news/2023/jun/can-chatgpt-answer-clients-questions.html, https://www.taxnotes.com/tax-notes-talk/podcast/chatgpt-takes-tax/7g8z and https://wapo.st/43hBwaJ-
9. https://www.cpajournal.com/2023/09/19/chatgpt-for-legal-and-tax-professionals/, September 2023
10. IRS Circular 230, §10.22(a)(1), §10.22(b) and §10.37(b)(1) and https://www.cpajournal.com/2023/09/19/chatgpt-for-legal-and-tax-professionals/, September 2023
11. https://www.barrons.com/articles/the-irs-will-use-ai-to-do-more-tax-audits-its-not-all-bad-b3c03bf1?mod=Searchresults, 10/12/24
12. https://www.wsj.com/articles/ai-comes-to-the-tax-code-11582713000, 2/26/20
Exhibit 1 Links
Evolution of Chat-Based AI: https://www.cnn.com/2022/08/20/tech/chatbot-ai-history/index.html
2011, Apple phones: https://www.linkedin.com/pulse/from-siri-alexa-brief-history-conversational-ai-raghu/
2014, Amazon launches Alexa: https://www.linkedin.com/pulse/from-siri-alexa-brief-history-conversational-ai-raghu/
2015, OpenAI founded: https://openai.com/blog/introducing-openai
2019, OpenAI enters into licensing agreement with Microsoft: https://openai.com/blog/microsoft-invests-in-and-partners-with-openai
2022, ChatGPT 3.0 released to the public: https://www.pcmag.com/news/the-new-chatgpt-what-you-get-with-gpt-4-vs-gpt-35
2023, ChatGPT 4.0 made available to paid subscribers: https://www.pcmag.com/news/the-new-chatgpt-what-you-get-with-gpt-4-vs-gpt-35
2023, Perplexity launches its AI-assisted search tool: https://www.forbes.com/sites/joannechen/2023/09/06/how-perplexityai-is-pioneering-the-future-of-search/?sh=5dcde902ad91
2023, Microsoft begins incorporating Copilot: https://blogs.microsoft.com/blog/2023/09/21/announcing-microsoft-copilot-your-everyday-ai-companion/
2023, Google launches its upgraded AI tool, Gemini: https://www.wired.com/story/google-gemini-generative-ai-boom/
Thanks to the Sponsors of Today's CPA Magazine
This content was made possible by the sponsors of this issue of Today's CPA Magazine: