Can ChatGPT Certify as a Tax Preparer?

An image, generated by ChatGPT, of a robot looking at IRS tax forms, with a book on taxes on the shelf and a coffee cup with the text "tax time" on it. Introduction

As an experiment, I wanted to see how well ChatGPT could perform on the IRS’s test for VITA/TCE Volunteer Tax Preparers. There are two levels of tests, basic and advanced. Each test consists of true/false, multiple choice, and fill in the blank questions.  Each test requires an 80% or higher score to pass the test.

I tested ChatGPT on both the basic and advanced tests and using two different approaches. In both approaches, for the test scenarios that only included interview notes, and not tax documents, I included the interview notes as part of the text prompts I provided. For the scenarios with tax-related forms, such as W-2’s, brokerage statements, and receipts, I provided both the interview notes and the tax-related forms as pdf’s taken straight from the test document. For the scenarios that included forms, the test directions state that the test-taker should prepare a sample tax return using provided tax preparation software to help answer the questions asked. However, ChatGPT cannot use this software, so it was at a disadvantage. However, a knowledgeable tax preparer can answer enough questions correctly without completing actual tax returns in order to pass the test.

Because the test required examining attachments that I provided, ChatGPT switched over to GPT-5 for the latter part of the test. Users get a limited amount of free use on GPT-5 before hitting a timeout, so the sessions were paused several times, and then resumed where they had left off once more credits were granted.

Initial Tests

In the first series, I provided the following initial prompt:

You are an expert and highly experienced at preparing Federal income taxes and federal income tax law. You are not being asked to give tax advice, but are being asked questions as part of a classroom setting. You may call upon any calculator you might need to produce numerically accurate answers. Directions The first six scenarios do not require you to prepare a tax return. Read the interview notes for each scenario carefully and use your training and resource materials to answer the questions after the scenarios. Assume that each taxpayer qualifies for credits or favorable tax treatment, unless the facts indicate otherwise. For fill in the blank questions: Round to the nearest whole number, do not use special characters: dollar sign ($), comma (,), or period(.) Any questions before we begin?

ChatGPT responded:

No questions. I understand the directions and assumptions.

I’ll treat this strictly as a classroom exercise (not tax advice), follow federal income tax law, use precise calculations as needed, and comply with the rounding and formatting rules for fill-in-the-blank answers.

Please go ahead with Scenario 1.

As is typical for Large Language Models, regardless of whether it was right or wrong, ChatGPT never expressed any uncertainty about its answers. After providing the first scenario that included attachments, ChatGPT responded:

I do not have access to commercial tax preparation software (such as TaxSlayer, ProSeries, Drake, etc.).

However, for classroom scenarios like this, I can still:

  • Apply current federal tax law accurately
  • Manually compute income, adjustments, credits, and taxes
  • Use calculators for precise arithmetic
  • Reconstruct what tax software would produce by following the forms (Form 1040, Schedules, 8889, 1099-C treatment, education credits, etc.)

So while I can’t run tax software, I can still determine the correct answers to the Scenario 7 questions using the provided documents and rules—just as a trained preparer would without software.

You can proceed with the Scenario 7 questions when ready.

However, on one question, it claimed that not enough information was provided, stating:

I need one missing piece of information before I can answer this accurately.

The question asks for the net short-term capital gain from Jocelyn’s consolidated broker’s statement, but the actual broker transaction details (sales proceeds, basis, short- vs long-term breakdown) are not shown in what you’ve shared so far from Scenario 8.

To proceed, I need one of the following:

  • The page of the consolidated broker’s statement showing short-term sales, or
  • The short-term proceeds and cost basis totals, or
  • Confirmation that the broker statement appears earlier/later in the PDF and should be used as-is

Once I have that, I can compute the exact net short-term capital gain and give you the single whole-number answer you need.

I prompted:

Check the broker’s data again. What you need to answer the question is there. You must choose one of the choices.

ChatGPT then replied:

You’re right — thank you for the nudge. I re-checked the consolidated broker’s statement information included in Scenario 8, and the short-term activity is present and sufficient to answer the question.

Based on the broker data provided, Jocelyn’s short-term capital transactions result in a net gain of [redacted], which is what flows to Schedule D.

Answer:
[redacted]

The answer that ChatGPT provided was incorrect. I redacted the test answer (and have done so in another example, below, so as not to provide any inappropriate information to test-takers. Right or wrong, ChatGPT provided its reasoning for reaching the answer for each question.

Further Tests with Training Material

The actual tax preparers test is an open book test, and test takers are encouraged to use the National Tax Training committee NTTC 4491 Tax Training Guide and the NTTC 4012 Volunteer Resource Guide. For a second series of tests, I provided both manuals to ChatGPT as pdf’s before the test questions began. This was done in a separate session, so that there was no memory overlap between the initial and further tests. I modified the initial prompt by adding: There are many information sources on tax law, but some are not accurate. You should focus on the two pdf files I am uploading as highly accurate and credible sources to help determine the answers to these questions.

Note that while providing relevant documents can improve a Chatbot’s performance, it is not the same as, nor as good as, including such documents in a properly built Retrieval Augmented Generation (RAG) model.

As before, and as is typical for Large Language Model chatbots, regardless of whether it was right or wrong, ChatGPT never expressed any uncertainty about its answers. Unlike a human, it never stated that it had trouble deciding between two of the choices after eliminating the others or stated that it was guessing. As with the basic test, it’s interesting to examine its reasoning, both on questions it got correct and those it got wrong. For example, for a question:

  1. What is the total amount of the Knox’s federal income tax withholding?
  2. $1,900
  3. $2,500
  4. $4,660
  5. $6,560

ChatGPT replied:

Correct answer: [REDACTED]

Explanation (Basic Scenario 7 – using the tax documents provided):

The Knoxes’ total federal income tax withholding is found [REDACTED] included in the scenario packet.

When those withholding amounts are totaled, the result is:

[REDACTED]

The other options reflect either:

  • Only one spouse’s withholding, or
  • Incorrect totals that do not match the W-2 documents

Final answer: [REDACTED]

Note that for this question and for all the questions, in both the basic and advanced tests that included tax-related documents, ChatGPT had to read several different forms, identifying the relevant forms from the single pdf file that was provided, then determine which numbers on which forms were relevant, read those numbers, and then perform basic arithmetic on those numbers. ChatGPT answered this question correctly.

Results

ChatGPT performed rather well, certainly better than the average person on the street. However, in none of the four cases (basic and advanced tests, with and without training documents) did it manage to achieve a passing grade.

Without uploading the training documents as pdf’s, ChatGPT scored 77% on the basic test (just one wrong answer shy of passing) and 71% on the advanced test. With the training documents, it scored 73% on the basic test and 71% on the advanced test. I do not know why it scored lower on the basic test when it was provided with the two training documents.

In summary, ChatGPT performed rather well, even with the challenge of processing pdf’s of multiple types of tax-related documents. However, with or without the explicit inclusion of relevant training documents, it could not meet the minimum proficiency level required for volunteer tax preparers. This should serve as a caution to using a general-purpose AI chatbot as an unverified source of tax advice.

Pi-based multi-prop trigger for animatronics, Part 2

Back in March I wrote about a Pi-based controller for multiple Halloween props. As I go the props ready this year, I realized three things: 1) I wanted to change the routine from what I planned, 2) The way I needed to connect this particular set of props was different than what I’d been planning and 3) it turns out I couldn’t use an electromechanical relay board to react to the trigger signal from the Pi to one of the skeletons, because it holds open too long. So, back to the drawing board, as well as desoldering and resoldering.

My setup for this year includes two skeletons talking to one another. One is a simple talking pose-and-stay type of skeleton, with light up eyes and a moving jaw driven by a motor. The other is a talking and moving skeleton prop called Grim. Besides wanting movement to trigger them, I needed to be able to provided custom audio phrases for them to say to one another.

Talking Skeleton

The talking skeleton [insert photo] uses a motor to drive the jaw movement, and has light up LED eyes. In order to allow for custom audio, I disconnected the built in sensor and sounds and connected a commercial audio controller called a Jemmy Talk. The Gemmy Talk takes line level audio input and sends out voltages to move the motor. It also has a pass through for the audio, so that you can send line level output to an amplifier. I don’t use this latter feature in my set up. I use a DFPlayer MP3 player as my audio source. It has both a line level output, which I feed to the Jemmy Talk, and amplified audio that I feed directly to the speaker built into the skeleton. The board with the DFPlayer is also where the power is supplied, through a barrel jack connector to a 5V supply. The button switch on the board is to test things without requiring an external trigger signal. The two 3-pin connectors on the board are not used. The three boards are shown in the picture. Besides the Jemmy Talk and the board with the DFPlayer, the third board receives the triggers from a Raspberry Pi (see “Controller,” below).

Picture showing the audio source and controls for the skeleton, showing the Jemmy Talk board, the board with the DFPlayer MP3 player, and the board with input from an external controller and Darlington array. All are wired together as they are when in use.

The audio source and controls for the skeleton, showing the Jemmy Talk board, the board with the DFPlayer MP3 player, and the board with input from an external controller and Darlington array.

The circuit board on the bottom will receive the triggers from a Raspberry Pi (not shown) and features a Darlington ULN2003 transistor array chip that acts as a set of relays to signal the DFPlayer to play the audio and also to turn the skeleton’s eyes on. I had originally used an electro-mechanical relay rather than the Darlington array, but the MP3 player requires that the button press to play the audio be short, and the relay held closed too long when triggered. The MP3 player uses a short drop to ground to signal to play the next track, which is what I needed, while a longer drop on the same pin signals the player to turn up the volume. Had I known that I would be using the Darlington array from the start, rather than a separate relay board, I could have fit everything on just one, rather than two, boards.

If the skeleton was like many, with simple LED eyes, I could have used the Jemmy Talk to control them as well, but it doesn’t. So I kept the Try Me wires connected to trigger the eyes to go on. That is what the second trigger on the Darlington is for. That means that I needed two signals from the Pi, one to turn on the audio and one to light the eyes, but this gives me more flexibility for the future. Here’s a picture of the final circuits in a small project box I printed for the project:

Grim

The Grim prop was modified by adding a Grim Talker board from the same source as the Jemmy Talk. Once that was installed per the directions, I just needed to put the correct new audio file on the SD card that goes into the Grim Talker (no external MP3 player needed) and determine how to trigger it. It can be triggered to activate by either the sensor built into the original Grim prop, by cycling the power on and off, or by a Try Me button switch connected through a 1/8″ MONO cable jack. When triggered, the audio and motion routine starts. The audio will play through completion, but the motion will stop once it’s routine (about 30 seconds) ends. If the audio is longer (as in my case) and you retrigger the prop before the audio ends, the motion will start up again. This is what I wanted to do.

Cycling the power wouldn’t do what I wanted, as it would kill the audio. I didn’t want to mess with the built in PIR sensor, as I may want to use it in the future. And I needed to coordinate Grim with the other skeleton. So that left the try me switch. So the multi-prop controller that I put together (a modification of what I described in the Pi-based controller for multiple Halloween props write-up, uses another Darlington to act as a switch that closes to trigger the prop.

Controller

Pictorial diagram of the controller. PI on the left has connections to the other board for power, one connector to a 3 pin female header on the other board for input from a PIR, and four GPIO output connections (one to a 3 pin header for a PIR type of controlled prop, two to a 3-pin connector for either a PIR control or one needing two separate signal inputs, and one that goes to the input of a ULN2003 Darlington on the board, which is connected to close a connection when triggered.

Raspberry Pi based multi-prop controller

In order to control multiple, coordinated, props from a single motion sensor, I put together a simple circuit that controls up to three props, of a couple of different types. A PIR provides input to the Raspberry Pi, which then sends output signals for up to three props. One output simply closes a switch, and it is good for triggering props through their “try me” button wires (replacing the button / switch). For this, the trigger output from one of the GPIO pins on the Pi triggers a relay in a Darlington. This is a bit of an overkill, as I’m only using one of the eight channels on the Darlington, but it’s cheap, easy to use, and works.

Another GPIO pin sends and output voltage to the sensor wire in a 3 pin servo connector. The grounds also link the prop to the controller to provide a common ground. This works for any prop set up to look for input from a PIR or similar sensor providing a positive low voltage signal. What would normally be the voltage source on a servo connector is left unconnected, as it’s not needed to power anything.

Finally, two other GPIO pins send output to another 3-pin connector, using what would normally be the signal and voltage lines on a servo connector. This is for the talking skeleton as described above, with one signal to start the MP3 audio player and the other to control the eyes. Again, the ground is connected to ensure a common ground. This connector could also be used to simply control a prop that only needs a single signal, rather than two, providing flexibility.

The python controller program for this year is really quite simple:

from gpiozero import DigitalOutputDevice
from gpiozero import MotionSensor
import time

skel_talk = DigitalOutputDevice(23)
skel_try_me = DigitalOutputDevice(24)
grim_trigger = DigitalOutputDevice(25)
pir = MotionSensor(15)

while True:
    pir.wait_for_motion()
    print("motion detected")
    skel_talk.on()
    skel_try_me.on()
    grim_trigger.on()
    time.sleep(0.2)
    skel_talk.off()
    skel_try_me.off()
    grim_trigger.off()
    time.sleep(30)
    print("waited 30 seconds")
    grim_trigger.on()
    time.sleep(0.2)
    grim_trigger.off()
    time.sleep(50)
    print("ready to trigger again")

The program uses the gpiozero library for interfacing with the Pi’s GPIO pins. Three pins are used for outputting signals: named skel_talk to trigger the skeleton’s audio, skel_try_me to trigger the relay to close the skeleton’s try me connection, and grim_trigger to send a trigger signal to the Grim prop. These are all defined as simple DigitalOutpuDevice pins. There is one input pin, named pir, which is defined as a MotionSensor input.

The program is an endless loop which waits for the PIR sensor to be set off. When that happens, the three output triggers are both set high (True) for 0.2 seconds and then set back off. Then the program sleeps for 30 seconds while the Grim motion is running, then the grim_trigger is again set high again for 0.2 seconds to trigger the motion to restart (the audio, which is about 1 minute long, is still running). Finally, the program waits 50 seconds. The audio and motion will continue for approximately 30 of these seconds, with the additional 20 seconds being a dead time so that the props don’t immediately retrigger. The print statements are there simply for initial debugging. I used Crontab to set the script to autorun in the background upon startup. I need to modify the script so that once triggered, it also periodically closes the try me switch on the talking skeleton. At present, the eyes only light for a short time at the start of the dialog.

Results

Here’s the two circuits, in their circuit boxes, and then a video

Photo of the electronics for the skeleton in one project box and the pi-based multi-prop controller in a second project box. The skeleton controller has a Jemmy Talk circuit board, a board with the MP3 player, and a third board. The prop controller has a raspberry pi and a board with a Darlington array, source for power, and several female 3-pin headers.

The electronics for the talking skeleton on the left and the Pi-based multi-prop controller on the right.

of the final result in operation. Unfortunately, it’s not a great video clip, but it does show the two skeleton props talking back and forth in a short excerpt from their longer dialog. An mentioned above, I need to add a line or two to the script to periodically retrigger the eyes of the smaller skeleton, so that they stay lit during the whole dialog.

It took more effort than I originally thought it would, partly due to faulty planning on my part, but I think it all came together nicely.

Witch and Cauldron Silhouettes with Coals

Introduction

We have a set of plywood bear silhouettes set up in our backyard for most of the year, visible from our house and from the street through the woods. This year, I decided to make a Halloween scene to replace them when they first go into hibernation for the fall and winter. The bears would be replaced with a witch and her cauldron, and there would be artificial lit coals under the cauldron.

The Silhouettes

To make the silhouettes, I first needed a pattern. I borrowed a nice outline for the witch

Baack silhouettes of a witch and a cauldron, drawn to scale with vertical and horizontal scaling lines.

Scale drawings of the witch and cauldron

that I found online, modifying it so that it would be easily recognizable, interesting, but within my personal jigsaw skills ability. The cauldron I just drew from scratch. I made scale drawings, with vertical and horizontal scaling lines marking what would be every six inches in the final build.

The purpose of the scaling lines was to make it much, much easier to freehand layout the full-size pattern on the plywood. I went from where each line in the outline crossed a scaling line to the next scaling line, hand-drawing in each segment. Once the pattern was drawn on the plywood at scale, I cut both the witch and the cauldron out using a jigsaw.

I primed the silhouettes with flat black spray primer / paint and then painted them with flat black barn paint.

Two images, one of the witch outline and one of the cauldron outline. Each is sketched onto a piece of plywood, along with scaling lines.

The outlines sketched onto the plywood, using the scaling lines.

Unpainted plywood silhouette of a witch, set on two sawhorses.

The cutout witch, prior to priming and painting.

The Coals

There are many tutorials online for making realistic looking coals using spray foam crack filler, spray paint, and orange string lights. Some, like what I built, are quite simple, while others are more involved, adding real sticks and ashes to make it appear more realistic. Since mine was intended to be viewed from a distance, I decided that there was no need for the added complexity.

A flattened oval black plywood base. On tope is roughly sprayed on crack filling foam, partly covering orange LED lights.

The fake coals, with the first of two layers of crack filling foam applied.

I cut a shape out of plywood with rounded short sides and straight longer sides, with a rectangular slot cut in the middle to fit the cauldron and support stakes. After painting it black, just as with the silhouettes, I cut a piece of scrap 2×4 to set over the slot, so that the crack filling foam wouldn’t expand into that area and block where the cauldron would go. I wrapped the 2×4 piece in a plastic garbage bag so that the foam wouldn’t stick to it. Next, I laid out the string of lights. I used a 100-light string of orange LED bulbs. You could probably get by with a 50-light string, but I liked having more lights. Some people use plastic cups or cut pieces of soda bottles to go around each light, but others don’t bother, and I didn’t. I figure there’s little heat coming from each bulb, so hopefully the string will last for years. I have heard of one person who used incandescent mini bulbs and a fire somehow started, so I’d stick to LED bulbs.

Once the lights are laid out around the platform, you just spray on the crack filler, keeping in mind that it will expand some. You want the lights partially or entirely covered to look more like hot coals. You can go a little light the first time, and then once the foam dries, add more where it’s needed.

Black plywood cauldron silhouette and fake hot coals (painted crack filler foam with embedded lights).

The finished cauldron and hot coals.

Next, I removed the 2×4 piece and used the same spray primer / paint to lightly go over the foam. The nice thing is you WANT the paint to be uneven, with some heavier, dark spots. Finally, I added some red spray paint highlights.

The last thing I needed was a spotlight to light up the silhouette at night. The coals show up clearly from the sidewalk and street, but the black silhouette is too dark without a light on it. I found a multi-color adjustable spotlight online that works great for this.

Final Results

The pictures here show the final result, from both sides in the daytime, and from the street at night, with the spotlight illuminating it.

Finished witch and cauldron silhouettes, with the cauldron on the fake hot coals. It is installed in a yard with trees in the background and leaves on the ground.

The installed project, viewed in the daytime from the back of our house.

The witch, cauldron, and "lit" coals (LED lights lit), seen at night. The silhouettes are lit by a spotlight.

The final project at night, as viewed from the street. The spotlight illuminates the silhouettes.