Zip is committed to being the one-stop shop for all your procurement needs. As a part of this mission, we recently launched our Vendor Cards product to empower customers to efficiently pay for and manage their expenses. While our Vendor Cards product launched with a robust feature set, we’re continuously looking for ways to innovate and make our customers’ lives easier. To this end, we’re excited to introduce AI-powered receipt matching, a powerful tool for helping admins manage spend and streamline the transaction management workflow.
Currently, after an employee makes a purchase on their vendor card, they’ll typically be required to upload a receipt, add a descriptive memo, and submit it to their admin for review in Zip. This process can be a source of friction, with cardholders being required to transfer receipts between multiple systems and admins often needing to chase down “delinquent” cardholders. With this launch, cardholders now have the ability to email their receipts and memos straight to Zip.
This might seem like magic at first, so let’s jump in and see what goes on behind the scenes.
The journey
The journey through our infrastructure begins with a user sending their receipt to Zip via email. Once the receipt lands in our Google Workspace, we forward it to Amazon’s Simple Email Service (SES) so that we can save the email in S3. S3 then triggers an Amazon lambda function to notify Zip and provide the necessary details to retrieve the file from S3.
After Zip receives the S3 receipt information, preprocessing is kicked off. This involves fetching the email from S3, processing the email, extracting the attachments, and converting the email body from html to PDF to handle in-email receipts. We then bulk send these attachments to an external Optical Character Recognition (OCR) service to translate the receipts from images and PDFs into textual data.
Matching a receipt
At this stage, we have everything we need to pinpoint the transaction the receipt was sent for. Our matching process starts by retrieving the transactions that belong to the receipt sender. We initially explored a solution that would go through the different filters (card’s last four digits, the date of transaction, the transaction amount, and the currency it was transacted in), keep transactions that successfully match the information returned by OCR, and throw out the ones that don’t.
We wish it were that simple! To illustrate the complexity, let’s take a closer look at one of the filters - the transaction date. It’s quite hard to match accurately since the data returned by OCR can be inaccurate and transaction dates don’t always match (ie merchants can settle transactions days after the receipt). There’s also timezones to take into account and that invoices have multiple dates on them.
To tackle all these issues, we use a window of dates instead of just the date that was returned by OCR, and we keep the transactions that fall within the window. The window direction and size varies if the receipt is say, a restaurant receipt vs an annual subscription invoice. If the filtered transactions don’t match a set of heuristics, we’ll iteratively increase the size of the window and try again.
All these adaptions enable us to more accurately match transactions to the correct dates. Speaking of which, dates are one of the most important filters to match on. One can imagine that matching a transaction date is much more helpful than matching the currency of the transaction. To this end, we set a weight for each of the filters to determine their importance, and we only consider a transaction matched when sufficient weight has been matched.
From good to great
Now that we have our base algorithms in place, we need to tune and iterate on filter weights, confidence thresholds, and make improvements to our matching algorithm. To do this, we built out an internal framework to test our system end-to-end on 500+ internally-matched receipts.
We also expected many edge cases, so we first launched to a small subset of customers, either those for which this feature was deal blocking or those that had explicitly requested the feature
Completing the user feedback loop
We’ve now built a working receipt matching experience, but something is still missing… there is no UI! There’s no way for the user to track where their receipt is. Has it matched? Did it match to the correct receipt? Or did it fail completely?
To bring visibility into this process, we send notifications to the receipt sender as soon as we finish the matching process. If we fail to match, we’ll notify the user of our best guess why it failed, and link them to the transactions page so they can match it manually. If we match successfully, we’ll send a success notification which includes a one-click submit experience to fully close out the transaction lifecycle.
Getting ahead of possible issues
There were three components to the process that were tricky to work with and would need extra attention before fully rolling out to all our customers.
Firstly, emails are notoriously finicky to parse due to lack of accepted standards. To this end, we paid special attention to catching email-parsing errors and surfacing them, and generically catching all unknown failure cases and logging them in detail.
Secondly, AI systems are pretty black box, so we added logs and metrics to explicitly display every field returned by OCR - including what was filtered out due to confidence thresholds. This allowed us to adjust the thresholds quickly and precisely.
Lastly a good portion of the flow is not within Zip at all and has very poor in-built visibility. For example, S3 doesn’t provide a way to sort or search through bucket contents, and SES doesn’t have any email viewing capabilities. To help developers debug, we built out internal tooling to view emails that cause errors.
This culminated in a dashboard with around 20 metrics and many different logs. Constructing this dashboard before we launched to any live customers had a significant impact in helping us identify and debug issues quickly
Measuring success
Typically, when thinking about success, engineers gravitate towards concrete data — the more numbers and percents, the better.
I personally was not an ideal vendor cards user; I frequently ignored the reminders to upload receipts. Although not an objective measure of success, a northstar that I worked towards during this project was, could I build something so simple to use that even I would use it?
More concretely, we wanted a smooth process with a high match rate and high adoption. As of time of writing, excluding user errors we’ve reached about a great receipt match success rate, and have had lots of positive user feedback. We’ve also seen adoption from a majority of vendor cards customers prior to any dedicated marketing push of the feature
What’s next
- OCR companies (especially those built specifically for receipts + invoices)
- Continue monitoring match rates, errors, and user satisfacation
- Automatically submit receipt for admin review