Smart Transaction Labeler
AI-powered labeling with prior-year learning, confidence scoring, and 50+ known vendor patterns.
How Smart Labeling Works
Your Input File
Upload an Excel workbook (.xlsx) from the Bank Statement Processor. Each sheet needs a Payee (or Payor) column and an Amount column. Sheet names are auto-detected — "Deposits & Additions", "Electronic Wdrwls", "Debits & Wdrwls", "Checks Paid", "Fees", "CC_XXXX", etc. Deposit sheets are treated separately — expense categories are never applied to income/deposit transactions.
Prior Year Data (Optional but Powerful)
Upload a previously labeled Excel file and the system learns from it. It indexes every payee + label combination, so if "ABC Consulting" was labeled "Legal & Professional" 8 out of 10 times last year, this year it auto-labels with high confidence. PY labels are automatically sanitized — inconsistent names like "Auto-Fuel" or "Supplies & Materials" are normalized to canonical forms.
Tip: Use last year's Smart Labeler output as your PY file — the more consistent your labels, the higher the confidence.
Classification Hierarchy
Each transaction runs through these steps in order. It stops at the first confident match:
- 1. Your vendor rules — keyword → category, with optional split or dollar threshold (100%)
- 2. Check number detection → flagged for manual review (upload check images for OCR)
- 3. Built-in threshold rules — Uber/Lyft, fast food, restaurant, pharmacy amounts
- 4. Prior-year data matches (70-100%)
- 5. 600+ known vendor patterns
- 6. Claude AI classification (if API key configured)
- 7. AI review pass — catches obvious misclassifications (e.g., restaurant labeled as fuel)
- 8. "NEEDS REVIEW" if nothing matches
After all labeling, Post-Label Rules run as a final pass — reclassify or split entire categories by dollar threshold (e.g., all "Meals" ≤ $100 → "Non-deductible Meal"). Category names are auto-normalized so "Auto-Fuel", "Auto_Fuel", and "Auto - Fuel" all merge into one group.
Tip: Vendor rules are checked in the order you set them — drag rules up or down to control priority. The first matching rule wins.
What You Get Back
Your original file plus 5 new columns, with sorted tabs shown first in the workbook:
- Sorted Tabs — shown first; grouped by category with subtotals and collapsible row groups (use the +/− buttons or 1/2 outline controls in Excel to expand/collapse). Split transactions appear as two rows with proportional amounts.
- Tax-Description — the assigned label (e.g., "Auto - Fuel", "Legal & Professional", "Not Income")
- Confidence — how sure the system is (0-100%)
- PY_Instances — prior year match ratio (e.g., "8/10" = labeled the same way 8 of 10 times)
- Match_Source — what matched: py_data, known_vendor, vendor_rule, threshold_rule, claude_ai, etc.
- Review_Notes — explanation of the label, split details, or why it needs review
Color Coding in Excel
Tax-Description and Confidence cells are color-coded so you can scan quickly:
- Green — High confidence (80%+), safe to accept
- Yellow — Moderate (50-79%), worth a quick review
- Orange — Low confidence (30-49%), needs review
- Red — Could not classify (<30%), requires manual review
- Blue — Classified by Claude AI
- Purple — Corrected by AI review pass
Post-Label Rules
Post-label rules apply after all classification is complete. Two modes:
- Reclassify — change a category based on dollar threshold. E.g., all "Meals" above $100 stay as "Meals" (deductible), at or below $100 become "Non-deductible Meal".
- Split % — split transactions above a threshold into two categories by percentage. E.g., all "Meals" above $50 become 50% Meals / 50% Non-deductible Meal. The sorted tab creates two rows with proportional amounts.
Tip: Use reclassify mode for meals deductibility thresholds. Use split mode for mixed-use expenses like a vehicle that's 70% business / 30% personal.
Tips for Best Results
- Set up vendor rules — add rules for common vendors (Amazon, meals, etc.) using the "Add Rule" button or templates dropdown. Use split for mixed-use vendors and threshold for amount-based decisions.
- Drag rules to set priority — rules are checked top to bottom; the first match wins. Put specific rules above general ones.
- Use post-label rules for meals — reclassify all "Meals" at or below your threshold as "Non-deductible Meal" in one click, instead of editing each row manually.
- Set your account type correctly — "Pure Business" never labels personal; "Commingled" allows personal/business split; "Mostly Business" is the default.
- Set your business type — restaurants auto-classify grocery stores as COGS instead of "NEEDS REVIEW".
- Use PY data — even a rough prior year file dramatically improves accuracy. The system learns from your corrections over time.
- Review yellows first — green items are almost always correct. Focus your time on yellow and red flagged rows.
- Upload check images — check payees change every time, so PY matching can't help. Upload check PDFs/images for OCR-based payee extraction.
- Run fresh for best results — uncheck "Only fill empty" to re-classify all transactions from scratch with your latest rules and PY data.