
My CSV had 50K rows. Row 23,487 broke everything.
My CSV had 50K rows. Row 23,487 broke everything. Built a data processor for vendor CSV exports. Tested with sample files, looked good, pushed to production. Two weeks later, 3 AM alert. Vendor changed export format. Not the whole thing, just ONE field on random rows. They added an unquoted comma. # Expected format product_name , price , stock Widgets , 12.99 , 45 # Row 23,487 Widgets , Premium Edition , 12.99 , 45 That comma shifted columns. Parser read "Premium Edition" as price, tried float conversion, crashed. Spent an hour thinking it was my regex. Nope. Thought maybe encoding issue. Nope. Finally printed the raw row. There it was. Unquoted comma in product name field. Tests missed it because first 20K rows were clean. Formatting bug only appeared when product names had certain keywords. My test CSVs had none of those. No validation existed. Python csv module parsed it without error, so I assumed valid data. Bad assumption. Fixed it: import csv def validate_row ( row , expected_co
Continue reading on Dev.to Python
Opens in a new tab



