
Building a Multilingual Name Database: 2000+ Names Across 46 Cultures
When I started BabyNamePick , the name database had maybe 200 entries. Today it has over 2,000 names spanning 46 cultural origins. Here's what I learned scaling a multilingual name dataset. The Data Model Each name entry is surprisingly simple: { "name" : "Sarangerel" , "meaning" : "Moonlight" , "gender" : "girl" , "origin" : "mongolian" , "style" : [ "nature" , "elegant" ], "popularity" : "rare" , "startLetter" : "S" } Seven fields. That's it. But getting those seven fields right across 46 cultures is where the complexity lives. Challenge 1: Cultural Accuracy Names carry deep cultural significance. A name's meaning in one culture might be completely different in another. "Kai" means "sea" in Hawaiian/Polynesian but "forgiveness" in Japanese. We handle this by treating origin as the primary key alongside the name itself. The same spelling can exist in multiple origins with different meanings. Challenge 2: Balanced Representation Early on, our database was heavily skewed — 100+ American
Continue reading on Dev.to Webdev
Opens in a new tab



