FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources
  • Privacy Policy

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
Building a Tokenizer from Scratch [part 2]
How-ToTools

Building a Tokenizer from Scratch [part 2]

via Dev.toJocer Franquiz4h ago

From FSM to PDA: Q/A with Claude Opus In part 1 , we built a working FSM that recognizes <div>text</div> using just 7 primitives mapped 1:1 to assembly opcodes. But FSMs have a hard limit: they can't handle nested structures like <div><div>hello</div></div> . In this post, we climb the Chomsky hierarchy from finite state machines to pushdown automata , build a PDA that recognizes nested <div> tags, and then turn it into a transducer that emits tokens. In other words we are building the core of a lexer . Q: Why can't FSMs handle nested structures? Because an FSM has a fixed number of states , and that's all the memory it has. Consider nested divs: <div><div><div>hello</div></div></div> To correctly match closing tags, you need to count how many <div> s you've opened so you know how many </div> s to expect. An FSM with, say, 12 states can handle nesting up to some fixed depth — but someone can always write HTML nested one level deeper than your states can track. Put simply: 1 level deep

Continue reading on Dev.to

Opens in a new tab

Read Full Article
6 views

Related Articles

LeetCode Solution: 121. Best Time to Buy and Sell Stock
How-To

LeetCode Solution: 121. Best Time to Buy and Sell Stock

Dev.to Tutorial • 3h ago

The Feature Took 2 Hours to Build — and 2 Weeks to Fix
How-To

The Feature Took 2 Hours to Build — and 2 Weeks to Fix

Medium Programming • 4h ago

Blog 15: SDLC Phase 4 — Testing
How-To

Blog 15: SDLC Phase 4 — Testing

Medium Programming • 5h ago

Before We Write a Single Data Structure, We Need to Talk
How-To

Before We Write a Single Data Structure, We Need to Talk

Medium Programming • 6h ago

How-To

How to implement the Outbox pattern in Go and Postgres

Lobsters • 7h ago

Discover More Articles