every npm install parses version strings. thousands of them. marvin hagemeister found that installing preact calls semver 21,000+ times. one package. 21 thousand version comparisons.
the semver package has 150 million weekly downloads. it's the backbone of npm's dependency resolution. and it's doing way more work than it needs to.
parsing 1.2.3 should be trivial. three numbers, two dots. but the official package:
- creates class instances for every single parse.
new SemVer()everywhere. - uses regex for parsing. yeah.
- has this fancy Map-based LRU cache with access tracking, eviction policies, the works
- re-parses the range string every time you call
satisfies()
this is the kind of code that looks reasonable until you realize it runs in a loop 21,000 times.
the dumb idea that worked
version numbers aren't random. you're not going to see 847293.1847.29847 in the wild. react is at 18.something. node is at 22.something. most packages never go past major version 50.
what if we just... precompute all of them?
const cache = Object.create(null);
for (let M = 0; M < 50; M++) {
for (let m = 0; m < 50; m++) {
for (let p = 0; p < 20; p++) {
const s = M + '.' + m + '.' + p;
cache[s] = { major: M, minor: m, patch: p, pre: null };
}
}
}
50,000 version objects. allocated once at module load. takes maybe 2-3ms on startup and a few hundred KB of memory.
now parsing 1.0.0 or 16.14.2 or 4.17.21 is just a hash lookup. no regex. no string splitting. no nothing. just cache["1.0.0"] and you're done.
this handles probably 95% of real-world versions. the remaining 5% (prereleases, build metadata, absurdly large numbers) still need proper parsing, but they're the exception.
Map is slow, actually
everyone says Map is fast. it is, for certain things. but for string key lookups, a plain object beats it.
// Map version
const cached = cache.get(str);
// Object version
const cached = cache[str];
the difference? .get() is a method call. the object bracket notation is a direct property access that V8 can inline and optimize to basically a hash lookup. no function call overhead. just pointer arithmetic.
the trick is using Object.create(null) instead of {}. regular objects have prototype pollution issues and slightly slower lookups because V8 has to check the prototype chain. null-prototype objects are pure hash tables.
compiling ranges to functions
this is where it gets fun.
a version range like ^1.2.3 means "anything >= 1.2.3 and < 2.0.0". every time you call satisfies("1.5.0", "^1.2.3"), the old package would:
- parse the range string
- build some internal representation
- parse the version string
- compare them
but the range is always the same. ^1.2.3 will always mean the same thing. why not compile it once to a function?
// ^1.2.3 compiles to:
(M, m, p, pr) => !pr && M === 1 && ((m - 2) || (p - 3)) >= 0
this closure gets cached. next time you check any version against ^1.2.3, it's just a function call with four integers. no parsing. no comparisons against parsed objects. just raw arithmetic that V8 can optimize to near-native speed.
the ((m - 2) || (p - 3)) >= 0 trick is cute. it's a branchless way to check if minor > 2 OR (minor === 2 AND patch >= 3). the || operator returns the first truthy value, so if m - 2 is positive we're done, otherwise we check the patch.
goodbye regex, hello charCodeAt
regex is powerful but expensive. for something as simple as parsing 1.2.3, it's overkill. manual character-by-character parsing wins:
let major = 0;
while (c >= 48 && c <= 57) { // '0' to '9'
major = major * 10 + c - 48;
c = s.charCodeAt(++i);
}
this is classic number parsing. read digits, multiply by 10, add the new digit. stop when you hit a non-digit. no regex engine startup, no backtracking, no capture groups. just a tight loop that probably fits in the L1 cache.
charCodeAt returns raw integer ASCII codes, so comparing c >= 48 is faster than char >= '0' (no string comparison). it's the kind of micro-optimization that only matters when you're doing it millions of times.
function calls are not free
the original semver package has a compare() function, and then gt(), lt(), gte(), lte() all call it:
function gt(a, b) {
return compare(a, b) > 0;
}
elegant! DRY! and slow.
every function call has overhead. pushing arguments to the stack, creating a new execution context, popping the return value. for hot code, you want to inline everything:
function gt(a, b) {
const va = parse(a), vb = parse(b);
return va.major > vb.major ||
(va.major === vb.major && (va.minor > vb.minor ||
(va.minor === vb.minor && va.patch > vb.patch)));
}
uglier? yes. faster? also yes. V8 can sometimes inline for you, but "sometimes" isn't good enough when you're chasing 30x speedups.
the results
| Operation | semver | pico-semver | Speedup |
|---|---|---|---|
parse |
6M ops/s | 208M ops/s | 35x |
satisfies |
2.4M ops/s | 70M ops/s | 30x |
valid |
6M ops/s | 186M ops/s | 31x |
maxSatisfying |
700K ops/s | 20M ops/s | 29x |
compare |
7M ops/s | 105M ops/s | 15x |
gt |
7M ops/s | 108M ops/s | 15x |
inc |
11M ops/s | 64M ops/s | 6x |
208 million parses per second. version strings. javascript.
this doesn't matter for most applications. if you're parsing 100 versions, the difference is like 50 microseconds. who cares.
but if you're npm, or a bundler, or a monorepo tool resolving dependencies across 500 packages? those 21,000 calls per install add up. every developer's machine, every CI run, every deploy.
things that didn't work
not everything made things faster. optimization is humbling.
hash-based cache with fixed array: i thought replacing the object cache with a fixed-size array and a hash function would be faster. less memory allocation! no hash table resizing! except computing the hash cost more than just doing the object lookup. V8's internal hash tables are really good.
fast path for simple versions: i tried adding a tryFastParse() that would quickly check if a string matches the simple X.Y.Z pattern before hitting the cache. sounds smart, right? why do a cache lookup if you can parse directly? but the cache lookup is so fast that the extra check just added overhead.
caching inc() results: inc("1.2.3", "minor") returns "1.3.0". why not cache it? because building the cache key (version + '\0' + release) costs more than just computing the result. some operations are so cheap that caching is overhead.
tagged unions instead of closures: i tried representing compiled ranges as data structures instead of functions, thinking a switch statement over a type tag would be faster than closure dispatch. it wasn't. V8 optimizes closures really well. like, really really well. the indirection of going through a type tag actually made things slower.
the lesson, as always: benchmark everything. your intuitions are probably wrong. mine certainly were.
the final stats
- 6.4 kB gzipped (semver is ~17 kB)
- zero dependencies
- ~500 lines of typescript
could probably be smaller. but at some point you're just code golfing.
drop it in
// before
import semver from 'semver';
// after
import * as semver from 'pico-semver';
same API. parse, valid, satisfies, compare, gt, lt, gte, lte, eq, neq, maxSatisfying, minSatisfying, inc, diff, coerce, sort, rsort. all the stuff you'd expect.
get it
npm install pico-semver
code is on github: github.com/Lulzx/pico-semver
will this make your app faster? probably not. npm installs slightly less annoying? maybe. fun weekend staring at V8 profiler output? yes.
there's something satisfying about taking code that 150 million people depend on and making it fast. even if most of them never notice.