Skip to content

V8's Garbage Collection

In our daily development, we rarely encounter memory leaks. Even when we do, we can solve them by using Buffer or lifting V8's memory limit (which is highly discouraged). However, as a Node.js developer, I think it's essential to understand how Node's V8 engine handles garbage collection, what algorithms it uses, and how it differs from Java's JVM or engines in other languages.

Generational Garbage Collection

Generational garbage collection is a very common mechanism. Through years of development, people realized that no single algorithm can handle all situations. Therefore, objects are treated differently based on their survival time, with more efficient algorithms applied to different groups of objects. "Generational" refers to dividing memory into the old generation and the new generation.

v8

V8's memory size is typically 1.4 GB on 64-bit systems and 0.7 GB on 32-bit systems. The old generation memory space is 1400 MB on 64-bit systems and 700 MB on 32-bit systems. The new generation memory is relatively small: 32 MB on 64-bit systems and 16 MB on 32-bit systems. Now that we've distinguished between the new and old generations and know their sizes, let's look at how memory objects in these two regions are handled.

New Generation Memory Handling

Scavenge Algorithm

In the new generation memory, V8 primarily uses the Scavenge algorithm for garbage collection. This algorithm splits the new generation memory space into two equal parts, each called a semispace — one is active (From) and the other is idle (To). When allocating objects, they are first placed in the From space. When garbage collection begins, V8 checks for surviving objects in the From space, copies them to the To space, frees the space occupied by non-surviving objects, and finally swaps the roles of the From and To spaces.

Clearly, Scavenge is a classic space-for-time algorithm. It requires half of the memory to be idle, which is fast but sacrifices 50% of the space. Therefore, it's only suitable for the new generation, where object lifecycles are short and the space is small.

scavenge

At this point, you might wonder: if an object survives multiple collections, isn't that wasted effort? Indeed. If an object lives long, keeping it in the frequently-collected new generation is a waste. This is when we need to promote it: objects that survive one Scavenge collection are copied to the old generation memory space:

scavenge2

But there's another issue — the size of the To space. A single garbage collection doesn't just collect one object. During the transfer from From to To space, many objects might be copied over. If the To space isn't large enough, that's problematic. So we need an additional constraint:

scavenge3

Old Generation Memory Handling

In the old generation memory space, surviving objects account for a larger proportion, and the space itself is bigger. The Scavenge algorithm is no longer sufficient here, so different algorithms are needed.

Mark-Sweep Algorithm

The Mark-Sweep algorithm has two phases: marking and sweeping. In the marking phase, it traverses all objects on the heap and marks the live ones. Then it simply clears the dead objects. Since dead objects in the old generation are relatively few, this is quite efficient.

mark-sweep

However, after sweeping, there's a problem: memory space becomes fragmented. This can prevent large objects from being allocated, triggering premature garbage collection. To solve this, the Mark-Compact algorithm was introduced.

Mark-Compact Algorithm

The difference between Mark-Compact and Mark-Sweep is that Mark-Compact marks dead objects and then moves live objects to one end, clearing the other end entirely.

mark-compact

Combined Usage

Both algorithms have their pros and cons. Mark-Compact can solve memory fragmentation, but it's less efficient than Mark-Sweep because it needs to move memory objects. Therefore, V8 primarily uses Mark-Sweep and only switches to Mark-Compact when memory space is insufficient for object allocation.

Additional Optimizations

The algorithms and strategies above handle basic garbage collection, but V8's GC is more sophisticated. Garbage collection typically requires pausing the JavaScript application logic, because without pausing, the JS application and the GC might see inconsistent views of memory objects. In the new generation, this pause is negligible due to the small memory and few surviving objects. But in the old generation, with larger space and more surviving objects, marking, sweeping, and compacting take much longer, making the pause unacceptable. To address this, V8 switched from conventional marking to incremental marking — breaking the process into many small steps to reduce pause time. Similarly, V8 introduced optimizations like lazy sweeping and incremental compaction...

Conclusion

V8's garbage collection is far more complex than what we've covered here, but this gives us a basic understanding of the underlying memory management algorithms.