# V Benchmarks

All benchmarks compiled with `v -prod` on Apple M5, 16 GB RAM, macOS (arm64).
V version: 0.5.1.

## GC: Boehm vs VGC

Compares Boehm GC (`-gc boehm`) against V's built-in concurrent tri-color
mark-and-sweep (`-gc vgc`).
5 iterations per test, median reported.

```
v run bench/bench_gc.v
```

```
  test                                             boehm       vgc     ratio
  ———————————————————————————————————————————— ————————— ————————— —————————
  small allocs (1000000x string)                   39 ms     48 ms    1.23x
  tree build+walk (depth=18, 10x)                  48 ms    118 ms    2.46x
  array grow (100x 100000 pushes)                   9 ms     26 ms    2.89x
  map insert (20x 10k entries)                     20 ms     27 ms    1.35x
  mixed workload (50 rounds)                       10 ms     16 ms    1.60x

  heap usage:
    boehm: 29856 KB allocated, 29296 KB free
    vgc:   131072 KB allocated, 0 KB free
```

Boehm is still 1.2x-2.9x faster across these workloads and uses ~4x less heap.

## Closures

Measures closure creation, invocation, multi-threaded creation, and memory overhead.

```
v -prod -o /tmp/bench_closure bench/bench_closure.v && /tmp/bench_closure
```

```
| Test Name                 | Iterations | Time(ms) | Ops/sec      |
|---------------------------|------------|----------|--------------|
| Normal Function Call      |  100000000 |        0 |  +inf Mop/s  |
| Small Closure Creation    |   10000000 |      188 | 53.19 Mop/s  |
| Medium Closure Creation   |   10000000 |      376 | 26.60 Mop/s  |
| Large Closure Creation    |    1000000 |      121 |  8.26 Mop/s  |
| Small Closure Call        |  100000000 |      136 | 735.29 Mop/s |
| Medium Closure Call       |  100000000 |      133 | 751.88 Mop/s |
| Large Closure Call        |   10000000 |       16 | 625.00 Mop/s |
| Multi-threaded Creation   |    1000000 |       95 | 10.53 Mop/s  |
```

Memory: ~69 bytes per closure (medium, 4 captured vars). Closure calls are ~625-750 Mop/s.

## String Deduplication

Compares four deduplication strategies on 10,000 strings with ~30% duplicates.

```
v -prod -o /tmp/bench_string_dedup bench/bench_string_dedup.v && /tmp/bench_string_dedup
```

```
Method 1 (basic array)          33 ms   7000 unique
Method 2 (pre-allocated array)  27 ms   7000 unique
Method 3 (map)                   0 ms   7000 unique
Method 4 (set)                   0 ms   7000 unique
```

Maps and sets are orders of magnitude faster than linear array search for deduplication.

## Vectors (Boids Simulation)

N-body boids simulation with 10,000 entities: cohesion, separation, and alignment.

```
v -prod -o /tmp/bench_vectors bench/vectors/vectors.v && /tmp/bench_vectors
```

```
~50 ms per run (after warmup)
```

## Crypto: ECDSA

Key generation, signing, and verification (1,000 iterations each).

```
v -prod -o /tmp/bench_ecdsa bench/crypto/ecdsa/ecdsa.v && /tmp/bench_ecdsa
```

```
Average key generation time:   9 µs
Average sign time:            11 µs
Average verify time:          30 µs
```

## SOA Structs (V2 cleanc only)

Compares Array-of-Structs vs Struct-of-Arrays memory layout for a 16-field
particle system (500k particles). Uses V2's `@[soa]` attribute which auto-generates
separate contiguous arrays per field for better cache utilization.

```
./cmd/v2/v2 -prod -backend cleanc bench/bench_soa_structs.v -o bench/bench_soa_structs
./bench/bench_soa_structs
```

```
build particles
  aos: 20 ms    soa push: 125 ms    soa indexed: 37 ms

sum x only
  aos: 19 ms    soa: 12 ms    speedup: 1.58x

sum x/y/z/life (4 of 16 fields)
  aos: 14 ms    soa: 10 ms    speedup: 1.40x

sum all 16 fields
  aos: 11 ms    soa: 16 ms    speedup: 0.69x

integrate position/velocity/life
  aos: 10 ms    soa: 13 ms    speedup: 0.77x
```

SOA is 1.4x-1.6x faster for partial field access (fewer cache lines touched).
When all fields are accessed or mutated, AOS wins due to less pointer indirection.