Wildvine API — Performance Analysis & Optimization

v3 endpoint latency teardown · data-flow breakdown · applied optimization · generated 2026-05-27

TL;DR

~2.0s
Fixed Athena floor paid by every data endpoint, even a 109-byte response
23.5%
of production requests were cold starts (7-day window), adding ~1.75s init each
~30–58%
latency cut after the applied query-layer optimization
1 of 13
v3 endpoints aggregates in SQL; the rest pull raw rows and process in pandas

The single most important finding

Latency is not dominated by data volume or pandas processing — it is dominated by Athena's fixed per-query overhead. Proof: /v3/bds/years returns a 109-byte payload from a trivial GROUP BY, yet still took ~2.0s. A big driver was that the awswrangler client polled Athena for completion only once per second (athena_query_wait_polling_delay = 1.0), so each query wasted up to a full second just waiting.

Confirming the hypothesis: raw-fetch vs straight-query

The question was whether "some endpoints take raw data and process it, some just straight query." Confirmed — and the split is lopsided:

Note: species metadata (name, group, redlist, timber group) is enriched locally from species.csv rather than via an Athena JOIN — a deliberate choice to avoid INNER-JOIN row loss.

Per-endpoint breakdown — how each gets & processes data

EndpointTypeAthena queriesHow it gets & processes data
/v3/overview raw + model 2 Pulls raw per-tree rows (filtered by year/loc/species/dbh) → enrich species → per-ha summary + group-by species_group & DBH class. Then a 2nd query fetches baseline-year rows to run the forecast projection. The double query is why it is the slowest.
/v3/trends raw 1 (large) Always pulls ALL years (skip_year_filter) because census metrics compare consecutive censuses. compute_census_metrics() does per-location, per-year Python set-diffs for growth/mortality/recruitment, then aggregates per year × metric × group.
/v3/trends-v2 raw 1 Same shape as trends but gated by metrics. Only pulls all years if a census metric is requested; otherwise filters years in SQL. Faster than trends for non-census metrics.
/v3/species-breakdown raw 1 Raw rows → enrich → groupby('species') → per-species per-ha metrics → top-N ranking per metric + a red-list breakdown by DBH class.
/v3/flora raw 1 Pulls all matching rows (incl. coordinates), enriches, then paginates in memory (df.iloc[start:end]). Fetches the full result set even for page 1 — inefficient for large filters.
/v3/summary raw 1 Raw rows (year-filtered unless census) → enrich → optional census → groupby('dbh_size_class') per requested metric, with percentage-of-total.
/v3/forecast-by-class-view model 1 Fetch baseline-year raw rows → build_stand_table()run_projection() iterates +5y…+30y applying DIF growth / mortality / ingrowth coefficients in NumPy. Large payload (~230 KB).
/v3/forecast-by-summary-view model 1 Same projection, then groupby(forecast_year, species_group, is_protected) and sum to stand totals.
/v3/forecast model 1 Legacy alias → forecast-by-class-view.
/v3/bds/simulation raw 1 Fetch one location+year raw → enrich → price → pre-felling filter (DBH ≥ 30 cm) → distributions + candidate-tree logic (commercial groups, cutting limits, red-list exclusion).
/v3/bds/years SQL 1 The only SQL-aggregated endpoint. SELECT year … GROUP BY year ORDER BY year — returns just the distinct census years.
/v3/bds/compartments none 0 Pure Python constant from LOCATION_AREA. No I/O.

Measured latency — local, against real Athena

3 runs each; cold-start & API-Gateway/Cognito overhead excluded (this isolates app + Athena time). "Before" = original query layer; "After" = optimized query layer (see below). Values are the fastest of 3 runs, in seconds.

EndpointBeforeAfterImprovementPayload
/ (no Athena)0.0030.00350 B
/species (CSV)0.0150.013189 KB
/v3/overview (loc 1, 2 ha)4.112.49 −39%6.8 KB
/v3/overview (loc 2, 50 ha)5.844.18 −28%7.3 KB
/v3/species-breakdown2.081.35 −35%6.5 KB
/v3/trends (all years, census)3.072.01 −35%12.8 KB
/v3/trends-v2 (no census)2.080.98 −53%353 B
/v3/trends-v2 (census)2.601.95 −25%332 B
/v3/flora (page 10)2.111.07 −49%4.6 KB
/v3/summary (basal_area)2.111.33 −37%637 B
/v3/summary (census)2.571.62 −37%641 B
/v3/forecast-by-class-view2.121.39 −34%230 KB
/v3/forecast-by-summary-view2.121.35 −36%11.4 KB
/v3/bds/simulation2.051.14 −44%1.7 KB
/v3/bds/years1.970.83 −58%109 B

Payload size barely correlates with latency — further proof the cost is Athena's fixed per-query overhead, not data transfer or pandas.

Deep dive: why /v3/trends is the slowest — and will a bigger Lambda help?

Trend is the worst case. Profiled against compartment 50 (the 50 ha plot), pulling all census years:

StageTimeCPU/mem sensitive?
Rows fetched215,92650 ha × 6 census years, raw per-tree rows
1 — Athena fetch5.26sNo — server-side query + paginated row transfer
2 — enrich + price0.31ssingle-threaded pandas
3 — census metrics0.83sBarely — single-threaded pandas
4 — aggregation loop1.90sBarely — single-threaded pandas
TOTAL8.30s

Two root causes: (1) /v3/trends forces skip_year_filter=True — it pulls every census year regardless of the year picker, because census metrics compare consecutive censuses tree-by-tree; (2) the aggregation is a triple nested loop (~6 years × 23 metrics × 8 groups ≈ 1,100 passes) that re-filters and re-sums slices of a 215k-row DataFrame.

Will bumping CPU / memory help? Barely — poor ROI.

Lambda CPU scales with memory: the function is at 3008 MB ≈ 1.78 vCPU; the ceiling is 10240 MB ≈ 6 vCPU. But 63% of the time is the Athena fetch, which is server-side and completely insensitive to Lambda size. The remaining 37% is single-threaded pandas — and at 3008 MB a single thread already gets a full core, so adding cores doesn't make it faster. Estimate: ~3.4× the cost for maybe 10–20% latency improvement. Not worth it.

What actually fixes Trend, in order of value-for-effort:

After the vectorization (Trend, 50 ha, all census years)

StageBeforeAfter
Athena fetch5.26s~5.0s
census metrics0.83s~0.8s
aggregation loop1.66–1.90s0.19s
TOTAL8.30s6.45s

The optimization that was applied

A two-line change in the shared query_athena() helper (src/query.py), affecting all endpoints at once:

Controlled micro-benchmark (same 1,811-row query, 3 runs)

ConfigurationMin latency
A — current (polling=1.0, no reuse)2.01s
B — polling=0.251.55s
C — polling=0.25 + result reuse1.02s

Staleness note

Result reuse means a re-run of the ETL won't be reflected until the reuse window expires (default 60 min). If you need fresh data immediately after re-ingest, lower ATHENA_RESULT_REUSE_MINUTES or set it to 0 to disable. The poll-interval change has no such tradeoff.

Production reality (CloudWatch, last 7 days)

937
invocations
1.94s
p50 execution
8.21s
p95 execution
12.58s
max execution
220
cold starts (23.5%)
1.75s
avg cold-start init

Most-called endpoints: /v3/overview (184), /v3/species-breakdown (183), /years-by-location (146), /v3/trends (106), /v3/flora (103). A handful of 500s were seen on /v3/forecast-by-*.

Further recommendations (prioritized)

ImpactRecommendationNotes
HIGH Drop Athena from the hot path — bake the dataset into the image as Parquet and query with DuckDB/pandas, or pre-compute endpoint responses to S3/DynamoDB. Data is tiny (2 ha + 50 ha plots) and static. Removing the ~2s Athena floor entirely would put most endpoints in the tens-to-low-hundreds of ms.
HIGH Address cold starts (23.5% of traffic, ~1.75s each). Provisioned concurrency, or slim the package / lazy-import awswrangler (pandas+pyarrow dominate init).
MED /v3/overview issues two Athena queries (main + forecast). Make the forecast block lazy/optional, or reuse already-fetched rows. Would roughly halve overview latency — the most-called endpoint.
MED /v3/flora fetches the full result set then paginates in memory. Push LIMIT/OFFSET (and a COUNT(*)) into SQL so page 1 doesn't scan everything.
MED Push aggregation into SQL for overview / summary / species-breakdown. GROUP BY in Athena instead of pulling raw rows. Smaller transfer; complements (not replaces) caching.
LOW Steer clients from /v3/trends to /v3/trends-v2 with an explicit metrics list. trends always pulls all years; trends-v2 only does so when a census metric is requested.
LOW Investigate the 500s on /v3/forecast-by-*. Seen in the 7-day window; likely a param/edge-case path.