【Practical How-To】Build a “Complete” 2-Year, ~400 Million Tick FX Data Infrastructure on OptiMax VPS and Brute-Force Optimize Strategies with 64 Cores

【Practical How-To】Build a “Complete” 2-Year, ~400 Million Tick FX Data Infrastructure on OptiMax VPS and Brute-Force Optimize Strategies with 64 Cores

FX-dedicated VPS / OptiMax case study

The resolution of a backtest is set by the resolution of its data. The structure of price action that disappears on daily or hourly candles—movement inside the spread, the bias of fills, simultaneous spikes—survives only in tick (per-fill) data. But tick is a different order of magnitude: 6 FX currency pairs over a full 2 years comes to more than 400 million records. To “collect it without gaps,” “load it into memory,” and “optimize it by brute force” is more than an ordinary small VPS can handle. In this article we walk through—on our FX-dedicated VPS OptiMax—everything from building a complete 2-year tick-data foundation from scratch to brute-force optimizing 2,430 strategy-parameter combinations across 64 cores, with measured numbers at every stumbling point.

Test environment: OptiMax VPS (64 vCPU / 251GB RAM / Ubuntu 24.04)
Measurement period: 2023–2024 (full 2 years) / 6 currency pairs. Every figure in this article is a measured value.
Scope: only general engineering techniques—data-foundation building, rate-limit avoidance, and parallel optimization. It does not include any of our proprietary signal research.

OptiMax VPS — details & sign-up here

Why “tick data × high-spec VPS” #

A tick is the smallest-unit record of “every time a fill happens.” What is going on inside the spread, which way fills lean during a sudden move—this micro-structure vanishes the moment you round candles up. That is exactly why verifying short-term strategies requires ticks. Yet tick data is on a different scale, and collection, storage, and computation all test a machine’s raw power. This article honestly shares the pitfalls everyone hits in tick-data collection—right up to the failure of “thinking we’d collected it when 80% was actually missing,” and the fix.

What we built (the result) #

Metric Value
Period × currency pairs Full 2 years (2023–2024) × 6 pairs
Total ticks 405,616,079 (≈406 million)
Coverage ≈100% (zero misses across trading hours. Of each pair’s 15,048 hours: data≈12,473 + market-closed(404)≈2,574, unfetched ≤6)
In 1-second bars 141 million bars
Optimization trials 2,430 combinations (signal × holding time × threshold × pair, each with a bootstrap CI)

Time for the whole process (measured) #

Stage Time Notes
① Full tick collection (split across 2 clean IPs · gap-filling) ≈3 hours 10,856 s for 3 pairs per collector, 2 machines in parallel
② Transfer to the compute machine (OptiMax), ≈1GB 51 s ≈13.6 MB/s
③ Format conversion for analysis 116 s npz → parquet
④ Brute-force optimization of 2,430 combos on 64 cores 581 s grid build 111s + sweep 470s, ≈60 cores occupied

From zero to “an analyzable, complete 2-year tick foundation + optimization results” took, in effect, a little over 3 hours.

STEP 1: Data collection ― a free feed and “two pitfalls” #

Historical ticks can be pulled from Dukascopy’s free data feed (.bi5 = LZMA-compressed + 20-byte fixed-length records). One hour = one file, and it can be decoded in pure Python.

# .bi5 (LZMA + 20B fixed-length: ms_offset, ask, bid, ask_vol, bid_vol), fetched hour by hour in parallel
import lzma, urllib.request, numpy as np, concurrent.futures as cf
REC = np.dtype([("t",">u4"),("ask",">u4"),("bid",">u4"),("av",">f4"),("bv",">f4")])
def fetch_hour(sym, y, m, d, h):
    url = f"https://datafeed.dukascopy.com/datafeed/{sym}/{y:04d}/{m-1:02d}/{d:02d}/{h:02d}h_ticks.bi5"
    raw = urllib.request.urlopen(url, timeout=30).read()
    return np.frombuffer(lzma.decompress(raw), dtype=REC)  # price = points/(1000 for JPY, 100000 otherwise)

Pitfall ①: HTTP 503 (rate limiting) is triggered by “bursts,” not “total volume” #

To go faster we spread the work across several machines, and by repeatedly stopping and immediately restarting, our IP got temporarily blocked by Dukascopy with 503 → no response. The conclusion from our testing:

  • The 503 trigger is not “total request volume” but “a burst of connections in a short window” (= repeated stop/restart, excessive concurrent connections).
  • Even once blocked, stopping requests entirely lets it recover on its own in 5–10 minutes.
  • The rule is “once only · with modest concurrency · run it through without stopping.” Here, 24 threads per pair × 2 pairs in parallel (=48 connections) was the stable band.

Pitfall ② (most important): the “silent drop” ― you think you collected it, but 80% is missing #

The first collection looked like it “finished,” but when we later measured coverage it was only about 22%. The cause: the downloader was silently discarding HTTP 503 the same way it treated “no data (404).” Even without being blocked, soft 503s cropped up during congestion, and those hours quietly went missing. “Data is flowing = complete” is simply not true.

🛠 How we caught it and how we fixed it (a reproducible checklist)

  1. Verify completeness by “coverage / loss rate,” not by “is it flowing.” If the “nonempty rate” falls well below the trading-hours expectation (≈80%), suspect gaps. This time it surfaced when downstream analysis came out extremely thin (a 5-minute panel that should have had 150,000 bars had only around 200).
  2. Distinguish 404 from 503. 404 = a legitimate market close (do not refetch); 503/timeout = needs refetching. The moment you treat them as the same, the gaps become invisible.
  3. Multi-pass · gap-filling. Record the result of each (day, hour), and refetch only the failed hours with exponential backoff → repeat until failures hit 0.
  4. Close with a coverage report. Always output “total / data / closed(404) / unfetched,” and only trust the data once you’ve confirmed unfetched ≈0 and ticks/year match expectation.

As a result, coverage went from 22% → ≈100% (each pair unfetched ≤6 hours / 15,048). For the full 2 years we reliably obtained 406 million ticks.

STEP 2: Pitfall ③ memory ― tick processing is “RAM-bound” #

Decoding ticks expands the records in the target range into memory once before writing them out. In other words, required RAM is proportional to period × number of pairs.

In testing, using a small VPS with 1.9GB RAM as a helper meant instant OOM (force-killed for lack of memory) across multiple years × multiple pairs. By contrast, OptiMax (251GB RAM) had room to spare even loading a full 2 years × 6 pairs entirely into memory (plenty free even at the analysis peak).

In tick-data processing, RAM becomes the wall before the CPU does. OptiMax’s large memory exists precisely for this use.

STEP 3: The design key ― separate the “download machine” from the “compute machine” #

  • Downloading is dominated by line quality (the route to Dukascopy).
  • Analysis · optimization is dominated by RAM and core count.

These are the strengths of different machines, so we split the roles. This time we gathered data with two collectors on clean IPs (doubling the avoidance of the per-IP rate limit = the real benefit of splitting), and dedicated OptiMax as the “compute mothership” to transfer, aggregation, and optimization. Transferring ≈1GB took just 51 seconds. Neither bottleneck drags the other down.

STEP 4: “Brute-force optimize strategies” on 64 cores #

This is where OptiMax is in its element. With the same idea as MT5’s Strategy Tester optimization, we assign 1 trial = 1 core and use every core to the full.

The key is the unit of parallelism. Parallelizing by “currency pair” alone only spins up 6 cores. Make every combination of (pair × signal × holding time × entry threshold) a single trial, and the trial count jumps so that all 64 cores stay busy. The heavy preprocessing (the 1-second grid) is built once and shared.

# Build the heavy preprocessing (1-second grid) once → share via a fork pool (copy-on-write) → brute-force all combinations
import multiprocessing as mp
GRIDS = {p: build_grid(p) for p in PAIRS}            # built once in the parent process (shared to children via COW)
tasks = [(p,sig,H,thr) for p in PAIRS for sig in SIGNALS for H in HORIZONS for thr in THRESHOLDS]
with mp.Pool(64) as pool:
    results = pool.map(eval_combo, tasks)            # each combo = walk-forward + cost-aware + bootstrap CI

Measured: on 141 million 1-second bars, 2,430 combinations finished in ≈60 cores occupied · 581 seconds (grid build 111s + sweep 470s). Each trial is a full verification including walk-forward (time-series train→validate split) + cost-aware PnL with the round-trip spread subtracted + block-bootstrap confidence intervals.

“You can’t use all cores because it’s a backtest” is a misconception. Take the unit of parallelism to be the “parameter combination,” and optimization simply speeds up in proportion to core count. OptiMax’s 64 cores run literally at full tilt here.

Results (the general lessons) #

The quantitative results over a full 2 years · 406 million ticks · 2,430 optimization trials:

  • The signal itself does “exist”: ignoring cost (GROSS), the Sharpe ratio reached up to +429. Short-term price action genuinely has structure.
  • But a taker (market order) can’t capture it: subtract the round-trip spread and not a single one of the 2,430 turns positive (NET>0 is 0/2,430). Moreover, even the upper bound of each trial’s 95% confidence interval is positive in 0/2,430 = statistically too, the number of combinations that “could be positive” is zero (completing the data also wiped out noise-driven “lucky positives”).
  • This rigorously re-confirms, on large-scale, complete data, the textbook market-microstructure result: a short-term edge fits inside the bid-ask spread, and the side paying the spread = the taker cannot recover it.

Practical implication: research into short-term edges only means anything “after costs are subtracted.” And to run that large-scale verification in realistic time you need complete data · RAM · core count.

Summary #

What we did What made it work
Collecting a full 2 years · 406 million ticks Modest parallelism + 404/503 distinction + multi-pass · gap-filling took coverage 22%→100%
Expanding · processing 400 million ticks in memory 251GB RAM (a small VPS OOMs)
Brute-force optimizing 2,430 combinations 64 cores + “parallelism at the combination level”
Whole process A little over 3 hours in practice

Tick-level large-scale backtesting and optimization no longer demands monopolizing a special computer. With OptiMax VPS you can secure exactly the RAM and cores you need, when you need them, and run verification at this scale in realistic time. The data-collection pitfalls (503 bursts · silent drops · the RAM ceiling) can all be avoided with this article’s checklist.

This entire test was run on OptiMax VPS (64 vCPU / 251GB RAM).
Large memory · many cores · low latency, exactly as much as you need, when you need it. From FX automated trading to quant research.

OptiMax VPS — details & sign-up here

What are your feelings
Updated on 2026年6月21日

Would you like to become an affiliate?

Our affiliate program is easy to get started