Literature Review

Author: Lucas Luhur

Last updated: 2026-06-12

> For each paper, the key takeaways (with the important maths explained simply) and how it matters for this project.

Index of sources

#	Source	Relevance
1	Logos Blockchain Documentation	(primary system) Defines the exact system we model: a stake-weighted, per-slot PoS lottery over $N$ nodes whose winner broadcasts a proposal, with Blend as the anonymisation layer in between. Its three Blend knobs - path length, per-hop delay, cover-traffic rate, all quota-bounded - are the README’s “configurable black box”. Names the two adversary classes we implement (global + local/compromised-node) and makes the coupling between stake-privacy and unlinkability explicit and mechanistic.
2	Ouroboros Crypsinous	(theoretical anchor) Proves the cryptography leaves only one leak - block-production frequency ($P[\text{win}]=\varphi_f(\text{stake})$) - then explicitly declares the network layer out of scope and assumes a plain broadcast. That out-of-scope gap is the whole project. Its footnote (“an anonymous broadcast would remove the submitter leak”) points directly at Blend, and the stake-weighted frequency signal motivates our frequency -> stake estimator whose variance-shrinkage is the time-to-link curve.
3	Hiding Routing Information (Onion Routing)	(anonymisation layer) The “Tor-style” baseline to implement: layered encryption + a path of relays + low latency, but no deliberate delay or cover. Its candid §5 vulnerabilities (end-to-end timing correlation, the constant-traffic trade-off) are our adversary’s playbook and motivate the timing-correlation module. Also fixes the key contrast: it hides who-talks-to-whom (unicast), whereas our problem hides the originator of a broadcast.
4	Tor	(anonymisation layer) The deployed low-latency reference and the “no mixing, no cover” endpoint of our comparison. Its threat model is the citable formalisation of our partial-view adversary; its $(m/N)^2$ relay-compromise result is a clean analytic baseline to extend to broadcast; website fingerprinting is the template for a single-end proposer fingerprint. Cover traffic was an open problem here in 2004 - situating Blend/Loopix as the later answers.
5	Chaum Mixes	(anonymisation layer) The mix-net-style high-latency baseline (batch + reorder + uniform size). Strikingly, it already contains two of Blend’s three defining features - cover traffic and broadcast/flood delivery - so Blend is essentially a modern Chaum mix tuned for scarcity. Chaum’s own caveat (economising on dummies “may open the door to statistical attack”) is our research question verbatim, and his global-active adversary is the strongest model in our suite.
6	Loopix	(anonymisation layer · keystone) The keystone for modelling Blend: supplies the exact distributions (exponential per-hop delays, Poisson cover in three streams, stratified $l\ge3$), the security knob $\lambda/\mu$, and two ready-made metrics (pool entropy, end-to-end likelihood difference $\varepsilon$). Its memoryless timing-unlinkability guarantee assumes healthy cover rates - which Blend’s quotas starve - so the regime where its assumptions break is the thesis’s central experiment.
7	Anonymity Trilemma	(impossibility bound) Gives the theoretical ceiling: with cover ($\beta$) and latency ($\ell$) both quota-bounded, the advantage $\delta$ cannot be negligible - so the README’s “any regularity is exploitable” is provable, not just plausible. Makes $\beta$ and $\ell$ the master axes of the parameter sweep (overlay the $2\beta\ell=1$ frontier), gives $\delta$ a theoretical floor for the unlinkability metric, and shows each compromised relay cancels one round of delay budget ($\ell\to\ell-c$).
8	Anonymous Communication (Piotrowska lecture)	(design-space map) Places Blend in the taxonomy (stratified + continuous-time + loop cover = Loopix/Nym lineage, not Chaum-cascade or Tor-circuit) and explains mechanistically why scarce traffic kills anonymity - our core regime. Exposes two exploitable cracks in the memoryless guarantee (the “since last empty” clause collapses under sparsity; the per-packet guarantee ignores repeated cross-epoch observation) and introduces Nym as Blend’s closest real cousin, with 3 hops as the empirical sweet spot.
9	Comprehensive Trilemma	(impossibility bound) Pins Blend on the unfavourable side of the trilemma: its independent cover traffic earns no DC-net “user-coordination” bonus, so it pays the full price. The unified floor $\hat\ell(p'+\beta)<1$ is the single cleanest equation for our sweep, with $p'$ = scarce proposal rate. Proves any compromise ($c>0$) makes constant latency fatal, and confirms our statistical estimator is exactly the stronger probabilistic adversary its (weaker, possibilistic) bound under-states.
10	Dandelion++	(architectural sibling) The closest architectural analogue to Blend - a deployed, Tor-free, broadcast-source anonymity layer on a blockchain P2P graph. Hands us the precision/recall metric pair (with floors recall $\ge p$, precision $\ge p^2$), the first-spy + Galton-Watson analysis toolkit, and - crucially - the intersection attack that is our thesis already demonstrated (~10 observations deanonymise at 30% spies). Its fixed-path “cable” defence is exactly what Blend’s fresh random path per message may forgo.
11	De-Anonymisation Attacks on Tor (survey)	(adversary playbook) The menu of attacks for the adversary module, plus the vocabulary (traffic analysis vs confirmation; correlation/timing/watermarking). Names the concrete statistical tools (Spearman, mutual information, cross-correlation, learned DeepCorr) and the website-fingerprinting classifier ladder. DeepCorr’s “accuracy improves with observation time” is direct empirical support for time-to-link, and the “network size kills node-compromise” lesson cuts toward Blend, whose small network makes compromised relays more feasible.
12	Continuous stop-and-go mixnets	(Blend mixing model) The first formal end-to-end model of Blend’s exact mechanism (continuous exponential delays, stratified, source-routed = the MCM model). Gives parameterised $\delta$ bounds in our sweep knobs and the inherent FIFO timing leak that survives even an ideal mix. Its decisive result - strong user-unlinkability needs $\lambda_u\propto\lambda$ - is the rigorous form of “scarcity kills anonymity”, and it leaves cover traffic as an explicit open problem (our headline extension).
13	Logos Anonymity analysis (Mozeika)	(Blend mixing model · first-party) Logos’ own discrete-time (geometric-delay) companion to #12, in the queueing idiom our discrete-event simulator uses directly. Hands us a closed-form single-node mixing meter, a ready-to-code Monte-Carlo FIFO estimator, and the three tunable “levers” ($k$, $\rho=q_S/q_M$, link-latency spread). The cleanest first milestone: reproduce a first-party result, then extend it with the cover traffic it omits.
14	Statistical Disclosure Attacks	(methodological keystone) The methodological backbone - the “patient observer” recast as signal detection, with a closed-form observations-to-link bound that is the time-to-link curve in algebra. Proves uniform cover is defeated by averaging (it buys time, not safety), gives the susceptibility condition $m<N/(b-1)$, and generalises to any abstract anonymity system. We adapt it from recipient- to sender-side and test whether Blend’s cover violates its known-stationary-mean assumption.
15	Towards Measuring Anonymity	(metric) The standard normalised anonymity metric, degree of anonymity $d=H(X)/H_M\in[0,1]$ - natively sender-side, so it fits the broadcast-originator problem with no adaptation and unifies every attack module under one comparable output. It literally names this dissertation’s two contributions as open problems: the evolution of $d$ over observation time and measuring the effect of dummy traffic. We also report the un-normalised $H(X)$ to avoid the small-set illusion in the scarce regime.
16	Failures in mix networks (Mozeika)	(compromise model · first-party) Logos’ own statistical analysis of node-compromise in Blend’s stratified topology. Gives $\alpha$ = the fraction of fully-compromised paths as a structural deanonymisation floor (trivially deanonymised regardless of timing defences), and shows the naïve $(M/N)^L$ formula is only the infinite-width limit - finite, scarce mixnets are ~2.2× more compromised. Supplies the hypergeometric compromise model and a layer-optimisation algorithm for the “best Blend configuration” deliverable.
17	Practical Traffic Analysis	(methodological) The experimental template: it points the SDA estimator at increasingly Blend-like systems (pooling, mix networks, dummies, partial views) and shows the intersection attack is slowed but never stopped. Its closing question - “how long can designs defend which senders against an adversary who sees how much” - is the project’s research question almost verbatim, and its median-rounds-to-link protocol is our experimental loop. Key finding: padding works only by denying the attacker a picture of the network in the target’s absence.
18	Information-Theoretic Anonymity Metric	(metric) The absolute companion to #15: effective size $S=-\sum p_u\log_2 p_u$ (“bits the attacker still needs”) and $2^S$ (effective anonymity-set size), the honest headline number in the scarce regime where raw set-size lies. Provides a pool-mix closed form to validate the simulator against, a composition law for multi-hop paths, and - most directly - the route-length attack, which weaponises exactly the kind of publicly-known path-length quota Blend imposes.
19	The Nym Network	(deployed-system comparison) The deployed, incentivised Loopix and Blend’s closest real-world cousin - its parameter set (stratified $L\in\{3,4\}$, stake-weighted per-epoch node selection, exponential delays, Poisson loop cover, epochs) is the defensible “deployed baseline” configuration. Its $\alpha^L$ compromise claim is the primary-source target #16 corrects, its stake-weighted selection creates the stake–topology–signal coupling no source models, and its load-bearing “anonymity loves company” assumption runs exactly backwards in Blend’s scarce regime.
20	Do Dummies Pay Off?	(cover-traffic keystone) The cover-traffic result in closed form: profiling error decays as $1/\rho$ and dummies change only the constant - cover buys time, never safety. It is the LSDA attack engine for our module, and its optimal-cover result ($\delta_i\propto\lambda_i$, i.e. cover proportional to sending rate) combines with #1 and #2 into the project’s headline finding: Blend’s flat per-node cover quota under-protects exactly the high-stake proposers an adversary most wants to deanonymise.

Dictionary

Recurring concepts, in roughly the order they become relevant.

Metadata - everything about a message except its content: who sent it, when, how big, how often, by what route. Encryption hides content; it does not hide metadata. This whole field exists to protect metadata.
Traffic analysis - inferring who-sent-what (or who-talks-to-whom) from metadata alone. This is the adversary’s job, and the thing the project measures.
Traffic confirmation vs traffic analysis - confirmation verifies a suspected pair (“is node X the proposer?”); analysis discovers an unknown one (“who is the proposer?”). Confirmation is easier and is what end-to-end timing attacks do.
Anonymity set - the set of participants who could plausibly have originated a given message. Bigger and more uniform = more anonymous. Shrinks toward one as the adversary learns more.
Sender anonymity - hiding who originated a message. This is the project’s core problem (the originator of a block proposal). Contrast receiver anonymity (who received it), which is less relevant because broadcasts go to everyone.
Unlinkability - can’t tie a message to its sender (or tie two messages to the same sender).
Unobservability - can’t even tell whether a node is sending at all (a stronger property, provided by always-on cover traffic).
Mix / mixnet - a relay (or network of relays) that hides the correspondence between its inputs and outputs, either by batching + reordering or by random per-message delays.
Layered (“onion”) encryption - wrap the message in one encryption layer per hop; each relay peels exactly one layer and learns only its immediate predecessor and successor. One honest relay on the path is enough to break the chain of custody.
Padding - making every message a uniform size (fixed-size onions / cells, e.g. Tor’s 512-byte cells), so message length leaks nothing - not the hop count, not the payload size. Distinct from cover traffic: padding normalises the size of real messages; cover injects fake ones. Both deny the adversary a per-message feature to correlate on.
Cover / dummy traffic - fake messages, indistinguishable from real ones, injected to manufacture a crowd and hide real sending patterns.