DIMACS Working Group on Measuring Anonymity Notes from Session 3: Information Theoretic and Language-based Approaches

DIMACS Wrking Grup n Measuring Annymity Ntes frm Sessin 3: Infrmatin Theretic and Language-based Appraches Scribe: Matthew Wright In this sessin, we had three 15-minute talks based n submitted abstracts and abut 45 minutes f panel discussin with the three speakers as panelists. The fcus f the sessin was n infrmatin theretic appraches t measuring annymity, plus a nvel language-based apprach t using annymity. Talk #1: Parv Venkitasubramaniam. Annymity in Tr-like systems under Timing Analysis: An Infrmatin Theretic Perspective. (15 min.) Basic Idea/Issue End-t-end timing attacks (cnfirmatin) Adversary mnitrs all the links? What is the likelihd f this? Need an bjective t evaluate the building blcks = the ruting/mixing Adversary assumptins Prir knwledge: traffic statistics, likelihd f a link System bservatin: timing n all links Netwrk strategy: (bservatin pints? missed this ne) Observatin includes all timing (past, present, future) Adversary is aware f the strategy Infrmatin theretic apprach: JOINT distributin Pr(these are the specific surces f the packets bs., etc.) Entrpic measure is useful Fan s inequality: Shannn entrpy prvides lwer bund n prbability f errr Can integrate prir infrmatin using Bayes Therem. Our cntributins We use a Pissn prcesses fr arrivals and create ptimal rerdering strategies fr a latency- and buffer-cnstrained system. Fundamental trade-ff between annymity and QS. Annymity f a mix-net is a linear fn. f ann. f the individual mixes. Packets vs. Streams: Fr lng streams, need dummies. Lts!, e.g. fr DLP. What if the stream is shrt lived? Maybe less padding? Admissible length: hw lng can the stream live w/ perfect ann? Pissn assumptin? Maybe this is nt s realistic -- neither users nr websites are memryless. Wuld be interesting t see this apprach using ther assumptins. Rerdering: is this really pssible?

What is imprtant is that rder is hidden. [Matt s nte: burstiness in traffic makes this hard, t] Talk #2: Kstas Chatzikklakis. Infrmatin thery and decisin thery t measure infrmatin leakage [using gain functins]. (15 min.) Infrmatin theretic definitins must say smething meaningful abut yur applicatin t be viable. Quantitative infrmatin flw: measure hw much infrmatin is leaked by the system. [think cvert channels r data query privacy] Mdel Channel C, input x, utput y -- C[x, y]: prb. f y given x Inputs gverned by a prir prbability. apply Bayes. Leakage Attacker tries t guess the secret (x) in ne try. Chance f success? Prir vuln: just given the prir prb. Pst: add the channel leakage Leakage: difference frm Pst and Prir (min-entrpy leakage) Limitatins What abut partial guessing, a prperty/part f a secret, multiple tries, r ther aspects? Example: a ty channel that reveals exactly ne sender (n mre r less) If attacker needs t guess the whle list: lst 10 f 10240 bits Guess the receiver f a particular sender: lst 1.016 ut f 10 bits Guess just the receiver f any sender: lst all bits! Gain functins Attacker makes ne guess abut the secret The benefit is a gain functin Success measure: The expected gain f a best guess. Benefit: Mdel a variety f attackers and peratinal scenaris. e.g.: can d apprximate guessing, prperty f a secret (gender/cuntry), part f a secret (part f a lcatin, IP), multiple tries. Therem: g-capacity \leq min-capacity fr all gain functins g Min-capacity is an upper bund n Shannn capacity This essentially brings the existing wrk in decisin thery t annymity.

Talk #3: Aslan Askarv and Stephen Chng. Twards Language-Based Netwrk Annymity. (15 min.) Applicatins (brwser, ssh, etc.) What if the applicatin knws abut annymity underneath? App may realize that annymity isn t needed App may realize that direct cmmunicatin is required Why annymity? Want t hide cmmunicatins frm netwrk (e.g. ISP) Want t remain annymus frm the receiver Prgramming languages techniques ISP case: sundly infer such messages Receiver case:... [what was this?] Mdel Annymus cmmunicatins as a primitive Ensure that they are used securely Ex: nline auctin. participatin is public, winner is secret Can infer using infrmatin flw that the winner declaratin must be annymus. Ex2: EasyChair. Authr buttn shuld be annymus cnnectin. App figures ut when yu need privacy and uses an ann cnnectin then (and nly then) t imprve perfrmance. Measuring annymity Given a netwrk annymity metric X Can we be sure that the annymity desn t g dwn? Panel Discussin (45 min.) Nte: Discussin participants are labeled A t Z fr each questin. T Aslan: Suppse I dn t want NYT advertisers t knw wh I am, but NYT is OK. Can I d this withut a leak given pssible side channels due t simultaneus lading? A: [yes] The inputs are gverned by a prir prbability. Apply Bayes. T Aslan: If yu weren t able t split ann and nn-ann, this wuldn t help much, right? A: yes. Reframed t be psitive -- it s an annymity-preserving ptimizatin f yur applicatin. B: Des the prgrammer need t knw smething abut the annymity service? A: Yes. C: there can be different specialized applicatins requiring different annymity levels, and this culd be taken t the transprt level t.

[T Aslan] If all the traffic is especially sensitive, did we make it a bigger target t attack? e.g. in the auctin prtcl, if yu lk at the timing, yu can see the auctin winner is the nly persn wh s ging t use annymus channels. A: T sme degree, peple already d this. Yu dn t just turn n Tr 100% f the time. May need t mve away frm this idealistic gal. B: An example f this is Tr set up w/ DNS nt Tr-ized. This bviusly leaks yur cnnectins t the DNS (e.g. yur ISP). C: If yu start at 100% Tr, and yu run ur idea n yur applicatin, yu can back ff withut lsing sme annymity. D: Isn t the pint f smething like Tails that yu can be sure t run everything ver Tr withut screwing up? This apprach seems backwards relative t that. E: When crypt was weak, yu had t use message discipline and be cautius abut what yu sent. Same here -- the mre yu use the system, the bigger/clearer the fingerprint yu leave behind. F: That depends n the attacker mdel. G: e.g. if yu have tw gmail accunts, ne is public and ne is sensitive, it s better t access the public ne withut using Tr. A: Yu have an annymity budget. The mre yu use, the mre yu lse, and then yu need a refresh. H: trying t d this [what?] in the Dissent prject. T Kstas: Is there a reasn t use expectatin in the gain functins (vs. min)? A: This depends n the applicatin. Very unlikely, but revealing, events are OK (r we can live with them). Differential privacy will nt say that. It will say the unlikely event is very bad. This is an extensin f infrmatin theretic definitins. B: (t Kstas) yu shuld nt use putting in yur passwrd as an example Shuld we really be develping an annymity-usability metric? A: I agree with this idea. There is n free lunch. Where n the spectrum d yu land [b/w high annymity and high usability]? B: Annymity depends n ther users activity. If yu have a system with very sensitive users, the annymity set is small. Yu need t make the cst fr users as lw as pssible t get users wh care less. C: Sure, if yu divide peple up, it s a prblem. B: The system can never get started. A: What abut a mdel in which the users als prvide the resurces (P2P annymity)? D: Oh, that s a wrld f pain. E: And nt s friendly t many users. Ref. t the many brken designs in P2P annymity. Nne f the metrics we ve talked abut tday are any gd at dealing w/ active adversaries. This is a big prblem.

A: that s nt entirely true. Strategies have sme tlerance t sme level f active attacks. That tlerance is a measure.* B: What abut DSing, degrading, etc.? These dn t accunt fr that. C: If it is assuming a distributin in advance, then yes. But if the metric can dne based n what is bserved, then it can be OK. * Diaz+ have a HtPETs paper that examines this: http://www.csic.esat.kuleuven.be/publicatins/article-2320.pdf