A ew Design of Privae Informaion Rerieval for Sorage Consrained Daabases icholas Woolsey, Rong-Rong Chen, and Mingyue Ji Deparmen of Elecrical and Compuer Engineering, Universiy of Uah Sal Lake Ciy, UT, USA Email: {nicholas.woolsey@uah.edu, rchen@ece.uah.edu, mingyue.ji@uah.edu arxiv:90.07490v [cs.it] 22 Jan 209 Absrac Privae informaion rerieval PIR) allows a user o download one of K messages from daabases wihou revealing o any daabase which of he K messages is being downloaded. In general, he daabases can be sorage consrained where each daabase can only sore up o µkl bis where µ and L is he size of each message in bis. Le = µ, a recen work showed ha he capaciy of Sorage Consrained PIR SC-PIR) is + + ) + +, which is achieved by a sorage 2 K placemen scheme inspired by he conen placemen scheme in he lieraure of coded caching and he original PIR scheme. o surprisingly, his achievable scheme requires ha each message is L = ) K bis in lengh, which can be impracical. In his paper, wihou rying o make he connecion beween SC-PIR and coded caching problems, based on a general connecion beween he ull Sorage PIR S-PIR) problem µ = ) and SC- PIR problem, we propose a new SC-PIR design idea using novel sorage placemen schemes. The proposed schemes significanly reduce he message size requiremen while sill meeing he capaciy of SC-PIR. In paricular, he proposed SC-PIR schemes require he size of each file o be only L = K compared o he sae-of-he-ar L = ) K. Hence, we conclude ha PIR may no mee coded caching when he size of L is consrained. I. ITRODUCTIO Recen works have aken an informaion heoreic approach o solve he privae informaion rerieval PIR) problem [], [2] originally inroduced by Chor e al. [3], [4]. In he PIR problem, a user desires o privaely download one of K messages from non-colluding daabases. In his conex, privacy means ha he ideniy of he message desired by he user is no revealed o any daabase. Ensuring privacy relies on he concep ha a user will reques sub-messages from all K messages as opposed o jus he message ha he user desires. To efficienly download he desired message, he user sraegically generaes daabase queries ha uilize undesired bu downloaded sub-messages for coding opporuniies. The rae of a PIR scheme is defined as he raio of desired bis, L, or he size of each message, o he oal number of downloaded bis, D. The capaciy C opimal rae) is defined as he maximum achievable rae. Previously, Sun and Jafar [] derived he capaciy of he ull Sorage PIR S-PIR) problem where a user privaely downloads one of K messages from daabases ha each sores all K messages. In his case, he capaciy is C = + + ), + + 2 which was achieved by a K PIR scheme requiring L = K. This resul was furher generalized by Aia e al. [2] for he Sorage Consrained PIR SC-PIR) problem where each daabase can only sore µkl uncoded bis where µ. In his case, boh a sorage placemen scheme and a PIR scheme querying and decoding) need o be designed. Le = µ, he capaciy of SC-PIR is + + ) + + 2 under an uncoded K sorage placemen consrain and was achieved by a sorage placemen scheme inspired by he coded caching problem [5] and a PIR scheme based on []. One of he limiaions of his scheme is he requiremen of a large message size, L = ) K [2], which is due o he fac ha he sorage placemen is designed based on he cache placemen in coded caching problem [5]. Hence, he proposed PIR scheme of [2] can be impracical for a large number of daabases. This achievable scheme was generalized o he decenralized sorage placemen in [6]. urhermore, Tian e al. [7] use Shannon heoreic approach o analyze he SC-PIR problem for he canonical case of K = 2 and = 2 and proposed he opimal linear scheme. More ineresingly, hey also showed ha non-linear scheme can use less sorage han he opimal linear scheme. In his paper, we aim o find SC-PIR schemes ha achieve he capaciy of SC-PIR while requiring a significanly smaller message size L. In order o achieve his goal, for he sorage placemen, we abandon he idea of using he cache placemen of coded caching problem and design i from scrach. In fac, our proposed SC-PIR schemes achieve he capaciy and require only L = K, which is significanly less han L = ) K in [2]. More specifically, our conribuions are as follows. Our Conribuions: ) We provide a general design mehodology for he SC- PIR problem by esablishing a generic connecion beween he S-PIR and SC-PIR problems. Based on his connecion, a SC-PIR scheme can be readily designed from any given S-PIR scheme. 2) We propose a simple sorage placemen when is an ineger. By adoping he achievable scheme based on [], he capaciy of SC-PIR can be achieved and L = K. This serves as a base case for he more general scenario when is no an ineger. 3) When is no an ineger, we propose a novel sorage placemen, which in conjuncion wih he S-PIR scheme of [8], achieves he capaciy of SC-PIR and only
requires L = K. The key o he reducion in L is achieved using he proposed novel sorage placemen. 4) We presen a se of sufficien condiions under which he proposed SC-PIR schemes are capaciy-achieving. oaion Convenion: We use o represen he cardinaliy of a se or he lengh of a vecor and [n] := [,2,...,n]. II. PROBLEM ORMULATIO There are K independen messages, W,...,W K, each of size L bis. The messages are collecively sored in an uncoded fashion among non-colluding daabases ha each has a sorage capaciy of µkl bis, where µ. We define Z n as he sorage conens of daabase n []. Also, we define µ as he average number of imes each bi of he messages is sored among he daabases. A user makes a reques W k and sends a query Q [k] n, which is independen of he messages, o each daabase n [] which hen sends an answer A [k] n such ha HA [k] n Z n,q [k] ) = 0, k [K]. ) n urhermore, given he answers from all he daabases, he user mus be able o recover he requesed message wih a small probabiliy of error. Therefore, HW k A [k],...,a[k] n,q [k],...,q[k] n ) = 0. 2) The user generaes queries in a manner o ensure privacy such ha no daabase has insigh ino which message he user desires, i.e., Ik;Q [k] n,a[k] n,w,...,w K,Z,...,Z ) = 0. 3) Le D be he oal number of downloaded bis. Given µ, we say ha a pair D,L) is achievable if here exiss a SC-PIR scheme wih rae L/D ha saisfies )-3). The SC-PIR capaciy is defined as C µ) = max{r : D,L) is achievable. 4) III. THE PROPOSED SC-PIR SCHEME WHE Z+ In order o presen he proposed scheme, we need o esablish a connecion beween S-PIR and SC-PIR problems. This connecion is vial o reduce he required minimum size of messages from ) K, as in he sae-of-he-ar scheme of [2], o K wihou affecing he opimal rae. We show ha an achievable SC-PIR scheme can be derived from any general achievable scheme for he S-PIR problem. Hence, by using he proposed sorage placemen, he achievable scheme in [] can be used o obain a new SC-PIR scheme. To illusrae our idea, we firs presen an example as follows. A. A Sorage Consrained PIR Example when Z+ Consider = 4 daabases labeled as DB hrough DB4. Collecively he daabases sore K = 3 messages, denoed by A, B and C. Each message is comprised of L = 6 bis. ) Sorage placemen scheme: We spli each message as follows. A = {a j i : i [2],j [8] 5) B = {b j i : i [2],j [8] 6) C = {c j i : i [2],j [8]. 7) Each daabase has he sorage capaciy of up o 24 bis, or half of all 3 messages µ = 2). The sorage conens of he daabases are defined o be Z = Z 2 = Z 3 = Z 4 = {a j : j [8] {b j : j [8] {c j : j [8] {a j 2 : j [8] {b j 2 : j [8] 8) {c j 2 : j [8]. 2) PIR Scheme: Each daabase sores 8 ou of 6 bis of each message. Daabases and 2 have he same sorage conens, bu do no have any sorage conens in common wih daabases 3 and 4. Likewise, daabases 3 and 4 have he same sorage conens. In his way, we essenially reduce a SC-PIR problem ino wo independen S-PIR problems; one consiss of daabases and 2, and he oher consiss of daabases 3 and 4. Subsequenly, we can simply adop he achievable S- PIR scheme of [] o generae he queries for each pair of he daabases separaely. The queries of a user ha desires message A are shown in Table I. TABLE I STORAGE COSTRAIED PIR, = 4, K = 3, µ = 2 DB DB2 DB3 DB4 a 5 b 8 c 6 a b 3 c a 5 2 b 7 2 c 4 2 a 2 2 b 6 2 c 2 2 a 6 +b 3 a 3 +b 8 a 2 +b 6 2 a 7 2 +b 7 2 a 7 +c a 8 +c 6 a 6 2 +c 2 2 a 8 2 +c 4 2 b 6 +c 5 b 7 +c 3 b 3 2 +c 6 2 b 8 2 +c 7 2 a 2 +b 7 +c 3 a 4 +b 6 +c 5 a 3 2 +b 8 2 +c 7 2 a 4 2 +b 3 2 +c 6 2 3) Achievable Rae: The oal number of downloaded bis is D = 28. Thus, we have for his scheme L D = 6 4 9) 28 = 7, which achieves he capaciy of + + ) = + 2 2 + 2 ) = 4 2 7. Compared o he SC-PIR scheme of [2] ha requires L = ) K = 4 2) 2 3 = 48 bis, he proposed SC-PIR requires only L = 6 bis. 4) Privacy Consrain: Privacy is ensured since he S-PIR scheme of [] is used o privaely download half of message A from DB and DB2 and he oher half from DB3 and DB4. The query o each daabase is symmeric such ha for each bi of A ha is requesed, a bi each from B and C are also requesed. All coded pairs of bis from he 3 messages are requesed an equal number of imes. Ulimaely, he user can decode all bis of message A, because downloaded bis of B andc can be used for decoding see Table I). In he following, we will firs formalize he connecion beween he S-PIR and SC-PIR problems and hen generalize his example.
B. The general connecion beween he S-PIR and SC-PIR Define a vecor α = [α,...,α ], where Z +, i= α i =, and α f, f [] is raional number such ha α f L Z +. or all k [K], we divide message W k ino disjoin sub-messages W k = W k,,...,w k, such ha for all f [], W k,f = α f L bis. or all f [], le M f W k,f, 0) k [K] and f [] be a non-empy subse of daabases which have he sub-messages in M f locally available o hem. The sorage conens of daabase n [] is Z n = {M f : f [],n f, ) where we have he requiremen ha for any n [], α f µ. 2) {f:f [],n f Given ha a user requess file W θ for some θ [K], we do he following. or all f [], using a S-PIR scheme, he user generaes a query o privaely downloadw θ,f from he daabases in f. In oher words, a SC-PIR scheme can be found by applying a S-PIR scheme o each se of daabases f. Changing he choice of he S-PIR scheme or he definiions of f will resul in new SC-PIR schemes. The rae of he SC-PIR scheme as a funcion of he rae of he implemened S-PIR scheme is given in he following heorem. Theorem : Given,K, Z + and α, spli each of he L-bi messages W,...,W K ino sub-messages of size α L,...,α L and sore hem a ses of daabases,..., [], respecively. Given a se of S-PIR schemes wih achievable raesr,...,r, he achievable rae of privaely downloading W θ, θ [K], from he sorage consrained daabases is α + α 2 + + α ). 3) R R 2 R Proof: We firs coun he number of downloaded bis. or all f [], R f = α fl D f where D f is he number of downloaded bis necessary o privaely download W θ,f of size α f L bis from he daabases in f. Therefore, he oal number of bis required o privaely download he enirey of W θ is α D = D +D 2 + +D = L + α 2 + + α ). R R 2 R Since L D, we obain 3). C. General Achievable Sorage Consrained PIR Scheme When Z+ ) Sorage Placemen Scheme: Given Z + and [] such ha Z +, le = and for each k [K], spli message W k ino disjoin, equal-size sub-messages, W k,,...,w k,. urhermore, spli he daabases ino disjoin groups of size labeled as,...,. or each f [ ], he sub-messages of M f = W k,f 4) k [K] are sored a every daabase of f. 2) PIR Scheme: A user desires o privaely download message W θ for some θ [K]. or each f [ ], he user generaes a query using he scheme of [], o privaely downloadw θ,f from hedaabases in f. The user combines he downloaded sub-messages, W θ,,...,w θ, o recover he desired message W θ. To implemen his SC-PIR scheme, each message is spli ino equal-size, disjoin sub-messages. urhermore, he adapaion of he S-PIR scheme of [] requires ha each sub-message is furher spli ino K equal-size, disjoin submessages. The resuling SC-PIR requires a oal of L = K = K bis. An example of his SC-PIR scheme is described in Secion III-A. 3) Achievable Rae: The achievable rae of his scheme is summarized as follows. Theorem 2: Given,K, and µ [,], such ha = µ [], Z + and L = K, for a user o privaely download one of K L-bi messages from daabases wih a sorage capaciy of µkl bis, he achievable rae is + + 2 + + ) K. 5) Moreover, i was shown in [2] ha 5) is he capaciy of SC-PIR for Z +. While we do no direcly prove Theorem 2 here, in Secion V we presen a se of sufficien condiions, which his scheme saisfies, for an SC-PIR scheme o mee he capaciy. IV. THE PROPOSED SC-PIR SCHEME WHE In Secion III, we esablished a general connecion beween SC-PIR and S-PIR problems. We showed ha by properly spliing messages and allocaing sub-messages o differen groups of daabases, a SC-PIR scheme can be derived by applying a separaely designed S-PIR scheme o each group of daabases. In paricular, when choosing he S-PIR scheme o be he one in [], we obain a SC-PIR scheme ha achieves capaciy while requiring Z +. In order o remove his resricion, in his secion, we propose a new sorage placemen and use i in conjuncion wih he achievable S-PIR scheme of [8] o obain a new SC-PIR scheme. This scheme achieves capaciy while requiring only L = K, which is he same as he scheme of Secion III-C when Z+. A. A Sorage Consrained PIR Example when In his example, = 5 daabases, labeled DB hrough DB5, collecively sore K = 2 messages, A and B, and each has a size of L = 5 bis. Each daabase sores an µ = 3 5 fracion of he 2-message library = µ = 3).
) Sorage Placemen Scheme: Each message is spli as follows. A = {a j i : i [5],j [3], B = {b j i : i [5],j [3]. 6) By his labeling, we have essenially spli he messages in wo phases. The firs spliing phase, denoed by he subscrip, deermines which daabases sore hese bis. The second spliing, denoed by he superscrip, is necessary o perform he S-PIR scheme. or all f [5], define M f = ) a j f bj f 7) j [3] and le he se of daabases f = [ 2 : 0] f locally sore he bis of M f. oe ha as opposed o he SC-PIR scheme described in Secion III-A where he ses of daabases { f,f =,, are muually exclusive, here we allow hem o overlap and hence removing he ineger consrain of Z+. As a resul, he bis of message A sored a DB n [5] are { Z n = a j i : i {[0 : 2] n,j [3]. 8) Message B is sored among he daabases in a similar manner. or insance, DB2 sores all bis a j i and bj i such ha i [2 : 4] and DB5 sores all bis a j i and bj i such ha i {5,,2. TABLE II STORAGE COSTRAIED PIR, = 5, K = 2, µ = 3 5 DB DB2 DB3 DB4 DB5,2,3) 2,3,4) 3,4,5) 4,5,) 5,,2) a 3 b 2 a 3 2 b 2 2 a 3 b 3 3 a 2 4 b 3 4 a 2 5 b 5 a 2 +b 2 2 a 3 3 +b 3 3 a 3 4 +b 3 4 a 5 +b 5 a 2 +b 2 a 2 3 +b 3 3 a 4 +b 3 4 a 3 5 +b 5 a +b 2 a 2 2 +b 2 2 2) PIR Scheme: The queries of a user ha desires o privaely download message A are shown in Table II. The op row of he able conains daabase labels and he 3-uple below each daabase label defines he subscrips of he bis ha are locally available o ha daabase. The remaining hree rows of he able show he queries of he user. The user adops he S-PIR scheme of [8] o design queries. or insance, o obain bis {a j,j [3], he user applies he S-PIR o DB, DB4, and DB5. In he firs round, he user obains a 3 from DB. In he second round, he user can decode a from DB4 s ransmission of a + b 2 because he user had already received b 2 from he firs round ransmission of DB in round. Similarly, he user decodes a 2 from DB5 s ransmission of a 2 + b 2. These ransmissions are highlighed in red in Table II. To ensure privacy, he queries are symmeric and no bi is requesed more han once from any one daabase. In his example, D = 20 bis are downloaded and he rae is 3 4. Comparing o he sae-of-he-ar SC-PIR scheme We impose he following noaion: a b = a+b mod )+ and [a : a 2 ] b = {a b : a [a : a 2 ]. of [2], he rae is he same, bu L has been reduced from ) K = 5 3) 3 2 = 90 o K = 5 3 2 = 5. B. General Achievable SC-PIR Scheme When ) Sorage Placemen Scheme: or each k [K], message W k is spli ino disjoin equal-size sub-messages W k,,...,w k,. or all f [], define a se of sub-messages M f = k [K] W k,f which is locally sored a he se of daabases f = [ ) : 0] f. 2) PIR Scheme: A user desires o privaely download message W θ for some θ [K]. or each f [], he user generaes a query using he scheme of [8], o privaely downloadw θ,f from hedaabases in f. The user combines he downloaded sub-messages, W θ,,...,w θ, o recover he desired message W θ. urhermore, if desired, o obain symmery across he daabases, i.e., each daabase sends he same amoun of coded bi combinaions from each file, he user can choose daabase f o sar he query process when privaely downloading W θ,f. or more deails on he query generaion process, see [8]. 3) Achievable Rae: The achievable rae of his SC-PIR scheme is summarized in he following heorem. Theorem 3: Given,K, and µ [,], such ha = µ [] and L = K, for a user o privaely download one of K L-bi messages from daabases, each wih a sorage capaciy of µkl bis, he rae is + + 2 + + K ). 9) The resuls of Secion V demonsrae ha his SC-PIR scheme saisfies he sufficien condiions o mee he capaciy. This proves Theorem 3. V. SUICIET CODITIOS TO ACHIEVE CAPACITY OR SC-PIR In his secion, we provide wo sufficien condiions for a sorage placemen scheme o achieve he SC-PIR capaciy. Theorem 4: Given,K, Z + and α, spli each of he L-bi messages W,...,W K ino sub-messages of size α L,...,α L and sore hem a ses of daabases,..., [] according o equaions 0)-2). Each daabase has a sorage capaciy of µkl bis, µ, where = µ [,]. Assume ha a user requess file W θ for some θ [K]. A SC-PIR scheme is obained if for all f [], he user generaes a query o privaely download W θ,f from he daabases in f using a capaciy-achieving S-PIR scheme. The resuling SC-PIR scheme is capaciyachieving if he sub-message sorage placemen saisfies one of he following wo condiions: ) If Z +, f = for all f [] 2) If, f {, for all f [] such ha α f = 20) and f: f = f: f = α f =. 2)
Proof: Define R S x) as he rae of a capaciy achieving S-PIR scheme o privaely download one of K messages from x nodes. urhermore, R S x) = + x + + ) x K 22) as was shown in []. or Z +, i follows from Theorem ha he rae of he SC-PIR scheme is α R S ) + + α ) = R S) 23) R S ) = + + + ) K 24) which is he capaciy of SC-PIR [2]. or, i follows from Theorem ha α f + R S ) R S ) = and hus f: f = f: f = α f 25) R S ) + ) 26) R S ) R = )R S )+ )R S ). 27) oe ha he poin,r ) is simply an linear inerpolaion of he wo poins,r S )) and,r S )) where he capaciy of SC-PIR for = x is precisely R S x). Moreover, i was shown in [2] ha he se of achievable poins,r ), is he lower convex hull of he se poins {,C ) : []. Therefore, 26) mees he SC-PIR capaciy. VI. DISCUSSIO AD UTURE WORK Recen works on SC-PIR sugges ha coded caching mees PIR [2], [9]; ha is, he file placemen soluions of coded caching [5] are useful for he SC-PIR sub-message placemen problem. In his work, we show ha coded caching placemen echniques are no necessary for SC-PIR by proposing wo novel sub-message placemen schemes which achieve he capaciy. In he coded caching problem, assigning differen files o an exponenially large number of overlapping user groups is necessary o creae mulicasing opporuniies such ha a user can cancel inerference from a received coded ransmission which also serves oher users. The SC-PIR problem is less complex in ha only one user is being served. In fac, as was demonsraed wih our firs proposed scheme, i is no necessary for he sub-message placemen groups o overlap a all. Moreover, he file or sub-message) placemen paradigms of coded caching and SC-PIR are inherenly differen. In coded caching, files are being placed among users ha wish o download conen, while in SC-PIR, sub-messages are being placed among daabases which are serving one user s reques. Therefore, i is no surprising he wo problems could have differen soluions for he sorage/file placemen problem. The resuls of Secion V show ha here exiss simple SC- PIR soluions for non-ineger. or example, he daabases could be spli ino wo disjoin groups, one in which submessages are assigned o sub-groups of size daabases, and anoher where sub-messages are assigned o sub-groups of size daabases. This is conrary o he soluion for non-ineger of he coded caching problem where he sorage of every user is spli ino wo pars o essenially creae wo coded caching neworks ha boh span across all users [5]. While his coded caching mehod was proposed o solve he non-ineger SC- PIR problem in [2], we have shown ha his is no necessary. This work presens several ineresing direcions for fuure work. irs, i remains an open problem o deermine he minimum message size L for a given se of SC-PIR parameers. Using a definiion of he rerieval rae ha is slighly differen from ha of [8], i was shown in [0] ha he minimum L of an S-PIR problem can be reduced significanly from K in [8] o. The new S-PIR scheme [0] can be readily adaped o our proposed SC-PIR o reduce he message size. urhermore, he proof echniques herein may be useful o derive he minimum L for a SC-PIR problem. Second, anoher work [6] has considered random placemen among daabases where a daabase sores a bi of a given message wih probabiliy µ. Ineresingly, his placemen mehod was also used in [] for he coded caching problem. I will be meaningful o examine alernaive random placemen sraegies for he SC-PIR problem where messages are spli ino a finie number of sub-messages. REERECES [] H. Sun and S. A. Jafar, The capaciy of privae informaion rerieval, IEEE Transacions on Informaion Theory, vol. 63, no. 7, pp. 4075 4088, 207. [2] M. A. Aia, D. Kumar, and R. Tandon, The capaciy of privae informaion rerieval from uncoded sorage consrained daabases, arxiv preprin arxiv:805.0404, 208. [3] B. Chor, O. Goldreich, E. Kushileviz, and M. Sudan, Privae informaion rerieval, in oundaions of Compuer Science, 995. Proceedings., 36h Annual Symposium on. IEEE, 995, pp. 4 50. [4] B. Chor, E. Kushileviz, O. Goldreich, and M. Sudan, Privae informaion rerieval, J. ACM, vol. 45, no. 6, pp. 965 98, 998. [5] M. A. Maddah-Ali and U. iesen, undamenal limis of caching, Informaion Theory, IEEE Transacions on, vol. 60, no. 5, pp. 2856 2867, 204. [6] Y.-P. Wei, B. Arasli, K. Banawan, and S. Ulukus, The capaciy of privae informaion rerieval from decenralized uncoded caching daabases, arxiv preprin arxiv:8.60, 208. [7] C. Tian, H. Sun, and J. Chen, A shannon-heoreic approach o he sorage-rerieval radeoff in pir sysems, in 208 IEEE Inernaional Symposium on Informaion Theory ISIT), June 208, pp. 904 908. [8] H. Sun and S. A. Jafar, Opimal download cos of privae informaion rerieval for arbirary message lengh, IEEE Transacions on Informaion orensics and Securiy, vol. 2, no. 2, pp. 2920 2932, 207. [9] R. Tandon, M. Abdul-Wahid,. Almoualem, and D. Kumar, PIR from sorage consrained daabases-coded caching mees PIR, in 208 IEEE Inernaional Conference on Communicaions ICC). IEEE, 208, pp. 7. [0] C. Tian, H. Sun, and J. Chen, Capaciy-achieving privae informaion rerieval codes wih opimal message size and upload cos, arxiv preprin arxiv:808.07536, 208. [] M. A. Maddah-Ali and U. iesen, Decenralized coded caching aains order-opimal memory-rae radeoff, eworking, IEEE/ACM Transacions on, vol. 23, no. 4, pp. 029 040, Aug 205.