Location Privacy Protection on Social Networks - Springer Link

3 downloads 212605 Views 189KB Size Report
protect social network users' location information via text messages. We ... most popular online social networks, Facebook, has reached more than 500 million .... 10:end if. In the algorithm, it takes user message as its input and outputs the .... Ho, A., Maiga, A., Aimeur, E.: Privacy Protection Issues in Social Networking Sites.
Location Privacy Protection on Social Networks Justin Zhan and Xing Fang Department of Computer Science North Carolina A&T State University {Zzhan,xfang}@ncat.edu

Abstract. Location information is considered as private in many scenarios. Protecting location information on mobile ad-hoc networks has attracted much research in past years. However, location information protection on social networks has not been paid much attention. In this paper, we present a novel location privacy protection approach on the basis of user messages in social networks. Our approach grants flexibility to users by offering them multiple protecting options. To the best of our knowledge, this is the first attempt to protect social network users’ location information via text messages. We propose five algorithms for location privacy protection on social networks. Keywords: Location information; privacy; social networks.

1 Introduction Online social network has gained remarkable popularity since its debut. One of the most popular online social networks, Facebook, has reached more than 500 million active users [4]. Social network users are able to post their words, pictures, videos, etc. They can also write comments, share personal preferences, and make friends. Facebook even integrates messenger on their webpages supporting users instant online chatting with current online friends. Online social networks typically deal with large amounts of user data. This may lead to privacy-related user data revelation [3, 14, 16]. Personal information, such as personal interests, contact information, photos, activities, associations and interactions, once revealed, may have different levels of impact, ranging from unexpected embarrassment or reputational damage [11] to identity theft [15]. Furthermore, effectively managing privacy for social network can be quite tricky. One reason is that different individuals have different levels of privacy-related expectations towards their information. For instance, some of them may be glad to publish their personal profiles tending to potentially develop additional friendships, while others may worry about the exposure of their identities so as to only reveal the profiles to their selected friends. In other situations, people are even willing to disclose their personal information to anonymous strangers rather than acquaintances [6]. Unfortunately, people may not often care about their privacy. Gross and Acquisti [6] found that only a few students change the default privacy preferences on Facebook by analyzing and evaluating the online behavior as well as the amount of private information disclosed from 4000 students in Carnegie Mellon University. Location information may be J. Salerno et al. (Eds.): SBP 2011, LNCS 6589, pp. 78–85, 2011. © Springer-Verlag Berlin Heidelberg 2011

Location Privacy Protection on Social Networks

79

sensitive because it enables the possibility of tracing someone. Motivated by this reason, we propose our privacy location information protection algorithms using encryption, k-anonymity and noise injection techniques. The rest of this paper is organized as follows: In section 2, we briefly review some related work on location privacy protection. In section 3, we introduce our location privacy protection approach. We conduct experimental evaluation and present the result in section 4. We conclude the paper in section 6.

2 Related Work 2.1 Location Privacy Protection on Mobile Networks Location privacy protection has gained much attention on mobile networks. With the rapid development of computing technology, computational capability harnessed from mobile phone hardware enables the running of some powerful software applications. Location-based service (LBS) is one of the applications where its service provider requires users to provide their location information in order to response their requests [2]. Schilit et al. [12] pointed out that location-based service is prone to privacy revelation. The service even incurs economic and reputation damages because of its potential privacy risks. Amoli et al. [2] classified privacy protections on LBS into two categories: policy-based approaches and data hiding techniques. In the policy-based approaches, a set of pre-defined policies are applied to users to inform that when and how the data can be stored, used, and revealed. K-anonymity, dummy-based approach, and obfuscation are three methods for the data hiding techniques. The kanonymity works when a user’s location can be hidden within a set of k members, who possess the same location information. In the dummy-based approach, a user is able to create some dummy location information to the service provider instead of sending out the real location information, while the obfuscation allows users to generalize their actual locations before sending them out. Since the aforementioned approaches have shortcomings, Amoli et al. proposed 2Ploc, a privacy preserving protocol of LBS based on one-time tickets [2]. Location protections also apply to mobile ad hoc routing protocols. Kamat et al. [8] introduced an idea of using fake routing sources. The fake sources are able to lead adversaries to fake locations. Kong and Hong [9] proposed ANDOR, an anonymous routing protocol, which suggests using route pseudonyms instead of node IDs during the routing process. ARM [13] is another anonymous routing protocol in which two nodes share a secret key and a pseudonym. Based on ANDOR and ARM, Taheri et al. [18] further proposed RDIS, an anonymous routing protocol with destination location privacy protection. 2.2 Location Privacy Protection on Social Networks Location information on social networks is commonly treated as private data and is only available to a certain group of people. For instance, users of online social networks are able to configure their privacy settings so as to only reveal their locations to friends. Unfortunately, the majority of social network users only make changes of their default privacy settings after bad things occurred and the existing privacy settings are perplexed. Lipford et al. [10] presented a new privacy setting

80

J. Zhan and X. Fang

interface based on Facebook that makes significantly improved understanding on the settings as well as better performance. The interface enables a set of HTML tabs, each providing a different browser’s view of Facebook users’ account information, in order to let the users properly manage their privacy settings. Ho et al. [7] analyzed the most popular social networks and discovered three privacy problems. First, users are not notified by social networks when their personal information is at privacy risks. Second, existing privacy protection tools in social networks are not flexible enough. Third, users cannot prevent information that may reveal the privacy of themselves from being uploaded by any other users. To solve the problems, Ho et al. [7] designed a privacy management framework, in which it enables data levels, privacy levels, and tracking levels, respectively. By designating data levels, users are able to sort their data into different privacy-related levels. Different amount of user data is available for different viewers, such as visitors, casual friends, normal friends, and best friends, according to their privacy levels. Finally, tracking levels can be applied to let users decide the magnitude of their account information that can be tracked. Content on social networks is usually tagged by its poster. This incurs privacy threats when location-related content is tagged via spatial-temporal tags. Freni et al. [5] summarized such threats as the location privacy and the absence privacy. The former one enables the possibility of referring other users’ presence of specific locations at a given period of time. The latter one enables the possibility of referring other users’ absence of specific locations at a given period of time. Freni et al. then introduced the generalization methods to prevent the privacy threats, where the methods are primarily space-based generalization and time-based generalization. Although all the aforementioned location privacy protection schemes have their advantages, to some extent, they are not applicable to protect location information revelation on user text messages. In this paper, we propose our location protection approach based on user text messages.

3 Our Approach 3.1 Preliminaries Social network users frequently share text information with others. Abrol and Khan [1] presented a location information extraction approach on the basis of users’ messages. In their approach, location information of a certain message is tagged and ranked based on its collision with their geo-location database entries. The one with the highest ranking score will be extracted. In our approach, we maintain a similar geo-location database, for the purpose of verifying location information’s existence for a given user massage. We narrow down our geo-location information to city level in our database. Hence, for each entry of data, it includes city, zip code, state, and country. The three location information protection techniques provided in our approach are encryption, k-anonymity, and noise injection. Users are able to make choice of them based on their concerns. Encryption is recommended when a user wants to share her location information with designated individuals, who are the friends of the user. In this case, the entire user message will be encrypted by its receiver’s public key. Both k-anonymity and noise injection techniques are suggested when a user wants to publicly post her message, in which it incorporates her location information.

Location Privacy Protection on Social Networks

81

In the rest of this subsection, we will formally elaborate some notions. Our geolocation database, DB, is a universe of location information of United States, specified to city level. Each data entry resembles as {city, zip code, state, country}. stands for a user that the user’s location information is tagged as . A user message is defined as , where it is tagged as . By denoting , we claim that | 0, we claim that contains user location information. By denoting | the tagged location information matches the database entries. For every user, she maintains a list of friends’ public keys . 3.2 Encryption Approach The encryption technique applies when a user wants to share her location information with her friends. Algorithm 1 is the pseudo code of the encryption algorithm. Algorithm 1. The Encryption Approach Input: Output: 1: for every do 2: 3: end for 4: if 5: return 6: else if (Encrypt = =1and 7: 8: else 9: return 10:end if

then ) then

In the algorithm, it takes user message as its input and outputs the encrypted cyphertext, which is able to be decrypted by its receiver’s private key. Location information of is tagged by the function. The Encrypt is a Boolean variable. When its value equals one, it means the user enables the encryption technique. Otherwise, the encryption technique is disabled. This leaves flexibility to user when choosing location protection techniques. 3.3 K-anonymity Approach K-anonymity technique is recommended when a user wants to publicly share her message, in which the message incorporates the user’s location information. Algorithm 2. The k-anonymity Approach Input: Output: 1: for every 2: 3: end for 4: if

do then

82

J. Zhan and X. Fang

5: return 6: else if (ka = =1) then 7: 8: end if 9: return Algorithm 3. The k-anonymize algorithm Input: , Output: 1:for every 2: if (| 3: return 4: else 5: return 6: end if 7: end for 8: 9: return

|

do 2) then

Each data entry, in the , resembles as {city, zip code, state, country}. Since we only consider the geo-location information throughout United States, the data entry is able to be reduced to {city, zip code, state}, without losing any information. A totality , , , . tagging of a certain data entry resembles as , , It is widely known that a zip code is a unique identifier for a city. Therefore, the code in line 4 of the k-anonymity function is able to remove zip code tag for a given . The collection of tags then becomes , , , . The tag, , , is a tagging of the Quasi-identifier of our . And this tag is able to be, at least, 2-anonymized via suppression, after the 4th step of the k-anonymity function. According to Sweeney [17], k-anonymity of a given message’s location information is then able to be achieved as long as the Quasi-identifier of the location information is k-anonymized. Given an example that the information, “Madison, SD, 57042”, is tagged from a . The k-anonymity algorithm first suppresses the zip code tag. The message transfers to “Madison, SD, *****”. The Quasi-identifier tag then still uniquely matches our database entry, because there is only one city named Madison in South Dakota. Therefore, two ways of suppressions can be applied to anonymize the Quasi-identifier tag. One is to completely anonymize the tag, which it leads to “*******, **, *****”. The alternative way is to anonymize the state tag in order to achieve city level anonymity. 3.4 Noise Injection Approach The noise injection technique is aimed to protect user location information by injecting location noise, , in between . City level and state level anonymity can be achieved via noise injection. In the noise injection algorithm, we still leave user to decide which level’s anonymity is most appropriate to her.

Location Privacy Protection on Social Networks

83

Algorithm 4. The Noise Injection Approach Input: Output: 1: for every do 2: 3: end for 4: if 5: return 6: else if (ni = =1) then 7: 8: end if 9: return

then

Algorithm 5. The Noise-Inject algorithm Input: , Output: 1: if 2: for every 3: 4: end for 5: else if 6: for every 7: 8: end for 9: for every

1 then do , 1 then do do

10: 11: end for 12:else return 13:end if 14: 15: return | 2. Consider a Recall that k-anonymity is achieved when | , , and noise , ,a with injected noise is denoted as || , , || . Therefore, a noise injected matches at least two entries of the . Since noise city and zip code both belong to the same state as the original city and zip code, city level kinferred anonymity is accomplished that the original city is hidden with 1 possibility. For the state level, first, a ′ city and zip code tags are suppressed that only leaves state tag unsuppressed. Hence, state level k-anonymity is able to accomplish via injecting, at least, one noisy state. Similarly, the state level kanonymity has 1 inferred possibility.

84

J. Zhan and X. Fang

4 Conclusion Location privacy protection is an important research topic on mobile ad-hoc networks. However, location privacy protection on social networks has not been well studied. In this paper, we present our approach for the protection of location information based on social network user text messages. Three techniques including encryption, k-anonymity, and noise injection, are provided. We grant the right of selecting the techniques by users. Our experimental results show that our approach can significantly reduce location privacy revelation and can be efficiently deployed. Even if our location protection approach is able to well performed, there are still some limitations haunting on it. First, the geo-location database has shortcomings. Currently, we only collected city level geo-locations in United States. However, social network users are scattered around the world, which requires us to expand the database entries in order to protect users outside of U.S. Additionally, we also need to expand our database to include any other geo-location information lower than city level. For instance, if a user message mentions locations like district, street, or buildings, our current database is then not able to deal with this situation. Second, our approach grants users the flexibilities for selecting protection techniques. On the other hand, it also impose burden such as the understanding of protection techniques to users. As our future work, we will integrate linguistic analysis towards user messages to allow system to automatically select the best protection technique for each user.

References 1. Abrol, S., Khan, L.: TweetHood: Agglomerative Clustering on Fuzzy k-Closest Friends with Variable Depth for Location Mining. In: Proceedings of the IEEE International Conference on Privacy, Security, Risk and Trust, Minneapolis, MN, USA, pp. 153–160 (August 2010) 2. Amoli, A., Kharrazi, M., Jalili, R.: 2Ploc: Preserving Privacy in Location-Based Services. In: Proceedings of the IEEE International Conference on Privacy, Security, Risk and Trust, Minneapolis, MN, USA, pp. 707–712 (August 2010) 3. Blakely, R.: Does Facebook’s Privacy Policy Stack Up?, http://business.timesonline.co.uk/tol/business/ industry_sectors/technology/article2430927.ece 4. Facebook Statistics, http://www.facebook.com/press/info.php?statistics 5. Freni, D., Vicente, C., Mascetti, S., Bettini, C., Jensen, C.: Preserving Location and Absence Privacy in Geo-Social Networks. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Ontario, Canada (October 2010) 6. Gross, R., Acquisti, A.: Information Revelation and Privacy in Online Social Networks. In: Proceedings of the 2005 ACM Workshop on Privacy in the Electronic Society, Alexandria, Virginia, USA, pp. 71–80 (November 2005) 7. Ho, A., Maiga, A., Aimeur, E.: Privacy Protection Issues in Social Networking Sites. In: Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, Rabat, Morocco, pp. 271–278 (May 2009)

Location Privacy Protection on Social Networks

85

8. Kamat, P., Zhang, Y., Trappe, W., Ozturk, C.: Enhancing Source-Location Privacy in Sensor Network Routing. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems, Columbus, Ohio, USA (June 2005) 9. Kong, J., Hong, X.: ANODR: Anonymous on Demand Routing with Untraceable Routes for Mobile ad-hoc Networks. In: Proceedings of the 4th ACM International Symposium on Mobile ad hoc Networking and Computing, New York, NY, USA, June 2003, pp. 291–302 (2003) 10. Lipford, H., Besmer, A., Watson, J.: Understanding Privacy Settings in Facebook with an Audience View. In: Proceedings of the 1st Conference on Usability, Psychology, and Security, San Francisco, USA, pp. 1–8 (April 2008) 11. Rosenblum, D.: What Anyone Can Know: The Privacy Risks of Social Networking Sites. IEEE Security and Privacy 5(3), 40–49 (2007) 12. Schilit, B., Hong, J., Gruteser, M.: Wireless Location Privacy Protection. Computer 36(12), 135–137 (2003) 13. Seys, S., Preneel, B.: ARM: Anonymous Routing Protocol for Mobile ad hoc Networks. International Journal of Wireless and Mobile Computing 3(3) (October 2009) 14. Sogholan, C.: The Next Facebook Privacy Scandal, http://news.cnet.com/8301-13739_3-9854409-46.html 15. Strater, K., Lipford, H.: Strategies and Struggles with Privacy in an Online Social Networking Community. In: Proceedings of the 22nd British HCI Group Annual Conference, Liverpool, UK, pp. 111–119 (September 2008) 16. Stross, R.: When Everyone’s a Friend, Is Anything Private?, http://www.nytimes.com/2009/03/09/business/worldbusiness/ 09iht-08digi.20688637.html 17. Sweeney, L.: k-anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002) 18. Taheri, S., Hartung, S., Hogrefe, D.: Achieving Receiver Location Privacy in Mobile ad hoc Networks. In: Proceedings of the IEEE International Conference on Privacy, Security, Risk and Trust, Minneapolis, MN, USA, pp. 800–807 (August 2010)