Crowd-Sourced Sensing and Collaboration Using Twitter more

Crowd-Sourced Sensing and Collaboration Using Twitter Murat Demirbas, Murat Ali Bayir, Cuneyt Gurcan Akcora, Yavuz Selim Yilmaz Computer Science & Engineering Dept., University at Buffalo, SUNY Email: {demirbas, mbayir, cgakcora, yavuzsel}@cse.buffalo.edu Hakan Ferhatosmanoglu Computer Science & Engineering Dept., The Ohio State University Email: hakan@cse.ohio-state.edu Abstract—Despite the availability of the sensor and smartphone devices to fulfill the ubiquitous computing vision, thestate-of-the-art falls short of this vision. We argue that the reason for this gap is the lack of an infrastructure to task/utilize these devices for collaboration. We propose that microblogging services like Twitter can provide an “open” publish-subscribe infrastructure for sensors and smartphones, and pave the way for ubiquitous crowd-sourced sensing and collaboration applications. We design and implement a crowd-sourced sensing and collaboration system over Twitter, and showcase our system in the context of two applications: a crowd-sourced weather radar, and a participatory noise-mapping application. Our results from real-world Twitter experiments give insights into the feasibility of this approach and outline the research challenges in sensor/smartphone integration to Twitter. I. I NTRODUCTION The ubiquitous systems vision [1] of embedding and weaving abundantly available tiny-computers to the fabric of our daily lives is close to fruition. With the advances in MEMS technology in the previous decade, it has become feasible to produce various types of sensors (such as magnetometers, accelerometers, passive-infrared based proximity, acoustics, light, heat) inexpensively, in very small-form factor, and in low-power usage. Furthermore, cellphone technology has seen an adoption rate faster than any other technology in human history [2]: as of 2009, the number of cellphone subscribers has exceeded 3.3 billion users. The rate of innovation in this field has been head-spinning. Nokia, Google, Microsoft, and Apple have all introduced cellphone operating systems (Symbian, Android, Windows Mobile, iPhoneOS) and provided APIs for enabling open application development on the cellphones. These modern cellphones, which are dubbed as smartphones, enable location-aware services as well as empowering the users to generate and access multimedia content. Despite the availability of the devices to fulfill the ubiquitous computing vision, the-state-of-the-art falls short of this vision. We argue that the reason for this gap is the lack of an infrastructure to task/utilize these devices for collaboration and coordination. In the absence of such an infrastructure, the state-of-the-art today is for each device to connect to Internet to download/upload data and accomplish an individual task that does not require collaboration and coordination. Providing 978-1-4244-7265-9/10/$26.00 c 2010 IEEE an infrastructure for publish/subscribe and tasking of these devices enables any node to search the data published by several nodes in one region to aggregate and decide on a question, as well as task several nodes in one region to acquire the needed data (if the data is not already being published to the infrastructure). We propose that Twitter [3] can provide an “open” publishsubscribe infrastructure for sensors and smartphones and pave the way for ubiquitous crowd-sourced sensing and collaboration applications. The open publish-subscribe system of Twitter implies that different actors may integrate user data differently. Moreover, third parties can use the gathered data in unanticipated ways to offer new services with them. In addition to this open publish-subscribe infrastructure, the social networks angle of Twitter also provides a useful feature for the crowd-sourced sensing and collaboration applications. Finally, the wide popularity of Twitter and the big community behind it (more than 30 million users in US), is an important reason to target our crowd-sourcing system for Twitter: It is easier to give the community a tool than to give the tool a community. More specifically, we provide the following contributions. • In Section II, we provide a detailed survey of Twitter with existing application domains on news and alert systems. In Section III, we present emerging application domains for Twitter: including crowd-sourcing, participatory sensing and social collaboration. • We discuss sensor integration to Twitter in Section III-A and smartphone integration in Section III-B. We point to a potential new architectural trend in sensor integration, that of inexpensive sensors using cellular data network to reach Internet in one hop. • In Section IV, we present our design and implementation of a crowd-sourced sensing and collaboration system over Twitter. Central to our system is a Twitter-bot (with an integrated database system) that accepts questions, crowdsources them, and aggregates the answers to reply back to the querier. The system also includes a smartphone client for automatically pushing sensor reading information to Twitter. • In Sections V and VI-A, we showcase and evaluate the performance of our crowd-sourced sensing and collab- • oration system on two case-studies. The first one is a crowd-sourced weather radar 1 , which help monitor finegranularity weather conditions and act as a ground-truth. Our second application is noise mapping of a region by aggregating the automatic noise-sensing updates from smartphones. We present an analysis of our real-world Twitter experiments to give insights for the feasibility of our approach. We find that although we do not offer the user any incentives to reply, our queries receive at least 15% reply ratios. Surprisingly, 50% of the total replies arrive within the first 10 minutes of our query, and 80% of the replies arrives within the first 2 hours, enabling lowlatency operations for crowd-sourcing applications. Our experiments also found that consistently the majority of replies come from users that access Twitter from their mobile phones. II. T WITTER A. Twitter in a Nutshell A Web 2.0 project, Facebook.com established a status update field in June 2006, but it was Twitter.com that took status sharing between people to mobile phones four months later. First named as “Status”, then as “Twttr”, Twitter has gone beyond status sharing and became a Web 2.0 microblogging site for information sharing and news reporting. Twitter started its journey in 2006, but its fame started to spread after the South by Southwest festival in 2007. In the event, company set up user accounts for the participants, and used big screens streaming tweets from them in the conference simultaneously. The effect of the conference was huge for Twitter. According to a report by HubSpot in 2008 [4], despite being functional since 2006, Twitter had its 70% of users joined in 2008, and reached around 4-5 million users, making it a top 1000 website in web traffic. The trend of growth for Twitter has continued since then. In 2009, HubSpot [5] reported an astonishing 18.000% growth rate of Twitter users. The report gives the total number of Twitter users in USA as 27 million. 55% of these users are male, 48% is between 18 and 34 ages. Twitter has seen a rapid growth in the western sphere, and cities like London, New York and San Francisco generate the largest traffic on the site. Top 100 cities list [6] is dominated by US, with the first non-USA city being Toronto, Canada. Tehran, Iran is 18th on the list, making it the first non-western city. Tokyo, Japan, once in top 10 in 2008, now enters the list as the 21st city. B. Beneath the Hood Twitter’s success can be attributed to two main factors; elegance in design, and simplicity in adding third party improvements to Twitter. Elegance is due to the character limit. Twitter names microblog posts from users as tweets. 1 You can visit our weather radar @rainradar on Twitter. We display the answers to our weather radar on a map at http://ubicomp.cse.buffalo.edu/ rainradar. The map is configurable to show results from previous days, and also is zoomable to show fine-grain locations of the replies. Each tweet has a 140 character limit which is inherited from text messaging. (The original 160 character SMS limit was reorganized into 20 character username and 140 character post fields.) A comparison between blogging and microblogging gives us a good understanding of the reason behind Twitter’s popularity. While blogging requires good writing skills and large content to fill pages, Twitter restricts posts to 140 characters, which encourages many more people to post. Simplicity is due to an early decision by Twitter to provide an HTTP based open source API and share posts with third party applications. Twitter’s API consists of two different parts: Search API and REST API. REST API enables Twitter developers to access the core Twitter data. This data includes tweets, timelines, and user data. Search API enables the developer to query the tweets. It also provides information about the trending topics. The usage of both APIs is subject to rate limiting, however, based on the request of the third party, Twitter may add these applications to its whitelist and remove request limitations. The ease and flexible usage of Twitter API encouraged several developers to write applications. Starting with Twitterrific in January 2007, many applications have been created for Twitter [7]. C. Existing Twitter Application Domains 1) News: Twitter is becoming regarded as the fastest way to reach to breaking news. Users’ collaboration has given Twitter a clear edge over news centers and recently news centers have set up Twitter accounts and encouraged users to interact with these accounts in order to capture breaking news. CNN maintains 45 official Twitter accounts with more than 5 million followers. During election periods in the world countries, Twitter played a greater role than news centers, and attracted attention. The information flow on Twitter has been observed with great interest, and in some cases governments reportedly warned Twitter.com to not to undergo maintenance for it would break the information stream [8]. In recent Mumbai attacks in India, just minutes after the attacks, Twitter was the major source until news sites caught up with updates. As well as posts, information flow to Twitter consists of pictures, links and videos. Demonstration pictures from the world and the first picture from US Airways plane in the Hudson River [9] increased Twitter’s popularity in the public. The Economist declared Twitter a winner in this information race [10]. 2) Alert Systems: Twitter provides a system that can connect residents of a city with virtually no cost. It also increases the abilities of an alert system by inputting more user generated data. Some cities already opted for Twitter to alert their residents [11]. The Virginia Tech incident in 2007 highlighted the security issues on university campuses. To integrate e2Campus emergency notification network with popular social networks, Pacific University of Forest Grove, Oregon, implemented a Twitter based alert system for its students, and the trend is likely to grow [12]. With this system, universities can send e2Campus alerts to popular networks. 2 III. R ESEARCH D IRECTIONS USING T WITTER In this section we discuss sensor and smartphone integration to Twitter and identify research directions and emerging applications for these domains. A. Integrating Sensors With the advances in MEMS technology in the previous decade, it has become feasible to produce various types of sensors (such as magnetometers, accelerometers, passive-infrared based proximity, acoustics, light, heat) inexpensively, in very small-form factor, and in low-power usage. Moreover there has been nearly a decade of research in wireless sensor networks (WSNs) and some real-world deployments of WSNs have been successfully demonstrated [13]–[16]. As such, WSNs offer an untapped source of information about our physical world. However, WSNs have not achieved the broad impact and visibility it deserves. Not only are we far away from “a central nervous system for earth”, there is no significant market penetration for WSNs yet. Arguably the greatest barrier against wider adoption of WSNs is the difficulty in locating sensors and subscribing to them. We propose that Twitter can provide an “open” publish-subscribe infrastructure for sensors, as well as the search/discovery of sensors with certain attributes. Moreover, having access to a lot of sensors is also valuable in that it would be possible to reduce false-positives from sensors by cross-checks. Below we list some ideas we are pursuing for sensor integration to Twitter. Sensor tweet standards. In order to search and process sensor values on Twitter, we need to agree upon a standard for publishing these sensor readings. We offer a biography format on Twitter that describes a sensor in detail in Section IV-B. The bio-code makes sensors easy to find. By just searching for the desired sensor functionalities using the Twitter API over the bios, one can reach all sensors within a locality that provides the desired functionalities. We are currently developing a standard, TweetML, for tweeting sensor values. We will make use of the built-in hashtags feature in Twitter for easier accessibility and searchability of sensor value fields. As part of our current work, we are publishing data to Twitter from some existing WSNs deployments. One of these is the wine-cellar monitoring WSN deployment, and another is personnel tracking WSN deployment. New WSN architectures. The popularity of Twitter have already resulted in the production of inexpensive specialized devices for microblogging. TwitterPeek [17] is a good example of this trend. TwitterPeek enables the user to tweet and follow Twitter from anywhere (no WiFi necessary) using the cellular data network to connect to Twitter. One can buy TwitterPeek for a low, one-time fee and get connectivity service for the lifetime of the device –without any bills ever. The reason TwitterPeek is able to offer a powerful device at a low price is because of the benefits of mass production. TwitterPeek may signal a new direction for WSN devices. Instead of using low communication range devices that incur the challenges/complexity of maintaining a multihop network 3 and still require a basestation to access Internet, TwitterPeeklike sensors can directly reach Internet at one hop. These devices may not only tweet their sensor readings, but can also be easily controlled over Twitter to reconfigure their sensing schedules and tune their parameters. B. Integrating Smartphones Smartphones provide significant advantages over traditional wireless sensor nodes. Firstly, smartphones are mobile. Wherever a smartphone user goes, smartphone can take sensor readings (with built-in sensors for acoustic, image, video, accelerometer, tilt, magnetometer, and potentially with other integrated custom sensors). The dynamic geolocation feature of smartphones enables these readings to be location and timestamped. Thus, in contrast to WSN nodes that are tied to static locations, and do not scale for coverage of large areas, smartphones cover large areas due to their mobility. Secondly, smartphones are personal and administrated by their users. In contrast to sensor networks where energy-efficiency of utmost importance, smartphones are recharged by their users and it is not necessary to try to squeeze every bit of energy. Moreover, since smartphones are personal, they provide the potential of interacting with the phone user for tasks requiring human intelligence and intervention, such as taking a picture of a requested location, answering a question for which the user is well-equipped. Below, we identify 3 new application domains for smartphone integration to Twitter, with increasing level of complexity. 1) Participatory Sensing: Participatory sensing is the use of volunteering smartphones to collect data from a large region. Although there has been significant work on participatory sensing [18], using Twitter opens up novel improvements on this application domain. Twitter’s open publish-subscribe system enables others to use the gathered data in unanticipated ways and offer new services over them. Moreover Twitter’s social network aspect enables new features to be added to participatory sensing. For example, when one of the users have performed significant amount of participatory sensing but her friend and competitor (Twitter enables using lists for followers/friends) have not done anything for that week, our system can send a reminder message for that friend. There is already a good support for enabling participatory sensing applications over Twitter. Some Twitter third party applications (including Twittervision17, Twittearth, Twitter Atlas, Twibs20) use maps to show status posts, and can be configured to show posts only from certain regions. 2) Crowd-Sourcing: Crowd-sourcing means distributing a query to several Twitter users in order to gather and aggregate the results and exploit the wisdom-of-crowds effect. Examples of crowd-sourcing may be a weather/rainradar (with better precision and ground-truth than meteorological weather radars), and polling for the best restaurant entree in town. Web offers a rich variety of successful crowdsourcing applications. Renta-coder 2 facilitates assigning programming tasks to freelance 2 http://www.rentacoder.com/ |LO:?43.003,-78.787|N97:NO| UTC-5|UB:CSE:CSS| |43.003509,-78.787079|noise:H |11/27/2009 16:48:26| Fig. 1. Crowd-sourcing System Architecture e-Government projects to win a prize, and as the students competed for the best applications, the government benefited from receiving applications for free [22]. IV. O UR C ROWD - SOURCING S YSTEM A RCHITECTURE In this section, we present the design of our crowd-sourced sensing and collaboration system over Twitter. Figure 1 illustrates the high level architecture of our crowd-sourcing system. Twitter acts a middleware for publish/subsribe as well as search & discovery. Our system is composed of three components namely Askweet, Sensweet and Twitter clients. Sensweet is a smartphone application that publishes real-time readings from the integrated-sensors to Twitter. Askweet is a program that listens to its Twitter account for questions and processes the questions and aggregates the replies it receives to these questions from Sensweet and the Twitter clients. We discuss the design of the Askweet and Sensweet components in more detail below. A. Askweet Askweet accepts a question, and tries to answer the question using the data on Twitter, potentially data published by Sensweets. If it is not possible to answer the question with existing data and/or if the question requires interaction, Askweet finds experts on Twitter (potentially using information retrieval techniques) and forwards the question to these experts. After obtaining answers from the experts, it replies the answers back to the asker. Askweet accepts a certain syntax from queries and replies, but it can also be extended and generalized to adopt modern natural language processing techniques. The Askweet components of two case studies in this paper run on a dedicated server, and keep all relevant data in a database to process questions and replies in a coordinated manner. Due to the parallelizable nature of processing queries and replies (a thread is assigned to each reply), it is easy to deploy 4 3 http://www.openmind.org/ 4 http://seeclickfix.com/   ©¥2  ¥& ) !§¡¤¥ Sensweet 0§'$ ©¨ £¥#"¥!  ¥ ©¥¦ ©¥ ¡¥§ ©¥¡¡§¦¨ ¥¤ £ ¡ ¢ ¤©!¤¥1 ©¥¦¤( ¤¤¥$! © 13 ¥& ) !§¡¤¥ ! $%£   © !§¡$§¢ programmers, and with Open Mind 3 non-expert internet users collaborate to create intelligent software. Crowd-sourcing depends on user participation. With Twitter’s popularity, finding a user to ask a question is not a problem, and we find that users are willing to participate and answer questions. In our experiments up to 1/6th of our queries got answered, although we did not provide any incentive for answering. We think this is due to the sharing and participatory nature of Twitter culture. It is possible and easy to provide incentives for encouraging participation. Using Twitter’s list functionality a group of users might be classified as experts of a topic. Each topic may have multiple user groups with different expert levels. Upon answering questions, the users can get promoted to a higher level. Visibility of these lists to the public would will be a great incentive for users to collaborate. Another way to incentivize users is to give the users that answer more questions the right to send more questions to our crowd-sourcing engine. For developing countries of the world, crowdsourcing can utilize interesting incentives. Eagle [19] developed the txtagle system to crowdsource translation, transcription and survey tasks to mobile phone users in Nigeria. With txteagle, users earn mobile phone airtime or mobile money upon completing tasks that are sent to them via text messages. Citizens’ interest in shaping their own city is also a strong incentive. Brabham [20] proposes harnessing creative ideas for city planning from web users. Another platform, SeeClickFix4 , creates a vital link for city inhabitants to report the problems to the government. The social network nature of Twitter can be exploited to provide an extra incentive for crowd-sourcing. It is also possible to provide useful feedback to crowd-source participant based on others answers. For example, the participant may get to see how her answer fares with other answers. In the “best restaurant” query, participants may get to learn which other participants also favorite their restaurant of choice. 3) Social Collaboration: Social collaboration applications are more sophisticated than crowd-sourcing applications in that they require back-and-forth interaction in contrast to the asymmetric one-shot interaction involved in crowd-sourcing. Examples of social collaboration applications include pick-up soccer games, arranged ride-sharing, community-organization events, support groups for addicts, and support groups for exercising and weight-watching. Recently, cultural institutions are developing platforms where users collaborate on creating rich media for an art exhibit. In the m-Dvara project by Coppola et al [21] visitors can record media and comment on the art pieces, and the new visitors can surf internet to read comments and see which art pieces are most recommended. Governments can greatly benefit from the synergistic effect of large scale collaboration. In a test case in India, 5000 students from more than 100 Indian institutions worked on Askweet B. Sensweet A Sensweet application uses the smartphones’ ability to work in the background without distracting the mobile user. Sensweet applications sense the surrounding environment and send these data to the Twitter. While sending the data to Twitter, the Sensweet client formats the data according to the bio-code it advertises in the Biography section of its Twitter account. The main idea of using a bio-code is to allow worldwide users to search for the sensors they are looking for on-the-fly and enjoy a plug-and-play sensor network without registering through dedicated sites. Here we provide a standard for a bio-code for Twitter to encode the values published by the sensor. To illustrate with an example, the Bio section of our noise-sensing application reads as: |LO :?43.003, −78.787|N 97 : N O|U T C − 5|U B : CSE : CSS|. This bio-code consists of tuples separated with a vertical bar (|). In each tuple, descriptive fields are separated with a colon (:). The values that are separated with commas describe the phenomena the sensor(s) captures. The first tuple is always the location parameter: longitude and latitude (obtained from the built-in GPSs or entered manually). If the sensor is mobile (e.g., smartphone), a question mark will precede the longitude value. Even for mobile sensors a default location is added to give the queriers an idea of the region the sensor operates. The question mark hints that a more accurate location is included in the tweets. The second tuple explains the manufacturer of the sensor, product ID (if possible) and the sensor type(s) the sensor provides. The third tuple is optional, and describes the time zone that the sensor uses and can also include a timestamp. Although Twitter provides timestamping of tweets, this extra timestamp becomes important in case when a sensor need to store readings and send them later when it can connect to the Internet. The fourth tuple involves identification of the company/project that deploys the sensor, and defines a group id to locate other sensors that are part of that project. Thus, the above bio-code is decoded as: Location is dynamic, but default location is UB North Campus Bell hall, Nokia N97 is used to capture GPS and accelerometer values in NY time zone for UB CSE Crowd-Sourced Sensing (CSS) Project. V. C ASE S TUDY: C ROWD -S OURCED W EATHER R ADAR In this section we explain our crowd-sourced weather radar application. For the sake of simplicity, we choose a topic where everybody in Twitter can be an expert: the current weather condition. Our application contains two subapplications, one of them obtains the current weather condition from users, and the other one obtains guesses from the users about the next day’s weather condition. 5 Question Handler Thread Poll Thread Database Fig. 2. State transition diagram for Askweet component Weather radar application has its own question and answer format. The question messages sent by query owners are in the form of “?[Application Name] Loc:Location” where application name is either Weather or WeatherGuess. For instance “?Weather Loc:Buffalo,NY” might be a valid question for asking weather condition in Buffalo,NY. The forwarded query to the Twitter users is of the form: “How is the weather there now? reply 0 for sunny, 1 for cloudy, 2 for rainy, and 3 for snowy” Our weather radar application account can be visited at rainradar on Twitter. We display the answers to our weather radar on a map at http://ubicomp.cse.buffalo.edu/rainradar. The map is configurable to show results from previous days, and is also zoomable to show fine-grain locations of the replies. We have implemented only the Askweet component of the crowd-sourced system since the Sensweet component can be any smartphone Twitter application. The Askweet component of our weather radar application is written in Java Programming language by using Twitter4J open source API library and total size of the source is about 2KLOC. Askweet listens to the incoming messages to its Twitter account and processes them with respect to their message types. The main function of Askweet component is to get a question, process it and/or forward this query to the multiple users who can answer it. After obtaining answers from Twitter users, Askweet sends the reply to the original querier. DABFP QY 9BBFE4 Weather Condition Application Thread Weather Guess Application Thread DCB9SB9BR ABFPQY 4 DAB99GFE 77654 D @9 @R b6a4 D Q6G9PBIH `PY 4 D@9@R b6a 4 DCB9A@98 77654 DCB9SB9BR Q6G9PBIH PPBIX ABW9@BV4 D CB9SB9BR Q6G9PBIH4 D @9 @R b6a4 D Q6G9PBIH `PY4 D CBAGU ABTGE4 D CB9SB9BR Q6G9PBIH AB W9 @B V4 DdaRc4 Twitter Rate-Limit Checker Thread Askweet on a cloud computing platform for elastic scalability. Since Askweet accounts have been recently whitelisted by Twitter and hourly request limits removed, it is possible to implement Askweet over Hadoop Map/Reduce framework to handle millions of queries and replies daily. Main Thread Process Answer Thread Cumulative Distribution for Response Time User Participation Client Type Percentage (%) Percentage (%) Percentage (%) (a) CDF for response time Fig. 3. Cumulative Distribution for Response Time (b) User participation Experimental results for NYC in different time slices User Participation (c) Client types Client Type Percentage (%) Percentage (%) Percentage (%) (a) CDF for response time Fig. 4. (b) User participation Experimental results for 3 cities (c) Client types Our Askweet implementation is multithreaded for scalability, with each thread implementing a specific functionality. When the Askweet application is launched (Figure 2), it starts the poll thread that polls the Twitter account and gets the messages. Then the thread detects whether the message is a question or answer. Depending on the message type, it starts either a question handle thread or a process answer thread. Poll thread keeps on checking the account every minute continuously to get the new messages addressed to itself. Question handle thread receives the question from the poll thread and detects if it is weather guess question or weather condition question. Depending on the question type it starts either a weather condition application thread or a weather guess application thread. Question handle thread also starts Twitter rate-limit checker thread in order to ensure that Askweet stays within Twitter’s request limits. After this step, the question handle thread is terminated. Weather guess application and weather condition application threads have almost the same functionality. Both of them get the question and parse the location from question text and search through Twitter to find users for the specified location. Then they send the question to the selected qualifying Twitter users. After that these application threads are terminated. Both of the applications keep all the relevant data in a database in order to observe the social collaboration and attendance. This database also helps the program not to spam any Twitter user with multiple requests within a week. As we have a query count restriction on Twitter, we need a thread that checks Twitter to see if the system can proceed to post questions and inform the query owner about the received answers. If the system exceeds the rate limit, the thread locks 6 question asking permit and releases the lock if otherwise. Process answer thread gets the answers from the poll thread and tweets the answer to Twitter. It also selects five of the answers to forward to the original querier. A. Experiment Results In this section, we present our experimental results for the weather radar application. We performed three types of experiments using weather radar. In the first one, we compare the user responses in different time slices of day for New York City (NYC). In the second, we compare user responses from three different cities: NYC, Toronto and Montreal. In the last one, we analyze the correlation of answers from our users with data from weather.com for one day (December 6, 2009). In the first experiment, we compare the user response behaviors in NYC at different time slices. We observed that the response times in the afternoon and in the evening are better than those in the morning and at night (Figure 3(a)). An interesting phenomenon is that on the average 50% of the answers are received within the first ten minutes (Figure 3(a)). Figure 3(b) shows the user contribution to our experiments. We observe that Twitter user contribution to the experiment is highest in the morning which is nearly 20% (Figure 3(b)); we get a response from 20% of the queried users. For the other time slices, the contribution is around 15% (Figure 3(b)). Figure 3(c) shows the user distribution with respect to Twitter client types. At night time, an overwhelming majority of people use mobile Twitter clients to send their responses (Figure 3(c)). Overall, mobile client users consistently dominate over desktop/laptop users (Figure 3(c)). In the second experiment, we compare the user responses ËÊÀÄÈÇÃÉ ÃÈÇÃÄÃÆ ÀËÏÎÃÉ ÍÌ ÅÄàÁÀ¿ µ´ª®²±­³ ¸ª®ª«·±¶ ­²±­®­° ¸ª®ª«·±¶ ²­© ¯®­¬ «ª© Maximum Response Time y€€yx~ƒ‚ {yzy€ x‡z†€ ~}{z| …„ {yzyxwv ¹¼ ¹» ¹½ ¹º ¹¹¾ ¹ ˜ii˜j—glk d˜™˜jih o—j—nm˜k o—j—nm˜k gie gfd™e d˜™˜—–• Maximum Response Time pr pt ps pq p ppu ‰‘ £¦ £¥ ‰“ ‰’ ‰ £§ £¤ ££¨ ‰ £ ‰‰” ur ’ up ’ yƒƒy„x†ˆ‡ ‘Œ  ‘‰  ti p tf p ¢¡”˜œŸ—  ‘Œ ti †t€… —œŸ—˜—ž ‘Š tg tf qes qep qr ‘‰ Žˆ Žˆ Ž y€y„ƒ‚ œ›š ™˜—– •”“ y€yxwv މ qf Ž qp eep ˆˆ e ˆ ei eg ˆŒ ˆŠ ef ˆ‹ ˆ‰ eh Montreal whereas it is 29% in Toronto. VI. C ASE S TUDY: S MARTPHONE E NABLED N OISE M AP In this application, we measure the noise level of the surrounding environment via GPS enabled smartphones and provide a noise level querying service over Twitter. Here, the noise corresponds to all sound frequencies in the environment. We describe our implementations of the Askweet and Sensweet components for this application below. Askweet component. We implemented the Askweet component similar to that of the weather radar application. The noise map application has its own query format of “?Noise Loc:Location”. Any Twitter user can send a question to the Twitter account of Askweet (twitter.com/askweet) in order to query the noise level of a specific location. For example “?Noise Loc:Student Union, UB, Buffalo, NY” queries for the noise level of the Student Union at the University at Buffalo. When Askweet gets a new query, it automatically tries to resolve the location by using Google’s Geocoding Service (http://code.google.com/apis/maps/documentation/). After getting the latitude and longitude information from Google’s Geocoding Service, Askweet searches previously known Sensweet clients in the database in proximity of the specified location. If Askweet finds a local client, it returns the latest noise level obtained from that client. If multiple Sensweet clients are found, the noise value with the latest timestamp is returned to the querier. Sensweet component. We implemented a Sensweet client for the Nokia N97 Smartphone series. For implementing the Sensweet client we used Carbide C++ version 2.0.2, Nokia N97 Symbian S60 SDK V1.0 and Qt Tower 4.5.2. The total size of the source code for this Sensweet component is more than 1500 lines of code. The Sensweet client detects the noise level of the surrounding environment and forwards this data to Twitter using our TweetML format mentioned in Section IV-B. The specific TweetML format (|Loc|N oise : V al|T imestamp|) for Noise Map application includes ordered values for location, sensor reading and timestamp. An example sensor reading can be “Noise:H” denoting that the current noise reading is “High”. Since Nokia N97 smartphones do not provide the noise level in decibel format, we implemented our own noise sensor driver to map noise samples into three categories: L as Low, M as Medium and H as High. Our Sensweet client implements a timer for reading the GPS coordinates and using the microphone to record a one second noise sample in “Windows WAV” file format. Then, Sensweet client parses this WAV file to obtain the mean value for the amplitude of signals in the sample. In order to map the current sample into one of the noise categories {Low, Medium, High}, we used three normal distributions. For a given mean value x of amplitudes obtained from a one second sample, we calculate the following probability density function (pdf(x)) for each of the predefined three normal distributions: (x − µ)2 1 ) exp(− pdf (x) = √ 2σ 2 2πσ 2 7 (1) Rainy! Time : 2010-01-05 00:52:41 Fig. 5. Screenshot of the Weather Radar Web Application from different cities. We observe that users in NYC respond quicker than those in Toronto and Montreal, which have almost the same response patterns (Figure 4(a)). In Figure 4b, we compare the participation ratio of the users in these three cities. We see that users in NYC participate more than those in Toronto and Montreal (Figure 4(b)). In all these three cities, mobile Twitter client users dominate over desktop/laptop users and this ratio is highest in NYC (Figure 4(c)). A screen shot of the weather radar map application for all cities is given in Figure 5. TABLE I C OMPARISON OF USER RESPONSES WITH WEATHER . COM City New York City Toronto Montreal Match for Current Day 89% 79% 88% Match for Next Day 56% 29% 54% In the final experiment, we analyze the correlation of answers from our users with data from Weather.com. Since it is not practical to validate Twitter user responses with various fine-grain spatial (latitude, longitude) and temporal dimensions, the correlation is based on course-grain city wide level weather data for the entire day. In the first column of Table I, we list the correlation of user responses with the data from weather.com for the current day (the weather.com data and user responses are collected in the same day). If the weather.com reports “snowy” for the day, all responses except “snowy” are counted as “unmatched”. If the weather.com reports a fuzzy condition such as “partly cloudy”, all responses including “sunny” and “cloudy” are counted as “matched”. In this experiment, we observe that for each city at least 79% of the answers match with the data from weather.com. In the second column of Table I, we list the correlation of user predictions for the next day with the data from weather.com. Here we collect the predictions of users in previous day (December 6) and find the correlations of those predictions with weather.com data collected on the next day (December 7). We observe that at least 50% of the user predictions match with weather.com for New York City and (a) Low level noise Fig. 6. 100 80 60 40 Signal Values 20 0 -20 -40 -60 -80 -100 0 500 1000 1500 2000 2500 Samples 3000 3500 4000 4500 -6000 -8000 0 Signal Values 2000 0 -2000 -4000 (b) Medium level noise Normal distributions for different noise levels 6000 4000 30000 20000 10000 0 -10000 -20000 -30000 500 1000 1500 2000 2500 Samples 3000 3500 4000 4500 0 (c) High level noise Signal Values 500 1000 1500 2000 Samples 2500 3000 (a) Low level noise Fig. 7. (b) Medium level noise Representative samples for different noise levels (c) High level noise Percentage (%) Fig. 8. State transition diagram for Sensweet client Fig. 9. Daily noise fluctuation graph The state diagram of the Sensweet client for noise map application is given in Figure8. When the phone is started the Sensweet application is also launched as a background process and waits in the “idle” state. The GPS based location, noise level, and current timestamp is logged to the flash memory when the sensor timer is fired. We also keep another timer for forwarding sensor readings to Twitter. When the Internet timer is fired, main application reads the latest sensor readings from the flash disk and tweets it (http://twitter.com/Sensweet). A. Experiment Results Here we provide our experimental results for the noise map application. 8 We assign the “Low” category to the samples that we obtained during the silence in home and computer lab locations. The amplitude distribution for “Low” level noise is given in Figure 6(a). Here the amplitude (absolute value of signal values) of low level noise mostly fluctuates between [0,100], which also implies that signal values mostly fluctuate between [-100,100] (Figure 7(a)). For the “Medium” category we collect samples from the Student Union at UB and various meeting rooms at the CSE department where people talk to each other (noise mostly includes human voice). The “High” category is collected in bars and clubs in Buffalo with loud background music. The normal distribution of amplitudes for qy o™mxl qy oxm–dl wut q~t}hz q p o–d mnl | k{ qp onm—l Flash Memory ujtjikz gwutv ujtjhsr jkkjihgfe • •– •— Main Process „ƒwx€‚ xw€y x‰ˆuw‡s Tweet Sensor Readings Log Noise Sample And GPS ••d The µ in the formula represents the mean of the corresponding distribution and σ 2 represents the variance. The assignment is based on the highest value. Since there is no gain setting for the microphone of Nokia N97, our mapping is valid for any Nokia N97 smartphone device. For the smartphones having adjustable microphone gain, our mapping can be easily adapted by dividing signal values by the gain factor. „’‘ts In order to determine the normal distributions representing the “Low”, “Medium”, and “High” categories for noise levels, we performed experiments in six different locations with varying noise levels. In each location, we recorded more than 200 noise samples with a duration of one second. 3333A 33335 rqpihgfed cb 33339 3 933343 8333343 3 7333343 6333343 5333343 33339@ 3500 B C D E F G H I P Q R S R T U T V W X Y ` a T V W øøø øøø© •™ •˜ øøøý 210)('&%$ #" øøøû ø ýøøøùø üøøøùø ûøøøùø úøøøùø ø øøøûÿ þøøøùø ¡   ¢ £ ¤ ¥ ¦ § ¨            !    „“‰‘s „ux†vw…s ÐÖÓ ÐÔÓ ÐÐÓ ÷öõôóòñðï îí „ƒwx€‚ xw€y vwuxwvuts ÐØ „ux†vw…s Ð× „ƒ”w…s ÐÖ ÐÔ Ð ÕÐÑÐ ÒÔÐÑÐ ÒÐÐÑÐ Ù Ú Û Ü Ý Þ ÓÐÑÐ ß à á â ã ä ÒÓÐÑÐ ãåæ ç èå é ê ÔÐÑÐ ë ì ç èå Ð “Medium” and “High” categories are given in Figure 6(b) and Figure 6(c). Representative samples for these two categories are also given in Figure 7(b) and Figure 7(c) respectively. In another experiment, we measure the noise fluctuation of our case study user for one weekend day over different time slices starting from Saturday 4.00 pm until Sunday 8.00 am (Figure 9). By analyzing the temporal noise fluctuation, it can be possible to predict some of the activities of the user during the day time. In the afternoon period the noise level fluctuates between “Low” and “Medium” level. During this time the user was at home and meeting with his friends. In the evening period the ratio of “Low” level decreases and ratio of other two levels increase. In this period, the user was having dinner with his/her friends in some place and going to a bar/club after that. In the night period the noise level is mostly “High” and the user was visiting a club. The noise level in the morning period is “Low” mostly since the case user was sleeping at home. VII. C ONCLUDING R EMARKS We presented a crowd-sourcing system architecture over Twitter, and demonstrated this system with two case studies: weather radar and noise mapping. Our experiments with crowd-sourcing on Twitter are promising. Even without an incentive structure, Twitter users volunteer to participate in our crowd-sourcing experiments (with around 15% reply rates) and the latency of the replies are low (50% replies arrive in 30 minutes and 80% replies arrive in 2 hours). Another promising finding is that a majority of replies were tweeted from smartphones. Our experiments suggest that Twitter provides a suitable open publish-subscribe infrastructure for tasking/utilizing sensors and smartphones and can pave the way for ubiquitous crowd-sourced sensing and social collaboration applications. There are several open research questions remaining for fulfilling this vision. Security and trust issues remain as significant challenges. In our future work we will consider mining of tweets and exploiting of social networks structures in Twitter to deploy expert finding and social collaboration applications. We will also experiment with adding various incentive schemes to our crowd-sourcing system, caching replies, and deploying our system on a cloud computing platform. R EFERENCES [1] M. Weiser, “The future of ubiquitous computing on campus,” Commun. ACM, vol. 41, no. 1, pp. 41–42, 1998. [2] N. Eagle and A. Pentland, “Social serendipity: Mobilizing social software,” IEEE Pervasive Computing, vol. 04-2, pp. 28–34, 2005. [3] “www.twitter.com.” [4] “State of twittersphere,” http://cdnqa.hubteam.com/State of the Twittersphere by HubSpot Q42008.pdf, December 2008. [5] “State of twittersphere,” http://blog.hubspot.com/Portals/249/sotwitter09.pdf, June 2009. [6] “http://twitter.grader.com/top/cities.” [7] “https://www.technologyreview.com/files/18810/forward.pdf,” October 2008. [8] “http://www.washingtonpost.com/wp-dyn/content/article/2009/06/16 /ar2009061603391.html,” June 2009. [9] “http://twitpic.com/135xa,” January 2009. [10] “http://www.economist.com/world/middleeastafrica/displaystory.cfm?story id=13856224,” June 2009. [11] “http://www.maine.gov/portal/cas/index.shtml.” [12] “http://www.e2campus.com/pr081203-pacific facebook twitter.htm,” December 2008. [13] A. Arora, P. Dutta, S. Bapat, V. Kulathumani, H. Zhang, V. Naik, V. Mittal, H. Cao, M. Demirbas, M. Gouda, Y.-R. Choi, T. Herman, S. S. Kulkarni, U. Arumugam, M. Nesterenko, A. Vora, and M. Miyashita, “A line in the sand: A wireless sensor network for target detection, classification, and tracking,” Computer Networks (Elsevier), vol. 46, no. 5, pp. 605–634, 2004. [14] A. Arora, R. Ramnath, E. Ertin, P. Sinha, S. Bapat, V. Naik, V. Kulathumani, H. Zhang, H. Cao, M. Sridharan, S. Kumar, N. Seddon, C. Anderson, T. Herman, N. Trivedi, C. Zhang, M. Nesterenko, R. Shah, S. Kulkarni, M. Aramugam, L. Wang, M. Gouda, Y. Choi, D. Culler, P. Dutta, C. Sharp, G. Tolle, M. Grimmer, B. Ferriera, and K. Parker, “Exscal: Elements of an extreme scale wireless sensor network,” Int. Conf. on Embedded and Real-Time Computing Systems and Applications, 2005. [15] R. Szewczyk, E. Osterweil, J. Polastre, M. Hamilton, A. M. Mainwaring, and D. Estrin, “Habitat monitoring with sensor networks,” Commun. ACM, vol. 47, no. 6, pp. 34–40, 2004. [16] G. Werner-Allen, K. Lorincz, J. Johnson, J. Lees, and M. Welsh, “Fidelity and yield in a volcano monitoring sensor network,” in in 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2006), 2006. [17] “www.twitterpeek.com.” [18] J. Burke and et al., “Participatory sensing,” in ACM Sensys World Sensor Web Workshop, 2006. [19] N. Eagle, “txteagle: Mobile crowdsourcing,” in IDGD ’09: Proceedings of the 3rd International Conference on Internationalization, Design and Global Development. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 447–456. [20] D. C. Brabham, “Crowdsourcing the public participation process for planning projects.” Planning Theory, vol. 8, pp. 242–262, 2009. [21] P. Coppola, R. Lomuscio, S. Mizzaro, and E. Nazzi, “m-dvara 2.0: Mobile & web 2.0 services integration for cultural heritage,” in SWKM, 2008. [22] N. Shah, A. Dhanesha, and D. Seetharam, “Crowdsourcing for egovernance: case study,” in ICEGOV ’09: Proceedings of the 3rd International Conference on Theory and Practice of Electronic Governance. New York, NY, USA: ACM, 2009, pp. 253–258. 9
x

Log In

or reset password

Reset Password

Enter the email address you signed up with, and we'll send a reset password email to that address

Academia © 2012