Social Data Analytics Tool (SODATO) Abid Hussain 1 and Ravi Vatrapu 1,2 1 CSSL, Department of IT Management, Copenhagen Business School, Denmark 2 MOTEL, Norwegian School of Information Technology (NITH), Norway {ah.itm,vatrapu}@cbs.dk Abstract. This paper presents the Social Data Analytics Tool (SODATO) that is designed, developed and evaluated to collect, store, analyze, and report big social data emanating from the social media engagement of and social media conversations about organizations. Keywords: social media, data science, computational social science, big data analytics. 1 Introduction Big social data that is generated by the social media engagement and social software utilization of a company results in both operational issues and managerial challenges [5, 8]. With the wide spread adoption of social media for organizational purposes, there is a clear need for developing concepts, methods, and tools that systematically analyze big social data. In recent years, researchers have emphasized that technical advancements are required to deal with the situation [10]. Such requirements include the modeling of social data on individual and collective levels as well as identification of unified methods to process social data [1, 4, 9]. In this paper, we report on the design, development and features provided by a theoretically informed and methodologically grounded IT artefact (SODATO) that addresses the diverse but interrelated issues associated with social data. The remainder of the paper is organized as follows. Section 2 describes the design of the IT artefact in terms of the problem statement, use cases, intended user groups, and technical architecture. Section 3 presents the Action Design Research (ADR) method [7] and describes how ADR informed the design, development and evaluation of SODATO. Sections 4 and 5 report on the significance of the IT artefact to researchers and practitioners respectively. Section 6 summarizes the ADR evaluation of SODATO. 2 Design of the IT Artefact 2.1 Problem Statement From an academic standpoint, Zeng [10] identified the unique technical challenges in social media analytics due to unstructured data across networks and observed that M.C. Tremblay et al. (Eds.): DESRIST 2014, LNCS 8463, pp. 368 372, 2014. Springer International Publishing Switzerland 2014
Social Data Analytics Tool (SODATO) 369 data has not been treated systematically in data and text mining literature. Wang [9] argued that social computing is changing the way we interact, identified a need for modeling of social data on both individual and collective level, and called for new analytical techniques and IT artefacts for social data analytics. Quite a few organizations have been falling behind in adopting the social media due to lack of understanding of its diverse scope [3]. Kietzmann et al. [4] urged organizations to take social media engagement seriously. They argued that since social media sites are diverse in functionality and scope, there is a need of identifying uniformed and generic methods to analyze social data from different sites. Finally, Chen et al. [1] describe the state of data analytics and state that social media analytics is essentially different from traditional Business Intelligence and more research is needed in order to design methods and techniques for drawing insights from social data. From an industry standpoint, many commercial software vendors (such as Radian6 1, IBM Cognos Customer Insight 2, SAS 3, Social Bakers 4 ) are providing software applications to monitor, measure, and manage social data. However, an important problem with existing commercial applications is that there is little-to-no empirical research on the data provenance, efficacy, effectiveness, and impact of the different social media metrics and key performance indicators employed. Further, there is no provision of raw data as such or transparency about the algorithms, formulas, and metrics from a data science technical perspective as well as an organizational perspective of business analytics. We term this lacuna in the current theoretical knowledge, empirical findings, and industry practice as the Gulf of Social Media Analytics. The primary objective of this paper to briefly present the design, development and evaluation of an IT artefact (Social Data Analytics Tool, SODATO) to bridge the Gulf of Social Media Analytics. To achieve this, we have developed a unified framework [6] consisting of a theory of social data, a conceptual model, a formal model, technological architecture and finally, the software tool itself which is the focus of this paper. 2.2 Use Cases One use case is for the campaign strategist of a political party can utilize SODATO in order to fetch Facebook walls. A second use case is that the social media manager at an organization can use SODATO to fetch their social data and extract meaningful and actionable insights on content performance (such as which post types are most popular amongst the users and explore correlations to sales and other in-house data from ERP and CRM systems). 2.3 Intended User Groups Analyzing the use cases mentioned, the target end users of SODATO can be listed as researchers, analysts, social media managers, chief listening officers, and trainee analysts (for example, students in social media management and social data analytics courses). 1 www.radian6.com 2 http://www-01.ibm.com/software/analytics/cognos/ 3 http://www.sas.com/software/customer-intelligence/ social-media-analytics/ 4 www.socialbakers.com
370 A. Hussain and R. Vatrapu 2.4 Technical Architecture Please refer to [6] for the unified framework of theory, conceptual model, and formal model of social data together with an illustrative example and a demonstrative case study. Figure 1 presents schematic of SODATO. SODATO can be accessed from http://cssl.cbs.dk/sodato Fig. 1. Schematic of the Social Data Analytics Tool (SODATO) Technically, SODATO utilizes the APIs provided by the social network vendors for example Facebook open source API named as Graph API. SODATO is a combination of web as well as windows based console applications that run in batches to fetch data and prepare data for analysis. The Web part of the tool is developed using HTML, JavaScript, Microsoft ASP.NET and C#. Console applications are developed using C#. Microsoft SQL Server is used for data storage and data pre-processing. SODATO provides a generic method for retrieving, storing and analyzing social data. SODATO can be utilized by practitioners as well as researchers in order to obtain a detailed understanding of trends, dynamics, and mechanisms in the domain of Facebook currently (and scalable to different online social media platforms). Specifically, SODATO supports descriptive, prescriptive and/or predictive analytics in terms of the social graph (actors involved, artefacts created, actions taken, activities engaged) and social text (sentiments expressed, topics discussed, pronouns addressed, and keywords mentioned) [8]. 3 Design Science Methodology SODATO has been developed based on the design principles defined by Action design Research methodology [7]. The development process started in 2011 when the alpha version of SODATO was released with the name SOGATO [2]. The development process started with the basic problem formulation jointly informed by the literature in the domain of social media analytics. Developers, researchers, trainee social media analysts and organizations constituted the ADR team of SODATO. There have been multiple iterations informed by the second phase of ADR, Building intervention and evaluation (BIE). Each iteration contributed towards the IT artefact as well as the knowledge body within the domain of social media
Social Data Analytics Tool (SODATO) 371 analytics. Finally, reflections and learnings from the ADR process are currently being actively reported through both academic publications and workshops.. Table 1 presents the First BIE cycle of ADR [7] for SODATO. SODATO: RELEASE ONE Table 1. First BIE Cycle for SODATO Actor: Type Researchers Evaluation: Examples Danish election 2011 Social media analysis Feedback: Examples Large walls cannot be fetched using web version Contribution: Ensemble Artefact Batch processing introduced Contribution: Science Trainee Analysts Practitioners Social media projects with realworld case companies Interaction with trainee analysts Statistics are performing slow Multiple requests for walls for multiple users User interface issues Statistics needed to be developed that suited the case company s industry sector and social media use Pre- Processing of fetched data built Authorization system developed User interface improved Custom metric module was developed Empirically informed modifications to the descriptive model of social data 4 Significance to Research SODATO incorporates innovative features that provide value to the researchers. The development of the tool in theoretically grounded [6] and follows industry standards in software engineering. SODATO addresses the commercial tools lack of attention to data fetch procedures by implementing systematic data collection procedures, logging, and error recovery options. Due to the complex data structure of Facebook when compared to Twitter, there has been far less research using Facebook data comparatively and SODATO enables researchers to fill this knowledge gap by provisioning facebook social data. As far as we know, there does not exist a social media analytics tool in the literature that is designed and developed using design science principles. Neither could we find any generic method that could uniformly be applied for building other application in the same domain of knowledge. Hence we believe that this artefact is a contribution for information systems in general and design science research in particular. At the operational level, SODATO provides unique (as of yet) features to researchers as stated below: SODATO can fetch and store historic data right from start of Facebook time (we are yet to find this in a commercial tool) SODATO provides very high level of transparency in data fetching and calculates data provenance
372 A. Hussain and R. Vatrapu Researchers are able to fetch the data using the tool and can export large sets of data and do analysis using modelling, statistical and/or coding tools of their choice such as Microsoft Excel, IBM SPSS, R, MatLab etc. Domain specific features for projects in fields such as Political Science, Marketing, Finance, and Sustainability. 5 Significance to Practice SODATO provides state-of-the-art functionality for descriptive, predictive, and prescriptive analytics of big social data for organizations. It differs from existing commercial tools in the sense that it can be used as a strategic tool for fetching the complete online social data record of an organization on the platform of Facebook. Different insights can be generated from different analysis methods provided by the tool such as sentiments analysis, keyword analysis, actor attribute analysis, content performance analysis, social influencer analysis and integrative analytics with in-house data from web analytics, ERP and CRM systems. 6 Evaluation As mentioned earlier, we adopted the ADR model [7] for design and development and the current state of the tool is informed by iterations over three years where trainee analysts, researchers and practitioners have been evaluating the tool (see Table 1 for the first BIE cycle). References 1. Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: from big data to big impact. MIS Quarterly 36, 1165 1188 (2012) 2. Hussain, A., Vatrapu, R.: SOGATO: A Social Graph Analytics Tool. In: The 12th European Conference on Computer Supported Cooperative Work 2011 (2011) 3. Kaplan, A.M., Haenlein, M.: Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons 53, 59 68 (2010) 4. Kietzmann, J.H., Hermkens, K., Mccarthy, I.P., Silvestre, B.S.: Social media? Get serious! Understanding the functional building blocks of social media. Business Horizons 54, 241 251 (2011) 5. Lovett, J.: Social Media Metrics Secrets. Wiley (2011) 6. Mukkamala, R., Hussain, A., Vatrapu, R.: Towards a Formal Model of Social Data. IT University Technical Report Series TR-2013-169 (2013), https://pure.itu.dk/ ws/files/54477234/itu_tr_54472013_54477169.pdf 7. Sein, M., Henfridsson, O., Purao, S., Rossi, M., Lindgren, R.: Action design research. MIS Quarterly 35, 37 56 (2011) 8. Vatrapu, R.: Understanding Social Business. In: Akhilesh, K.B. (ed.) Emerging Dimensions of Technology Management, pp. 147 158. Springer, Heidelberg (2013) 9. Wang, F.-Y., Carley, K.M., Zeng, D., Mao, W.: Social computing: From social informatics to social intelligence. IEEE Intelligent Systems 22, 79 83 (2007) 10. Zeng, D., Chen, H., Lusch, R., Li, S.-H.: Social media analytics and intelligence. IEEE Intelligent Systems 25, 13 16 (2010)