The paper aims to explore the value generated by a specific configuration of a smart city's infrastructure by proposing a comparison between a silos configuration versus a crowd configuration at the data storage and processing level.
A system dynamics simulation is adopted to determine and compare the value created by the two configurations of smart city's infrastructure. The simulation outlines the flow of data and their positive and negative feedback that reinforce and hinder the smart city value generation.
The results demonstrate the huge impact of the availability of data for App developers when crowdsourcing configuration is adopted. Furthermore, results unveil the potential in value generation of a crowdsourcing smart city platform configuration compared to a silos architecture.
The authors have proposed a comparison between two alternative smart city digital platform configurations. The paper seeks to test the magnitude of the pros and cons of a crowdsourcing approach in setting up a smart city digital platform. The paper provides new guidelines for improving the data management of smart cities.
1. Introduction
Smart cities are one of the most researched topic trends in management studies during the last decade (Anttila and Jussila, 2018; Caragliu and Del Bo, 2019; Ciasullo et al., 2020). The OECD (2019) defines a smart city as “initiatives or approaches that effectively leverage digitalization to boost citizen well-being and deliver more efficient, sustainable and inclusive urban services and environments as part of a collaborative, multi-stakeholder process”. The concept already emerged years before to face two relevant issues of the XX century as pollution and urbanization, simultaneously with the support of information and communication technology (Cocchia, 2014; Essuman-Quainoo et al., 2019; Nilssen, 2019). Regardless, an overlapping of terms and meaning have induced scholars to address the topic of the smart city to come up with a better understanding of the phenomenon. Indeed, the word “smart city” recurred together with those of “digital city” “wired city”, “knowledge city” and “green city” linking together technological informational transformations with an economic, political and socio-cultural change (Hollands, 2008; Lytras and Visvizi, 2018).
A broad body of literature has focused on the social, economic and environmental smart city aims to explain different participants and groups of stakeholders create value for citizens (e.g. Zygiaris, 2013; Nilssen, 2019; Caragliu and Del Bo, 2019; Ciasullo et al., 2020). For instance, in Italy, Trento city is intended as a smart community that manages the different resources, value propositions and co-creation practices arising from actors' engagement adopting an ecosystems approach (Ciasullo et al., 2020). Other scholars are focused on the physical and technological infrastructure of cities, particularly describing the key role of ICT (Al-Hader et al., 2009; Serrano, 2018) or the role of intelligent technologies in improving the quality of services and information supplied to citizens (Gutiérrez et al., 2013 Sun et al., 2016; Szum, 2021). Digital infrastructures are a fundamental element in providing smart IT solutions to citizens and supporting companies in designing better customer experiences (Caporuscio et al., 2021a, b; Krishnan et al., 2020; Kumar et al., 2020; Szum, 2021). However, few authors mix the strand of literature on the value creation process with those of technological infrastructures.
This paper addresses this gap in the literature as follows. According to Kumar et al. (2016) and Ciasullo et al. (2020), this study considers the smart city's digital infrastructure analysis as representative of interconnections among smart city stakeholders and representative of the way stakeholders create value. More precisely, the study restricts the lens of research on digital infrastructure that handles citizen data and proposes a crowdsourcing perspective to explore the value generated by a crowd-configuration of smart city's infrastructure at the data storage and processing level.
The authors draw, in fact, on the literature on crowdsourcing that has become an emerging data collection paradigm for smart city implementation (Huang et al., 2016; Breetzke and Flowerday, 2016; Staletić et al., 2020). However, to the best of authors' knowledge, no studies apply the crowdsourcing paradigm in data management, especially to explore a smart city's digital infrastructure.
Previous research analyzes the advantages of crowdsourcing in gathering data for managing anomalies in cities (e.g. noise, illegal use of public facilities, urban infrastructure functions) (Huang et al., 2016), or in involving citizens to capture new ideas (Schuurman et al., 2012). Instead, this paper clarifies how a crowdsourcing approach influences the smart city value enabling cross-data management with a specific configuration of the smart city's infrastructure. Notably, this study opens the route with the following research question: “In which way a crowd-based configuration of a smart city's infrastructure may generate value for a smart city?” The study adopts a system dynamics simulation (Luna-Reyes and Andersen, 2003) by stimulating the flow of data and their positive and negative feedbacks that reinforce and hinder the smart city value generation. This kind of modelization permits handling a higher degree of causality among many heterogenous variables within complex systems such as smart city's digital infrastructures. More specifically, a system dynamic modelization enables the management of difficult circle feedback among data gathering, platform data management and value creation for citizens (Black, 2013). Besides, what matters the most, the system dynamics approach permits to observe the variation in value process generation when a crowdsourcing perspective configures the smart city's infrastructure, and the data are becoming more and more cross-sectional.
Therefore, to show the potentiality of the crowdsourcing approach, the paper contributes to the management literature by simulated comparison between two different configurations of smart city: a silos configuration and a crowd configuration. The novelty of this comparison provides directions about which infrastructural design of a smart city is more suitable to create value for smart city users and citizens. Indeed, the results demonstrate that the positive effects of the crowdsourcing paradigm, in the long run, overcome the silos infrastructure configuration. In this vein, the crowd-based design improves the data-sharing mechanisms. It offers advantages in reducing providers' costs and increasing the smart city's adoption rate. This paper also provides implications for practitioners and policy-makers. Indeed, the cross-data management and processing may require additional and specific providers' capabilities, increasing R&D costs for the smart city providers. Thus, the crowdsourcing approach seems to be an effective means for competent city providers to offer integrated services for citizens.
The paper is organized as follows. Section 2 presents the theoretical background on the smart city's digital infrastructure, the crowdsourcing approach in and out of the smart city domain. Section 3 introduces the methodology adopted, explaining the dynamic system modeling and the experimental setup used in our simulation. Section 4 illustrates the findings, and section 5 presents the discussion. Finally, section 6 presents conclusions emphasizing the implications and limitations of the research.
2. Theoretical background
2.1 Smart city's digital infrastructure
The smart city concept has typically been associated with an ecosystem where technology is embedded everywhere and represents an integral aspect of the functioning of smart city dynamics (Aguilar et al., 2017; Carvalho et al., 2014; Ciasullo et al., 2020). As a matter of fact, the technological infrastructure of a smart city improves services offered by the city (such as traffic, water, sewage, energy and commerce), exploiting the interconnected information that deployed devices provide. Several smart city definitions are mainly based on technological or infrastructural elements that characterize cities. Lee et al. (2008) define a smart city in terms of the convergence of IT services within an urban space. Batty et al. (2012) more precisely describe a smart city as a city in which ICT is merged with classic infrastructures, coordinated and integrated using new digital technologies. In analyzing the technological infrastructure, scholars identify different levels of technological architecture through which the smart city is realized (Cocchia, 2014; Essuman-Quainoo et al., 2019; Nilssen, 2019).
Commonly, the smart city's infrastructure is linked to the type and the mix of technology deployed. Generally, levels analyzed are for and correspond to the following basic stages: data collection, data storage, data management and processing, information deployment. Some scholars focus on the first level, exploring technology and configuration more suitable to collect different data types. Other scholars focus on data storage and management, investigating which configuration is more effective not to disperse data and use them appropriately (Krishnan et al., 2020; Kumar et al., 2020; Szum, 2021). Another group of researchers focuses on the information products that directly impact the level of services provided for citizens to achieve urban innovation (Caragliu and Del Bo, 2019).
Gutierrez et al. (2013) pay attention to the data collection's first stage. They are interested in capturing and transmitting a wide range of data types (e.g. image, audio and location) to perform better smart city platform usability. The authors present a smart city architecture adapted to implement and test Internet of Things (IoT) and augmented reality services. They identify three smart city infrastructure levels: IoT node, gateway (GW) and server.
The IoT node tier embraces most devices deployed in the smart city physical infrastructure. It comprises diverse heterogenous devices, including miscellaneous sensor platforms, tailor-made devices for specific services, Radio Frequency Identification (RFID) and Near Field Communications (NFCs) tags. These devices are typically resource-constrained and host a range of sensors and, in some cases, actuators.
The GW tier links the IoT devices on the edges of the capillary network to the core network infrastructure. IoT nodes are grouped in clusters that depend on a GW device. This node locally gathers and processes the information retrieved by IoT devices within its cluster. It also manages (transmission/reception of commands) them, thus scaling and easing the management of the whole network. The GW tier devices are typically more powerful than IoT nodes in memory and processing capabilities, providing faster and more robust communication interfaces.
The server tier provides more powerful computing platforms with high availability and directly connects to the core network. The servers are used to host IoT data repositories and application servers. Server tier devices receive data from all GW tier nodes.
Al-Hader et al. (2009) build up a model of operating the infrastructure frameworks of a smart city to manage energy consumption in a city. Smart infrastructure, smart database, smart management system and smart interface. The smart infrastructure is a group of device and operational sensors that collect data and information. The smart database concerns resources that store data and information reflecting the existing/proposed infrastructure networks. The smart management systems are deputed to process data. The smart interface is intended as dashboard or operational platforms or web services that deploy information. Smart databases and smart management systems represent a crucial part of the digital infrastructure. They represent the so-called system administration that manages server applications, database servers and communication servers. Al-Hader et al. (2009) propose centralized operational platforms that provide a single management system for the collective processing and management across multiple sub-systems, applications and controllers. Authors stress the necessary data integration with the enterprise data-warehouse management solutions to provide a unified city solution.
Serrano (2018) proposes a broad overview of smart city's infrastructure configuration considering the effect of modern digital technology such as Cloud, BlockChain, Big data analysis and AI. Following Sun et al. (2016), he affirms that the digital advance has changed the concept of smart city Serrano states that the smart city has become a more connected community based on the IoT, crowdsensing and cyber-physical cloud computing that provide a comprehensive network of connected devices. In describing the smart city architecture and the impact of digitalization, Serrano (2018) traces the shift from a silos architecture to shared server architecture and identifies four levels of smart city's infrastructure: sensor, network, server and workstation.
In the silo approach, each digital system is independent and dedicated to a function with its own communications infrastructure, server and workstation. In the shared network–enabled by Internet Protocol (IP) and Local Area Networks (LANs) - the transmission of information is on shared switches and routers, where each digital system has an associated Virtual LAN.
In the shared architecture, there is a combination of workstations into a single management desktop using system integrator software that merges the data feeds from the different Systems showing to the user a single Graphical User Interface (GUI).
Finally, the shared server consists of a CPU or memory shared in virtual private or public cloud applications hosted in data centers. Server virtualization eliminates deploying dedicated servers installed physically in the city. Shared servers can be privately hosted within the smart city's infrastructure in dedicated rooms or remotely installed in datacenters for the smart city. The benefits of shared servers are: reducing operational cost; reducing capital expenditure, optimizing the server usage based on the user demand; high levels of integration and systems interoperability, providing the overall view of several industrial sectors in one system platform.
2.2 The crowdsourcing approach
Crowdsourcing is a form of outsourcing directed to a large set of anonymous individuals. It leads firms to collaborate among heterogenous individuals (external to organizations) to support innovation and problem-solving. Acar and van den Ende (2016) highlight that crowdsourcing opens up to not only experts from within a problem domain but also outsiders such as scientists from other domains or hobbyists who may contribute in fresh ideas and perspectives. The concept arose from Web 2.0 when the Internet capabilities improved in connecting net users bi-directional and facilitated the rise of crowdsourcing platforms (Behl et al., 2021a, b).
The phenomenon was theorized by many scholars belonging to different research fields, from information systems to management. Information system scholars (Doan et al., 2011; Blohm et al., 2018) focus on Web processes and technology that enable the crowd contribution; while management scholars (Zwass, 2010; Tian et al., 2021) deal with a broad spectrum of themes such as governance, nature of tasks outsourced (Geiger et al., 2011), level of collaboration (Bogers et al., 2017), motivation for participating in crowdsourcing challenges (Martinez, 2017; Acar, 2019; Sharma et al., 2021) and so on. However, crowdsourcing remains an umbrella term covering a set of practices sometimes related to complementary phenomena such as open innovation, user innovation and open sources (Schenk and Guittard, 2011; Camacho et al., 2019). Howe (2009) defines crowdsourcing as a practice that depends on some contribution from the crowd, but the nature of those contributions can differ tremendously (Howe, 2009). To sign boundaries of crowdsourcing Schenk and Guittard (2011) define the phenomenon referring to categories of actors involved: 1) the individuals forming the crowd who are the providers of ideas or data; 2) the companies directly benefiting from the crowd input, that are client companies; 3) an intermediation platform builds a link between providers and client companies.
On the other hand, Geiger et al. (2011) depict a prototypical crowdsourcing approach applicable to all crowdsourcing processes. Geiger et al.'s prototype aggregates one or several kinds of contributions from the crowd (the crowdsourcing process), starting on fixed organizational goals. However, according to Schenk and Guittard (2011), authors distinguished in two ways of aggregate crowd contributions: the integrative and the selective.
The integrative crowdsourcing process offers access to multiple and complementary information and data. It is named integrative since the issue is to pool complementary input from the crowd. Individual elements have very little value per se, but the amount of complementary input brings value to the firm. Unless they fail to meet specific quality requirements, all contributions are reused for the outcome. The integrative process will be relevant when the client firm seeks to build data or information bases. Schenk and Guittard (2011) consider it as a form of content crowdsourcing. That focus on content has inspired recent definitions of crowdsourcing as a form of online content creation (Behl et al., 2020, 2021a, b). While gathering information or data at an individual's level can be unproblematic, building a database generally requires significant amounts of resources. Therefore, the rationale of integrative crowdsourcing lies in the cost of building large data or information bases. Since individuals within the crowd are heterogenous, crowdsourcing enables the client firm to gather various contents. In that case, a relevant role is displayed by engagement strategies adopted by the crowdsourcing platform designers (Behl et al., 2021a, b). However, the firm seeking to implement integrative crowdsourcing should be aware of integration issues. Data or information stemming from various origins might be incompatible or redundant if no precaution is taken. Precautions include defining a data format and the sound selection of data sources.
On the other hand, the selective process will be relevant to face specific needs; it allows a crowdsourcing organization “to choose an input from among a set of options that the crowd has provided” (Schenk and Guittard, 2011). For instance, a firm facing an R&D problem may rely on the crowd's competencies to solve the problem. When there is no identified in-house solution to a given problem, selective crowdsourcing may be a way to find candidate solutions. However, selective crowdsourcing processes follow a competitive approach to achieve the outcome since individual contributions are compared and the “best” one(s) is selected.
2.3 Crowdsourcing for smart city
Crowdsourcing could become an emerging data and idea collection paradigm for smart city applications (Huang et al., 2016; Breetzke and Flowerday, 2016; Staletić et al., 2020). Two emerging fields of literature are debating the link between crowdsourcing and smart city. A stream of literature focused on the role of crowdsourcing in improving data collection for managing city anomalies (Huang et al., 2016). These scholars developed a new category of crowdsourcing-based urban anomaly reporting systems that have been developed to enable pervasive and real-time reporting of anomalies in cities (e.g. noise, illegal use of public facilities, urban infrastructure malfunctions). An exciting challenge in these applications is accurately predicting an anomaly in a given city region before it happens. The second stream of literature considers crowdsourcing an opportunity to derive innovative ideas from citizens (Schuurman et al., 2012). However, differently from the existing literature, the authors propose crowdsourcing as an effective means for smart city providers to offer integrated services for citizens. The crowdsourcing practices intervene in data management, particularly integrative crowdsourcing, which gathers and mix various useable contents from app providers to offer well-integrated services.
3. Method
This study runs the comparison between the two theoretical frameworks using the system dynamics simulation. The authors selected this simulation method because of its adaptability in managing feedback structures and complex system behavior patterns (Wang et al., 2021).
A system dynamics simulation is adapted for managing data flow and positive and negative feedback that reinforce and hinder innovative city value generation. This kind of modelization permits handling a higher degree of causality among many heterogenous variables within complex systems such as smart city's digital infrastructures. Indeed the smart cities, due to new disruptive technologies, are increasing the level of interdependence among people and digital infrastructures. Indeed, the system dynamics model permits comparing two different smart infrastructures by a precise action functions ruling (Rezchikov et al., 2017). System dynamic simulation supports understanding the complex system, focusing on feedback loops that evidence causes and the reason for some events to happen (Wolstenholme and Coyle, 1983).
For achieving the research purpose, the authors followed three essential steps, suggested by the most relevant system dynamics studies (Luna-Reyes and Andersen, 2003; Black, 2013): 1) the conceptualization of research hypothesis formulation; 2) the rules set of system dynamics simulation; 3) the test of model and the analysis of results.
3.1 System dynamics simulation: silos infrastructure configuration
To date, the digital platform architecture of several smart cities is framed by a silos infrastructure paradigm. In other terms, the process of data collection from devices, data collection through servers and data exploitation by app developers is configured as silos infrastructure categorized according to each industry segment (Caporuscio et al., 2021a, b). This simulation attempts to run the feedback loops at the base of smart city digital platform infrastructures, where the citizens' data is drawn by urban devices and flowed through silos servers segmented by the features of developers. The simulation (Figure 1), for the sake of simplicity, is framed by three kinds of infrastructures linked to three types of data (Data A, Data B, Data C) and three generic app developers (App 1, App 2, App 3). The sum of value generated per-app developer is the smart city value. The system dynamics feedback loops are associated at three levels: the impact on smart city's infrastructure, the level of R&D data cost and the cost tied to data storage. The simulation does not concern the data sharing and the capacity of app developers to create innovation in a much more participated way. In short, the circular process of data from extraction to exploitation is conveyed only in the same silos infrastructure without generating new added value.
The flowchart illustrates interconnected components and labeled data flows within a smart city system. At the top center, the text “Adoption rate” points downward toward a rectangular box labeled “Urban user”. To the left of this box is another rectangle labeled “Citizens”, connected to “Urban user” by a right-pointing arrow. From “Urban user”, a downward arrow leads to a rectangular box labeled “Device”. Below the “Device” box are three vertically aligned groups labeled “Silos Infrastructure A”, “Silos Infrastructure B”, and “Silos Infrastructure C”. Each silo is enclosed within a rectangular boundary. Inside “Silos Infrastructure A”, three stacked boxes read “Data A”, “Server A”, and “App 1”, connected by downward arrows. Inside “Silos Infrastructure B”, the stacked boxes are labeled “Data B”, “Server B”, and “App 2”, also connected by downward arrows. Inside “Silos Infrastructure C”, the stacked boxes are labeled “Data C”, “Server C”, and “App 3”, with downward arrows between them. Curved arrows extend from the “Device” box toward each of the three silo infrastructures. Additional curved arrows connect “App 1”, “App 2”, and “App 3” across silos, converging toward the text “Smart city value” shown below the three infrastructures. On the left side of the flowchart, the label “Quality of smart city’s infrastructure” appears, with a diagonal arrow extending from it toward the label “R and D Data Cost”. From “R and D Data Cost”, arrows point toward “App 1”, “App 2”, and “App 3”. From “Quality of smart city’s infrastructure”, curved arrows extend toward each silos and it receives an arrow from the label “Smart city value”. On the right side, the label “Storage Data Cost” appears, with curved arrows connecting back toward “App 1”, “App 2”, and “App 3”. A diagonal arrow also extends from “Device” to “Storage data cost”.A silos infrastructure configuration
The flowchart illustrates interconnected components and labeled data flows within a smart city system. At the top center, the text “Adoption rate” points downward toward a rectangular box labeled “Urban user”. To the left of this box is another rectangle labeled “Citizens”, connected to “Urban user” by a right-pointing arrow. From “Urban user”, a downward arrow leads to a rectangular box labeled “Device”. Below the “Device” box are three vertically aligned groups labeled “Silos Infrastructure A”, “Silos Infrastructure B”, and “Silos Infrastructure C”. Each silo is enclosed within a rectangular boundary. Inside “Silos Infrastructure A”, three stacked boxes read “Data A”, “Server A”, and “App 1”, connected by downward arrows. Inside “Silos Infrastructure B”, the stacked boxes are labeled “Data B”, “Server B”, and “App 2”, also connected by downward arrows. Inside “Silos Infrastructure C”, the stacked boxes are labeled “Data C”, “Server C”, and “App 3”, with downward arrows between them. Curved arrows extend from the “Device” box toward each of the three silo infrastructures. Additional curved arrows connect “App 1”, “App 2”, and “App 3” across silos, converging toward the text “Smart city value” shown below the three infrastructures. On the left side of the flowchart, the label “Quality of smart city’s infrastructure” appears, with a diagonal arrow extending from it toward the label “R and D Data Cost”. From “R and D Data Cost”, arrows point toward “App 1”, “App 2”, and “App 3”. From “Quality of smart city’s infrastructure”, curved arrows extend toward each silos and it receives an arrow from the label “Smart city value”. On the right side, the label “Storage Data Cost” appears, with curved arrows connecting back toward “App 1”, “App 2”, and “App 3”. A diagonal arrow also extends from “Device” to “Storage data cost”.A silos infrastructure configuration
3.2 System dynamics simulation: a crowdsourcing configuration
The data management style of smart city's infrastructure does not directly affect the way of value generation for smart cities. The platform configuration and server management roles are responsible for triggering and shaping the feedback loop structure. Such a circle loop passes through app developers by modifying their primary inputs that are data resources. The server is an intermediate construct that links the technical, strategic and economic domains. As a combination of technologies and markets, a crowdsourcing configuration of servers may allow for potential additional values on app value-generating processes. In other words, adopting a crowdsourcing approach for managing the data permits generating additional value from new alternative combinations of data types among them. The conceptual model at the base of our system dynamics simulation (Figure 2) concerns the balance effects caused by a server crowdsourcing configuration. In a nutshell, the urban server can favor whole app developers with a massive amount of extra data, classified as cross-sectional type. This data group permits liberating a great alternative urban functionality by embracing different scopes.
The flowchart shows labeled elements and directional connections related to a smart city system. At the top center, two rectangular boxes labeled “Citizens” and “Urban Users” are connected by a right-pointing arrow. Curved arrows labeled “Adoption rate” extend from the left side toward the central box “Urban users”. Below “Urban Users”, a downward arrow leads to a rectangular box labeled “Device”, which is positioned inside a larger rectangular boundary labeled “Smart city infrastructure”. Within the “Smart city infrastructure” boundary, three rectangular boxes labeled “Data A”, “Data B”, and “Data C” are arranged horizontally below the box “Device”. The downward arrows from “Device” connect to each data box. Downward arrows from these data boxes converge into a central rectangle labeled “Server”. Beneath the server, a label reads “Cross-sectional data (data A B C)”, with arrows extending downward to three horizontally aligned rectangles labeled “App 1”, “App 2”, and “App 3”. Multiple curved arrows connect “App 1”, “App 2”, and “App 3” to each other and to surrounding labels. At the bottom center, the text “Smart city value” appears, with several arrows pointing toward it from the application boxes. On the left side of the flowchart, the labels “Quality of smart city’s infrastructure” and “R and D Data Cost” are shown, each connected by curved arrows to the central infrastructure, application boxes, and the label “Adoption rate”. On the right side, the labels “Network” and “Storage data cost” appear, with curved arrows linking them back to the device, server, and application boxes. All elements are drawn as rectangles connected by blue arrows, forming a dense network of labeled paths within and around the “Smart city infrastructure” boundary.A crowdsourcing configuration
The flowchart shows labeled elements and directional connections related to a smart city system. At the top center, two rectangular boxes labeled “Citizens” and “Urban Users” are connected by a right-pointing arrow. Curved arrows labeled “Adoption rate” extend from the left side toward the central box “Urban users”. Below “Urban Users”, a downward arrow leads to a rectangular box labeled “Device”, which is positioned inside a larger rectangular boundary labeled “Smart city infrastructure”. Within the “Smart city infrastructure” boundary, three rectangular boxes labeled “Data A”, “Data B”, and “Data C” are arranged horizontally below the box “Device”. The downward arrows from “Device” connect to each data box. Downward arrows from these data boxes converge into a central rectangle labeled “Server”. Beneath the server, a label reads “Cross-sectional data (data A B C)”, with arrows extending downward to three horizontally aligned rectangles labeled “App 1”, “App 2”, and “App 3”. Multiple curved arrows connect “App 1”, “App 2”, and “App 3” to each other and to surrounding labels. At the bottom center, the text “Smart city value” appears, with several arrows pointing toward it from the application boxes. On the left side of the flowchart, the labels “Quality of smart city’s infrastructure” and “R and D Data Cost” are shown, each connected by curved arrows to the central infrastructure, application boxes, and the label “Adoption rate”. On the right side, the labels “Network” and “Storage data cost” appear, with curved arrows linking them back to the device, server, and application boxes. All elements are drawn as rectangles connected by blue arrows, forming a dense network of labeled paths within and around the “Smart city infrastructure” boundary.A crowdsourcing configuration
3.3 Experimental setup
This system dynamics configuration captures these effects by simulating the positive and negative feedbacks that reinforce and hinder the smart city value generation. This simulation sets both silos infrastructure and crowdsourcing configurations to compare the two different configuration scenarios. In the first case, the setting is framed according to the following assumptions:
The value-generating process of smart city's digital infrastructure is a positive function of the app value.
For clarity, the authors assume that the smart city has just three kinds of Apps (App 1, App 2, App 3). Such segmentation reflects the several scopes of Smart city Apps.
The value-generating process of the app has a positive correlation with the amount and quality of data.
Within smart cities, the amount of data positively correlates with the number of users joining the urban devices.
For simplicity, the data are supposed to be of three types: data A, B and C.
The data storage and processing is run by individual infrastructure tied to the data typology.
The data storage cost impacts each silo's infrastructure.
Each type of data requests a shorter processing time than crowdsourcing data.
The following assumptions drive the simulation of crowdsourcing setting:
The value-generating process of a smart city's digital infrastructure is a positive function of the app value.
For clarity, the authors assume that the smart city has just three kinds of Apps (App 1, App 2, App 3). Such segmentation reflects the several scopes of Smart city Apps.
The value-generating process of apps positively correlates with the amount and quality of data.
Within smart cities, the amount of data positively correlates with the number of users joining the urban devices.
For the sake of simplicity, the data are supposed to be of three types: data A, data data C.
The effectiveness of data is related to the capacity to be processed more and more cross-sectional (data A, B; C have higher quality than data A, B and A).
A crowd-based configuration of smart city platforms generates cross-sectional data.
Cross-sectional data needs more time to be available since the request than traditional data.
The temporal lag for processing cross-sectional data is decrescent.
The time for being available hurts app value-generating function.
The first five points are common to both the simulations because of obtaining homogenous starting points. Indeed, the size of a smart city is 500,000 citizens for both configurations; and the adoption rate pace is set as the same. The model is developed to replicate only the early stage of smart city platform infrastructure implementation. Additionally, for the sake of simplicity, the simulation zoomed mainly on the technological configuration for better capturing the loop feedback effects, and this limitation is carried over to this research. Nevertheless, the model is very suitable to study the focal research question as it (1) captures the data pathway within smart city's digital infrastructure; (2) tests the alternative ways of combining and processing urban data, (3) illustrates the new possibilities of cross-sectional data exploitation by app developers, (4) handles the causality among direct and retro feedback generated by different server configurations. The model allows for implementing new simulations to design new alternative digital smart city's infrastructure or test new policies of interventions by municipalities.
4. Findings
Figure 3 shows the amount of Data A captured by devices in the availability of App A developers. The results demonstrate the terrific impact of the crowdsourcing approach in raising the amount of data per developer segment. The simulations have been set with the same number of citizens and devices. However, the positive feedback effect of quality infrastructure is more and more considerable for a crowdsourcing approach than silos infrastructure. Although, it is worth to put in evidence that the number of data for crowdsourcing configuration is more minor than silos simulation in the first ten months. This trend confirms that in the short term, the mechanisms at the base of crowdsourcing approaches are hard to implement and are much more effective and transparent in the long run. The shape of curves provides another exciting piece of evidence. In the silos infrastructure simulation (Figure 3a), the crescent trend is quite linear, demonstrating no sensitive variations in the data flow process. On the opposite, the crowdsourcing configuration (Figure 3b) shows a crescent convexity that states how the reinforcing effect of data sharing impacts data availability.
The illustration contains two line graphs arranged vertically, each titled “Data A”. In both graphs, the horizontal axis is labeled “Time (Month)” and ranges from 0 to 100 with an interval of 5, and the vertical axis is labeled “Amount of Data”. In the top graph (a), the vertical axis ranges from 0 to 180000 with an interval of 20000. A single blue line labeled “Current” starts near the origin at month 0 and extends linearly upward to approximately 163000 at month 100. In the bottom graph (b), the vertical axis spans from 0, 500 B, and 1 T to 5 T with an interval of 0.5 T. Two lines are present: a blue line labeled “base” and a red line labeled “Current”, as shown in the legend at the bottom. The blue line rises from near 0 at month 0, increases in a curved pattern, and ends around 4.9 T at month 100, while the red line remains close to the baseline across the time range. Note: All numerical data values are approximated.Data flow comparison
The illustration contains two line graphs arranged vertically, each titled “Data A”. In both graphs, the horizontal axis is labeled “Time (Month)” and ranges from 0 to 100 with an interval of 5, and the vertical axis is labeled “Amount of Data”. In the top graph (a), the vertical axis ranges from 0 to 180000 with an interval of 20000. A single blue line labeled “Current” starts near the origin at month 0 and extends linearly upward to approximately 163000 at month 100. In the bottom graph (b), the vertical axis spans from 0, 500 B, and 1 T to 5 T with an interval of 0.5 T. Two lines are present: a blue line labeled “base” and a red line labeled “Current”, as shown in the legend at the bottom. The blue line rises from near 0 at month 0, increases in a curved pattern, and ends around 4.9 T at month 100, while the red line remains close to the baseline across the time range. Note: All numerical data values are approximated.Data flow comparison
Figure 4a and 4,b show the evolution of smart city's infrastructure value. The authors compared a single silos infrastructure against a crowdsourcing infrastructure to understand if significant differences in the curve shape exist. The evolution trends of value generation are approximately similar. In the case of crowdsourcing simulation, the increment is more abrupt in the phase when the crowdsourcing is implemented. The relevant difference concerns the value generated. Indeed, the sum of the three single silos infrastructure cannot reach the value generated by the crowdsourcing infrastructure.
The illustration contains two vertically stacked line graphs labeled “(a)” and “(b)”. In both graphs, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5. In graph “(a)”, the vertical axis is labeled “Silos Infrastructure A value” and ranges from 0 to 2 with an interval of 0.2. A single smooth blue line starts near 0.85 at month 0, rises steeply during the early months, and then gradually flattens as time increases, passing through the value of 1.8 around 37 months and approaching a value close to 2 by month 100. A small legend below the plot shows a blue line labeled “Current”. In graph “(b)”, the vertical axis is labeled “Smart city infrastructure value” and ranges from 0 to 4.5 with an interval of 0.5. A single smooth blue line begins near a value of about 2 at month 0, increases sharply within the first 2 months and reaches just above the value of 4, and then slowly levels off, approaching a value just above 4.4 by month 100. A legend beneath this graph shows a blue line labeled “base”. Note: All numerical data values are approximated.Smart city's infrastructure value
The illustration contains two vertically stacked line graphs labeled “(a)” and “(b)”. In both graphs, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5. In graph “(a)”, the vertical axis is labeled “Silos Infrastructure A value” and ranges from 0 to 2 with an interval of 0.2. A single smooth blue line starts near 0.85 at month 0, rises steeply during the early months, and then gradually flattens as time increases, passing through the value of 1.8 around 37 months and approaching a value close to 2 by month 100. A small legend below the plot shows a blue line labeled “Current”. In graph “(b)”, the vertical axis is labeled “Smart city infrastructure value” and ranges from 0 to 4.5 with an interval of 0.5. A single smooth blue line begins near a value of about 2 at month 0, increases sharply within the first 2 months and reaches just above the value of 4, and then slowly levels off, approaching a value just above 4.4 by month 100. A legend beneath this graph shows a blue line labeled “base”. Note: All numerical data values are approximated.Smart city's infrastructure value
Figure 5a and 5,b highlight the tremendous difference between the two infrastructures in data processing. The silos servers can treat a large amount of data, although the crowdsourcing smart city platform setting can abruptly scale up by exponentially reaching more significant and greater data processing capability. Furthermore, the positive convexity of the curve confirms a positive marginal trend for the crowdsourcing server process.
The illustration contains two vertically stacked line graphs labeled “(a)” and “(b)”. In both graphs, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5. In panel “(a)”, the vertical axis is labeled “Data A server volume” and ranges from 0 to 1200 with an interval of 200. A single blue line begins at a low value near the origin and increases steadily in a linear pattern across the entire time range, reaching a value slightly above 1100 at month 100. A legend beneath the plot shows a blue line labeled “Current”. In panel “(b)”, the vertical axis is labeled “Data server volume” and uses a logarithmic scale from 2 e positive 37 to 3.4 e positive 38 with an increment of 2 e positive 37. A single blue line starts extremely close to zero and remains low for much of the early time period till month 35, then rises sharply after roughly the midpoint of the time range after around 60 months, forming a steep upward curve toward the upper end of the vertical scale by month 100 just above 3.2 e positive 38. A legend beneath this plot shows a blue line labeled “base”. Note: All numerical data values are approximated.Comparison in the data processing
The illustration contains two vertically stacked line graphs labeled “(a)” and “(b)”. In both graphs, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5. In panel “(a)”, the vertical axis is labeled “Data A server volume” and ranges from 0 to 1200 with an interval of 200. A single blue line begins at a low value near the origin and increases steadily in a linear pattern across the entire time range, reaching a value slightly above 1100 at month 100. A legend beneath the plot shows a blue line labeled “Current”. In panel “(b)”, the vertical axis is labeled “Data server volume” and uses a logarithmic scale from 2 e positive 37 to 3.4 e positive 38 with an increment of 2 e positive 37. A single blue line starts extremely close to zero and remains low for much of the early time period till month 35, then rises sharply after roughly the midpoint of the time range after around 60 months, forming a steep upward curve toward the upper end of the vertical scale by month 100 just above 3.2 e positive 38. A legend beneath this plot shows a blue line labeled “base”. Note: All numerical data values are approximated.Comparison in the data processing
Figure 6a and 6,b demonstrate the potential in value generation of a crowdsourcing smart city platform configuration. Indeed, the amount of value generated by the second approach is considerably more significant than the first. The crowdsourcing data can build a vast spectrum of new services for the citizens by habilitating innovation in app development. An important recurrent key point is in the period orientation. The silos infrastructure configuration has a consolidated process; for this reason, it can generate higher value than crowdsourcing approaches within the first three years. After that period, the app developers' urban ecosystem may unleash the potential triggered by crowdsourcing data exploitation. The effect of cross-sectional data on app value-generating processes is positive. The higher is the number of users joining devices, the higher is the amount of cross-sectional data, and the higher is the app value generation. Although, a crowdsourcing platform configuration needs more significant time for processing cross-sectional data, this temporal lag is responsible for an initial slow increment of the app value curve. This simulation highlights how the potential of a crowdsourcing configuration needs a large bulk of users to be successful.
The illustration contains two vertically stacked line graphs titled “Smart city value”, labeled “(a)” for the top panel and “(b)” for the bottom panel. In both panels, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5. In panel “(a)”, the vertical axis is labeled “value for citizens” and ranges from 0 to 1800 with an interval of 200. A single blue line begins from the origin and increases smoothly across the full time span, forming a gently accelerating curve that reaches a value close to 1600 by month 100. A legend below the graph shows a blue line labeled “Current”. In panel “(b)”, the vertical axis is labeled “Value for citizens” and uses logarithmic scale from 0 to 1.4 e positive 36 with an interval of 2 e positive 35. The blue line remains very close to zero for much of the early time period till month 50 and then rises sharply after roughly the midpoint of the horizontal axis, producing a steep upward curve, and ends just above 1.3 e positive 36 at month 100. A legend beneath this plot shows a blue line labeled “base”. Note: All numerical data values are approximated.Comparison in value generation
The illustration contains two vertically stacked line graphs titled “Smart city value”, labeled “(a)” for the top panel and “(b)” for the bottom panel. In both panels, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5. In panel “(a)”, the vertical axis is labeled “value for citizens” and ranges from 0 to 1800 with an interval of 200. A single blue line begins from the origin and increases smoothly across the full time span, forming a gently accelerating curve that reaches a value close to 1600 by month 100. A legend below the graph shows a blue line labeled “Current”. In panel “(b)”, the vertical axis is labeled “Value for citizens” and uses logarithmic scale from 0 to 1.4 e positive 36 with an interval of 2 e positive 35. The blue line remains very close to zero for much of the early time period till month 50 and then rises sharply after roughly the midpoint of the horizontal axis, producing a steep upward curve, and ends just above 1.3 e positive 36 at month 100. A legend beneath this plot shows a blue line labeled “base”. Note: All numerical data values are approximated.Comparison in value generation
Figures 7a and 7b illustrate the evolution of R&D Data Cost for each App Developer over time. R&D data cost concerns the expenditure that an App developer has to make for processing, elaborating, exploiting and converting in new added value through a new App. This system dynamics simulation shows how the feedback loop tied to the availability of new cross-sectional data generates a considerable opportunity for App developers and dramatically impacts the cost. The R&D data cost curves have the same shape and trend but diverge in price. The generation of additional cross-sectional data needs a new, more significant investment for exploitation. This trend is based on the initial slow increment of new additional value when a crowdsourcing configuration is implemented. In fact, in the early stage of platform configuration, the app developers do not activate scale economies, and the impact of R&D data cost is quite relevant. Little by little, the R&D data costs are covered by the full exploitation of cross-sectional data.
The illustration shows two vertically stacked line graphs, with panel (a) at the top and panel (b) at the bottom, each titled “R and D Data Cost”. In both graphs, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5, and the vertical axis is labeled “ReD Data Cost”. In panel (a), the vertical axis ranges from 0 to 90 with an interval of 10. A blue curve labeled “base” begins from a cost of around 19 at month 0, rises sharply within the first 5 months to just above 70, then continues upward with a progressively flatter curvature, reaching just under 89 near month 100. A red curve labeled “Current” appears as a thin line close to the bottom of the chart, running nearly flat along the horizontal axis across the full time range at a cost of 3. In panel (b), the vertical axis ranges from 0 to 2 with an interval of 0.2. A single blue curve labeled “Current” starts around 0.85 at month 0, increases steeply at first, then continues upward with a smooth, concave-down shape, approaching a value close to 2 near month 100. Note: All numerical data values are approximated.Comparison in R&D data cost
The illustration shows two vertically stacked line graphs, with panel (a) at the top and panel (b) at the bottom, each titled “R and D Data Cost”. In both graphs, the horizontal axis is labeled “Time (Month)” and spans from 0 to 100 with an interval of 5, and the vertical axis is labeled “ReD Data Cost”. In panel (a), the vertical axis ranges from 0 to 90 with an interval of 10. A blue curve labeled “base” begins from a cost of around 19 at month 0, rises sharply within the first 5 months to just above 70, then continues upward with a progressively flatter curvature, reaching just under 89 near month 100. A red curve labeled “Current” appears as a thin line close to the bottom of the chart, running nearly flat along the horizontal axis across the full time range at a cost of 3. In panel (b), the vertical axis ranges from 0 to 2 with an interval of 0.2. A single blue curve labeled “Current” starts around 0.85 at month 0, increases steeply at first, then continues upward with a smooth, concave-down shape, approaching a value close to 2 near month 100. Note: All numerical data values are approximated.Comparison in R&D data cost
5. Discussion
The urgent issue of city overcrowding is one of the challenges of the next two decades. The city governments are worried about urban service management and quality of life (Prandi et al., 2017; Masik et al., 2021). Designing a smart city has become more and more relevant to face those issues (Anttila and Jussila, 2018; Caragliu and Del Bo, 2019). In this regard, considering that a smart city's physical and digital infrastructure may represent how stakeholders create and distribute value for citizens (Kumar et al., 2016; Ciasullo et al., 2020), the authors have proposed an integrated crowdsourcing approach to data storage and processing level.
Data management is a key factor for successfully configuring a smart city's digital infrastructure (Krishnan et al., 2020; Kumar et al., 2020; Szum, 2021). Indeed, collected data by smart city platforms represent an opportunity to develop innovation strictly linked to citizens' needs (Nilssen, 2019; Caragliu and Del Bo, 2019). However, they depend on the smart city's infrastructure configuration.
This work shows that a crowdsourcing approach, differently from a silos approach, allows app providers to use a single server and draw from cross-data originating from different sources and devices, developing integrated and value-added services for citizens. Moreover, the paper provides guidelines to improve the data management of smart cities. Crowdsourcing does not solely produce cost operational advantages, but it can: 1) offer an effective configuration of smart city's infrastructure; 2) improve the quality of existing services; 3) stimulate new integrated service development; 4) enable urban experimentation for technology and service providers; 5) increase the smart city value.
This paper seeks to provide a long-run vision for helping the public administrations fix the digital transformation strategy and evaluating the advantages of a crowdsourcing platform configuration. In this sense, results provide evidence tied to the proliferation of new value when cross-sectional data sharing is activated at the urban level. A crowdsourcing configuration might stimulate the emergence of the urban App developer ecosystem due to a huge amount of new data. Following this line, the findings attest that smart cities are environments dominated by complexity and heterogeneity in terms of services, users, actors and infrastructures. However, a crowd approach may enable network connectivity that affects finding innovation solutions in a real-time problem-solving context (Jiao et al., 2021). Consequently, the smart cities may be considered living labs where app developers are stimulated by different needs and problems to address, such as pollution or transport and the quality of citizens' life (Caporuscio et al., 2021a, b). This simulation enables us to compare the silos infrastructure configuration with the crowdsourcing configuration.
The findings demonstrate that the positive effects of the crowdsourcing paradigm, in the long run, overcome the silos infrastructure configuration. Furthermore, this study shows the impact of learning on cross-sectional data provision. A crowdsourcing configuration platform permits improving data sharing mechanisms. This simulation enables comparison between the crowdsourcing platform configuration and the traditional one, characterized by a unidirectional sharing of data from a certain type of users and the same type of app developer.
The cross-sectional data curve shows another piece of evidence; its trend is positive but asymptotically decrescent. This kind of curve tendency has claimed the role of crowdsourcing configuration since the very beginning of its adoption. However, the marginal contribution becomes less and less significant for app developers. In accordance, the app value generation processes per each type of app (see App 1, App 2, App 3) replace a positive marginal trend.
The paper's novelty concerns the assessment of crowdsourcing data not only at the level of the gathering phase but also during the stage of data management. The smart city digital infrastructures are characterized by a lack of understanding concerning the platform configurations, which are still needed to study. This piece of evidence remains unclear for the line of literature concerning smart city's digital infrastructure. The paper promises to close that gap by adopting a customary methodology for dealing with complex systems and testing future scenarios by adopting two alternative architectural configurations of a smart city platform.
6. Implications, limitations and conclusions
This paper discusses a smart city digital architecture that enables the creation of a crowd-based data management platform to improve city service efficiency (Huang et al., 2016; Breetzke and Flowerday, 2016; Staletić et al., 2020). Smart city and digital platforms are two of the most relevant lines of management studies in recent years (Kumar et al., 2016; Ciasullo et al., 2020). Literature has debated on the capacity of smart community in reconfiguring the urban environment to tackle city challenges is debating more and more (Nilssen, 2019, Caragliu and Del Bo, 2019). The study seeks to contribute to increasing the knowledge about different ways of setting a digital platform within smart cities, mixing the strand of literature on the value creation process with those of technological infrastructures, particularly applying the crowdsourcing paradigm in data management. In this vein, this paper offers theoretical implications simulating the positive and negative effects of crowdsourcing configuration of smart city digital platforms. This study has adopted this perspective to estimate the impact and compare the most significant differences quantitatively with traditional digital smart city platform configuration, in other ways called silos infrastructure configuration. Thus, this study attempts to fill the literature gap about which infrastructural configuration of a smart city is more suitable to create value for smart city users and citizens. There were no studies that provided an overall overview from a quantitative perspective. Furthermore, the extant literature did not address the most relevant consequences of moving from a silos infrastructure to a crowdsourcing configuration.
Referring to the practical implications, the results show that a single server offers advantages in reducing costs for providers and increasing the adoption rate for the smart city. However, cross-data management and processing may require additional and specific providers' capabilities. Furthermore, App development is expected to be a relevant industry in the future, and smart cities are called to act as living lab laboratories. Data management might be defined as the raw material for App developers that need more and more sophisticated and customized data. It is even more evident for all the contexts characterized by great complexity and heterogeneity because data refinement is much more useful for providing new solutions, business models and value generation processes. Under this scenario, municipal governments should implement new digital technologies.
On the other hand, despite the purpose of the paper highlighting the relevance of the digital component, the study looks at hardware and software tools in terms of the value they provide for citizens. In fact, according to Gutiérrez et al. (2013), it is important to avoid focusing only on the technology and missing the engagement of society in smart city analysis. The authors state that the users' privacy and the transparency in employing the data users' should consistently be recognized. Citizens should be aware of the type of information they are sharing and the contribution they are giving to improve the value of the city, thereby increasing R&D costs. Therefore, the authors can argue that a users' critical mass point is helpful for an efficient platform running. In other words, the crowdsourcing configuration may unleash all its potential when the digital infrastructure overtakes a specific user reach. This critical threshold is useful for practitioners, especially from municipalities that pour a lot of public financial resources to improve urban life quality and sustain the city's economic growth.
Although the pioneer study is focused just on the overall overview by treating the theme in a systemic perspective, this lens of analysis permits to grab several feedback loops by missing specificity on certain aspects, such as the technological endowment of smart city or who could be considered an App 1 developer in relation to a Data A. Albeit those limitations affect this research, they are at the base of all simulations. The system dynamics setting needs a certain degree of simplification through several assumptions. The paper opens the avenue to a new study line dedicated to zooming more into each part of this general simulation. Future researchers may process a quantitative investigation to verify the magnitude of trends that the paper has simulated.
Furthermore, this analysis set the configuration on the city size of 500,000 citizens, which in other ways might be relevant to investigating other city sizes or including other city dimensions such as infrastructural endowment. Fortunately, the amount of data and the capacity to grab them are dramatically increasing, the number of devices, sensors and other artificial intelligence technologies is exponentially growing. A concrete problem is the smart city platform configuration. In other words, the different segments of city life, such as transport or healthcare or waste management, can collect many data for developing and improving their apps. Still, they are not adopting a crowdsourcing configuration of smart city platforms. Cross-sectional data should stimulate the smart city developer ecosystem to provide new solutions for urban life problems. Although, implementing a smart digital platform with a crowdsourcing configuration might generate several negative and positive feedback loops.
