The need to digitise is an awareness that is shared across our community globally, and yet the probability of the intersection between resources, expertise and institutions are not as prospective. A strategic view towards the long-term goal of cultivating and digitally upskilling the younger generation, building a community and creating awareness with digital activities that can be beneficial for cultural heritage is necessary.
The work involves distributing tasks between stakeholders and local volunteers. It uses close-range photogrammetry for reconstructing the entire heritage site in 3D, and outlines achievable digitisation activities in the crowdsourced, close-range photogrammetry of a 19th century Cheah Kongsi clan temple located in George Town, a UNESCO World Heritage Site in Penang, Malaysia.
The research explores whether loosely distributing photogrammetry work that partially simulates an unorganised crowdsourcing activity can generate complete models of a site that meets the criteria set by the needs of the clan temple. The data acquired were able to provide a complete visual record of the site, but the 3D models that was generated through the distributed task revealed gaps that needed further measurements.
Key lessons learned in this activity is transferable. Furthermore, the involvement of volunteers can also raise awareness of ownership, identity and care for local cultural heritage.
Key lessons learned in this activity is transferable. Furthermore, the involvement of volunteers can also raise awareness of identity, ownership, cultural understanding, and care for local cultural heritage.
The value of semi-formal activities indicated that set goals can be achieved through crowdsourcing and that the new generation can be taught both to care for their heritage, and that the transfer of digital skills is made possible through such activities. The mass crowdsourcing activity is the first of its kind that attempts to completely digitise a cultural heritage site in 3D via distributed activities.
1. Introduction
Crowdsourcing arose as a reaction towards the need to solve significant problems that are deemed too difficult for an individual or an agency. The difficulty associated with such problems may be an overall task that could not be replicated or automated with ease via some algorithms or machine-based systems. Especially those that require either physical human effort, or via some human senses and mental faculties and their ability to identify, differentiate, evaluate or make judgements. Solutions to problems in need of crowdsourcing may involve simple tasks distributed in parallel across a group of participants and some form of reward in the pleasure of the work, experience or value gained, or via some incentives in exchange for the collective effort.
The rise of crowdsourcing is without a doubt attributed to the connectivity of the Internet and the accessibility of digital devices that are able to achieve elite equipment quality at the fraction of the cost, providing greater access to economic profits (see for example, Howe, 2006). Since the very first recorded activity of crowdsourcing in 1714, there has been an increase in major crowdsourcing activities globally, towards the beginning of the new century as depicted in a graph which charts the density of occurrences (Ch’Ng et al., 2016, Figure 1). Crowdsourcing work in relation to cultural heritage and the museums are invariably associated with data that can lead to the cohesivity of information, which can progress the goals of the institutions that curate cultural heritage objects and sites. The article by Dunn and Hedges highlighted the categories of ‘scholarly primitives’, processes that are common to scholarly activities, ones that create and enhances digital assets (Dunn and Hedges, 2013) that are specific within crowdsourcing in the humanities. As demonstrated in the survey, data is core to the processes that are associated with assets in the museums, with the outcome of constructing usable information from such activities. While these are posited within the academic framework, activities linked to cultural heritage cannot really be disassociated with such processes for they all involve the administering of data and metadata, with the goal of obtaining information and knowledge as the outcome. In fact, the majority of literature dealing with definitions, typology and categorisation (Bonney et al., 2009; Oomen and Aroyo, 2011; Estellés-Arolas and González-Ladrón-De-Guevara, 2012; Bonacchi et al., 2019) enumerate tasks that manipulates data within activities that made use of digital technology. An extensive survey (Estellés-Arolas and González-Ladrón-De-Guevara, 2012) provides a good definition and literature review on crowdsourcing.
The illustration contains four side-by-side line graphs, each with a vertical axis labeled “Effort” and a horizontal axis labeled “Time”. A red curve appears in each panel, and a blue dashed vertical line is positioned at the center of each graph. The panels are labeled below as “GOOD”, “BAD”, “RESEARCH”, and “AIM”. In the “GOOD” panel, the red curve starts high on the left under the label “Crowd” and gradually crosses the vertical dashed line over time toward a low level on the right under the label “Lab”. In the “BAD” panel, the red curve starts low on the left under the label “Crowd”, increases, and crosses the vertical dashed line over time toward a high level on the right under the label “Lab”. In the “RESEARCH” panel, the red curve begins high under “Crowd”, decreases gradually, then drops more sharply after a first blue dashed vertical line. A second blue dashed vertical line appears further to the right. The label “A I” is positioned between the two dashed lines. The curve continues declining toward a low level near the “Lab” label at the far right. In the “AIM” panel, the red line remains high and flat under “Crowd” until a blue dashed vertical line. At that line, there is a sharp vertical drop to a medium level. The label “A I” appears at the drop location. A second blue dashed vertical line appears slightly to the right of the first, marking another reference point, and again the curve drops on the second dashed line and runs horizontally toward a low level near the “Lab” label at the far right.The first diagram depicts a good model where the majority of the effort assigned to the crowds (before the blue dotted line), with the specialist putting little effort into completing the model
The illustration contains four side-by-side line graphs, each with a vertical axis labeled “Effort” and a horizontal axis labeled “Time”. A red curve appears in each panel, and a blue dashed vertical line is positioned at the center of each graph. The panels are labeled below as “GOOD”, “BAD”, “RESEARCH”, and “AIM”. In the “GOOD” panel, the red curve starts high on the left under the label “Crowd” and gradually crosses the vertical dashed line over time toward a low level on the right under the label “Lab”. In the “BAD” panel, the red curve starts low on the left under the label “Crowd”, increases, and crosses the vertical dashed line over time toward a high level on the right under the label “Lab”. In the “RESEARCH” panel, the red curve begins high under “Crowd”, decreases gradually, then drops more sharply after a first blue dashed vertical line. A second blue dashed vertical line appears further to the right. The label “A I” is positioned between the two dashed lines. The curve continues declining toward a low level near the “Lab” label at the far right. In the “AIM” panel, the red line remains high and flat under “Crowd” until a blue dashed vertical line. At that line, there is a sharp vertical drop to a medium level. The label “A I” appears at the drop location. A second blue dashed vertical line appears slightly to the right of the first, marking another reference point, and again the curve drops on the second dashed line and runs horizontally toward a low level near the “Lab” label at the far right.The first diagram depicts a good model where the majority of the effort assigned to the crowds (before the blue dotted line), with the specialist putting little effort into completing the model
The crowdsourcing activity reported in this article arose as a need to provide a solution to a particular problem associated with a 19th century Chinese clan temple within George Town, a UNESCO World Heritage Site in the island of Penang, Malaysia. The project is as a result of the awareness of the need to provide a digital record of the largely wooden structure, architectural ornaments and contents at risk of fire hazards (Yen and Mydin, 2013; Mydin et al., 2014) and flooding (Chiam, 2017; Osborne, 2017). The digital documentation would act as a 3D visual record from which reconstruction or reparation work can be made possible. This need to document by digitising cultural heritage by the local community is a shared awareness that spans regions and countries globally. However, the resources and expertise that are needed to carry out such works can be challenging for many institutions in terms of technology accessibility and expert guidance. The intersection between human resources and expertise that can be facilitated by leaderships between institutions are not as prospective and therefore the challenge remains very real across the globe. Even museums that are at the top of the class have found it difficult to invest and mobilise resources into digitisation to which the British Museum is a case in point. Thus far, they have digitised and published over 270 high-quality 3D models from their collection, and, in total gained over 1.5m views and over 10K likes (see “Sketchfab@BritishMuseum”). Despite the British Museum's collaboration with the world's largest repository of 3D models, the digitisation has been relatively slow.
The awareness now is that there is the need for those with the knowledge and expertise to engage with various levels of cultural institutions with the aim of facilitating, with a strategic view towards the long-term goal of cultivating and digitally upskilling the younger generation, building a community of participants, and creating awareness with digital activities that can be beneficial for cultural heritage, whilst also carrying out the digital documentation and publication of models.
In this research, two supporting organisations took part in the study, which provided the opportunity to investigate the nature of crowdsourcing 3D cultural heritage in the real world. At the core of this research, our aim is to answer this question, “can loosely distributing photogrammetry work that partially simulates an unorganised crowdsourcing activity generate complete models of a site?” A logical measure of the success of a site would be the amount of work needed to process the image data and the effort in merging components into a complete model, in the form of a weighted measure discussed prior (Ch'ng et al., 2019c). It can be projected that mass photogrammetry work that involves distributed volunteers would be disorganised and that the amount of manual effort needed to complete the 3D modelling would be disproportional to the number of persons contributing to the work. This article aims to test these assumptions and communicate lessons learned from such activities.
The article is structured as follows: the subsequent section provides a background of the project and the collaborative partnership that was formed as a result. This is followed by the methods section describing a crowdsourcing approach, which was adopted for the project. The results section outlines lessons learned and a summary of how value can be created in such partnerships. Finally, the discussions section summarises the value and contribution of the project.
2. Background
This section introduces the site and contextualises the project within the crowdsourcing and close-range photogrammetry domain. It covers the cultural heritage site where the activity took place, the initiation of the project through mutual awareness and dialogues, the activity itself, and related literatures and parallel works.
2.1 Seh Tek Tong Cheah Kongsi
The Seh Tek Tong Cheah Kongsi is an ancestral temple belonging to the Cheah clan association. Built in 1858 and completed in 1873, it had its origin from the formation of the oldest Hokkien Kongsi (clan) in Penang in 1810. Cheah Kongsi is “an eclectic mix of a Chinese mansion, Chinese temple, and European bungalow considered most charming clan temples” (Yen and Mydin, 2013). The 19th century clan temples are generally built by expertise that was contracted from Fujian, Southern China. In the case of Cheah Kongsi, three main types of skilled artisans – carpenters, masons and “jian nian” (剪粘, artisans who use break coloured bowls and stick them to the figurines) were tasked for the architectural design and construction. At the time, the forefathers had “built on a voluntary basis not discounting risk of pirates and diseases” in the journey to Malaysia. They build without any blueprints, with only a projection of the architecture in their well-trained minds. The buildings therefore are without blueprints apart from a latter architectural survey conducted several decades ago, according to Dato’ Peter Swee Huat Cheah, who showed the author the documents at the Interpretation Centre in the February of 2020. A special collection housed at the Interpretation Centre is “the 1,200 manuscripts and old title deeds, blueprints, plans, chops and gifts relating to the Kongsi's history which have digitised, transcribed and preserved in a physical and an electronic archive” (Chung, 2021).
Clan associations such as Cheah Kongsi were formed because of the need to look after the welfare of clansmen who have immigrated from China, i.e. the Cheah clansmen from the Sek Tong Seah ancestral village in Fujian. The term Kongsi “refers to an autonomous organization of shared interests based upon blood ties and geographical affinity among its members” (Tuan, 2015). Clan temples were built for the veneration of ancestors and patron deities, and as sites for the gathering of clansmen. The temple has been renovated several times, the first major restoration were undertaken in 2013 (Constructionplusasia, 2017), which involves the artistic fencing, the ornate entrance arch, the surrounding houses along Armenian Street and the Interpretation Centre. The restoration recovered the temple to its most accurate historical state. Cheah Kongsi is a Category 1 conservation building (GTWHIBuilding, 2009) within George Town, inscribed as a UNESCO World Heritage Site in 2008 (GeorgeTownUNESCO, 2008). Category 1 buildings and sites are important as they reflect the authenticity of the cultural landscape and are of outstanding universal value, it is a site of exceptional cultural significance.
2.2 Project initiation and collaboration
The crowdsourcing project had begun as a result of the Conference on Managing Urban Cultural Heritage (MUCH'18) initiated by George Town World Heritage Incorporated (GTWHI) to commemorate the 10th anniversary of George Town and Melaka's joint inscription as a UNESCO World Heritage Site. The author presented the article “Crowdsourcing for 3D Cultural Heritage for George Town UNESCO World Heritage Site” (Ch'ng, 2018) in argument of the need to use 3D technologies in combination with crowdsourcing mechanisms for the UNESCO site, and to demonstrate the benefits that should follow if George Town's cultural heritage is digitised and subsequently digitalised. In the conference, Cheah Kongsi's present chairman Dato” Alan Teik Cheng Cheah approached the author and initiated a conversation. After a series of meetings with the then Chairman Dato’ Peter Swee Huat Cheah, which includes access to the Interpretation Centre where old building plans were archived, the project began in the summer of 2019 as a collaboration between Cheah Kongsi, the NVIDIA Joint-Lab on Mixed Reality at the University of Nottingham Ningbo China, and the Equator Academy of Art. Cheah Kongsi would provide the full access to the site, resources and support for documenting the exteriors and interiors of the buildings. The Equator Academy of Art would provide staff and student resources, and the NVIDIA Joint-Lab on Mixed Reality directed by the author would ensure the provision of digital expertise in 3D scanning and compute resources and knowledge transfer throughout the project.
2.3 Crowdsourcing for cultural heritage
Crowdsourced activities that benefit cultural heritage have been explored, these made use of the human willingness and cognitive surplus (Shirky, 2010) of volunteers to achieve goals that would not have been possible due to the limited resources that often characterise 21st century cultural institutions. Bonacchi et al. (2019) article provides a recent glimpse of centralised crowdsourcing activities in the sector, using MicroPasts (Bevan et al., 2014), a multi-application crowdsourcing project that enables both community-led and massive online contributions to high quality research in archaeology, history and heritage as a case study. Crowdsourced works can be within one of the three categories (Bonney et al., 2009).
A literature review on formal studies on the use of “mass photogrammetry” yielded little results. Mass photogrammetry is a term first used in Granshaw's editorial “Imaging Technology 1430–2015: Old Masters to Mass Photogrammetry” (Granshaw, 2015). It was later expanded as a guideline (Ch'ng et al., 2019a), and further experimented with in more formal settings (Cheng and Ch'ng, 2022). In terms of the application of 3D crowdsourcing, one of the more ambitious project that made use of past crowdsourced images for constructing 3D models was the AHRC funded Curious Travellers project (Wilson et al., 2019). The project, of which the author is a co-principal investigator of, made use of big data technologies in combination with Structure-from-Motion (SfM) for reconstructing models. The goal of the project was to repurpose imagery to manage and interpret threatened monuments, sites and landscapes. The ancient Temple of Bel located in Palmyra, Syria that was partly destroyed by the Islamic State of Iraq and Syria in August 2015, was one of the reconstructions (TEMPLEOFBEL, 2017) using tourist photographs that were crawled and scraped from search engines.
The recent popularity and ease of use of the close-range photogrammetry have prompted the need to involve volunteers in digitising activities by the cultural heritage community that produces 3D models of various levels of quality. The research community that focuses on digital heritage has initiated experimental studies that is centred on mass photogrammetry (Granshaw, 2015; Ch'ng et al., 2019a; Cheng and Ch'ng, 2022). The close-range photogrammetry models have been very useful for both research and production on the effects of communication and the learning of cultural heritage via immersive virtual environments, and augmented reality devices (Cai et al., 2018; Li et al., 2018, 2021; Ch'ng et al., 2019b). However, designing a crowdsourced project can be challenging on many fronts due to the unpredictable nature of the types of demographics, personalities, attitudes and skillsets that the project may attract as volunteers. As most crowdsourcing work within the cultural heritage sector is voluntary, the positive aspect would be that interested parties would likely be zealous and enthusiastic. While the call may be open globally, the likely candidate that would eventually sign up would be the minority, particularly for projects that are in the cultural sector, and, even within the volunteer pool, the Pareto principle (Pareto, 1896) may be at work, where only 20% of the participants would contribute to 80% of the effective work. Indeed, most successful participation projects are “not about crowds” but a mere continuation of the volunteering tradition of “inviting participation from interested and engaged members of the public” (Owens, 2012). However, we felt that the extent of the work that would be involved within our world cultural heritage would be too limited if only a small pool of volunteers participates. Therefore, the research reported in this article would serve as the beginning of many crowdsourcing activities that involve crowds rather than a minority of enthusiasts.
In general, crowdsourcing within the heritage sector will involve any of the processes in the typology that contribute to or manipulate data and metadata in a way that collectively consolidate or produce new information. In the case of this project, the goal is to digitally record the structure, architectural ornaments, and the site in 3D, using close-range photogrammetry techniques as a mitigation for reparation work against the risk of fire, flood and other environmental and anthropogenic hazards. Similar to any digitisation works within the cultural heritage domain, questions on what we can do with the data often relate to the use of resources. As such, the practical, technical and philosophical questions on what we can digitally capture should be asked prior to initiating a project (Ch'ng, 2019). In terms of the technicality of digital capture, close-range photogrammetry is now relatively well-known, and the technical details will not be covered in detail in this article. For the readership, the foundations of close-range photogrammetry principles is accessible (Luhmann et al., 2006; Mudge et al., 2010; Luhmann et al., 2014), and best practice for mass photogrammetry using heterogenous devices and software, within environments that do not favour photography is available (Ch'ng et al., 2019c).
3. Methodology
The present activity is a “collaborative project” that simulates mass photogrammetry. Projects of such nature are generally designed by facilitators and for which members of the public contribute data but also may help to refine project design, analyse data or disseminate findings. This section communicates the experimental setup, the survey and measurement of the site, the crowdsourcing activity and the data that were acquired.
3.1 The need to explore mass photogrammetry
Crowdsourcing 3D models of cultural heritage remains an area that is relatively new and therefore unexplored. Apart from the cooperative and collaborative aspect of crowdsourcing, the nature of the work that this article investigates has not been understood in respect of the seven elements which the author has observed in prior experiments:
The uncertainty of the angle and completeness of capture for a single object or site, with a single component or multiple components
The variability and complexity of an object or a site's shape, form, details and textures
The unpredictability of the environment, weather situation and lighting conditions in the schedule
The number of participants that may volunteer and how the size of the group would affect performance
The attitude, ability and thoroughness of each participant in a given task
The types of heterogenous imaging devices and variability of image resolution that a pool of participant may use
The asynchronous nature of the work where volunteers may be available to carry out a task
These seven elements are seen to be important towards the collective performance of acquiring 3D models from cultural heritage sites, measured in terms of the quality of the model and the speed of production.
The quality of photogrammetric models that can be sourced from crowdsourcing work (Ch'ng et al., 2019a) was formulated as M = w1µ+w2ε+w3δ+w4ζ, where µ is the object's material and form; ε represents the object's environment; δ is the devices used; ζ is the skill of the photographer, and M represents the standard practice in the range [0,1] with the arbitrary weights indicating how much a volunteer has control over a given variable. If the goal is to increase the quality of 3D reconstruction measured as a performance factor, there is the need to increase the weights for all the variables. The seven elements in total, are the first part of the work, the second part is where the image sequences are processed into 3D models. In other words, the measure is applied in the on-site activity from which a 3D model is produced in the laboratory where specialists work. Therefore, the more crowdsourced work is accomplished to a level of standard by the volunteers, the lesser work the professional end will need to accomplish to produce a good quality model. Figure 1 is a diagrammatic indicator of the different types of models that measure performance and effort. A good model (first diagram) would have the majority of the effort assigned to the crowds (before the blue dotted lines), with the specialist putting little effort into completing the model. A bad model of crowdsourcing work (second diagram) would be a case, where the crowd has contributed little or where they have produced low quality image sequences, leaving the majority of the effort to the laboratory. In the present research, we aim to achieve the fourth model (fourth diagram) through conducting research (third diagram), where we aim to be able to strategise a mechanism whereby the first and the intermediate works would be accomplished by a combination of the crowd and the machines and thus, leaving very little effort towards the professional end, i.e. conservators, digital teams, museums, etc.
We have made efforts to create mechanisms from which 3D models can be made more complete from the first part of mass photogrammetry. For example, our prior study demonstrated that self-organised team working in synchrony effectively facilitates crowdsourcing photogrammetric 3D models regarding dedication, contribution and 3D model completeness (Cheng and Ch'ng, 2022).
3.2 The site and built area
The Cheah Kongsi site has a gross floor area of 1,410 m2 within a site of 0.89 acres. The main gate is located at Beach Street, accompanied by a side entrance at Armenian Street (Figure 2). The main materials used in the building are timber, stone, lime and clay. Timber being the main component that is at risk of fire is used for structural beams, piles, carved screens, staircase, terrace structure and the floor. There are glass installations for some furniture but the main materials do not present challenges for close-range photogrammetry. The buildings adjacent to the sites were ideal for drone mapping as they were at most four-story buildings. Trees are of medium height in the vicinity and were sparsely planted along the edges of buildings. In retrospect, this is an ideal site for photogrammetry work.
The oblique aerial view of a heritage building complex situated within a dense urban block. The complex is composed of multiple adjoining low-rise structures with sloped clay-tiled roofs in varying shades, arranged around internal open spaces. White pointer lines identify key elements: “Main Building” is positioned toward the upper central portion, slightly elevated compared to surrounding roofs; “Interpretation Centre” appears on the upper right side; two separate internal spaces are labeled “Courtyard,” one near the center and another slightly to the right; and “Armenian Street Entrance” is marked along the left edge of the complex where the buildings meet the street. A large rectangular green lawn occupies the central foreground, bordered by building facades on three sides and separated from the street by a low boundary wall with evenly spaced vertical elements. Small objects resembling benches or low structures are visible on the lawn. The surrounding streets are clearly labeled: “Armenian Street” runs vertically along the left side of the image, and “Beach Street” runs horizontally along the bottom edge. Vehicles and narrow sidewalks are visible along Beach Street, while tightly packed shop houses and mixed-use buildings with varied rooflines and façade conditions fill the surrounding urban fabric. The rooftops display a mix of ridge lines, parapets, and patched surfaces. The buildings form an irregular perimeter enclosing the courtyards and lawn. In the lower right corner, an inset image shows a three-dimensional digital model of the same complex placed on a dark grid background. The model includes a circular overlay structure composed of ring-like elements hovering above the site.Photogrammetry reconstructed site model
The oblique aerial view of a heritage building complex situated within a dense urban block. The complex is composed of multiple adjoining low-rise structures with sloped clay-tiled roofs in varying shades, arranged around internal open spaces. White pointer lines identify key elements: “Main Building” is positioned toward the upper central portion, slightly elevated compared to surrounding roofs; “Interpretation Centre” appears on the upper right side; two separate internal spaces are labeled “Courtyard,” one near the center and another slightly to the right; and “Armenian Street Entrance” is marked along the left edge of the complex where the buildings meet the street. A large rectangular green lawn occupies the central foreground, bordered by building facades on three sides and separated from the street by a low boundary wall with evenly spaced vertical elements. Small objects resembling benches or low structures are visible on the lawn. The surrounding streets are clearly labeled: “Armenian Street” runs vertically along the left side of the image, and “Beach Street” runs horizontally along the bottom edge. Vehicles and narrow sidewalks are visible along Beach Street, while tightly packed shop houses and mixed-use buildings with varied rooflines and façade conditions fill the surrounding urban fabric. The rooftops display a mix of ridge lines, parapets, and patched surfaces. The buildings form an irregular perimeter enclosing the courtyards and lawn. In the lower right corner, an inset image shows a three-dimensional digital model of the same complex placed on a dark grid background. The model includes a circular overlay structure composed of ring-like elements hovering above the site.Photogrammetry reconstructed site model
3.3 Recruitment and participation
The activity was opened to all Equator College students and staff. A total of eight students were recruited initially from the college photography club as the basic operation of digital SLR (DSLR) cameras is essential. Photography club members would also have DSLR cameras in their possession and this would save considerable resources, keeping the project lean as a case demonstration of sustainability for sites within the UNESCO zone. The final participants were gathered from across three departments – graphics design, 3D design and interior design. Contingent to the activity, since all third-year students of the college require a total of 200 credit hours to complete their studies, they were given six credit hours each for a total of four days, amounting to 24 credit hours each as a part of the “Beyond Class Experience” module, a programme subject as a part of their study “to expose students to think smart, and to obtain real-world experiences as well as social responsibility such as in community services”. The hours would include the copying and backing up of data and project meetings. The two instructors and supervisors (Ms. Lim Hui Chyi and Mr. Jason Chuah Kay Kuan), accompanying the students spent four days within the same activity, and upon realising that there were gaps in the images, one of the instructors spent an extra three days photographing the site. In combination with the author's seven days of six h of work, the total time spent taking photos and copying, categorising and backing up photos were a total of 276 h with an average of 25 h per person.
On the 26th June 2019, the author gave a briefing session on the principles of photogrammetry. This is followed by an open lecture on the 28th June titled “The Crowdsourcing of Cultural Heritage and the Role of VR in Archiving Memories”. The talk covers the importance of cultural heritage and its relation to our identity and memory, and the role of VR in cultural heritage. The crowdsourcing work was carried out between the 29th June and 1st July. Two more weekends of work were carried out by Ms. Lim Hui Chyi, a supervisor for the group. The extended work was due to the inadequacy of the quality of some images, i.e. out of focus, under exposure or lack of the required area of overlapping photos. Particular area of the site was assigned to the author for professional photogrammetry work. This was deliberate so that both professional and amateur photogrammetry can be compared for research purposes. The datasets taken by both specialists and amateurs can complement each other in the 3D modelling phase. Areas assigned to the authors were site surveys with drone, the balcony, parts of the main temple and the two courtyards and their entrances.
3.4 Equipment and image data
The loose specification of imaging devices used in the project is a necessary component of the crowdsourcing study. It simulates the diversity of imaging devices that would be expected in the possession of volunteers. The processing of data is carried out at the NVIDIA Joint-Lab on Mixed Reality, equipped with purpose-built hardware and software supported by CapturingReality, Epic Games. All data were made anonymous, and no individual names were associated with the camera models or the work as part of our commitment to ethics and professionality.
3.4.1 Digital SLR cameras and phototaking
The crowdsourcing work made use of available cameras that were in the natural possession of participants (Table 1). The author uses a Sony a7R III (42.4 megapixels) while the instructors and students made use of Nikon (D5100 and D7000 both with 16.2 megapixels), Canon 70D (20.9 megapixels) and Canon 550D, 1300D, 600D, 700D (18 megapixels), Canon 1100D (12.2 megapixels) and Fujifilm X-S1 (12 megapixels). All photos were taken with the cameras' native RAW formats (Sony ARW, Nikon NEF, Canon CR2, Fujifilm RAF, DJI Mavic Pro DNG), a small dataset was taken in the JPG format.
Image file types and file size captured across ten photographic equipment
| Image type | File size | Photographic equipment | Pixel resolution |
|---|---|---|---|
| ARW | 451.18 GB | Sony | 42.4 megapixels |
| CR2 | 675.99 GB | Canon | 70D (20.9 megapixels) and Canon 550D, 1300D, 600D, 700D (18 megapixels), Canon 1100D (12.2 megapixels) |
| DNG | 117.93 GB | DJI Mavic Pro | |
| NEF | 201.15 GB | Nikon | 16.2 megapixels |
| RAF | 2.72 GB | Fujifilm | 12 megapixels |
| JPG | 0.48 GB | Miscellaneous devices |
| Image type | File size | Photographic equipment | Pixel resolution |
|---|---|---|---|
| ARW | 451.18 GB | Sony | 42.4 megapixels |
| CR2 | 675.99 GB | Canon | 70D (20.9 megapixels) and Canon 550D, 1300D, 600D, 700D (18 megapixels), Canon 1100D (12.2 megapixels) |
| DNG | 117.93 GB | DJI Mavic Pro | |
| NEF | 201.15 GB | Nikon | 16.2 megapixels |
| RAF | 2.72 GB | Fujifilm | 12 megapixels |
| JPG | 0.48 GB | Miscellaneous devices |
Source(s): Table created by the author
Work was loosely distributed where 2–3 students would cover larger, more complex areas and rooms, the entrance, the flooring and walls of the site. Individuals would cover the smaller rooms.
3.4.2 Drone mapping and survey
The DJI Mavic Pro with 12.35 megapixels using the DNG native RAW format was used for 3D survey and mapping. The top overlapping views of the site was mapped, with multiple rotations around the site, the main building and details of the ornamental roof. Drone work was also conducted within the two courtyards extending vertically upwards to the enclosing roof.
3.4.3 Data processing
At the end of the survey, the datasets were processed from the native RAW format of each camera (Table 1). A total of 32,138 photos were taken with a total file size of 1449.46 gigabytes (451.18 GB of ARW, 675.99 GB of CR2, 117.93 GB of DNG, 201.15 GB of NEF, 2.72 GB of RAF, 0.48 GB of JPG). Adjustments were made within Adobe Lightroom to remove tints, and to brighten the darker areas of each photograph. Exterior photography was captured in bright sunlight. Some interior spaces were dark with a mixture of tungsten and partial natural sunlight. All photographs had to be adjusted prior to exports as a JPEG file. These were stored within the named folder of each area of the site. Data processing would represent one of the most time-consuming works in the photogrammetry pipeline.
3.4.4 Hardware and software for 3D reconstruction
RealityCapture (version 1.2.0.16813) was used as the photogrammetry software for creating 3D models out of the unordered photographs that we have captured. The series of images associated with individual areas were first imported and aligned for inspection. Images of adjoining areas were processed within the same session. Where components were separated in the photogrammetry alignment process, additional work via control points were manually added to merge them into a single model, e.g. the point cloud model in Figure 2.
The hardware was purpose-built by the author for 3D data processing. The system has the Intel(R) Xeon(R) CPU E5-2620 v3 @2.40 GHz with 6 cores and 12 logical processors. The workstation is installed with 64 GB of DDR4 RAM on Windows 10 64-bit OS. Two NVIDIA Quadro M6000 12 GB connected via SLI were used for real-time visualisation.
4. Results
This section presents the result of the experiment. The 3D models that were reconstructed can provide summary evidence of the work conducted by contributors, their photographic ability and thoroughness with the task. These can be read from the EXIF metadata in the images, the quality of the capture such as the clarity of the shots, exposure via the histogram of the image and thoroughness of the angle and density of the shots as seen in the 3D space. The number of control points needed to join areas together can be a measure of the thoroughness of the crowdsourcing work. These, together with the final 3D reconstruction can provide a performance measure of the crowdsourcing work.
4.1 The site model
Figure 1 shows the 3D model of the site captured through drone imaging with ground photography contributed by participants. The main architectural feature of the site consists of the main building, the interpretation centre, the main gate and the Armenian Street entrance. While the main building is the focus of the documentation, the context of the building is equally important and therefore, the need to capture the entire site.
The drone mapping produced four separated components of the site (Figure 3), which consists of the major part of the site (2,588 photos), the partial roof component which was disjointed (34 photos) and the gate at Beach Street (162 and 119 photos) broken into two components. Control points P5-P11 were strategically added via the manual approach for merging components 2–4 with the site component. The Armenian Street entrance was generated separately and merged with the main component later without the use of control points.
The dark-background aerial three-dimensional site model labeled in the upper left corner as “Component 1: Site”. The main view presents a rectangular urban block rendered as a textured point cloud or mesh model, surrounded by a darker water-like surface. Multiple red circular markers with labels such as P 0, P 1, P 2, P 3, P 4, P 5, P 6, P 7, P 8, P 9, P 10, and P 11 are distributed across the central and upper portions of the site, showing specific points of interest. Thin white rectangular guide lines frame the main site area and extend toward inset panels on the right. Three inset views appear on the right side. The top-right inset is labeled “Component 2: Partial Roofing” and shows a closer three-dimensional view of the building cluster with roof surfaces highlighted, including a visible green courtyard area within the block. Below it, two smaller inset panels are arranged side by side. The lower-left inset is labeled “Component 3: Partial Gate” and displays a ground-level perspective along an edge of the site with a pathway and adjacent green strip. The lower-right inset is labeled “Component 4: Partial Gate 2” and shows another close-up ground-level perspective of a boundary or gate area with a grid overlay visible in the background.Control points within RealityCapture used for merging separate components
The dark-background aerial three-dimensional site model labeled in the upper left corner as “Component 1: Site”. The main view presents a rectangular urban block rendered as a textured point cloud or mesh model, surrounded by a darker water-like surface. Multiple red circular markers with labels such as P 0, P 1, P 2, P 3, P 4, P 5, P 6, P 7, P 8, P 9, P 10, and P 11 are distributed across the central and upper portions of the site, showing specific points of interest. Thin white rectangular guide lines frame the main site area and extend toward inset panels on the right. Three inset views appear on the right side. The top-right inset is labeled “Component 2: Partial Roofing” and shows a closer three-dimensional view of the building cluster with roof surfaces highlighted, including a visible green courtyard area within the block. Below it, two smaller inset panels are arranged side by side. The lower-left inset is labeled “Component 3: Partial Gate” and displays a ground-level perspective along an edge of the site with a pathway and adjacent green strip. The lower-right inset is labeled “Component 4: Partial Gate 2” and shows another close-up ground-level perspective of a boundary or gate area with a grid overlay visible in the background.Control points within RealityCapture used for merging separate components
4.2 Results from interior spaces within the main building
Out of the 19 rooms within the main building, only nine were reconstructed without the need to use control points. This represented a 47.36% success rate for the interior of the building. Figure 4 shows the quality of the 3D model components reconstructed using RealityCapture. The successfully reconstructed models were the Typical Chinese Hall (A), the Deities room (B), the Main Temple, the Baba Nyonya Room (D), the Patron Deities room (E), the Ancestor Worship room on the ground floor (F), the Furniture Room (H), the Balcony (I) immediately outside of the main temple and the Ancestor and Worship room on the first floor (J). Failed components as seen (G) generally have very sparsely positioned cameras. Images in G were meant to join two adjacent areas but have failed to do so. This requires additional work with control points to combine separated components. The participants working on the Furniture Room (D) were very enthusiastic. While the work was thorough, the photos had too many overlaps, i.e. too many photos that have been taken too close together and as such, alternate photos had to be removed for the 3D model to generate properly.
The three-row collage of ten rectangular panels is arranged left to right and labeled in letters “A”, “B”, “C” on the top row; “D”, “E”, “F”, “G” on the middle row; and “H”, “I”, “J” on the bottom row. In “A”, a corner section of a room is shown in perspective view with two perpendicular vertical walls meeting at a right angle and a flat floor forming a rectangular base; dense white point cloud dots cluster along the floor edges and around furniture-like shapes near the left wall, while the surrounding space fades into darkness. In “B”, a box-shaped storefront structure is displayed with its front façade facing slightly right; transparent double glass doors form a rectangular opening at the center front, side walls are partially cut away, and shelves and interior objects are visible inside the cubic frame, all overlaid with scattered white point data. In “C”, a large hall-like interior extends backward in depth, with evenly spaced vertical columns rising from a flat tiled floor to the ceiling; wall panels line the far side, and floating white point clusters appear at mid-height across the open space. In “D”, another room corner is presented with two vertical walls and a floor grid visible beneath; white point dots trace the perimeter of the room and outline furniture positioned along the walls, forming a rectangular enclosure. In “E”, a kitchen-like rectangular room is shown with a back wall containing cabinets and a counter running horizontally; a rectangular table stands centrally on a tiled floor, and portions of the side wall are missing, exposing the interior volume while point clouds scatter across surfaces. In “F”, a compact rectangular room is depicted as a cutaway box with one vertical wall removed; shelves line the interior wall, and small objects are arranged inside, with point clusters adhering to surfaces and edges. In “G”, a darker, sparsely reconstructed scene appears with scattered white point clusters concentrated toward the lower right quadrant; a thin horizontal line runs across the frame, intersecting the point cloud. In “H”, a wide horizontal floor plane fills the frame, overlaid with curved and circular white line traces forming layered contour-like shapes; the outlines appear concentric and irregular, positioned centrally within the rectangular ground area. In “I”, an ornate interior façade spans the width of the panel, featuring multiple vertical pillars and decorative wall elements; dense white point formations occupy the foreground, partially obscuring the detailed background structure. In “J”, a fully reconstructed dining room is shown from an elevated angle; several rectangular tables are arranged in rows across a wooden floor, each table surrounded by chairs positioned symmetrically, forming a grid-like layout within an enclosed rectangular room.Photogrammetry reconstructed rooms within the main building
The three-row collage of ten rectangular panels is arranged left to right and labeled in letters “A”, “B”, “C” on the top row; “D”, “E”, “F”, “G” on the middle row; and “H”, “I”, “J” on the bottom row. In “A”, a corner section of a room is shown in perspective view with two perpendicular vertical walls meeting at a right angle and a flat floor forming a rectangular base; dense white point cloud dots cluster along the floor edges and around furniture-like shapes near the left wall, while the surrounding space fades into darkness. In “B”, a box-shaped storefront structure is displayed with its front façade facing slightly right; transparent double glass doors form a rectangular opening at the center front, side walls are partially cut away, and shelves and interior objects are visible inside the cubic frame, all overlaid with scattered white point data. In “C”, a large hall-like interior extends backward in depth, with evenly spaced vertical columns rising from a flat tiled floor to the ceiling; wall panels line the far side, and floating white point clusters appear at mid-height across the open space. In “D”, another room corner is presented with two vertical walls and a floor grid visible beneath; white point dots trace the perimeter of the room and outline furniture positioned along the walls, forming a rectangular enclosure. In “E”, a kitchen-like rectangular room is shown with a back wall containing cabinets and a counter running horizontally; a rectangular table stands centrally on a tiled floor, and portions of the side wall are missing, exposing the interior volume while point clouds scatter across surfaces. In “F”, a compact rectangular room is depicted as a cutaway box with one vertical wall removed; shelves line the interior wall, and small objects are arranged inside, with point clusters adhering to surfaces and edges. In “G”, a darker, sparsely reconstructed scene appears with scattered white point clusters concentrated toward the lower right quadrant; a thin horizontal line runs across the frame, intersecting the point cloud. In “H”, a wide horizontal floor plane fills the frame, overlaid with curved and circular white line traces forming layered contour-like shapes; the outlines appear concentric and irregular, positioned centrally within the rectangular ground area. In “I”, an ornate interior façade spans the width of the panel, featuring multiple vertical pillars and decorative wall elements; dense white point formations occupy the foreground, partially obscuring the detailed background structure. In “J”, a fully reconstructed dining room is shown from an elevated angle; several rectangular tables are arranged in rows across a wooden floor, each table surrounded by chairs positioned symmetrically, forming a grid-like layout within an enclosed rectangular room.Photogrammetry reconstructed rooms within the main building
5. Discussions
Crowdsourcing for cultural heritage can be a viable approach that is both efficient in terms of cost, and effective in the task to be accomplished, if stakeholders participate well in supporting such activities. Crowdsourcing alone would not be viable without institutional stakeholder involvement and tasks accomplished would not have arrived at a level where values can be made. Studies that have evaluated the value of participatory activities in real-world scenarios have raised the importance of stakeholder involvement where cultural heritage is concerned. For example, Snis et al. (2021) highlighted the opportunities that lie in the dynamics of interaction between the spirit of cultural heritage and the body of participatory management. As the research reported here has learned, at least for cultural heritage preservation activities, the value of stakeholder involvement is critical for several reasons. Firstly, if value were to be created for a cultural heritage site, the organisation managing the site would have the authoritative knowledge of the needs and consent must be given in terms of the access to the site. A crowdsourcing activity initiated by an independent party would be invasive, and efforts exerted would have amounted to little value.
In addition, an initial pool of human resource is often necessary, and academic researchers have underlined such a need. Alam and Campbell highlighted the challenges of initiating crowdsourcing works, especially in projects without monetory compensations, and summarised the implications for practice that volunteerism can be encouraged by intrinsic and extrinsic motivations and crowd-like or collective-like participation (Alam and Campbell, 2017). A body of research have shown that there is always uncertainty with volunteer motivations, and if anyone would turn up at all. In the case of this crowdsourcing exercise, an educational institution, e.g. a local college would be necessary as a stakeholder. Digital volunteerism, to which this research belongs are often without financial incentives, nor are they contractual (Benkler and Nissenbaum, 2006; Dutton, 2010; Owens, 2013; Auferbauer and Tellioğlu, 2017), yet the involvement of the college would provide the young publics with the incentive of a real-world experience of preserving a local cultural heritage as a part of their social responsibility, the acquirement of transferable digital skills, on top of 24 credit hours added to their schedule. Finally, the author's institution would be the third stakeholder, which carries the responsibility of providing the technical knowledge and skills that would link the volunteers to the goal of the institution owning the site, and thereby accomplishing the task in a three-party collaborative project. The stakeholder with the digital skills would be the “organizing body that coordinates and manages activities and flows of information and communication as well as acts as a driving force for digitalization.”, mentioned in (Snis et al., 2021).
In cities where cultural heritage is the driving force for socioeconomic benefits, the role of governments in facilitating sustainable heritage projects is critical. In terms of governmental involvement, although the municipality was not a stakeholder in the sense of being an active and actual contributor in the project, their primary role has been to facilitate the connection between institutions, as communicated in section two, through the 2018 conference that marked the 10th anniversary of George Town and Melaka's joint inscription as a UNESCO World Heritage Site organised by George Town World Heritage Incorporated (GTWHI).
The remainder of the discussions section provides a summary lesson learned from crowdsourcing 3D cultural heritage. Through the participation of the stakeholders, a tremendous amount of data processing work has been accomplished. In terms of the actual work, it would be beneficial to reflect on what have we learned from the crowdsourcing experiment, and how have each stakeholder benefited from the collaboration? Can loosely distributing photogrammetry works that partially simulate an unorganised crowdsourcing activity able to generate complete models of a site that meets the criteria set by the needs of the clan temple? How sustainable are such projects in terms of cost and effort?
The data and metadata that have been gathered through the crowdsourcing experiment suggest that, as a collective unit with distributed tasks, image data that was captured can indeed generate complete models of a site of this size, with 1,410 m2 of built area and across 0.89 acres, provided that there were three chained phases of work:
The initial loosely distributed task across a range of participant with variable skills and thoroughness,
The secondary work for filling up gaps after the initial dataset has been inspected for adequacy and
Further image processing work within the software systems, leveraging the automated features of photogrammetry software and the accompanying visual tools such as control points.
The crowdsourcing work involving distributed volunteers is indeed disorganised, needing facilitators to bring order and structure into each phase to ensure data quality and performance. The facilitator(s) would be the “organizing body … that coordinates and manages activities”, and “a driving force for digitalization.” (Snis et al., 2021). As we have learned in this exercise and as observed in similar the literature, the driving force for digitalisation is usually the stakeholder that carries out the digitisation work. This is evident in the need to carry out the work, the facilitation of the transfer of technical knowledge, and the completion of the digital work. We can also see that the amount of manual effort needed to complete the 3D modelling process is disproportional to the number of persons contributing to the work due to the variability of skills and thoroughness.
Within crowdsourcing, it can be observed that the variability in the skillsets and thoroughness of the pool of workers do averages out the quality and level of completeness of the model. Therefore, we can project that the higher the skillsets and thoroughness of the pool of workers, the lesser work would be required in the 2nd and 3rd phase of work. This can perhaps be used to extrapolate the performance that one would expect in a much larger crowdsourcing work involving the entirety of the George Town UNESCO World Heritage Site or any site of equivalent or of a larger size globally.
Stakeholders collaborating in view of a common goal would benefit from achieving it. We have discussed at the beginning of the section, the important roles of institutional stakeholders in making crowdsourcing feasible. Here, we summarise how each stakeholder has benefited from the partnership. The Seh Tek Tong Cheah Kongsi achieved its goal of digitally documenting the entire site in 3D as mitigation against fire and natural disasters. Equator College has, in the past, contributed to many volunteering activities in the local cultural heritage domain, but this is the first of their experience with 3D crowdsourcing. The response in the interview was that new knowledge and digital techniques were gained, and that they were educated on the state-of-the-art in digital heritage technologies for preserving local cultural heritage. Similarly, the two instructors would use the experience and skills as a reference for teaching and learning, and thus helping to transfer knowledge to a broader student base for future work. One of the instructors used the technique learned for creating 3D prints, and this may lead to other, latter creative avenues. Suggestions were made by the college to initiate activities for the other clan temples, churches and mosques within the heritage zones and that the scanned monuments and objects could be pathways for promoting the UNESCO site. The author, as an active researcher in the digital heritage domain, and through the experiment has extended knowledge through the research questions. Valuable insights were gained from the experiment. In particular, a pattern within the data and metadata can be obtained which reveals the variability, behaviour and thoroughness of the specific combination of volunteers in the pool. Such patterns can then be used to inform the design of crowdsourcing mechanisms for future work throughout the UNESCO zone, and in many other cultural heritage environments that will become of interest.
In terms of cost, apart from the subsistence of 500 Ringgit (∼USD120) supported by Cheah Kongsi for food and drinks for the students, and the cost of flight and accommodation was covered by the author's host institution. Imaging equipment used was owned by individual participants (Plate 1). The rewards obtained by the participating institutions were the value, experience, skillsets earned, the digital data documented, lessons learned, and the human bonds made in such a relationship. The provision of resources by the individuals is perhaps the clearest defining identity of this crowdsourcing collaboration.
The photo shows a group of people standing in a straight horizontal line in front of a symmetrical building façade, with all faces intentionally blurred. The wall is light-colored, and dark wooden beams run horizontally across the ceiling area at the top of the frame. At the center is a dark double door with rectangular golden decorative panels and vertical bars, framed by a darker border. Above the door is a horizontal rectangular blue signboard with gold Chinese characters. On both the left and right sides of the door are matching octagonal window structures with layered multicolored borders forming stepped geometric frames. Inside each window opening, vertical green cylindrical elements are arranged closely in parallel rows. Above each window is a curved decorative plaque displaying Chinese characters within ornate frames. In the foreground, a group of adults stands shoulder to shoulder on patterned floor tiles with repeating geometric motifs. Each person holds a camera raised near shoulder height, facing forward toward the viewer. They are dressed casually in T-shirts, jeans, shorts, and sneakers. The person on the far left wears a dark shirt and shorts and holds a camera in one hand and a device resembling a controller in the other. A person on the right side wears a light green shirt with the visible printed text “YOU’RE” across the chest. In the lower right corner of the photo, a blue plastic chair is placed beside a small table covered with a dark cloth, and a rectangular black device rests on top of the table.The group of volunteers from Equator College with the two instructors and the author
The photo shows a group of people standing in a straight horizontal line in front of a symmetrical building façade, with all faces intentionally blurred. The wall is light-colored, and dark wooden beams run horizontally across the ceiling area at the top of the frame. At the center is a dark double door with rectangular golden decorative panels and vertical bars, framed by a darker border. Above the door is a horizontal rectangular blue signboard with gold Chinese characters. On both the left and right sides of the door are matching octagonal window structures with layered multicolored borders forming stepped geometric frames. Inside each window opening, vertical green cylindrical elements are arranged closely in parallel rows. Above each window is a curved decorative plaque displaying Chinese characters within ornate frames. In the foreground, a group of adults stands shoulder to shoulder on patterned floor tiles with repeating geometric motifs. Each person holds a camera raised near shoulder height, facing forward toward the viewer. They are dressed casually in T-shirts, jeans, shorts, and sneakers. The person on the far left wears a dark shirt and shorts and holds a camera in one hand and a device resembling a controller in the other. A person on the right side wears a light green shirt with the visible printed text “YOU’RE” across the chest. In the lower right corner of the photo, a blue plastic chair is placed beside a small table covered with a dark cloth, and a rectangular black device rests on top of the table.The group of volunteers from Equator College with the two instructors and the author
The collaboration between Cheah Kongsi, Equator College and the NVIDIA Joint-Lab on Mixed Reality is a constructive and beneficial activity. The three institutional stakeholders that formed this collaborative partnership because of the 2018 UNESCO conference hosted by the George Town World Heritage Incorporated has been a positive one. The awareness of the need to digitise and preserve the 19th century clan temple within George Town's UNESCO heritage zone and the executive decision to carry out the actual work is the first in the heritage city. 3D crowdsourcing work of such nature may also be the very first for the heritage community. The awareness of the need to digitise and the courage to initiate conversations by the present Chairman of Cheah Kongsi, Dato’ Alan Teik Cheng Cheah is perhaps the starting point of the many benefits that we have witnessed in the present project. Perhaps the nature of the three institutions and the relationship that was formed earlier is key to the synergy and success of the collaboration – Cheah Kongsi, as a category 1 building within a UNESCO site in need of digital preservation, Equator College as an educational institution that aims to cultivate digital skills for the workforce, and NVIDIA Joint-Lab on Mixed Reality possesses the state-of-the-art in VR and 3D technologies, with leading research and development in the digital heritage domain. But that which has prompted such a project was the keen awareness of the need to digitally document a site that is at risk of fire hazards and the impacts of flooding. Decisions by leaders of any cultural institution and the openness to embrace digital technology can open possibilities that go far beyond that of a single institution. The synergy of the work, its intended and even unintended positive outcomes, the lessons learned and the positive relationships formed can extend into other untapped areas of opportunities.
In an age where the globalised nature of new media and entertainment within the hand-sized display of smartphones that have usurped the attention of society, the outmoded and ignored importance of cultural heritage towards our sense of identity, our bond to the past, and collective ownership is in need of rekindling. Perhaps crowdsourcing activities that made use of engaging user interfaces, the very thing that have usurped our attention, and that produces solid outcomes can be a catalyst for engaging our younger generation in their awareness of the importance of our cultural heritage. This could also be extended to certain social demographics that are detached from museums (Sandell, 2003; Bennett et al., 2009).
Collaborative partnerships between stakeholders, achieved through crowdsourcing work that utilises digital technology and aims toward a common goal, can be highly beneficial to each institution. The value of such partnerships indicated that set goals can indeed be achieved, and that the new generation can be taught both to care for their heritage, and that the transfer of digital skills is made possible through such activities.
