This study intends to unfold the potential of multimodal pedagogy for fostering inclusivity and engagement in English language learning in higher education.
Using a mixed-method approach, data were collected through classroom observations, teacher interviews and social media analysis.
The findings indicate that multimodality provides more opportunities for structured, inclusive and engaging teaching. Despite efforts to enhance multimodality in the three contexts, much of the learners’ multimodal activities remain a topic of contention, raising questions about measuring learners’ multimodal outcomes.
The main caveat stems from the small sample size, which restricts the generalizability of the findings.
The study proposes implications for advancing multimodal practices in teaching and assessing the English language across contexts. It proposes practical tools such as grading contracts, digital badges and adaptable assessment frameworks that promote inclusivity and engagement while allowing for flexible evaluation of students’ multimodal competencies.
While inclusivity is recognized as a central pillar in education reform and development, linking it to multimodal pedagogy is a gap that leaves an important opportunity for the present endeavor to advance research in this area.
Introduction
In recent years, a shift from the traditional alphabetical literacy to multimodality has been suggested for more inclusive learning practices that engage all the senses of learners (Diamantopoulou & Sigrid, 2022; Jakobsen, 2022; Nwachukwu, Lazarus, Asuzu, Ubani, & Wei, 2024). Cope and Kalantzis (2016) and Kress (2010) argued that multimodality, which integrates various modes of communication: text, images, sound and spatial elements, offers enriched opportunities for multimodal communication. It is a shift from a purely alphabetic mode to multimodality − linguistic and non-linguistic elements for making meaning (Cope & Kalantzis, 2016; Diamantopoulou & Sigrid, 2022; Kress, 2010). In multimodal pedagogy, learners are positioned as active users, designers and analysts of multimodal content. Nwachukwu et al. (2024) speculated that multilingual-oriented pedagogy creates language learning ecologies that embrace linguistic diversity and support active participation. It expands the scope of teaching strategies and techniques (Lim et al., 2021, 2022; Lourenço & Melo-Pfeifer, 2024). Whitney (2016) found that multimodal approaches in diverse classrooms promote social and academic inclusion. These arguments emphasize the importance of inclusivity in educational practices.
The term “multimodal pedagogy” has been subject to varied interpretations, and this complicates its application in language education (Cohn & Schilperoord, 2024; Diamantopoulou & Sigrid, 2022). Perhaps this is because multimodality is conceptually fluid, which, in turn, makes it open to interpretations, leading to inconsistent applications across contexts (Cohn & Schilperoord, 2024). What is not uncommon, however, is integrating multimodal practices in language learning and teaching from different perspectives (Diamantopoulou & Sigrid, 2022; Jakobsen, 2022; Lim et al., 2022; Tour & Barnes, 2022). While multimodal pedagogy has been widely explored in Western contexts, its application in the Middle East is, on the whole, shaped by sociocultural dynamics (Thomas, 2025), including classroom hierarchical structures, exam-oriented curricula and limited access to digital tools (Al-Kadi, 2022; Bailey & Damerow, 2014; Modhish & Al-Kadi, 2016; Moqbel & Al-Kadi, 2023). These factors influence both the feasibility and reception of multimodal strategies in countries in the Middle East, such as Jordan and Yemen.
Multimodal pedagogy, which has potential for learners’ engagement and inclusivity, remains a researchable area open to further inquiry into its potential. There is a shortage of studies on assessing multimodal learning (Garvey, 2022; West-Puckett, 2016), and this gap is particularly evident in contexts where multimodal approaches are not formally integrated into curricula, as in the Arab context of English education (Moqbel & Al-Kadi, 2023; Thomas, 2025). Given the centrality of inclusivity in current educational reforms (Nwachukwu et al., 2024; Whitney, 2016), this study intends to unfold the potential of multimodal pedagogy for fostering inclusivity and engagement in English learning in higher education. Drawing on prior research (Bailey & Damerow, 2014; Hasumi & Chiu, 2024; Lim et al., 2022; Lourenço & Melo-Pfeifer, 2024; Whitney, 2016), this work digs into multimodality affordances for more inclusive language teaching practices and assessment. It consolidates global perspectives of multimodal pedagogy in teaching English and multimodal literacy learning across the world (Lim et al., 2022) for insightful ideas for local contexts where multimodality is not yet widely adopted.
Research questions
How does multimodal approaches enhance inclusivity and engagement in English classrooms across diverse contexts?
What scalable assessment methods can effectively evaluate learners’ multimodal competencies?
Literature review
Theoretical background
This inquiry takes its theoretical underpinning from multimodal literacy, the theory of situated learning and multimodal pedagogy. The concept of multimodal literacy, as articulated by Cope and Kalantzis (2016), emphasizes the integration of various modes, such as text, images, sound and spatial elements, in communication. This framework emphasizes the necessity for learners to develop skills that enable them to interpret and create meaning across multiple modes, reflecting the complex nature of contemporary communication (Walsh, 2010). Lave and Wenger's (1991) theory of situated learning posits that learning is inherently tied to the context in which it occurs. By introducing the concept of legitimate peripheral participation, they argue that learners acquire knowledge and skills through active involvement in authentic activities within a community of practice. This perspective emphasizes the pivotal role of contextual and social factors in shaping the learning process.
Complementing this view, Halliday and Matthiessen's (1999) work on social semiotics explicates how meaning is constructed and interpreted within specific social and cultural contexts. Their framework examines various semiotic resources, such as language, images and gestures, in communication, offering valuable insights into how multimodal texts are produced and comprehended. Social semiotics explores how meaning-making is shaped by cultural and social contexts, offering insights into multimodal communication (Halliday & Matthiessen, 1999). This study applies multimodal literacy (Walsh, 2010) to examine how learners interpret and produce meaning in diverse EFL contexts.
The importance of learner agency and active participation is further explored in Nguyen (2022) and Lourenço and Melo-Pfeifer (2024). These studies highlight how multimodal pedagogy empowers learners to take ownership of their educational experiences, fostering engagement and motivation by engaging them in the creation and analysis of multimodal content. This empowerment aligns with the broader goals of inclusive education, as it recognizes and values the diverse contribution learners bring to the classroom. Meanwhile, Garvey (2022), Reid, Paster, and Abramovich (2015) and West-Puckett (2016) explored innovative assessment methods tailored to multimodal learning environments. They propose assessment measures, such as grading contracts, digital badges and open-ended criteria, providing inclusive and flexible alternatives to traditional assessments that better capture the multidimensional nature of multimodal competencies.
Together, these theoretical frameworks provide a comprehensive foundation for understanding the intricacies of implementing multimodal pedagogy in foreign language education. They highlight the dynamic interplay among contextual learning, social meaning-making, learner agency and innovative assessment practices, highlighting the transformative potential of multimodal approaches to foster inclusivity, engagement and equitable learning outcomes.
Multimodal pedagogy
Multimodal pedagogy is closely associated with the concept of multiliteracies (Cope & Kalantzis, 2016; Kress, 2010), both of which have shaped an emerging framework for modern teaching and learning (Lim et al., 2021). This approach utilizes culturally diverse teaching materials to enrich learning and promote inclusivity. By fostering an environment that accommodates and values the unique needs and perspectives of all students, multimodal pedagogy inherently supports inclusivity (Sanger & Gleason, 2020).
At its core, multimodal pedagogy views learners as active users, designers, analysts and creators of multimodal content. Studies exploring its application have consistently highlighted its potential to enhance language learning (Diamantopoulou & Sigrid, 2022; Lim et al., 2022; Lourenço & Melo-Pfeifer, 2024; Ørevik, 2024; Whitney, 2016). For instance, Tour and Barnes (2022) demonstrated the effectiveness of multimodal composition in engaging learners, while Jakobsen (2022) examined its role in literacy practices within English language education. Other studies have explored themes, such as multimodality’s intersection with social justice and inclusive educational practices (Lim et al., 2021; Nwachukwu et al., 2024).
Despite a sizeable volume of research, key issues remain unresolved. The term multimodality lacks a universally accepted definition, leading to divergent interpretations of what constitutes multimodal pedagogy (Cohn & Schilperoord, 2024; Diamantopoulou & Sigrid, 2022). Additionally, much of the existing literature focuses on integrating multimodality into teaching practices. However, multimodal assessment and the impact of multimodality on inclusivity and engagement remain notably limited.
Inclusivity and engagement in EFL classrooms
Inclusivity has gained significant ground in modern education alongside the principles of equity and diversity. For this study, the terms inclusivity and engagement are approached holistically, reflecting their interrelated nature in the context of an English as a Foreign Language (EFL) classroom (Al-Kadi, 2022; Bailey & Damerow, 2014; Thomas, 2025). These concepts can be explored through three interconnected dimensions: academic inclusion, linguistic inclusion and social inclusion. Academic inclusion emphasizes strategies that address diverse learning needs and promote equitable access to education. Techniques such as Universal Design for Learning (UDL), fostering positive teacher–student relationships and encouraging collaborative learning have proven effective in creating inclusive environments (Nwachukwu et al., 2024).
These approaches ensure that learners with varying abilities and backgrounds can actively engage in classroom activities, supporting their academic growth. Linguistic inclusion involves recognizing and valuing students’ linguistic and cultural diversity. Practices such as translanguaging, code-switching and culturally responsive teaching are key strategies for fostering linguistic inclusivity (Nwachukwu et al., 2024). These strategies enable students to draw on their linguistic repertoire as tools for learning, creating a dynamic and inclusive language-learning environment. Social inclusion involves fostering a positive classroom climate through student-centered teaching practices (Modhish & Al-Kadi, 2016). Strategies such as promoting peer collaboration, creating inclusive assessment methods and encouraging active participation ensure that all learners feel valued and connected to the classroom community (Whitney, 2016; Sanger & Gleason, 2020). This dimension is integral to enhancing learner engagement and reducing social barriers in EFL classrooms.
Among the components of inclusivity, assessment is particularly underexplored in EFL contexts. Inclusive assessment practices go beyond traditional tests and exams, focusing instead on evaluating students’ ability to actualize their knowledge and skills in authentic, real-world contexts. Alternative assessment methods, such as multimodal portfolios, digital badges and grading contracts, provide more nuanced and flexible means to capture the complexity of learners’ progress while accommodating diverse needs (Garvey, 2022; West-Puckett, 2016; Reid et al., 2015). Moqbel and Al-Kadi (2023) investigated the challenges of assessing foreign language learning outcomes in the context of advanced AI tools such as ChatGPT. Their study highlights the need to adapt assessment strategies to accommodate technological advancements, ensuring that they remain relevant and inclusive.
Lim et al. (2022) identified five common themes in multimodal pedagogy in EFL classrooms: engagement with texts rooted in students’ lived experiences, the use of critical and culturally responsive pedagogies, explicit instruction in multimodal literacy, the impact of these practices on student learning and challenges surrounding assessment. This study builds on their findings by focusing on three core areas: involvement with diverse media, the use of culturally responsive teaching methods and difficulties in evaluating these practices. While Lim et al. (2022) explored multimodal pedagogy’s effects on learner engagement, they did not address the challenges of assessment in resource-constrained environments.
The major gap the study strives to cover lies in understanding how multimodality fosters inclusivity and engagement in the language classroom. While inclusivity is recognized as a central pillar in education reform and development (Lourenço & Melo-Pfeifer, 2024; Nwachukwu et al., 2024), linking it to multimodal pedagogy is a gap that leaves an important opportunity for the present endeavor to cover. It advances research in this area by digging into how multimodal pedagogy promotes inclusivity and engagement in English language programs. It contributes to a broader understanding of how multimodality supports diverse educational environments and the development of inclusive, effective assessment strategies that enhance learner engagement and inclusivity.
Method
Following the guidelines of Brinkmann (2018) and Creswell and Creswell (2018), a mixed-method approach was employed to solicit data from classroom observations, teacher interviews and social media, providing a comprehensive understanding of the research questions. This triangulation facilitated the exploration of the interplay between multimodal pedagogy, learner engagement and inclusivity in the three comparable settings wherein the researchers served as university lecturers: the United States, Jordan and Yemen. These three contexts were chosen for their pedagogical contrasts: the US represents full integration of multimodal pedagogy, Jordan reflects partial and informal adoption, and Yemen illustrates a traditional, monomodal approach. This range enables meaningful comparison. As reflected in course syllabi and classroom practices, multimodal pedagogy in the US could be described as “widely integrated” into English programs and “less explicitly integrated,” with informal implementation observed in some teaching practices in Jordan. In Yemen, however, the teaching paradigm is noticeably “traditional,” wherein the monomodal teaching methods dominate, with little formal acknowledgment of multimodal pedagogy in English programs.
Participants
The study involved 15 English teachers. They were purposefully sampled from their respective teaching settings so as to identify individuals with relevant expertise and contextual insights into multimodal pedagogy. These teachers, all of whom hold at least a master’s degree in ELT or related fields, had extensive experience teaching English to undergraduate students. Their professional backgrounds included varying levels of multimodal resources and support. During this study, the teachers in focus were engaged in teaching English in their teaching contexts (Table 1), and their responsibilities extended beyond instruction to address the unique challenges under the current demands for active learning and inclusivity. Hence, their insights and background were crucial in understanding the effectiveness of different teaching strategies, including multimodal teaching and the impact of educational contexts on language learning outcomes.
Basic information about the contexts of data collection
| Contexts | Participants | Observation time |
|---|---|---|
| The US | 5 Teachers | 1.5 hours (each) = 7.5hrs |
| Jordan | 5 Teachers | 1.5 hours (each) = 7.5hrs |
| Yemen | 5 Teachers | 1.5 hours (each) = 7.5hrs |
| Contexts | Participants | Observation time |
|---|---|---|
| The US | 5 Teachers | 1.5 hours (each) = 7.5hrs |
| Jordan | 5 Teachers | 1.5 hours (each) = 7.5hrs |
| Yemen | 5 Teachers | 1.5 hours (each) = 7.5hrs |
Instruments
The datasets were collected through classroom observations and semi-structured teacher interviews and social media analysis. The seating arrangements of the observed classes included clusters in the US, mixed rows and circles in Jordan and traditional rows in Yemen. Activities included group projects, multimedia tasks and teacher-led discussions. The observations aimed to explore how teachers employed multimodal pedagogy and its impact on learner engagement and inclusivity in teaching English. Each teacher was observed for 1.5 h, during which multimodal practices, teaching materials and student participation were documented. The integration of various modes (text, visuals, audio and digital tools) into teaching activities and indicators of learner engagement and inclusion was documented. Following the classroom observations, interviews were conducted with the participants one-by-one to gain deeper insights. This was necessary because the specific timing or context of the observed classes might not fully capture the broader picture of multimodal teaching practices. A list of interview questions was devised after Brinkmann (2018) to explore (1) how multimodal approaches were integrated into their teaching, (2) the challenges they faced and (3) the strategies they used to foster inclusivity and engagement.
In Yemen and Jordan, where multimodal pedagogy is not formally included in curricula, informal multimodal activities were observed on departmental Facebook pages. These pages were identified as central hubs where students and teachers shared posts, particularly those with multimodal elements. Posts were selected based on their relevance to classroom activities and inclusion of multimodal elements such as videos, images, infographics or audio. These platforms were selected to obtain additional insights into how students and teachers engage with multimodal content outside of formal classroom settings and their relevance to formal classroom settings. A dataset was collected from those posts during the academic year 2023–2024. These posts were created or co-created by students or teachers in the relevant Facebook groups and demonstrate multimodal practices such as combining text with visuals, videos or other non-textual elements (Figure 1).
The photograph shows the three project boards placed in a horizontal series. The left board displays a mind map with a rectangular box at the center titled, “Parts of Speech”. This box is surrounded by nine cloud shapes illustrating parts of speech. These nine clouds are surrounded by nine rectangular boxes. Each box has some text in it. The center board is a list of terms contained in multicolored bubble shapes. The right board is a three-dimensional, black cross-section model of a head or vocal tract with connected labels. Note: The text on almost all three boards is not readable.Illustrations of multimodal content on social media of the depts. Source: Screenshots by authors
The photograph shows the three project boards placed in a horizontal series. The left board displays a mind map with a rectangular box at the center titled, “Parts of Speech”. This box is surrounded by nine cloud shapes illustrating parts of speech. These nine clouds are surrounded by nine rectangular boxes. Each box has some text in it. The center board is a list of terms contained in multicolored bubble shapes. The right board is a three-dimensional, black cross-section model of a head or vocal tract with connected labels. Note: The text on almost all three boards is not readable.Illustrations of multimodal content on social media of the depts. Source: Screenshots by authors
Data analysis
All qualitative data (observation notes, interview transcripts and Facebook posts) were analyzed using a systematic, iterative coding process supported by NVivo software. The researchers began by reviewing the data to identify initial patterns and categories (Creswell & Creswell, 2018). Open coding was then applied to label relevant segments with provisional codes, e.g. technology integration, assessment challenges and inclusive strategies, followed by axial coding to group related codes into broader thematic categories. Finally, selective coding refined these categories to highlight key themes that explained the interplay between multimodal practices, institutional support, resource availability and learner engagement.
To enhance credibility, the researchers independently coded a subset of the data and resolved discrepancies through discussion until they reached a consensus. This intercoder reliability process increased the trustworthiness of the qualitative analysis. Brief member checking involved sharing preliminary interpretations with two teachers to provide reflective feedback on the preliminary emerging themes and ensure the themes were accurate and reflective of their lived experiences with multimodal pedagogy. Presented with a summary of the initial thematic categories derived from the data, the teachers in question provided feedback that confirmed the relevance and accuracy of the themes, particularly regarding institutional support, learner engagement and the challenges of implementing multimodal strategies.
Triangulation across multiple data sources (classrooms, interviews and social media) and consistent use of NVivo for transparent analysis supported the transferability, dependability and confirmability of the findings. This reflective approach ensured that the conclusions drawn were contextually sensitive and aligned with the study’s goal of understanding how multimodal pedagogy can enhance inclusivity and engagement in diverse educational settings. Data from observations, teacher interviews and social media analyses were triangulated to support key findings, enhancing the reliability and depth of the results. These sources were cross-analyzed using NVivo, allowing the researchers to identify converging patterns and validate findings across contexts. For example, a teacher’s claim in an interview about using videos to enhance engagement was supported by both classroom observations and the presence of related video content on social media. Likewise, observational data showing limited use of technology in Yemen was corroborated by the absence of digital tools in social media posts and interview comments on resource scarcity. This triangulation strengthened the credibility and depth of the findings by ensuring that themes were not based on a single data stream but were consistently reflected across multiple forms of evidence.
While the qualitative methods captured the nuanced experiences and perspectives of participants, the quantitative data provided measurable insights into the prevalence and impact of multimodal practices across the three contexts in question. Basic descriptive statistics were used to quantify patterns in participation and engagement. Engagement scores were derived from observation notes using a five-point Likert scale rubric. Each observed session was rated based on indicators such as student attentiveness, participation in activities and responsiveness to materials. Scores from all sessions in each context were averaged to produce the final engagement score per count.
For analyzing the content on social media, each post was manually reviewed for its content and relevance. The posts were categorized into student-created content and teacher-led initiatives. The former includes videos and presentations shared by students, and the latter includes educational resources or materials shared by teachers, such as videos or annotated slides. The engagement metrics (e.g. number of likes, comments and shares) were documented to gauge the popularity and reach of each post.
Results
This section integrates qualitative and quantitative data about multimodal practices, their impact on learner engagement and inclusivity, and the challenges associated with assessing multimodal activities. The results are presented in tables and as short quotations and themes. They are arranged in two parts. The first corresponds to the first research question and the second corresponds to the second question.
Part 1:
This part speaks to multimodality across the three contexts. It can be subdivided into the following three subheadings for a better understanding of the results: multimodal integration, impact on engagement and inclusivity and participation rates and collaborative activities.
Multimodal integration across contexts
The findings from classroom observations, teacher interviews and social media analysis demonstrate significant variations in the implementation and impact of multimodal pedagogy across the three contexts. Data from the thematic analysis, patterns in multimodal implementation and impact are outlined in Table 2. It outlines key thematic variations across contexts, including inclusivity, collaboration and technology integration. As the data in the table indicates, in the US, multimodal pedagogy was highly institutionalized, with practices embedded in classroom teaching. Teachers utilized a range of multimodal resources, such as slides, videos, online platforms and collaborative projects, to create dynamic, interactive learning environments. These tools were accessible, seamlessly integrated into the curriculum and supported by institutional policies that provided clear guidelines for their implementation. Observations revealed that teachers were well-prepared to utilize these resources, benefiting from professional development initiatives that equipped them with the skills necessary to integrate multimodality effectively.
Themes of multimodal pedagogy across three contexts
| Theme | Description | The United States (US) | Jordan | Yemen |
|---|---|---|---|---|
| Participation | The extent of student involvement in classroom activities | Structured activities such as group projects and digital presentations fostered consistent participation Example: “Students love collaborating on online projects for that keeps them motivated” | Participation was teacher-driven and sporadic Example: “I try to use videos to make my lessons more interactive, but it's not something I can do regularly” | Participation was minimal, relying on lectures with brief moments of activity through poster-making Example: “We have limited tools, so most of my teaching is lecture-based” |
| Engagement | Reduced boredom and increased motivation through interactive tools | Engagement driven by multimodal tools like videos, group work and digital platforms Example: “Using visuals and videos keeps students interested and improves their learning outcomes” | Engagement was moderate, with videos and diagrams stimulating interest Example: “When I use visuals, the students respond better, but I can't do this for every class” | Engagement was low, as traditional lecture methods dominated Example: “I don't have adequate tools or materials to make lessons more interactive” |
| Tech Integration | Using digital tools to enhance learning experiences | Consistent use of devices and online tools as part of the curriculum Example: “Each classroom is equipped with projectors and access to digital tools for lessons” | Technology was limited, depending on availability and teacher familiarity Example: “I sometimes use my own laptop to show videos because not every classroom has a projector” | Technology use was minimal due to resource constraints Example: “We don't have access to computers, so I rely on textbooks and handouts” |
| Theme | Description | The United States (US) | Jordan | Yemen |
|---|---|---|---|---|
| Participation | The extent of student involvement in classroom activities | Structured activities such as group projects and digital presentations fostered consistent participation | Participation was teacher-driven and sporadic | Participation was minimal, relying on lectures with brief moments of activity through poster-making |
| Engagement | Reduced boredom and increased motivation through interactive tools | Engagement driven by multimodal tools like videos, group work and digital platforms | Engagement was moderate, with videos and diagrams stimulating interest | Engagement was low, as traditional lecture methods dominated |
| Tech Integration | Using digital tools to enhance learning experiences | Consistent use of devices and online tools as part of the curriculum | Technology was limited, depending on availability and teacher familiarity | Technology use was minimal due to resource constraints |
In comparison, multimodal pedagogy in the Jordanian context was applied in a less formalized manner, driven largely by the personal initiative of individual teachers rather than institutional directives. While course descriptions and syllabi rarely mandated the use of multimodal methods, the teachers in focus independently incorporated tools such as videos, diagrams and visual aids to enrich their lessons and make them more engaging and interactive. These efforts, though well-intentioned, were inconsistently applied and varied significantly across classrooms. The teachers in question expressed a strong interest in expanding their use of multimodal methods but cited several barriers. A Jordanian teacher posted a slide deck explaining rhetorical devices, which students referenced in one of the classes that was observed. Such instances highlight the potential for integrated multimodal learning when digital content is purposefully aligned with instructional goals.
Similarly, implementing multimodal pedagogy in Yemen was minimal, with classrooms largely reliant on monomodal methods such as lecture-based instruction and text-heavy materials. Observations highlighted a pervasive dependence on traditional teaching approaches, with little integration of visual, auditory or interactive elements. While some teachers made isolated efforts to incorporate multimodal activities, such as student-created posters or group discussions on their Dept. Facebook pages (e.g. Figure 1), these practices lacked depth or continuity. Noticeably, the students created posters summarizing grammatical concepts, which were shared on Facebook. Although visually engaging and well-received online, these materials were optional and disconnected from formal classroom instruction, limiting their pedagogical impact. The posts were generally created on an ad hoc basis rather than as part of a structured curricular framework. While some activities (e.g. student posters) were tied to classroom assignments, many were self-initiated. For example, a Yemeni teacher explained that a poster summarizing a novel was an optional project shared online to encourage creativity but was not formally assessed in the classroom. Another example of engagement in the Jordanian context is a video summarizing the themes of a poem. It received 120 likes and over 20 comments, reflecting its impact on the online audience.
Impact on engagement and inclusivity
The data suggest that fully integrated multimodal strategies enhance both learner engagement and inclusivity. Class observations in the US context show that students enthusiastically participated in group projects and digital presentations. By offering multiple avenues for expression beyond traditional text-based tasks, multimodal approaches accommodate diverse learning preferences and abilities, thereby fostering a more inclusive learning environment. As one teacher noted, “When I use group projects and multimedia tools, students who usually struggle with text-based tasks find ways to shine.” This example shows how such practices can boost student confidence, enhance motivation and encourage equitable participation.
In contrast, the less systematic application of multimodal methods in Jordan and Yemen limited their overall impact on inclusivity and engagement. A Jordanian teacher noted, “I use videos and visual aids, but it depends on what we can afford or what's available in the classroom,” highlighting the reliance on individual initiative rather than institutional support. Similarly, in Yemen, the dominance of monomodal lecture-based instruction presented challenges in meeting the diverse needs of learners, as one teacher observed, “We want to use more visuals or interactive methods, but we don't have the resources.”
To support the qualitative findings, engagement levels were quantified using a five-point scale (1 = low engagement and 5 = high engagement). As shown in Table 3, the US averaged an engagement score of 4.5, aligning with its integrated multimodal practices, while Jordan’s moderate score of 3.2 mirrored its intermittent use of videos and visuals. Yemen’s score of 2.1 reflected the limitations of monomodal teaching methods and emphasized the need for resources and institutional support.
Engagement in the observed classes
| Context | Engagement score (average) | Observed patterns |
|---|---|---|
| The US | 4.5 | Active discussions, enthusiasm in group work |
| Jordan | 3.2 | Attention during visual/video tasks |
| Yemen | 2.1 | Minimal engagement, reliance on lectures |
| Context | Engagement score (average) | Observed patterns |
|---|---|---|
| The US | 4.5 | Active discussions, enthusiasm in group work |
| Jordan | 3.2 | Attention during visual/video tasks |
| Yemen | 2.1 | Minimal engagement, reliance on lectures |
Participation rates and collaborative activities
Analysis of participation rates, defined as the proportion of students actively involved in classroom activities, underlines the disparities across the three contexts. As shown in Table 4, the US recorded the highest participation rate at 83%, with learners frequently engaged in group projects, interactive discussions and digital presentations. In contrast, Jordanian classrooms achieved a moderate participation rate of 57%, where student involvement increased notably during activities incorporating video-based materials. Yemeni classrooms exhibited the lowest participation rate (33%, characterized by limited opportunities for interactive learning and a reliance on lecture-based instruction. Observably, in the Yemeni context, occasional use of visual aids or small-group discussions offered temporary boosts in engagement but failed to yield sustained improvement. Teachers expressed a strong desire to adopt more multimodal strategies but acknowledged the scarcity of resources and training that hindered their efforts.
Participation rate in the observed classes
| Context | Participation rate | Observed activities |
|---|---|---|
| The US | 83% | Group discussions and collaborative projects |
| Jordan | 57% | Class discussions and video-based activities |
| Yemen | 33% | Lecture-heavy sessions with brief group tasks |
| Context | Participation rate | Observed activities |
|---|---|---|
| The US | 83% | Group discussions and collaborative projects |
| Jordan | 57% | Class discussions and video-based activities |
| Yemen | 33% | Lecture-heavy sessions with brief group tasks |
Collaboration and group work played a critical role in enhancing participation and communication skills, particularly in the US. One teacher remarked, “I assign tasks for students to work in pairs and create presentations; it's very effective,” illustrating how structured teamwork activities supported learner engagement. In the Jordanian context, collaboration occurred when teachers proactively sought to engage students, though such efforts depended on the topic and available materials. As one Jordanian teacher explained, “I ask students to work in pairs for discussions, but it depends on the topic.” In Yemen, however, collaborative initiatives remained sporadic and resource-constrained. A Yemeni teacher stated, “There isn't enough time or resources to organize group tasks regularly,” which highlights the systemic challenges that prevented consistent use of collaborative strategies. Together, these findings emphasize the importance of institutional support, training and resources for sustained and meaningful participation through multimodal and collaborative learning experiences.
When it comes to technology, the results in Table 5 reflect the availability of infrastructure, institutional support and professional development in the US, wherein digital resources were incorporated into approximately 92% of the observed sessions. It enabled learners to create presentations, engage in online collaborations and access a range of multimedia materials. One US teacher noted, “Students are more motivated when they can use technology and express themselves creatively; it keeps them involved and excited to learn.” This high degree of technological support and readily available equipment contributed to more interactive and engaging learning experiences. In contrast, technology use in Jordan occurred in about 48% of the sessions, depending on the initiative of individual teachers who relied on personal devices rather than institutional provisions. In Yemen, only 18% of sessions featured digital tools, reflecting the scarcity of basic equipment like computers, projectors or reliable electricity. One Yemeni teacher recounted having to borrow a colleague’s laptop to show a grammar video, forcing students to crowd around a small screen. These constraints severely limit the feasibility and sustainability of multimodal teaching strategies in resource-scarce settings. These stark differences highlight how systemic factors, including resource allocation, institutional policies and infrastructure, influence the extent to which technology can be harnessed to support multimodal teaching and learning.
Technology integration in the observed classes
| Context | Using technology | Examples |
|---|---|---|
| The US | 92% | Digital presentations, online collaboration |
| Jordan | 48% | Teacher’s personal devices and videos |
| Yemen | 18% | Occasional videos on shared devices |
| Context | Using technology | Examples |
|---|---|---|
| The US | 92% | Digital presentations, online collaboration |
| Jordan | 48% | Teacher’s personal devices and videos |
| Yemen | 18% | Occasional videos on shared devices |
Part 2:
This part corresponds to the second research question. It relates to the assessment of multimodal practices in English classes. The focus is on scalable methods to assess multimodal competencies. The results revealed a set of common challenges across the three contexts. In the US, teachers demonstrated a willingness to experiment with innovative approaches, such as grading contracts and digital portfolios. These strategies showed promise in capturing a broader range of student abilities, including collaboration and multimodal skills. However, while teachers in the US were relatively well-positioned to explore these methods, owing to better resource availability and institutional support, they pointed to the lack of standardized criteria and the considerable time required to implement such assessments effectively. One teacher explained, “We need rubrics and clear criteria to make these assessments reliable, especially if they're going to be adopted widely.” This comment stresses the need for scalable frameworks that can offer consistency and efficiency, ensuring that multimodal assessments can be integrated sustainably into established educational practices. The Jordanian and Yemeni contexts presented more pronounced challenges. In Jordan, multimodal assessment remained largely absent, with educators continuing to rely on traditional testing methods that are ill-suited for evaluating tasks involving visual, auditory and interactive elements. Similarly, Yemeni teachers were constrained by limited resources and an entrenched reliance on conventional examinations, making it difficult to fairly and comprehensively evaluate multimodal work. Teachers in both contexts (Jordan and Yemen) expressed uncertainty about robust guidelines to assess the full spectrum of skills that multimodal assignments could potentially develop.
These findings align with prior research on the difficulties of implementing new assessment forms in under-resourced environments. Although innovative tools like portfolios and grading contracts have demonstrated promise in principle (Garvey, 2022; Reid et al., 2015), their use has remained limited to informal or pilot initiatives. Teachers interviewed across the three contexts consistently voiced a need for professional development opportunities to enhance their capacity to design and administer effective multimodal assessments. This need is especially pressing in Jordan and Yemen, where limited access to technology, lack of professional development and institutional constraints pose substantial obstacles. In these settings, educators must be equipped not only with theoretical knowledge of multimodal assessment practices but also with practical, context-sensitive tools that can operate under resource constraints.
Moreover, the persistent lack of standardized frameworks and criteria for evaluating multimodal outputs reflects broader systemic issues. West-Puckett's (2016) observations on resource limitations and the absence of institutional structures mirror the challenges identified in this study, while Newfield's (2011) insights into the difficulty of integrating innovative pedagogies into traditional assessment cultures further illuminate the complexity of the problem. Without institutional backing and clear benchmarks for success, teachers find themselves navigating a landscape where new teaching methods and learning modalities remain undervalued or misunderstood. As a result, the transformative potential of multimodal pedagogy is undermined by the inability to systematically and equitably measure its effectiveness.
Discussion
Findings from the first part reaffirm the potential of multimodal pedagogy to enhance inclusivity in English language teaching, an approach that goes beyond the traditional teaching of the English language (Bailey & Damerow, 2014), echoing insights from prior research (e.g. Whitney, 2016; Nwachukwu et al., 2024). This reflects Cope and Kalantzis' (2016) assertion that multimodal literacy embraces the diverse semiotic repertoire learners bring to the classroom. Moreover, this aligns with the UDL framework (Nwachukwu et al., 2024), which advocates for providing multiple representational means, engagement and expression to address learner variability. These findings are consistent with Kress (2010), who emphasizes that multimodality allows for a richer communication landscape, facilitating learning for students with diverse backgrounds and abilities. By incorporating various modes of representation, teachers in the US were able to cater to different learning styles and preferences, thereby promoting inclusivity. This is particularly important in heterogeneous classrooms where students may have varying levels of proficiency, learning disabilities or come from different cultural backgrounds (Ajayi, 2011). The success observed in the US context underlines the effectiveness of multimodal pedagogy when it is supported by institutional frameworks and resources. These findings align with Halliday and Matthiessen's (1999) emphasis on the role of multimodality in bridging cultural and linguistic diversity but reveal systemic barriers that prevent this potential from being fully realized in resource-limited settings.
Regarding technology integration (Table 4), which largely determines the magnitude of multimodality, teachers in the US used a variety of tools, including diagrams, slides, videos and interactive discussions, to cater to students with different learning styles. For instance, visual learners benefited from the extensive use of visual aids, such as charts and multimedia presentations, which helped them grasp complex concepts more easily. Auditory learners, in contrast, benefited from spoken content and video materials, which enhanced their engagement and comprehension. The variety of modalities (Table 3) allowed all students to participate meaningfully in classroom activities, creating a sense of belonging and inclusivity. It extends to learners who may struggle with traditional, text-heavy methods. This holistic approach to learning not only enhances student engagement but also empowers them by providing equitable opportunities to succeed, regardless of their preferred learning style.
In Jordan and Yemen, while individual teachers took the initiative to incorporate videos, diagrams and visual aids into their lessons, these efforts were inconsistent across classrooms and largely dependent on personal motivation. In this regard, Lim et al. (2022) asserted that real change often emerges from grassroots classroom practices. Observations revealed that students engaged more actively in lessons when multimodal elements were included, as these tools made the material more accessible and relatable (Table 2). For example, using video content often captured students’ attention and spurred discussions, while visual representations helped clarify abstract concepts. However, these benefits were often short-lived due to the absence of a system-wide approach to multimodal integration. Without clear policies or frameworks to standardize these practices, their effectiveness was limited and their impact on engagement and inclusivity was uneven. Nurlely (2023) identified comparable challenges in low-tech contexts, where limited access to technology poses significant barriers for teachers attempting to implement multimodal practices effectively. The disparities observed highlight the critical role of institutional support and resources in enabling teachers to adopt inclusive pedagogies.
The study also demonstrates that multimodal pedagogy enhances learner engagement, particularly when it is supported by systemic implementation. This finding resonates with situated learning theory (Lave & Wenger, 1991), which emphasizes the importance of authentic, contextually relevant learning experiences. By integrating real-world tools and practices, multimodal approaches positioned learners as active participants in their educational journey, fostering deeper engagement. This is further supported by Jewitt (2008), who argues that multimodal pedagogies can transform the learning environment by making it more interactive and student-centered. The observed active participation aligns with Vygotsky's (1978) social constructivist theory, which posits that learning takes place via collaboration and social interaction. By facilitating collaborative and creative tasks, multimodal pedagogy leverages social learning processes that enhance engagement and comprehension. However, in Jordan and Yemen, engagement was lower due to the limited and inconsistent use of multimodal practices.
Teachers reported that the occasional use of videos, diagrams or posters temporarily boosted participation, but these efforts lacked the sustainability observed in the US. One Jordanian teacher remarked, “When I use a video, students pay attention, but I don't have time to do this for every lesson.” Similarly, a Yemeni teacher noted, “visual aids help students engage better, but we lack the resources to use them regularly.” These responses highlight the importance of a structured approach to multimodal integration. Sporadic implementation is insufficient to achieve lasting improvements in learner motivation, underscoring the need for systemic support to standardize multimodal practices. This is consistent with Walsh (2010), who suggests that without institutional backing and adequate resources, teachers may be unable to sustain multimodal interventions. The lack of consistency in implementing multimodal strategies can lead to fragmented learning experiences that do not fully engage students. The necessity for a coherent and sustained approach is evident, pointing to the need for curriculum development and teacher training that prioritize multimodal engagement strategies.
Findings related to the second part, which relate to assessment as a persistent theme across all three contexts, mirror broader challenges in the literature (viz. Garvey, 2022; Lim et al., 2022). This aligns with Reid et al.'s (2015) argument that alternative assessments require clear guidelines to ensure validity and reliability. Moreover, Serafini (2012) highlights that assessing multimodal texts necessitates new frameworks that can account for the different modes of communication involved. In the US context, refining and standardizing innovative practices, such as digital portfolios, could facilitate broader adoption. Teachers could benefit from professional development focused on designing and implementing reliable assessment rubrics for multimodal tasks. Such rubrics would provide clear criteria for both teachers and students, enhancing the reliability and validity of assessments. Meanwhile, in Jordan and Yemen, efforts should focus on developing low-cost, scalable rubrics aligned with local resources and teaching practices. As one Yemeni teacher suggested, “If we had simple tools for grading group work or creativity, it would make a big difference.” This statement captures the urgent need for accessible, low-cost tools tailored to the constraints of under-resourced environments. These measures would enable teachers to bridge the gap between pedagogical innovation and assessment, ensuring that multimodal practices are both effective and sustainable. This approach is supported by Hafner (2014), who argues that practical assessment tools are deemed necessary for the successful integration of multimodal literacy in classrooms.
The findings also suggest that social media platforms can foster multimodal expression, but their ad hoc uses limit their educational impact. The analysis of Facebook pages revealed insights into informal multimodal practices in both Jordan and Yemen. However, social media activities lacked alignment with class objectives. Without instructional alignment or guidelines, these activities remained disconnected from formal curricular goals. In Yemen, student-created posters on Facebook were independent projects, with no direct connection to lesson plans or assessments. In Jordan, video projects were occasionally referenced in class but were not consistently evaluated or incorporated into lesson objectives.
Overall, the findings support Bailey and Damerow's (2014) call for shifting away from lecture-based instruction to more interactive student-centered learning. Implementing and enhancing a multimodal approach to English teaching in the Middle East holds significant potential for improving learning outcomes. While multimodal approaches can enhance inclusivity and engagement, their effectiveness varies based on resource availability. In the current study, structured multimodal practices supported by policies and resources in the US were more effective than in the contexts of Jordan and Yemen, wherein inconsistent application by individual teachers shows both potential and limitations. This implies the need for institutional support, resource allocation and the inclusion of multimodal approaches in national curricula. It also stresses the importance of training and developing inclusive assessment techniques to capture a wide range of competencies and address diverse educational needs. Without formal guidelines, multimodal practices resulted in uneven impacts on student engagement. Training in developing inclusive assessment techniques such as grading contracts (Garvey, 2022), digital badges (Reid et al., 2015) and open-ended evaluation criteria can help educators capture the full spectrum of multimodal competencies. These tools could include inclusive tools and rubrics tailored to the needs of resource-constrained environments to evaluate multimodal learning within a framework that aligns with existing teaching practices and could help bridge the gap between multimodal pedagogy and assessment. In addition to teacher training, policy-level reforms are essential for institutionalizing multimodal pedagogy and ensuring its scalability and sustainability. Without such systemic changes, the benefits of multimodal approaches may remain inaccessible to learners in under-resourced environments.
Conclusion
The study examined how multimodal pedagogy enhances inclusivity and engagement across the contexts of the US, Jordan and Yemen. While resource-rich environments (the US, a case in point), multimodal practices are obvious in formal settings, the resource-limited settings (Jordan and Yemen) have relied on individual teachers' initiatives, leading to inconsistent and less effective application of multimodal practices. Classrooms in the American context implemented multimodal pedagogy in a consistent and well-integrated manner, which, in turn, demonstrated significantly higher levels of learner engagement and inclusivity. However, a shared challenge across all three contexts is the lack of standardized criteria for evaluating multimodal competencies. With that said, multimodal assessment tools and criteria are necessary to facilitate the broader adoption of multimodal pedagogy and ensure scalability across diverse educational environments.
While the study identifies pathways for contributing to the global discourse on inclusive and multimodal foreign language education, it is important to acknowledge its limitations. The main caveat stems from the small sample size, which restricts the generalizability of the findings. This study leans toward descriptive insights on the affordances and constraints in multimodal pedagogy using a mixed-methods study, which draws primarily on observational data and interview accounts. Despite in-depth analysis through a combination of data from classroom observations, interviews and social media analysis, there is a need for longitudinal data to better understand the sustained impact of multimodal practices on engagement and inclusivity. Future research may build on the findings and expand the scope of investigation to examine the long-term effects of multimodal approaches across different levels and settings. There is also a need for standardized, context-sensitive assessment tools for multimodal competencies. Addressing these gaps can help unlock the full potential of multimodal pedagogy, demonstrating how it can transform language education into a more inclusive, engaging and effective learning experiences.

