This study aims to develop and validate a scale to measure commercial friendship quality (CFQ) between hospitality staff and guests. In addition, this study provides quantitative validation of a conceptual model previously developed through qualitative research.
Drawing on friendship theory and prior qualitative research, an initial item pool was generated based on three core dimensions: disclosure, social support and activities. Following a widely recognized scale development procedure, the item pool was systematically refined through expert review, cognitive interviews and pilot testing. Data were collected from 562 guests of Dutch pubs. Subsequently, exploratory and confirmatory factor analyses were conducted, followed by tests of convergent and nomological validity.
The results reveal a three-factor structure that largely corresponds to the proposed dimensions. The final scale showed strong reliability and validity, including nomological validity for loyalty and word of mouth.
The CFQ scale provides hospitality managers with a practical tool to evaluate and improve guest interactions, offering insights into the social dynamics that underpin repeat patronage. In doing so, it helps translate academic findings into actionable strategies to enhance loyalty and positive word of mouth.
This study substantiates earlier qualitative work with quantitative evidence and introduces the first validated scale for measuring CFQ. It advances research on the social component of service experiences and highlights the novelty of capturing the unique dimensions of commercial friendship in the hospitality sector.
1. Introduction
Loyalty benefits hospitality businesses by reducing costs, improving retention and stimulating positive word of mouth (Bilgihan et al., 2025; Kandampully et al., 2015). In hospitality, loyalty is fostered by delivering satisfaction through added value (Han et al., 2019; Kandampully and Suhartanto, 2000), much of which emerges in host–guest interactions (Veloso and Gomez-Suarez, 2023; Wang, 2019). Repeated encounters may become personal relationships (António and Rita, 2023; Velthuis, 2022) and develop into friendship-like social bonds (Velthuis et al., 2024); when such bonds form in a service environment, they are commonly termed commercial friendships (Banerji et al., 2020).
Commercial friendship is widely regarded as consequential in hospitality (Lashley and Morrison, 2003). Emotional bonds can create loyalty that is more resistant to competitive pressures (Mattila, 2001), and commercial friendship has been linked to both business performance and guest well-being (Lee and Kim, 2022). Economically, it increases willingness to pay and strengthens behavioral loyalty (Pamacheche and Duh, 2021), enhances satisfaction and retention and encourages recommendation intentions (So et al., 2019). It also stimulates positive word of mouth, a powerful form of peer-to-peer marketing (Storr et al., 2023). Socially, commercial friendship can create memorable and meaningful experiences and a sense of belonging (Velthuis, 2022), contribute to supportive networks and reduce feelings of isolation in contexts where guests seek social contact or respite from loneliness (Ashida and Heaney, 2008; Buz et al., 2014). It is further associated with psychological outcomes such as happiness and life satisfaction (Pezirkianidis et al., 2023).
Despite its numerous advantages, commercial friendship also entails risks. The dual roles of friend and business associate can generate conflict when self-interested motives are perceived, leading to damaged relationships, negative word of mouth or retaliatory behavior (Grayson, 2007; Oppen, 2020). For employees, displaying emotions that diverge from genuine feelings may undermine well-being and work performance (Hochschild, 2012). These mixed outcomes underscore the construct’s complexity, yet much prior work treats commercial friendship as binary − either present or absent (Pamacheche and Duh, 2021). Only recently, Velthuis et al. (2024) showed that commercial friendship is multidimensional, changes over time and varies in quality, opening several avenues for further research.
To build on these opportunities, the literature needs a validated commercial friendship quality (CFQ) measure. Without such a measure, researchers cannot rigorously test how commercial friendship forms and drives outcomes (i.e. loyalty and word of mouth), compare CFQ across settings or cultures or model trajectories of development over time. Given the substantial benefits and risks, commercial friendship warrants more precise conceptualization. A validated CFQ scale will enable rigorous hypothesis testing.
Commercial friendship sits at the intersection of business exchanges and social relationships, suggesting that existing measures can be applied to it. However, relationship quality scales in marketing primarily assess evaluative judgments about an exchange relationship (e.g. trust, commitment, satisfaction) and are typically modeled as antecedents or outcomes in performance models (Crosby et al., 1990) offering limited insight into how host–guest interactions become personal and friend-like. Noncommercial friendship quality scales were developed for reciprocal, voluntary ties in which both parties can escalate disclosure, support and shared activities without professional constraints. Hospitality relationships differ on precisely these points: they originate in a paid service role, are bounded by professional norms and often develop asymmetrically (Velthuis et al., 2024). CFQ, therefore, requires a dedicated operationalization of friendship-like qualities. In line with Nunkoo et al.’s (2025) argument that interdisciplinary research can open new avenues and generate fresh knowledge, this study answers that call by integrating insights from relationship marketing, friendship theory and hospitality research to operationalize CFQ.
This research builds on Velthuis et al. (2024) and pursues two objectives: to examine whether their qualitative findings are supported by quantitative evidence, and to introduce the first validated scale to capture and quantify CFQ. We first situate commercial friendship in marketing and hospitality literature and define the construct, arguing that it develops through stages in which quality changes rather than constituting a uniform state between purely personal and purely business interaction. We then show how commercial friendship differs from both noncommercial friendship and relationship quality in marketing, justifying the development of a new scale. The subsequent sections describe the scale development procedure, data analyses, model tests and the theoretical and practical implications of our findings.
2. Literature review: Commercial friendship quality in hospitality
2.1 Commercial friendship within the frame of relationship marketing and hospitality literature
Commercial friendship can be situated within relationship marketing and hospitality scholarship, where value creation is inherently relational. In services, relationship marketing emphasizes how ongoing customer–supplier relationships are strengthened through social bonds that support retention and loyalty (Berry, 1983). These bonds are created and maintained during the service encounter: social interaction structures the exchange and becomes a key locus for co-producing value (Carvalho and Alves, 2023; Grönroos, 2006; Solomon et al., 1985; Surprenant and Solomon, 1987). Hospitality research similarly treats social relating as constitutive of the domain, describing hospitality as the “human exchange around products and services” (Brotherton, 1999).
In hospitality, these relational processes are enacted primarily at the interpersonal level. Hostmanship captures the interactional work through which guests are made to feel welcome (Gunnarsson et al., 2011). Feeling welcome involves experiences of genuine connection, appreciation and belonging (Medema and de Zwaan, 2020). Conceptually, these experiences can be understood as communicated valuing, which is central to philosophical accounts of friendship (Leibowitz, 2018). When communicated valuing recurs across encounters, guests may interpret it as exceeding professional courtesy and as reflecting a more personal bond. We therefore position commercial friendship as the actual social bond within relationship marketing and the friendship-like component of hospitality encounters.
2.2 Defining commercial friendship
Commercial friendship was introduced as a distinctive relationship formed within service encounters (Price and Arnould, 1999). According to these scholars, repeated interactions can produce personal and emotional connections, and when these are experienced as enjoyable and accompanied by liking or trust, individuals adjust their behavior toward a friendship-like relationship (Banerji et al., 2020). Subsequent work broadened the concept to include ties among coworkers, business representatives and guests (Storr et al., 2023), while Banerji et al. (2020) defined it more narrowly as a personal relationship between business representatives and customers that originated in a service context. Much prior research treats commercial friendship as a fixed state between purely transactional business relationships and purely relational noncommercial friendships. In contrast, Velthuis et al. (2024) recently showed that commercial friendships differ in quality and develop through levels.
Based on this literature, we define commercial friendship in hospitality setting as a social relationship between a host and a guest that resembles noncommercial friendship, whose quality varies along a continuum from non-existent (e.g. first-time guest) to extremely high (e.g. close friend). Importantly, even when host and guest interact purely professionally, the relationship is still classified as a commercial friendship in its minimal form.
This study builds on Velthuis et al. (2024). We summarize their CFQ dimensions and developmental levels, incorporate their finding that development can be asymmetric (differences between hosts and guests in what they say and do), and use their conceptual model as the starting point for the current study.
2.3 Dimensions of commercial friendship quality
CFQ comprises three core dimensions: disclosure, social support and activities (Velthuis et al., 2024). Disclosure concerns the breadth and depth of sharing personal thoughts, feelings and experiences. Social support includes emotional and instrumental assistance; emotional support involves empathic listening, encouragement and concern, whereas instrumental support involves practical help (e.g. advice, assistance, access to resources) (Mendelson and Aboud, 1999). Activities capture what hosts and guests do together, from sharing a drink to attending private events. Velthuis et al. (2024) further distinguished subdimensions within activities: type (what is done), place (where it occurs) and planning (planned vs spontaneous).
2.4 Levels of commercial friendship
Velthuis et al. (2024) described six levels through which the three dimensions intensify. At Level 1 (Guest), interaction is largely transactional: activities are confined to the service encounter, disclosure is formulaic and support is absent. Level 2 (Regular) remains venue-bound but includes greater familiarity and professional attention; disclosure stays superficial, and support remains limited. Level 3 (casual friend) broadens interactions (including a broader range of activities), while disclosure typically remains focused on public, positive topics.
A substantive shift occurs at Level 4 (Friend): disclosure deepens to private and sometimes negative experiences; activities become more planned and personal; and emotional support becomes more salient through empathic listening. Levels 1–4, therefore, mark the transition from role-based familiarity to a recognizably personal relationship, driven primarily by deeper disclosure and the emergence of support. At Levels 5 (close friend) and 6 (best friend), engagement across all three dimensions reaches its maximum: activities extend to planned private events and home visits, and disclosure reaches full breadth and depth.
In this article, we focus on quality rather than discrete levels. The levels are an analytically useful typology derived from thematic analysis, yet everyday relationship judgments often have fuzzy boundaries (e.g. differentiating acquaintance, friend or pal). Treating CFQ as a continuous latent construct better captures incremental change, retains information and enables more precise comparisons over time and across groups.
2.5 Asymmetry of commercial friendship development
Social penetration theory conceptualizes relationship deepening as reciprocal increases in disclosure breadth and depth, with partners matching and gradually extending intimacy (Altman and Taylor, 1973). Noncommercial friendship is likewise often framed as reciprocal, with mutual support and shared interests (Rubin and Bowker, 2018). In hospitality, however, role requirements introduce asymmetries: service providers must maintain service standards while responding to guest needs, creating tension between authentic expression and professionalism (Wirtz and Lovelock, 2016). Hosts may therefore engage in emotional labor (Hochschild, 2012), adhere to scripts and withhold genuine emotion (Ahmad et al., 2023) Velthuis et al. (2024) reported asymmetry in disclosure and social support at Levels 2–4 (but not at Levels 1, 5 or 6). Guests disclose more than bartenders, and guests typically receive support from bartenders, whereas bartenders rarely rely on guests for comparable support. Measures are therefore needed to describe both perspectives within these dimensions.
Based on these findings, we present the following conceptual model for testing, see Figure 1.
The diagram presents commercial friendship quality linked to three components including activities, disclosure, and social support. Activities include type, spontaneity, and location. Disclosure includes by guest and by bartender. Social support includes from guest and from bartender.Conceptual model
Source: Created by the authors
The diagram presents commercial friendship quality linked to three components including activities, disclosure, and social support. Activities include type, spontaneity, and location. Disclosure includes by guest and by bartender. Social support includes from guest and from bartender.Conceptual model
Source: Created by the authors
2.6 Perception of friendship quality
Because we later justify appropriateness rather than frequency as response anchors, we clarify how people judge CFQ. First, friendship quality is partly inferred from the other person’s behavior because friendship is socially learned and guided by expectations about acceptable behaviors at different relationship stages; individuals evaluate quality by assessing fit with norms of what is appropriate (Xue and Silk, 2012). These norms imply an “appropriateness range” that varies with friendship quality. Second, friendship quality is reflected in behavioral matching and mirroring (Hugh-Jones and Ooi, 2023): people learn what is appropriate by observing actions and responses and adjusting accordingly. Measuring the appropriateness of specific behaviors during interaction, considering both self and other, therefore provides a suitable proxy for friendship quality and a defensible scale anchor.
2.7 Positioning CFQ relative to relationship marketing and friendship scales
Although commercial friendship sits at the interface of market exchange and social relations, existing measures from these domains do not map cleanly onto CFQ. Available instruments largely fall into two categories: relationship marketing/relationship quality measures that operationalize exchange relationships via evaluative judgments and performance consequences, and friendship quality measures that operationalize voluntary, reciprocal personal ties. CFQ overlaps with both yet adds two ingredients that these constructs do not jointly capture: (a) role-bound host–guest asymmetry and (b) judgments about the appropriateness of friendship-like behaviors within commercial service encounters (Velthuis et al., 2024).
Relationship marketing is an organization-level strategic framework for managing customer relationships (Berry, 1983; Grönroos, 2006). When operationalizing relational value, it commonly relies on trust, satisfaction and commitment to explain outcomes such as loyalty and word of mouth (Crosby et al., 1990; Hennig-Thurau et al., 2002; Palmatier et al., 2006). These measures are valuable for exchange relationships but are not designed to capture the interactional content through which host–guest ties become personal in hospitality settings (Price and Arnould, 1999). Moreover, commercial friendships can be interpreted as more or less genuine depending on perceived motives (Grayson, 2007). Thus, standard relationship marketing metrics locate “relationship quality” in evaluations and outcomes, whereas CFQ targets friendship-like behaviors (disclosure, support, activities) and whether they are appropriate given service roles (Velthuis et al., 2024).
Friendship quality scales share CFQ’s behavioral content (disclosure, support, activity) (Bukowski et al., 1994; Parker and Asher, 1993) but were developed for reciprocal, noncommercial relationships and typically do not incorporate role expectations and structural asymmetries that shape hospitality ties (Armsden and Greenberg, 1987; Bukowski et al., 1994; Mendelson and Aboud, 1999). In commercial friendships, relationships originate transactionally, are bounded by professional norms and frame social support as part of paid service (Grayson, 2007), implying that identical behaviors can carry different meanings by role and context.
Taken together, these measurement precedents are well-suited to their intended constructs, but neither is designed to capture CFQ’s role-bound asymmetry and appropriateness-based evaluation of friendship-like behaviors in hospitality service encounters. Accordingly, this study develops and validates a context-specific psychometric scale for CFQ. It also tests whether the proposed qualitative model of Velthuis et al. (2024) is supported by quantitative evidence.
3. Methods
We followed De Vellis and Thorpe’s (2022) scale development framework. Because the construct exploration and conceptual model were established in prior qualitative work (Velthuis et al., 2024) and summarized in the literature review, we briefly report that phase here and focus on item generation, iterative refinement, response anchors, data collection and the analytic approach. In line with De Vellis and Thorpe (2022), the initial item-generation phase prioritized breadth over parsimony by generating multiple, partly overlapping items per dimension and allowing redundancy. We outline several steps taken to arrive at the final set of items. We will discuss data collection in this section. To facilitate better comprehension, we report data analyses, including data cleaning, checks for common-method bias, exploratory and confirmatory factor analysis (EFA/CFA) and tests of convergent and nomological validity in the results section.
3.1 Exploration of the concept
The foundational qualitative phase was completed in earlier research (Velthuis et al., 2024). That study conducted in-depth, semistructured interviews with bartenders and pub guests in The Netherlands and linked interview insights to theory through thematic analysis (Braun and Clarke, 2006), producing a data-driven conceptualization (Figure 1). For full qualitative procedures and analysis, see Velthuis et al. (2024).
3.2 Item generation
We integrated qualitative and psychometric phases through an explicit chain of evidence. Starting from the conceptual model (Velthuis et al., 2024), we generated an overinclusive item pool covering each dimension. Following De Vellis and Thorpe (2022), items were drawn from three sources. First, based on Alsarrani et al. (2022) overview, we identified four widely used friendship quality instruments, the McGill Friendship Questionnaire (Mendelson and Aboud, 1999), Friendship Qualities Scale (Bukowski et al., 1994), Friendship Quality Questionnaire (Parker and Asher, 1993) and Inventory of Parent and Peer Attachment (Armsden and Greenberg, 1987), and adapted items by substituting “friend” with “bartender” and contextualizing them for commercial interactions. Second, we generated items inductively from Velthuis et al. (2024) transcripts by translating thematic codes and participant phrasing into candidate items per dimension. Third, we used ChatGPT-4.0 (OpenAI) to propose candidate items aligned with the construct to broaden wording and content coverage (Wang et al., 2025). All suggestions were reviewed and edited by the author team before inclusion in the item pool. For convergent validity, we included the highest-loading items from established friendship scales; for nomological validity, we included Price and Arnould’s (1999) items to test links with loyalty and positive word of mouth.
3.3 Refinement of the item pool
The initial pool comprised more than 300 items and was refined in five iterative stages. First, we removed duplicates and used more abstract wording to reduce length and response fatigue (e.g. replacing “going to a concert/museum” with “an activity lasting part of a day”). Second, the author team reviewed dimension coverage and revised or removed items that did not align with the framework. Third, we interviewed experts who had published on commercial friendship and revised the pool based on their feedback (simplifying wording, adding control items and verifying coverage of all subdimensions). Fourth, the first author conducted 10 cognitive interviews with diverse participants using think-aloud and cognitive probing (Willis, 2005) to assess the interpretation of items and anchors (e.g. clarity of items; distinguishing “not really appropriate” from “somewhat appropriate”); issues were corrected iteratively until no clarity problems remained. Fifth, a small pilot study with university students and teachers (n = 17) confirmed sufficient variance, acceptable completion time and no residual interpretation issues. The refined pool contained 37 items (disclosure = 10; social support = 12; activities = 15), approximately twice the minimum required to retain at least three indicators per dimension (Hair, 2019).
3.4 Response anchors
Items were rated on a six-point Likert scale capturing appropriateness of disclosure, social support and activities; we omitted a neutral midpoint to encourage more decisive responses (De Vellis and Thorpe, 2022). Cognitive interviews supported this choice: with an initial five-point format, respondents frequently selected the midpoint to avoid taking a stance or to respond in socially desirable ways. We also adopted appropriateness rather than frequency as the response anchor. In cognitive interviews, “how often” was interpreted inconsistently: some respondents used a relative frame (e.g. “often” because it occurs every visit, even if visits are rare), whereas others used an absolute frame (e.g. “not often” because they visit infrequently). Frequency-based disclosure items also showed limited variance because even close ties rarely discuss severely negative topics (e.g. tragedies, family illness), leading to clustering around low-frequency options. Appropriateness anchors additionally reduced recall bias.
3.5 Data collection
We recruited 1,067 guests from Dutch pubs using convenience sampling (n = 383) via LinkedIn posts, newsletters, personal networks, trade magazines and intermediaries such as bar owners and staff, and Prolific (n = 684; www.prolific.com), a panel platform that connects researchers with respondents and is associated with high data quality (Douglas et al., 2023). Prior work indicates that Prolific samples pass more attention checks and yield higher-quality responses than MTurk, Qualtrics panels or undergraduate pools (Uittenhove et al., 2023). Data were collected via Qualtrics between November 3, 2024 and January 31, 2025. This study was reviewed and approved by the Ethical Review Board of the Tilburg University School of Social and Behavioral Sciences (approval code: TSB_RP1168). All procedures complied with institutional and national ethical guidelines for research involving human participants. Prior to participation, respondents received written information explaining the purpose of the study, the voluntary nature of participation and their right to withdraw at any time without consequences. Participants were informed that all data were collected anonymously, that only the research team had access to the data, and that data were stored and processed in accordance with the General Data Protection Regulation. No foreseeable risks were associated with participation. Informed consent was obtained electronically by requiring participants to actively agree to the study conditions before proceeding with the questionnaire. After consent, eligibility was verified: participants had to (a) have visited the same pub at least twice in the past year and (b) be able to recall a specific bartender from that venue. Only eligible participants proceeded.
3.6 Sample description
Data quality procedures included three attention checks, removal of incomplete responses and exclusion of surveys completed in under 4 min (the first researcher’s completion time plus 30 s). We computed Mahalanobis distance values (MDVs) for the CFQ items and the convergent/nomological validity items, following Kılıçhan et al. (2022), to detect multivariate outliers. Using the t-test approach (Tabachnick et al., 2007), we flagged MDVs exceeding the t-value at p < 0.01, based on the number of items. Flagged cases were inspected manually and removed when response patterns were highly inconsistent (e.g. rating “having a drink” as 1 − highly inappropriate − while rating “spending time in the bartender’s home” as 6 − highly appropriate). Four cases were removed.
After applying these criteria, the final sample comprised 562 valid responses (Prolific: 363; convenience: 199). Comparing the Prolific and convenience samples, 13 of the 37 CFQ items showed slightly higher means in the convenience sample (p < 0.01), consistent with recruitment via bar owners/online channels that likely reached guests with closer bartender relationships than a general population panel. The sample was diverse (Table 1): ages 18–73 (M = 32.5, SD = 11.6); gender was 54.7% men and 40.8% women; the recalled bartenders were about two-thirds men (67.8%). Dyads were predominantly male–male (38.6%), followed by female–male (26.3%; guest woman, bartender man). Respondents were distributed across Dutch regions proportional to population, indicating regional representativeness.
Descriptive statistics
| Variable | Frequency | % |
|---|---|---|
| Age group | ||
| <18 | 1 | 0.1 |
| 18–24 | 145 | 25.8 |
| 25–34 | 215 | 38.3 |
| 35–44 | 87 | 15.5 |
| 45–54 | 56 | 10.0 |
| 55–64 | 24 | 4.3 |
| 65> | 8 | 1.4 |
| NA | 26 | 4.6 |
| Total | 562 | 100.0 |
| Gender respondent | ||
| Male | 306 | 54.5 |
| Female | 231 | 41.1 |
| Other | 2 | 0.4 |
| Don’t want to say | 1 | 0.2 |
| NA | 22 | 3.9 |
| Total | 562 | 100.0 |
| Gender bartender | ||
| Male | 381 | 67.8 |
| Female | 174 | 31.0 |
| Other | 3 | 0.5 |
| Don’t want to say | 4 | 0.7 |
| Total | 562 | 100.0 |
| Region in the country | ||
| North | 146 | 26.0 |
| West | 188 | 33.5 |
| South | 139 | 24.7 |
| East | 76 | 13.5 |
| NA | 13 | 2.3 |
| Total | 562 | 100.0 |
| Dyad: Respondent − Bartender | ||
| Female−female | 81 | 14.4 |
| Female−male | 148 | 26.3 |
| Male−male | 217 | 38.6 |
| East | 76 | 13.5 |
| NA | 30 | 5.3 |
| Total | 562 | 100.0 |
| Variable | Frequency | % |
|---|---|---|
| Age group | ||
| <18 | 1 | 0.1 |
| 18–24 | 145 | 25.8 |
| 25–34 | 215 | 38.3 |
| 35–44 | 87 | 15.5 |
| 45–54 | 56 | 10.0 |
| 55–64 | 24 | 4.3 |
| 65> | 8 | 1.4 |
| 26 | 4.6 | |
| Total | 562 | 100.0 |
| Gender respondent | ||
| Male | 306 | 54.5 |
| Female | 231 | 41.1 |
| Other | 2 | 0.4 |
| Don’t want to say | 1 | 0.2 |
| 22 | 3.9 | |
| Total | 562 | 100.0 |
| Gender bartender | ||
| Male | 381 | 67.8 |
| Female | 174 | 31.0 |
| Other | 3 | 0.5 |
| Don’t want to say | 4 | 0.7 |
| Total | 562 | 100.0 |
| Region in the country | ||
| North | 146 | 26.0 |
| West | 188 | 33.5 |
| South | 139 | 24.7 |
| East | 76 | 13.5 |
| 13 | 2.3 | |
| Total | 562 | 100.0 |
| Dyad: Respondent − Bartender | ||
| Female−female | 81 | 14.4 |
| Female−male | 148 | 26.3 |
| Male−male | 217 | 38.6 |
| East | 76 | 13.5 |
| 30 | 5.3 | |
| Total | 562 | 100.0 |
Missing values labeled “NA” result from early termination, whereas “don’t want to say” denotes an intentional selection
4. Analysis and results
Following Wang (2025), we assessed common method bias using Harman’s single-factor test and a one-factor CFA. For Harman’s test, all 49 survey items (including removed items and validity measures) were analyzed with principal axis factoring; a single unrotated factor explained 46.46% of the variance, below the commonly used 50% heuristic (Podsakoff et al., 2012), suggesting that no single factor accounted for most covariance. Consistent with Kao et al. (2020) we then estimated a one-factor CFA using only the focal CFQ items. The one-factor model fit poorly (χ2 = 965.782, df = 54, χ2/df = 17.885, GFI = 0.956, RMSEA = 0.280, CFI = 0.57, TLI = 0.47, NFI = 0.94) and fit significantly worse than the theoretically specified three-factor model (χ2 = 73.018, df = 50). Because estimation used weighted least squares mean and variance adjusted (WLSMV), we compared models with the robust DIFFTEST; the one-factor model fit worse (Δχ2 = 299.1, Δdf = 4, p < 0.001). Together, these diagnostics suggest that common method bias is unlikely to be driving the factor structure.
Respondents were randomly split into an EFA subsample (n = 291) and a CFA subsample (n = 271) in R (version 2024.12.0 + 467). Item-level t-tests indicated no differences between subsamples (p < 0.01).
4.1 Exploratory factor analysis
We evaluated suitability for factor analysis. Two highly collinear items were removed (r > 0.90). Sampling adequacy was excellent (KMO = 0.94), and Bartlett’s test was significant (χ2 = 8,776.864, p < 0.001), supporting factorability (Hair, 2019; Kaiser and Rice, 1974). Because items were ordinal and non-normal (Shapiro–Wilk, p < 0.01), we used polychoric correlations and principal axis factoring (Osborne et al., 2008). Given expected relatedness among disclosure, social support and activities, we used oblique rotation (Hayton et al., 2004; Reise et al., 2000), consistent with modeling a higher-order construct (Kwon, 2023).
Extraction followed an iterative, evidence-guided routine (Lee and Chiang, 2017; Pijls et al., 2017). A five-factor solution was initially suggested by the scree plot and parallel analysis (Hayton et al., 2004). Parallel analysis retains factors when observed eigenvalues exceed those from random data with equivalent dimensions and sample size. After each estimation, we removed one problematic item and re-estimated, applying three a priori criteria: highest loading on an unintended factor, loading < 0.60 and cross-loading > 0.20 (Hair, 2019). As items were pruned, the fifth factor fell below the parallel-analysis criterion after 13 removals; repeating the routine on the full pool with four factors, the test fell below the criterion after nine removals. The remaining solution based on three factors remained stable after applying all criteria.
The theorized three-layer structure with lower-order factors within each main factor did not emerge. Within activities, retained items reflected only activity type; spontaneity and location did not form distinct lower-order factors, and no lower-order factor structure emerged within disclosure or social support. A plausible explanation is that these proposed subdimensions act as contextual descriptors rather than separable latent dimensions in guests’ evaluations: when judging the appropriateness of friendship-like behaviors in a role-bound service encounter, respondents may integrate place and planning into a single appraisal. This is especially likely for activities because many activity items implicitly encode planning and public/private settings. For example, activities lasting a day or multi-day activities typically occur in the public domain and require planning, leaving little conceptual room to treat “place” or “planning” as separable latent subdimensions. For construct definition, CFQ is therefore best represented, at least from the guest perspective, in this pub context, as three broad reflective dimensions.
To improve parsimony without compromising reliability, we examined ordinal alpha and “alpha if item deleted,” removed two additional items while keeping more than three indicators per factor (Hair, 2019), and reran the EFA to confirm the structure. The final EFA retained 13 items: disclosure (5), social support (4) and activities (4). The solution explained 75.96% of the variance (31.08%, 24.89%, 17.99%; eigenvalues 7.223, 1.608, 0.967). Although the third eigenvalue was marginally below 1, it still met the parallel analysis retention criterion. Communalities ranged from 0.584 to 0.873. Full loadings are reported in Table 2.
Exploratory factor analysis results
| Factors | Disclosure | Social support | Activities | Communalities | Explained variance (%) | Eigenvalues | Ordinal alpha |
|---|---|---|---|---|---|---|---|
| Disclosure | 28.12 | 7.223 | 0.925 | ||||
| Things that make the bartender sad | 0.912 | 0.751 | |||||
| Things that make me sad | 0.870 | 0.833 | |||||
| Things I am angry about | 0.840 | 0.832 | |||||
| Things the bartender is angry about | 0.823 | 0.750 | |||||
| My opinion on recent events | 0.648 | 0.540 | |||||
| Social support | 23.97 | 1.608 | 0.928 | ||||
| The bartender shows a personal interest | 0.904 | 0.773 | |||||
| The bartender gives me compliments | 0.866 | 0.781 | |||||
| The bartender makes me feel important | 0.794 | 0.793 | |||||
| The bartender shows compassion when I talk about problems | 0.759 | 0.695 | |||||
| Activities | 23.28 | 0.967 | 0.927 | ||||
| Having a drink | 0.969 | 0.861 | |||||
| Eating together | 0.873 | 0.813 | |||||
| Going out | 0.771 | 0.718 | |||||
| Activity lasting a day | 0.759 | 0.704 |
| Factors | Disclosure | Social support | Activities | Communalities | Explained variance (%) | Eigenvalues | Ordinal alpha |
|---|---|---|---|---|---|---|---|
| Disclosure | 28.12 | 7.223 | 0.925 | ||||
| Things that make the bartender sad | 0.912 | 0.751 | |||||
| Things that make me sad | 0.870 | 0.833 | |||||
| Things I am angry about | 0.840 | 0.832 | |||||
| Things the bartender is angry about | 0.823 | 0.750 | |||||
| My opinion on recent events | 0.648 | 0.540 | |||||
| Social support | 23.97 | 1.608 | 0.928 | ||||
| The bartender shows a personal interest | 0.904 | 0.773 | |||||
| The bartender gives me compliments | 0.866 | 0.781 | |||||
| The bartender makes me feel important | 0.794 | 0.793 | |||||
| The bartender shows compassion when I talk about problems | 0.759 | 0.695 | |||||
| Activities | 23.28 | 0.967 | 0.927 | ||||
| Having a drink | 0.969 | 0.861 | |||||
| Eating together | 0.873 | 0.813 | |||||
| Going out | 0.771 | 0.718 | |||||
| Activity lasting a day | 0.759 | 0.704 |
Factor extraction method: principal axis factoring, rotate method: oblimin, KMO 0.94, total variance explained 75.37%, Bartlett’s test of sphericity: 8,776,864; p < 0.001, ordinal alpha for the entire model: 0.94, RMSR: 0.0282, response options: 1 − totally inappropriate; 6 completely appropriate
4.2 Confirmatory factor analysis
Using the CFA subsample, we tested a second-order model in which CFQ is reflected by three first-order factors (disclosure, social support, activities). The model was estimated using the WLSMV, which is appropriate for ordinal indicators (Kline, 2023). Following Hair (2019), we required standardized loadings ≥ 0.60. We inspected fit and modification indices cautiously and implemented only theoretically defensible changes, resulting in the removal of one item (“things that make the bartender sad”) and freeing one residual covariance between two closely related indicators. Fit was perfect according to the used standards: χ2/df = 1.507; RMSEA = 0.049, 90% CI [0.017, 0.073] (Hu and Bentler, 1999); SRMR = 0.033; CFI = 0.987; TLI = 0.984 (Bollen, 1989), see Figure 2 and Table 3.
The diagram illustrates relationships between three constructs and commercial friendship quality. Disclosure is represented by my opinion on recent events, with a value of 0.67, things that make me sad, with a value of 0.93, things that make me angry, with a value of 0.93, and things that make the bartender angry, with a value of 0.80. Social support includes bartender shows a personal interest, with a value of 0.68, bartender gives me compliments, with a value of 0.92, bartender makes me feel important, with a value of 0.80, and bartender shows compassion, with a value of 0.95. Activities include eating together, with a value of 0.86, going out, with a value of 0.80, activity lasting a day, with a value of 0.94, and multi day activity, with a value of 0.92. These constructs connect to commercial friendship quality, with values of 0.92 for disclosure, 0.69 for social support, and 0.65 for activities.Factor loadings from the confirmatory factor analysis
Source: Created by the others
The diagram illustrates relationships between three constructs and commercial friendship quality. Disclosure is represented by my opinion on recent events, with a value of 0.67, things that make me sad, with a value of 0.93, things that make me angry, with a value of 0.93, and things that make the bartender angry, with a value of 0.80. Social support includes bartender shows a personal interest, with a value of 0.68, bartender gives me compliments, with a value of 0.92, bartender makes me feel important, with a value of 0.80, and bartender shows compassion, with a value of 0.95. Activities include eating together, with a value of 0.86, going out, with a value of 0.80, activity lasting a day, with a value of 0.94, and multi day activity, with a value of 0.92. These constructs connect to commercial friendship quality, with values of 0.92 for disclosure, 0.69 for social support, and 0.65 for activities.Factor loadings from the confirmatory factor analysis
Source: Created by the others
Goodness-of-fit indices of the final three-factor model
| Goodness-of-fit indices | Perfect fit indices* | Calculated values | Interpretation |
|---|---|---|---|
| CMIN/DF | 0 ≤ x2/df ≤ 2 | 1.510 | Perfect fit |
| RMSEA | 0.00 ≤ RMSEA ≤ 0.05 | 0.049 | Perfect fit |
| SRMR | 0.00 ≤ SRMR ≤ 0.05 | 0.034 | Perfect fit |
| CFI | 0.95 ≤ CFI ≤ 1.00 | 0.998 | Perfect fit |
| TLI/ NNFI | 0.95 ≤ TLI ≤ 1.00 | 0.984 | Perfect fit |
| Goodness-of-fit indices | Perfect fit indices* | Calculated values | Interpretation |
|---|---|---|---|
| CMIN/DF | 0 ≤ x2/df ≤ 2 | 1.510 | Perfect fit |
| 0.00 ≤ RMSEA ≤ 0.05 | 0.049 | Perfect fit | |
| 0.00 ≤ SRMR ≤ 0.05 | 0.034 | Perfect fit | |
| 0.95 ≤ CFI ≤ 1.00 | 0.998 | Perfect fit | |
| TLI/ | 0.95 ≤ TLI ≤ 1.00 | 0.984 | Perfect fit |
Indicators were strong (Table 4): all loadings were significant (p < 0.001) and ranged from 0.654 to 0.949; squared multiple correlations ranged from 0.427 to 0.901. Reliability was high (McDonald’s ω = 0.890–0.912) and convergent validity was supported (AVE = 0.705–0.778; Hair et al., 2019). First-order factors loaded strongly on the second-order CFQ factor (Disclosure = 0.922; social support = 0.687; activities = 0.654), with moderate inter-factor correlations (0.449–0.634). Discriminant validity met the Fornell–Larcker criterion: for each construct, √AVE (diagonal of Table 5) exceeded its correlations with other constructs.
Confirmatory factor analysis results
| Factors | Standardized factor loadings | Squared multiple correlation | Composite reliability scores McDonald’s omega | Average variance extracted |
|---|---|---|---|---|
| Disclosure | 0.890 | 0.705 | ||
| My opinion on recent events | 0.672 | 0.452 | ||
| Things that make me sad | 0.926 | 0.857 | ||
| Things I am angry about | 0.933 | 0.870 | ||
| Things the bartender is angry about | 0.803 | 0.645 | ||
| Social support | 0.910 | 0.708 | ||
| The bartender shows a personal interest | 0.681 | 0.464 | ||
| The bartender gives me compliments | 0.916 | 0.839 | ||
| The bartender makes me feel important | 0.792 | 0.628 | ||
| The bartender shows compassion when I talk about problems | 0.949 | 0.901 | ||
| Activities | 0.912 | 0.778 | ||
| Having a drink | 0.858 | 0.737 | ||
| Eating together | 0.802 | 0.643 | ||
| Going out | 0.937 | 0.877 | ||
| Activity lasting a day | 0.924 | 0.854 |
| Factors | Standardized factor loadings | Squared multiple correlation | Composite reliability scores McDonald’s omega | Average variance extracted |
|---|---|---|---|---|
| Disclosure | 0.890 | 0.705 | ||
| My opinion on recent events | 0.672 | 0.452 | ||
| Things that make me sad | 0.926 | 0.857 | ||
| Things I am angry about | 0.933 | 0.870 | ||
| Things the bartender is angry about | 0.803 | 0.645 | ||
| Social support | 0.910 | 0.708 | ||
| The bartender shows a personal interest | 0.681 | 0.464 | ||
| The bartender gives me compliments | 0.916 | 0.839 | ||
| The bartender makes me feel important | 0.792 | 0.628 | ||
| The bartender shows compassion when I talk about problems | 0.949 | 0.901 | ||
| Activities | 0.912 | 0.778 | ||
| Having a drink | 0.858 | 0.737 | ||
| Eating together | 0.802 | 0.643 | ||
| Going out | 0.937 | 0.877 | ||
| Activity lasting a day | 0.924 | 0.854 |
During CFA, the item “Things that make the bartender sad” was removed
Discriminant validity
| Construct | Disclosure | Social support | Activities |
|---|---|---|---|
| Disclosure | 0.840* | ||
| Social support | 0.634 | 0.842* | |
| Activities | 0.603 | 0.449 | 0.882* |
| Construct | Disclosure | Social support | Activities |
|---|---|---|---|
| Disclosure | 0.840* | ||
| Social support | 0.634 | 0.842* | |
| Activities | 0.603 | 0.449 | 0.882* |
*The main diagonal contains the square root of the average variance extracted of every multi-item construct
Substantively, this pattern indicates that the three dimensions function as interrelated manifestations of an overarching appraisal of CFQ (i.e. how friendship-like and appropriate the host–guest relationship feels), expressed through what is shared, how support is exchanged and what is done together. The stronger second-order loading for disclosure suggests that conversational self-disclosure is a salient cue for CFQ. In contrast, social support and shared activities may be more bound by service roles and setting constraints. The higher-order model therefore supports interpretation at the dimension level for diagnostic use and as an overall CFQ score for parsimonious comparisons.
4.3 Convergent validity
Convergent validity was assessed via correlations between CFQ factor scores and adjacent friendship-quality measures (Table 6). Consistent with DeVellis and Thorpe (2022), we expected positive associations with conceptually similar constructs, but not redundancy, because CFQ is role-bound and appropriateness-based. We used Spearman’s rho because the factor scores and comparator measures were ordinal and did not consistently meet the normality/linearity assumptions for Pearson correlations (Kline, 2023); Spearman’s rho provides a robust estimate of monotonic association (Puth et al., 2015). Correlations for first-order factors were moderate: disclosure (ρ = 0.619), social support (ρ = 0.448) and activities (ρ = 0.611). The second-order CFQ factor correlated moderately to strongly with the total comparator score (ρ = 0.681), supporting convergent validity while indicating non-redundancy.
Spearman correlations between latent factors and gold-standard measures
| Latent factor | Gold-standard measure | ρ (rho) |
|---|---|---|
| Disclosure | Intimate exchange | 0.619 |
| Social support | Help | 0.448 |
| Activities | Companionship | 0.611 |
| Commercial friendship quality (second-order factor) | Total gold-standard score | 0.681 |
| Latent factor | Gold-standard measure | ρ (rho) |
|---|---|---|
| Disclosure | Intimate exchange | 0.619 |
| Social support | Help | 0.448 |
| Activities | Companionship | 0.611 |
| Commercial friendship quality (second-order factor) | Total gold-standard score | 0.681 |
ρ = Spearman correlation coefficient. All correlations are significant at p < 0.001
4.4 Nomological validity
Nomological validity was tested by relating CFQ to customer loyalty and positive word of mouth (PWOM) (Table 7); we hypothesized positive effects on both outcomes. To mitigate potential non-normality and outliers in summed Likert-type outcomes, we estimated robust linear regression (rlm; MASS, R). CFQ predicted loyalty [β = 0.629, SE = 0.044, t(270) = 14.27] and PWOM [β = 0.357, SE = 0.046, t(270) = 7.75]. Complementary OLS models were consistent, explaining 32.6% of the variance in loyalty (R2 = 0.326) and 11.9% in PWOM (R2 = 0.119). For managerial interpretation, a + 1 point increase in CFQ corresponded to +2.61 points in loyalty (95% CI [2.16, 3.06]) and +0.74 points in PWOM (95% CI [0.50, 0.98]).
Nomological validity
| Dependent variable | Predictor | β | SE | t(df) | Residual SD |
|---|---|---|---|---|---|
| Loyalty (z-score) | CFQ_z | 0.629 | 0.044 | 14.27 (270) | 0.76 |
| Positive word of mouth (z-score) | CFQ_z | 0.357 | 0.046 | 7.75 (270) | 0.68 |
| Dependent variable | Predictor | β | t(df) | Residual | |
|---|---|---|---|---|---|
| Loyalty (z-score) | CFQ_z | 0.629 | 0.044 | 14.27 (270) | 0.76 |
| Positive word of mouth (z-score) | CFQ_z | 0.357 | 0.046 | 7.75 (270) | 0.68 |
β = standardized regression coefficient; SE = standard error; t = robust t-value; CFQ = commercial friendship quality. Models were estimated using robust linear regression (rlm() in R) to account for potential non-normality and outliers
4.5 Conclusion of results
This study addresses a measurement gap by introducing a validated instrument for CFQ and testing whether the qualitative CFQ model of Velthuis et al. (2024) holds quantitatively. Across EFA and CFA, CFQ was best represented as a second-order construct with three interrelated first-order dimensions (disclosure, social support, activities), capturing an overarching appraisal of friendship-like, appropriate host–guest relationships expressed through what is shared, how support is exchanged and what is done together. The theorized subdimensions (e.g. planning and location within activities) did not materialize as separable latent factors, plausibly because such features are embedded in type-of-activity items and integrated into a single appropriateness judgment. Likewise, no separate latent dimensions explicitly captured asymmetry; in the data we gathered only from guests, behaviors directed at and received from the bartender loaded onto the same constructs, suggesting fuzzy role boundaries in guests’ cognitive representations.
Guided by this structure, we retained a 12-item CFQ scale (four items per dimension). The final EFA explained 75.96% of variance, and CFA confirmed the second-order model with excellent fit (χ2/df = 1.51; RMSEA = 0.049, 90% CI [0.017, 0.073]; SRMR = 0.033; CFI = 0.987; TLI = 0.984). Reliability and convergent validity were high (ω = 0.890–0.912; AVE = 0.705–0.778); correlations with adjacent friendship-quality measures supported alignment without redundancy; and CFQ predicted loyalty (β = 0.629) and PWOM (β = 0.357) in this Dutch pub sample.
5. Discussion
5.1 Theoretical contribution
This study provides a validated CFQ measure and quantifies the qualitative model of Velthuis et al. (2024). Prior research on commercial friendship in retail and service contexts (e.g. Oppen, 2020; Price and Arnould, 1999; Rosenbaum, 2009; Rosenbaum et al., 2018; Seger-Guttmann and Medler-Liraz, 2018) has lacked an instrument to quantify CFQ, constraining (a) tests of formation and its effect on outcome mechanisms, (b) systematic comparisons across settings and cultures and (c) modeling trajectories of relationship development. The present scale addresses this lacuna by operationalizing CFQ as a coherent, measurable construct, enabling theory testing, including longitudinal modeling.
CFQ is best represented as a second-order appraisal expressed through three interrelated dimensions: disclosure, social support and activities. Lower-order factors did not materialize. Many activity descriptors already embed planning and public/private settings (e.g. day or multi-day trips), leaving limited conceptual space to treat “planning” or “place” as distinct subdimensions. CFQ also appears heterogeneous in level and dimensional profiles, implying that benefits and risks are unlikely to be uniform across relationships.
Qualitative work suggested asymmetry (Velthuis et al., 2024), but in guest data, indicators of giving and receiving loaded on the same constructs. This guest-perception pattern should not be interpreted as behavioral symmetry; testing asymmetry requires dyadic or multi-informant designs, matched host–guest data, and behavioral indicators.
5.2 Practical implications
The CFQ scale offers a structured diagnostic tool of friendship-like relational quality in hospitality encounters. In our data, a + 1 point increase in CFQ corresponded to approximately + 2.61 points in loyalty and + 0.74 points in PWOM, providing a practical “points-per-point” benchmark. Translating the dimensions into hostmanship practice, training can focus on calibrating appropriate self-disclosure (what to share and when), recognizing and responding supportively to guests’ cues and initiating small shared activities (e.g. brief rituals or personalized touchpoints) without crossing professional boundaries. Repeated CFQ measurement can then be used both to monitor relational performance and to evaluate whether training and service-design interventions produce sustained improvements over time.
Used alongside satisfaction and loyalty metrics, CFQ can track relational dynamics across locations, teams and customer segments. Dimension-level diagnostics can reveal where relational gaps emerge (e.g. low perceived social support or inappropriate disclosure) and inform targeted coaching beyond generic hospitality training. High-scoring employees can serve as internal exemplars that support peer learning and knowledge diffusion, and team-level monitoring can help sustain guest relations despite staff turnover (e.g. structured handovers that preserve knowledge of guest preferences and prior disclosures).
CFQ can also guide experience design by embedding opportunities for shared activities, appropriate self-disclosure prompts and empathetic responses to emotional cues; these levers can be framed as testable mechanisms (e.g. calibrated disclosure and supportive responses increase perceived CFQ, which in turn predicts loyalty and PWOM). Such interventions should avoid over-scripting that undermines perceived sincerity and should specify and monitor ethical and professional boundary risks when encouraging friendship-like behaviors.
5.3 Limitations and future research
Several limitations qualify the interpretation and suggest avenues for research. First, during consultations with experts from various countries, it became evident that culture strongly influences how people perceive and define friendship. Because data were collected in The Netherlands, replication across cultures is needed; future studies should test the measurement invariance of the second-order structure and examine whether the relative salience of disclosure, social support and activities shifts across settings and cultures.
Second, data were collected only from the guest perspective. This single-informant-type design limits inference about dyadic processes and asymmetry and does not directly validate whether the same structure holds for employees. Future work should develop an employee-referent version with reworded items and role-specific indicators. Subsequently, matched host–guest data could be used to test configural, metric and scalar invariance across guest and staff forms.
Third, the scale was developed in pubs and bars. Transfer is most plausible in contexts with recurring encounters, staff discretion and opportunities for personal recognition, whereas other settings differ in cadence, dwell time, intimacy and social distance. We anticipate lower average CFQ in high-tier, more scripted settings, with activities potentially less central and disclosure and social support comparatively stable. Adaptation should retain the three core dimensions while adjusting role labels (e.g. from bartender to server/front desk/concierge) and replacing examples with setting- and socially feasible encounters (e.g. a welcome ritual, a discreet check-in, a post-stay note). Because the CFQ scale uses appropriateness rather than frequency as anchors, respondents can be instructed to evaluate the person they interact with in a service encounter as if they were in a social setting. Amendments should be done consistently with scale development guidance (DeVellis and Thorpe, 2022), meaning changes should be minimal and verified through cognitive interviews and a small pilot.
Fourth, reliance on self-reports increases common-method bias and social-desirability concerns. Although we used two diagnostics and procedural choices that reduce recall artifacts, method effects cannot be ruled out. Future studies may address this issue by using complementary methods, such as observational techniques or physiological measures, to capture more objective indicators of emotional connection.
Finally, future research should test predictive validity for additional outcomes (e.g. willingness to pay, risk tolerance, guest and host well-being) and examine moderators (e.g. culture, staff personality, brand image). Longitudinal, vignette-based and experimental designs, such as manipulating self-disclosure or presenting vignettes that depict different levels of commercial friendship, can capture change over time and provide causal evidence.
In summary, our validated scale for commercial friendship provides a novel framework for scholars and practitioners to understand, measure and facilitate meaningful host−guest relationships. While the study focuses on pubs and bars, we believe the model’s fundamentals may be generalizable within the hospitality industry and beyond. Ensuring that commercial friendship is cultivated thoughtfully and ethically can yield compelling competitive advantages, reinforcing the integral human element that characterizes exceptional hospitality experiences.

