1 Introduction

Virtual reality (VR) bears the potential to disrupt today's education (Allcoat and Von Mühlenen 2018; Checa and Bustillo 2020; Radianti et al. 2020). It allows students to experience and engage in virtual scenarios as if they were there (Jerald 2016; Rupp et al. 2019). VR allows educators to deliver learning content in a more immersive way than slides, texts, or 2D videos. A promising VR application for students in higher education is to increase awareness of possible biases in potential future workplace situations (Innocenti et al. 2012; Salmanowitz 2018), such as judging applicants’ competences in a recruitment interview. The main task of the rater in a recruitment interview is to judge the competences of an applicant objectively (Dipboye 2017). Research on situations other than job interviews has shown that people are influenced by their environment (Cialdini 2001). One important aspect is social influence (e.g., Cialdini and Goldstein 2004, for an overview of social influence), which often occurs through nonverbal behavior (Guyer et al. 2019). To our knowledge, no previous study has investigated whether the nonverbal behavior of interviewers influences observers who must rate the competences of a job applicant in a recruitment interview. To close this gap, we conducted a study in which participants observed a recruitment interview and rated three competences of the interviewed person (customer care, behavior in a team, and sales skill). We used a 2 × 2 design with the nonverbal behavior (positive vs. negative) of the interviewer and display type (VR-cardboard vs. 2D screen display) as between-subjects factors. In the remaining introduction, we present the educational background of this study and summarize previous research on job interviews, social influence through nonverbal behavior, and immersive VR versus 2D displays before explaining the current study and the expected results.

1.1 Educational background of this study

Traditional classroom-based lecturing uses slides, texts, and 2D videos. The use of VR can increase engagement and enjoyment, promote self-learning, and enable multisensory learning (Papanastasiou et al. 2019). This is particularly useful for theories and concepts that are related to perception, action, and decision making in real-life situations (for further potentials of VR-use in education see, e.g., Freina and Ott 2015; Martín-Gutiérrez et al. 2017; Jensen and Konradsen 2018; Markowitz et al. 2018; Radianti et al. 2020; Wu et al. 2020). We applied a total of four interactive VR modules in a lecture on general psychology for first-semester undergraduate psychology students. With the first VR module, the students were introduced to VR technology by interactively mixing colored lights in a dark room and viewing different colored objects in a virtual environment. In a session two weeks later, we introduced two subsequent VR modules to demonstrate biases of distance perceptions under various environmental conditions (e.g., daylight vs. night; detailed vs. limited visual context). The current study evaluates the fourth VR module, which was deployed again two weeks later. Its goal was to raise awareness of social influence bias by nonverbal behavior in a job recruitment interview. This module combines the topics of cognitive interpretations of perceived visual and auditive stimuli with a topic relevant to the students’ future careers (occupational, organizational, and business psychology students). Furthermore, the VR module set up for the study served multiple purposes in the class-based lecture. The students’ ratings of the evaluated competences were summarized as graphs in real time and used to discuss rating biases and their implications. Additionally, the disclosure and explanation of the study design introduced the students to key elements of experimental designs. Finally, students received first-hand experience using VR in a non-game application.

1.2 Job recruitment interviews and nonverbal behavior

Job recruitment interviews continue to be one of the most widely used methods for evaluating the fit between potential applicants’ competences and job requirements (Arvey and Campion 1982; Schmidt and Hunter 1998; Ryan et al. 1999; Salgado et al. 2004; Macan 2009). In fact, “the interview is the most-used procedure in personnel selection across all countries. Job applicants expect to be interviewed; managers expect to interview some or all candidates” (Viswesvaran and Ones 2018, p. 460). For example, a survey conducted among human resources (HR) personnel in the German-speaking part of Switzerland revealed that 99% of the respondents included a form of interview in their selection processes, while 71% of these were at least semistructured (König et al. 2010, p. 23). A meta-analysis by Hausknecht et al. (2004) revealed that candidates also perceive interviews favorably compared to alternative selection methods (see also Dipboye et al. 2012). However, job recruitment interviews differ in several aspects. One important aspect is standardization across candidates regarding the questions asked and the evaluation (Campion et al. 1997). A structured interview enhances reliability and predictive validity (e.g., Campion et al. 1988; McDaniel et al. 1994; Schmidt and Hunter 1998). Therefore, structured interviews are often the preferred choice in practice. Interview formats with two or more interviewers/observers (also called panel interviews) enhance the accuracy of the assessment (Huffcutt et al. 2013). Moreover, applicants perceive that interviews with more than one interviewer are more fair (Blackman 2017, p. 187).

An essential part of communication and human interaction is nonverbal (see Bonaccio et al. 2016, for an overview in the workplace context). Nonverbal behavior is an effective means of social influence: it affects recipients’ attitudes, behaviors, and beliefs (e.g., Guyer et al. 2019). Nonverbal cues are mostly emitted unintendedly and unconsciously. Visually observable positive nonverbal behaviors are related, for example, to eye gaze, nodding of the head in agreement, and smiling, while negative nonverbal cues can be expressed with a furrowed brow, stern look, and tightening of the mouth (Butler and Geis 1990, p. 49; Owen et al. 2012, p. 366). Whereas many studies have analyzed the effects of applicants’ nonverbal behavior on interviewers’ ratings (e.g., Parsons and Liden 1984; Gifford et al. 1985; DeGroot and Gooty 2009; Huffcutt et al. 2011, for a more advanced model), only a few studies have investigated the effects of interviewers’ behavior on the applicant (e.g., Liden et al. 1993). To our knowledge, no study has investigated whether social influence through the nonverbal behavior of interviewers influences fellow interviewers or raters.

1.3 Display types and perceived immersion

VR enables users to immerse themselves in and experience virtual content as if they were there (Jerald 2016; Lanier 2017; Slater 2018). VR offers new ways of studying human perception and behavior (e.g., Blascovich et al. 2002; Pan and Hamilton 2018). Highly realistic VR content includes 360-degree videos, also called spherical or immersive videos (e.g., Violante et al. 2019). This type of content has limited natural interaction capabilities typically associated with immersive VR. However, 360-degree videos allow users to look around at all angles from the location where the 360-degree camera captured the footage. In addition to experiencing 360-degree videos on an immersive VR headset (e.g., VR-cardboard), users can view the content on a 2D screen display (e.g., notebook or tablet display). Instead of simply turning the head naturally to change perspective in VR, the viewing angle is controlled on a 2D screen display using the computer mouse (notebook computer) or a touch screen (tablet computer). Immersive VR has the advantage that visual distractions from the user’s real environment are not perceived. The user’s visual focus remains on the virtual world.

Only a few studies have compared immersive VR and traditional 2D screen displays. For example, Schutte and Stilinović (2017) found that VR evoked significantly more empathy than 2D screen displays when participants watched a documentary about a girl in a refugee camp. Niu et al. (2019) compared subjective emotions evoked by short film sequences in immersive VR and 2D screen displays and found that immersive VR caused significantly higher arousal (intensity of the emotion) than the same sequences viewed on a 2D screen display. Research comparing VR to real-world experiences showed high ecological validity of VR. For example, watching a 360-degree video of a scenic landscape in immersive VR evoked similar emotions as seeing it in reality (Chirico and Gaggioli 2019). Kinateder and Warren (2016) studied socially influenced human behavior in evacuation scenarios. Their findings also yielded similar behaviors in immersive VR as in real life, supporting the ecological validity of immersive VR compared to real-life behavior. Other studies compared job interviews conducted as videoconferences on 2D screen displays and conducted face-to-face. For example, Sears et al. (2013) and Blacksmith et al. (2016) found that applicants received better interview ratings in face-to-face interviews. These authors argue that technology use limits the perception of interpersonal skills, such as effectively reading nonverbal cues. Overall, studies suggest that nonverbal cues are less effective in 2D screen video-mediated communications than the face-to-face equivalent (e.g., Doherty-Sneddon et al. 1997; Bos et al. 2002; Proost et al. 2020).

One reason for the better performance of VR compared to 2D screen displays could be the degree of immersion (e.g., Kumar et al. 2017). From a technical perspective, immersion in VR can be described as “the extent to which the computer displays are capable of delivering an inclusive, extensive, surrounding and vivid illusion of reality to the senses of a human participant” (Slater and Wilbur 1997, p. 604ff). However, more recent research characterizes immersion as a psychological state that “people attain when they are intensely involved in an activity and are experiencing a high level of enjoyment and fulfillment” (Cheng et al. 2015, p. 234ff). Based on game immersion theory, Brown and Cairns (2004) operationalized immersion on three levels (engagement, engrossment, and total immersion). Each level has specific barriers that players need to overcome before reaching the next stage. Cheng et al. (2015) refined this concept for video games and learning applications and shaped subfactors for each of the three levels. They described engagement as the first stage of immersion: users are interested in the topic and style of the application. The second stage is engrossment, in which users focus on the application’s content and gradually lose awareness of their environment and self-perception. Total immersion represents the third and final stage: users experience feelings of presence (a sense of actually being there) and detach entirely from reality (Cheng et al. 2015, p. 235). Existing questionnaires measuring aspects of immersion for VR were typically designed to evaluate fully interactive VR applications (e.g., Sutcliffe et al. 2005) in immersive VR only (see also Suh and Prophet 2018; Schwind et al. 2019, for overviews of different scales). Only a few studies have evaluated aspects of immersion based on game immersion theory using VR or comparing VR to other media. However, a few recent publications have successfully applied Georgiou and Kyza’s (2017) augmented reality immersion (ARI) questionnaire (Voit et al. 2019) or parts of it (Gallup et al. 2019; Tennant et al. 2020). Thus, in the current study, we slightly adapted and applied the ARI questionnaire (see Method for details).

1.4 Current study

We investigated whether the nonverbal behavior of an interviewer influences the competence ratings of observers in a recruitment interview using 360-degree videos. We used a 2 × 2 design with the nonverbal behavior (positive vs. negative) of the interviewer and display type (VR-cardboard vs. 2D screen display) as between-subjects factors. Participants acted as interview observers who had to rate the applicant’s competences (behavior in a team, customer care, sales skill) in the recruitment interview. To assess perceived immersion depending on the display type (VR-cardboard vs. 2D screen), participants completed the ARI questionnaire (Georgiou and Kyza 2017). Additionally, they rated their overall satisfaction with the experience.

1.5 Research questions and expected results

Based on the reviewed literature mentioned above, we defined our research questions and hypotheses: First, does the nonverbal behavior of interviewers influence the competence ratings of observers in recruitment interviews? We expected that positive nonverbal behavior of the interviewer would lead to higher competence ratings than negative nonverbal behavior of the interviewer (and vice versa; e.g., Cialdini and Goldstein 2004; Cialdini 2007). Second, does display type (VR-cardboard vs. 2D screen display) influence the competence ratings of observers in recruitment interviews? Previous studies reported higher performance ratings for face-to-face interviews compared to technology mediated interviews (Sears et al. 2013; Blacksmith et al. 2016). However, to our knowledge, no previous study examined whether display type (VR-Cardboard vs. 2D screen display) influences the competence ratings of observers in recruitment interviews. Therefore, we did not formulate a hypothesis for the second research question. Third, is there an interaction between nonverbal behavior and display type on competence ratings of observers in recruitment interviews? Because research on this question is also scarce, we did not formulate a hypothesis on this third research question. Fourth, does VR result in higher immersion than 2D screen displays? The first two stages of immersion (engagement, engrossment) focus mainly on content- and application-related topics (Brown and Cairns 2004; Cheng et al. 2015), which were identical for the immersive VR and 2D screen display conditions. Thus, we expected to find higher perceived immersion on the subscales of flow and presence (both related to total immersion) for VR because the technology is capable of making users experience a feeling of being there and detaching from reality (Slater and Wilbur 1997). Fifth, can the ARI questionnaire (Georgiou and Kyza 2017) be applied to compare 360-degree videos displayed with VR versus displaying them on 2D screens? We expected that applying the ARI scales would differentiate between immersive VR and 2D screen display (construct validity) on the relevant subscales of total immersion and that the ARI scales would show good reliability.

2 Method

2.1 Participants

One hundred and seventeen (117) first-semester psychology students simultaneously participated in the experiment in the General Psychology I lecture. Participation was voluntary and took place during the scheduled lecture time. All participants gave informed consent before starting the experiment. Seven students did not complete the questionnaire due to technical issues during the experiment (e.g., smartphone incompatibility with VR-cardboard, low battery); thus, we excluded their data. The final sample consisted of 110 participants (n = 110; 74% female; age: M = 24.1 years, SD = 4.6). All had previously experienced VR-cardboard by conducting a cardboard VR module on color and two modules on distance perception in the same lecture two weeks before the experiment. The study complied with the American Psychological Association Code of Ethics and was conducted in accordance with the European General Data Protection Regulation.

2.2 Experimental design

We used a 2 × 2 design, with the interviewer’s nonverbal behavior (positive vs. negative) and display type (VR-cardboard vs. 2D screen display) as between-subject factors. The participants rated three competences of the applicant (behavior in a team, customer care, and sales skill) based on 360-degree videos of a job recruitment interview. Finally, the participants completed the ARI questionnaire (Georgiou and Kyza 2017) on perceived immersion and answered one item on overall satisfaction with the experience.

2.3 Procedure

The experiment was conducted in a lecture. We welcomed the students and explained that they would be observers of a job interview to provide competence ratings of a job applicant. We told the participants that they would be randomly distributed to the VR-cardboard or the 2D screen condition and that they could experience the experiment later at home using both display types if they wished to do so. All participants were asked to remain seated at their table in the classroom during the experiment. We explained to the participants that they could stop the experiment anytime and should stop the experiment immediately if they felt any discomfort (this was important for the immersive VR conditions because, in rare cases, participants can feel adverse effects such as sensation disorientation, oculomotor issues, or visually induced motion sickness; e.g., Lubeck et al. 2015; Kim et al. 2018). To our knowledge, this was not the case; all participants without technical difficulties completed the experiment. Six persons were available on-site to help in case of technical issues. After the instructions, the participants logged into a web-based survey and education platform (developed by Waimanoo 2020) using their notebook or tablet computer and an anonymous ID. The platform randomly distributed the participants to four groups: 2D screen display with positive nonverbal behavior, 2D screen display with negative nonverbal behavior, VR-cardboard with positive nonverbal behavior, and VR-cardboard with negative nonverbal behavior. All participants were then asked to connect their headphones to the device (notebook, tablet, or smartphone) and adjust the audio volume to a comfortable level based on an audio sample. Participants in the immersive VR condition then mounted their smartphone on their VR-cardboard (all students had received a Google cardboard V2, Google 2020a, in the previous lecture two weeks earlier). The experiment started for all students simultaneously (Fig. 1).

Fig. 1
figure 1

Photograph of the experiment in the classroom lecture

In the VR-cardboard conditions, participants could change the viewpoint by moving their head. Interaction elements, such as buttons, were activated by either maintaining the look-at position for three seconds (also called “fusing”; with a cursor animation; Google 2020b) or clicking on a physical button positioned at the top right-hand side of the VR-cardboard. To ensure easy usability for both display types and keep the interaction modalities comparable, in the 2D screen display condition, participants could move their viewpoint with a mouse or touchscreen (click-and-drag). They could select buttons with a screen-centered cursor. A click on the left mouse button, a tab on the screen (tablets), or the keyboard’s enter key was used to activate a button.

Once ready, a click on an “OK” button started the experiment with the first 360-degree video (introduction video; Fig. 2a), welcoming the applicant into the room and introducing the interview setting and the persons involved. Then, three video sequences were displayed (one for each competence; Fig. 2b, for an example); the order of these sequences was counterbalanced across participants using Latin square (Jones and Kenward 2015, p. 140). After each interview sequence, the video faded out again, and instructions appeared to rate the corresponding competence (Fig. 2c, for an example). Once answered, a click on an “OK” button started the next video sequence until all three competences were rated. A brief closing video represented the end of the interview setting. Participants in the VR-cardboard condition were asked to remove the VR-cardboard and continue on their notebook or tablet computer. Subsequently, all participants completed the ARI questionnaire (Georgiou and Kyza 2017) and rated an extra item on the overall satisfaction with the experience on their notebook or tablet computer.

Fig. 2
figure 2

Screenshots of the 360-degree videos used in the study. a Picture of the introduction video showing the job applicant entering the room. b Example picture of the interview. c Question and ten-point Likert scale for rating the competence behavior in a team

2.4 Materials

As visual and auditory stimuli, multiple 360-degree videos were created in cooperation with Waimanoo (2020) to represent realistic video sequences of a recruiting interview. The 360-degree camera captured the videos from a natural viewing position of the rater or the actual participant (Fig. 2). We did not manipulate the camera position for all clips to reduce the risk of simulator sickness (e.g., Lubeck et al. 2015). To retain the participant’s attention on the relevant actions, we kept the job interview setting’s surroundings simple (Fig. 3) with a minimum of visual distractions (Pirker et al. 2020). The interview setting is a generic office environment consisting of a table, a fictional company logo, a plant, and a round table with the three protagonists: the interviewer, the applicant, and the rater. From the rater’s perspective, this setup allowed the participants to observe the interviewer and applicant simultaneously, even with the limited horizontal field of view of a VR-cardboard (between 60° and 100°, depending on the specific smartphone used; Powell et al. 2016).

Fig. 3
figure 3

Layout of the interview setting. The interviewer, the job applicant, and the rater were located around a table. The 360-degree camera was placed at the position of the rater’s chair. The room’s setting was kept simple, with a minimum of visual distractors

We captured the video footage with a dual-lens GoPro Fusion camera and then stitched the content to monoscopic, spherical 360-degree videos (for a general overview of typical capturing workflows, see Feurstein 2019). An Olympus LS-P4 recorded the audio. The separate video and audio materials were merged and edited (for illustration, see Fig. 4) using Adobe Premiere Pro version 13.1.3. Furthermore, we aligned the center of the “primary action” (Pirker et al. 2020, p. 323) to the viewer’s forward view. To minimize potential technical issues with the Wi-Fi bandwidth in the classroom and limited cache on the students’ devices, we lowered the resolution to 1920 × 960 pixels and set the target bitrate to 5 Mbps, which was a compromise for good image quality combined with high functionality on the variety of student devices used during the experiment.

Fig. 4
figure 4

Editing process of the 360-degree video footage. All scenes were recorded multiple times. The content was then assembled and edited using Adobe Premiere Pro’s masking functions (red frame; depending on the nonverbal behavior). In the final videos, only the visual and auditive footage of the interviewer differed between the positive and negative nonverbal behavior conditions

We implemented the scenario of a job applicant applying for a salesperson position in an electronic retail store. We chose this scenario because first-semester students in higher education have had experiences with salespersons in such a store. Thus, it could be assumed that they could imagine an appropriate job requirement profile. In addition to technical knowledge, salespersons should also exhibit competences in other categories, such as working well in a team, interacting with customers, and sales skills (for a meta-analysis of the determinants of salespersons’ performance, Churchill et al. 1985; for a study analyzing situational interviews for a sales-related job, Weekley and Gier 1987; for detailed job analysis for sales-related jobs, Phillips 1992; for a study evaluating a skill-set of sales-related jobs, Rentz et al. 2002). Accordingly, we chose three specific competences for each of these three categories: behavior in a team, customer care, and sales skill.

The first 360-degree video served as an introduction (the same video for all test conditions) and aimed to familiarize the rater with the virtual scene and build rapport in the interview setting (e.g., Barrick et al. 2010; Swider et al. 2016). The interviewer (female; professional theater actress) welcomed the participant and explained the interview procedure. Then, a door buzzer sounded, the door opened, and the interviewer welcomed the applicant (male; professional theater actor). After they both sat down at a table, the interviewer briefed the applicant and the rater on the procedure. This introduction video ended after 1 min 32 s. The 360-degree content blurred out, and written text appeared with instructions for the participant to focus on the upcoming competence rating. A click on the “OK” button started the video sequence for the first competence rating. For each of the three competences (each video sequence was 42 s), the interviewer asked one of the following questions.

Behavior in a team: For this competence, the interviewer asked a situational question derived from Campion et al. (1994, p. 999): “Suppose you had an idea for a change in work procedure to enhance quality, but there was a problem in that some members of your work team were against any type of change. What would you do in this situation?”.

Customer care: This question was derived from an example noted in Weekley and Gier (1987, p. 485): “A customer comes into the store to pick up a watch he had left for repair. The repair was supposed to have been completed a week ago, but the watch is not yet back from the repair shop. The customer becomes very angry. How would you handle this situation?” For a better fit in the chosen scenario, we replaced the term “watch” with “smartphone.”

Sales skill: The interviewer’s question posed for this competence was derived from Campion et al. (1997, p. 668): “Suppose you were giving a sales presentation and a difficult technical question arose that you could not answer. What would you do?”.

The applicant’s reply to the question was the same in all four conditions within every competence. We defined and evaluated the replies to each competence question with HR experts. The applicant’s replies corresponded to answers rated as average and contained no salient cues related to high or low performers. For each competence, we edited two versions (positive nonverbal behavior of the interviewer, negative nonverbal behavior of the interviewer) using Adobe Premiere Pro’s masking functions. Only the interviewer’s visual and audio differed between the two versions, while all other content remained the same. The interviewer responded to the applicant’s reply by giving nonverbal cues only. For the positive nonverbal behavior condition, the interviewer showed facial expressions (for an overview on facial expressions, see Cohn et al. 2007), such as cheek raising, head up, and head down, along with other nonverbal visual cues of pleasure (for studies applying these cues, see Butler and Geis 1990; Owen et al. 2012), such as friendly eye gaze to the rater, nodding, and authentic smiles (see Fig. 4, top right). For the negative condition, the interviewer showed facial expressions such as brow lowering, chin raising, and lip tightening, along with other nonverbal cues of displeasure, such as giving stern looks to the rater, frowning, or humming quietly (negative nonverbal behavior; see Fig. 4, bottom right). To keep the credibility of the interviewer as perceived by the observer consistent (Williamson et al. 2013), we maintained the intensity and number of nonverbal indications for all conditions (Borkenau and Liebler 1995; Larsen and Shackelford 1996; for examples, see Fig. 5 in Appendix).

In the closing video (length: 30 s), the interviewer said goodbye to the applicant, who left the room. The interviewer told the participant (rater) to continue on his/her notebook or tablet computer for the final questionnaire. The total length of the 360-degree videos per participant was 4 min 18 s. For the immersive VR conditions, the total time wearing a VR-cardboard was under 8 min (avoiding potential adverse effects on the participants’ ocular system; see, e.g., Turnbull and Phillips 2017; Elias et al. 2019), including reading instructions and rating the competences.

All video sequences were evaluated by three experts (one professor and two research scientists) working in the field of personnel assessment. They unanimously confirmed that the videos corresponded to a realistic interview setting, the interviewer asked realistic questions, the applicant’s replies were short but realistic, and that the nonverbal behavior of the interviewer was clearly identifiable as either positive or negative. The experts also mentioned that professional interviewers would often pose follow-up questions and that their facial expressions would be less pronounced.

2.5 Measures

Competence ratings: We asked the participants to rate the applicant’s competencies (behavior in a team, customer care, sales skill) directly after the presentation of the specific video sequence on a ten-point Likert scale ranging from “low” (1) to “high” (10), see Fig. 2, right for an example.

Immersion questionnaire: We applied the ARI questionnaire (Georgiou and Kyza 2017) to measure perceived immersion aspects based on game immersion theory (Brown and Cairns 2004; Cheng et al. 2015). Cronbach’s \(\alpha\)-values reported by Georgiou and Kyza (2017) are shown in Table 1. The model consists of three main factors, each with two subscales: engagement is composed of usability and interest, engrossment consists of emotional attachment and focus of attention, and total immersion includes flow and presence.

Table 1 Cronbach’s \(\alpha\) Values of the ARI Scales and Subscales Reported by Georgiou and Kyza (2017, p. 33)

We used the items of the ARI questionnaire (Georgiou and Kyza 2017) and replaced the term “AR” with “VR” in the four items mentioning AR. A native English and German speaker translated the items into German (initial translation; see, e.g., Beaton et al. 2000). The items were then translated back to English by another native speaker of both languages (back-translation), followed by a review by the authors. Table 6 in Appendix contains the English and German items (the latter were used in the current study). We used a seven-point Likert scale ranging from “totally disagree” (1) to “totally agree” (7), identical to Georgiou and Kyza (2017).

Overall satisfaction: To gather an overall impression of whether the participants liked the experience, we added one extra item at the end of the survey: “How did you like the VR module [with VR-cardboard/on the 2D screen display]?” The text indicated in square brackets changed depending on the display type condition. A seven-point Likert scale ranging from “very bad” (1) to “very good” (7) was applied.

2.6 Analysis

For psychometric and descriptive statistics, analyses of variance (ANOVAs), and post hoc comparisons, we used the free and open statistical software JAMOVI version 1.2.27.0 (The Jamovi Project 2020) with alpha set to 0.05. We report the effect sizes of ANOVAs using η2p (partial eta-squared) and interpret η2p-effect sizes of .01, .06, and .14 as small, medium, and large effects, respectively (Cohen 1988, p. 368). Significant interaction effects of the ANOVAs were further investigated with post hoc tests using Holm–Bonferroni corrections (Holm 1979).

3 Results

All analyses refer to the same data set; thus, the participants per condition remained the same for all analyses. No outliers were detected or removed.

3.1 Effects of nonverbal behavior and display type on competence ratings

Table 2 shows the number of participants per condition, means, standard deviations, and results of two-way ANOVA statistics (according to the American Psychological Association 2020) for the competence ratings as dependent variables.

Table 2 Number of participants, means, standard deviations, and two-way ANOVA statistics for evaluation of competences

We conducted two-way between-subject ANOVAs to explore the impact of nonverbal behavior and display type on the three competence ratings using two-way independent ANOVAs (Table 2). We found significant main effects of the interviewer’s nonverbal behavior on the competence ratings of the job applicant with regard to behavior in a team and customer care. For competence behavior in a team, participants who experienced positive nonverbal behavior gave higher ratings (M = 5.78, SD = 1.49) than those who experienced negative nonverbal behavior (M = 5.00, SD = 2.08). Competence customer care showed similar effects: the participants in conditions with positive nonverbal behavior (M = 5.72, SD = 1.87) rated the applicants higher than those in conditions with negative nonverbal behavior (M = 5.02, SD = 1.77). However, these ratings did not depend on the display type; we found no significant effects of display type and no significant interaction of nonverbal behavior with the display type. No main effects or interaction of the display type with nonverbal behavior reached statistical significance for competence sales skill. Considering the ten-point Likert scale used for the three competency ratings (behavior in a team, customer care, sales skill), the means of each group (ranging from M = 4.44 to M = 5.83) shown in Table 2 indicate average ratings with no signs of ceiling or floor effects.

3.2 Effects of nonverbal behavior and display type on perceived immersion

Table 3 shows the psychometric properties of the immersion scales we used based on the ARI questionnaire of Georgiou and Kyza (2017). We found very good reliability (DeVellis 2016, p. 136) for all scales, with Cronbach’s \(\alpha\) ranging from .79 to .91.

Table 3 Psychometric properties for ARI scales and subscales

We conducted two-way between-subject ANOVAs to explore the impact of nonverbal behavior and display type on the immersion scales (see Table 4). The analysis of the immersion variable engagement showed a significant interaction in the subscale of interest between nonverbal behavior and display type. Post hoc analysis revealed significantly higher interest ratings in the negative nonverbal behavior with VR-cardboard condition compared to the negative nonverbal behavior with 2D screen, t(106) = 2.83, p < 0.05.

Table 4 Means, standard deviations, and two-way ANOVA statistics for the ARI subscales

The ANOVAs also yielded a significant main effect of the display type on both subscales of total immersion: flow (VR-cardboard: M = 4.21, SD = 1.61; 2D screen display: M = 3.43, SD = 1.46) and presence (VR-cardboard: M = 3.53, SD = 1.53; 2D screen display: M = 2.67, SD = 1.31). The VR-cardboard condition led to higher ratings of both subscales compared to the 2D screen display condition.

3.3 Effects of nonverbal behavior and display type on overall satisfaction with the experience

We further analyzed overall satisfaction with a two-way between-subject ANOVA (see Table 5). The ANOVA suggested a significant main effect of the display type (VR-cardboard: M = 5.37, SD = 1.46; 2D screen display: M = 4.68, SD = 1.43). Thus, experiencing 360-degree videos with a VR-cardboard led to higher overall satisfaction than viewing the videos on a 2D screen display.

Table 5 Means, standard deviations, and two-way ANOVA statistics for overall satisfaction

4 Discussion

We studied whether an interviewer’s nonverbal behavior (positive vs. negative) impacts the competence ratings of observers in a job recruitment interview using 360-degree videos experienced with two different display types (VR-cardboard vs. 2D screen display). First-semester psychology students (n = 110) in the role of raters observed a recruitment interview and assessed three competences (behavior in a team, customer care, and sales skill) of an applicant. We found that positive nonverbal behavior of the interviewer resulted in higher ratings for two of three competences (behavior in a team and customer care) compared to negative nonverbal behavior by the interviewer. This effect was similar for 360-degree videos viewed with immersive VR and on 2D screen displays; we found no effect on display type for the three competence ratings. Our data did not show an interaction of nonverbal behavior and display type on the competence ratings. However, 360-degree videos resulted in higher immersion when experienced with VR than when viewed on 2D screen displays with regard to flow and presence. Thus, the two relevant ARI subscales of total immersion (flow and presence) differentiated between VR and 2D screen display (construct validity). All applied immersion scales showed good reliability. In the following sections, we discuss each key result in the context of the research questions, hypotheses, and relevant literature.

4.1 Effects of nonverbal behavior on competence ratings

The interviewer’s nonverbal behavior influenced observers’ competence ratings in the job recruitment interview. Overall, the results support our first hypothesis on social influence trough nonverbal behavior. The participants, acting as interview observers, rated the job applicant’s competence behavior in a team and customer care significantly higher when the interviewer reacted with positive nonverbal behavior to the job applicant’s replies than when the interviewer’s nonverbal behavior was negative. This effect of social influence can be explained by social proof (also called informational influence, Cialdini and Goldstein 2004; Cialdini 2007), a form of seeking accuracy for judgments in uncertain situations (Wood 2000). These results have highly relevant practical implications. To increase the validity of interview ratings (e.g., Huffcutt et al. 2013; Blackman 2017; and further practical reasons), recruitment interviews often include multiple persons from the organization’s side. However, these persons may unintentionally influence others and can unknowingly be socially influenced, which is by no means intended in job interviews with multiple interviewers or raters. However, our results show no significant difference between positive and negative nonverbal behavior with competence sales skills. A possible explanation may be that the participants felt more competent in judging this specific competence because it can be considered part of the everyday experience if a salesperson’s reaction leads to real sales success. We will return to this point in the section on limitations and further research.

From an educational perspective, our study has shown that 360-degree videos are useful to increase students’ awareness regarding the effects of social influence through nonverbal behavior in job application interviews. With the integration of didactical aspects (e.g., Fowler 2015; with focus on active learning: e.g., Theobald et al. 2020), such as the subsequent real-time visualization of the experiment’s data and class discussion, this procedure can lead to the desired reflections by students. On the other hand, these achievements can only be seen as an initial step to reduce the risk of biases such as these occurring in job interviews. Before individuals act as interviewers or observers in real recruitment interviews, thorough training on interviewing and rating applicants’ competences is highly recommended (Alcañiz et al. 2018, for leadership training in VR; Doll 2018, for training on conducting interviews). Our study suggests that such training would benefit from increasing awareness of possible social influence effects through nonverbal behavior that could result in biases when judging job applicants’ competence.

4.2 Effects of display type (VR-cardboard vs. 2D screen display) on competence ratings

The effect of social influence through nonverbal behavior was similar for 360-degree videos viewed in VR-cardboard and 2D screen displays. To our knowledge, no previous research has investigated the effects of nonverbal behavior on observers of job recruitment interviews. Therefore, we did not formulate a hypothesis on whether the effect of social influence depends on the display type used to experience 360-degree videos. Our results suggest that the display type (VR-cardboard vs. 2D screen display) does not significantly impact competence ratings. Further, we did not formulate a hypothesis for interaction effects of nonverbal behavior and display type on competence ratings at job interviews. On all three competence ratings, the ANOVAs did not yield an interaction effect. These findings may imply that using 2D screen displays for awareness-raising experiences (or training) on social influence biases could be sufficient in practice. Currently, the use of immersive VR in large higher education courses still demands extra effort. Although the use of low-cost VR-cardboards, combined with the use of students’ smartphones, does not necessarily result in high hardware acquisition costs, the complete implementation requires more effort. Securing full technical functionality and compatibility on all device types (e.g., different screen sizes, operating systems, browser types, and versions) demands additional testing and, thus, potentially more development resources. The alternative, purchasing an adequate set of homogenous stand-alone, untethered VR devices, is costly and involves additional necessary cleaning measures to meet hygiene standards. In contrast, viewing (360-degree) videos on notebook/tablet computers can be achieved relatively quickly. However, immersive VR in a class-based scenario has several benefits compared to 2D screen displays. First, immersive VR automatically absorbs potential visual distractors (e.g., neighboring students). Second, as higher education prepares students for future needs at work (e.g., twenty-first-century skills), such non-game applications are feasible ways to familiarize students with VR’s emerging trends and technology. Third, the overall satisfaction ratings showed that the students liked using immersive VR (with VR-cardboard) significantly better than 2D screen displays for viewing 360-degree video content. Finally, technology improves rapidly, and VR solutions become more affordable every year.

4.3 Effects of nonverbal behavior and display type on perceived immersion

360-degree videos have higher immersion when viewed in VR than when viewed on 2D screen displays with regard to flow and presence. Game immersion theory (Brown and Cairns 2004) suggests that players need to overcome specific barriers on an immersion level before reaching the next stage (for learning context, see also Cheng et al. 2015). The first stage (engagement) and the second stage (engrossment) of perceived immersion focus more on the content (e.g., the story) and the application’s use (e.g., interaction modes). These were kept nearly identical for the VR-cardboard and 2D screen display conditions. As expected, the subscale ratings on the two first stages did not differ with regard to the display type. However, the subscale means ranged from M = 4.46 to M = 6.00 (see Table 4). Taking into account the seven-point Likert scale, these values were high, indicating that the two stages were achieved. The analysis of the subscales of total immersion (presence and flow) showed that they were both significantly higher for the VR-cardboard condition than for the 2D screen display condition. These findings align with our expectations as immersive VR has the advantage of providing a higher perception of total immersion (subscales of presence and flow) than 2D screen displays.

An interesting point was discovered in the interaction between nonverbal behavior and display type on the subscale of interest. The VR-cardboard group with negative nonverbal behavior perceived the experience with significantly higher interest than the group with positive nonverbal behavior. One possible explanation is the evocation of emotions through nonverbal behavior or social influence. To our knowledge, however, no previous research examines social influence that triggers emotions (i.e., their nature, valence, and intensity), which then influences aspects of perceived immersion. We will return to this point in the section on limitations and further research.

4.4 Adapted ARI questionnaire

As mentioned in Introduction and Method, we decided to slightly adapt the ARI questionnaire (Georgiou and Kyza 2017) to be used with 360-degrees videos presented using a VR-cardboard and a 2D screen display. Our results suggest that the adapted ARI questionnaire can be used to reliably measure different aspects of immersion when viewing 360-degree videos in VR-cardboard and on a 2D screen display. The psychometric properties of the adapted ARI questionnaire showed very good reliability with our data (Cronbach’s \(\alpha\)-values ranging from .79 to .91). Remarkably, with the exception of the subscales usability and focus of attention, the reliabilities with our sample were even higher than the reported values of the original ARI questionnaire that was used in another context (Georgiou and Kyza 2017).

4.5 Limitations and further research

As mentioned in Method section, all video sequences were evaluated by three experts working in the field of personnel assessment. They unanimously confirmed that the interview setting, interviewer questions and answers of the applicant were realistic, but professional interviewers would show less pronounced emotional expressions. Our study highlights the importance of conducting job assessment interviews in a benevolent manner without showing pronounced facial expressions. We tested lay raters (first-semester psychology students) in our study. It is unclear whether our findings of social influence bias through nonverbal behavior would also affect professional raters in job recruitment interviews. It would be interesting to investigate this aspect in a future study and compare students’ ratings to those of experienced job interviewers’ ratings. Furthermore, our data relied on the judgment of three competences based on situational questions only. A more comprehensive range of job-relevant aspects (e.g., Lambert et al. 2014, for sales-related jobs) and the inclusion of behavior description questions (e.g., Taylor and Small 2002; Huffcutt et al. 2004; Klehe and Latham 2006; Culbertson et al. 2017; Hartwell et al. 2019) would allow further conclusions about which type of competencies or questions are especially likely to socially influence raters. Our results are very promising regarding the application of a slightly adapted ARI questionnaire (Table 6 in Appendix) to measure the three levels of perceived immersion (as defined in Brown and Cairns 2004). However, further research with more participants would be beneficial to validate the model structure of game immersion theory (Cheng et al. 2015) in more detail. Finally, the observed interaction of nonverbal behavior and the display type on the immersion scale of interest potentially opens further research questions. Why and to what extent do immersive VR experiences with negative nonverbal behavior compared to positive nonverbal behavior arouse interest, but not when displayed on a 2D screen? A future study analyzing viewers’ emotions (e.g., Diemer et al. 2015; Chirico and Gaggioli 2019; Niu et al. 2019) induced by nonverbal behavior (positive vs. negative) could more precisely reveal the reasons for the observed interaction effect on interest.

5 Conclusion

Recruitment interviews are a widely used method for recruiting new personnel and often include multiple persons from the hiring organization. This study evaluated whether an interviewer’s nonverbal behavior (positive vs. negative) impacts observers’ competence ratings of an applicant using 360-degree videos on two different display types (VR-cardboard vs. 2D screen display). The participants observed interview sequences in the role of a rater and assessed three competences (behavior in a team, customer care, and sales skill) of the applicant. The interviewer’s nonverbal behavior influenced these ratings on two out of three evaluated competences (behavior in a team and customer care). Overall, these findings confirm a social influence bias and are therefore highly relevant for practice, particularly because interviews with multiple interviewers/raters are generally considered valid. Additionally, our study showed that 360-degree videos of recruitment interviews can be used for awareness-raising experiences on social influence through nonverbal behavior. Interestingly, the effect of social influence was independent of the display type: watching the 360-degree videos with immersive VR and on 2D screen displays resulted in similar ratings of competence. This raises the question of whether it is worth the extra effort of using immersive VR in a classroom to demonstrate the effect of social influence through nonverbal behavior. However, the participants in the immersive VR condition rated the overall satisfaction significantly higher than the participants in the 2D screen display condition. Furthermore, the participants perceived a higher level of flow and presence (defined as the third level of immersion, total immersion) when experiencing the 360-degree videos in immersive VR compared to 2D screen displays. This finding is in line with the underlying game immersion theory. The first and second levels of immersion (engagement with the subscales of usability and interest; engrossment with the subscales of emotional attachment and focus of attention) did not differ between the display types because the constructs relate mainly to the content and the implementation (which were similar for both display conditions). Thus, the evaluated scales of the ARI questionnaire (Georgiou and Kyza 2017, with the term “AR” modified to “VR”; see the Table 6 in Appendix) demonstrated construct validity and high reliability. Accordingly, our results suggest that the ARI questionnaire is an appropriate instrument to rate different aspects of immersion when assessing 360-degree videos using immersive VR and 2D screen displays.