Learning in collaborative virtual environments
- Impressions from a trial using the Dovre framework

BY OLA ØDEGÅRD AND KARL ANDERS ØYGARD 

This paper focuses on aspects of collaborative virtual environments (CVE) or Televirtuality. The first part will be related to interaction issues and the results of a field trial based on Televirtuality. The next part of this paper will focus on technical aspects of a generic framework for such systems, and how these utilise the network and the technical platforms in an optimal way. 

Part I 

This part deals with the subject in the perspective of other conferencing systems. It pays special attention to Virtual Reality (VR) as a medium and how it can be used to support social interaction and collaboration between persons located in remote areas. A networked application, based on the idea of city planning (The Little Houses on the Cyber Prairie), has been used in a trial consisting of several VR stations linked in a local area network. In each group three persons have entered the city, confronted with some tasks which they had to solve in collaboration with each other. This paper reveals some of the findings and experiences the test users had. The experiences so far seems promising in using this new medium in distance education and remote work. 

Part II 

This part gives an introduction to DOVRE, which has been used to build the application for city planning. This system makes it possible for several users to enter a shared virtual space and interact with enhanced functionality. Then it describes briefly the Dovre framework, what design goals it is based on and some of the technical solutions we used when implementing a first version of the framework. 

Part I 

1 Introduction 

The last 20 years the telecommunication evolution has found its form in a huge range of application areas and services. This evolution has coexisted with the information revolution and the media merger between telecommunication technology, computer technology and content providers. We have seen the birth of LAN, WAN, connected through modems, leased lines, ISDN, ATM. The wide spread of IP-based services like the Internet, email and The WorldWideWeb. We can use services like videoconferencing and multimedia conferences. The personal computer in networks seem to be the winning medium as its performance and rendering capabilities seem to double every two years. This is the background for the next generation of communication medium; the networked virtual reality. The advantages seem quite obvious, as it can overcome the obstacles of traditional remote telephone conference meetings, videoconferences, multimedia conferences and computer conferences. These obstacles are related to limitations in the various media and first of all the ability to make the participants fully engaged in the conferences. To feel and sense objects, be able to determine its own viewpoint, appearance and utilise a variety of functionality seems to be what we can expect of a future teleconferencing system. The alienating nature of many traditional media can be overcome in virtual reality conferencing. The participants can experience a new kind of presence with each other and with the three dimensional objects and documents they collaborate on, as they are present in the same virtual room. 

2 Field trial: Interaction in The Cyber Prairie 

The initiation of the networked trial was that we wanted to know more about the following topics 

About the application 

Based on the platform development (see second part of the paper) an application scenario was developed. The idea was to give the users the possibility to co-operate on a city planning process in a distributed environment. The virtual world consisted of a piece of virtual landscape with several objects like houses, a hospital, a ruin (historical), streets, a gas station, etc. The objects were scattered around in disorder, and the objective for the participants was to put the objects into an order of a city. The system was also created in a way so that it would require co-operation. Before entering the virtual environment, the participants were given a list of tasks that they should try to perform. 

Avatars 

The three users were represented in the 3D environment with so-called avatars or Virtual Identities (VID) (see figure 1). The use of avatars is crucial in collaborative virtual environments, as they represent the point of view of each participant. The avatars used in the Cyber Prairie were designed as humanoids, but they could have been without resemblance to humans. To a certain extent we were also interested in examining the effect of using avatars, and how they were comprehended. 

Interface 

It was possible for the users to move their avatars and their subjective point of view with 6 degrees of freedom in space. The application did not use collision detection to objects or surface. The main input device was a joystick for direction and movement along the X- and Z-axis, equipped with buttons for object (house) selection and rotation in the Y-axis. 

Figure 1. Avatars on the Cyber Prairie 

2.1 Technical configuration and method 

Each of the three VR stations consisted of a Pentium 200 with a Head Mounted Display with head tracking and a simple joystick. The application Cyber Prairie performed in a frame rate around 10 frames per second in 640 by 400 pixel resolution. The three users were placed in separate rooms. Each VR station ran a client application, and they were connected through a Local Area Network to a server which ran on a Silicon Graphics workstation. This server also ran a client application. This made it possible for the researchers to observe the interaction in the virtual world. The VR stations were equipped with microphones which were connected to a mixer that sent sound back to the HMDs. 

In this way, the researchers could follow the interaction both by choosing their own point of view and listening to the conversation between the test persons. The idea was not to interrupt the test users unnecessarily, and to let them learn by doing. On a couple of occasions one of the client applications crashed, or the test person got lost in the virtual world. Then it was easy for us to help when needed. 

In addition to the trial a questionnaire was developed. The test persons were given a hands-on experience using the VR equipment. Then they were introduced to some tasks which they should attempt to solve. These tasks were related to environment issues, like where to put the hospital and where to put the villas. 

The background for the test was based on the following assumption: 

It is easier to collaborate and learn in networked virtual environments than in other teleconferencing environments. 

We wanted to acquire experiences with the use of avatars and the possibility to choose their own point of view. We wanted to observe the shared awareness of the participants. Compared to traditional videoconferencing or multimedia conferencing some difficulties have been recorded in sharing objects and knowing who is doing and seeing what. We wanted to observe if these problems were more easily overcome in virtual environments. 

Figure 2. Three VR stations configuration 

2.2 Interface, avatars and co-operation 

The tests were carried out with three groups of three persons each. Two of the groups were high school students around 16 to 17 years of age. The last group consisted of three researchers in the thirties. After the test a structured interview was conducted with each group. 

Interface: 

In general, the youngsters had more experience in using the joystick. None of the test persons had any significant experience in moving in virtual space or using an HMD. Here are some quotes about the interface: 

"I had no control of the joystick, the houses just flew back and forth, then I had to learn how to use the joystick." 

"In the beginning the head tracking did not work, but later it did, and then it became much more alive." 

"I thought we just should watch it in the head mount. I did not know that we were able to go around and do things." 

"Some limitations on how and where to move had made it easier." 

Each group spent around sixty minutes inside the virtual environment. All groups used around half the time to figure out how to use the joystick and the HMD. There were two ways to choose direction of the movements, either by using the joystick, or by using the HMD by turning their head in the desired direction. 

Using an avatar: 

The avatars had various colours which were the only identificator, or difference between them. They were static in the sense of body limb movement. They could, however, see from the position of the avatar which direction they were looking. Here are some quotes: 

"It was no problem using an avatar; however, you did not see yourself, just the other avatars." 

"Even though I was flying around, I did not feel that I had any different body." 

One observation we did was that none of the groups had any difficulties in using avatars. To most of them this was just a natural extension of themselves in the virtual environment. Some of them expressed a special kind of freedom in virtual space: 

"I felt as if I were identical to my avatar, but I felt a bit handicapped not being able to use my hands." 

"This is a real experience, different to a movie, because you are in the movie." 

"It would be real nice to travel into space with your friends." 

Co-operation: 

The next thing to do for the participants after getting used to the joystick and HMD, was to figure out how to select objects (houses) and to move them. As they had managed this, the next thing would be to move them to desired position. The participants had to discuss among themselves to come up with an agreement on the desired position or desired plan of the city. There was no city plan that was "right" or "wrong". The main idea was that they should co-operate and organise themselves in the planning and to put the plan into (virtual) life. Here are some quotes: 

"In the beginning I had quite enough moving and orienting, but after a while we started co-operating." 

"We did not have any leader, but that worked fine." 

"It is easier to work face to face, but if I had this system at home for one month, it would have been easier." 

"I experienced it more intense to co-operate in VR than in real life. Because you were there all the time." 

"More fun and more interesting than actual co-operation!" 

"It was really important for co-operation that we could both see the avatars and speak to each other." 

Reflections on VR as a medium: 

"This is a different experience; I have been some place else, I have escaped from the normal world." 

"You have a TV screen in front of your eyes, and you move into these pictures, there is nothing dangerous there, you can be driven over by a car and nothing happens, go through walls, go underground and into space." 

"We were in for 60 minutes, but the time just disappeared." 

"Good for emergency meetings." 

Reflections on VR in schools: 

"Go to school in VR? Then there would be a lot of dropouts!" 

"In schools first of all, the pupils should have their own PC, then the books should be on CD-ROM and on the Internet, then they would be less expensive and more up-to-date." 

"The teacher in future VR schools will be the little avatar that flies around" 

"I sometimes felt that it was the computer that had control over me, I hope this is not how the school of the future will be." 

Figure 3. The Cyber Prairie 

3 Conclusions on Part I 

It became apparent for us that all test users had no problem in enjoying this experiment. All three test groups managed during the period of sixty minutes to agree on some plan for the city, and to some extent to realise this plan. Here we observed that all groups were good in organising themselves. In order to get the best result, all groups had to place one of the avatars above the city (birds eye perspective). This user had to guide the other users as they moved the houses. Most of the groups took turns in performing the various task. One of them always took on this leadership. We were also a bit astonished by how naturally they related to using avatars. All participants got used to the joystick and HMD, but they expressed the need for a better 3D input device, like a glove for grasping. After spending sixty minutes using an HMD, we expected that some of them might feel sick, but this proved not to be the case. Some of them of course felt a bit dizzy. The VR application Cyber Prairie was just a research prototype, and could have been improved in lots of ways to make interaction easier, but it still proved promising for this kind of collaboration. The three groups gave many similar views on the experience. The greatest difference was between the youngsters and the researchers. The researchers seemed much more analytic in their approach to the tasks they should perform, but they were not so clever at performing them. The youngsters were more spontaneous and explorative in their approach to the virtual worlds. In this test we observed two kinds of learning; first they learned how to move and orient themselves in the environment, then they learned how to collaborate on specific tasks in a virtual environment. The results of this trial have given us more confidence in Collaborative Virtual Environments as a useful way to learn and interact. 

Part II 

4 Design issues 

4.1 Rationale 

When we set out to write our first applications that would demonstrate some of the principles of Televirtuality, we soon found that there existed no frameworks for developing distributed Virtual Reality applications on low end computer equipment. While solutions such as DIVE existed for Silicon Graphics class machines, we thought these unsuitable for consumer Windows PCs. Also, many Virtual Reality application frameworks were strongly biased towards visualisation of the virtual worlds, neglecting basic features such as interaction with the virtual world itself. 

In the end, we decided that in order to easily implement the application we wanted, we needed to develop a new framework. To this purport, we started work on Dovre. 

4.2 Design goals 

When we designed the Dovre framework, we had the following goals in mind: 

Active objects 

During the design phase of Dovre, we took much inspiration from the real world. Most notably, this has resulted in the introduction of active objects. Rather than being passive objects that must be manipulated by higher level algorithms, Dovre objects are active and have their own behaviour and state. The active object paradigm is supported by a strongly object oriented line of thought and greatly simplifies tasks such as implementing agents with artificial intelligence. 

Biased towards ATM and TCP/IP 

One of our main goals was that Dovre should utilise the ATM network technology to its fullest. Since ATM is inherently connection oriented, this effectively excludes connection-less protocols. Standards for multicasting over ATM are still rudimentary, so the initial implementation was based on a symmetrical client/server model. Future versions of Dovre, however, may be extended to use multicasting in a server-less environment. 

The TCP/IP protocol is a standard component of nearly all networking capable systems and works with most network layers, so we chose to base the initial implementation on TCP/IP. Note that the I/O system of Dovre has been completely abstracted, so that implementing support for other network protocols is completely invisible to applications and most of the framework itself. 

Figure 4. Features of the Dovre framework 

High degree of abstraction 

In order to construct a system that is architecture and operating system independent, a high degree of abstraction is required. In the Dovre framework we have used abstraction to hide operating system, rendering engine and sound system details from the application. 

The current implementation supports the following operating systems and rendering engines: 

Sound is supported on all platforms. 

The framework may be used without visualisation and sound support, which can be highly useful for generating automated agents. Such agents are entirely computer driven and act only on interaction or triggers from other participants, and have no use for visualisation of sound locally, but may otherwise use all the functionality that the framework offers. 

Support advanced sensors and interaction 

In order to support interaction with the virtual world, Dovre supports sensors such as gloves, motion trackers, joysticks, etc. The sensors may be used for navigating, triggering actions and generally interfacing with the computer. 

Sensors used in Virtual Reality applications exhibit different behaviours and have their own output formats. While a mouse gives merely two degrees of freedom, with no point of reference, a Polhemus Fasttrak allows for six degrees of freedom in a fixed space. In order to support the various sensors in a generic manner, a scheme for querying the sensors for their capabilities has been implemented. 

Currently, Dovre supports the following sensors: 

Dovre offers full object and ray intersection detection in the object hierarchy. 

Efficiency 

To maximise throughput and to construct a system that scales well, we have gone to great lengths to streamline and fine tune Dovre. The framework is based on the hierarchical world model, which allows for many optimisations. We use our own highly optimised math library and advanced template mechanisms that, when inlined properly, generates code that rivals that of hand coded assembly (see for example [8]). 

The OpenGL implementation uses many of the most common optimisation techniques in order to achieve high frame rates, as well as special features of more expensive SGI hardware, such as anti-aliasing (multisampling) and real time shadows. 

With high end computers with multiple processors becoming cheaper, it is also important to consider how performance may be increased by the parallelisation of accesses to the hierarchy. For example on high end computers, much of the rendering engine has been implemented in hardware, which may execute in parallel with the main CPU. On these systems, parallelisation is not only desirable but imperative in order to get reasonable performance. 

5 Object organisation 

The Dovre framework is strongly based on the hierarchical model. This allows for many optimisations, and is used extensively for scene culling, fast intersection detection, efficient structure manipulation, etc. 

The world is partitioned in domains, which are separate hierarchies built up of objects and containers. Domains may be connected with portals, allowing for very large worlds. 

All hosts (i.e. both clients and servers) have at least one domain. This domain is entirely built up of real objects, objects which are local and belong to that host. All objects that are imported from other hosts become virtual objects, objects which are mirrored from another host. 

When two hosts connect, they will attempt to exchange object hierarchies with each other. This is first done by the hosts exchanging any portals they might have with each other. If the hosts have portals which may connect, they will then proceed with distributing the hierarchies of those portals to each other. Eventually, all hosts in the network will have a local copy of the part of the hierarchy that is relevant to that particular host. All these imported objects are virtual objects. Any changes to the hierarchy or its objects will be distributed to all hosts that need to know about it. 

All this is automatically done by Dovre, and is completely transparent to the programmer. 

5.1 Real objects 

Real objects are, as the name implies, the real objects in the hierarchy. They may receive messages and act upon them. If the real object chooses to act on a request, it reissues the request as a command to itself, which is distributed to all its virtual counterparts on other hosts. This implies that both the real objects and all its virtual objects have the same state, and integrity is ensured. 

5.2 Virtual objects 

Virtual objects are mirrors of real objects. They may receive requests, but may not themselves take on any action, rather they must pass them on to its real objects. Virtual objects receive commands from their real object which they must then do their best to fulfil. 

6 The message system 

The Dovre framework is an event-driven system, in that all interaction between objects must be done with messages. Indeed, the very core of Dovre is the message system. The message queue is basically a distributed queue with messages that can be updated if they lose scope, not entirely unlike the updatable queue described in [6]. The message system may change, however, with the advent of MPEG4. 

Messages may be sent directly to the receiving object, for time critical purposes, effectively bypassing the message system, or it may be queued for delivery when the system has time, optionally with a time delay. 

6.1 Messages 

Any real object may issue a message to another object. The message will then be delivered to the real object. The receiving object may choose to act on the message or ignore it. Note that the receiving object may not update internal state according to the message, as this would introduce inconsistencies in the hierarchy. This has to be done with commands. 

6.2 Commands 

Commands are used for updating internal state in an object. When a real object needs to update internal state, it issues a command, which is distributed to all its virtual objects, as well as itself. In this way, the hierarchies of all hosts are updated simultaneously, and consistency is ensured. Note that only real objects are allowed to issue commands. 

7 Object types 

The following is a short description of the basic object types in Dovre. These object types are the building blocks for virtual worlds, and may be subclassed and overloaded to get new behaviour. 

8 Future development 

Dovre is a platform for constant research and development. Some of the most important features that we are currently working with include: 

8.1 Inverse kinematics 

In order to simulate the real world as closely as possible, natural and realistic movement of avatars is necessary. Inverse kinematics provides a solution to this problem, but is computationally intensive and currently difficult to use. We are undertaking research on how to optimise such algorithms, so they may be applied in real time simulation. 

8.2 Dynamic models and MPEG4 

Currently, Dovre does not have the capability of coding models and distributing them to other hosts. Work is being done that will change this. We will proceed with implementing support for audio and video streams, most likely through support for MPEG4. 

8.3 Graceful degradation of network services 

With ATM comes a wealth of new possibilities as regards controlling the quality of the network services provided. In order to handle the highly variant nature of the data streams associated with Dovre, we need to support QoS management and graceful degradation of network services. The current implementation does not do this, but we are working actively with supporting it. The MPEG4 SNHC group is performing research in this field, which we hope to put to use in Dovre. 

8.4 Java 

Java is making a great impact on the computer world, and is particularly interesting in the context of active objects. We are looking into how Java amongst other things could be used for agents with programmable intelligence. 

9 Conclusion on Part II 

So far, we have had very good experiences with Dovre. The framework provides us with a platform for further Televirtuality research and a quick and simple way of implementing new concepts and ideas for distributed virtual reality applications. 

Performance is excellent, and Dovre has no problems handling relatively large worlds. We have more ideas for optimisations that should increase performance even further, and we will continue adding support for new technology. 

The object oriented, active object point of view provides a good basis for developing distributed virtual reality applications. Since objects have their own behaviour and state, they can virtually act on their own. In the future we will continue to explore how the active object paradigm can be used to bring virtual environments to life and to implement intelligent user interfaces. 

References 

1 Bowers, J, Pycook, J, O'Brien, J. Talk and embodyment in collaborative virtual environments. In: Proceedings of CHI'96, New York, ACM Press, 1996. 

2 Bowers, J, Pycook, J, O'Brien, J. Practically accomplishing immersion : cooperation in and for virtual environments. In: Proceedings of CSCW'96, New York, ACM Press, 1996. 

3 Pycook, J, Bowers, J. Getting others to get it right : an ethnography of design work in the fashion industry. In: Proceedings of CSCW'96, New York, ACM Press, 1996. 

4 Søby, M. Cyborg and the prosthesis : about socialisation in Cyberspace. Paper to the 5th National Conference on Pedagogic, Stavanger Nov. 1996. 

5 Ødegård, O. Social interaction in televirtuality. Paper for Virtual Reality World '96, Stuttgart, Germany, 13-15 February 1996. http://televr.fou.telenor.no/papers.html 

6 Kessler, D G, Hodges, L F. A network communication protocol for distributed virtual environment systems. 1996. Available at ftp://ftp.gvu.gatech.edu/ pub/gvu/tr/96-03.ps.Z. 

7 Luebke, D P, Georges, C. Portals and mirrors : simple, fast evaluation of potentially visible sets. Proceedings of the 1995 Symposium on Interactive 3-D Graphics, 1996. Available at http://www.cs.unc.edu/~luebke/publications/portals.html. 

8 Veldhuizen, T. Expression templates. C++ Report, 7, (5),26-31, 1995. Available at http://monet.uwaterloo.ca/~tveldhui/papers/Expression-Templates/exprtmpl.html.