Modeling Augmented Reality System, Image Guided Surgery Case Study
Modeling Augmented Reality System, Image Guided Surgery Case Study1,2Daniela Gorski Trevisan, 3Christian Raftopoulos, 1Benoît Macq, 2Jean Vanderdonckt1Université catholique de Louvain TELE -, Place du Levant, 2-B-1348 Louvain-la-Neuve, Belgium 2 BCHI - Institut d'Administration et de Gestion, Place des Doyens, 1-B-1348 3 Faculté de Médicine - Département de neurologie et de psychiatrie, Saint Luc, NEURO 1200 - Bruxelles{trevisan,macq}, IntroductionAn Augmented Reality (AR) system supplements the real world with virtual (computer-generated) objects that appear to coexist in the same space as the real world. [2]. The main interactive characteristic that emerges from these systems is the user focus shared between real and virtual worlds. This limitation results in two different interfaces possibly inconsistent – one to deal with the physical space and another for the digital one. Consequently interaction discontinuities do break the natural workflow, forcing the user to switch between operation modes. For these reasons methods to support guidance during design of conventional interfaces are not anymore valid for modeling, analyzing and designing virtual and augmented reality systems. Our work addresses this lack of focusing at the model and analyze of continuous interaction and ergonomic integrity. The methodology explores synchronization and integration characteristics that result from relationships between entities involved in an interactive AR system. As result designers can discover errors as early as possible in the development process and then provide effective collaboration between the user and the system through efficient interfaces and interaction. In the last years Augmented Reality (AR) and Image Guided Surgery (IGS) systems have received considerable attention in many works e.g. [4,7]. With IGS system complex surgical procedure can be navigated visually with great precision by overlapping on an image of the patient a colour coded preoperative plan specifying details such as the locations of incisions, areas to be avoided and the diseased tissue. This is a typical application of AR systems. The virtual world corresponds to the preoperative information, the real world corresponds to the intra-operative information and both should be correctly aligned in real time. Therefore we have studied the AR paradigm applied to the IGS, more specifically to Neurosurgery as a study case*. Section 2 describes the problem and the methodology used to modeling AR systems. Section 3 shows how it can be applied to the IGS system to analyze interaction and section 4 presents the conclusion and future works.*This work is conducted in collaboration with the MRI intra-operative group of the Neurology department at Hospital Saint Luc, Brussels.2. Modeling AR systems A distinguishing and highly desired feature of many of these emerging technologies is continuous interaction between system and user. Input and output devices establish the boundaries between the real and virtual world and for this reason the user’s interaction with them is the key point to keep continuity in AR systems. Here we define the continuity as a capability of the system to promote a smooth interaction scheme with the user during task accomplishment considering perceptual, cognitive and functional aspects. This definition is expanded and revised from [8]. Perceptual continuity is verified if the user directly and smoothly perceives the different representations of a given entity. Cognitive continuity is verified if the cognitive processes that are involved in the interpretation of the different perceived representations are similar. Functional discontinuities can occur between different functional workspaces, forcing the user to change modes of operation. This continuity property is not specific to those AR systems, but its importance becomes vital, if not safety-critical for AR systems in general and for IGS in particular. Some classifications have been proposed in order to enhance understanding of these spaces of interactions. The classification space suggested by [3] uses the ASUR modeling to provide a systematic classification process of augmented reality systems, describing physical properties of components and their relationships. The Dimension space methodology proposes in [5] classifies entities in the system according to the received attention, to the role, the manifestation, the I/O capacities, and to the informational density. Taking into account the vast number of possible events and their combinations based on user interaction, the number of the entities that have to be managed in an AR system is considerable. Our approach proposes an analysis based on synchronization and integration properties that result from interaction and relationships between entities involved in the domain. To address the continuity problem the methodology is centred on: • Objects modelling compatible with the presentation specifications and requirements; • Specification of the objects in terms of spatial integration and temporal synchronization; • Specification of the application functionality that should be interactive applications and inserted in context of the user’s task. The goal is to provide a well-structured approach to the design of interaction with these artifacts and technologies and so on to discover errors as early as possible in the development process, where the problems we are trying to eliminate concern usability and the user’s interaction with the system. 2.1 Entities and relationships Entities in an AR environment consist of systems (computational and noncomputational), output and input devices, objects, users, and tasks. Figure 1 shows the class diagram for all involved entities and their relationships. Objects can be real (e.g. patients, paper document, a pen, etc) or digital (media as images, sounds, etc.). A sensor is a kind of an input device that could track them. A real object is any object that has an actual objective existence and can be observed directly. Digital objects can be either real or virtual. Digital virtual objects are objectsthat exist in essence or effect, but not formally or actually. In order for a digital virtual object to be viewed, it must be simulated, since in essence it does not exist. This entails the use of some sort of a description, or modeling of the object like a rendered volume model of brain. For otherwise, a video image of a real scene is an example of a digital real object. One task is composed of one or many subtasks and it may have the focus in the real, virtual or mixed world. The device class consists of two categories: input and output devices. Each device may be manipulated by zero or many users and/or by zero or many systems. The device class has information about synchronization type, list of objects that may be synchronized and media type that the device can present according to his physical capabilities (like resolution, size, mobility, etc). The entity system can be composed of one or many computer-based systems and/or one or more non-computer-based systems. Synchronization between systems can be necessary to exchange information. The system entity may synchronize events from devices according to the performed task and also integrate different media sources into one or more displays mixing information about both worlds.Figure 1. Class Diagram for a generic Augmented Reality System. 2.2 Synchronization of entities in AR systems Synchronization is an event controlled by system that should be analysed between media, user-devices interaction and tasks. Basically there are two types of temporal synchronization: sequential (before relation) and simultaneous that can be equal, meets, overlaps, during, starts, or finishes relations. A complete description of synchronization types can be found in [1]. 2.2.1 Media synchronization With respect to the media synchronization, it is possible to find all these kinds of temporal relationships and we can consider the start- and end-points of events and distinguish the end of the event in natural (i.e., when the media object finishes its presentation) and forced (i.e., when an event explicitly stops the presentation of a media object).2.2.2 Devices Synchronization Devices synchronization describes a way that devices will be available to the user interacts with them at a specific time. It raises the question of how the user interaction is with multiple devices. This issue often occurs in systems with multimodal interfaces [6]. For example, if the system permits to select one object using a data glove and another with speech recognition at the same time, then there is simultaneous synchronization. If the user interacts with one device at a time we have sequential synchronization. 2.2.3 Tasks synchronization Tasks synchronization can be simultaneous or sequential and performed by one user, by various users, by the system, by a third party, or by any combination. It is possible to find all kind of temporal relations approached before. 2.3 Integration in AR systems We have considered three aspects about integration in AR systems, which are: physical integration, spatial integration and insertion context of devices. 2.3.1 Physical Integration The physical integration is controlled by the system and it describes how the user will receive feedback and how the media are distributed into output devices. It means that each media could be displayed in different displays or integrated within the same display. For example, overlapping real and virtual images in a head mounted display or showing sequences of images in a multiscreen device. 2.3.2 Spatial Integration Another aspect of integration concerns the spatial ordering and topological features of the participating visual objects. The spatial composition aims at representing three aspects: • topological relationships between the objects (disjoint, meet, overlap, etc.), • directional relationships between the objects (left, right, above-left, etc.) • the distance/metric relationships between the objects (outside 5 cm, inside 2 cm, etc.). The directional and relational relationships between the objects in many AR applications come from the registration procedures to mix the correct way both worlds, real and digital. There is a spatial integration between media entities only when there is some kind of simultaneous synchronization between media entities. 2.3.3 Insertion context of devices A device can be peripheral or central depending on the user and his/her task focus when a specific task is carried out. If the device is inserted in the central context of the user’s task, she does not need to change her attention focus to perform the task. Otherwise if the user is changing the attention focus all time, then in this case the device is inserted in context peripheral of use. 2.4 Perceptual and cognitive characteristics By exploring the Theory of Action suggested by [9] is possible to identify two main levels in the execution cycle of a task: execution and evaluation flows. The execution level consists of how the user will accomplish the task involving the temporal interaction synchronization and the insertion context of input devices in the environment. The evaluation level consists of three phases: user’s perception, interpretation and evaluation. The perception corresponds to how the user perceivesthe system state involving the temporal interaction synchronization; spatial and physical media-device integration and insertion context of output devices in the environment. The interpretation level consists of how much cognitive effort the user needs to understand the system state. It depends of what communication language or media type will be used by the system to provide the feedback to the user. The last phase corresponds to the evaluation of the system state by the user with respect to the goals.3. Case studyOur case study is related to the microscope-assisted guided neurosurgery, which keeps the task target in the physical world to reach the tumor in the brain. The user interaction with the patient is augmented by the computer giving the user extra information about the surgical planning in order to assist the user during execution of her task. Basically this system can be divided in two main phases: (1) pre-operative that includes selecting tumor shape, planning path and registration; and (2) intraoperative that corresponds to neuronavigation during the surgical intervention. In the planning procedure the task focus is concentrated on the pre-operative images; in the registration procedure the task focus is shared between the patient and the preoperative information and in the neuronavigation procedure the focus is concentrated on the patient with augmented information. During the intra-operative procedure the surgeon uses the microscope display to see the planning path coming from the digital world overlapped to the real image of a patient. The microscope has a camera to monitor the surgery procedures and to display them on the TV monitor, which is accessible for all people involved. Figure 2 shows all entities involved in the scenario according to the class diagram presented in Figure 1.mouse Surgeon(s) buttons MSMDWSWDTVStar PointSensorCameraPatientWS = Workstation System (System class) WD = Workstation Display (Device class) MS = Microscope System (System class) MD = Microscope Display (Device class) TV = TV Monitor (Device class) Nurses and Auxiliaires Surgeon(s), Nurses and Auxiliaries (User class) Patient (Object class) Mouse, buttons, star point, sensor, camera (Device class) Interacts with (according to relationships described in Figure 1)Figure 2. Entities involved in the domain of Image Guided Surgery System. 3.1 Analyzing synchronization and integration It is possible to extract some useful analysis to evaluate continuity interaction by applying the synchronization and integration characteristics to the devices, media and tasks entities involved in the IGS scenario. The used abbreviations mean: TS stands for temporal synchronization, PI for physical integration, SItop for topologic spatial integration, and SImet_direc for metric and directional spatial integration. 3.1.1 Devices Relation 1 - x is the workstation display (WD) or the TV monitor (TV) and y is the microscope display (MD) ⇒ TS= ‘x equal y’ and x is in peripheral context and y is in central context of use considering the surgical task. (Figure 3 (a) and (b)). Relation 2 - x is the surgical tool and y represents the microscope buttons ⇒ TS = ‘x before y’ and x and y are located in the central context of task accomplishment.Peripheral contextPeripheral contextCentral contextTask focus (a) (b) Figure 3. Insertion context of devices. (a) Peripheral Context. (b) Central and Peripheral Context. 3.1.2 Media Relation 1 - x and y are the pre-op images from patient but in different views or in different dimensions (e.g. x = 2D image and y = 3D image) ⇒ TS = ‘x equal y’, SItop = disjoint, SImet_direc = defined by designer or by user, PI = displayed in the same output device (WD) (Figure 4.a). Relation 2 - x is the image from patient and y is the pre-op image from patient ⇒ TS = ‘x equal y’, SItop = disjoint, SImet_direc = not required because the media are displayed in different output devices, PI = x is displayed in the MD and TV (Figure 4.b) and y in the WD (Figure 4.a). Relation 3 - x is the real direction and orientation of microscope and y is the digital image from patient ⇒ TS = ‘x equal y’, SItop = x overlaps y, SImet_direc = defined by registration phase and PI = displayed in the same output device (WD) (Figure 4.a).(a) (b) Figure 4. (a) Media displayed in the WD. (b) Media displayed in the MD. 3.1.3 Tasks Relation 1 - x and y are sub-tasks respectively in the pre-operative and in the intraoperative phases. TS = ‘x before y’. Relation 2 - x is the neuronavigation task in the intra-operative phase and y is the main task (surgical procedure). TS = ‘x during y’. 3.2 Evaluating Considering the task relation 2 and the device relation 2, the task synchronization is simultaneous and the device synchronization is sequential. For this reason, the user needs to stop one interaction before starting another. It characterizes an interaction discrete or not continuous. Another relevant point is related to the media relations 2 and 3. We can see different information available in each device: WD (Figure 4.a) is showing real information from microscope direction and position overlapped to the pre-operative images; MD (Figure 4.b) is showing digital information (path-line) overlapped to real image of apatient. It forces the user to change the focus between the devices taking into account that they have different contexts of insertion due to device relation 1. As both are required to guide the surgeon during the surgery we have an interaction classified as partial continuity. When there is simultaneous tasks synchronization the synchronization between devices or tools required to perform those tasks should be simultaneous too and they should be inserted in central context of user’s task focus. To choose how should be the device synchronized we need to consider the task synchronization and the location of the task focus. Regarding the spatial and physical integration we need to consider which are the media channels required to perform certain tasks and if the device should be inserted in central or peripheral context of the task accomplishment.4. Conclusion and future worksWe have been describing a model to design AR system considering characteristics of synchronization and integration resulting from relationships between all entities involved in the domain of discourse. At a high level of abstraction, the here applied methodology reveals itself useful to design AR systems. It integrates coherently constraints favoring continuity instead of discrete interaction with potential inconsistencies. In fact, these characteristics are important when the task requires fusion and alignment of different objects and when the interactive objects and information should be available in the central context of the user’s task. In particular we are interested in exploring the synchronization and integration characteristics presented in this work and the model-based approaches [10] for developing AR systems [3,11] that allow the users to dynamically reshuffle interaction devices at run-time and to provide better continuity during the interaction. For this purpose we intend to run a series of usability experiments to identify the potential combinations of input and output modalities that users can accommodate and preserve a certain cognitive load.5. References[1] Allen, James F. , Maintaining knowledge about temporal intervals. Communications of the ACM Volume 26 , Issue 11, pp 832 – 843 November 1983. [2] Azuma, R. T. , A survey of Augmented Reality, Presence: Teleoperators and Virtual Environments, Vol 6, No. 4(August), pp. 355-385, 1997. [3] Dubois, E, Silva, P. P. and Gray, P., Notational Support for the Design of Augmented Reality Systems. Proceedings of DSV-IS'2002, Rostock, Germany, June 12-14, 2002. [4] Edwards, P., J., King, P., A., Maurer, R., C., Jr, D. A. de Cunha, D. J. Hawkes, D. L. G. Hill, R. P. Gaston, M. R. Fenlon, A. Jusczyzck, A. J. Strong, C. L. Chandler, M. J. Gleeson , "Design and Evaluation of a System for Microscope-Assisted Guided Interventions (MAGI)", IEEE Transactions on Medical Imaging, 19(11):pp-pp, November,2000. [5] Graham, T., C., N., Watts, L., A., Calvary, G., Coutaz, J., Dubois, E.,Nigay, L., "A Dimension Space for the Design of Interactive Systems within their Physical Environments", DIS2000, 17-19 August 2000, ACMPubl. New York - USA, p. 406-416, 2000. [6] Hiroshi Ishii and Brygg Ullmer. “Tangible Bits: Towards Seamless Interfaces between People, Bits and Atoms”. Conference on Human Factors in Computing systems Atlanta, Georgia USA, 1997. [7] King, P. Edwards, C. R. Maurer, Jr., D. A. de Cunha, R. P. Gaston, M. Clarkson, D. L. G. Hill, D. J. Hawkes, M. R. Fenlon, A. J. Strong, T. C. S. Cox, M. J. Gleeson. "Stereo Augmented Reality in the Surgical Microscope", Presence: Teleoperators and Virtual Environments, 9(4):360-368, 2000. [8] Nigay; L., Dubois, E., Troccaz, J., Compatibility and Continuity in Augmented Reality Systems. I3 Spring Days Workshop, Continuity in Future Computing Systems, Porto, Portugal, April 23-24 2001. [9] Norman, D. A. & Draper, S. W. (Eds.). “User centered system design: New perspectives on human-computer interaction”. Hillsdale, NJ: Lawrence Erlbaum Associates, 1986. [10] Paternò, F. Model-based Design and Evaluation of Interactive Applications, Springer-Verlag, Berlin, 2000. [11] Tanriverdi, V. and Jacob, R.J.K. "VRID: A Design Model and Methodology for Developing Virtual Reality Interfaces," Proc. ACM VRST 2001 Symposium on Virtual Reality Software and Technology, ACM Press, Banff, Canada, 2001. 。