UNSW CSE Technical Reports

Technical Reports

Send queries to reports@cse.unsw.edu.au.

TR # Title Author(s) and Affiliation(s) Abstract
201901 Localization of Lumbar and Thoracic Vertebrae in 3D CT Datasets by combining Deep Reinforcement Learning with Imitation Learning Sankaran Iyer
School of Computer Science and Engineering,
University of New South Wales
Sankaran.iyer@student.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales
a.sowmya@unsw.edu.au

Alan Blair
School of Computer Science and Engineering,
University of New South Wales
alan@blair.name

Christopher White
Department of Endocrinology and Metabolism
Prince of Wales Hospital, NSW, Australia
Christopher.White@health.nsw.gov.au

Laughlin Dawes
Department of Medical Imaging
Prince of Wales Hospital, NSW, Australia
Laughlin.Dawes@health.nsw.gov.au

Daniel Moses
Department of Medical Imaging
Prince of Wales Hospital, NSW, Australia
Daniel.Moses@health.nsw.gov.au
Landmark detection and 3D localization are often an important step in the analysis of medical images. This task, however, is challenging, due to the natural variability of human anatomical structures. We present a novel approach to lumbar and thoracic vertebrae localization by combining Deep Reinforcement Learning with Imitation Learning. The method involves navigating a 3D bounding box to the target landmark, followed by adjustment of the bounding box dimensions to enclose the region of interest (ROI). Two different 3D Convolutional Neural Networks (CNN) are used, one for learning the navigation in the coordinate directions, the other for predicting the bounding box dimensions. The algorithm is a modification of Deep Reinforcement Learning (Deep Q Networks), with the random search for navigation replaced by guiding the movement in an optimal coordinate direction using Imitation Learning. To improve the accuracy of detection, three different architectures for CNNs are used and the combined results provided to the next stage for analysis. Threefold cross validation is used to evaluate localization performance on two separate datasets, one each for the lumbar and thoracic spine. The method achieves mean 3D Jaccard Index of 76.96%(Dice Coefficient 85.92%) on the lumbar spine dataset after training on 115 Computed Tomography (CT) images and testing on 29. The corresponding figures for the thoracic spine are Jacquard index of 74.39% (Dice Coefficient 85%) after training on 105 and testing on 27. The results for this new approach are promising and the method is applicable for localization of any ROI in a 3D dataset.
201808 Prostate Cancer Recognition by Region-based Heterogeneity Analysis of T2w-MRI Gihan Samarasinghe
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
gihan.samarasinghe@unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
arcot.sowmya@unsw.edu.au

Daniel A. Moses
Prince of Wales Clinical School, Prince of Wales Hospital, Faculty of Medicine, University New South Wales, Sydney, Australia. daniel.moses@unsw.edu.au
Recognition of prostate cancer is important prior to treatment, especially inside the prostate peripheral zone where the majority of tumours occur. A classification framework for recognition of suspicious peripheral zone lesions by region-based heterogeneity analysis in prostate T2w-Magnetic Resonance Images (MRI) is developed and evaluated. The most critical component in the proposed framework is the feature extraction method. Four different novel features are derived for a selected peripheral zone lesion, based on the heterogeneity of the whole peripheral zone on the corresponding 2D MRI slice. When deriving these features, in addition to the relative intensity distribution of regions within the remaining peripheral zone, a distance function that measures the distance of a region from the lesion is taken into account. This guarantees that the adjacent regions, where the influence of the selected lesion is higher, are a ssigned a lower weight in the computation of intensity discrimination.The developed features were used to build a probabilistic Naive Bayes classifier using 108 peripheral zone lesions in 3.0T T2-weighted MRI datasets for 56 patients, and evaluated against the ground truth provided by an expert radiologist. Quantitative results obtained using 5-fold cross-validation show that the classification performance depends on the distance function used in feature extraction. A second order distance function achieves the best classification results (90.8% sensitivity, 92.3% specificity, 91.5% accuracy, and 91.6 AUC), and significantly outperforms traditional image features found in the literature that are based on true intensities and specific intensity within the peripheral zone.
201807 Deep Learning for Volumetric Segmentation in Spatio-temporal Data: Application to Segmentation of Prostate in DCE-MRI Jian Kang
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
jian.kang@student.unsw.edu.au

Gihan Samarasinghe
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
gihan.samarasinghe@unsw.edu.au

Upul Senanayake
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
upul.senanayake@unsw.edu.au

Sailesh Conjeti
German Center for Neurodegenerative Diseases,Bonn, Germany and
Computer Aided Medical Procedures,Technische Universität München,Munich, Germany
sailesh.conjeti@tum.de

Arcot Sowmya
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
arcot.sowmya@unsw.edu.au
Segmentation of the prostate in MR images is an essential step that underpins the success of subsequent analysis methods, such as cancer lesion detection inside the tumour and registration between different modalities. This work focuses on leveraging deep learning for analysis of longitudinal volumetric datasets, particularly for the task of segmentation, and presents proof-of-concept for segmentation of the prostate in 3D+T DCE-MRI sequences. A two-stream processing pipeline is proposed for this task, comprising a spatial stream modelled using a volumetric fully convolutional network and a temporal stream modeled using recurrent neural networks with Long-Short-term Memory (LSTM) units. The predictions of the two streams are fused using deep neural networks. The proposed method has been validated on a public benchmark dataset of 17 patients, each with 40 temporal volumes. When averaged over three experiments, a highly competitive Dice overlap score of 0.8688 and sensitivity of 0.8694 were achieved. As a spatio-temporal segmentation method, it can easily migrate to other datasets.
201806 Robust CNNs for detecting collapsed buildings with crowd-sourced data Matthew Gibson
School of Computer Science and Engineering,
University of New South Wales
matthew.gibson1@student.unsw.edu.au

Dhruv Kaushik
School of Computer Science and Engineering,
University of New South Wales
dhruv.kaushik@student.unsw.edu.au


Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales
sowmya@cse.unsw.edu.au
Wildfires are increasingly common and responsible for widespread property damage and loss of life. Rapid and accurate identification of damage to buildings and other infrastructure can heavily affect the efficacy of disaster response during and after a wildfire. We have developed a dataset and a convolutional neural network-based object detection model for rapid identification of collapsed buildings from aerial imagery. We show that a baseline model built with crowd- sourced data can achieve better-than-chance mean average precision of 0.642, which can be further improved to 0.733 by constructing a new, more robust loss function.
201805 Convolutional Neural Networks for Automated Fetal Cardiac Assessment using 4D B-mode Ultrasound Manna E. Philip
School of Computer Science and Engineering,
University of New South Wales, Australia
mannaelizabeth.philip@unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales, Australia
a.sowmya@unsw.edu.au

Hagai Avnet
Institute of Obstetrics and Gynecological Imaging and Fetal Therapy,
Sheba Medical Center, Israel

Ana Ferreira
School of Womens and Childrens Health,
Faculty of Medicine,
University of New South Wales, Australia

Gordon Stevenson
School of Womens and Childrens Health,
Faculty of Medicine,
University of New South Wales, Australia

Alec Welsh
School of Womens and Childrens Health,
Faculty of Medicine,
University of New South Wales, Australia
and
Department of Maternal-Fetal Medicine,
Royal Hospital for Women,
Randwick, Australia
Structural and functional assessment of the fetal heart is currently performed using ultrasound biomarkers that require annotation by a trained clinician. This requires experience and expertise and is subject to inter- and intra-observer variability. In this work, a Convolutional Neural Network (CNN) was implemented for segmentation of the fetal annulus and automated measurements of the excursion of the mitral and tricuspid valve annular planes (TAPSE/MAPSE) were evaluated against manual annotation. After training, the network was able to achieve a Dice score of 0.78 and the excursion measures had an RMSE of < 0:16cm. Results show the feasibility of a CNN to detect the fetal annulus and subsequently for measuring these imaging biomarkers for cardiac functional assessment.
201804 Correct high level synthesis of triple modular redundant user circuits for FPGAs Michael Bernardi
School of Computer Science and Engineering,
University of New South Wales, Australia
mrmbernardi@gmail.com

Ediz Cetin
School of Engineering,
Macquarie University
ediz.cetin@mq.edu.au

Oliver Diessel
School of Computer Science and Engineering,
University of New South Wales, Australia
o.diessel@unsw.edu.au
This report outlines the HLS tool TLegUp, an extension of LegUp that produces circuits with TMR for use in space-based applications. We cover the background and motivation behind the project and the need to show that the software produces correct output in order to verify the current implementation and provide a framework for future development. A three-tiered approach for validating correctness is described and the top two tiers are applied to TLegUp. Finally, this report explores results of the validation and future goals and considerations for the project. This work is substantially based on the 4th year Bachelor of Engineering thesis by Michael Bernardi.
201803 Localization of Lumbar and Thoracic Vertebrae in 3D CT Datasets by Combining Deep Reinforcement Learning with Imitation Learning Sankaran Iyer
School of Computer Science and Engineering,
University of New South Wales
Sankaran.iyer@student.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales
a.sowmya@unsw.edu.au

Alan Blair
School of Computer Science and Engineering,
University of New South Wales
alan@blair.name

Christopher White
Department of Endocrinology and Metabolism
Prince of Wales Hospital, NSW, Australia
Christopher.White@health.nsw.gov.au

Laughlin Dawes
Department of Medical Imaging
Prince of Wales Hospital, NSW, Australia
Laughlin.Dawes@health.nsw.gov.au

Daniel Moses
Department of Medical Imaging
Prince of Wales Hospital, NSW, Australia
Daniel.Moses@health.nsw.gov.au
Landmark detection and 3D localization in CT datasets is challenging due to the natural variability of human anatomical structures. We present a novel approach to lumbar and thoracic vertebrae localisation that combines Deep Reinforcement Learning with Imitation Learning.  The method involves navigating a 3D bounding box to the target landmark, followed by adjustment of the bounding box dimensions to enclose the region of interest. Two different 3D Convolutional Neural Networks (CNN) were utilized, one for learning the navigation in the coordinate directions and the other for predicting the size of the bounding box dimensions. Deep Reinforcement Learning was used to learn the direction of navigation, with random search replaced by guided search using Imitation Learning. The method achieved mean 3D Jaccard Index of 69.96% / 67.75% for lumbar spine (training on 62 datasets, testing on 20) / thoracic spine (training on 74 datasets, testing on 20).
201801 A Bandwidth-Aware Authentication Design for Packet Integrity Detection in Untrusted Third-party NoCs Mubashir Hussain
School of Computer Science and Engineering,
University of New South Wales
mhussain@unsw.edu.au

Hui Guo
School of Computer Science and Engineering,
University of New South Wales
huig@unsw.edu.au
Bandwidth is a fundamental issue in the network-on-chip (NoC) design and increasing use of third-party NoC IPs adds the security as another design concern. The third-party NoC may have hardware Trojans (small malicious hardware modifications) that can break the confidentiality and integrity of the data transferred over the NoC, thereafter exposing the system to varied attacks. Basically, encryption can be applied to protect the data confidentiality; For the data integrity, authentication is often used. A general authentication approach for network communication is using a tag to identify the data; The tag is attached to the data at the source and the data is verified against the tag at the destination. If data is altered, its tag becomes invalid and the change can be detected. For the effectiveness of the authentication, it is desired that the tag be sufficiently large. But large tags, when transferred over the NoC, will consume a considerable network bandwidth, potentially conflicting with the system bandwidth requirement. In this work, we propose a packet data authentication design, where a packet is progressively authenticated against a set of small tag segments such that the authentication overhead is low yet the effective tag size is large. Our experiments on the 8x8 mesh NoC for a set of applications show that with an 8-bit tag size, our authentication design can achieve a detection rate of 99.99%, better than the traditional design of 32-bit tags. Meanwhile, the significant overheads, about 36% on bandwidth, 12% on packet latency, and 56% on area that have been incurred by the traditional design, can be saved from our approach.
201710 An MPSoC Based Embedded System Solution for Short Read Genome Alignment Vikkitharan Gnanasambandapillai
Arash Bayat
Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales
vikki.g@unsw.edu.au
a.bayat@unsw.edu.au
sri.parameswaran@unsw.edu.au
Computational needs for genome processing are often satiated by enormous servers or by cloud computers. Even though there has been some work in implementing some aspects of genome processing in GPUs and FPGAs, they are often accelerators for such servers and not stand-alone systems. In this paper, for the first time, we present a method to entirely move the alignment process to embedded processors. Such a system is useful in a variety of situations where significant networked infrastructure is not available, and where privacy is a concern. A ring pipelined processor architecture for short read alignment, based on partitioned genome references is shown. A timeout based alignment method is proposed to prevent unnecessary exhaustive search. The proposed partitioning method allows an entire human genome to be processed using small embedded processors. Experimental results show that the proposed solution speeds up the performance by approximately seven times with 16 embedded processors when compared to a linear pipelined system. It is expected that the proposed solution will to lead to a portable genome analyzing device, significantly reducing cost and testing time.
201708 A Formal Description Language for General Epistemic Games Michael Thielscher
School of Computer Science and Engineering,
University of New South Wales
mit@unsw.edu.au
GDL-III, a description language for general game playing with imperfect information and introspection, supports the specification of epistemic games. These are characterised by rules that depend on the knowledge of players. GDL-III provides a simpler language for representing actions and knowledge than existing formalisms: domain descriptions require neither explicit axioms about the epistemic effects of actions, nor explicit specifications of accessibility relations. We develop a formal semantics for GDL-III and demonstrate that this language, despite its syntactic simplicity, is expressive enough to model the famous Muddy Children domain. We also show that it significantly enhances the expressiveness of its predecessor GDL-II by formally proving that termination of games becomes undecidable, and we present experimental results with a reasoner for GDL-III applied to general epistemic puzzles. This report is an extended version of "GDL-III: A Description Language for Epistemic General Game Playing," which will be published at the International Joint Conference on Artificial Intelligence (IJCAI-17).
201707 Effective and Efficient Dynamic Graph Coloring Long Yuan
School of Computer Science and Engineering,
University of New South Wales
longyuan@cse.unsw.edu.au

Lu Qin
University of Technology, Sydney
lu.qin@uts.edu.au


Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales
lxue@cse.unsw.edu.au

Lijun Chang
School of Computer Science and Engineering,
University of New South Wales
ljchang@cse.unsw.edu.au

Wenjie Zhang,
School of Computer Science and Engineering,
University of New South Wales
zhangw@cse.unsw.edu.au
Graph coloring is a fundamental graph problem that is widely applied in a variety of applications. The aim of graph coloring is to minimize the number of colors used to color the vertices in a graph such that no two incident vertices have the same color. Existing solutions for graph coloring mainly focus on computing a good coloring for a static graph. However, since many real-world graphs are highly dynamic, in this paper, we aim to incrementally maintain the graph coloring when the graph is dynamically updated. Our proposal has two goals: high effectiveness and high efficiency. To achieve high effectiveness, we maintain the graph coloring in a way such that the coloring result is consistent with one of the best static graph coloring algorithms. To achieve high efficiency, we investigate efficient incremental algorithms to update the graph coloring by exploring a small number of vertices. The algorithms are designed based on the observation that the number of vertices with color changes after a graph update is usually very small. We design a color-propagation based algorithm which only explores the vertices within the 2-hop neighbors of the color-changed vertices. We then propose a novel color index to maintain some summary color information and, thus, bound the explored vertices within the neighbors of the color-changed vertices. Moreover, we derive some effective pruning rules to further reduce the number of propagated vertices. The results from extensive performance studies on real and synthetic graphs from various domains demonstrate the high effectiveness and efficiency of our approach.
201706 A Semantically Motivated Approach to Compute ROUGE Scores Elaheh ShafieiBavani
School of Computer Science and Engineering,
University of New South Wales
Data61 CSIRO
elahehs@cse.unsw.edu.au

Mohammad Ebrahimi
School of Computer Science and Engineering,
University of New South Wales
Data61 CSIRO
mohammade@cse.unsw.edu.au

Raymond Wong
School of Computer Science and Engineering,
University of New South Wales
Data61 CSIRO
wong@cse.unsw.edu.au

Fang Chen
School of Computer Science and Engineering,
University of New South Wales
Data61 CSIRO
fang@cse.unsw.edu.au
ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate abstractive summaries including lexical variations and paraphrasing. Exploring the effectiveness of lexical resource-based models to address this issue, we adopt a graph-based algorithm into ROUGE to capture the semantic similarities between peer and model summaries. Our semantically motivated approach computes ROUGE scores based on both lexical and semantic similarities. Experiment results over TAC AESOP datasets indicate that exploiting the lexicosemantic similarity of the words used in summaries would significantly help ROUGE correlate better with human judgments.
201705 Scheduling Considerations for Voter Checking in FPGA-based TMR Systems Nguyen T. H. Nguyen
School of Computer Science and Engineering,
University of New South Wales
h.nguyentran@unsw.edu.au

Ediz Cetin
School of Engineering,
Macquarie University
ediz.cetin@mq.edu.au

Oliver Diessel
School of Computer Science and Engineering,
University of New South Wales
o.diessel@unsw.edu.au
Field-Programmable Gate Arrays (FPGAs) are susceptible to radiation-induced Single Event Upsets (SEUs). A common technique for dealing with SEUs is Triple Modular Redundancy (TMR) combined with Module-based configuration memory Error Recovery (MER). By triplicating components and voting on their outputs, TMR helps localize the configuration memory errors, and by reconfiguring the faulty component, MER swiftly corrects the errors. However, the order in which the voters of TMR components are checked has an inevitable impact on the overall system reliability. In this paper, we outline an approach for computing the reliability of TMR-MER systems that consist of finitely many components. Using the derived reliability models we demonstrate that the system reliability is improved when the critical components are checked more frequently for the presence of configuration memory errors than when they are checked in round-robin order. We propose a genetic algorithm for finding a voter checking schedule that maximizes system reliability for systems consisting of finitely many TMR components. Simulation results indicate that the mean time to failure of TMR-MER systems can be increased by up to 100% when Variable-Rate Voter Checking (VRVC) rather than round robin, is used. We show that the power used to eliminate configuration memory errors in an exemplar TMR-MER system employing VRVC is reduced while system reliability remains high. We also demonstrate that errors can be detected 30% faster on average when the system employs VRVC instead of round robin for voter checking.
201704 Using architectural modelling and simulation to predict latency of blockchain-based systems Rajitha Yasaweerasinghelage
School of Computer Science and Engineering, University of New South Wales
Data61, CSIRO, Australia
Rajitha.Yasaweerasinghelage@data61.csiro.au

Mark Staples
School of Computer Science and Engineering, University of New South Wales
Data61, CSIRO, Australia
Mark.Staples@udata61.csiro.au

Ingo Weber
School of Computer Science and Engineering, University of New South Wales
Data61, CSIRO, Australia
Ingo.Weber@data61.csiro.au
Blockchain is an emerging technology for sharing transactional data and computation without using a central trusted third party. It is an architectural choice to use a blockchain instead of traditional databases or protocols, and this creates trade-offs between non-functional requirements such as performance, cost, and security. However, little is known about predicting the behaviour of blockchain-based systems. This paper shows the feasibility of using architectural performance modelling and simulation tools to predict the latency of blockchain-based systems. We use established tools and techniques, but explore new blockchain-specific issues such as the configuration of the number of confirmation blocks and inter-block times. We report on a lab-based experimental study using an incident management system, showing predictions of median system level response time with a relative error mostly under 10%. We discuss how the approach can be used to support architectural decision-making, during the design of blockchain-based systems.
201703 DominoHash - a fast hash function for bioinformatic applications suitable for custom hardware acceleration Data Curation APIs Arash Bayat
School of Computer Science and Engineering,
University of New South Wales
a.bayat@unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales
ignjat@cse.unsw.edu.au

Bruno Gaeta
School of Computer Science and Engineering,
University of New South Wales
bgaeta@unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales
sridevan@cse.unsw.edu.au
The hash-table is a widely used data structure in bioinformatics, for database searching as well as DNA-read mapping. Due to the increasing growth in the size of sequenced data, hardware acceleration has been used to speed up related algorithms. We have developed an alternative hash function to MurmurHash, a hash function commonly used in bioinformatic applications. The main advantage of the proposed hash function (DominoHash) is its suitability for acceleration by custom design hardware. Software and hardware implementations as well as the dataset used in our evaluation are available at \url{sites.google.com/site/dmhashf}.
201701 Fast accurate sequence alignment using Maximum Exact Matches Data Curation APIs Arash Bayat
School of Computer Science and Engineering,
University of New South Wales
a.bayat@unsw.edu.au

Bruno Gaeta
School of Computer Science and Engineering,
University of New South Wales
bgaeta@unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales
ignjat@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales
sridevan@cse.unsw.edu.au
Sequence alignment is a central technique in biological sequence analysis, and dynamic programming is widely used to perform an optimal alignment of two sequences. While efficient, dynamic programming is still costly in terms of time and memory when aligning very large sequences. We describe MEM-Align, an optimal alignment algorithm that focuses on Maximal Exact Matches(MEMs) between two sequences, instead of processing every symbol individually. In its original definition, MEM-Align is guaranteed to find the optimal alignment but its execution time is not manageable unless optimisations are applied that decrease its accuracy. However it is possible to configure these optimisations to balance speed and accuracy. The resulting algorithm outperforms existing solutions such as GeneMyer and Ukkonen. MEM-Align can replace edit distance-based aligners or provide a faster alternative to Smith-Waterman alignment for most of their applications including in the final stage of short read mapping.
201618 Deep Learning in Medical Imaging: A Review Upul Senanayake
School of Computer Science and Engineering,
University of New South Wales
upul.senanayake@unsw.edu.au

Jian Kang
School of Electronics and Electrical Engineering,
University of New South Wales
jian.kang@unsw.edu.au

Matthew Gibson
School of Computer Science and Engineering,
University of New South Wales
matthew.gibson1@unsw.edu.au

Arathy Satheesh Babu
School of Computer Science and Engineering,
University of New South Wales
a.satheeshbabu@unsw.edu.au

Anastasia Levenkova
School of Computer Science and Engineering,
University of New South Wales
a.levenkova@unsw.edu.au

Gihan Samarasinghe
School of Computer Science and Engineering,
University of New South Wales
gihan.samarasinghe@unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales
a.sowmya@unsw.edu.au
Computer vision and machine learning techniques have usually found their way into medical image analysis as there are strong synergies between the two fields. Medical imaging professionals typically draw inspiration from computer vision techniques to register images from multiple modalities, segment the regions of interest (ROIs) and measure ROIs to streamline the analysis pipeline, and from machine learning techniques to build models that are capable of integral tasks such as anomaly detection, progression tracking and recognition. Historically, we have seen fast adaptation of computer vision techniques and with the recent significant improvements demonstrated by the use of deep learning techniques in other fields, we believe that the medical imaging professional can benefit from it as well. Deep learning is a collection of techniques that consists of new as well as old algorithms from the neural networks community. Ideas such as convolutional neural networks and autoencoders go back to the 1980s and 1990s, while concepts such as deep belief networks and long short term memory in recurrent neural networks are relatively new. Since its advent, deep learning has been mostly perceived as a black-box and its adaptation to medical imaging has been relatively slow. In this report, we attempt to untangle the techniques collectively known as deep learning and present their building blocks concisely. We also present a comprehensive review of the state-of-the-art applications of deep learning techniques in medical imaging. It is our belief that we can inspire more medical professionals to adapt deep learning techniques to their analysis pipeline and further refine the techniques that have surfaced so far.
201617 Data Curation APIs Seyed-Mehdi-Reza (Amin) Beheshti
School of Computer Science and Engineering,
University of New South Wales
sbeheshti@cse.unsw.edu.au

Alireza Tabebordbar
School of Computer Science and Engineering,
University of New South Wales
a.tabebordbar@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales
boualem@cse.unsw.edu.au

Reza Nouri
School of Computer Science and Engineering,
University of New South Wales
snouri@cse.unsw.edu.au
Understanding and analyzing big data is firmly recognized as a powerful and strategic priority. For deeper interpretation of and better intelligence with big data, it is important to transform raw data (unstructured, semi-structured and structured data sources, e.g., text, video, image data sets) into curated data: contextualized data and knowledge that is maintained and made available for use by end-users and applications. In particular, data curation acts as the glue between raw data and analytics, providing an abstraction layer that relieves users from time onsuming, tedious and error prone curation tasks. In this context, the data curation process becomes a vital analytics asset for increasing added value and insights. In this paper, we identify and implement a set of curation APIs and make them available (as an open source project on GitHub) to researchers and developers to assist them transforming their raw data into curated data. The curation APIs enable developers to easily add features - such as extracting keyword, part of speech, and named entities such as Persons, Locations, Organizations, Companies, Products, Diseases, Drugs, etc.; providing synonyms and stems for extracted information items leveraging lexical knowledge bases for the English language such as WordNet; linking extracted entities to external knowledge bases such as Google Knowledge Graph and Wikidata; discovering similarity among the extracted information items, such as calculating similarity between string, number, date and time data; classifying, sorting and categorizing data into various types, forms or any other distinct class; and indexing structured and unstructured data - into their applications.
201616 Leveraging Set Correlations in Exact Set Similarity Join Xubo Wang
School of Computer Science and Engineering,
University of New South Wales
xwang@cse.unsw.edu.au

Lu Qin
The Centre for Quantum Computation & Intelligent Systems (QCIS),
University of Technology Sydney
lu.qin@uts.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales
lxue@cse.unsw.edu.au

Ying Zhang
The Centre for Quantum Computation & Intelligent Systems (QCIS),
University of Technology Sydney
ying.zhang@uts.edu.au

Lijun Chang
School of Computer Science and Engineering,
University of New South Wales
ljchang@cse.unsw.edu.au
Exact set similarity join, which finds all the similar set pairs from two collections of sets, is a fundamental problem with a wide range of applications. The existing solutions for set similarity join follow a filtering-verification framework, which generates a list of candidate pairs through scanning indexes in the filtering phase, and reports those similar pairs in the verification phase. Though much research has been conducted on this problem, the set correlations, which we find out is quite effective on improving the algorithm efficiency through computational cost sharing, has never been studied. Therefore, in this paper, instead of considering each set individually, we explore the set correlations in different levels to reduce the overall computational costs. First, it has been shown that most of the computational time is spent on the filtering phase, which can be quadratic to the number of sets in the worst case for the existing solutions. Thus we explore index-level correlations to reduce the filtering cost to be linear to the size of the input while keeping the same filtering power. We achieve this by grouping related sets into blocks in the index and skipping useless index probes in joins. Second, we explore answer-level correlations to further improve the algorithm based on the intuition that if two sets are similar, their answers may have a large overlap. We derive an algorithm to incrementally generate the answer of one set from an already computed answer of another similar set rather than compute the answer from scratch to reduce the computational cost. Finally, we conduct extensive performance studies using 20 real datasets with various data properties from a wide range of domains. The experimental results demonstrate that our algorithm outperforms all the existing algorithms across all datasets and can achieve more than an order of magnitude speed-up against the state-of-the-art algorithms.
201615 Exploiting multiple side channels for secret key agreement in Wireless Networks Hailun Tan
School of Computer Science and Engineering,
University of New South Wales
thailun@cse.unsw.edu.au

Vijay Sivaraman
School of Electronics and Electrical Engineering,
University of New South Wales
vijay@unsw.edu.au

Diet Ostry
School of Computer Science and Engineering,
University of New South Wales
diet.ostry@csiro.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales
sanjay@cse.unsw.edu.au
Generating a secret key between two wireless devices without any priori information is a challenging problem. Extracting the shared secret from a wireless fading channel is proven as an effective solution to this problem. However, the unreliable wireless channel results in a significant communication overhead. Most of the related works focus on minimizing the impact of channel unreliability in the key agreement process. In this paper, we explore another direction, multiple side channels, to establish the shared key. One of the side channels is packet transmission power. By switching among multiple transmission power levels, the receiver is able to decode the bits by comparing the Received Signal Strength of the current packet with that of the previous one. However, a side channel of transmission power changes alone is not secure enough as adversary could intercept the packets and infer the transmission power change pattern. Therefore, we employed another side channel by swapping the source and Destination address of the packets. We showed that adversary is able to extract shared bit with only one of the these side channels deployed but it cannot when both side channels are utilized. We showed that our approach could establish the N-bit shared key with $O(N)$ packets.
201614 Latency and Lifetime-Aware Clustering and Routing in Wireless Sensor Networks Chuanyao Nie
School of Computer Science and Engineering,
University of New South Wales
cnie@cse.unsw.edu.au

Hui Wu
School of Computer Science and Engineering,
University of New South Wales
huiw@cse.unsw.edu.au

Wenguang Zheng
School of Computer Science and Engineering,
University of New South Wales
wenguangz@cse.unsw.edu.au
Clustering is an effective technique for improving both the network lifetime and the robustness of a WSN (Wireless Sensor Network). We investigate the following latency and network lifetime-aware clustering problem for data collection: Given a WSN with one base station and a natural number k, construct a set of disjoint clusters for the WSN and a routing tree for inter-cluster communication such that the network lifetime is maximized and the maximum hop distance between each cluster to the base station is at most k. We propose a novel approach to this problem. Our approach consists of a polynomial-time heuristic for constructing clusters, a polynomial-time heuristic and an ILP (Integer Linear Programming) algorithm for constructing a routing tree for inter-cluster communication. We have performed extensive simulations on network instances with 200, 400, 600, 800 and 1000 sensor nodes with the uniform distribution and the random distribution, and compared our approach with two state-of-the-art approaches, namely MR-LEACH and DSBCA. In terms of network lifetime, the average improvements of our approach using the ILP algorithm for constructing a routing tree over MR-LEACH and DSBCA are 35% and 62%, respectively. We have also compared the heuristic and the ILP algorithm for constructing a routing tree. The ratios of the average lifetimes achieved by the heuristic and the ILP algorithm are 93% and 90%, for the uniform distribution and the random distribution, respectively.
201612 Data-Space Relocation for High Data Cache Performance in MT-Sync Embedded Multi-Threaded Processor Mahanama Wickramasinghe
School of Computer Science and Engineering,
University of New South Wales
mahanamaw@cse.unsw.edu.au

Hui Guo
School of Computer Science and Engineering,
University of New South Wales
huig@cse.unsw.edu.au
Multi-threaded processor execution is a design strategy for performance improvement and energy reduction. With the multi-threaded execution, the processor pipeline's idle time of one thread execution can be hidden by executing other threads so that the overall execution time (hence the energy consumption) can be reduced. One typical issue with the multi-threaded processor design is the cache. Cache reduces long and power consuming memory accesses, and has become an essential component in modern processor systems. However, multi-threaded execution can interfere the cache access behaviour, potentially causing more cache misses, leading to degraded cache performance. This work proposes an off-line thread data-space relocation approach for our MT-Sync multi-threaded processor design to reduce such data cache misses. The approach does not introduce any performance or hardware overhead to the existing processor. The experiment results on a set of applications show that our design achieves 17.5 times more performance gain compared to baseline multi-threaded execution and saves 2.5 times more energy.
201611 Appraising UMLS Coverage for Summarizing Medical Evidence Elaheh ShafieiBavani
School of Computer Science and Engineering,
University of New South Wales & Data61 CSIRO
elahehs@cse.unsw.edu.au

Mohammad Ebarhimi
School of Computer Science and Engineering,
University of New South Wales & Data61 CSIRO
mohammade@cse.unsw.edu.au

Raymond Wong
School of Computer Science and Engineering,
University of New South Wales & Data61 CSIRO
wong@cse.unsw.edu.au

Fang Chen
School of Computer Science and Engineering,
University of New South Wales & Data61 CSIRO
fang@cse.unsw.edu.au
When making clinical decisions, practitioners need to rely on the most relevant evidence available. However, accessing a vast body of medical evidence and confronting the issue of information overload, can be challenging and time consuming. Automatic text summarization has been known as a natural language processing technique to address this issue. While most top-performing summarizers remain largely extractive (i.e. extract a group of sentences and concatenate them.), this paper proposes an abstractive query-focused summarization framework for evidence-based medicine (EBM). Given a clinical query and a set of relevant medical evidence, our aim is to generate a fluent, well-organized, and compact summary that answers the query. The quality of biomedical summaries is also enhanced by appraising the applicability of both general-purpose (WordNet), and domain-specific (UMLS) knowledge sources for concept discrimination. We first perform iterative random walks, over the graph representation of both WordNet and UMLS, to capture sentence-to-query and sentence-to-sentence semantic similarities. We then construct a similarity graph with less query-relevant sentences filtered out, and relevant sentences are clustered. Finally, a word graph is constructed for each cluster, and the most abstractive summary sentences are obtained by re-ranking k-shortest paths. Analysis via ROUGE metrics shows that using WordNet as a general-purpose lexicon helps to capture the concepts not covered by the UMLS Metathesaurus, and hence significantly increases the summarization performance. The effectiveness of our proposed framework is demonstrated by conducting a set of experiments over a specialized EBM corpus - which has been gathered and annotated for the purpose of biomedical text summarization.
201610 Collaborative Topic Regression for Predicting Topic-Based Social Influence Asso Hamzehei
School of Computer Science and Engineering,
University of New South Wales
a.hamzehei@unsw.edu.au

Shanqing Jiang
School of Computer Science and Engineering,
University of New South Wales
shanqing.jiang@student.unsw.edu.au

Danai Koutra
School of Computer Science and Engineering,
University of Michigan, MI, USA
dkoutra@umich.edu

Raymond Wong
School of Computer Science and Engineering,
University of New South Wales
wong@cse.unsw.edu.au

Fang Chen
School of Computer Science and Engineering,
University of New South Wales
and Data61-CSIRO, Sydney, Australia
fang.chen@data61.csiro.au
Social science studies have acknowledged that the social influence of individuals is not identical. Social networks structure and shared text can reveal immense information about users, their interests, and topic-based influence. Although some studies have considered measuring user influence, less has been on measuring and estimating topic-based user influence. In this paper, we propose an approach that incorporates both network structure and user-generated content for topic-based influence measurement and prediction. We predict topic-based individual influence on unobserved topics, based on observed influence of users on the topics in which they have shown interest by posting about them in a social network. A collaborative topic-based social influence model is proposed to learn user and topic latent spaces for estimating each user's social influence on an unobserved topic. We perform experimental analysis on Twitter data and show that our model outperforms benchmarks on accuracy, recall, and precision for predicting topic-based user influence.
201609 Using Blockchain to Enable Untrusted Business Process Monitoring and Execution Ingo Weber
Data61, CSIRO, Australia
School of Computer Science and Engineering,
University of New South Wales
ingo.weber@data61.csiro.au

Xiwei Xu
Data61, CSIRO, Australia
School of Computer Science and Engineering,
University of New South Wales
xiwei.xu@data61.csiro.au

Regis Riveret
Data61, CSIRO, Australia
regis.riveret@data61.csiro.au

Guido Governatori
Data61, CSIRO, Australia
guido.governatori@data61.csiro.au

Alexander Ponomarev
Data61, CSIRO, Australia
alexander.ponomarev@data61.csiro.au

Jan Mendling
Wirtschaftsuniversitat Wien, Vienna, Austria
jan.mendling@wu.ac.at
The integration of business processes across organizations is typically beneficial for all involved parties. However, the lack of trust is often a roadblock. Blockchain is an emerging technology for decentralized and transactional data sharing across a network of untrusted participants. It can be used to find agreement about the shared state of collaborating parties without trusting a central authority or any particular participant. Some blockchain networks also provide a computational infrastructure to run autonomous programs called smart contracts. In this paper, we address the fundamental problem of trust in collaborative process execution using blockchain. We develop a technique to integrate blockchain into the choreography of processes in such a way that no central authority is needed, but trust maintained. Our solution comprises the combination of an intricate set of components, which allow monitoring or coordination of business processes. We implemented our solution and demonstrate its feasibility by applying it to three use case processes. Our evaluation includes the creation of more than 500 smart contracts and the execution over 8,000 blockchain transactions.
201608 Aggregated Search over Personal Process Description Graph Jing Ouyang Hsu
School of Computer Science and Engineering,
University of New South Wales
jxux494@cse.unsw.edu.au

Hye-young Paik
School of Computer Science and Engineering,
University of New South Wales
hpaik@cse.unsw.edu.au

Liming Zhan
School of Computer Science and Engineering,
University of New South Wales
zhanl@cse.unsw.edu.au

Anne H. H. Ngu
Department of Computer Science,
Texas State University, Austin, Texas, USA
angu@txstate.edu
People share various processes in daily lives on-line in natural language form (e.g., cooking recipes, “how-to guides” in eHow). We refer to them as personal process descriptions. Previously, we proposed Personal Process Description Graph (PPDG) to concretely represent the personal process descriptions as graphs, along with query processing techniques that conduct exact as well as similarity search over PPDGs. However, both techniques fail if no single personal process description satisfies all constraints of a query. In this paper, we propose a new approach based on our previous query techniques to query personal process descriptions by aggregation - composing fragments from different PPDGs to produce an answer. We formally define the PPDG Aggregated Search. A general framework is presented to perform aggregated searches over PPDGs. Comprehensive experiments demonstrate the efficiency and scalability of our techniques.
201607 Big Data Analytics Using Cloud and Crowd Mohammad Allahbakhsh
School of Computer Science and Engineering,
University of New South Wales
mallahbakhsh@unsw.edu.au

Saeed Arbabi
Department of Computer Engineering,
University of Zabol, Iran
sarbabi@uoz.ac.ir

Hamid-Reza Motahari-Nezhad
IBM Almaden Research Center,
San Jose, CA, USA
motahari@us.ibm.com

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales
boualem@cse.unsw.edu.au
The increasing application of social and human-enabled systems in people's daily life from one side and from the other side the fast growth of mobile and smart phones technologies have resulted in generating tremendous amount of data, also referred to as big data, and a need for analyzing these data, i.e., big data analytics. Recently a trend has emerged to incorporate human computing power into big data analytics to solve some shortcomings of existing big data analytics such as dealing with semi or unstructured data. Including crowd into big data analytics creates some new challenges such as security, privacy and availability issues. In this paper study hybrid human-machine big data analytics and propose a framework to study these systems from crowd involvement point of view. We identify some open issues in the area and propose a set of research directions for the future of big data analytics area.
201606 Scalable Distributed Subgraph Enumeration Longbin Lai
School of Computer Science and Engineering,
University of New South Wales
llai@cse.unsw.edu.au

Lu Qin
Quantum Computation and Intelligent Systems,
University of Technology Sydney,
lu.qin@uts.edu.au

Ying Zhang
Quantum Computation and Intelligent Systems,
University of Technology Sydney,
ying.zhang@uts.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales
lxue@cse.unsw.edu.au

Lijun Chang
School of Computer Science and Engineering,
University of New South Wales
ljchang@cse.unsw.edu.au
Subgraph enumeration aims to find all the subgraphs of a large data graph that are isomorphic to a given pattern graph, and is a fundamental graph problem with a wide range of applications. As the subgraph isomorphism operation is computationally intensive, research has recently focused on solving this problem in distributed environments, such as MapReduce and Pregel. Among them, the state-of-the-art algorithm, TwinTwigJoin, is proven to be instance optimal based on a left-deep-join framework. However, it is still not scalable to very large graphs because of the constraints in the left-deep-join framework and that each join unit must be a star. In this paper, we propose SEED - a scalable subgraph enumeration approach in the distributed environment. Compared to TwinTwigJoin, SEED returns optimal solution in a generalized join framework without the constraints in TwinTwigJoin. In addition to stars, we use cliques as the join units, and design an effective distributed graph storage mechanism to support such an extension. We develop a comprehensive cost model, that evaluates the number of matches of any given pattern graph by considering power-law degree distribution in the data graph. We then generalize the left-deep-join framework and develop a dynamic-programming algorithm to compute an optimal bushy join plan。 We also consider overlaps among the join units. Finally, we propose clique compression to further improve the algorithm by reducing the number of the intermediate results. Extensive performance studies are conducted on several real graphs, one containing billions of edges. The results demonstrate our algorithm is more than one order of magnitude faster than all other state-of-the-art algorithms for most queries in all datasets.
201605 Finding Hierarchically Correlated Heavy Hitters Summary from Two Dimensional Data Streams Zubair Shah
School of Engineering and IT (SEIT),
University of New South Wales
zubair.shah@student.adfa.edu.au

Abdun Naser Mahmood
School of Engineering and IT (SEIT),
University of New South Wales
a.mahmood@adfa.edu.au

Michael Barlow School of Engineering and IT (SEIT),
University of New South Wales
m.barlow@adfa.unsw.edu.au
While most applications work on traditional ``flat'' data, many domains contain hierarchical data, such as time, geographic locations, IP addresses etc. Flat methods are generally not suitable for hierarchical data, and existing hierarchical approaches--such as, hierarchical heavy hitters, multilevel and cross-level association rules--cannot capture the semantics we require when we monitor data in the form of hierarically correlated pairs. Therefore, in this work, we introduce the concept of Hierarchically Correlated Heavy Hitters (HCHH), which captures the sequential nature between pairs of hierarchical items at multiple concept levels. Specifically, the approach finds the correlation between items corresponding to hierarchically discounted frequency counts. We have provided formal definition for the proposed concept, and developed algorithmic approaches for solving HCHH efficiently in data streams. The proposed HCHH algorithms have deterministic error guarantees, and space bounds. They require O(eta/(epsilon_p*epsilon_s )) memory, where "eta" is a small constant, and epsilon_p \in [0,1], epsilon_s \in [0,1] are user defined parameters. We have compared the proposed concept of HCHH with other existing similar hierarchical notions; experimental analysis shows that HCHH identifies more interesting patterns that other hierarchical notions cannot capture. Furthermore, experimental results demonstrate that the proposed HCHH algorithm is much more efficient in terms of memory usage and output quality compared to benchmark algorithm.
201602 SEMON: Sensorless Event Monitoring in Self-Powered Wireless Nanosensor Networks Eisa Zarepour
School of Computer Science and Engineering,
University of New South Wales, Australia
ezarepour@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Adesoji A. Adesina
ATODATECH LLC, Brentwood, CA 94513 USA,
ceo@atodatech.com
A conventional wireless sensor network node consists of a number of components: microprocessor, memory, sensor and radio. Advances in nanotechnology has enabled the miniaturization of these components, thus enabling wireless nanoscale sensor networks (WNSN). Due to their small size, WNSN nodes are expected to be powered by harvesting energy from the environment. Unfortunately, there is a mismatch in the energy that can be harvested and the energy required to power all the aforementioned components in a WNSN node. In this paper, we propose a simplified sensor node architecture for event detection. We call our architecture SEMON which stands for Sensorless Event MONitoring in self-powered WNSNs. A SEMON node consists of only an energy harvester and a radio with minimal processing capacity. We assume that each event to be monitored will generate a different amount of energy and we can therefore use this amount of energy as the signature of an event. When an event occurs, a SEMON node harvests the energy released by the event and turns it into a radio pulse with an amplitude proportional to the harvested energy. A remote station is used to decode the amplitude of the pulse to recognize the event that has occurred. We propose two methods for the remote station to decode the events that have occurred. The first method is based on thresholds. The second method makes use of an event model which gives the probability that a sequence of events will occur. This enables us to formulate the decoding problem using Hidden Markov Models. We study the decoding performance of both methods. Finally, we provide a case study on using the SEMON architecture to monitor the chemical reactions inside a reactor.
201601 Improved VCF Normalization for Evaluation of DNA Variant Calling Algorithms Arash Bayat
School of Computer Science and Engineering,
University of New South Wales
a.bayat@unsw.edu.au

Bruno Gaeta
School of Computer Science and Engineering,
University of New South Wales
bgaeta@unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales
ignjat@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales
sridevan@cse.unsw.edu.au
Variant Call Format (VCF) is widely used to store data about genetic variations. Applications include evaluation of variant calling workflows and the study of the similarity of individual variations, where it is required to compare two sets of variants against each other. However, finding concordance between VCF files is a complicated task as the same variant can be represented in several different ways and is therefore not necessarily reported in a unique way by different software. In this paper, we have introduced a VCF normalization method that results in more accurate comparison. Basically, in our proposed normalization procedure, we apply all variations in a VCF file to the reference genome to created a mutated genome, and then recall variants by aligning this mutated genome back with the reference genome. The normalized VCF is not necessarily closer to the truth but is suitable for comparison purposes. The result shows over 34 times less disagreement when comparing VCF files normalized by our method relative to unormalized files. Our method mostly relies on available validated software.
201518 On an asymptotic equality for reproducing kernels and sums of squares of orthonormal polynomials Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales
ignjat@cse.unsw.edu.au

Doron Lubinsky
School of Mathematics,
Georgia Institute of Technology
Atlanta, GA 30332-0160, USA.
lubinsky@math.gatech.edu
In a recent paper, the first author considered orthonormal polynomials p_n associated with a symmetric measure with unbounded support, and with recurrence relation x p_n(x) = A_n p_{n+1}(x) + A_{n-1} p_{n-1}(x), n> 0. Under appropriate restrictions on coefficients A_n, the first author established the equality of limits of (\sum_{k=0}^n p_k^2(x))/(\sum_{k=0}^{n}1/A_k) and of (p_{2n}^{2}(x) + p_{2n+1}^{2}(x))/(1/A_{2n} + 1/A_{2n+1}), uniformly for x in compact subsets of the real line. In this paper, we establish and evaluate this limit for a class of even exponential weights, and also investigate analogues for weights on a finite interval, and for some non-even weights.
201517 On Improving Informativity and Grammaticality for Multi-Sentence Compression Elahe Shafiei
School of Computer Science and Engineering,
University of New South Wales
ATP Laboratory, National ICT Australia
elahehs@cse.unsw.edu.au

Mohammad Ebrahimi
School of Computer Science and Engineering,
University of New South Wales
ATP Laboratory, National ICT Australia
mohammade@cse.unsw.edu.au

Raymond Wong
School of Computer Science and Engineering,
University of New South Wales
ATP Laboratory, National ICT Australia
wong@cse.unsw.edu.au

Fang Chen,
School of Computer Science and Engineering,
University of New South Wales
ATP Laboratory, National ICT Australia
fang@cse.unsw.edu.au
Multi Sentence Compression (MSC) is of great value to many real world applications, such as guided microblog summarization, opinion summarization and newswire summarization. Recently, word graph-based approaches have been proposed and become popular in MSC. Their key assumption is that redundancy among a set of related sentences provides a reliable way to generate informative and grammatical sentences. In this paper, we propose an effective approach to enhance the word graph-based MSC and tackle the issue that most of the state-of-the-art MSC approaches are confronted with: i.e., improving both informativity and grammaticality at the same time. Our approach consists of three main components: (1) a merging method based on Multiword Expressions (MWE); (2) a mapping strategy based on synonymy between words; (3) a reranking step to identify the best compression candidates generated using a POS-based language model (POS-LM). We demonstrate the effectiveness of this novel approach using a dataset made of clusters of English newswire sentences. The observed improvements on informativity and grammaticality of the generated compressions show that our approach is superior to state-of-the-art MSC methods.
201516 I/O Efficient ECC Graph Decomposition via Graph Reduction Long Yuan
School of Computer Science and Engineering,
University of New South Wales
longyuan@cse.unsw.edu.au

Lu Qin
University of Technology, Sydney
Lu.Qin@uts.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales
lxue@cse.unsw.edu.au

Lijun Chang
School of Computer Science and Engineering,
University of New South Wales
ljchang@cse.unsw.edu.au

Wenjie Zhang,
School of Computer Science and Engineering,
University of New South Wales
zhangw@cse.unsw.edu.au
The problem of computing k-edge connected components (k-ECCs) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problem of ECC decomposition, which computes the k-ECCs of a graph G for all k values. ECC decomposition can be widely applied in a variety of applications such as graph-topology analysis, community detection, Steiner component search, and graph visualization. A straightforward solution for ECC decomposition is to apply the existing k-ECC computation algorithm to compute the k-ECCs for all k values. However, this solution is not applicable to large graphs for two challenging reasons. First, all existing k-ECC computation algorithms are highly memory intensive due to the complex data structures used in the algorithms. Second, the number of possible k values can be very large, resulting in a high computational cost when each k value is independently considered. In this paper, we address the above challenges, and study I/O efficient ECC decomposition via graph reduction. We introduce two elegant graph reduction operators which aim to reduce the size of the graph loaded in memory while preserving the connectivity information of a certain set of edges to be computed for a specific k. We also propose three novel I/O efficient algorithms, Bottom-Up, Top- Down, and Hybrid, that explore the k values in different orders to reduce the redundant computations between different k values. We analyze the I/O and memory costs for all proposed algorithms. In our experiments, we evaluate our algorithms using seven real large datasets with various graph properties, one of which contains 1.95 billion edges. The experimental results show that our proposed algorithms are scalable and efficient.
201515 A Survey and Tutorial on the Possible Applications and Challenges of Wearable Visual Lifeloggers. Eisa Zarepour
School of Computer Science and Engineering,
University of New South Wales, Sydney, New South Wales 2052, Australia
ezarepou@cse.unsw.edu.au


Mohammadreza Hosseini
School of Computer Science and Engineering,
University of New South Wales, Sydney, New South Wales 2052, Australia
mhosseini@cse.unsw.edu.au


Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Sydney, New South Wales 2052, Australia
salilk@cse.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales, Sydney, New South Wales 2052, Australia
sowmya@cse.unsw.edu.au
Advances in manufacture of miniaturized low-power embedded systems are paving the way for ultralight-weight wearable cameras that can be used for visual lifelogging of minute details of people's daily lives. The ability of wearable cameras to continuously capture the first person viewpoint with minimal user interaction, have made them very attractive in many application domains. Although today the wearable cameras are available and useful, but they are not widely used and accepted due to various challenges such as privacy concerns and some technical limitations. In this paper, possible industrial, medical, martial, educational, personal and media applications of wearable cameras are highlighted. The main challenges in realizing the full potential of wearable cameras are outlined and current state-of-the-art proposals for addressing these challenges are reviewed.
201514 A Model-Driven Framework for Interoperable Cloud Resources Management Denis Weerasiri
School of Computer Science and Engineering,
University of New South Wales
denisw@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales
boualem@cse.unsw.edu.au

Moshe Chai Barukh
School of Computer Science and Engineering,
University of New South Wales
mosheb@cse.unsw.edu.au

Cao Jian
Shanghai Jiaotong University, Shanghai, China
cao-jian@cs.sjtu.edu.cn
The proliferation of tools for different aspects of cloud resource Configuration and Management (C&M) processes encourages DevOps to design end-to-end and automated C&M tasks that span across a selection of best-of-breed tools. But heterogeneities among resource description models and management capabilities of such C&M tools pose fundamental limitations when managing complex and dynamic cloud resources. We propose Domain-specific Models, a model-driven approach for describing elementary and federated cloud resources as reusable knowledge artifacts over existing C&M tools. We also propose a pluggable architecture to translate these artifacts into resource descriptions and management rules that can be interpreted by external C&M tools like Juju and Docker. The paper describes concepts, techniques and current implementation of the proposed system. Experiments on a real-world federated cloud resource show significant improvements in productivity and usability achieved by our approach compared to traditional techniques.
201513 Four-fold Auto-scaling for Docker Containers Philipp Hoenisch
Institute of Information Systems
Distributed Systems Group
TU Wien, Austria
p.hoenisch@infosys.tuwien.ac.at

Ingo Weber
Software Systems Research Group,
NICTA, Sydney, Australia
ingo.weber@nicta.com.au

Stefan Schulte
Institute of Information Systems
Distributed Systems Group
TU Wien, Austria
s.schulte@infosys.tuwien.ac.at

Liming Zhu
Software Systems Research Group,
NICTA, Sydney, Australia
liming.zhu@nicta.com.au

Alan Fekete
School of Information Technologies,
University of Sydney, Australia
alan.fekete@sydney.edu.au
Virtual machines (VMs) are quickly becoming the default method of hosting Web applications (apps), whether operating in public, private, or hybrid clouds. Hence, in many cloud-based systems, auto-scaling of VMs has become a standard practice. However, VMs suffer from several disadvantages, e.g., the overhead of needed resources as a full operating system (OS) needs to be started, a degree of vendor lock-in and the relatively coarse-grained nature. This can be overcome by using lightweight container technologies like Docker as the OS is not included in a container, instead, the one from the host machine is used. Like VMs, containers offer resource elasticity, isolation, flexibility and dependability. On the one hand, containers need to run on a compatible OS and share resources through the outside OS, on the other hand, exactly this fact leads to benefits such as a faster start-up time and less overhead in terms of used resources. A common approach is to run containers on top of VMs, e.g., in a public cloud. Doing so, the flexibility for auto-scaling increases, since VMs can then be sub-divided. However, the additional freedom also means that scaling decisions become more complex: considering horizontal and vertical scaling on both, the container and the VM level, auto-scaling is now four-fold. We address four-fold auto-scaling by (i) capturing the decision space as a multi-objective optimization model, (ii) solving instances of that model dynamically to find an optimal solution, and (iii) executing the dynamic scaling decision through a scaling platform for managing Docker containers on top of VMs. We evaluated our approach with realistic apps, and found that using our approach the average cost per request is about 20-28% lower.
201512 Optimizing HTTP-Based Adaptive Streaming in Vehicular Environment using Markov Decision Process Ayub Bokani
Mahbub Hassan
Salil S. Kanhere
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW 2052, Australia
{abokani, mahbub, salilk}@cse.unsw.edu.au

Xiaoqing Zhu
Chief Technology and Architecture Office,
Cisco Systems, San Jose, CA USA
xiaoqzhu@cisco.com
Hypertext transfer protocol (HTTP) is the fundamental mechanics supporting web browsing on the Internet. An HTTP server stores large volumes of contents and delivers specific pieces to the clients when requested. There is a recent move to use HTTP for video streaming as well, which promises seamless integration of video delivery to existing HTTP-based server platforms. This is achieved by segmenting the video into many small chunks and storing these chunks as separate files on the server. For adaptive streaming, the server stores different quality versions of the same chunk in different files to allow real-time quality adaptation of the video due to network bandwidth variation experienced by a client. For each chunk of the video, which quality version to download, therefore, becomes a major decision-making challenge for the streaming client, especially in vehicular environment with significant uncertainty in mobile bandwidth. In this paper, we demonstrate that for such decision making, Markov decision process (MDP) is superior to previously proposed non-MDP solutions. Using publicly available video and bandwidth datasets, we show that MDP achieves up to 15x reduction in playback deadline miss compared to a well-known non-MDP solution when the MDP has the prior knowledge of the bandwidth model. We also consider a model-free MDP implementation that uses Q-learning to gradually learn the optimal decisions by continuously observing the outcome of its decision making. We find that MDP with Q-learning significantly outperforms MDP that uses bandwidth models.
201511 Robust Subspace Clustering for Multi-view Data by Exploiting Correlation Consensus Yang Wang
Xuemin Lin
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW 2052, Australia
{wangy, lxue}@cse.unsw.edu.au

Lin Wu
Australian Centre for Visual Technologies, The University of Adelaide,
Australian Centre for Robotic Vision,
lin.wu@adelaide.edu.au

Wenjie Zhang
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW 2052, Australia
zhangw@cse.unsw.edu.au

Xiaodi Huang
Charles Sturt University, Australia.
xhuang@csu.edu.au

Qing Zhang
Australian E-Health Research Centre
qing.zhang@csiro.edu.au
More often than not, a multimedia data described by multiple features, such as color and shape features, can be naturally decomposed of multi-views. Since multi-views provide complementary information to each other, great endeavors have been dedicated by leveraging multiple views instead of a single view to achieve the better clustering performance. To effectively exploit data correlation consensus among multi-views, in this paper we study subspace clustering for multi-view data while keeping individual views well encapsulated. For characterizing data correlations, we generate a similarity matrix in a way that high affinity values are assigned to data objects within the same subspace across views, while the correlations among data objects from distinct subspaces are minimized. Before generating this matrix, however, we should consider that multi-view data in practice might be corrupted by noise. The corrupted data will significantly downgrade clustering results. Towards these ends, We firstly present a novel objective function coupled with an angular based regularizer. By minimizing this function, multiple sparse vectors are obtained for each data object as its multiple representations. In fact, these sparse vectors result from reaching data correlation consensus on all views. For tackling noise corruption, we present a sparsity based approach that refines the angular based data correlation. By using this approach, a more ideal data similarity matrix is generated for multi-view data. Spectral clustering is then applied to the similarity matrix to obtain the final subspace clustering. Extensive experiments have been conducted to validate the effectiveness of our proposed approach.
201510 Design and Analysis of a Wireless Nanosensor Network for Monitoring Human Lung Cells Eisa Zarepour
School of Computer Science and Engineering,
University of New South Wales, Australia
ezarepour@cse.unsw.edu.au

Najmul Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
nhassan@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Majid Ebrahimi Warkiani
Laboratory of Microfluidics Biomedical Microdevices
School of Mechanical and Manufacturing Engineering
University of New South Wales, Sydney Australia
m.warkiani@unsw.edu.au
Thanks to nanotechnology, it is now possible to fabricate sensor nodes below 100 nanometers in size. Although wireless communication at this scale has not been successfully demonstrated yet, simulations confirm that these sensor nodes would be able to communicate in the terahertz band using graphene as a transmission antenna. These developments suggest that deployment of wireless nanoscale sensor networks (WNSNs) inside human body could be a reality one day. In this paper, we design and analyse a WNSN for monitoring human lung cells. We find that respiration, i.e., the periodic inhalation and exhalation of oxygen and carbon dioxide, is the major process that influences the terahertz channel inside lung cells. The channel is characterised as a two-state channel, where it periodically switches between good and bad states. Using real human respiratory data, we find that the channel absorbs terahertz signal much faster when it is in bad state compared to good state. Our simulation experiments confirm that we could reduce transmission power of the nanosensors, and hence the electromagnetic radiation inside lungs due to deployment of WNSN, by a factor of 20 if we could schedule all communication only during good channel states. We propose two duty cycling protocols along with a simple channel estimation algorithm that enables nanosensors to achieve such scheduling.
201508 Computing Connected Components with Linear Communication Cost in Pregel-like Systems Xing Feng
School of Computer Science and Engineering,
University of New South Wales
xingfeng@cse.unsw.edu.au

Lijun Chang
School of Computer Science and Engineering,
University of New South Wales
ljchang@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales
lxue@cse.unsw.edu.au

Lu Qin
University of Technology, Sydney
Lu.Qin@uts.edu.au

Wenjie Zhang,
School of Computer Science and Engineering,
University of New South Wales
zhangw@cse.unsw.edu.au
The paper studies two fundamental problems in graph analytics: computing Connected Components (CCs) and computing BiConnected Components (BCCs) of a graph. With the recent advent of Big Data, developing efficient distributed algorithms for computing CCs and BCCs of a big graph has received increasing interests. As with the existing research efforts, in this paper we focus on the Pregel programming model, while the techniques may be extended to other programming models including MapReduce and Spark. The state-of-the-art techniques for computing CCs and BCCs in Pregel incur O(m * #supersteps) total costs for both data communication and computation, where m is the number of edges in a graph and #supersteps is the number of supersteps. Since the network communication speed is usually much slower than the computation speed, communication costs are the dominant costs of the total running time in the existing techniques. In this paper, we propose a new paradigm based on graph decomposition to reduce the total communication costs from O(m * #supersteps) to O(m), for both computing CCs and computing BCCs. Moreover, the total computation costs of our techniques are smaller than that of the existing techniques in practice, though theoretically they are almost the same. Comprehensive empirical studies demonstrate that our approaches can outperform the existing techniques by one order of magnitude regarding the total running time.
201506 Secret Key Generation for Body-worn Devices by Inducing Artificial Randomness in the Channel Girish Revadigar 1,2
Chitra Javali 1,2
Hassan Jameel Asghar 2
Kasper B. Rasmussen 3
Sanjay Jha 1


1 School of Computer Science & Engineering
UNSW Australia, Sydney, Australia
{girishr,chitraj,sanjay}@cse.unsw.edu.au
2 National ICT Australia (NICTA), ATP Sydney, Australia
hassan.asghar@nicta.com.au
3 Dept. of Computer Science, University of Oxford, Oxford, UK
kasper.rasmussen@cs.ox.ac.uk
Security in Wireless Body Area Networks (WBAN) is of major concern as the miniature personal health-care devices need to protect the sensitive health information transmitted in wireless medium. It is essential for these devices to generate the shared secret key used for data encryption periodically. Recent studies have exploited wireless channel characteristics, e.g., received signal strength indicator (RSSI) to derive the shared secret key, a.k.a. session key dynamically. These schemes have very low bit rate capacity, and, in the absence of node mobility, they fail to derive keys with good entropy, which is a big threat for security. In this work, we study the effectiveness of combining dual antennas and frequency diversity for obtaining uncorrelated channel samples to improve entropy of key and bit rate in static channel conditions. We propose a novel mobility independent RSSI based secret key generation protocol - iARC for WBAN. iARC induces artificial randomness in the channel by employing dual antennas and dynamic frequency hopping effectively on resource constrained devices. We conduct an extensive set of experiments in real time environments on sensor platforms to validate the performance of iARC. To the best of our knowledge, iARC is the first WBAN protocol to extract secret keys with good entropy and high bit rate in static channel conditions. iARC has 800 bps secrecy capacity and generates 128 bit key in only 160 ms.
201505 Method for Providing Secure and Private Fine-grained Access to Outsourced Data Mosarrat Jahan 1,2
Mohsen Rezvani 1
Aruna Seneviratne 1,2
Sanjay Jha 1,2
1 School of Computer Science and Engineering,
The University of New South Wales
2 National ICTA (NICTA)
Sydney, NSW, Australia
{mjahan, mohsin, sanjay}@cse.unsw.edu.au
{Aruna.Seneviratne@nicta.com.au}
Outsourcing data to the cloud for computation and storage has been on rise in recent years. In this paper we investigate the problem of supporting write operation on the outsourced data for clients using mobile devices. Due to security concerns, data in the cloud is expected to be stored in encrypted form and associated with access control mechanism. In this work, we consider the Attribute based Encryption (ABE) scheme as it is well suited to support access control in outsourced cloud environment. Currently there is a gap in the literature on providing write access on the data encrypted with ABE. Moreover, since ABE is computationally expensive, it imposes processing burden on resource constrained mobile devices. Our work has two fold advantages. Firstly, we extend the single authority Ciphertext Policy Attribute based Encryption (CP-ABE) scheme to support write operations. We define a group among the set of authorized users for a ciphertext that can perform the write operation, while the remaining users can perform only read operation. Secondly in achieving this goal, we move some of the expensive computations to a manager and remote cloud server by exploiting their high-end computational power. Our security analysis demonstrates that the security properties of system are not compromised.
201504 TEXUS: A Task-based Approach for Table Extraction and Understanding Roya Rastan
DSchool of Computer Science and Engineering,
University of New South Wales, Australia
rrastan@cse.unsw.edu.au

Hye-Young Paik
School of Computer Science and Engineering,
University of New South Wales, Australia
hpaik@cse.unsw.edu.au

John Shepherd
School of Computer Science and Engineering,
University of New South Wales, Australia
jas@cse.unsw.edu.au
In this paper, we propose a precise, comprehensive model of table processing which aims to remedy some of the problems in the discussion of table processing in the literature. The model targets application-independent, end-to-end table processing, and thus encompasses a large subset of the work in the area. The model can be used to aid the design of table processing systems (and we provide an example of such a system), can be considered as a reference framework for evaluating the performance of table processing systems, and can assist in clarifying terminological differences in the table processing literature.
201503 A Markovian Approach to the Optimal Demodulation of Diffusion-based Molecular Communication Networks Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au
In a diffusion-based molecular communication network, transmitters and receivers communicate by using signalling molecules (or ligands) in a fluid medium. This paper assumes that the transmitter uses different chemical reactions to generate different emission patterns of signalling molecules to represent different transmission symbols, and the receiver consists of receptors. When the signalling molecules arrive at the receiver, they may react with the receptors to form ligand-receptor complexes. Our goal is to study the demodulation in this setup assuming that the transmitter and receiver are synchronised. We derive an optimal demodulator using the continuous history of the number of complexes at the receiver as the input to the demodulator. We do that by first deriving a communication model which includes the chemical reactions in the transmitter, diffusion in the transmission medium and the ligand-receptor process in the receiver. This model, which takes the form of a continuous-time Markov process, captures the noise in the receiver signal due to the stochastic nature of chemical reactions and diffusion. We then adopt a maximum a posterior framework and use Bayesian filtering to derive the optimal demodulator. We use numerical examples to illustrate the properties of this optimal demodulator.
201502 An Iterative Algorithm for Reputation Aggregation in Multi-dimensional and Multinomial Rating Systems Mohsen Rezvani
School of Computer Science and Engineering,
University of New South Wales, Australia
mrezvani@cse.unsw.edu.au

Mohammad Allahbakhsh
University of Zabol, Iran
allahbakhsh@uoz.ac.ir

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Online rating systems are widely accepted as a means for quality assessment on the web, and users increasingly rely on these systems when deciding to purchase an item online. This fact motivates people to manipulate rating systems by posting unfair ratings scores for fame or profit. Therefore, both building useful realistic rating scores as well as detecting unfair behaviours are of very high importance. Existing solutions are mostly majority based, also employing temporal analysis and clustering techniques. However, they are still vulnerable to false ratings. They also ignore distance between options, provenance of information and different dimensions of cast rating scores while building trust and rating scores. In this paper, we propose a robust iterative algorithm which leverages the information in the profile of raters, provenance of information and a decay function for the distance between options to build decent rating scores for items and trust ranks for the people. We have implemented and tested our rating method using simulated data as well as three real world datasets. Our tests demonstrate that our model calculates realistic rating scores even in the presence of massive false ratings and outperforms well-known algorithms in the area.
201501 Personal Process Description Graph for Describing and Querying Personal Processes Jing Xu
Hye-young Paik
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW 2052, Australia
{jxux494, hpaik}@cse.unsw.edu.au

Anne H. H. Ngu
Department of Computer Science,
Texas State University,
Austin, Texas, USA
angu@txstate.edu
Unlike business processes which are template driven, personal processes are adhoc to the point where each personal process may have a unique structure and is certainly not as strictly defined as a business process. In order to describe, share and analyze personal processes more effectively, in this paper, we propose Personal Process Description Graph (PPDG) for describing personal processes. Based on the proposed model, a personal process query approach is developed to support different types of graph queries in a personal process graph repository. The approach follows a filtering and refinement framework to speed up the query computation. We conduct some experiments on real and synthetic datasets to demonstrate the efficiency of our techniques.
201425 Identification of Transition-Based Models of Biological Systems using Logic Programming Ashwin Srinivasan
Department of Computer Science,
IIIT-D,
New Delhi, India
ashwin@iiitd.ac.in

Michael Bain
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW 2052, Australia
mike@cse.unsw.edu.au
Transition systems like Petri nets have been widely used to model networks and to capture the dynamics of system behaviour. To date, these models have mostly been specified using specialised languages and associated simulators. In this paper we adopt the representation of first-order logic to specify a very general class of transition systems. Logical Guarded Transition Systems (or LGTSs) are characterised by the use of transitions that combine the usual linear numerical constraints associated with Petri nets with logical constraints (the "guard") expressed as a first-order formula. Our interest here is that this class of transition systems allows a very flexible way of specifying complex systems, such as large-scale biological networks. Using LGTSs we define the system-identification task in terms of logical consequence-finding, given domain-specific background knowledge and data on the system's behaviour. Consequence-finding by a logic-programming system is used to determine if there exists a finite-state automaton (FSA) that accepts a sequence of observational data S, given the background knowledge B. The output symbols of the FSA specify an LGTS consistent with the data. This basic approach is adequate to handle a number of situations like the hierarchical construction of large networks, the use of domain-knowledge to constrain answers, and the hypothesis of missing states. We also describe how the deductive machinery can be augmented using abduction and induction to deal with deficiencies in the data and background knowledge. Using a number of classical networks from the literature, we demonstrate the identification of: pure Petri nets; extended Petri nets; networks that re-use sub-nets; network from data with missing values; and networks requiring new transitions. The results suggest that LGTSs and the logical formulation of system-identification can be used to obtain qualitative network models for a wide variety of biological systems, and is sufficiently general to apply to areas other than biology.
201423 Self-Powered Wireless Nano-scale Sensor Networks within Chemical Reactors Eisa Zarepour
School of Computer Science and Engineering,
University of New South Wales, Australia
ezarepour@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Adesoji A. Adesina
ATODATECH LLC, Brentwood, CA 94513 USA,
ceo@atodatech.com
Because of their small size and unique nanomaterial properties, nano-scale sensor networks (NSNs) can be applied in many chemical applications to monitor and control the chemical process at molecule level. Nano-sensors can take ad- vantage of the temperature variation during a chemical synthesis to harvest thermoelectric energy from each individual reaction. In this study, we demonstrate that how the thermal property of chemical reactions could be used as a practical source for energy harvesting for the nanosensor nodes deployed in the catalyst sites to form a self-powered NSNs.
201421 Asymptotic behaviour of some families of orthonormal polynomials and an associated Hilbert space Aleksandar Ignjatovic
School of Computer Science and Engineering,
The University of New South Wales, Australia
and National ICT Australia (NICTA)
ignjat@cse.unsw.edu.au
We characterise the asymptotic behaviour of families of orthonormal polynomials whose recursion coefficients satisfy certain conditions, satised for example by the Hermite polynomials and, more generally, by families with recursion coefficients of the form c(n+1)^p for 0 < p < 1. We then use this result to show that, in a Hilbert space associated with a family of orthonormal polynomials whose recursion coefficients satisfy such conditions, every two sinusoids of unequal positive frequencies are mutually orthogonal.
201420 Mining Processes with Multi-Instantiation: Discovery and Conformance Checking Ingo Weber
Software Systems Research Group, NICTA, and
School of Computer Science and Engineering,
The University of New South Wales, Australia
Ingo.Weber@nicta.com.au

Mostafa Farshchi
Software Systems Research Group, NICTA, and
Department of Computer Science and Software Engineering
Swinburne University of Technology, Hawthorn, Australia
mfarshchi@swin.edu.au

Jan Mendling
Department of Information Systems and Operations
Wirtschaftsuniversitaet Wien, Vienna, Austria
jan.mendling@wu.ac.at

Jean-Guy Schneider
Department of Computer Science and Software Engineering
Swinburne University of Technology, Hawthorn, Australia
jschneider@swin.edu.au
Process mining is becoming a widely adopted practice. However, when the underlying process contains multi-instantiation of sub-processes, classical process mining techniques that assume a flat process are not directly applicable. Their application can cause one of two problems: either the mined model is overly general, allowing arbitrary order and execution frequency of activities in the sub-process, or it lacks fitness by capturing only single instantiation of sub-processes. For conformance checking, this results in a too high rate of ei- ther false positives or false negatives, respectively. In this report, we propose an extension to well-known process mining techniques, adding the capability of handling multi-instantiated sub-processes to discovery and conformance checking. We evaluate the approach with two independent data sets taken from real-world applications.
201419 EPLA: Energy-balancing Packets Scheduling for Airborne Relaying Networks Kai Li
School of Computer Science and Engineering,
The University of New South Wales, Australia
kail@cse.unsw.edu.au

Wei Ni
CSIRO, Australia
wei.ni@csiro.au

Xin Wang
Fudan University, China
xwang11@fudan.edu.cn

Ren Ping Liu
CSIRO, Australia
ren.liu@csiro.au

Salil S. Kanhere
School of Computer Science and Engineering,
The University of New South Wales, Australia
salilk@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering,
The University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Airborne relaying has great potential to extend the coverage of wireless sensor network (WSN), relaying sensed data from remote, human-unfriendly terrains. However, the challenges of lossy airborne relaying channels and short lifetime arise, due to the high mobility and limited battery capacity of unmanned aerial vehicles (UAVs). We propose an energy-efficient relaying scheme which is able to overcome the lossy channels and extend the lifetime of cooperative UAVs substantially. The key idea is to employ a swarm of UAVs to listen to a remote sensor from distributed locations, thereby improving packet reception over lossy channels. UAVs report their reception qualities to the base station, which then schedules UAVs' forwarding with guaranteed success rates and balanced energy consumption. Such scheduling is a NP-hard binary integer programming problem and intractable in WSNs where there can be a large number of packets. We develop a practical suboptimal solution by decoupling the processes of energy balancing and modulation selection. The decoupled processes are carried out in an alternating manner, achieving fast convergence. Simulation results confirm that our method is indistinguishably close to the NP-hard optimal solution in terms of network yield (throughput) and lifetime. Meanwhile, the complexity of our method is significantly lower by orders of magnitude. Simulations also reveal that our scheme can save energy by 50%, increase network yield by 15%, and extend network lifetime by 33%, compared to existing greedy algorithms.
201417 Optimal Enumeration: Efficient Top-k Tree Matching Lijun Chang
School of Computer Science and Engineering,
University of New South Wales, Australia
ljchang@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales, Australia
lxue@cse.unsw.edu.au

Wenjie Zhang
School of Computer Science and Engineering,
University of New South Wales, Australia
zhangw@cse.unsw.edu.au

Jeffrey Xu Yu
The Chinese University of Hong Kong, China
yu@se.cuhk.edu.hk

Ying Zhang
University of Technology, Sydney, Australia
Ying.Zhang@uts.edu.au

Lu Qin
University of Technology, Sydney, Australia
Lu.Qin@uts.edu.au
Driven by many real applications, graph pattern matching has attracted a great deal of attention recently. Consider that a twig-pattern matching may result in an extremely large number of matches in a graph; this may not only confuse users by providing too many results but also lead to high computational costs. In this paper, we study the problem of top-$k$ tree pattern matching; that is, given a rooted tree $T$, compute its top-$k$ matches in a directed graph $G$ based on the twig-pattern matching semantics. We firstly present a novel and optimal enumeration paradigm based on the principle of Lawler's procedure. We show that our enumeration algorithm runs in $O (n_T + \log k)$ time in each round where $n_T$ is the number of nodes in $T$. Considering that the time complexity to output a match of $T$ is $O(n_T)$ and $n_T \geq \log k$ in practice, our enumeration technique is optimal. Moreover, the cost of generating top-$1$ match of $T$ in our algorithm is $O (m_R)$ where $m_R$ is the number of edges in the transitive closure of a data graph $G$ involving all relevant nodes to $T$. $O (m_R)$ is also optimal in the worst case without pre-knowledge of $G$. Consequently, our algorithm is optimal with the running time $O (m_R + k (n_T + \log k))$ in contrast to the time complexity $O (m_R \log k + k n_T (\log k + d_T))$ of the existing technique where $d_T$ is the maximal node degree in $T$. Secondly, a novel priority based access technique is proposed, which greatly reduces the number of edges accessed and results in a significant performance improvement. Finally, we apply our techniques to the general form of top-$k$ graph pattern matching problem (i.e., query is a graph) to improve the existing techniques. Comprehensive empirical studies demonstrate that our techniques may improve the existing techniques by orders of magnitude.
201416 Plane-based Object Categorisation using Relational Learning: Implementation Details and Extension of Experiments Reza Farid
School of Computer Science and Engineering,
University of New South Wales, Australia
rezaf@cse.unsw.edu.au

Claude Sammut
School of Computer Science and Engineering,
University of New South Wales, Australia
claude@cse.unsw.edu.au
Object detection, classification and manipulation are some of the capabilities required by autonomous robots. The main steps in object recognition and object classification are: segmentation, feature extraction, object representation and learning. To address the problem of learning object classification using multi-view range data, we use a relational approach. The first step is to decompose a scene into shape primitives such as planes. A set of higher-level, relational features is extracted from the segmented regions. Thus, features are presented in three different levels: single region features, pair-region relationships and features of all regions forming an object instance. The extracted features are represented as predicates in Horn clause logic. Positive and negative examples are produced for learning by the labelling and training facilities developed in this research. Inductive Logic Programming (ILP) is used to learn relational concepts from instances taken by a depth camera. As a result, a human-readable representation for each object class is created. The methods developed in this research have been evaluated in experiments on data captured from a real robot designed for urban search and rescue, as well as on standard data sets. The results show that ILP is successful in recognising objects encountered by a robot and are competitive with the other state-of-the-art methods. In this report, we provide details of this plane-based object categorisation using relational learning, including the details of the developed segmentation method for producing high-quality planar segments used for learning object classes, implementation of features extraction, and specification needed for learning. We also perform some experiments to evaluate the new features and compare our method with a state-of-the-art non-relational object classifier.
201415 Qualitative Simulation with Answer Set Programming Timothy Wiley
Claude Sammut
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW 2052, Australia
{timothyw, claude}@cse.unsw.edu.au

Ivan Bratko
Faculty of Computer and Information Science,
University of Ljubljana,
Trzaska 25, 1000 Ljubljana, Slovenia
bratko@fri.uni-lj.si
Qualitative Simulation (QSIM) reasons about the behaviour of dynamic physical systems as they evolve over time. The system is represented by a coarse qualitative model rather than precise numerical models. However, for large complex domains, such as robotics for Urban Search and Rescue, existing QSIM implementations are inefficient. ASPQSIM is a novel formulation of the QSIM algorithm in Answer Set Programming that takes advantage of the similarities between qualitative simulation and constraint satisfaction problems. ASPQSIM is compared against an existing QSIM implementation on a variety of domains that demonstrate ASPQSIM provides a significant improvement in efficiency especially on complex domains, and producing simulations in domains that are not solvable by the procedural implementation.
201414 Credibility Propagation for Robust Data Aggregation in WSNs Mohsen Rezvani
School of Computer Science and Engineering,
University of New South Wales, Australia
mrezvani@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Elisa Bertino
Department of Computer Science,
Purdue University
bertino@cs.purdue.edu

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Trust and reputation systems are widely employed in WSNs in order to help decision making processes by assessing trustworthiness of sensor nodes in a data aggregation process. However, in unattended and hostile environments, some sophisticated malicious attacks such as collusion attacks can distort the computed trust scores and lead to low quality or deceptive services as well as to undermine the aggregation results. Thus, taking into account the collusion attacks for developing a secure trust-based data aggregation in unattended environments has become an important research issue. In this paper, we address this problem by proposing a novel collaborative-based trust framework for WSNs, which is based on the introduced concept of credibility propagation. In this method, the trustworthiness of a sensor node is evaluated from the amount of credibility that such a node collects from other nodes. Moreover, we obtain the statistical parameters of sensors errors including sensors\rq{} variances from such credibility values. Accordingly, we propose an iterative filtering algorithm to recursively compute the credibility and variance of all sensors. Following this algorithm, an estimate of the true value of the signal can be effectively obtained through the maximum likelihood estimation. Furthermore, we augment the proposed trust framework with a collusion detection and revocation method as well as data streaming algorithm. Extensive experiments across a wide variety of configurations over both real-world and synthetic datasets demonstrate the efficiency and effectiveness of our approach.
201412 CLAMS: Cross-Layer Multi-Cloud Application Monitoring-as-a-Service Framework Khalid Alhamazani
School of Computer Science and Engineering
University of New South Wales, Australia
ktal130@cse.unsw.edu.au

Rajiv Ranjan
CSIRO Computational Informatics
CSIRO, Australia
rajiv.ranjan@csiro.au

Karan Mitra
Luleĺ University of Technology
Skellefteĺ Campus, 93187 Skellefteĺ, Sweden
karan.mitra@ltu.se

Prem Prakash Jayaraman
CSIRO Computational Informatics
CSIRO, Australia
prem.jayaraman@csiro.au

Zhiqiang (George) Huang
CSIRO Computational Informatics
CSIRO, Australia
zhiqiang.huang@csiro.au

Lizhe Wang
Chinese Academy of Sciences, Beijing, China
lizhe.wang@gmail.com

Fethi Rabhi
School of Computer Science and Engineering
University of New South Wales, Australia
Fethir@cse.unsw.edu.au
Cloud computing provides on-demand access to affordable hardware (e.g., multi-core CPUs, GPUs, disks, and networking equipment) and software (e.g., databases, application servers, data processing frameworks, etc.) platforms. Application services hosted on single/multiple cloud provider platforms have diverse characteristics that require extensive monitoring mechanisms to aid in controlling run-time quality of service (e.g., access latency and number of requests being served per second, etc.). To provide essential real-time information for effective and efficient cloud application quality of service (QoS) monitoring, in this paper we propose, develop and validate CLAMS—Cross-Layer Multi-Cloud Application Monitoring-as-a-Service Framework. The proposed framework is capable of: (a) performing QoS monitoring of application components (e.g., database, web server, application server, etc.) that may be deployed across multiple cloud platforms (e.g., Amazon and Azure); and (b) giving visibility into the QoS of individual application component, which is something not supported by current monitoring services and techniques. We conduct experiments on real-world multi-cloud platforms such as Amazon and Azure to empirically evaluate our framework and the results validate that CLAMS efficiently monitors applications running across multiple clouds.
201411 Unified Representation and Reuse of Federated Cloud Resources Configuration Knowledge Denis Weerasiri
School of Computer Science and Engineering,
University of New South Wales, Australia
denisw@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Jian Yang
Department of Computing
Macquarie University
Sydney NSW 2109, Australia
jian.yang@mq.edu.au
Current cloud resource delivery models enforce cloud resource consumers to bear the burden of leveraging existing cloud resources configuration management knowledge to satisfy consumers' federated cloud application and resource requirements. Because the support offered by current cloud resources configuration management techniques is mostly limited to segregated cloud infrastructures or platform functionalities, which prevent any coordinated combination of on-premise and off-premise applications, and resources. In this paper, we propose an embryonic data model for unified cloud resources configuration knowledge representations. Also we propose a rule based recommender system, which allows consumers to declaratively specify requirements and get recommendations of configuration management knowledge that satisfies the given requirements. We implemented a proof-of-concept prototype to test our approach.
201409 Trust and Privacy Considerations in Participant Selection for Social Participatory Sensing Haleh Amintoosi
School of Computer Science and Engineering,
University of New South Wales, Australia
haleha@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au

Mohammad Allahbakhsh
School of Computer Science and Engineering,
University of New South Wales, Australia
mallahbakhsh@cse.unsw.edu.au
The main idea behind social participatory sensing is to leverage social friends to participate in mobile sensing tasks. A main challenge, however, is the identification and recruitment of sufficient number of well-suited participants. This becomes especially more challenging for large-scale online social networks where the network topology and friendship relations are not known to the applications. Moreover, the potential sparseness of the friendship network may result in insufficient participation, thus reducing the validity of the obtained information. In this paper, we propose a participant selection framework which aims to address the aforementioned limitations. The framework has two main modules. The nomination module makes use of a customized random surfer to crawl the social graph and identify suitable nominees among the requester's friends and friends-of-friends. The nominee selection is determined as a function of the suitability score of members and pairwise trust perception among members. The selection module is responsible for selecting the required participants from the set of nominees. The selection is done based on the nominee's timeliness, the number of participants selected so far and the task's remaining time. Moreover, to prevent any possible collusion, a further check is performed to determine whether the selection of a new participant may result in the formation of a colluding group among the selected participants. Simulation results demonstrate the efficacy of our proposed participant selection framework in terms of selecting a large number of reputable participants with high suitability scores, in comparison with state-of-the-art methods.
201408 Contents and Time Sensitive Document Ranking of Scientific Literature Han Xu
School of Computer Science and Engineering,
University of New South Wales, Australia
hanx@cse.unsw.edu.au

Eric Martin
School of Computer Science and Engineering,
University of New South Wales, Australia
emartin@cse.unsw.edu.au

Ashesh Mahidadia
School of Computer Science and Engineering,
University of New South Wales, Australia
ashesh@cse.unsw.edu.au
A new link-based document ranking framework is devised with at its heart, a contents and time sensitive random literature explorer designed to more accurately model the behaviour of readers of scientific documents. In particular, our ranking framework dynamically adjusts its random walk parameters according to both contents and age of encountered documents, thus incorporating the diversity of topics and how they evolve over time into the score of a scientific publication. Our random walk framework results in a ranking of scientific documents which is shown to be more effective in facilitating literature exploration than PageRank measured against a proxy of ground truth based on papers’ potential usefulness in facilitating later research. One of its many strengths lies in its practical value in reliably retrieving and placing promisingly useful papers at the top of its ranking.
201407 Could Message Ferrying be a Viable Technology for Small Cell Backhaul? Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au
Small cell is seen as key to combat the looming capacity crisis in next generation mobile networks. Backhaul, however, is proving very costly, especially when it comes to connecting outdoor small cells to the core network. We analyse the viability of message ferrying as a low-cost option for small cell backhaul. The idea is that the smartphones of vehicle occupants could work as an army of ferries to transfer data between a small cell and another location that already has an existing backhaul infrastructure. We analyse the potential capacity of ferry-based backhaul using real road traffic data and the capacity of two different types of phone storage, RAM and internal memory. We find that even with only a 5\% use of the phone storage, message ferrying could deliver giga bits per second capacity and transfer tens of peta bytes per week. Our analysis also reveals that the choice of storage type, which may be influenced by privacy concerns, has a significant effect on capacity. Operators choosing to use the internal memory can expect to increase the ferrying capacity by nearly an order of magnitude compared to a RAM-only solution.
201406 Interdependent Security Risk Analysis of Hosts and Flows Mohsen Rezvani
School of Computer Science and Engineering,
University of New South Wales, Australia
mrezvani@cse.unsw.edu.au

Verica Sekulic
School of Computer Science and Engineering,
University of New South Wales, Australia
vericas@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Elisa Bertino
Department of Computer Science,
Purdue University, USA
bertino@cs.purdue.edu

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Detection of high risk hosts and flows continues to be a significant problem in security monitoring of high throughput networks. A comprehensive risk assessment method should take into account the risk propagation among risky hosts and flows. In this paper this is achieved by introducing two novel concepts. The first is an interdependency relationship among the risk scores of a network flow and its source and destination hosts. In one hand, the risk score of a host depends on risky flows such a host initiates and is targeted by. On the other hand, the risk score of a flow depends on the risk scores of its source and destination hosts. The second concept, which we call flow provenance, represents risk propagation among network flows which takes into account the likelihood that a particular flow is caused by other flows. Based on these two concepts, we develop an iterative algorithm for computing the risk level of hosts and network flows. We give a rigorous proof that our algorithm rapidly converges to unique risk estimates, and provide its extensive empirical evaluation using two real-world datasets. Our evaluation demonstrates that our method is effective in detecting high risk hosts and flows and is sufficiently efficient to be deployed in high throughput networks.
201405 Representing and Reasoning about Game Strategies Dongmo Zhang
School of Computing and Mathematics,
University of Western Sydney, Australia
dongmo@scm.uws.edu.au

Michael Thielscher
School of Computer Science and Engineering,
University of New South Wales
mit@cse.unsw.edu.au
As a contribution to the challenge of building game-playing AI systems, we develop and analyse a formal language for representing and reasoning about strategies. Our logical language builds on the existing general Game Description Language (GDL) and extends it by a standard modality for linear time along with two dual connectives to express preferences when combining strategies. The semantics of the language is provided by a standard state-transition model. As such, problems that require reasoning about games can be solved by the standard methods for reasoning about actions and change. We also endow the language with a specific semantics by which strategy formulas are understood as move recommendations for a player. To illustrate how our formalism supports automated reasoning about strategies, we demonstrate two example methods of implementation: first, we formalise the semantic interpretation of our language in conjunction with game rules and strategy rules in the Situation Calculus; second, we show how the reasoning problem can be solved with Answer Set Programming.
201403 Reconfigurable Convolutional Codec Architecture and its Security Application Liang Tang
School of Computer Science and Engineering,
University of New South Wales, Australia
to.liang.tang@gmail.com

Jude Angelo Ambrose
School of Computer Science and Engineering,
University of New South Wales, Australia
ajangelo@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales, Australia
sridevan@cse.unsw.edu.au
Wireless communication is an indispensable tool in our daily life. Due to the open nature of wireless channels, wireless communication is more vulnerable to attacks than wired communication. Security is paramount in wireless communication to overcome these attacks. A reconfigurable convolutional encoder/decoder based physical layer security mechanism, named ReConv, is proposed in this paper. ReConv provides an extra level of security at the physical layer by dynamically updating baseband convolution parameters for secure packets. ReConv is expected to interleave normal packets along with secure packets. An eavesdropper will see the packets with changed convolution parameters as packets containing errors and will drop them. However, the rightful receiver will be able to decode the secure packets without error. Since the secure packets are dropped at the eavesdropperŐs physical layer, there will be no further processing of the secure packets by the eavesdropper. ReConv allows the use of 63.5 billions differing convolution parameter combinations; the attacking complexity would be increased exponentially with the extended byte-level ReConv. Meanwhile, the low hardware overhead of ReConv makes it a low cost solution to enable greater security in existing wireless systems. Furthermore, the orthogonal relationship between ReConv and existing security algorithms such as AES, makes it easy to integrate ReConv to existing wireless systems.
201402 ElasticCopyset: An Elastic Replica Placement Scheme for High Durability Han Li
School of Computer Science and Engineering,
University of New South Wales, Australia
hli@cse.unsw.edu.au

Srikumar Venugopal
School of Computer Science and Engineering,
University of New South Wales, Australia
srikumarv@cse.unsw.edu.au
Distributed key-value stores (KVSs) are a standard component for data management for applications in Infrastructure-as-a-Service (IaaS) clouds. Replica placement schemes for KVSs on IaaS have to be adapted to on-demand node addition and removal, as well as, to handle correlated failures of the physical hardware underlying the cloud. Currently, while placement strategies exist for handling correlated failures, they tend to rely on static mapping of data to nodes, which is inefficient for an elastic system. This paper presents ElasticCopyset, a novel replica placement scheme, which fills in the gap of providing efficient elasticity while maintaining high data durability when multiple nodes simultaneously fail. We experimentally demonstrate that ElasticCopyset maintains a close to minimised probability of data loss at correlated failures under different scenarios, and exhibits better scalability and elasticity than state-of-the-art replica placement schemes.
201401 An Empirical Study of On-Line Models for Relational Data Streams Ashwin Srinivasan
Department of Computer Science,
IIIT, New Delhi, India
ashwin@iiitd.ac.in

Michael Bain
School of Computer Science and Engineering,
University of New South Wales, Australia
mike@cse.unsw.edu.au
To date, Inductive Logic Programming (ILP) systems have largely assumed that all data needed for learning have been provided at the onset of model construction. Increasingly, for application areas like telecommunications, astronomy, text processing, financial markets and biology, machine-generated data are being generated continuously and on a vast scale. We see at least four kinds of problems that this presents for ILP: (1) It may not be possible to store all of the data, even in secondary memory; (2) Even if it were possible to store the data, it may be impractical to construct an acceptable model using partitioning techniques that repeatedly perform expensive coverage or subsumption-tests on the data; (3) Models constructed at some point may become less effective, or even invalid, as more data become available (exemplified by the ``drift'' problem when identifying concepts); and (4) The representation of the data instances may need to change as more data become available (a kind of ``language drift'' problem). In this paper, we investigate the adoption of a stream-based on-line learning approach to relational data. Specifically, we examine the representation of relational data in both an infinite-attribute setting, and in the usual fixed-attribute setting, and develop implementations that use ILP engines in combination with on-line model-constructors. The behaviour of each program is investigated using a set of controlled experiments, and performance in practical settings is demonstrated by constructing complete theories for some of the largest biochemical datasets examined by ILP systems to date, including one with a million examples - to the best of our knowledge, the first time this has been empirically demonstrated with ILP on a real-world data set.
201335 Improving GA-based mapping algorithm of NoC using a formal model Vinitha A Palaniveloo
School of Computer Science and Engineering,
University of New South Wales, Australia
vinithaap@cse.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering,
University of New South Wales, Australia
sowmya@cse.unsw.edu.au
Network on Chip (NoC) is a sophisticated communication infrastructure designed to interconnect components in a complex system on chip (SoC). NoC provides quality of service (QoS) guarantee to the applications mapped on it. QoS depends on NoC router architecture, NoC communication scheme, application traffic characteristics as well as application mapping strategy. Applications are mapped to NoC using mapping algorithms that satisfy power/latency/bandwidth constraints. The effect of different mapping algorithms on QoS parameters such as average latency, throughput, power and area is evaluated using NoC simulators. The suitable QoS parameter for evaluating mapping algorithms that satisfy bandwidth constraint and minimizes average communication delay is worst-case communication latency. Worst-case latency is a measure of latency upper bound, it provides insight on the latency guaranteed by NoC to the application. However, it is not possible to measure worst-case latency using NoC simulator so analytical models are used. The formal model previously proposed for measuring worst-case latency formally is used here to improve mapping algorithms constraint to bandwidth and latency.
201334 Using Column Generation for Solving Large Scale Concrete Dispatching Problems Mojtaba Maghrebi
School of Civil and Environmental Engineering,
University of New South Wales, Australia
maghrebi@unsw.edu.au

Vivek Periara
Department of Systems and Industrial Engineering,
The University of Arizona, Tucson, AZ, USA
vivek.periaraj@gmail.com


S. Travis Waller
School of Civil and Environmental Engineering,
University of New South Wales, Australia
s.waller@unsw.edu.au

Claude Sammut
School of Computer Science and Engineering,
University of New South Wales, Australia
claude@cse.unsw.edu.au
Ready Mix Concrete (RMC) dispatching forms a critical component of the construction supply chain. However, optimization approaches within the RMC dispatching continue to evolve due to the specific size, constraints and objectives required of the application domain. In this paper, we develop a column generation algorithm for Vehicle Routing Problems with time window constraints as applied to RMC dispatching problems and examine the performance of the approach for this specific application domain. The objective of the problem is to find the minimum cost routes for a fleet of capacitated vehicles serving concrete to customers with known demand from depots within the allowable time window. The VRP is specified to cover the concrete delivery problem by adding additional constraints that reflect real situations. The introduced model is amenable to the Dantzig-Wolfe reformulation for solving pricing problems using a two-staged methodology as proposed in this paper. Further, under the mild assumption of homogeneity of the vehicles, the pricing sub-problem can be viewed as a minimum-cost multi-commodity flow problem (MMCF) and solved in polynomial time using efficient network simplex method implementations. A large-scale field collect dataset is used for evaluating the model and the proposed solution method, with and without time window constraints. In addition, the method is compared with the exact solution found via enumeration. The results show that on average the proposed methodology attains near optimal solutions for many of the large sized models but is 10 times faster than branch-and-cut.
201333 Big Data and Cross-Document Coreference Resolution: Current State and Future Opportunities Seyed-Mehdi-Reza (Amin) Beheshti
School of Computer Science and Engineering,
University of New South Wales, Australia
sbeheshti@cse.unsw.edu.au

Srikumar Venugopal
School of Computer Science and Engineering,
University of New South Wales, Australia
srikumarv@cse.unsw.edu.au

Seung Hwan Ryu
School of Computer Science and Engineering,
University of New South Wales, Australia
seungr@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Wei Wang
School of Computer Science and Engineering,
University of New South Wales, Australia
weiw@cse.unsw.edu.au
Information Extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among various IE tasks, extracting actionable intelligence from ever-increasing amount of data depends critically upon Cross-Document Coreference Resolution (CDCR) - the task of identifying entity mentions across multiple documents that refer to the same underlying entity. Recently, document datasets of the order of peta-/tera-bytes has raised many challenges for performing effective CDCR such as scaling to large numbers of mentions and limited representational power. The problem of analysing such datasets is called "big data". The aim of this paper is to provide readers with an understanding of the central concepts, subtasks, and the current state-of-the-art in CDCR process. We provide assessment of existing tools/techniques for CDCR subtasks and highlight big data challenges in each of them to help readers identify important and outstanding issues for further investigation. Finally, we provide concluding remarks and discuss possible directions for future work.
201332 Higher-order Multidimensional Programming (Revised) John Plaice
School of Computer Science and Engineering,
University of New South Wales, Australia
plaice@cse.unsw.edu.au

Jarryd P. Beck
School of Computer Science and Engineering,
University of New South Wales, Australia
jarrydb@cse.unsw.edu.au
We present a higher-order functional language, called TransLucid, in which expressions and variables denote intensions, which are arbitrary-dimensional arrays in which any atomic value may be used as a dimension, and a multidimensional runtime context is used to index the intensions. In addition to atomic objects, the first- class objects of TransLucid are contexts, intensions and functions. We give an intuitive presentation of the core principles of TransLucid, present its denotational semantics, and then develop a number of programming techniques, typically avoiding recursive function calls, taking advantage of the features of the language.
201330 Securing Networks Using Software Defined Networking: A Survey Syed Taha Ali
School of Computer Science and Engineering,
University of New South Wales, Australia
taha@unsw.edu.au
Vijay Sivaraman
School of Electrical Engineering and Telecommunications,
University of New South Wales, Australia
vijay@unsw.edu.au
Adam Radford
Cisco Systems,
Australia
aradford@cisco.com
Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay.jha@unsw.edu.au
Software Defined Networking (SDN) is rapidly emerging as a new paradigm for managing and controlling the operation of networks ranging from the data center to core, enterprise and home. The logical centralization of network intelligence presents exciting challenges and opportunities to enhance security in such networks, including new ways to prevent, detect and react to threats, as well as innovative security services and applications that are built upon SDN capabilities. In this paper we undertake a comprehensive survey of recent works that apply SDN to security, and identify promising future directions that can be addressed by such research.
201329 Target-Dependent Sentiment Analysis for Hashtags on Twitter Zhiwei YU
School of Computer Science and Engineering,
University of New South Wales, Australia
zhiweiyu@cse.unsw.edu.au

Raymond K. Wong
School of Computer Science and Engineering,
University of New South Wales, Australia
wong@cse.unsw.edu.au

Chi-Hung Chi
Intelligent Sensing & Sys. Lab,
CSIRO, Tasmania, Australia
chihung.chi@csiro.au
Microblogging services, such as Twitter, allow Internet users to exchange short messages easily. Users express their feelings on various topics as status messages or opinions. Hashtags are usually used in these services to mark essential words or phrases as a means for grouping topics. Thus, sentiment analysis on hashtags has become a popular method in determining user opinions on microblogs. In this paper, an effective approach to determine target-dependent hashtag sentiments is proposed. For a given tweet, a hashtag may carry different or even opposite opinions for different targets. Therefore, we aim to identify sentiment for two-dimensional parameters, namely . We firstly build a target-dependent tweet-level sentiment classifier based on target position sensitive features. A majority voting strategy for hashtag-level sentiment classification is then proposed as a baseline method. Finally, we show that its performance is significantly improved by propagation on a hyper relationship graph containing both target and hashtag nodes.
201328 Social Media Epidemiology (I) - Microblog Emerging Outbreak Monitoring Victor W. Chu

School of Computer Science and Engineering,

University of New South Wales, Australia

wchu@cse.unsw.edu.au



Raymond K. Wong

School of Computer Science and Engineering,

University of New South Wales, Australia

wong@cse.unsw.edu.au
A recent study on collective attention in Twitter shows that epidemic spreading of hashtags only plays a minor role in hashtag popularity and is predominantly driven by exogenous factors. Although a standard epidemic model is insufficient to explain the diffusion patterns of hashtags, we show that a time-series form of susceptible-infectious-recovered (SIR) model can be extended to monitor emerging outbreaks in microblog. In particular, we focus on disturbance analysis in Twitter. Different from other research work on hashtag analysis, we introduce a notion of disturbance; which is defined by a probability distribution over a common vocabulary. We investigate the disturbances which have already been identified by a community, e.g., topics learned from hashtagged messages, to focus on interpretable results. The use of probabilistic definition of disturbances overcomes small usable sample space problem in hashtag analysis, such that related tweets can be included by inference. This report presents a Bayesian online parameter mining method to monitor the diffusion of emerging disturbances in Twitter by combining a semi-supervised topic learning model with an enhanced SIR time-series model, which covers both endogenous and exogenous factors. By monitoring the estimated effective-reproduction-number of disturbances, one can profile and categorize the disturbances based on their levels of contagiousness and so as to generate alerts on potential outbreaks.
201327 A Partition-Based Approach to Structure Similarity Search Xiang Zhao
The University of New South Wales, Australia
xzhao@cse.unsw.edu.au

Chuan Xiao
Nagoya University, Japan
chuanx@nagoya-u.jp

Xuemin Lin
The University of New South Wales, Australia
lxue@cse.unsw.edu.au

Qing Liu
CSIRO, Australia
q.liu@csiro.au

Wenjie Zhang
The University of New South Wales, Australia
zhangw@cse.unsw.edu.au
Graphs are widely used to model complex data in many applications, such as bioinformatics, chemistry, social networks, pattern recognition, etc. A funda- mental and critical query primitive is to efficiently search similar structures in a large collection of graphs. This paper studies the graph similarity queries with edit distance constraints. Existing solutions to the problem utilize fixed-size overlapping substructures to generate candidates, and thus become susceptible to large vertex degrees or large distance thresholds. In this paper, we present a partition-based approach to tackle the problem. By dividing data graphs into variable-size non-overlapping partitions, the edit distance constraint is con- verted to a graph containment constraint for candidate generation. We develop efficient query processing algorithms based on the new paradigm. A candi- date pruning technique and an improved graph edit distance algorithm are also developed to further boost the performance. In addition, a cost-aware graph partitioning technique is devised to optimize the index. Extensive experiments demonstrate our approach significantly outperforms existing approaches.
201326 k-FSOM: Fair Link Scheduling Optimization for Energy-Aware Data Collection in Mobile Sensor Networks Kai Li
School of Computer Science and Engineering,
University of New South Wales, Australia
kail@cse.unsw.edu.au

Branislav Kusy
Autonomous Systems Lab,
CSIRO ICT Centre, Australia
brano.kusy@csiro.au

Raja Jurdak
Autonomous Systems Lab,
CSIRO ICT Centre, Australia
raja.jurdak@csiro.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
We consider the problem of data collection from a continental-scale network of mobile sensors, specifically applied to wildlife tracking. Our application constraints favor a highly asymmetric solution, with heavily duty-cycled sensor nodes communicating with a network of powered base stations. Individual nodes move freely in the environment, resulting in low-quality radio links and hot-spot arrival patterns with the available data exceeding the radio link capacity. We propose a novel scheduling algorithm, k-Fair Scheduling Optimization Model (k-FSOM), that maximizes the amount of collected data under the constraints of radio link quality and energy, while ensuring a fair access to the radio channel. We show the problem is NP-complete and propose a heuristic to approximate the optimal scheduling solution in polynomial time. We use empirical link quality data to evaluate the k-FSOM heuristic in a realistic setting and compare its performance to other heuristics. We show that k-FSOM heuristic achieves high data reception rates, under different fairness and node lifetime constraints.
201325 Scalable Protein Sequence Similarity Search using Locality-Sensitive Hashing and MapReduce Freddie Sunarso
School of Computer Science and Engineering,
University of New South Wales, Australia
basems@cse.unsw.edu.au

Srikumar Venugopal
School of Computer Science and Engineering,
University of New South Wales, Australia
srikumarv@cse.unsw.edu.au

Federico Lauro
School of Biotechnology and Biomolecular Science,
University of New South Wales, Sydney, Australia
and
Singapore Centre on Environmental Life Sciences Engineering,
Nanyang Technological University, Singapore
Metagenomics is the study of environments through genetic sampling of their microbiota. Metagenomic studies produce large datasets that are estimated to grow at a faster rate than the available computational capacity. A key step in the study of metagenome data is sequence similarity searching which is computationally intensive over large datasets. Tools such as BLAST require large dedicated computing infrastructure to perform such analysis and may not be available to every researcher. In this paper, we propose a novel approach called ScalLoPS that performs searching on protein sequence datasets using LSH (Locality-Sensitive Hashing) that is implemented using the MapReduce distributed framework. ScalLoPS is designed to scale across computing resources sourced from cloud computing providers. We present the design and implementation of ScalLoPS followed by evaluation with datasets derived from both traditional as well as metagenomic studies. Our experiments show that with this method the scalability of protein sequence search are significantly improved with reasonable performance and quality.
201324 Efficient Node Bootstrapping for Decentralised Shared-nothing Key-value Stores Han Li
School of Computer Science and Engineering,
University of New South Wales, Australia
hli@cse.unsw.edu.au

Srikumar Venugopal
School of Computer Science and Engineering,
University of New South Wales, Australia
srikumarv@cse.unsw.edu.au
Distributed key-value stores (KVSs) have become an important component for data management in cloud applications. Since resources can be provisioned on demand in the cloud, there is a need for efficient node bootstrapping and decommissioning, i.e. to incorporate or eliminate the provisioned resources as a members of the KVS. It requires the data be handed over and the load be shifted across the nodes quickly. However, the data partitioning schemes in the current-state shared nothing KVSs are not efficient in quick bootstrapping. In this paper, we have designed a middleware layer that provides a decentralised scheme of auto-sharding with a two-phase bootstrapping. We experimentally demonstrate that our scheme reduces bootstrap time and improves load-balancing thereby increasing scalability of the KVS.
201323 Modeling Performance of Elasticity Rules for Cloud-based Applications Basem Suleiman
School of Computer Science and Engineering,
University of New South Wales, Australia
basems@cse.unsw.edu.au

Srikumar Venugopal
School of Computer Science and Engineering,
University of New South Wales, Australia
srikumarv@cse.unsw.edu.au
Many IaaS providers, e.g., Amazon Web Services, allow cloud consumers to define elasticity (or auto-scaling) rules to dynamically allocate and release computing resources on-demand and at per-unit-of-time costs. Modern enterprises are increasingly deploying their applications, e.g., internet banking and financial services, on such IaaS clouds so their applications can inherently become self-elastic to meet its variable workload. Defining elasticity rules for such applications, however, remains as a key challenge for cloud consumers as it requires choosing appropriate threshold values to satisfy desired applications and resources metrics. Achieving this empirically is expensive as it requires running large amount of empirical testing and analysis in real cloud environments. In this paper we propose novel analytical models that capture core elasticity thresholds and emulate how elasticity works. The proposed models also approximate primary metrics including CPU utilization, application's response time and servers usage cost for evaluating elasticity rules' performance. Based on our models, we develop algorithms that decide when and how to scale-out and scale-in based on CPU utilization and other thresholds and to estimate servers cost resulted from scaling actions. We validate the simulation of our elasticity models and algorithms with different elasticity rules' thresholds against empirical data resulted from experiments with the same elasticity rules thresholds with TPC-W application on Amazon cloud. The simulation results demonstrated reasonable accuracy of our elasticity models and algorithms in approximating CPU utilization, application's response time, number of servers and servers usage costs.
201322 Causal Time-Varying Dynamic Bayesian Networks Victor W. Chu

School of Computer Science and Engineering,

University of New South Wales, Australia

wchu@cse.unsw.edu.au



Raymond K. Wong

School of Computer Science and Engineering,

University of New South Wales, Australia

wong@cse.unsw.edu.au



Wei Liu

National ICT Australia

wei.liu@nicta.com.au



Fang Chen

National ICT Australia

fang.chen@nicta.com.au
Causal network structure learning methods, e.g., IC*, FCI and MBCS*, are investigated in recent time but none of them has taken possible time-varying network structure, such as time-varying dynamic Bayesian networks (TV-DBN), into consideration. In this paper, notions of relaxed TV-DBN (RTV-DBN) and causal TV-DBN (CTV-DBN), as well as a definition of causal boundary are introduced. RTV-DBN is a generalized version of TV-DBN whilst CTV-DBN is a causal compliant version. CTV-DBN is constructed by using an asymmetric kernel, versus a symmetric kernel as in TV-DBN, to address the problem of sample scarcity and to better fit within the causal boundary; while maintaining similar level of variance and bias trade-off. Upon satisfying causal Markov assumption, causal inference can be made based on manipulation rule. We explore spatio-temporal data which is known to exhibit heterogeneous patterns, data sparseness and distribution skewness. Contrary to a naĂŻve method to divide a space by grids, we capture the moving objects' view of space by using clustering to overcome data sparseness and skewness issues. In our experiments, we use RTV-DBN and CTV-DBN to reveal the evolution of interesting region time-varying structure from the transformed data.
201321 Top-Down XML Keyword Query Processing Junfeng Zhou
Yanshan University, China
zhoujf@ysu.edu.cn

Xingmin Zhao
Yanshan University, China
zxm@ysu.edu.cn

Wei Wang
University of New South Wales, Australia
weiw@cse.unsw.edu.au

Ziyang Chen
Yanshan University, China
zychen@ysu.edu.cn

Jeffrey Xu Yu
Chinese University of Hong Kong, China
yu@se.cuhk.edu.hk

Xian Tang
Yanshan University, China
txianz@ysu.edu.cn
Efficiently answering XML keyword queries has attracted much research effort in the last decade. The key factors resulting in the inefficiency of existing methods are the common-ancestor-repetition (CAR) and visiting-useless-nodes (VUN) problems. To address the CAR problem, we propose a generic top-down processing strategy to answer a given keyword query w.r.t. LCA/SLCA/ELCA semantics. By ``top-down'', we mean that we visit all common ancestor (CA) nodes in a depth-first, left-to-right order; by ``generic'', we mean that our method is independent of the query semantics. To address the VUN problem, we propose to use child nodes, rather than descendant nodes to test the satisfiability of a node v w.r.t. the given semantics. We propose two algorithms that are based on either traditional inverted lists or our newly proposed LLists to improve the overall performance. We further propose several algorithms that are based on hash search to simplify the operation of finding CA nodes from all involved LLists. The experimental results verify the benefits of our methods according to various evaluation metrics.
201320 A Social Network-based Process-aware Task Management System Seyed Alireza Hajimirsadeghi

School of Computer Science and Engineering,

University of New South Wales, Australia

seyedh@cse.unsw.edu.au



Hye-Young Paik

School of Computer Science and Engineering,

University of New South Wales, Australia

hpaik@cse.unsw.edu.au



John Shepherd

School of Computer Science and Engineering,

University of New South Wales, Australia

jas@cse.unsw.edu.au



Anthony Kosasih

School of Computer Science and Engineering,

University of New South Wales, Australia

ajkosasih@gmail.com
In modern society, we are frequently required to perform administrative or business processes in order to achieve our personal goals. We call these kinds of ad hoc processes, carried out towards a personal goal, personal processes. For almost all of our personal goals, from applying for a research position in academia or a job in industry to organising a marriage ceremony or buying a house, we seek help from social networks. Social networks are the prevalent medium for sharing notes, documents, images, videos, etc and they also provide the basic infrastructure for exchanging opinions. However we believe social networks are not particularly helpful when it comes to personal process management (PPM) as there remain significant problems in discovering and integrating the sets of tasks that are typically required in order to achieve many useful outcomes. This is mainly because social networks do not possess an integrated and structured framework for sharing ``process knowledge". In this paper, we propose Processbook, a social network-based management system for personal processes. A simple modelling interface is introduced based on ToDoLists to help users plan towards their goals. We describe how the system can capture a user's experience in managing their ToDoList and the associated personal process, how this information can be shared with other users, and how the system can use this information to recommend process strategies. We exemplify the approach by a sample administrative process inside the University of New South Wales.
201319 Secure Data Aggregation Technique for Wireless Sensor Networks in the Presence of Collusion Attacks Mohsen Rezvani
School of Computer Science and Engineering,
University of New South Wales, Australia
mrezvani@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au
Elisa Bertino

Department of Computer Science,

Purdue University

bertino@cs.purdue.edu


Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
At present, due to limited computational power and energy resources of sensor nodes, aggregation of data from multiple sensor nodes done at the aggregating node is usually accomplished by simple methods such as averaging. However, such aggregation has been known to be highly vulnerable to node compromising attacks. Since WSN are usually unattended and without tamper resistant hardware, they are highly susceptible to such attacks. Thus, ascertaining trustworthiness of data and reputation of sensor nodes has become crucially important for WSN. As the performance of very low power processors dramatically improves and their cost is drastically reduced, future aggregator nodes will be capable of performing more sophisticated data aggregation algorithms, which will make WSN less vulnerable to severe impact of compromised nodes. Iterative filtering algorithms hold great promise for such a purpose. Such algorithms simultaneously aggregate data from multiple sources and provide trust assessment of these sources, usually in a form of corresponding weight factors assigned to data provided by each source. In this paper we demonstrate that a number of existing iterative filtering algorithms, while significantly more robust against collusion attacks than the simple averaging methods, are nevertheless susceptive to a novel sophisticated collusion attack we introduce. To address this security issue, we propose an improvement for iterative filtering techniques by providing an initial approximation for such algorithms which makes them not only collusion robust, but also more accurate and faster converging. We believe that so modified iterative filtering algorithms have a great potential for deployment in the future WSN.
201318 Nano Sensor Networks for Tailored Operation of Highly Efficient Gas-To-Liquid Fuels Catalysts Eisa Zarepour
School of Computer Science and Engineering,
University of New South Wales, Australia
ezarepour@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Adesoji A. Adesina
School of Chemical Engineering,
University of New South Wales, Australia,
a.adesina@unsw.edu.au
Fischer-Tropsch synthesis, a major process for converting natural gas to liquid hydrocarbons (GTL) suffers from selectivity limitations. While GTL reactor produces highly useful hydrocarbons in the form of liquid fuels such as gasoline, it also produces low-grade hydrocarbons such as methane. Selectivity refers to the ratio of highly useful hydrocarbons to the total product output. The literature is replete with various catalyst formulations which seek to improve selectivity to specific product spectrum via for example, molecular size or shape exclusion using zeolites or control of growing surface chain length with particle (site) geometry. Existing strategies for selectivity improvement, such as manipulation of reactor operating factors (temperature, pressure, etc) and catalyst design preparation variables may be classified as top-down approaches. In this work, a bottom-up approach is proposed in which surface processes can be controlled via Nano Sensor Network (NSN) involving the turning on or off of elementary steps consisting undesired species, and redirection of surface efforts to step(s) leading to the wanted products. The overall effect of this nano-level communications will lead to superior selectivity than hitherto possible by reducing the rate of Hydrogenation To Paraffin (HTP) reactions. Our numerical and simulation results reveal an exponential relationship between reduction in rates of HTP reactions and selectivity. It also confirms a considerable improvement of overall selectivity in a catalyst that is equipped by a high reliable NSN in comparison with extant catalyst technologies and current commercial Fischer-Tropsch reactors.
201317 "The tail wags the dog": A study of anomaly detection in commercial application performance Richard Gow
School of Computer Science and Engineering, University of New South Wales, Australia.
Email: richard.gow@iag.com.au
Srikumar Venugopal
School of Computer Science and Engineering, University of New South Wales, Australia.
Email: srikumarv@cse.unsw.edu.au
Pradeep Ray
Asia-Pacific Ubiquitous Healthcare Research Centre,
University of New South Wales, Australia
Email: p.ray@unsw.edu.au}
The IT industry needs systems management models that leverage available application information to detect quality of service, scalability and health of service. Ideally this technique would be common for varying application types with different n-tier architectures under normal production conditions of varying load, user session traffic, transaction type, transaction mix, and hosting environment. This paper shows that a whole of service measurement paradigm utilizing a black box M/M/1 queuing model and auto regression curve fitting of the associated CDF are an accurate model to characterize system performance signatures. This modeling method is also used to detect application slow down events. The technique was shown to work for a diverse range of workloads ranging from 76 Tx/ 5min to 19,025 Tx/ 5min. The method did not rely on customizations specific to the n-tier architecture of the systems being analyzed and so the performance anomaly detection technique was shown to be platform and configuration agnostic.
201316 HTTP-based Adaptive Streaming for Mobile Clients using Markov Decision Process Ayub Bokani
School of Computer Science and Engineering,
University of New South Wales, Australia
abokani@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Salil Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au
Due to its simplicity at the server side, HTTP-based adaptive streaming has become a popular choice for streaming on-line contents to a wide range of user devices. In HTTP-based streaming systems, the server simply stores the video segmented into a series of small chunks coded in many different qualities and sizes, and leaves the decision of which chunk to download next to achieve a high quality viewing experience to the client. This decision making is a challenging task, especially in mobile environment due to unexpected changes in network bandwidth as the user moves through different regions. In this paper, we consider Markov Decision Process (MDP) as an optimisation framework to optimise the three dimensions of streaming performance, picture quality, deadline miss, and the frequency of quality change. We highlight that MDP has a high overhead arising from frequent strategy updates as a moving client attempts to learn the statistical parameters of the underlying bandwidth. We propose and evaluate three approaches to reduce MDP overhead considering both on-line and offline optimisation. For online, we propose a k-chunk update approach (k-MDP) which recomputes the optimal strategy after downloading every k chunks. For offline, we propose two different approaches, single MDP strategy (s-MDP) and x-meter MDP strategy (x-MDP). s-MDP uses the global statistics of a given region to compute an optimal MDP strategy which is used throughout the video session, while x-MDP recomputes the optimal strategy for every x meters of the travel using offline statistics for each x-meter of the road. We have evaluated the performance of the proposed approaches using simulation driven by real-world 3G bandwidth and vehicular mobility traces. We find that k-MDP yields a linear trade off between performance and overhead. Interestingly, although offline approaches have zero online computation overhead, they both outperform the online approach. The best performance is achieved with x-MDP.
201315 Making Context-Sensitive Inclusion-based Pointer Analysis Practical for Compilers Using Parameterised Summarisation Yulei Sui
School of Computer Science and Engineering,
University of New South Wales, Australia
ysui@cse.unsw.edu.au

Sen Ye
School of Computer Science and Engineering,
University of New South Wales, Australia
yes@cse.unsw.edu.au

Jingling Xue
School of Computer Science and Engineering,
University of New South Wales, Australia
jingling@cse.unsw.edu.au
Due to its high precision as a flow-insensitive pointer analysis, Andersen's analysis has been deployed in some modern optimizing compilers. To obtain improved precision, we describe how to add context sensitivity on top of Andersen's analysis. The resulting analysis, called ICON, is efficient to analyse large programs while being sufficiently precise to drive compiler optimisations. Its novelty lies in summarising the side effects of a procedure by using one transfer function on virtual variables that represent fully parameterised locations accessed via its formal parameters. As a result, a good balance between efficiency and precision is made, resulting in ICON that is more powerful than a 1-callsite-sensitive analysis and less so than a call-path-sensitive analysis (when the recursion cycles in a program are collapsed in all cases). We have compared ICON with FULCRA, a state of the art Andersen's analysis that is context-sensitive by acyclic call paths, in Open64 (with recursion cycles collapsed in both cases) using the 16 C/C++ benchmarks in SPEC2000 (totalling 600 KLOC) and 5 C applications (totalling 2.1 MLOC). Our results demonstrate scalability of ICON and lack of scalability of FULCRA. FULCRA spends over 2 hours in analysing SPEC2000 and fails to run to completion within 5 hours for two of the five applications tested. In contrast, ICON spends just under 7 minutes on the 16 benchmarks in SPEC2000 and just under 26 minutes on the same two applications. For the 19 benchmarks analysable by FULCRA, ICON is nearly as accurate as FULCRA in terms of the quality of the built SSA form and the precision of the discovered alias information.
201314 Accelerating Inclusion-based Points-to Analysis on Heterogeneous CPU-GPU Systems Yu Su
School of Computer Science and Engineering,
University of New South Wales, Australia
ysu@cse.unsw.edu.au

Ding Ye
School of Computer Science and Engineering,
University of New South Wales, Australia
dye@cse.unsw.edu.au

Jingling Xue
School of Computer Science and Engineering,
University of New South Wales, Australia
jingling@cse.unsw.edu.au
This paper describes the first implementation of Andersen's inclusion-based pointer analysis for C programs on a heterogeneous CPU-GPU system, where both its CPU and GPU cores are used. As an important graph algorithm, Andersen's analysis is difficult to parallelise because it makes extensive modifications to the structure of the underlying graph, in a way that is highly input-dependent and statically hard to analyse. Existing parallel solutions run on either the CPU or GPU but not both, rendering the underlying computational resources underutilised and the ratios of CPU-only over GPU-only speedups for certain programs (i.e., graphs) unpredictable. We observe that a naive parallel solution of Andersen's analysis on a CPU-GPU system suffers from poor performance due to workload imbalance. We introduce a solution that is centered around a new dynamic workload distribution scheme. The novelty lies in prioritising the distribution of different types of workloads, i.e., graph-rewriting rules in Andersen's analysis to CPU or GPU according to the degrees of the processing unit's suitability for processing them. This scheme is effective when combined with synchronisation-free execution of tasks (i.e., graph-rewriting rules) and difference propagation of points-to information between the CPU and GPU. For a set of seven C benchmarks evaluated, our CPU-GPU solution outperforms (on average) (1) the CPU-only solution by 50.6\%, (2) the GPU-only solution by 78.5\%, and (3) an oracle solution that behaves as the faster of (1) and (2) on every benchmark by 34.6\%.
201312 Trust-based Recruitment in Multi-hop Social Participatory Sensing Haleh Amintoosi
School of Computer Science and Engineering,
University of New South Wales, Australia
haleha@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au
The idea of social participatory sensing provides a substrate to benefit from friendship relations in recruiting a critical mass of participants willing to attend in a sensing campaign. However, the selection of suitable participants who are trustable and provide high quality contributions is challenging. In this paper, we propose a recruitment framework for social participatory sensing. Our framework leverages multi-hop friendship relations to identify and select suitable and trustworthy participants among friends or friends of friends, and finds the most trustable paths to them. The framework also includes a suggestion component which provides a cluster of suggested friends along with the path to them, which can be further used for recruitment or friendship establishment. Simulation results demonstrate the efficacy of our proposed recruitment framework in terms of selecting a large number of well-suited participants and providing contributions with high overall trust, in comparison with one-hop recruitment architecture.
201311 Efficient Recovery of Missing Events Jianmin Wang
School of Software,
Tsinghua University, China
jimwang@tsinghua.edu.cn

Shaoxu Song
School of Software,
Tsinghua University, China
sxsong@tsinghua.edu.cn

Xiaochen Zhu
School of Software,
Tsinghua University, China
zhu-xc10@mails.tsinghua.edu.cn

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales, Australia
lxue@cse.unsw.edu.au
For various entering and transmission issues raised by human or system, missing events often occur in event data, which record execution logs of business processes. Without recovering these missing events, applications such as provenance analysis or complex event processing built upon event data are not reliable. Following the minimum change discipline in improving data quality, it is also rational to find a recovery that minimally differs from the original data. Existing recovery approaches fall short of efficiency owing to enumerating and searching over all the possible sequences of events. In this paper, we study the efficient techniques for recovering missing events. According to our theoretical results, the recovery problem is proved to be NP-hard. Nevertheless, we are able to concisely represent the space of event sequences in a branching framework. Advanced indexing and pruning techniques are developed to further improve the recovery efficiency. Our proposed efficient techniques make it possible to find top-k recoveries. The experimental results demonstrate that our minimum recovery approach achieves high accuracy, and significantly outperforms the state-of-the-art technique for up to 5 orders of magnitudes improvement in time performance.
201310 Multi-Threading Processor Design for Embedded Systems Ran Zhang
School of Computer Science and Engineering,
University of New South Wales, Australia
cliffran87@hotmail.com

Hui Guo
School of Computer Science and Engineering,
University of New South Wales, Australia
huig@cse.unsw.edu.au
Multi-threading processor design enables high performance of a single processor core by exploiting both the thread-level and instruction-level parallelism. This performance gain is, however, at the cost of increasing energy consumption, which is not desirable to embedded systems. This paper investigates multi-threaded designs of varied thread number and under two different thread switching schemes: fine-grained and coarse-grained. Based on our experiment with a 6-stage PISA processor, we found that in terms of energy efficiency the coarse-grained based designs are better than the fine-grained designs. And for the coarse-grained design, the thread number for an optimal design is closely related to the memory access delay; when the memory access latency is small, the low-thread processor appears more energy efficient than the processor with a high thread number, but when the memory delay increases the high-thread processor becomes superior.
201309 Noise properties of linear molecular communication networks Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au
Molecular communication networks consist of transmitters and receivers distributed in a fluid medium. The communication in these networks is realised by the transmitters emitting signalling molecules, which are diffused in the medium to reach the receivers. This paper investigates the properties of noise, or the variance of the receiver output, in molecular communication networks. The noise in these networks come from multiple sources: stochastic emission of signalling molecules by the transmitters, diffusion in the fluid medium and stochastic reaction kinetics at the receivers. We model these stochastic fluctuations by using an extension of the master equation. We show that, under certain conditions, the receiver outputs of linear molecular communication networks are Poisson distributed. The derivation also shows that noise in these networks is a nonlinear function of the network parameters and is non-additive. Numerical examples are provided to illustrate the properties of this type of Poisson channels.
201308 Efficient Probabilistic Supergraph Search over Uncertain Graphs Ke Zhu
School of Computer Science and Engineering,
University of New South Wales, Australia
kez@cse.unsw.edu.au

Gaoping Zhu
School of Computer Science and Engineering,
University of New South Wales, Australia
gzhu@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales, Australia
lxue@cse.unsw.edu.au
Given a query graph q, retrieving the data graphs g from a set D of data graphs such that q contains g, namely supergraph containment search, is fundamental in graph data analysis with a wide range of real applications. It is very challenging due to the NP-Completeness of subgraph isomorphism testing. Consider that in many applications, graph data are often uncertain due to various reasons. In this paper, we study the problem of probabilistic supergraph search; that is, given a set D of uncertain data graphs, a certain query graph q and a probability threshold \theta, we retrieve the data graphs gu from D such that the probability of q containing gu is not smaller than \theta. We show that besides the NP-Completeness of subgraph isomorphism testing, the problem of calculating probabilities is NP-Complete; thus it is even more challenging than the supergraph containment search. To tackle the computational hardness, we first propose two novel and effective pruning rules to efficiently prune non-promising data graphs. Then, efficient verification algorithms are developed with the aim of sharing computation and terminating non-promising computation as early as possible. Extensive performance studies on both real and synthetic data demonstrate the efficiency and effectiveness of our techniques in practice.
201307 Nano-scale Sensor Networks for Chemical Catalysis Eisa Zarepour
School of Computer Science and Engineering,
University of New South Wales, Australia
ezarepour@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Adesoji A. Adesina
School of Chemical Engineering,
University of New South Wales, Australia,
a.adesina@unsw.edu.au
Following the success of conventional macro-scale wireless sensor networks, researchers are now investigating the viability of nano-scale sensor networks (NSNs), which are formed by establishing communication between devices made from nanomaterials. Due to unusual properties of nanomaterials, it is envisaged that such NSNs can support completely new types of applications beyond what could be achieved with conventional sensor networks. In this paper, we propose and investigate a novel application of NSNs in the field of chemical catalysis. More specifically, our goal is to explore an NSN architecture to improve product selectivity of Fischer-Tropsch (FT) reaction, a major class of chemical reactions for converting natural gas to liquid fuel. Given that reliable wireless communication at nano-scale is at very early stages of development, we investigate FT selectivity as a function of communication reliability of the underlying NSN. Our simulation study reveals that FT selectivity has a logarithmic dependence on NSN communication reliability in terms of packet loss probability. The logarithmic dependence implies that (1) FT selectivity improvement is sensitive to both small and large packet loss probabilities supporting tangible impact of NSN even when communication reliability is at a developing stage, (2) there is room for further improvement in FT selectivity even when NSN reliability reaches a high level, and (3) to achieve a linear improvement in FT selectivity, we need an exponential improvement in NSN reliability.
201306 Multidimensional Infinite Data in the Language Lucid Jarryd P. Beck
School of Computer Science and Engineering,
University of New South Wales, Australia
jarrydb@cse.unsw.edu.au

John Plaice
School of Computer Science and Engineering,
University of New South Wales, Australia AND
Department of Computer Science and Software Engineering,
Concordia University, Canada
plaice@cse.unsw.edu.au

William W. Wadge
Department of Computer Science,
University of Victoria, Canada
wwadge@cs.uvic.ca
Although the language Lucid was not originally intended to support computing with infinite data structures, the notion of (infinite) sequence quickly came to the fore, together with a demand-driven computation model in which demands are propagated for the values of particular values at particular index points. This naturally generalized to sequences of multiple dimensions so that a programmer could, for example, write a program that could be understood as a (nonterminating) loop in which one of the loop variables is an infinite vector. Programmers inevitably found use for more and more dimensions, which led to a problem which is fully solved for the first time in this paper. The problem is that the implementation’s cache requires some estimate of the dimensions actually used to compute a value being fetched. This estimate can be difficult or (if dimensions are passed as parameters) impossible to obtain, and the demand-driven evaluation model for Lucid breaks down. We outline the evolution of Lucid which gave rise to this problem, and outline the solution, as used for the implementation of TransLucid, the latest descendant of Lucid.
201305 Iterative Security Risk Analysis for Network Flows Based on Provenance and Interdependency Mohsen Rezvani
School of Computer Science and Engineering,
University of New South Wales, Australia
mrezvani@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Discovering high risk network flows and hosts in a high throughput network is a challenging task of network monitoring. Emerging complicated attack scenarios such as DDoS attacks increases the complexity of tracking malicious and high risk network activities within a huge number of monitored network flows. To address this problem, we propose an iterative framework to assessing risk scores for hosts and network flows. To obtain risk scores of flows, we take into account two properties, flow attributes and flow provenance. Also, our iterative risk assessment measures the risk scores of hosts and flows based on an interdependency property where the risk score of a flow influences the risk of its source and destination hosts, and the risk score of a host is evaluated by risk scores of flows initiated by or terminated at the host. Moreover, the update mechanism in our framework allows flows to keep streaming into the system while our risk assessment method performs an online monitoring task. The experimental results show that our approach is effective in detecting high risk hosts and flows as well as sufficiently efficient to be deployed in high throughput networks compared to other algorithms.
201304 Providing Trustworthy Contributions via a Reputation Framework in Social Participatory Sensing Systems Haleh Amintoosi
School of Computer Science and Engineering,
University of New South Wales, Australia
haleha@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au
Social participatory sensing is a newly proposed paradigm that tries to address the limitations of participatory sensing by leveraging online social networks as an infrastructure. A critical issue in the success of this paradigm is to assure the trustworthiness of contributions provided by participants. In this paper, we propose an application-agnostic reputation framework for social participatory sensing systems. Our framework considers both the quality of contribution and the trustworthiness level of participant within the social network. These two aspects are then combined via a fuzzy inference system to arrive at a final trust rating for a contribution. A reputation score is also calculated for each participant as a resultant of the trust ratings assigned to him. We adopt the utilization of PageRank algorithm as the building block for our reputation module. Extensive simulations demonstrate the efficacy of our framework in achieving high overall trust and assigning accurate reputation scores.
201303 Human Activity Recognition for Indoor Positioning using Smartphone Accelerometer Sara Khalifa
School of Computer Science and Engineering,
University of New South Wales, Australia
National ICT Australia, Locked Bag 9013, Alexandria, NSW 1435, Australia
sarak@cse.unsw.edu.au and sara.khalifa@nicta.com.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
National ICT Australia, Locked Bag 9013, Alexandria, NSW 1435, Australia
mahbub@cse.unsw.edu.au and mahbub.hassan@nicta.com.au

Aruna Seneviratne
School of Electrical and Telecommunication Engineering
University of New South Wales, Australia
National ICT Australia, Locked Bag 9013, Alexandria, NSW 1435, Australia
a.seneviratne@unsw.edu.au and aruna.seneviratne@nicta.com.au
With indoor maps showing facility locations, the activity context of the user, such as riding an escalator, could be used to determine user position in the map without any external aid. Human activity recognition (HAR), therefore, could become a potential aid for indoor positioning. In this paper, we propose to use the smartphone accelerometer for HAR of two key indoor positioning activities, riding an escalator (E) and riding a lift (L). However, since users do not actually perform any specific physical activity during E and L (they typically stand still in escalator or lift), HAR of these two activities is a challenging problem. We conjecture that the smartphone accelerometer would capture the characteristic vibrations of escalators and lifts, making it possible to distinguish them from each other with reasonable accuracy. We collect a total of 177 accelerometer traces from different individuals riding different lifts and escalators in different indoor complexes under natural conditions, and apply different combinations of noise filtering, feature selection, and classification algorithms to these traces. We find that using only the raw accelerometer data, the E and L activities can be recognized with 90% accuracy, but a simple moving average filter would increase the accuracy to 97%. We, however, discover that a third indoor activity, standing still on the floor (S), which could be confused with E and L, reduces recognition accuracy noticeably from 97% to 94% for the filtered data. An interesting finding is that the moving average filter leads to simpler features for classification, which may ultimately compensate for any increase in HAR overhead due to filtering.
201302 Can Mobile-to-Mobile Browser Cache Cooperation Reduce Energy Consumption of Internet Access? Abdul Alim Abd Karim
School of Computer Science and Engineering,
University of New South Wales, Australia
aaak656@cse.unsw.edu.au

Ajay Sharma
School of Computer Science and Engineering,
University of New South Wales, Australia
ajays@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales, Australia
mahbub@cse.unsw.edu.au

Aruna Seneviratne
University of New South Wales, Australia
a.seneviratne@unsw.edu.au
This paper investigates the possibility of reducing 3G browsing energy consumption of a mobile device by opportunistically fetching browser cache contents of a nearby device over the low-energy Bluetooth connection. By analysing a generic model of component-based web pages, we show that energy reduction is dependent on the fraction of dynamic components of a loaded page (dynamic page factor) which cannot be cached effectively. We find that energy reduction for a given page is possible only if the dynamic page factor is below a given threshold. We have designed and implemented an Android-based cooperative browser caching prototype and collected energy data in two different locations for 16 popular pages whose dynamic page factors ranged from zero to 0.6. Our experimental results confirm that cooperative browsing reduces energy cost only if the dynamic page factor is below a threshold and that this threshold is very small (about 0.03). This finding suggests that cooperative browsing may be counter productive if browsing has a bias toward pages with large dynamic page factors and vice-versa. For unbiased browsing, however, we find that short-range cache cooperation can reduce 3G browsing energy consumption by 13%. Finally, we propose a dynamic decision making algorithm that switches between non- cooperative and cooperative browsing adaptively based on the dynamic page factor. For our empirical data, the proposed algorithm could potentially achieve a 17% reduction in browsing energy consumption.
201228 A Framework and a Language for Analyzing Cross-Cutting Aspects in Ad-hoc Processes Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering,
University of New South Wales, Australia
sbeheshti@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto
CA 94304, USA
hamid.motahari@hp.com
Ad-hoc processes have flexible underlying process definition. The semi-structured nature of ad-hoc process data requires organizing process entities, people and artifacts, and relationships among them in graphs. The structure of process graphs, describing how the graph is wired, helps in understanding, predicting and optimizing the behavior of dynamic processes. In many cases, however, process artifacts evolve over time, as they pass through the business's operations. Consequently, identifying the interactions among people and artifacts over time becomes challenging and requires analyzing the cross-cutting aspects of process artifacts: process artifacts, like code, has cross-cutting aspects such as versioning and provenance. Analyzing these aspects will expose many hidden interactions among entities in process graphs. We present a framework, simple abstractions and a language for analyzing cross-cutting aspects in ad-hoc processes. We introduce two concepts of timed-folders to represent evolution of artifacts over time, and activity-paths to represent the process which led to artifacts. The approaches presented in this paper have been implemented on top of FPSPARQL, Folder-Path enabled extension of SPARQL, and experimentally validated on
201227 Joint Channel and Delay Aware User Scheduling for Multiuser MIMO system over LTE-A Network Jayeta Biswas
School of Computer Science and Engineering,
University of New South Wales, Australia
jbiswas@cse.unsw.edu.au

Ren Ping Liu
ICT Centre, CSIRO, Australia
ren.liu@csiro.au

Wei Ni
ICT Centre, CSIRO, Australia
wei.ni@csiro.au

Iain B. Collings
ICT Centre, CSIRO, Australia
iain.collings@csiro.au

Sanjay K. Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
The increasing popularity of mobile video applications has generated huge demand for higher throughput and better quality of service (QoS) in wireless networks. Multiuser multiple input multiple output (MU-MIMO) is the most promising technology to boost the throughput in the Long Term Evolution - Advanced (LTE-A) networks. However, most existing LTE-A proposals achieve the MU-MIMO capacity gain by selecting orthogonal user sets without considering QoS requirement, such as delay. Such user selection may seriously degrade the QoS performance of delay sensitive applications, such as mobile video. In this paper, we propose a joint channel and delay aware user scheduling algorithm, which selects users based on both channel conditions and weighted delay. The weight factor is designed to capture the historic rate and maximum coding rate of a video stream. Simulation results show that the proposed algorithm reduces the average delay by up to 30% with a marginal 2% sacrice of throughput, compared to previous work. It also reduces delay variations and improves fairness among the users.
201226 Towards Automatic Undo for Cloud Management via AI Planning Ingo Weber (1,2)
Hiroshi Wada (1,2)
Alan Fekete (1,3)
Anna Liu (1,2)
Len Bass (1,2)

1 NICTA, Sydney, Australia
(firstname.lastname)@nicta.com.au

2 School of Computer Science and Engineering,
University of New South Wales, Australia

3 School of Information Technologies
University of Sydney, Australia
The facility to undo a collection of changes, reverting to a previous acceptable state, is widely recognised as valuable support for dependability. In this paper, we consider the particular needs of the user of cloud computing resources, who wishes to manage the resources available to them, for example by installing and configuring virtual machine instances. The restricted management interface provided to the outside by a cloud platform prevents the usual rollback techniques: there is no command to save a complex state or configuration, and each action may have non-obvious constraints and side-effects. We propose an approach which is based on an abstract model of the effects of each available operation. Using this model, those forward operations that are truly irrevocable are replaced by alternatives (such as `pseudo-delete'), and an AI planning technique automatically creates an inverse workflow to take the system back to the desired earlier state, using the available operations. We demonstrate the feasibility and scalability of the approach.
201225 A Space-Efficient Indexing Algorithm for Boolean Query Processing Jianbin Qin
School of Computer Science and Engineering,
University of New South Wales, Australia
jqin@cse.unsw.edu.au

Chuan Xiao
Nagoya University, Japan
chuanx@itc.nagoya-u.ac.jp

Wei Wang
School of Computer Science and Engineering,
University of New South Wales, Australia
weiw@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales, Australia
lxue@cse.unsw.edu.au
Boolean search is the most common information retrieval technique that requires users to input the exact words they are looking for into a text field. Existing algorithms use inverted list indexing schemes to answer boolean search queries. These approaches occupy large index size as the inverted lists are highly overlapping and redundant. In this paper, we propose a novel approach that reduces the size of inverted lists while retaining time-efficiency. Our solution is based on merging inverted lists that bear high overlap to each other. A new algorithm is designed to discover overlapping inverted lists and construct condensed index for a given dataset. We conduct extensive experiments on several publicly available datasets. The proposed algorithm delivers considerable space usage improvement while exhibits tolerable time performance penalties.
201224 Loyalty-based Retrieval of Objects That Satisfy Criteria Persistently Zhitao Shen
School of Computer Science and Engineering,
University of New South Wales, Australia
shenz@cse.unsw.edu.au

Muhammad Aamir Cheema
School of Computer Science and Engineering,
University of New South Wales, Australia
macheema@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering,
University of New South Wales, Australia
lxue@cse.unsw.edu.au
A traditional query returns a set of objects that satisfy user defined criteria at the time query was issued. The results are based on the values of objects at query time and may be affected by outliers. Intuitively, an object better meets the user's needs if it persistently satisfies the criteria, i.e., it satisfies the criteria for majority of the time in the past T time units. In this paper, we propose a measure named loyalty that reflects how persistently an object satisfies the criteria. Formally, the loyalty of an object is the total time (in past T time units) it satisfies the query criteria. In this paper, we study top-k loyalty queries over sliding windows that continuously report k objects with the highest loyalties. Each object issues an update when it starts satisfying the criteria or when it stops satisfying the criteria. We show that the lower bound cost of updating the results of a top-k loyalty query is O(logN), for each object update, where N is the number of updates issued in last T time units. We conduct a detailed complexity analysis and show that our proposed algorithm is optimal. Moreover, effective pruning techniques are proposed that not only reduce the communication cost of the system but also improve the efficiency. We experimentally verify the effectiveness of the proposed approach by comparing it with a classic sweep line algorithm.
201223 ExCaD: Exploring Last-level Cache to Improve DRAM Energy Efficiency Su Myat Min
School of Computer Science and Engineering,
University of New South Wales, Australia
sumyatmins@cse.unsw.edu.au

Haris Javaid
School of Computer Science and Engineering,
University of New South Wales, Australia
harisj@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales, Australia
sridevan@cse.unsw.edu.au
Embedded systems with high energy consumption often exploit the idleness of DRAM to reduce their energy consumption by putting the DRAM into deepest low-power mode (self-refresh power down mode) during idle periods. DRAM idle periods heavily depend on the last-level cache, and in this paper, we propose the exploration of last-level cache configurations to improve DRAM energy efficiency by selecting the one which maximally reduces the total energy consumption of last-level cache and DRAM. To facilitate fast exploration, we propose a novel, simple yet high fidelity DRAM energy reduction estimator. Our framework, ExCaD, combines the estimator with a cache simulator and a novel cache profile transformation technique to avoid slow cycle-accurate processor-memory simulations. Our experiments with eight different applications from mediabench, two different DRAM sizes and 330 last-level cache configurations show that the cache configuration selected by ExCaD reduces DRAM energy consumption by at least 96% and 34% over systems without last-level cache and with largest last-level cache respectively. Use of self-refresh power down mode saved at least 93% more DRAM energy consumption compared to a system where it was not used. These results indicate that a suitable last-level cache configuration with self-refresh power down mode can significantly improve DRAM energy efficiency. ExCaD took only a few hours to explore last-level cache configurations compared to several days for cycle-accurate processor-memory simulations.
201222 Reconfigurable Pipelined Coprocessor for Multi-mode Communication Application Liang Tang
School of Computer Science and Engineering,
University of New South Wales, Australia
liangt@cse.unsw.edu.au

Jude Angelo Ambrose
School of Computer Science and Engineering,
University of New South Wales, Australia
ajangelo@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales, Australia
sridevan@cse.unsw.edu.au
The need to integrate multiple wireless communication protocols into a single low-cost, low-power hardware platform is prompted by the increasing number of emerging communication protocols and applications. This paper presents an efficient design methodology for integrating multiple wireless communication baseband protocols in a pipelined coprocessor which can be programmed to support various baseband protocols. This coprocessor can dynamically select the suitable pipeline stages for each baseband protocol. Moreover, each carefully designed stage is able to perform a certain signal processing function in reconfigurable fashion. The proposed method is flexible (compared to ASICs) and suitable for mobile application (compared to FPGAs). The area size of the coprocessor is smaller than an ASIC or FPGA implementation of multiple individual protocols, while the overheads of timing delay (40% worse than ASICs and 30% better than FPGA) and power consumption (6X worse than ASICs, 100X better than FPGA on average) are kept within reasonable levels. Moreover, fast protocol switching is supported. Wireless LAN (WLAN) 802.11a, WLAN 802.11b and Ultra Wide Band (UWB) transmission circuits are developed and mapped to the pipelined coprocessor to prove the efficacy of our proposal.
201221 Automating Form-based Processes through Annotation Sung Wook Kim
School of Computer Science and Engineering,
University of New South Wales, Australia
skim@cse.unsw.edu.au

Hye-Young Paik
School of Computer Science and Engineering,
University of New South Wales, Australia
hpaik@cse.unsw.edu.au

Ingo Weber
School of Computer Science and Engineering,
University of New South Wales, Australia
ingo.weber@cse.unsw.edu.au
Despite all efforts to support processes through IT, processes based on paper forms are still prevalent. While they are easy to create, using paper-based forms puts a burden of tedious manual work on the end users. Automating these processes often requires heavy customisation of commercial software tools, and building new systems to replace form-based processes may not be cost-effective. In this paper, we propose a pragmatic approach to enable end users to automate form-based processes. The approach builds on several types of annotations: to help collect and distribute information for form fields; to choose appropriate process execution paths; and to support email distribution or approval for filled forms. We implemented the approach in a prototype, called EzyForms. On this basis we conducted a user study with 15 participants, showing that people with little technical background were able to automate the existing form-based processes efficiently.
201220 Detecting, Representing and Querying Collusion in Online Rating Systems Mohammad Allahbakhsh
School of Computer Science and Engineering,
University of New South Wales, Australia
mallahbakhsh@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering,
University of New South Wales, Australia
sbeheshti@cse.unsw.edu.au

Norman Foo
School of Computer Science and Engineering,
University of New South Wales, Australia
norman@cse.unsw.edu.au

Elisa Bertino
Purdue University
West Lafayette, Indiana, USA
bertino@cs.purdue.edu
Online rating systems are subject to malicious behaviors mainly by posting unfair rating scores. Users may try to individually or collaboratively promote or demote a product. Collaborating unfair rating 'collusion' is more damaging than individual unfair rating. Although collusion detection in general has been widely studied, identifying collusion groups in online rating systems is less studied and needs more investigation. In this paper, we study impact of collusion in online rating systems and asses their susceptibility to collusion attacks. The proposed model uses a frequent itemset mining algorithm to detect candidate collusion groups. Then, several indicators are used for identifying collusion groups and for estimating how damaging such colluding groups might be. Also, we propose an algorithm for finding possible collusive subgroup inside larger groups which are not identified as collusive. The model has been implemented and we present results of experimental evaluation of our methodology.
201218 DIMSim: A Rapid Two-level Cache Simulation Approach for Deadline-based MPSoCs Mohammad Shihabul Haque
National University of Singapore, Singapore
email: elemsh@nus.edu.sg

Roshan Ragel
University of Peradeniya, Sri Lanka
email: ragelrg@gmail.com

Angelo Ambrose
University of New South Wales, Australia
email: ajangelo@cse.unsw.edu.au

S. Radhakrishnan
University of Peradeniya, Sri Lanka
email: swarna.radhakrishnan@gmail.com

Sri Parameswaran
University of New South Wales, Australia
email: sridevan@cse.unsw.edu.au
It is of critical importance to satisfy deadline requirements for an embedded application to avoid undesired outcomes. Multiprocessor System-on-Chips (MPSoCs) play a vital role in contemporary embedded devices to satisfy timing deadlines. Such MPSoCs include two-level cache hierarchies which have to be dimensioned carefully to support timing deadlines of the application(s) while consuming minimum area and therefore minimum power. Given the deadline of an application, it is possible to systematically derive the maximum time that could be spent on memory accesses which can then be used to dimension the suitable cache sizes. As the dimensioning has to be done rapidly to satisfy the time to market requirement, we choose a well acclaimed rapid cache simulation strategy, the single-pass trace driven simulation, for estimating the cache dimensions. Therefore, for the first time, we address the two main challenges, coherency and scalability, in adapting a single-pass simulator to a MPSoC with two-level cache hierarchy. The challenges are addressed through a modular bottom-up simulation technique where L1 and L2 simulations are handled in independent communicating modules. In this paper, we present how the dimensioning is performed for a two-level inclusive data cache hierarchy in an MPSoC. With the rapid simulation proposed, the estimations are suggested within an hour (worst case on considered application benchmarks). We experimented our approach with task based MPSoC implementations of JPEG and H264 benchmarks and achieved timing deviations of 16.1% and 7.2% respectively on average against the requested data access times. The deviation numbers are always positive meaning our simulator guarantees to satisfy the requested data access time. In addition, we generated a set of synthetic memory traces and used them to extensively analyse our simulator. For the synthetic traces, our simulator provides cache sizes to always guarantee the requested data access time, deviating below 14.5% on average.
201215 Higher-order Multidimensional Programming John Plaice and Jarryd P. Beck
School of Computer Science and Engineering,
University of New South Wales, Australia
{plaice,jarrydb}@cse.unsw.edu.au
We present a higher-order functional language in which variables define arbitrary-dimensional entities, where any atomic value may be used as a dimension, and a multidimensional runtime context is used to index the variables. We give an intuitive presentation of the language, present the denotational semantics, and demonstrate how function applications over these potentially infinite data structures can be transformed into manipulations of the runtime context. At the core of the design of functions is the intension abstraction, a parameterless function whose body is evaluated with respect to the context in which it is used and to part of the context in which it is created. The multidimensional space can be used for both programming and implementation purposes. At the programming level, the informal presentation of the language gives many examples showing the utility of describing common computing entities as infinite multidimensional data structures. At the implementation level, the main technical part of the paper demonstrates that the higher-order functions over infinite data structures---even ones that are curried---can be statically transformed into equivalent functions directly manipulating the context, thereby replacing closures over parts of the environment by closures over parts of the context.
201214 Online Analytical Processing on Graphs (GOLAP): Model and Query Language Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering,
University of New South Wales, Australia
sbeheshti@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto
CA 94304, USA
hamid.motahari@hp.com

Mohammad Allahbakhsh
School of Computer Science and Engineering,
University of New South Wales, Australia
mallahbakhsh@cse.unsw.edu.au
Graphs are essential modeling and analytical objects for representing information networks. Examples of this are the case of Web, social, and crowdsourcing graphs which may involve millions of nodes and relationships. In recent years, a new stream of work has focused on On-Line Analytical Processing (OLAP) on graphs. Although these line of works took the first step to put graphs in a rigid multi-dimensional and multi-level framework, none of them provide a semantic-driven framework and a language to support n-dimensional computations on graphs. The major challenge here is how to extend decision support on multidimensional networks considering both data objects and the relationships among them. Moreover, one of the critical deficiencies of graph query languages, e.g. SPARQL, is lack of support for n-dimensional computations which are frequent in OLAP environments. Traditional OLAP technologies were conceived to support multidimensional analysis, however, they cannot recognize patterns among graph entities and analyzing multidimensional graph data (from multiple perspectives and granularities) may become complex and cumbersome. In this paper, we present a framework, simple abstractions and a language to apply OLAP style queries on graphs in an explorative manner and from various user perspectives. We redefine OLAP data elements (e.g., dimensions, measures, and cubes) by considering the relationships among graph entities as first class objects. We have implemented the approach on top of FPSPARQL, Folder-Path enabled extension of SPARQL. The evaluation shows the viability and efficiency of our approach.
201213 Strategic and Epistemic Reasoning for the Game Description Language GDL-II: The Technical Report Ji Ruan
School of Computer Science and Engineering,
University of New South Wales, Australia
jiruan@cse.unsw.edu.au

Michael Thielscher
School of Computer Science and Engineering,
University of New South Wales, Australia
mit@cse.unsw.edu.au
The game description language GDL has been developed as a logic-based formalism for representing the rules of arbitrary games in general game playing. A recent language extension called GDL-II allows to describe nondeterministic games with any number of players who may have incomplete, asymmetric information. In this paper, we show how the well-known Alternating-time Temporal Epistemic Logic (ATEL) can be adapted for strategic and epistemic reasoning about general games described in GDL-II. We provide a semantic characterisation of GDL-II descriptions in terms of ATEL models. We also provide a syntactic translation of GDL-II descriptions into ATEL formulas, and we prove that these two characterisations are equivalent. We show that model checking in this setting is decidable by giving an algorithm, and we demonstrate how our results can be used to verify strategic and epistemic properties of games described in GDL-II.
201212 ServiceBase: A Web 2.0 Service-Oriented Backend Programming Platform Moshe Chai Barukh
School of Computer Science and Engineering,
University of New South Wales, Australia
mosheb@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au
With the growing success of modern web-technology and the service-oriented paradigm, the Internet continues to flourish with a large and interesting plethora of web-services. What started as the supply of simple reusable application functions has rapidly evolved to far more powerful service-enabled technologies, such as Software, Platform and Infrastructure -as-a-Service. Although, while these services may expose their functionality in the form of an API, often integrating them in everyday application-development still remains a manual, complex and repetitive task. Such as, the technical knowledge required in connecting to services, as well as the ability in interpreting and manipulating run-time data to drive the application-logic. More importantly, applications need to be constantly maintained to keep up with the ongoing evolutions in the service-providers' API. In this paper, we propose to address these challenges by abstracting and simplifying access to web-services, where a common programming interface can be used to search, explore and also interact with services. A framework is also proposed for decomposing and mapping raw service-messages into more common data-constructs, thus making interpreting, manipulating and chaining services further simplified despite their underlying heterogeneity. Furthermore, unlike existing systems, we implement a Web2.0-oriented ServiceBus platform that fosters a community between service-curators, service-consumers and end-users. In this manner, knowledge about services can be incrementally registered, shared and reused by distributed application developers.
201211 Reliable Communications in Aerial Sensor Networks by Using A Hybrid Antenna Kai Li
School of Computer Science and Engineering,
University of New South Wales, Australia
kail@cse.unsw.edu.au

Nadeem Ahmed
School of Computer Science and Engineering,
University of New South Wales, Australia
nahmed@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
An AWSN composed of bird-sized Unmanned Aerial Vehicles (UAVs) equipped with sensors and wireless radio, enables low cost high granularity three-dimensional sensing of the physical world. The sensed data is relayed in real-time over a multi-hop wireless communication network to ground stations. The following characteristics of an AWSN make effective multi-hop communication challenging - (i) frequent link disconnections due to the inherent dynamism (ii) significant inter-node interference (iii) three dimensional motion of the UAVs. In this paper, we investigate the use of a hybrid antenna to accomplish efficient neighbor discovery and reliable communication in AWSNs. We propose the design of a hybrid Omni Bidirectional ESPAR (O-BESPAR) antenna, which combines the complimentary features of an isotropic omni radio (360 degree coverage) and directional ESPAR antennas (beamforming and reduced interference). Control and data messages are transmitted separately over the omni and directional modules of the antenna, respectively. Moreover, a communication protocol is presented to perform fast neighbor discovery and beam steering. We present results from extensive simulations then consider three different real-world AWSN application scenarios and empirical aerial link characterization and show that the proposed antenna design and protocol reduces the packet loss rate and end to end delay by up to 54% and 49 seconds respectively, and increases the goodput by up to 33%, as compared to a single omni or ESPAR antenna.
201210 An Artifact-Centric Activity Model for Analyzing Knowledge Intensive Processes Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering,
University of New South Wales, Australia
sbeheshti@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto
CA 94304, USA
hamid.motahari@hp.com
Many processes in organizations involve knowledge workers. Understanding and analyzing knowledge-intensive processes is a challenge for organizations today. As knowledge-intensive processes involve human judgements in the selection of activities that are performed, process execution path can change in a dynamic and ad-hoc manner. Case management is a common approach to support knowledge-intensive processes and almost always involves the collection and presentation of a diverse set of artifacts. In case scenarios, understanding ad-hoc processes entails identifying the interactions among people and artifacts, where artifacts are developed and changed gradually over a long period of time as a case is long running and it changes hands over time. We present a framework, simple abstractions and a language for the explorative querying and understanding of the knowledge-intensive processes. Analyzing the set of activities on artifacts helps in understanding the case process. We introduce two concepts of timed folders to represent evolution of artifacts over time, and activity paths to analyze proposed framework. We have implemented the approach on top of FPSPARQL, a graph query language for analyzing business processes execution. The evaluation shows the viability and efficiency of our approach.
201209 A Cost-Effective Tag Design for Memory Data Authentication in Embedded Systems Mei Hong
School of Computer Science and Engineering,
University of New South Wales, Australia
meihong@cse.unsw.edu.au

Hui Guo
School of Computer Science and Engineering,
University of New South Wales, Australia
huig@cse.unsw.edu.au
This paper presents a tag design approach for memory data integrity protection. The approach is highly area, energy and memory efficient, very suitable to embedded systems that have stringent resources. Experiments have been performed to compare our approach with the state-of-art designs, which shows the effectiveness of our design.
201208 Work Efficient Higher-Order Vectorisation (Unabridged) Ben Lippmeier
School of Computer Science and Engineering,
University of New South Wales, Australia
benl@cse.unsw.edu.au

Manuel M. T. Chakravarty
School of Computer Science and Engineering,
University of New South Wales, Australia
chak@cse.unsw.edu.au

Gabriele Keller
School of Computer Science and Engineering,
University of New South Wales, Australia
keller@cse.unsw.edu.au

Roman Leshchinskiy
Unaffiliated
rl@cse.unsw.edu.au

Simon Peyton Jones
Microsoft Research Ltd.
Cambridge, U.K.
simonpj@microsoft.com
Existing approaches to higher-order vectorisation, also known as flattening nested data parallelism, do not preserve the asymptotic work complexity of the source program. Straightforward examples, such as sparse matrix-vector multiplication, can suffer a severe blow-up in both time and space, which limits the practicality of this method. We discuss why this problem arises, identify the mis-handling of index space transforms as the root cause, and present a solution using a refined representation of nested arrays. We have implemented this solution in Data Parallel Haskell (DPH) and present benchmarks showing that realistic programs, which used to suffer the blow-up, now have the correct asymptotic work complexity. In some cases, the asymptotic complexity of the vectorised program is even better than the original.
201206 Dynamic Encryption Key Design and Management for Embedded Systems with Insecure External Memory Mei Hong
School of Computer Science and Engineering,
University of New South Wales, Australia
meihong@cse.unsw.edu.au

Hui Guo
School of Computer Science and Engineering,
University of New South Wales, Australia
huig@cse.unsw.edu.au
To effectively encrypt memory contents of an embedded processor, multiple keys which are dynamically changed are necessary. However, the resources required to store and manage these keys on-chip (so that they are secure) can be extensive. This paper examines novel methods to improve efficiency of encryption(by reducing the amount of re-encryption due to change of key, hence improving the overall encryption speed), and to reduce the required memory resources (by using a special key construction and an implementation scheme). Experiments on a set of applications show that on average, 95% memory and on-chip cost can be saved when compared to the state of art approach.
201205 Quality Control in Crowdsourcing Systems Mohammad Allahbakhsh
School of Computer Science and Engineering,
University of New South Wales, Australia
mallahbakhsh@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto, CA, USA
and
School of Computer Science and Engineering,
University of New South Wales, Australia
hamid.motahari@hp.com

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Elisa Bertino
Purdue University
West Lafayette, Indiana, USA
bertino@cs.purdue.edu
Crowdsourcing as a new model of distributed computing enables people to leverage the intelligence and wisdom of the crowd toward solving problems. Quality control is a critical aspect in crowdsourcing. This article proposes a framework for characterizing various dimensions of the quality control in crowdsourcing systems. We alos review some existing crowdsourcing systems and identify open issues.
201204 An Analytic Approach to People Evaluation in Crowdsourcing Systems Mohammad Allahbakhsh
School of Computer Science and Engineering,
University of New South Wales, Australia
mallahbakhsh@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering,
University of New South Wales, Australia
boualem@cse.unsw.edu.au

Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering,
University of New South Wales, Australia
sbeheshti@cse.unsw.edu.au

Norman Foo
School of Computer Science and Engineering,
University of New South Wales, Australia
norman@cse.unsw.edu.au

Elisa Bertino
Purdue University
West Lafayette, Indiana, USA
bertino@cs.purdue.edu
Worker selection is a significant and challenging issue in crowdsourcing systems. Such selection is usually based on an assessment of the reputation of the individual workers participating in such systems. However, assessing the credibility and adequacy of such calculated reputation is a real challenge. In this paper, we propose an analytic model which leverages the values of the tasks completed, the credibility of the evaluators of the results of the tasks and time of evaluation of the results of these tasks in order to calculate an accurate and credible reputation rank of participating workers and fairness rank for evaluators. The model has been implemented and experimentally validated.
201203 Flexible Resource Allocation for Multicast in OFDMA based Wireless Networks Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au

Xin Zhao
School of Computer Science and Engineering,
University of New South Wales, Australia
xinzhao@cse.unsw.edu.au
This paper studies an efficient resource allocation scheme for multicast in OFDMA based wireless networks. Apart from the conventional resource allocation schemes for multicast which allocate exactly the same subcarriers to the users in a multicast group, this paper proposes a more flexible scheme to divide the multicast group members into different subgroups by utilising the diversity of channel coefficient of different users. We first formulate an optimisation problem to maximise the overall transmission rate. Given the NP-hardness of the problem, we design a low-complexity heuristic, Flexible Resource Allocation with Geometric programming (FRAG). FRAG is a two-step heuristic to subdivide the multicast groups and allocate resource to corresponding subgroups. In the first step, we propose a greedy algorithm to subdivide groups and allocate subcarriers given the assumption of even power distribution. Then we use geometric programming (GP) to solve the optimal power allocation problem. Numerical results show that FRAG is able to allocate subcarriers and power efficiently and effectively, and it achieves up to 33% improvement in aggregated throughput.
201202 A multi-transmitter multi-receiver model for molecular communication networks Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au
We consider molecular communication networks consisting of transmitters and receivers distributed in a fluidic medium. In such networks, a transmitter sends one or signalling molecules, which are diffused over the medium, to the receiver to realise the communication. In order to be able to engineer synthetic molecular communication networks, a mathematical model for these networks is required. This paper proposes a new stochastic model of molecular communication networks called reaction-diffusion master equation with exogenous input (RDMEX). An advantage of RDMEX is that it can readily be used to model molecular communication networks with multiple transmitters and receivers. For the case where the reaction kinetics at the receivers is linear, we show how RDMEX can be used to determine the mean and covariance behaviour of molecular communication networks, and derive closed-form expressions for the mean behaviour of the RDMEX model. These closed-form expressions offer insight into how the transmitters and receivers interfere with each other. Numerical examples are provided to demonstrate the properties of the model.
201201 Towards Service-Oriented Middleware for Cyber-Physical Systems Dat Dac Hoang
School of Computer Science and Engineering,
University of New South Wales, Australia
ddhoang@cse.unsw.edu.au

Hye-Young Paik
School of Computer Science and Engineering,
University of New South Wales, Australia
hpaik@cse.unsw.edu.au

Chae-Kyu Kim
IT Convergence Technology Research Lab
Electronics and Telecommunications Research Institute, Korea
kyu@etri.re.kr
We propose middleware, named WebMed, which is designed with a service-oriented view point to support Cyber-Physical Systems (CPS) applications. WebMed enables access to the underlying smart devices and integration of its device-specific functionality with other software services. It consists of five components: WebMed node, Web service enabler, service repository, engine, and application development. With WebMed, interacting with physical devices becomes as easy as invoking a computation service. Using the basics of service-oriented guidelines, we can build a loosely coupled infrastructure that exposes the functionality of physical devices to the Web for application development.
1118 A HW/SW Checkpoint and Recovery Scheme for Embedded Processors Tuo Li
School of Computer Science and Engineering,
University of New South Wales, Australia
tuol@cse.unsw.edu.au

Roshan Gabriel Ragel
Department of Computer Engineering,
University of Peradeniya, Sri Lanka
roshanr@pdn.ac.lk

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales, Australia
sridevan@cse.unsw.edu.au
Checkpoint and Recovery (CR) allows computer systems to operate correctly even when compromised by transient faults. While many software systems and hardware systems for CR do exist, they are usually either too large, require major modifications to the software, too slow, or require extensive modifications to the caching schemes. In this report, we propose a novel error-recovery management scheme, which is based upon re-engineering the instruction set. We take the native instruction set of the processor and enhance the microinstructions with additional micro-operations which enable checkpointing. The recovery mechanism is implemented by three custom instructions, which recover the registers which were changed, the data memory values which were changed and the special registers (PC, status registers etc.) which were changed. Our checkpointing storage is changed according to the benchmark executed. Results show that our method degrades performance by just 1.45% under fault free conditions, and incurs area overhead of 45% on average and 79% in the worst case. The recovery takes just 62 clock cycles (worst case) in the examples which we examined.
1116 Temporal Provenance Model (TPM): Model and Query Language Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sbeheshti@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto
CA 94304, USA
E-mail: hamid.motahari@hp.com

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au
Provenance refers to the documentation of an object's lifecycle. This documentation (often represented as a graph) should include all the information necessary to reproduce a certain piece of data or the process that led to it. In a dynamic world, as data changes, it is important to be able to get a piece of data as it was, and its provenance graph, at a certain point in time. Supporting time-aware provenance querying is challenging and requires: (i) explicitly representing the time information in the provenance graphs, and (ii) providing abstractions and efficient mechanisms for time-aware querying of provenance graphs over an ever growing volume of data. The existing provenance models treat time as a second class citizen (i.e. as an optional annotation). This makes time-aware querying of provenance data inefficient and sometimes inaccessible. We introduce an extended provenance graph model to explicitly represent time as an additional dimension of provenance data. We also provide a query language, novel abstractions and efficient mechanisms to query and analyze timed provenance graphs. The main contributions of the paper include: (i) proposing a Temporal Provenance Model (TPM) as a timed provenance model; and (ii) introducing two concepts of timed folder, as a container of related set of objects and their provenance relationship over time, and timed paths, to represent the evolution of objects tracing information over time, for analyzing and querying TPM graphs. We have implemented the approach on top of FPSPARQL, a query engine for large graphs, and have evaluated for querying TPM models. The evaluation shows the viability and efficiency of our approach.
1115 CIPARSim: Cache Intersection Property Assisted Rapid Single-pass FIFO Cache Simulation Technique Mohammad Shihabul Haque, Jorgen Peddersen and Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales, Australia
{mhaque, jorgenp, sridevan}@cse.unsw.edu.au
An application's cache miss rate is used in timing analysis, system performance prediction and in deciding the best cache memory for an embedded system to meet tighter constraints. Single-pass simulation allows a designer to find the number of cache misses quickly and accurately on various cache memories. Such single-pass simulation systems have previously relied heavily on cache inclusion properties, which allowed rapid simulation of cache configurations for different applications. Thus far the only inclusion properties discovered were applicable to the Least Recently Used (LRU) replacement policy based caches. However, LRU based caches are rarely implemented in real life due to their circuit complexity at larger cache associativities. Embedded processors typically use a FIFO replacement policy in their caches instead, for which there are no full inclusion properties to exploit. In this paper, for the first time, we introduce a cache property called the ``Intersection Property'' that helps to reduce single-pass simulation time in a manner similar to inclusion property. An intersection property defines conditions that if met, prove a particular element exists in larger caches, thus avoiding further search time. We have discussed three such intersection properties for caches using the FIFO replacement policy in this paper. A rapid single-pass FIFO cache simulator ``CIPARSim'' has also been proposed. CIPARSim is the first single-pass simulator dependent on the FIFO cache properties to reduce simulation time significantly. CIPARSim's simulation time was up to 5 times faster (on average 3 times faster) compared to the state of the art single-pass FIFO cache simulator for the cache configurations tested. CIPARSim produces the cache hit and miss rates of an application accurately on various cache configurations. During simulation, CIPARSim's intersection properties alone predict up to 90% (on average 65%) of the total hits, reducing simulation time immensely.
1114 VChunkJoin: An Efficient Algorithm for Edit Similarity Joins Wei Wang, Jianbin Qin, Chuan Xiao, Xuemin Lin

School of Computer Science and Engineering,
University of New South Wales, Australia
{weiw,jqin,chuanx,lxue}@cse.unsw.edu.au


Heng Tao Shen

University of Queensland, Australia
shenht@itee.uq.edu.au
Similarity joins play an important role in many application areas, such as data integration and cleaning, record linkage, and pattern recognition. In this paper, we study efficient algorithms for similarity joins with an edit distance constraint. Currently, the most prevalent approach is based on extracting overlap- ping grams from strings and considering only strings that share a certain number of grams as candidates. Unlike these existing approaches, we propose a novel approach to edit similarity join based on extracting non-overlapping substrings, or chunks, from strings. We propose a class of chunking schemes based on the notion of tail-restricted chunk boundary dictionary. A new algorithm, VChunkJoin, is designed by integrating existing filtering methods and several new filters unique to our chunk-based method. We also design a greedy algorithm to automatically select a good chunking scheme for a given dataset. We demonstrate experimentally that the new algorithm is faster than alternative methods yet occupies less space.
1113 Link Characterization for Aerial Wireless Sensor Networks Nadeem Ahmed
School of Computer Science and Engineering,
University of New South Wales, Australia
nahmed@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Characterization of communication links in Aerial Wireless Sensor Networks (AWSN) is of paramount importance for achieving acceptable network performance. Protocols based on an arbitrary link performance threshold may exhibit inconsistent behavior due to link behavior not considered during the design stage. It is thus necessary to account for factors that affect the link performance in real deployments. This report details observations from an extensive experimental campaign designed to characterize the behavior of communication links in AWSN. We employ the widely used TelosB sensor platform for these experiments. The experimental results highlight the fact that apart from the usual outdoor environmental factors affecting the link performance, two major contributors to the link degradation in AWSN are the antenna orientation and the multi-path fading effect due to ground reflections. Based on these observations, we recommend measures that can help alleviate the effect of these potential sources of performance degradation in AWSN in order to achieve acceptable network performance.
1112 A Form Annotation Approach to Long-Tail Process Automation Sung Wook Kim
School of Computer Science and Engineering,
University of New South Wales, Australia
skim@cse.unsw.edu.au

Hye-Young Paik
School of Computer Science and Engineering,
University of New South Wales, Australia
hpaik@cse.unsw.edu.au
Simple form of annotations, such as tagging, are proven to be helpful to end-users in organising and managing large amount of resources (e.g., photos, documents). In this paper, we take a first step in applying annotation to forms, one of the main artefacts that make up the long-tail of the processes, to explore potential benefits of helping people with little or no technical background to automate the long-tail of the processes. An analysis of real-world forms was conducted to design algorithms for tag recommendations. Our initial evaluation suggests that useful tag recommendation can be generated based on the contents and the metadata of the forms. We also briefly present FormSys+, a framework for supporting form-based processes. The architecture supports an end-to-end lifecycle of forms, starting from its creation, annotation, and ultimately to its execution in a process.
1111 Forms-based Service Composition for Domain Experts Ingo Weber
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ingo.weber@cse.unsw.edu.au

Hye-Young Paik
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: hpaik@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au
In many cases, it is not cost effective to automate business processes which affect a small number of people and/or change frequently. We present a novel approach for enabling domain experts to model and deploy such processes from their respective domain as Web service compositions. The approach is based on user-editable service naming, a graphical composition language where Web services are represented as forms, a targeted restriction of control flow expressivity, automated process verifiation mechanisms, and code generation for executing orchestrations. A Web-based service composition prototype implements this approach, including a WS-BPEL code generator.
1110 Probabilistic and Max-margin structured learning in Human Action Recognition Tuan Hue Thi
NICTA and School of Computer Science and Engineering,
University of New South Wales, Australia
TuanHue.Thi@nicta.com.au

Li Cheng
Bioinformatics Institute, A*STAR, Singapore
chengli@bii.a-star.edu.sg

Jian Zhang
School of Computer Science and Engineering,
University of New South Wales, Australia
Jian.Zhang@nicta.com.au

Li Wang
Nanjing Forest Univeristy, China
wang.li.seu.nj@gmail.com

Shinichi Satoh
National Institute of Informatics, Japan
satoh@nii.ac.jp
Human action recognition is a promising yet non-trivial computer vision field with many potential applications. Current advances in bag-of-feature approaches have brought significant insights into recognizing human actions for various practical purposes. It is, however, a common practice in literature to consider a set of local feature descriptors with uniform contributions. This assumption has been shown to be oversimplified, which limit these works from robust deployments in real-life video content retrieval. In this work, we propose and show that, by taking into account global configuration of local features, we can greatly improve the recognition performance. A novel feature selection process is also devised with the help of Sparse Hierarchical Bayes Filter, an additional process to boost the traditional \emph{bag-of-feature} learning. We further introduce the usage of structured learning for the problem of human action recognition. That is, by representing one human action as a complex set of local features, a set of feature functions can be utilized to discriminatively infer the structured output for action classification and action localization. In particular, we tackle the problem of action localization in video using structured learning and we compare two two options: One is Dynamic Conditional Random Field from probabilistic principle; The other is Structured Support Vector Machine from max-margin principle. We evaluate our modular classification-localization framework on various testbeds, where the proposed framework is demonstrated by its competitive performance comparing with the state-of-the-art methods.
1109 Automatic Image Capturing and Processing for PetrolWatch Yi Fei Dong
School of Computer Science and Engineering,
University of New South Wales, Australia
ydon@cse.unsw.edu.au

Salil Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Ren Ping Liu
ICT Centre
CSIRO, Australia
ren.liu@csiro.au
In our previous work [5], we proposed a Participatory Sensing (PS) architecture called PetrolWatch to collect fuel prices from camera images of road-side price board (billboard) of service (or gas) stations. A key part of the PetrolWatch architecture, and the main focus of this paper, is the automatic billboard image capture from a moving car without user intervention. Capturing a clear image by an unassisted mobile phone from a moving car is proved to be a challenge by our street driving experiments. In this paper, we design the camera control and image pre-selection schemes to address this challenge. In particular, we leverage the GPS (Global Positioning System) and GIS (Geographic Information System) capabilities of modern mobile phones to design an acceptable camera triggering range and set the camera focus accordingly. Experiment rssults show that our design improve fuel price extraction rate by more than 40%. To deal with blurred images caused by vehicle vibrations, we design a set of pre-selection thresholds based on the measures from embedded accelerometer of the mobile phone. Our experiments show that our pre-selection improves the system eciency by eliminating 78.57% of the blurred images.
1108 Constraint-Based Multi-robot Path Planning with Subgrap Malcolm Ryan
School of Computer Science and Engineering,

University of New South Wales, Australia

malcolmr@cse.unsw.edu.au
Coordinating a group of robots as they independently navigate a shared map without collision is a difficult planning problem. Traditional approaches have scaled badly as they pay little attention to the structure of the underlying search space and waste time exploring parts of the space that are never going to yield a solution. We would like to be able to eliminate these branches early in the search, but a naive representation of the problem does not provide enough information to allow this. We present an alternative formulation of the task as a constraint satisfaction problem with a temporally and spatially abstract representation that allows search to focus on critical decisions early in the search and recognise untenable branches sooner. This representation is based on a partitioning of the map into subgraphs of particular structure, cliques and halls, which place easily representable constraints on the movement of their occupants. We plan first at the level of subgraph transitions and only choose concrete moment-to-moment positions once this abstract plan is complete. Experimental evaluation shows that this allows us to create plans for many more robots that a traditional approach. Further, we show how these map partitions can be automatically generated using the betweenness property of the vertices to detect bottlenecks in the graph and turn them into hall subgraphs. This method is evaluated on 100 different maps and shown to be most effective on indoor maps with long branching corridors.
1107 Auto-scaling Emergency Call Centres using Cloud Resources to Handle Disasters Srikumar Venugopal, Han Li
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {hli,srikumarv}@cse.unsw.edu.au

Pradeep Ray
School of Information Systems, Technology and Management
University of New South Wales, Australia
Email: p.ray@unsw.edu.au
The fixed-line and mobile telephony network is one of the crucial elements of an emergency response to a disaster event. However, frequently the phone network is overwhelmed in such situations and is disrupted. It is not cost-effective to maintain an over-provisioned IT infrastructure for such rare events. Cloud computing allows users to create resources on-demand and can enable an IT infrastructure that scales in response to the demands of disaster management. In this paper, we introduce a system that uses the Amazon EC2 service to automatically scale up a software telephony network in response to a large volume of calls and scale down in normal times. We demonstrate the efficacy of this system through experiments based on real-world data.
1106 An Envelope Detection based Broadband Ultrasonic Ranging System for Improved Indoor/Outdoor Positioning Prasant Misra
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: pkmisra@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sanjay@cse.unsw.edu.au

Diet Ostry
CSIRO, Australia
E-mail: Diet.Ostry@csiro.au

Navinda Kottege
CSIRO, Australia
E-mail: Navinda.Kottege@csiro.au
Fine-grained location information at long range can benefit many applications of embedded sensor networks and robotics. In this paper, we focus on range estimation - an important prerequisite for fine-grained localization - in the ultrasonic domain for both indoor and outdoor environments, and make three contributions. First, we evaluate the characteristics of broadband signals, and provide useful statistics in their design and engineering to achieve a good trade-off between range and accuracy. Second, to overcome the inaccuracies due to correlation sidelobes, we propose a signal detection technique that estimates the envelope of the correlated pulse using a simple least-square approximation approach, and undertake a simulation study to verify its ranging efficiency on linear chirps. Third, leveraging on the insights obtained from our initial study, we present the design and implementation of two different ultrasonic broadband ranging systems based on linear chirps: ($1$): PC-based system using the most basic commodity hardware and custom designed units, and ($2$): Mote-based system using the CSIRO Audio nodes, which comprises of a Fleck-$3$z mote along with audio codecs and a Blackfin DSP. Our evaluation results for both the systems indicate that they are precise enough to support source localization applications: a reliable operational range of $45$m and $20$m (outdoor) respectively, and an average accuracy of approximately $2$cm.
1105 A TPM-enabled Remote Attestation Protocol in Wireless Sensor Networks Hailun Tan
School of Computer Science and Engineering,
University of New South Wales, Australia
thailun@cse.unsw.edu.au

Wen Hu
ICT Centre, CSIRO
wen.hu@csiro.au

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
Given the limited resources and computational power of current embedded sensor devices, memory protection is difficult to achieve and generally unavailable. Hence, the software run-time buffer overflow that is used by the worm attacks in the Internet could be easily exploited to inject malicious codes into Wireless Sensor Networks (WSNs). Previous software-based remote code verification approaches such as SWATT and SCUBA have been shown difficult to deploy in recent work. In this paper, we propose and implement a remote attestation protocol for detecting unauthorized tampering in the application codes running on sensor nodes with the assistance of Trusted Platform Modules (TPMs), a tiny, cost-effective cryptographic micro-controller. In our design, each sensor node is equipped with a TPM and the firmware running on the node could be verified by the other sensor nodes in a WSN, including the sink. Specifically, we present a hardware-based remote attestation protocol, discuss the potential attacks an adversary could launch against the protocol, and provide comprehensive system performance results of the protocol in a multi-hop sensor network testbed.
1104 Using Reinforcement Learning for Controlling an Elastic Web Application Hosting Platform Han Li
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: hli@cse.unsw.edu.au

Srikumar Venugopal
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: srikumarv@cse.unsw.edu.au
Web applications have stringent performance requirements that are sometimes violated during periods of high demand due to lack of resources. Infrastructure as a Service (IaaS) providers have made it easy to provision and terminate compute resources on demand. However, there is a need for a control mechanism that is able to provision resources and create multiple instances of a web application in response to excess load events. In this paper, we propose and implement a reinforcement learning-based controller that is able to respond to volatile and complex arrival patterns through a set of simple states and actions. The controller is implemented within a distributed architecture that is able to not only scale up quickly to meet rising demand but also scale down by shutting down excess servers to save on ongoing costs.
1103 FPSPARQL: A Language for Querying Semi-Structured Business Process Execution Data Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sbeheshti@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto
CA 94304, USA
E-mail: hamid.motahari@hp.com

Sherif Sakr
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ssakr@cse.unsw.edu.au
Business processes (BPs) in today's enterprises are realized over multiple IT systems and services. Understanding the execution of a BP in terms of its scope and details is challenging specially as it is subjective: depends on the perspective of the person looking at BP execution. Existing business process querying and visualization tools assume a pre-defined model of BPs. However, often models and documentation of BPs or the correlation rules for process events across various IT systems do not exist or are outdated. In this paper, we present a framework and a language that provide abstractions and methods for the explorative querying and understanding business process execution from the event logs of workflows, IT systems and services. We propose a query language for analyzing event logs of process-related systems based on the two concepts of folders and paths, which enable an analyst to group related events in the logs or find paths among events. Folders and paths can be stored to be used in future analysis, enabling progressive and explorative analysis. We have implemented the proposed techniques in a graph processing engine called FPSPARQL by extending SPARQL graph query language. We present the evaluation results on the performance and the quality of the results using a number of process event logs.
1102 Efficient Resource Allocation for Delay Sensitive Multicast in Future WiMAX Systems Xin Zhao
School of Computer Science and Engineering,
University of New South Wales, Australia
xinzhao@cse.unsw.edu.au

Joo Ghee Lim
School of EEE
Singapore Polytechnic, Singapore
lim_joo_ghee@sp.edu.sg

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
sanjay@cse.unsw.edu.au
This paper studies efficient resource (radio spectrum and transmission power) allocation in orthogonal frequency-division multiple-access (OFDMA) based multicast wireless system under guaranteed QoS to users. Since most multicast applications are delay sensitive (e.g. Voice over IP, video gaming, online conference etc.), this paper takes minimizing average transmission delay with individual delay requirement as the objective of resource allocation. We first formulate an optimization problem to minimize the system delay with the individual delay bound for each multicast group. Given the NP-hardness of the problem, we design two algorithms to solve the optimization problem effectively. The first one is an efficient resource algorithm by using geometric programming (GP), and the second one is a low complexity heuristic by allocating subcarriers and power separately. Numerical results show that the proposed algorithms are able to allocate subcarriers and power efficiently and effectively, and also achieve the low system delay.
1101 Cartesian Programming John Plaice
School of Computer Science and Engineering,
University of New South Wales, Australia
plaice@cse.unsw.edu.au
We present a new form of declarative programming inspired by the Cartesian coordinate system. This Cartesian programming, illustrated by the TransLucid language, assumes that all programmed entities vary with respect to all possible dimensions, or degrees of freedom. This model is immediately applicable to areas of science, engineering and business in which working with multiple dimensions is common. It is also well suited to specification and programming in situations where a problem is not fully understood, and, with refinement, more parameters will need to be taken into consideration as time progresses. In the Cartesian model, these dimensions include, for all entities, all possible parameters, be these explicit or implicit, visible or hidden. As a result, defining the aggregate semantics of an entire system is simplified, much as the use of the universal relation simplifies the semantics of a relational database system. Evolution through time is handled through the use of a special time dimension that does not allow access to the future. In TransLucid, any atomic value may be used as a dimension. A context maps each dimension to its corresponding ordinate. A context delta is the partial equivalent. An expression is evaluated in a given context, and this context may be queried, dimension by dimension, or perturbed by a context delta. A variable in TransLucid may have several definitions and, given a current context, the bestfit (most specific) definitions with respect to that context are chosen and evaluated separately, and the results are combined together. The set of definitions for a variable define a hyperdaton, which can be understood as an arbitrary-dimensional array of arbitrary extent. Functional abstraction in TransLucid requires two kinds of parameters: value parameters, with call-by-value semantics, are used to pass dimensions and constants; named parameters, with call-by-name semantics, are used to pass hyperdatons. Where clauses, used for local definitions, define both new variables and new dimensions of variance. This thesis presents the full development of Cartesian programming and of TransLucid, complete with a historical overview leading to their conception. Both the denotational and operational semantics are presented, as is the implementation, designed as a library. One important result is that the operational semantics requires only the evaluation and caching of relevant dimensions, thereby ensuring that space usage is kept to a minimum. Two applications using the TransLucid library are presented, one a standalone interpreter, the other an interactive code browser and hyperdaton visualizer. The set of equations defining a TransLucid system can vary over time, a special dimension. At each instant, the set of equations may be modified, but in so doing can only affect the present and future of the system. Timing semantics is always synchronous. There are several possible ways for multiple TransLucid systems to interact. The caching mechanism provided in the operational semantics allows for the efficient implementation of systems whose calling structure is highly irregular. For more regular structures, it is possible to create even more efficient bottom-up solutions, in which recursive instantiations of functions are eliminated, with clear bounds on memory usage and computation. Cartesian programming is not just designed as a standalone paradigm, but as a means of better understanding other paradigms. We examine imperative programming and side-effects, and show that these, under certain conditions, can be translated into TransLucid, thereby allowing the design of new imperative constructs in the original language.
1023 An Adaptive Algorithm for Compressive Approximation of Trajectory (AACAT) for Delay Tolerant Networks Rajib Rana, Chun Tung Chou
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
Emails: {rajibr, ctchou}@cse.unsw.edu.au

Wen Hu, Tim Wark
CSIRO ICT Centre, Australia
Emails: firstname.lastname@csiro.au
Highly efficient compression provides a promising approach to address the transmission and computation challenges imposed by moving object tracking applications on resource constrained Wireless Sensor Networks (WSNs). In this paper, we propose and design a Compressive Sensing (CS) based trajectory approximation algorithm, Adaptive Algorithm for Compressive Approximation of Trajectory (AACAT), which performs trajectory compression, so as to maximize the information about the trajectory subject to limited bandwidth. Our extensive evaluation using real trajectories of three different object groups (animals, pedestrians and vehicles) shows that CS-based trajectory compression reduces up to 30% transmission overheads, for given information loss bounds, compared to the state-of-the-art trajectory compression algorithms. We implement AACAT on the resource-impoverished sensor nodes, which shows that AACAT achieves high compression performance with very limited resource (computation power and energy) overheads.
1022 A Decade of Database Research Publications Sherif Sakr
NICTA and School of Computer Science and Engineering,
University of New South Wales, Australia
ssakr@cse.unsw.edu.au

Mohammad Alomari
School of Information Technologies,
University of Sydney, Australia
miomari@it.usyd.edu.au
We analyze the database research publications of four major core database technology conferences (SIGMOD, VLDB, ICDE, EDBT), two main theoretical database conferences (PODS, ICDT) and three database journals (TODS, VLDB Journal, IEEE TKDE) over 10 years (2001 - 2010). Our analysis considers only regular papers as we do not include short papers, demo papers, posters, tutorials or panels into our statistics. We rank the research scholars according to their number of publication in each conference/journal separately and in combined. We also report about the growth in the number of research publications and the size of the research community in the last decade.
1021 Robust Gait Recognition Based on Procrustes Shape Analysis of Pairwise Configuration Worapan Kusakunniran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: wkus036@cse.unsw.edu.au

Qiang Wu
School of Computing and Communications
University of Technology Sydney
NSW 2007, Australia
E-mail: qiang.wu@uts.edu.au

Jian Zhang
National ICT Australia
NSW 2033, Australia
E-mail: jian.zhang@nicta.com.au

Hongdong Li
Research School of Information Sciences and Engineering
Australian National University
ACT 0200, Australia
E-mail: hongdong.li@anu.edu.au
In this paper, we have further developed Procrustes Shape Analysis (PSA) for robust gait recognition. As a result, a significant improved performance is presented in this paper. Based on the framework of PSA, Procrustes Mean Shape (PMS) has been well applied in many literatures to extract the signature of gait sequence. Then, similarity of gaits is measured using Procrustes Distance (PD). To describe the shape, conventional PSA uses Centroid Shape Configuration (CSC) as the descriptor which embeds global shape information inside. However, CSC significantly limits performance of PSA in our observation for the case of gait recognition. Its nature of global representation cannot address the challenges of gait recognition which involves significant shape (i.e. body pose) changes caused by various reasons such as dressing of individuals, change of walking speed, change of walking direction, etc. In this paper, a novel Pairwise Shape Configuration (PSC) is proposed as a new shape descriptor in PSA. The proposed PSC well embeds local information of gait as well as relevant global information so it is robust to deal with the challenges mentioned above. From our point of view, a reliable PSC should depend on consistent shape re-sampling which is relevant to the gait similarity measure using PD in the later stage. In this paper, context of body pose through the sequence of gait is introduced, which is significant to assist the body shape re-sampling. Our method has been tested on large gait databases under various walking environments. The extensive experimental results have shown that the proposed method outperforms several other methods in literature with a big margin including existing PSA-based methods.
1020 Stochastic Skyline Operator Xuemin Lin, Ying Zhang, Wenjie Zhang, Muhammad Aamir Cheema
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
Emails: {lxue, yingz, zhangw, macheema}@cse.unsw.edu.au
In many applications involving the multiple criteria optimal decision making, users may often want to make a personal trade-off among all optimal solutions. As a key feature, the skyline in a multi-dimensional space provides the minimum set of candidates for such purposes by removing all points not preferred by any (monotonic) utility/scoring functions; that is, the skyline removes all objects not preferred by any user no mater how their preferences vary. Driven by many applications with uncertain data, the probabilistic skyline model is proposed to retrieve uncertain objects based on skyline probabilities. Nevertheless, skyline probabilities cannot capture the preferences of monotonic utility functions. Motivated by this, in this paper we propose a novel skyline operator, namely stochastic skyline. In the light of the expected utility principle, stochastic skyline guarantees to provide the minimum set of candidates for the optimal solutions over all possible monotonic utility functions. In contrast to the conventional skyline or the probabilistic skyline computation, we show that the problem of stochastic skyline is NP-complete with respect to the dimensionality. Novel and efficient algorithms are developed to efficiently compute stochastic skyline over multidimensional uncertain data, which run in polynomial time if the dimensionality is fixed. We also show, by theoretical analysis and experiments, that the size of stochastic skyline is quite similar to that of conventional skyline over certain data. Comprehensive experiments demonstrate that our techniques are efficient and scalable regarding both CPU and IO costs.
1019 Extending SPARQL to Support Entity Grouping and Path Queries Seyed-Mehdi-Reza Beheshti
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sbeheshti@cse.unsw.edu.au

Sherif Sakr
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ssakr@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au

Hamid Reza Motahari-Nezhad
HP Labs Palo Alto
CA 94304, USA
E-mail: hamid.motahari@hp.com
The ability to efficiently find relevant subgraphs and paths in a large graph to a given query is important in many applications including scientific data analysis, social networks, and business intelligence. Currently, there is little support and no efficient approaches for expressing and executing such queries. This paper proposes a data model and a query language to address this problem. The contributions include supporting the construction and selection of: (i) folder nodes, representing a set of related entities, and (ii) path nodes, representing a set of paths in which a path is the transitive relationship of two or more entities in the graph. Folders and paths can be stored and used for future queries. We introduce FPSPARQL which is an extension of the SPARQL supporting folder and path nodes. We have implemented a query engine that supports FPSPARQL and the evaluation results shows its viability and efficiency for querying large graph datasets.
1018 Personal Process Management: Design and Execution for End-Users Ingo Weber
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ingo.weber@cse.unsw.edu.au

Hye-Young Paik
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: hpaik@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au

Corren Vorwerk
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia

Zifei Gong
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia

Liangliang Zheng
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia

Sung Wook Kim
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
In many cases, it is not cost effective to automate given business processes. Those business processes often affect a small number of people and/or change frequently. In this report, we present a novel approach for enabling end-users to model and deploy processes they encounter in their daily work. We herein describe the current status of our research and prototype development on personal process management. In our approach the processes are modelled exclusively from the viewpoint of a single user, and hence avoid many complicated constructs. Therefore, the modelling can rely on simple process representations, which can be as easily understood as a cooking recipe or an audio playlist. The simplicity is achieved by allowing only few activity types in the process: filling forms and manual tasks. The process models can be translated to an executable format and be deployed, including an automatically generated Web interface for user interaction.
1017 Influence Zone and Its Applications in Reverse k Nearest Neighbors Processing Muhammad Aamir Cheema
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: macheema@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: lxue@cse.unsw.edu.au

Wenjie Zhang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: zhangw@cse.unsw.edu.au

Ying Zhang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: yingz@cse.unsw.edu.au
Given a set of objects and a query q, a point p is called the reverse k nearest neighbor (RkNN) of q if q is one of the k closest objects of p. In this paper, we introduce the concept of influence zone which is the area such that every point inside this area is the RkNN of q and every point outside this area is not the RkNN. The influence zone has several applications in location based services, marketing and decision support systems. It can also be used to efficiently process RkNN queries. First, we present efficient algorithm to compute the influence zone. Then, based on the influence zone, we present efficient algorithms to process RkNN queries that significantly outperform existing best known techniques for both the snapshot and continuous RkNN queries. We also present a detailed theoretical analysis to analyse the area of the influence zone and IO costs of our RkNN processing algorithms. Our experiments demonstrate the accuracy of our theoretical analysis.
1016 Characterization of Asymmetry in Low-Power Wireless Links: An Empirical Study Prasant Misra
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: pkmisra@cse.unsw.edu.au

Nadeem Ahmed
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: nahmed@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sanjay@cse.unsw.edu.au
Experimental studies in wireless sensor network (WSN) have shown that asymmetry in low-power wireless links has a significant e ect on the performance of WSN network protocols. Protocols, which work in simulation studies often fail when link asymmetry is encountered in real deployments. Characterization of link asymmetry, is thus, of importance for the design and operation of resilient WSN protocols in real scenarios. This paper details an empirical study to characterize link asymmetry in WSNs. It presents a systematic approach to measure the effects of hardware performance, environmental factors, and temporal properties, on link asymmetry, using off-the-shelf WSN devices. It shows that differences in reception power of WSN devices, operating within the receiver's critical sensitivity band of approximately [-80dBm, -90dBm], transmit-receive switching operation, environmental factors, and high traffic load, are the major factors responsible for asymmetry in low-power wireless links, while frequency misalignment in the transmitter, and power variations in the antenna are unlikely causes for it.
1015 Ear-Phone: A context-aware End-to-End Participatory Urban Noise Mapping System Rajib Rana
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: rajibr@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: ctchou@cse.unsw.edu.au

Wen Hu
CSIRO, Australia
E-mail: wen.hu@csiro.au

Nirupama Bulusu
Department of Computer Science
Maseeh College of Engineering and Computer Science
Portland State University, Portland
Email: nbulusu@cs.pdx.edu

Salil Kanhere
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: salilk@cse.unsw.edu.au
A noise map facilitates monitoring of environmental noise pollution in urban areas. It can raise citizen awareness of noise pollution levels, and aid in the development of mitigation strategies to cope with the adverse effects. However, state-of-the-art techniques for rendering noise maps in urban areas are expensive and rarely updated (months or even years), as they rely on population and traffic models rather than on real data. Participatory urban sensing can be leveraged to create an open and inexpensive platform for rendering up-to-date noise maps. In this paper, we present the design, implementation and performance evaluation of an end-to-end participatory urban noise mapping system called Ear-Phone. Ear-Phone, for the first time, leverages Compressive Sensing to address the fundamental problem of recovering the noise map from incomplete and random samples obtained by crowdsourcing data collection. Ear-Phone, implemented on Nokia N95, N97 and HP iPAQ mobile devices, also addresses the challenge of collecting accurate noise pollution readings at a mobile device. Ear-Phone also leverages context aware sensing and we study the impact of using data from different contexts upon noise map reconstruction. Extensive simulations and outdoor experiments demonstrate that Ear-Phone is a feasible platform to assess noise pollution, incurring reasonable system resource consumption at mobile devices and providing high reconstruction accuracy of the noise map.
1013 Multi-hop Performance Analysis of Whisper Cognitive Radio Networks Quanjun Chen, Chun Tung Chou, Salil S. Kanhere, Wei Zhang(*), Sanjay K. Jha
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, ctchou, salilk, sanjay}@cse.unsw.edu.au


(*)School of Electrical Engineering and Telecommunications,
The University of New South Wales, Sydney, Australia,
Email: w.zhang@unsw.edu.au
Spectrum scarcity has been one of the main challenges that wireless communications face. Cognitive Radio Networks (CRNs) allow secondary users to opportunistically utilize the licensed spectrum that are dedicated to primary users. In 2003, Federal Communications Commission (FCC) has released a spectrum policy called ``Interference Temperature model". Under this model, the secondary users are allowed to access the licensed spectrum simultaneously with the primary users provided that the interference at the primary receiver meets a certain threshold. We refer the Cognitive Radio Networks that employs this model as ``Whisper CRNs" (since secondary users have to use lower transmission power to satisfy the interference constraint). In this work, we analyze the performance of multi-hop whisper CRNs and aim to answer the fundamental question: What is the achieved throughput using whisper CRNs and what are the factors that affect the throughput? We consider a set of realistic network protocols including two-ray radio model and fading radio model at the physical layer, and a geographic routing protocol at network layer. The results quantitatively show that, while the primary users are busy, the secondary users using whisper CRNs can achieve a considerably high end-to-end throughput in some cases (compared to the zero throughput in conventional CRNs where secondary users are prohibited from using the channel when primary users are busy). We also show that the radio propagation characteristics and node density of secondary users have a significant impact on the performance of whisper CRNs.
1012 Integrating local Action Elements using Implicit Shape Model for Action Matching Tuan Hue Thi
NICTA and School of Computer Science and Engineering,
University of New South Wales, Australia
TuanHue.Thi@nicta.com.au

Jian Zhang
School of Computer Science and Engineering,
University of New South Wales, Australia
Jian.Zhang@nicta.com.au
In this paper we propose an approach for analyzing human action in video, and demonstrate its application to several related tasks: action retrieval, action classification and action localization. In our work, actions are represented as a set of local space time features, or Action Elements, then an Implicit Shape Model of the action is built to integrate spatial and temporal correlations of these local Action Elements. In particular, we propose \textit{two} different approaches to extracting Action Elements: a \textit{discriminative} and a \textit{generative} approach, respectively. Action Elements detected from either of the approaches are then used to build an Implicit Shape Model for current action of interest. We apply our action matching algorithm to the application of action recognition and carry out thorough experiments on the two well-known action datasets, KTH and Weizmann. Results obtained from those experiments demonstrate the highly competitive performance of our action matching algorithm compared to state-of-the-arts, and also give evaluative insights on the two proposed feature extraction approaches.
1011 Disjunctive logic programs, answer sets, and the cut rule Eric A. Martin
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: emartin@cse.unsw.edu.au
Minker has proposed a semantics for negation-free disjunctive logic programs that offers a natural generalization of the fixed point semantics for definite logic programs. We show that this semantics can be further generalized for disjunctive logic programs with classical negation, in a constructive modal-theoretic framework where rules are built from assertions and hypotheses, namely, formulas of the form Box F and Diamond Box F where F is a literal, respectively, yielding a "base semantics" for general disjunctive logic programs. Model-theoretically, this base semantics is expressed in terms of a classical notion of logical consequence. It has a complete proof procedure based on a general form of the cut rule. Usually, alternative semantics of logic programs amount to a particular interpretation of nonclassical negation as "failure to derive." The counterpart in our framework is to complement the original program with a set of hypotheses required to satisfy specific conditions, and apply the base semantics to the resulting set. We demonstrate the approach for the answer-set semantics. The proposed framework is purely classical in mainly three ways. First, it uses classical negation as unique form of negation. Second, it advocates the computation of logical consequences rather than of particular models. Third, it makes no reference to a notion of preferred or minimal interpretation.
1010 On Measuring Fidelity of Estimation Models Haris Javaid
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: harisj@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
Estimation models play a vital role in many aspects of day to day life. Extremely complex estimation models are employed in the design space exploration of SoCs, and the efficacies of these estimation models are usually measured by the absolute error of the model compared to a known actual result. Such absolute error based metrics can often result in over-designed estimation models, with a number of researchers suggesting that fidelity of an estimation model should be examined in addition to, or instead of, the absolute error. In this paper, for the first time, we propose four metrics to measure the fidelity of an estimation model, in particular for use in design space exploration. The first two are based on two well known rank correlation coefficients. The other two are weighted versions of the first two metrics, to give importance to points nearer the Pareto front. The proposed fidelity metrics were calculated for a single processor estimation model and a multiprocessor estimation model to observe their behavior, and were compared against the models' absolute error.
1009 HUBCODE: Hub-based Forwarding Using Network Coding in Delay Tolerant Networks Shabbir Ahmed
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: shabbira@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: salilk@cse.unsw.edu.au
Most people-centric delay tolerant networks have been shown to exhibit power-law behavior. Analysis of the temporal connectivity graph of such networks reveals the existence of hubs, a fraction of the nodes, which are collectively connected to the rest of the nodes. In this paper, we propose a novel forwarding strategy called HubCode, which seeks to use the hubs as message relays. The hubs employ random linear network coding to encode multiple messages addressed to the same destination, thus reducing the forwarding overheads. Further, the use of the hubs as relays, ensures that most messages are delivered to the destinations. Two versions of HubCode are presented, with each scheme exhibiting contrasting behavior in terms of the computational costs and routing overheads. We formulate a mathematical model for message delivery delay and present a closed-form expression for the same. We validate our model and demonstrate the efficacy of our solutions in comparison with other forwarding schemes by simulating a large-scale vehicular DTN using empirically collected movement traces of a city-wide public transport network. Under pragmatic assumptions, which account for short contact durations between nodes, our schemes outperform comparable strategies by more than 20\%.
1008 A Study of Spatial Packet Loss Correlation in 802.11 Wireless Networks Zhe Wang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: zhewang@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: mahbub@cse.unsw.edu.au

Tim Moors
School of Electrical Engineering and Telecommunications
University of New South Wales
NSW 2052, Australia
E-mail: moors@ieee.org
This report examines the spatial correlation of packet loss events in IEEE 802.11 wireless networks. We confirm previous experimental studies that showed low correlation of loss for receivers that are far from the transmitter, but we found that the correlation falls with increasing distance between the receivers being correlated and the transmitter, i.e. loss can be highly correlated for receivers close to the transmitter.
1007 A Brinkmanship Game Theory Model for Competitive Wireless Networking Environment Jahan A. Hassan, Mahbub Hassan, and Sajal K. Das Mobile handset manufacturers are introducing new features that allow a user to configure the same handset for seamless operation with multiple wireless network providers. As the competitiveness in the wireless network service market intensifies, such products will deliver greater freedom for the mobile users to switch providers dynamically for a better price or quality of experience. For example, when faced with an unexpected wireless link quality problem, the user could choose to physically switch the provider, or she could be more strategic and use her freedom of switching provider as a `psychological weapon' to force the current provider upgrading the link quality without delay. In this paper, we explore the latter option where users threaten to quit the current provider unless it takes immediate actions to improve the link quality. By threatening the provider, the user will have to accept the risk of having to disconnect from the current provider and reconnect to another in the middle of a communication session, should the provider defies the threat. The user therefore will have to carefully assess the merit of issuing such threats. To analyze the dynamics of this scenario, we formulate the problem as a brinkmanship game theory model. As a function of user's and provider's payoff or utility values, we derive conditions under which the user could expect to gain from adopting the brinkmanship strategy. The effect of uncertainties in payoff values are analyzed using Monte Carlo simulation, which confirms that brinkmanship can be an effective strategy under a wide range of scenarios. Since user threats must be credible to the provider for the brinkmanship model to work, we discuss possible avenues in achieving threat credibility in the context of mobile communications.
1006 An Energy Efficient Instruction Prefetching Scheme for Embedded Processors Ji Gu, Hui Guo
School of Computer Science and Engineering,
University of New South Wales, Australia
{jigu,huig}@cse.unsw.edu.au
Existing instruction prefetching schemes improve performance with significant energy sacrifice, making them unsuitable for embedded and ubiquitous systems where high performance and low energy consumption are all demanded. In this paper, we reduce energy overhead in instruction prefetching by using a simple prefetching hardware/software design and an efficient prefetching operation scheme. Two approaches are investigated: one, Decoded Loop Instruction Cache based Prefetching (DLICP) that is most effective for loop intensive applications; two, enhanced DLICP with the popular existing Next Line Prefetching (NLP) for applications of a moderate number of loops. Our experimental results show that both DLICP and enhanced DLICP deliver improved performance at greatly reduced energy overhead. Up to 21% performance can be improved by the enhanced DLICP at about 3.5% energy overhead, as in comparison to the maximal 11% performance improvement and 49% energy overhead from NLP.
1005 A Unified Framework for Computing Best Pairs Queries Muhammad Aamir Cheema
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: macheema@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: lxue@cse.unsw.edu.au

Haixun Wang
Microsoft Research Asia
Hai Dian District
100190 Beijing, China
E-mail: haixunw@microsoft.com

Jianmin Wang
School of Software
Tsinghua University
100084 Beijing, China
E-mail: jimwang@tsinghua.edu.cn

Wenjie Zhang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: zhangw@cse.unsw.edu.au
Top-k pairs queries have many real applications. k closest pairs queries, k furthest pairs queries and their bichromatic variants are few examples of the top-k pairs queries that rank the pairs on distance functions. While these queries have received significant research attention, there does not exist a unified approach that can efficiently answer all these queries. Moreover, there is no existing work that supports top-k pairs queries based on generic ranking functions. In this paper, we present a unified approach that supports a broad class of top-k pairs queries including the queries mentioned above. Our proposed approach allows users to define a local scoring function for each attribute involved in the query and a global scoring function that computes the final score of each pair by combining its scores on different attributes. The proposed framework also supports the skyline pairs queries; that is, return the pairs that are not dominated by any other pair. We propose efficient internal and external memory algorithms and our theoretical analysis shows that the expected performance of the algorithms is optimal when two or less attributes are involved. Our approach does not require any pre-built indexes and is parallelizable.
1004 Rapid Runtime Estimation Methods for Pipelined MPSoCs targeting Streaming Applications Haris Javaid
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: harisj@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
The pipelined Multiprocessor System on Chip (MPSoC) paradigm is well suited to the data flow nature of streaming applications, specifically multimedia applications. A pipelined MPSoC is a system where processors are connected in a pipeline. To balance the pipeline for high throughput and reduced area footprint, Application Specific Instruction set Processors (ASIPs) are used as the building blocks. Each ASIP in the system has a number of configurations which differ by instruction sets and cache sizes. The design space of a pipelined MPSoC is all the possible permutations of the ASIP configurations. To estimate the runtime of a pipelined MPSoC with one combination of ASIP configurations, designers typically perform cycle-accurate simulation of the whole pipelined MPSoC. Since the number of possible combinations of ASIP configurations (design points) can be in the order of billions, estimation methods are necessary. In this paper, we propose two methods to estimate the runtime of a pipelined MPSoC, minimizing the use of slow cycle-accurate simulations. The first method performs cycle accurate simulations of individual ASIP configurations rather than the whole system, and then utilizes an analytical model of the pipelined MPSoC to estimate its runtime. In the second method, runtimes of individual ASIP configurations are estimated using an analytical processor model. These estimated runtimes of individual ASIP configurations are then used in pipelined MPSoC's analytical model to estimate its runtime. By evaluating our approach on four benchmarks, we show that the maximum estimation error is 5.91% and 13.21%, with an average estimation error of 2.28% and 5.91% for the first and second method respectively. The time to cycle-accurately simulate the whole design space of a pipelined MPSoC is in the order of years, as design spaces with at least 10^10 design points are considered in this paper. However, the time for cycle-accurate simulations of individual ASIP configurations (first method) is days, while the time to simulate a subset of ASIP configurations and estimate their runtimes (second method) is only several hours. Once these simulations are done, the runtime of each design point can just be estimated by using the pipelined MPSoC's analytical model's estimation equation.
1003 Video Quality Prediction in the Presence of MAC Contention and Wireless Channel Error Werayut Saesue
School of Computer Science and Engineering,
University of New South Wales, Australia
wsae207@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Jian Zhang
National ICT Australia, Sydney 1466, Australia
jian.zhang@nicta.com.au
In order to provide adequate QoS for both multimedia and data traffic in a wireless network, it is necessary to develop models which can be used to predict the quality of both multimedia delivery (measured by decoded video quality) and data traffic (measured by throughput) in a wireless environment. This paper proposes an analytical model to predict the quality of video, expressed in terms of mean square error of the received video frames, in an IEEE 802.11e wireless network. The proposed model takes into account contention at the MAC layer, wireless channel error, queueing at the MAC layer, parameters of different 802.11e access categories, and video characteristics of different H.264 data partitions. To the best of the authors' knowledge, this is the first model that takes these network and video characteristics into consideration to predict video quality in an IEEE 802.11e network. The proposed model consists of two components. The first component predicts the packet loss rate of each H.264 data partition by using a multi-dimensional discrete-time Markov chain coupled to a M/G/1 queue. The second component uses these packet loss rates and the video characteristics to predict the MSE of each received video frames. We verify the accuracy of our analytical model by using discrete event simulation and real H.264 coded video sequences.
1002 Mobility and Traffic Adaptive Position Update for Geographic Routing Quan Jun Chen, Salil S. Kanhere, Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, salilk, mahbub}@cse.unsw.edu.au
In geographic routing, nodes need to maintain up-to-date positions of their immediate neighbors for making effective forwarding decisions. Periodic broadcasting of beacon packets that contain the geographic location coordinates of the nodes is a popular method used by most geographic routing protocols to maintain neighbor positions. We contend and demonstrate that periodic beaconing regardless of the node mobility and traffic patterns in the network is not attractive from both update cost and routing performance points of view. We propose the {\emph Adaptive Position Update (APU)} strategy for geographic routing, which dynamically adjusts the frequency of position updates based on the mobility dynamics of the nodes and the forwarding patterns in the network. APU is based on two simple principles: (i) nodes whose movements are harder to predict update their positions more frequently (and vice versa), and (ii) nodes closer to forwarding paths update their positions more frequently (and vice versa). Our theoretical analysis, which is validated by NS2 simulations of a well known geographic routing protocol, Greedy Perimeter Stateless Routing Protocol (GPSR), shows that APU can significantly reduce the update cost and improve the routing performance in terms of packet delivery ratio and average end-to-end delay in comparison with periodic beaconing and other recently proposed updating schemes. The benefits of APU are further confirmed by undertaking evaluations in realistic network scenarios, which account for localization error, realistic radio propagation and a practical vehicular ad-hoc network that exhibits realistic movement patterns of public transport buses in a metropolitan city.
1001 Dynamic Video Buffer Compensation Evan Tan
NICTA and School of Computer Science and Engineering,
University of New South Wales, Australia
evan.tan@nicta.com.au

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au
Decoder buffer underflow is one of the main issue with video streaming. This is mainly brought about by the time varying network conditions. It has an impact on the user perceived video quality as it stops video playback to allow for rebuffering. Having a large enough buffer would, in theory, solve this issue, but it brings about delays that might violate the application's delay requirement. A way to tackle this issue is to do buffer compensation, this is done by allowing more video data to be transmitted and stored in the buffer. What we propose to do is to increase the video frames being sent by increasing the encoding frame rate at the encoder and decrease the frames being played by decreasing the playout frame rate at the decoder to reduce the probability of a decoder underflow. However, adjusting these two parameters would have an adverse effect on both the frame quality and the playout quality. So we further propose an optimization framework based on Lyapunov drift analysis that adjusts the encoding and playout frame rates to maximize the frame and playout qualities as well as to maintain the buffer at an acceptable level. Simulation results shows that our proposed framework obtained significant improvements over a typical setup of fixed encoding and playout frame rates by up to 70% in video utility.
0921 Modeling and Verification of NoC Communication Interfaces Vinitha Palaniveloo
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: vinithaap@cse.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sowmya@cse.unsw.edu.au

Sridevan Parameswaran
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
The concept of Network on Chip (NoC) addresses the communication requirements on-chip and decouples communication from computation. One of the challenges faced by designers of NoC is verifying the correctness of communication scheme for a NoC Architecture. The NoCs borrow the networking concept from computer networks to interconnect complex Intellectual Property (IP) on chip. The applications on IP cores communicate with peer applications through communication architecture that consists of layered communication protocol, routers and switches. The absence of an integrated architectural model poses the challenge of performing end-to-end verification of communication scheme. The formal models of NoC proposed so far in the literature focus on modeling parts of communication architecture such as the specific layers of communication protocol or routers or network topology but not as a integrated architectural model. This is attributed to the absence of expressive modeling language to model all the modules of the NoC. The NoC communication architecture is heterogeneous as it consists of both synchronous and asynchronous nodes and communication pipelines. In this project, we propose a heterogeneous formalism for modeling and verification of NoC. The proposed modeling language is based on formal methods as they provide precise semantics, mathematics based tools and techniques for specification and verification.
0920 Ear-Phone: An End-to-End Participatory Urban Noise Mapping System Rajib Kumar Rana
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: rajibr@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ctchou@cse.unsw.edu.au

Salill Kanhere
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: salilk@cse.unsw.edu.au

Nirupama Bulusu
Department of Computer Science
Portland State University, USA
E-mail: nbulusu@cs.pdx.edu

Wen Hu
CSIRO ICT Centre, Australia
E-mail: wen.hu@csiro.au
A noise map facilitates monitoring of environmental noise pollution in urban areas. It can raise citizen awareness of noise pollution levels, and aid in the development of mitigation strategies to cope with the adverse effects. However, state-of-the-art techniques for rendering noise maps in urban areas are expensive and rarely updated (months or even years), as they rely on population and traffic models rather than on real data. Participatory urban sensing can be leveraged to create an open and inexpensive platform for rendering up-to-date noise maps. In this paper, we present the design, implementation and performance evaluation of an end-to-end participatory urban noise mapping system called Ear- Phone. Ear-Phone, for the first time, leverages Compressive Sensing to address the fundamental problem of recovering the noise map from incomplete and random samples obtained by crowdsourcing data collection. Ear-Phone, implemented on Nokia N95 and HP iPAQ mobile devices, also addresses the challenge of collecting accurate noise pollution readings at a mobile device. We evaluate Ear-Phone with extensive simulations and outdoor experiments, that demonstrate that it is a feasible platform to assess noise pollution with reasonable system resource consumption at mobile devices and high reconstruction accuracy of the noise map.
0919 Spreadsheet-based Complex Data Transformation Regis Saint-Paul
CREATE-NET, Italy
regis.saint-paul@create-net.org

Hung Vu
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: vthung@cse.unsw.edu.au

Ghazi Al-Naymat
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ghazi@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au
Spreadsheets are used by millions of users as a routine all-purpose data management tool. It is now increasingly necessary for external applications and services to consume spreadsheet data. In this paper, we investigate the problem of transforming spreadsheet data to structured formats required by these applications and services. Unlike prior methods, we propose a novel approach in which transformation logic is embedded into a familiar and expressive spreadsheet-like formula mapping language. All transformation patterns commonly provided by popular transformation languages and mapping tools are supported in the language. Consequently, the language avoids cluttering the source document with transformations and turns out to be helpful when multiple schemas are targeted. Furthermore, the language supports the generalization of a mapping from instance-level to template-level element. This enables the language to transform a large number of naturally occurring spreadsheets, which cannot be effectively handled by the alternative approaches. We implemented a prototype and evaluated the benefits of our approach via experiments in two real applications.
0918 OpenPEX: An Open Provisioning and EXecution System for Virtual Machines Srikumar Venugopal
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: srikumarv@cse.unsw.edu.au

James Broberg
Department of Computer Science and Software Engineering,
The University of Melbourne,
VIC 3010, Australia
E-mail: brobergj@csse.unimelb.edu.au

Rajkumar Buyya
Department of Computer Science and Software Engineering,
The University of Melbourne,
VIC 3010, Australia
E-mail: raj@csse.unimelb.edu.au
Virtual machines (VMs) have become capable enough to emulate full-featured physical machines in all aspects. Therefore, they have become the foundation not only for flexible data center infrastructure but also for commercial Infrastructure-as-a-Service (IaaS) solutions. However, current providers of virtual infrastructure offer simple mechanisms through which users can ask for immediate allocation of VMs. More sophisticated economic and allocation mechanisms are required so that users can plan ahead and IaaS providers can improve their revenue. This paper introduces OpenPEX, a system that allows users to provision resources ahead of time through advance reservations. OpenPEX also incorporates a bilateral negotiation protocol that allows users and providers to come to an agreement by exchanging offers and counter-offers. These functions are made available to users through a web portal and a REST-based Web service interface.
0917 Performance of Multi-hop Whisper Cognitive Radio Networks Quanjun Chen, Chun Tung Chou, Salil S. Kanhere, Wei Zhang(*), Sanjay K. Jha
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, ctchou, salilk, sanjay}@cse.unsw.edu.au


(*)School of Electrical Engineering and Telecommunications,
The University of New South Wales, Sydney, Australia,
Email: w.zhang@unsw.edu.au
In 2003, Federal Communications Commission (FCC) has released a new spectrum policy called ``Interference Temperature model" to improve the channel access opportunities of secondary users. Under this model, the secondary users are allowed to access the licensed spectrum simultaneously with the primary users provided that the interference at the primary receiver meets a certain threshold. We refer the Cognitive Radio Networks that employs this model as ``Whisper CRNs" (since secondary users have to use smaller transmission power to satisfy the interference constraint). In this work, we systematically analyze the performance of whisper CRNs and compare it with the traditional CRNs, where the secondary users are not allowed to transmit when primary users are busy. We aim to answer the fundamental question: what is the performance trade-off by switching to whisper CRNs from traditional CRNs. Based on the performance analysis, we also propose an efficient channel assignment scheme that has a small channel switching overhead. The results show that whisper CRNs can improve the connectivity and end-to-end throughput of secondary users considerably (by more than two times in some scenarios) but at the cost of increasing end-to-end delay. We also show that the node density of secondary users has a significant impact on the performance of whisper CRNs.
0916 Characterizing Human Effort in Wireless Voice Over IP Jahan A. Hassan
School of Computer Science and Engineering
University of New South Wales
Kensington, Sydney 2052, Australia
E-mail: jahan@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
Kensington, Sydney 2052, Australia
E-mail: mahbub@cse.unsw.edu.au

Sajal K. Das
Department of Computer Science and Engineering
University Texas at Arlington
P.O. Box 19015
Arlington, TX 76019, USA
Skype Voice Over IP (VoIP) traces from an experimental WiFi network were analyzed to detect and characterize user efforts that go into these calls. Our analysis shows that users have a very low tolerance threshold when it comes to putting efforts for getting the conversation going (they prefer rather effort-less conversation). A wireless VoIP session is highly likely to be abandoned prematurely by the user if the effort threshold is exceeded during the call. Our results also suggest that after exceeding the effort threshold, users are likely to spend quite a bit of time in the call before finally abandoning it. These effort patterns are found to be consistent across multiple users, with the actual value of the effort threshold being sensitive to the user. An important outcome is that it is possible to reliably generate warnings for calls that are going to face premature ending by simply monitoring the number of times the user has put efforts into the call. Besides reliability, these warnings can be generated well in advance giving plenty of time to network controllers for possible repair of the wireless connection and avoidance of premature call ending. Using the effort data captured from our experimental network, we conduct discrete event simulations of a WiFi VoIP network to evaluate the effectiveness of dynamic resource allocation in addressing such warnings. The experiments show that resource allocation schemes which are capable of exploiting the long warning lead times of effort-based predictions, can find additional resources with a high probability.
0915 Efficient computation of robust average in wireless sensor networks using compressive sensing Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering,
University of New South Wales, Australia
ignjat@cse.unsw.edu.au

Wen Hu
Autonomous Systems Laboratory,
CSIRO ICT Centre, Brisbane, Australia
Wen.Hu@csiro.au
Wireless sensor networks (WSNs) enable the collection of physical measurements over a large geographic area. It is often the case that we are interested in computing and tracking the spatial-average of the sensor measurements over a region of the WSN. Unfortunately, the standard average operation is not robust because it is highly susceptible to sensor faults (e.g. offset, stuck-at errors etc.) and variation of sensor measurement noises. In this paper, we propose a method to compute a robust average of sensor measurements, which appropriately takes sensor faults and sensor noise into consideration, in a bandwidth- and computational-efficient manner. At the same time, the proposed method can determine which sensors are likely to be faulty. Our method achieves bandwidth efficiency by exploiting compressive sensing. Instead of sending a block of sensor readings to the data fusion centre, each sensor performs random projections (as in compressive sensing) on the data block and sends the results of the projections (which we will refer to as the compressed data) to the data fusion centre. At the data fusion centre, we achieve computational efficiency by working directly with the compressed data, whose dimension is only a fraction of that of the original block of sensor data. In other words, our proposed method will work on the compressed data without decompressing them. By using the compressed data, our proposed method will determine which sensors are likely to be faulty as well as a robust average of the compressed data, which, after decompression (or compressive sensing reconstruction), will yield an approximation of the robust average of the original sensor readings. This means that the data fusion centre will only need to perform decompression once in order to obtain the robust average (rather than decompressing all the compressed data from all the sensors), therefore achieving computational efficiency. We apply our proposed method to the data collected from a number of WSN deployments to demonstrate its efficiency and accuracy.
0914 SuSeSim: A Fast Simulation Strategy to Find Optimal L1 Cache Configuration for Embedded Systems Mohammad Shihabul Haque
School of Computer Science and Engineering,
University of New South Wales, Australia
mhaque@cse.unsw.edu.au

Andhi Janapsatya
School of Computer Science and Engineering,
University of New South Wales, Australia
andhij@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering,
University of New South Wales, Australia
sridevan@cse.unsw.edu.au
Simulation of an application is a popular and reliable approach to find the op- timal conŻguration of L1 cache memory for an application specific embedded system processor. However, long simulation time is one of the main disadvan- tages of simulation based approaches. In this paper, we propose a new and fast simulation method, Super Set Simulator (SuSeSim). While previous methods use Top-Down searching strategy, SuSeSim utilizes a Bottom-Up search strat- egy along with a new elaborate data structure to reduce the search space to determine a cache hit or miss. SuSeSim can simulate hundreds of cache config- urations simultaneously by reading an application's memory request trace just once. Total number of cache hits and misses are accurately recorded. Depending on different cache block sizes and benchmark applications, SuSeSim can reduce the number of tags to be checked by up to 43% compared to the existing fastest simulation approach (the CRCB algorithm). With the help of a faster search and an easy to maintain data structure, SuSeSim can be up to 94% faster in simulating memory requests compared to the CRCB algorithm.
0913 Phylogeny, Genealogy, and the Linnaean Hierarchy: Formal Proofs Rex Kwok
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: rkwok@cse.unsw.edu.au
Phylogenetic terms (monophyly, polyphyly, and paraphyly) were first used in the context of a phylogenetic tree. However, the only possible source for a phylogeny is a genealogy. This paper presents formal definitions for phylogenetic terms in a genealogical context and shows that their properties match their intuitive meanings. Moreover, by presenting the definitions in a genealogical context, a firm connection between genealogy and phylogeny is established. To support the correctness of the definitions, results will show that they satisfy the appropriate properties in the context of a phylogenetic tree. Ancestors in a phylogenetic tree are viewed as theoretical entities since no means exist for proving ancestral relationships. As such, groups of terminal species are often considered. This will impact on phylogenetic concepts. Results will be presented showing that monophyly and polyphyly have reasonable interpretations in this context while the notion of paraphyly becomes degenerate. The vigorous debate about whether biological taxa should be monophyletic will also be addressed. Results will be presented showing why the monophyletic condition will make a Linnaean classification entirely monotypic.
0912 Safety Assurance and Rescue Communication Systems in High-Stress Environments - A Mining Case Study Prasant Misra
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: pkmisra@cse.unsw.edu.au

Salil Kanhere
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: salilk@cse.unsw.edu.au

Diet Ostry
CSIRO, Australia
E-mail: Diet.Ostry@csiro.au

Sanjay Jha
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sanjay@cse.unsw.edu.au
Effective communication is critical for response and rescue operations; however,the capricious behavior of communication devices in high-stress environments is a significant obstacle to effectiveness. High-stress environments in which disaster response and recovery operations are needed include natural calamities such as earthquakes, tsunamis, hurricanes, tornadoes, fires, floods/flash floods in urban areas; salvage, search and rescue operations in underwater, urban war zones, mountainous terrain, avalanches, underground mine disasters, volcanic eruptions, plane crashes, high-rise building collapses, and nuclear facility malfunctions; and deep-space communication in outer-space exploration. We have observed a correlation between the channel characteristics that effect the performance of the communication devices across these extreme environments. The contribution of this article is three-fold. First, it generates a list of characteristics that affect communication in all high-stress environments and then evaluates it with respect to the underground mine environment. Second, it discusses current underground mine communication techniques and identifies the potential problems. Third, it explores the design of a wireless sensor network (WSN) based communication and location sensing system that could potentially meet the current challenges. Finally, we discuss some preliminary results of an empirical study of the wireless communication characteristics of off-the-shelf MicaZ wireless sensor nodes in an underground mine in Parkes, NSW, Australia.
0911 Synthesis of Application Specific Heterogeneous Pipelined Multiprocessor Systems Haris Javaid
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: harisj@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
This paper describes a rapid design methodology to create a pipeline of processers to execute streaming applications. The methodology has two separate phases. The first phase uses a heuristic to rapidly search through a large number of processor configurations (configurations differ by the base processor, the additional instructions and cache sizes) to find the near Pareto front. The second phase utilizes either the above heuristic or an ILP (Integer Linear Programming) formulation to search a smaller design space to find an appropriate final implementation. By the utilization of the fast heuristic with differing runtime constraints in the first phase, we rapidly find the near Pareto front. The second phase provides either an optimal or a near optimal solution. Both the ILP formulation and the heuristic find a system with the smallest area, within a designer specified runtime constraint. The system has efficiently explored design spaces with over 10^12 design points. We integrated this design methodology into a commercial design flow and evaluated our approach with different benchmarks (JPEG Encoder, JPEG Decoder and MP3 Encoder). For each benchmark, the near Pareto front was found in a few hours using the heuristic (took several days for the ILP). The results show that the average area error of the heuristic is within 2.5% of the optimal design points (obtained using ILP) for all benchmarks.
0910 Underground Mine Communication and Tracking Systems : A Survey Prasant Misra
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: pkmisra@cse.unsw.edu.au

Diet Ostry
CSIRO, Australia
E-mail: Diet.Ostry@csiro.au

Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sanjay@cse.unsw.edu.au
This article presents a survey of the state-of-the art underground mine communication and tracking systems. Underground mines are extensive labyrinths. They employ hundreds of mining personnel working at any point of time under extreme conditions. To ensure the safety of workers and perform co-ordination of tasks, a communication and tracking system is one of the more important infrastructures that needs to be deployed and is expected to deliver satisfactory performance in terms of communicating in routine and rescue operations. To develop an engineering and scientific foundation, we need to understand the underground channel characteristics along with the challenges faced by wired and wireless communication solutions.
0909 Cross-layer interactions in energy efficient information collection in wireless sensor networks with adaptive compressive sensing Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Rajib Rana
School of Computer Science and Engineering,
University of New South Wales, Australia
rajibr@cse.unsw.edu.au

Wen Hu
Autonomous Systems Laboratory,
CSIRO ICT Centre, Brisbane, Australia
Wen.Hu@csiro.au
We consider the problem of using Wireless Sensor Networks (WSNs) to measure the temporal-spatial field of some scalar physical quantities. Our goal is to obtain a sufficiently accurate approximation of the temporal-spatial field with as little energy as possible. We propose an adaptive algorithm, based on the recently developed theory of adaptive compressive sensing, to collect information from WSNs in an energy efficient manner. The key idea of the algorithm is to perform ``projections" iteratively to maximise the amount of information gain per energy expenditure. We prove that this maximisation problem is NP-hard and propose a number of heuristics to solve this problem. This maximisation problem can also be viewed as a routing problem with a metric which is a non-linear function of the data field; therefore the problem is cross-layer in nature, requiring information in both application and network layers. We evaluate the performance of our proposed algorithms using data from both simulation and an outdoor WSN testbed. The results show that our proposed algorithms are able to give a more accurate approximation of the temporal-spatial field for a given energy expenditure.
0908 Fixed Point Method for Voting Chung Tong Lee
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ctlee@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering
University of New South Wales, and NICTA
NSW 2052, Australia
E-mail: ignjat@cse.unsw.edu.au
Question answering (Q&A) community sites, such as the MSN QnA and Yahoo! Answers, facilitate question answering by a community of users. However, the quality of the answers provided by users varies. To determine the best answer, vote counts, sometimes with extra weight put on the askers, are commonly used. This makes the result vulnerable to tainted vote effect as opinions from "bad" voters weight the same as those from the "good" ones. We propose a new method to determine the best answer by the sum of voters’ reliability scores, which are calculated based on voters’ behaviors. The more a voter can choose the best answer, the more reliable he is and the more weight his opinion should carry. This is a circular definition similar to the reputation score evaluation in [2]. Our method does not require the identification of anomalous voting behavior to reduce the reliability score. Instead, we employ the Brouwer Fixed Point Theorem [1] to show the existence of the assignment for reliability scores which satisfy the axiomatic description of the system. Iterative method is used for actual evaluation. To demonstrate the robustness, simulations are designed with data that match the real-life situation, augmented with various forms of anomalous behaviors.
0907 LOP: A Novel SRAM-based Architecture for LOw Power Packet Classification XIN HE
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: xinhe@cse.unsw.edu.au

Jorgen Peddersen
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: jorgenp@cse.unsw.edu.au

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
Performance of packet classification algorithms is an important area of concern in modern networks. Algorithms for matching incoming packets from the network to pre-defined rules, have been proposed by a number of researchers. Current software-based packet classification techniques have low performance, so many researchers have moved their focus to new architectures encompassing both software and hardware components. Some of the newer hardware architectures exclusively utilize Ternary Content Addressable Memory (TCAM) to improve performance of rule matching. However, this results in systems with very high power consumption. TCAM consumes a high amount of power as the entire memory array is read during any given access, much of which may not be necessary. In this paper, we propose a novel SRAM-based (named LOP) architecture where incoming packets are compared against parts of all rules simultaneously until a single matching rule is found for the compared bits in the packets, significantly reducing power consumption (i.e., only a segment of the memory is compared to the incoming packet). This comes with a penalty of time to match a single packet, but multiple packets can be compared in parallel to improve throughput beyond the levels of TCAM approaches. Nine different benchmarks were tested in with two classification systems, with results showing that LOP architectures provide high lookup rates and throughput, and consume low power. Compared with a low power commercial TCAM approach, LOP achieves power reduction of more than 60% with equivalent throughput, and a power reduction of about 20% with high throughput (220 million searches per second (Msps) compared to 66Msps). Furthermore, energy can be reduced by up to 75% compared with commercial TCAMs in 0.18ąm CMOS technology.
0905 Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse kNN Muhammad Aamir Cheema
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: macheema@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: lxue@cse.unsw.edu.au

Ying Zhang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: yingz@cse.unsw.edu.au

Wei Wang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: weiw@cse.unsw.edu.au
In the past few years, continuous monitoring of spatial queries has received significant attention from the database research community. In this paper, we study the problem of continuous monitoring of reverse k nearest neighbor queries. Existing continuous reverse nearest neighbor monitoring techniques are sensitive towards objects and queries movement. For example, the results of a query are to be recomputed whenever the query changes its location. We present a framework for continuous reverse k nearest neighbor queries by assigning each object and query with a rectangular safe region such that the expensive re-computation is not required as long as the query and objects remain in their respective safe regions. This significantly improves the computation cost. As a by-product, our framework also reduces the communication cost in client-server architectures because an object does not report its location to the server unless it leaves its safe region or the server sends a location update request. We also conduct a rigid cost analysis to guide an effective selection of such rectangular safe regions. The extensive experiments demonstrate that our techniques outperform the existing techniques by an order of magnitude in terms of computation cost and communication cost.
0904 Using Agile Practices in Global Software Development: A Systematic Review Emam Hossain
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: meh@cse.unsw.edu.au

Muhammad Ali Babar
Lero, University of Limerick
International Science Centre, Limerick
Ireland
E-mail: malibaba@lero.ie

Hye-young Paik
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: hpaik@cse.unsw.edu.au
There is a growing interest in applying agile approaches in Global Software Development (GSD) projects. Recently, some studies have reported the use of Scrum practices in distributed development projects. However,little is known about how these practices are carried out in reality and what the outcomes are. We have conducted a systematic literature review to identify, synthesize and present the findings from the primary studies that report using Scrum practices in GSD projects. Our search strategy identified 583 papers, of which 20 were identified as primary papers relevant to our research. We extracted data from these papers to identify various challenges of using Scrum in GSD. Current strategies to deal with the identified challenges have also been extracted. This paperpresents the review’s findings that are expected to help researchers and practitioners to understand the current state of use of Scrum practices in GSD.
0903 Modeling and Verification of NoC Communication Interfaces Vinitha Palaniveloo
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: vinithaap@cse.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sowmya@cse.unsw.edu.au

Sridevan Parameswaran
School of Computer Science and Engineering
The University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
The concept of Network on Chip (NoC) addresses the communication requirements on chip and decouples communication from computation. One of the challenges faced by designers of NoC integrated circuits is verifying the correctness of communication scheme for an NoC Architecture. The NoC are on-chip communication networks that borrows the networking concept from computer networks to interconnect complex Intellectual Property (IP) on chip. Therefore, the applications on IP cores communicate with peer applications thorough communication architecture that consists of layered communication protocol, routers and switches. The absence of an integrated architectural model poses the challenge of performing end-to-end verification of communication scheme. The formal models of NoC proposed so far in the literature focus on modeling parts of communication architecture such as the specific layers of communication protocol or routers or network topology but not as a integrated architectural model. This is attributed to the absence of expressive modelling language to model all the modules of the NoC communication architecture. The NoC communication architecture is heterogeneous as it consists of synchronous and asynchronous IP cores communicating through heterogenous communication pipelines. In this project we propose a heterogenous modeling language for modelling and verification of NoC communcation architecture. The proposed modeling language is based on formal methods as they provide precise semanitics, mathematics based tools and techniques for specification and verification.
0902 An Aspect-oriented Approach for Service Adaptation Woralak Kongdenfha
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: woralakk@cse.unsw.edu.au

Hamid Motahari
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: hamidm@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au

Fabio Casati
Department of Information and Communication
University of Trento, Italy
E-mail: casati@dit.unitn.it

Regis Saint-Paul
CREATE-NET International Research Center
Trento, Italy
E-mail: regis.saint-paul@create-net.org
Standardization in Web services simplifies integration. However, it does not remove the need for adapters due to possible heterogeneity among service interfaces and protocols. In this paper, we characterize the problem of Web services adaptation focusing on business interfaces and protocols adapters. Our study shows that many of differences between business interfaces and protocols are recurring. We introduce mismatch patterns to capture these recurring differences and to provide solutions to resolve them. We leverage mismatch patterns for service adaptation with two approaches: by developing standalone adapters and via service modification. We then dig into the notion of adaptation aspects, that, following aspect-oriented programming paradigm and service modification approach, allow for rapid development of adapters. We present a study showing that it is a preferable approach in many cases. The proposed approach is implemented in a proof-of-concept prototype tool. We explain how it simplifies adapter development through a case study.
0824 Hop Count Analysis for Greedy Geographic Routing Quanjun Chen, Salil S. Kanhere, Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, salilk, mahbub}@cse.unsw.edu.au
Hop count is a fundamental metric in multi-hop wireless ad-hoc network. It has a determinative effect on the performances of wireless network, such as throughput, end-to-end delay and energy consumption. Identifying hop count metric, including the distribution function and the mean value, is therefore vital for analyzing wireless network performance. This paper proposes a theoretical model to accurately analyze the hop count distribution and its mean value. Given a communication pair, its hop count metric is dependent on the routing protocol selected and the network topology determined by the physical radio model. At routing layer, our model focuses on the widely used greedy routing. At physical layer, the model investigate the ideal radio model, and a more realistic radio model, e.g. log-normal shadowing model. We conduct a rich set of simulation to validate our analytical model. The comparison results show that the simulation results closely match with the analysis results. The analytical model is further validated through a trace driven simulation of a practical vehicular ad-hoc network that exhibits realistic topologies of public transport buses in a metropolitan city.
0823 A Formal Analysis of Phylogenetic Definitions in the PhyloCode Rex Kwok
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: rkwok@cse.unsw.edu.au
Three main codes currently govern biological nomenclature: (i) The International Code of Botanical Nomenclature (ICBN), (ii) The International Code of Zoological Nomenclature (ICZN), and (iii) The International Code of Nomenclature of Bacteria (ICNB). Recently, the PhyloCode -- a code based on phylogenetic nomenclature, has been presented as an alternative. To facilitate a comparison between the various codes, this paper presents a formal study into the properties of phylogenetic nomenclature -- as presented in the PhyloCode. While much of the PhyloCode necessarily deals with the procedures for publishing and registering names, an important component deals with phylogenetic definitions. It is this component that will be studied in detail here. The various types of phylogenetic definition will be formalised in a mathematical setting. Results will be presented showing that under phylogenetic trees that much of the intuition surrounding phylogenetic definitions match up with the formalisation. However, ambiguity in the meaning of such definitions arises under the more general case when a phylogenetic hypothesis is allowed to be a rooted directed acyclic graph -- a situation expressly allowed by the PhyloCode. Solutions to such problems will be presented. The issue of semantic stability -- an often stated desirable property of a nomenclatural system -- will also be examined. Conditions will be presented showing how stability can be improved for phylogenetic definitions. Two new types phylogenetic definition, the minimality--based definition and the maximality--based definition, will be presented as generalisations of the PhyloCode definitions. How the PhyloCode definitions relate to each other will be shown from this new perspective.
0821 Logic Programming Revisited from a Classical Standpoint Eric A. Martin
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: emartin@cse.unsw.edu.au
Logic programming has developed as a rich field, built over a logical substratum whose main constituent is a nonclassical form of negation, sometimes coexisting with classical negation. The field has seen the advent of a number of alternative semantics, with Kripke-Kleene semantics, the well founded semantics, the stable model semantics, and answer-set programming standing out as the most successful of all. We show that using classical negation only, all aforementioned semantics are particular cases of a unique semantics applied to a general notion of logic program possibly transformed following a simple procedure. The notions and results presented in this paper give a classical perspective on the field of logic programming and broaden its scope, as that simple procedure suggests a number of possible transformations of a logic program, that can be classified into families, some members of some of those families matching a particular paradigm in the field. The paper demonstrates that logic programming can be developed in such a way that negation does not present itself as an intrinsically complex operator, hard to interpret properly and that needs a complicated formal apparatus to be fully apprehended, but still in a way that accommodates the semantics that have put nonclassical negation at the center of their investigations.
0820 Distributed Optimization For Location Refinement in Ad-hoc Sensor Networks Sarfraz Nawaz
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: mnawaz@cse.unsw.edu.au

Chun Tung Chou
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ctchou@cse.unsw.edu.au

Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sanjay@cse.unsw.edu.au
A number of range based sensor network localization systems form a rough layout of the network and then starting from this rough layout gradually refine the location coordinates of sensor nodes. We model this coordinate refinement as an unconstrained non-linear optimization problem and show that current heuristic based approaches that require empirical tunning of parameters cannot guarantee convergence in ad-hoc network deployments. We then present a completely distributed algorithm for location refinement and show that this problem can be solved by iteratively performing aggregate sum computations of certain locally computed values over the entire sensor network. Our proposed algorithm does not require any empirical tunning and thus can work with ad-hoc network deployments. We show through simulations and real experimentation that our algorithm exhibits faster convergence as compared to current empirically tunned approaches with similar overhead.
0818 GATE: A Novel Robust Object Tracking Method Using the Particle Filtering and Level Set Method Cheng Luo
School of Computer Science and Engineering
University of New South Wales
National ICT Australia
NSW 2052, Australia
E-mail: luo.cheng@nicta.com.au

Xiongcai Cai
School of Computer Science and Engineering
University of New South Wales
National ICT Australia
NSW 2052, Australia
E-mail: xcai@cse.unsw.edu.au

Jian Zhang
School of Computer Science and Engineering
University of New South Wales
National ICT Australia
NSW 2052, Australia
E-mail: jian.zhang@nicta.com.au
This technical report presents a novel algorithm for robust object tracking based on the particle filtering method employed in recursive Bayesian estimation and image segmentation and optimisation techniques employed in active contour models and level set methods. The proposed Geometric Active contour-based Tracking Estimation, namely GATE, enables particle filters to track object of interest in complex environments using merely a simple feature. GATE creates a spatial prior in the state space using shape information of the tracked object. The created spatial prior is then used to filter particles in the state space in order to reshape and refine the observation distribution of the particle filtering. This improves the performance of the likelihood model in the particle filtering, so the significantly overall improvement of the particle filtering. The promising performances of our method on real sequences are demonstrated.
0816 Probabilistic Reverse Nearest Neighbor Queries on Uncertain Data Muhammad Aamir Cheema
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: macheema@cse.unsw.edu.au

Xuemin Lin
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: lxue@cse.unsw.edu.au

Wei Wang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: weiw@cse.unsw.edu.au

Wenjie Zhang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: zhangw@cse.unsw.edu.au

Jian Pei
School of Computer Science
Simon Fraser University
Burnaby, BC Canada
E-mail: jpei@cs.sfu.ca
Uncertain data is inherent in many important applications where the exact data values are not known. While many types of queries on uncertain data have been studied, reverse nearest neighbor query on uncertain data is still an open problem. In this paper, we formalize the problem of probabilistic reverse nearest neighbor query based on the possible worlds semantics. We propose an efficient method that processes such queries efficiently. The key technique innovation is several novel pruning methods that exploit various properties of the problem. Extensive experiment demonstrates that our algorithm is highly efficient and scalable.
0815 Context-Aware Channel Coordination for Vehicular Communications Zhe Wang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: zhewang@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: mahbub@cse.unsw.edu.au
Vehicular communication could be the much anticipated breakthrough against the unabated fatal and near fatal accidents that continue to threaten the public safety on our roads. The same technology is also expected to concurrently support a range of non-safety applications including real-time traffic information, mobile entertainment, and access to the Internet. The standard has specified an explicit multi-channel structure whereby safety and non-safety transmissions will occur at different channels. Consequently, a vehicle with a conventional single-radio transceiver will need to continuously switch between the safety and the non-safety modes of operation. The interval spent in the safety mode (safety interval) at each cycle is a critical parameter that directly limits the availability of the technology for commercial use. Using simulation, we show that the safety interval required to satisfy the reliability of safety applications is a function of traffic density on the road. Given that in most roads traffic density is expected to vary during the day, we propose dynamic adjustment of the safety interval based on the traffic context. To further motivate the concept of traffic aware vehicular communications, we evaluate the performance of three dynamic channel coordination algorithms using empirical traffic data collected from the roads around the city of Sydney, Australia. A key finding is that, the time-of-day is an effective context that can prevent a vehicular radio from keep running in the safety mode unnecessarily, thereby enhancing the commercial opportunity of the technology. We further demonstrate that the use of the location context can dramatically improve the performance of the basic time-of-day algorithms.
0814 How Much of DSRC is Available for Non-Safety Use? Zhe Wang
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: zhewang@cse.unsw.edu.au

Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: mahbub@cse.unsw.edu.au
The Dedicated Short Range Communication (DSRC) technology is currently being standardized by the IEEE to enable a range of communication-based automotive safety applications. However, for DSRC to be cost-effective, it is important to accommodate commercial non-safety use of the spectrum as well. The co-existence of safety and non-safety is achieved through a periodic channel switching scheme whereby access to DSRC alternates between these two classes of applications. In this paper, we propose a framework that links the non-safety share of DSRC as effected by the channel switching to the performance requirements of safety applications. Using simulation experiments, we analyze the non-safety opportunity in the DSRC under varied road traffic conditions. We find that non-safety use of DSRC may have to be severely restricted during peak hours of traffic to insure that automotive safety is not compromised. Our study also sheds interesting insights into how simple strategies, e.g., optimizing the message generation rate of the safety applications, can significantly increase the commercial opportunities of DSRC. Finally, we find that adaptive schemes that can dynamically adjust the switching parameters in response to observed traffic conditions may help in maximizing the commercial use of DSRC.
0812 A Measurement Study of Bandwidth Predictability in Mobile Communication Networks Jun Yao, Salil S. Kanhere, Mahbub Hassan
School of Computer Science and Engineering,
University of New South Wales,
Australia
Email: {jyao, salilk, mahbub}@cse.unsw.edu.au
While bandwidth predictability has been well studied in static environments, it remains largely unexplored in the context of mobile computing. To gain a deeper understanding of this important issue in the mobile environment, we conducted an eight-month measurement study consisting of 71 repeated trips along a 23Km route in Sydney under typical driving conditions. To account for the network diversity, we measure bandwidth from two independent cellular providers implementing the popular High-Speed Downlink Packet Access (HSDPA) technology in two different peak access rates (1.8 and 3.6 Mbps). Interestingly, we observe no significant correlation between the bandwidth signals at different points in time within a given trip. This observation eventually leads to the revelation that the popular time series models, e.g. the Autoregressive and Moving Average, typically used to predict network traffic in static environments are not as effective in capturing the regularity in mobile bandwidth. Although the bandwidth signal in a given trip appears as a random white noise, we are able to detect the existence of patterns by analyzing the distribution of the bandwidth observed during the repeated trips. We quantify the bandwidth predictability reflected by these patterns using tools from information theory, entropy in particular. The entropy analysis reveals that the bandwidth uncertainty may reduce by as much as 46% when observations from past trips are accounted for. We further demonstrate that the bandwidth in mobile computing appears more predictable when location is used as a context. All these observations are consistent across multiple independent providers offering different data transfer rates using possibly different networking hardware.
0811 Interference Analysis and Channel Assignment in Multi-Radio Multi-Channel Wireless Mesh Networks Anjum Naveed
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: anaveed@cse.unsw.edu.au


Salil Kanhere
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: salilk@cse.unsw.edu.au
In a typical Wireless Mesh Network (WMN), the links that interfere with a particular link can be broadly classified into two categories depending on their geometric relationships: coordinated and non-coordinated links. In this paper, we analytically quantify the impact of both kind of interfering links on transmission losses. Our analysis shows that compared to coordinated links, the non-coordinated links result in significantly lower throughput and an unfair distribution of channel capacity amongst interfering links. We hypothesize that channel assignment in multi-radio multi-channel WMNs can be effective in significantly reducing the interference caused by the non-coordinated links. We prove that the channel assignment problem based on this hypothesis is NP-Hard. We propose a novel two-phase heuristic channel assignment protocol referred as Cluster-Based Channel Assignment Protocol (CCAP). The protocol logically partitions the network into non-overlapping clusters. In the first phase, nodes within a cluster are assigned to a common channel with orthogonal channels being used in adjacent clusters. The inter-cluster links are assigned channels with the aim of minimizing non-coordinated interference. The second phase of CCAP exploits channel diversity to sub-divide each cluster into multiple interference domains, thereby increasing the capacity of individual links. Simulation-based evaluations demonstrate that CCAP can achieve twice the aggregate network throughput as compared to existing channel assignment protocols, while ensuring a fair distribution of capacity amongst the links.
0810 Mashups for Data Integration: An Analysis Giusy Di Lorenzo
Dipartimento di Informatica e Sistemistica
University Federico II
Via Claudio, 21
80125 Napoli, Italy
E-mail: giusy.dilorenzo@unina.it

Hakim Hacid
School of Computer Science Engineering
University of new south wales
Sydney, NSW 2052, Australia
E-mail: hakimh@cse.unsw.edu.au


Hye-young Paik
School of Computer Science and Engineering
University of new south wales
Sydney, NSW 2052, Australia
E-mail: hpaik@cse.unsw.edu.au

Boualem Benatallah
School of Computer Science and Engineering
University of new south wales
Sydney, NSW 2052, Australia
E-mail: boualem@cse.unsw.edu.au
Mashup is a new application development approach that allows users aggregate multiple services, each serving its own purpose, to create a service that serves a new purpose. Even if the Mashup approach opens new and broader opportunities for data/service consumers, the development process still requires the users to know, not only understand how to write code using languages, but also how to use the different Web APIs from all services. The objective of this study is to analyze the richnesses and weaknesses of the Mashup tools. In particular, we identify the behaviors and characteristics of general Mashup applications and analyze the tools with respect to the key aspects from the Mashup applications. We believe that this kind of study is important to drive future contributions in this emerging area where a lot of research and application fields, such as databases, user machine interaction, etc., can meet.
0809 Mix and Test Counting in Preferential Electoral Systems Roland Wen, Richard Buckland
School of Computer Science and Engineering
University of New South Wales
Sydney, NSW 2052, Australia
E-mail: {rolandw,richardb}@cse.unsw.edu.au
Although there is a substantial body of work on online voting schemes that prevent bribery and coercion of voters, as yet there are few suitable schemes for counting in the alternative vote and single transferable vote preferential systems. Preferential systems are prone to bribery and coercion via signature attacks. This is an issue for online elections in Australia, where all parliamentary elections use these preferential systems. We present the Mix and Test Counting scheme, a preferential counting protocol that is resistant to signature attacks. For the alternative vote, it reveals no information apart from the identity of the winning candidate. For the single transferable vote, it reveals additional anonymised counting information. However the only candidates identified are the winning candidates.
0808 Herbrand Analysis of Some Second-order Theories with Weak Set Existence Principles Chung Tong Lee
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ctlee@cse.unsw.edu.au

Aleksandar Ignjatovic
School of Computer Science and Engineering
University of New South Wales, and
NICTA
NSW2052, Australia
Email: ignjat@cse.unsw.edu.au
We present a proof-theoretic analysis of some second-order theories of binary strings which were introduced in [5]. The core of these theories contains, besides finitely many open axioms for basic operations on strings, only a weak comprehension axiom schema. In such theories, a collection W can be defined to play the role of natural numbers. W is given as the intersection of all sets containing the empty string and closed under the two successor functions S0 and S1. We characterize the classes of functions which provably map W into itself and whose graphs are defined by formulas of an appropriate bounded quantifier complexity. For theories with weak comprehension schemas, this notion corresponds naturally to that of provably recursive functions for arithmetic theories. The techniques of Herbrand analysis developed by Sieg in [8] and [9] allow us to prove that these classes match up with levels of the polynomial-time hierarchy.
0807 Masked Ballot Receipt-Free Elections Roland Wen
School of Computer Science and Engineering
University of New South Wales
Sydney, NSW 2052, Australia
E-mail: rolandw@cse.unsw.edu.au
Online election schemes mitigate bribery and coercion by precluding the generation of receipts that can prove how voters voted. In order to guarantee that each voter's public election data appears ambiguous, existing approaches to receipt-free schemes rely on problematic assumptions when voters cast ballots. We take a new approach by using the novel properties of the Damgard-Jurik cryptosystem to construct Masked Ballot, a receipt-free scheme that avoids such assumptions during the election. The Masked Ballot scheme assumes the existence of untappable channels for a trusted registrar to send private masking values to voters before the election, but does not require these channels during the election. Voters cast ballots over completely public channels without relying on untappability, anonymity or trusted devices.
0806 Towards Agile Service-oriented Business Systems: A Directive-oriented Pattern Analysis Approach Soo Ling Lim
School of Computer Science and Engineering
University of New South Wales, and
NICTA, Australian Technology Park
Sydney, Australia
Email: slim@cse.unsw.edu.au

Fuyuki Ishikawa, Eric Platon
National Institute of Informatics
Tokyo, Japan
Email: {f-ishikawa,platon}@nii.ac.jp

Karl Cox
NICTA, Australian Technology Park
Sydney, Australia
Email: Karl.Cox@nicta.com.au
Volatile requirements should be managed such that changes can be introduced into the system in a quick and structured way. This paper presents Directive-oriented Pattern Analysis (DoPA), a requirements engineering approach that handles volatile requirements by managing the coupling between business intentions and service integration. The key insight is to utilise services as commodities via service choreography patterns. DoPA captures differentiating enterprise intentions as Directives, while using patterns to handle common business needs. This enables the notion of declarative configuration of services to achieve business agility.
0805 Adapting the Weighted Backtrack Estimator to Conflict Driven Search Shai Haim
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: shaih@cse.unsw.edu.au

Toby Walsh
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: tw@cse.unsw.edu.au
Modern SAT solvers present several challenges to estimate search cost including coping with nonchronological backtracking and learning. We present a method to adapt an existing algorithm for estimating the size of a search tree to deal with these challenges. We show the effectiveness of this method using random and structured problems.
0804 Analysis of Per-node Traffic Load in Multi-hop Wireless Sensor Networks Quan Jun Chen, Salil S. Kanhere, Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, salilk, mahbub}@cse.unsw.edu.au
The energy expended by sensor nodes in reception and transmission of data packets makes up a significant quantum of their total energy consumption. Consequently, models that can accurately predict the communication traffic load of a sensor node are critical for designing effective and efficient sensor network protocols. In this paper, we present an analytical model for estimating the per-node traffic load in a multi-hop wireless sensor network, where the nodes sense the environment periodically and forwards the data packets to the sink using greedy geographic routing. The analysis incorporates the idealistic circular coverage radio model as well as a realistic model, log-normal shadowing. Our results confirm that, irrespective of the radio models, the traffic load generally increases as a function of the node's proximity to the sink. In the immediate vicinity of the sink, however, the two radio models yield quite contrasting results. The ideal radio model reveals the existence of a volcano region near the sink, where the traffic load drops significantly. On the contrary, with the log-normal shadowing model, the opposite effect is observed, wherein the traffic load actually increases at a much higher rate as one approaches the sink resulting in the formation of a mountain peak. The results from our analysis are validated by extensive simulations. The simulations also demonstrate that our results are applicable in more realistic environments, which do not conform to the simplifying assumptions made in the analysis for mathematical tractability.
0803 Automatic collection of fuel prices from a network of mobile cameras Yi Fei Dong, Salil Kanhere, Chun Tung Chou
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
Email: {ydon,salilk,ctchou}@cse.unsw.edu.au

Nirupama Bulusu
Portland State University, USA
Email: nbulusu@cs.pdx.edu
It is an undeniable fact that people want information. Unfortunately, even in today's highly automated society, a lot of the information we desire is still manually collected. An example is fuel prices where websites providing fuel price information either send their workers out to manually collect the prices or depend on volunteers manually relaying the information. This paper proposes a novel application of wireless sensor networks to automatically collect fuel prices from camera images of road-side price board (billboard) of service (or gas) stations. Our system exploits the ubiquity of mobile phones that have cameras as well as users contributing and sharing data. In our proposed system, cameras of contributing users will be automatically triggered when they get close to a service station. These images will then be processed by computer vision algorithms to extract the fuel prices. In this paper, we will describe the system architecture and present results from our computer vision algorithms. Based on 52 images, our system achieves a hit rate of 92.3% for correctly detecting the fuel price board from the image background and reads the prices correctly in 87.7% of them. To the best of our knowledge, this is the first instance of a sensor network being used for collecting consumer pricing information
0802 ERTP: Energy-efficient and Reliable Transport Protocol for Data Streaming in Wireless Sensor Networks Tuan Le
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: dtle@cse.unsw.edu.au

Wen Hu
CSIRO ICT Centre
Brisbane, Australia
E-mail: wen.hu@csiro.au

Peter Corke
CSIRO ICT Centre
Brisbane, Australia
E-mail: peter.corke@csiro.au

Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sanjay@cse.unsw.edu.au
Emerging data streaming applications in Wireless Sensor Networks require re- liable and energy-efficient transport protocols. Our recent Wireless Sensor Network deployment in the Burdekin delta, Australia for water monitoring is one such example. Our application involved streaming sensed data such as pressure readings, water flow rate, and salinity readings periodi- cally from many scattered sensors to the sink node which in turn relayed them via an IP network to a remote site for archiving, processing and presentation. While latency is not a primary concern in this class of application (the sample rate is usually in terms of minutes or hours), energy-efficiency is. Long-term operation and reliable delivery of the sensed data to the sink are also desirable. In this paper, we discuss ERTP, an Energy-efficient and Reliable Transport Protocol for Wireless Sensor Networks. ERTP is designed for data streaming applications, in which sensor readings are transmitted from one or more sensor sources to a base station (or sink). ERTP uses a statistical reliability metric which ensures the number of data packets delivered to the sink exceeds the defined threshold. Using a statistical reliability metric when designing a reliable transport protocol guarantees the delivery of adequate information to the users, and reduces the number of transmissions when compared to absolute reliability. To reduce energy-consumption, ERTP uses hop-by-hop Implicit Acknowl- edgment with dynamically updated retransmission timeout for loss recovery. In multihop wireless networks, the transmitter can overhear a forwarding trans- mission and interpret it as an Implicit Acknowledgment. However, Implicit Acknowledgment timeout depends on the time taken a packet to be forwarded by the downstream node. Thus, a dynamic retransmission timeout estimation is crucial for the class of Hop-by-Hop Implicit Acknowledgment transport pro- tocol. By combining statistical reliability and hop-by-hop Implicit ACK loss re- covery, ERTP can provide the reliability to application users with the mini- mal energy-expense. Our extensive discrete event simulations and experimental evaluations show that ERTP is significantly more energy-efficient than current approaches and can reduce energy consumption by more than 50% when com- pared to current approaches. Consequently, sensors are more energy-efficient and the lifespan of the unattended WSN is increase
0801 Analytical Evaluation of the 802.11 Wireless Broadcast under Saturated Conditions Zhe Wang and Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: {zhewang, mahbub}@cse.unsw.edu.au
The popular IEEE 802.11 wireless networking standard has been traditionally used for unicast communications. There is, however, a recent trend in harnessing the broadcast capability of the standard in a range of monitoring and safety related applications, e.g., transportation safety and vehicular traffic management. As this trend continues, models that accurately predict the performance of the 802.11 broadcast will be increasingly useful. Although unicast modelling received considerable attention, analytical evaluation of the broadcast protocol remained relatively unexplored. Using a simple one-dimensional discrete time Markov chain, we analyse the reliability and throughput performance of the IEEE 802.11 broadcast communications for saturated traffic conditions. The model is validated by means of an independent commercial simulator. Using the proposed model, we provide an extensive performance analysis of the 802.11 broadcast communications. The analysis allows us to study the tradeoff between the communication reliability and the system throughput of a local area wireless broadcast network. The throughput performance of broadcast is compared with that of the unicast under different network sizes and different combinations of 802.11 protocol parameters.
0723 Tool support for verifying trace inclusion with Uppaal Timothy Bourke
NICTA, Kensington Laboratory, and
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: tbourke@cse.unsw.edu.au

Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sowmya@cse.unsw.edu.au
Trace inclusion against a deterministic Timed Automata can be verified with the Uppaal model checking tool by constructing a test automaton that snares illegal synchronisation possibilities. Constructing the automaton manually is tedious and error prone. This paper presents a tool that does it automatically for a subset of Uppaal models. Certain features of Uppaal, namely selection bindings and channel arrays, complicate the construction. We first formalise these features, and then show how to incorporate them directly in the testing construction. To do so we limit the forms of subscript that can be used to specify synchronisations; striving for a balance between practicability and program complexity. Unfortunately, some combinations of selection bindings and universal quantifiers cannot be effectively manipulated. The tool does not yet validate the determinism requirements, nor handle committed states or broadcast channels.
0722 Time-Aware Content Summarization of Data Streams Quang-Khai Pham, R´egis Saint-Paul, Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Guillaume Raschia, Noureddine Mouaddib
Atlas Group
LINA at University of Nantes
Nantes, France
Major media companies such as The Financial Times, the Wall Street Journal or Reuters generate huge amounts of textual news data on a daily basis. Mining frequent patterns in this mass of information is critical for knowledge workers such as financial analysts, stock traders or economists. Using existing frequent pattern mining (FPM) algorithms for the analysis of news data is difficult because of the size and lack of structuring of the free text news content. In this article, we propose a Time-Aware Content Summarization algorithm to support FPM in financial news data. The summary allows a concise representation of large volume of data by taking into account the expert's peculiar interest. The summary also preserves the news arrival time information which is essential for FPM algorithms. We experimented the proposed approach on Reuters news data and integrated it into the Streaming TEmporAl Data (STEAD) analysis framework for interactive discovery of frequent pattern.
0721 Process Spaceship: Discovering Process Views in Process Spaces Hamid Reza Motahari-Nezhad, R´egis Saint-Paul, Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Fabio Casati, Periklis Andritsos
University of Trento
Trento, Italy
Process and service execution analysis is a key endeavour for enterprises. Such analysis requires observing and correlating messages related to process and service executions, meaning that identifying if messages belong to the same process instance or service execution. A first challenge is that message correlation is subjective, i.e., depends on the purpose of the analysis and on the perspective of the analyst. Another challenge lies in the huge space of possible correlations between messages, which can be built based on different combinations of message attributes. In this paper, we consider process and service execution data as a process space, and different ways of performing correlations as process views that are views over the process space. We propose methods, by adopting a level-wise approach, and heuristics to identify the set of interesting process views and present a visual, interactive environment that allows users to efficiently navigate through the views identified over a process space. The experiments show the viability and efficiently of the approach on both synthetic and real-world service logs.
0719 Experience and Trust: A Sytems-Theoretic Approach Rex Kwok and Norman Foo
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {rkwok|norman}@cse.unsw.edu.au

Abhaya Nayak
Department of Computing
Division of Information and Communication Sciences
Macquarie University, NSW 2109, Australia
Email: abhaya@ics.mq.edu.au
The core of scientific theories are laws. These laws often make use of theoretical terms, linguistic entities which do not directly refer to observables. There is therefore no direct way of determining which theoretical assertions are true. This suggests that multiple theories may exist which are incompatible with one another but compatible with all possible observations. Since such theories make the same empirical claims, empirical tests cannot be used to differentiate or rank such theories. One property that has been suggested for evaluating rival theories is coherence. This was investigated qualitatively until we introduced a coherence measure based on the average use of formulas in support sets for observations. Our idea was to identify highly coherent theories with those whose formulas that are tightly coupled to account for observations, while low coherence theories contain many disjointed and isolated statements. The present paper generalizes it to accommodate fundamental intuitions from the philosophy of science and better mirrors scientific practice. Moreover, this new approach is neutral with respect to the philosophy and practice of science, and is able to explain notions like modularization using coherence.
0718 Protocol Compatibility and Automatic Converter Synthesis Karin Avnit
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: kavnit@cse.unsw.edu.au

Vijay D'Silva
Department of Computer Science,
ETH Zurich
CH-8092 Zurich, Switzerland
E-mail: vdsilva@inf.ethz.ch

Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sowmya@cse.unsw.edu.au

S. Ramesh
GM India Science Lab
Bangalore India
E-mail: rameshari1958@gmail.com

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
Hardware module reuse is a standard solution to deal with the increasing complexity of chip architectures and growing pressure to reduce time to market. In the absence of a single module interface standard, pre-designed modules for “plug and play” usually require a converter between incompatible interface protocols. Current approaches to automatic synthesis of protocol converters mostly lack formal foundations and either employ abstractions far removed from implementation or grossly simplify the structure of the protocols considered. In this work, we present a state-machine based formalism for modeling bus based communication protocols, a notion of protocol compatibility and of correct conversion between incompatible protocols. Using this formalism, we derive algorithms for checking protocol compatibility and for automatic converter synthesis. We report our experience with automatic converter synthesis between different configurations of widely used commercial bus protocols, such as AMBA AHB, ASB APB, and the open core protocol (OCP). The presented work is unique in its combination of a complete formal approach and the use of low abstraction level that enables precise modeling of protocol characteristics and simple translation to HDL.
0717 Experience and Trust: A Sytems-Theoretic Approach Norman Foo
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: norman@cse.unsw.edu.au

Jochen Renz
Research School of Information Science and Engineering
Australian National University
Canberra ACT 2600, Australia
E-mail: jochen.renz@anu.edu.au
An influential model of agent trust and experience is that of Jonker and Treur. In that model an agent uses its experience of the interactions of another agent to assess its trustworthiness. We show here that a key property of that model is subsumed by a result of classical mathematical systems theory. Using the latter theory we also clarify the issue of when two experience sequences may be regarded as equivalent. An intuitive feature of the Jonker and Treur model is that experience sequence orderings are respected by functions that map such sequences to trust orderings. We raise a question about another intuitive property -- that of continuity of these functions, viz. that they map experience sequences that resemble each other to trust values that also resemble each other. Using fundamental results in the relationship between partial orders and topologies we also show that these two intutive properties are essentially equivalent.
0716 Localized Minimum-Latency Broadcasting in Multi-Radio Multi-Channel Multi-Rate Wireless Mesh Networks J. Qadir, C.T. Chou, J.G. Lim
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {junaidq,ctchou,jool}@cse.unsw.edu.au

Archan Misra
Research Staff Member,
Next-Generation Web Infrastructure Dept.
IBM T J Watson Research Center
19 Skyline Drive,
Hawthorne, NY 10532, USA
Email: archan@us.ibm.com
We address the problem of minimizing the worst-case broadcast delay in ``multi-radio multi-channel multi-rate wireless mesh networks'' (MR2-MC WMN) in a distributed and localized fashion. Efficient broadcasting in such networks is especially challenging due to the desirability of exploiting the ``wireless broadcast advantage'' (WBA), the interface-diversity, the channel-diversity and the rate-diversity offered by these networks. We propose a framework that calculates a set of forwarding nodes and transmission rate at these forwarding nodes irrespective of the broadcast source. Thereafter, a forwarding tree is constructed taking into consideration the source of broadcast. Our broadcasting algorithms are distributed and utilize locally available information. To the best of our knowledge, this works constitutes the first contribution in the area of distributed broadcast in multi-radio multi-rate wireless mesh networks. We present a detailed performance evaluation of our distributed and localized algorithm and demonstrate that our algorithm can greatly improve broadcast performance by exploiting the rate, interface and channel diversity of MR2-MC WMNs and match the performance of centralized algorithms proposed in literature while utilizing only limited two-hop neighborhood information.
0715 Solving the expression problem in Haskell with true separate compilation Sean Seefried
Formal Methods
NICTA
Email: sean.seefried@nicta.com.au

Manuel M. T. Chakravarty

Programming Languages and Systems
School of Computer Science & Engineering
University of New South Wales
Email: chak@cse.unsw.edu.au
We present a novel solution to the expression problem which offers true separate compilation and can be used in existing Haskell compilers that support multi-parameter type classes and recursive dictionaries. The solution is best viewed as both a programming idiom, allowing a programmer to implement open data types and open functions, and the target encoding of a translation from Haskell augmented with syntactic sugar.
0714 Resource-aware Broadcast and Multicast in Multi-rate Wireless Mesh Networks Bao Hua Liu
Thales Australia - Joint Systems, Garden Island, NSW 2011, Australia

Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Archan Misra
IBM T J Watson Research Center,
Hawthorne, New York, USA
archan@us.ibm.com

Sanjay Jha
School of Computer Science and Engineering,
University of New South Wales, Australia
This paper studies some of the fundamental challenges and opportunities associated with the network-layer broadcast and multicast in a multihop multirate wireless mesh network (WMN). In particular, we focus on exploiting the ability of nodes to perform link-layer broadcasts at different rates (with correspondingly different coverage areas). We first show how, in the broadcast wireless medium, the available capacity at a mesh node for a multicast transmission is not just a function of the aggregate pre-existing traffic load of other interfering nodes, but intricately coupled to the actual (sender, receiver) set and the link-layer rate of each individual transmission. We then present and study six alternative heuristic strategies for computing a broadcast tree that not only factors in a flow's traffic rate but also exploits the wireless broadcast advantage (WBA). Finally, we demonstrate how our insights can be extended to multicast routing in a WMN, and present results that show how a tree-formation algorithm that combines contention awareness with transmission rate diversity can significantly increase the total amount of admissible multicast traffic load in a WMN.
0713 A Stronger Notion of Equivalence for Logic Programs Ka-Shu Wong
School of Computer Science and Engineering
and National ICT Australia
University of New South Wales
NSW 2052, Australia
E-mail: kswong@cse.unsw.edu.au
Several different notions of equivalence have been proposed for logic programs with answer set semantics, most notably strong equivalence. However, strong equivalence is not preserved by certain logic program operators such as the strong and weak forgetting operators of Zhang and Foo, in the sense that two programs which are strongly equivalent may no longer be strongly equivalent after the same operator is applied to both. We propose the stronger notion of T-equivalence which is designed to be preserved by logic program operators such as strong and weak forgetting. We give a syntactic definition of T-equivalence and provide a model-theoretic characterisation of T-equivalence using what we call T-models. We show that strong and weak forgetting does preserve T-equivalence and using this, arrive at a model-theoretic definition of the strong and weak forgetting operators using T-models.
0712 Protocol Compatibility and Automatic Converter Synthesis Karin Avnit
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia.
E-mail: kavnit@cse.unsw.edu.au

Vijay D'Silva
Department of Computer Science,
ETH Zurich
CH-8092 Zurich, Switzerland.
E-mail: vdsilva@inf.ethz.ch

Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia.
E-mail: sowmya@cse.unsw.edu.au

S. Ramesh
GM India Science Lab
Bangalore, India.
E-mail: rameshari1958@gmail.com

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
Hardware module reuse is common practice to deal with the increasing complexity of chip architectures and growing pressure to reduce time to market. In the absence of module interface standards, use of pre-designed modules in a "plug and play" fashion usually requires a converter between incompatible interface protocols. Though several approaches to such mediation have been proposed in the past, automation of protocol converter synthesis is yet to be realized. In this work, we present a state-machine based formalism for modeling bus based communication protocols, a notion of protocol compatibility and of protocol conversion. Using this formalism, we devise algorithms for checking protocol compatibility and for automatic converter synthesis. We report our experience with automatic converter synthesis for commercial bus protocols. The presented work is unique in its low abstraction level that enables precise modeling of protocol characteristics and simple translation to HDL.
0711 Topology Control and Channel Assignment in Multi-Radio Multi-Channel Wireless Mesh Networks Anjum Naveed
School of Computer Science and Engineering
The University of New South Wales,
NSW 2052, Australia
E-mail: anaveed@cse.unsw.edu.au

Salil S. Kanhere
School of Computer Science and Engineering
The University of New South Wales,
NSW 2052, Australia
E-mail: salilk@cse.unsw.edu.au

Sanjay K. Jha
School of Computer Science and Engineering
The University of New South Wales,
NSW 2052, Australia
E-mail: sjha@cse.unsw.edu.au
The aggregate capacity of wireless mesh networks can be improved significantly by equipping each node with multiple interfaces and by using multiple channels in order to reduce the effect of interference. Efficient channel assignment is required to ensure the optimal use of the limited channels in the radio spectrum. In this paper, a Cluster-based Multipath Topology control and Channel assignment scheme (CoMTaC), is proposed, which explicitly creates a separation between the channel assignment and topology control functions, thus minimizing flow disruptions. A cluster-based approach is employed to ensure basic network connectivity. Intrinsic support for broadcasting with minimal overheads is also provided. CoMTaC also takes advantage of the inherent multiple paths that exist in a typical WMN by constructing a spanner of the network graph and using the additional node interfaces. The second phase of CoMTaC proposes a dynamic distributed channel assignment algorithm, which employs a novel interference estimation mechanism based on the average link-layer queue length within the interference domain. Partially overlapping channels are also included in the channel assignment process to enhance the network capacity. Extensive simulation based experiments have been conducted to test various parameters and the effectiveness of the proposed scheme. The experimental results show that the proposed scheme outperforms existing dynamic channel assignment schemes by a minimum of a factor of 2.
0710 Generative Code Specialisation for High-Performance Monte-Carlo Simulations Don Stewart(1)
Hugh Chaffey-Millar(2)
Gabriele Keller(1)
Manuel M. T. Chakravarty(1)
Christopher Barner-Kowollik(2)

(1) Programming Languages and Systems
School of Computer Science & Engineering
University of New South Wales

(2) Centre for Advanced Macromolecular Design
School of Chemical Sciences and Engineering
University of New South Wales
We address the tension between software generality and performance in the domain of scientific and financial simulations based on Monte-Carlo methods. To this end, we present a novel software architecture, centred around the concept of a specialising simulator generator, that combines and extends methods from generative programming, partial evaluation, runtime code generation, and dynamic code loading. The core tenet is that, given a fixed simulator configuration, a generator in a functional language can produce low-level code that is more highly optimised than a manually implemented generic simulator. We also introduce a skeleton, or template, capturing a wide range of Monte-Carlo methods and use it to explain how to design specialising simulator generators and how to generate parallelised simulators for multi-core and distributed-memory multiprocessors. We evaluated the practical benefits and limitations of our approach by applying it to a highly relevant problem in computational chemistry. More precisely, we used a Markov-chain Monte-Carlo method for the study of advanced forms of polymerisation kinetics. The resulting implementation executes faster than all competing software products, while at the same time also being more general. The generative architecture allows us to cover a wider range of chemical reactions and to target a wider range of high-performance architectures (such as PC clusters and SMP multiprocessors). We show that it is possible to outperform low-level languages with functional programming in domains with very stringent performance requirements if the domain also demands generality.
0709 Message Correlation for Conversation Reconstruction in Service Interaction Logs Hamid Reza Motahari-Nezhad, R´egis Saint-Paul, Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Fabio Casati, Periklis Andritsos
University of Trento
Trento, Italy
The problem of understanding the behavior of business processes and of services is rapidly becoming a priority in medium and large companies. To this end, recently, analysis tools as well as variations of data mining techniques have been applied to process and service execution logs to perform OLAP-style analysis and to discover behavioral (process and protocol) models out of execution data. All these approaches are based on one key assumption: events describing executions and stored in process and service logs include identifiers that allow associating each event to the process or service execution they belong to (e.g., can correlate all events related to the processing of a certain purchase order or to the hiring of a given employee). In reality, however, such information rarely exists. In this paper, we present a framework for discovering correlations among messages in service logs. We characterize the problem of message correlation and propose novel algorithms and techniques based on well-funded principles and heuristics on the characteristics of conversations and of message attributes that can act as identifier for such conversations. As we will show, there is no right or wrong way to correlate messages, and such correlation is necessarily subjective. To account for this subjectiveness, we propose an approach where algorithms suggest candidate correlators, provide measures that help users understand the implications of choosing a given correlators, and organize candidate correlators in such a way to facilitate visual exploration. The approach has been implemented and experimental results show its viability and scalability on large synthetic and real-world datasets. We believe that message correlation is a very important and challenging area of research that will witness many contributions in the near future due to the pressing industry needs for process and service execution analysis.
0708 SPARK: Top-k Keyword Query in Relational Databases Yi Luo (1), Xuemin Lin (1), Wei Wang (1), and Xiaofang Zhou (2)

1: School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {luoyi, lxue, weiw}@cse.unsw.edu.au

2: School of Information Technology and Electrical Engineering
University of Queensland
Australia
E-mail: zxf@itee.uq.edu.au
With the increasing amount of text data stored in relational databases, there is a demand for RDBMS to support keyword queries over text data. As a search result is often assembled from multiple relational tables, traditional IR-style ranking and query evaluation methods cannot be applied directly. In this paper, we study the effectiveness and the efficiency issues of answering top-k keyword query in relational database systems. We propose a new ranking formula by adapting existing IR techniques based on a natural notion of virtual document. Compared with previous approaches, our new ranking method is simple yet effective, and agrees with human perceptions. We also study efficient query processing methods for the new ranking method, and propose algorithms that have minimal accesses to the database. We have conducted extensive experiments on large-scale real databases using two popular RDBMSs. The experimental results demonstrate significant improvement to the alternative approaches in terms of retrieval effectiveness and efficiency.
0707 Adapting Web Service Interfaces and Protocols Using Adapter Simulation Hamid R. Motahari Nezhad, Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Axel Matren, Francisco Curbera
IBM TJ Watson Research Center
New York, USA

Fabio Casati
University of Trento
Trento, Italy
In today’sWeb, many functionality-wise similarWeb services are offered through heterogeneous interfaces (operation definitions) and business protocols (ordering constraints defined on legal operation invocation sequences). The typical approach to enable interoperation in such a heterogeneous setting is through developing adapters. There have been approaches for classifying possible mismatches between service interfaces and business protocols to facilitate adapter development. However, the hard job is that of identifying, given two service specifications, the actual mismatches between their interfaces and business protocols. In this paper we present novel techniques and a tool that provides semi-automated support for identifying and resolution of mismatches between service interfaces and protocols, and for generating adapter specification. We make the following main contributions: (i) we identify mismatches between service interfaces, which leads to finding mismatches of type of signature, merge/split, and extra/missing messages; (ii) we identify all ordering mismatches between service protocols and generate a tree, called mismatch tree, for mismatches that require developers’ input for their resolution. In addition, we provide semi-automated support in analyzing the mismatch tree to help in resolving such mismatches. We have implemented the approach in a tool inside IBM WID (WebSphere Integration Developer). Our experiments with some real-world case studies show the viability of the proposed approach. The methods and tool are significant in that they considerably simplify the problem of adapting services so that interoperation is possible.
0706 The Operational Semantics of Dual Logic Programming Eric A. Martin
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
Many developments in the field of Logic programming reveal poor choices of basic concepts, and misleading views on negation. We show that Logic programming can enjoy a general, simple and clean foundation, provided that the basic concepts be revisited, and that any nonclassical form of negation, in particular, negation as finite failure, be disregarded. We propose a framework that starts with a ruled-based notion of dual logic program, allowing negation to appear in the heads as well as in the bodies of the rules. Dual logic programs can be treated as very general logic programs, but it is conceptually more beneficial to view them as redefined logic programs. Classical semantics, such as the Kripke-Kleene semantics, the well-founded semantics, and the stable model semantics, are redefined as natural program transformations of the same kind, resulting into dual logic programs whose behaviors can be described in terms of the same notion of logical consequence in a classical semantics. A kind of inference `in the style of Logic programming' from arbitrary theories is presented, together with a natural method to transform arbitrary theories into dual logic programs.
0705 WS-Policy4MASC – A WS-Policy Extension Used in the Manageable and Adaptable Service Compositions (MASC) Middleware Vladimir Tosic, Abdelkarim Erradi, Piyush Maheshwari
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
WS-Policy4MASC is a new XML language that we have developed for policy specification in the Manageable and Adaptable Service Compositions (MASC) middleware. It can be also used for other Web service middleware. It extends the Web Services Policy Framework (WS-Policy) by defining new types of policy assertions. Goal policy assertions specify requirements and guarantees (e.g., maximal response time) to be met in desired normal operation. They guide monitoring activities in MASC. Action policy assertions specify actions to be taken if certain conditions are met or not met (e.g., some guarantees were not satisfied). They guide adaptation and other control actions. Utility policy assertions specify monetary values assigned to particular situations (e.g., execution of some action). They can be used by MASC for billing and for selection between alternative action policy assertions. Meta-policy assertions can be used to specify which action policy assertions are alternative and which conflict resolution strategy (e.g., profit maximization) should be used. In addition to these 4 new types of policy assertions, WS-Policy4MASC enables specification of additional information that is necessary for run-time policy-driven management. This includes information about conditions when policy assertions are evaluated/ executed, parties performing this evaluation /execution, a party responsible for meeting a goal policy assertion, ontological meaning, monitored data items, states, state transitions, schedules, events, and various expressions. We have evaluated feasibility of the WS-Policy4MASC solutions by implementing a policy repository and other modules in MASC. Further, we have examined their usefulness on a set of realistic stock trading scenarios.
0704 Localized Minimum-Latency Broadcasting in Multi-rate Wireless Mesh Networks Junaid Qadir (^1), Chun Tung Chou (^1), Archan Misra (^2) and Joo Ghee Lim
(^1)

1: School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: [junaidq, ctchou, jool]@cse.unsw.edu.au

2: IBM T J Watson Research Center
19 Skyline Drive,
Hawthorne, NY 10532, USA
E-mail: archan@us.ibm.com
We address the problem of minimizing the worst-case broadcast delay in multi-rate wireless mesh networks (WMN) in a distributed and localized fashion. Efficient broadcasting in such networks is especially challenging due to the multi-rate transmission capability and the interference between wireless transmissions of WMN nodes. We propose connecting dominating set (CDS) based broadcast routing approach which calculates the set of forwarding nodes and the transmission rate at each forwarding node independent of the broadcast source. Thereafter, a forwarding tree is constructed taking into consideration the source of the broadcast. In this paper, we propose three distributed and localized rate-aware broadcast algorithms. We compare the performance of our distributed and localized algorithms with previously proposed centralized algorithms and observe that the performance gap is not large. We show that our algorithms greatly improve performance of rate-unaware broadcasting algorithms by incorporating rate-awareness into the broadcast tree construction algorithm process.
0703 Sensing Data Market: Architecture, Applications and Challenges Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Nirupama Bulusu
Department of Computer Science,
Portland State University,
USA
nbulusu@cs.pdx.edu

Salil Kanhere
School of Computer Science and Engineering,
University of New South Wales, Australia
salilk@cse.unsw.edu.au
With the rapid development of the Internet and various form of wireless communication technologies, information-on-demand has become a reality, e.g. people today routinely receive news items, stock prices etc. via their mobile phones or RSS feeds. We believe there is a real demand from users for all kind of information and especially sensing data. This paper proposes a new network based service called Sensing Data Market (SenseMart). The defining characteristic of SenseMart is that users share their sensing data among themselves. In other words, SenseMart facilitates the exchange (in the sense of a marketplace) of sensing data and can be viewed as the "Napster" of sensing data. This paper discusses possible architectures for SenseMart and the research challenges to realise it.
0702 WS-Policy4MASC Version 0.8 Vladimir Tosic, Abdelkarim Erradi, Piyush Maheshwari
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
WS-Policy4MASC is a new XML language that we have developed for policy specification in the Manageable and Adaptable Service Compositions (MASC) middleware. It can be also used for other Web service middleware. It extends the Web Services Policy Framework (WS-Policy) by defining new types of policy assertions. Goal policy assertions specify requirements and guarantees (e.g., maximal response time) to be met in desired normal operation. They guide monitoring activities in MASC. Action policy assertions specify actions to be taken if certain conditions are met or not met (e.g., some guarantees were not satisfied). They guide adaptation and other control actions. Utility policy assertions specify monetary values assigned to particular situations (e.g., execution of some action). They can be used by MASC for billing and for selection between alternative action policy assertions. Meta-policy assertions can be used to specify which action policy assertions are alternative and which conflict resolution strategy (e.g., profit maximization) should be used. In addition to these 4 new types of policy assertions, WS-Policy4MASC enables specification of additional information that is necessary for run-time policy-driven management. This includes information about conditions when policy assertions are evaluated/ executed, parties performing this evaluation /execution, a party responsible for meeting a goal policy assertion, ontological meaning, monitored data items, states, state transitions, schedules, events, and various expressions. This research report provides technical details about WS-Policy4MASC solutions. First, we summarize the need for the language. Then, we list the requirements we have identified for a policy language to support middleware for QoS-aware and adaptive Web service composition. Then, we explain and discuss many architectural decisions in the language. They are illustrated with diagrams (XmlSpy diagrams of XML Schemas and UML diagrams) and examples. The Appendices contain XML files of detailed examples and XML schemas of the WS-Policy4MASC language grammar.
0701 AC-Index: An Efficient Adaptive Index for Branching XML Queries Bo Zhang (^1), Wei Wang (^2), Xiaoling Wang (^1) and Aoying Zhou

1. Department of Computer Science and Engineering
Fudan University
Shanghai 200433, P. R. China
E-mail: {zhangbo, wxling, ayzhou}@fudan.edu.cn

2. School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: weiw@cse.unsw.edu.au
Query-adaptive XML indexing has been proposed and shown to be an efficient way to accelerate XML query processing, because it dynamically adapts to the workload. However, existing adaptive index suffers from a number of issues, such as lack of support for general types of XML queries, and unsatisfactory query and update performances. In this paper, we propose a new query-adaptive index named AC-Index. It is designed to supports XML path queries with branching predicates. We propose efficient index construction, query processing, and index adaptation algorithms for the AC-Index, together with a number of optimizations to further boost the performance of the index. Our experimental results demonstrate that the AC-Index significantly outperformed previous approaches in terms of query processing and adaptation efficiencies.
0625 Protocol Compatibility and Converter Synthesis Karin Avnit
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: kavnit@cse.unsw.edu.au

Vijay D'Silva
Department of Computer Science,
ETH Zurich
CH-8092 Zurich, Bwitzerland
E-mail: vdsilva@inf.ethz.ch

Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sowmya@cse.unsw.edu.au

S. Ramesh
Department of Computer Science and Engineering
Indian Institute of Technology Poway, Bombay 400 076
E-mail: ramesh@cse.iitb.ac.in

Sri Parameswaran
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: sridevan@cse.unsw.edu.au
With increasing complexity of chip architecture and growing pressure for short time to market, hardware module reuse is common practice. However, in the absence of module interface standards, use of pre-designed modules in a "plug and play" fashion usually requires a mediator between mismatched interface protocols. Though several approaches to such mediation have been proposed, automation of protocol converter synthesis is yet to be realized. In this work we focus on the framework of state based protocol models. We present a formalism for modeling bus based communication protocols, the notion of compatibility and a protocol converter. Using this formalism, we provide algorithms for checking compatibility and demonstrate the process of automatic converter synthesis for commercial bus protocols.
0624 System F with Type Equality Coercions Martin Sulzmann
National University of Singapore
Email: sulzmann@comp.nus.edu.sg

Manuel M. T. Chakravarty
Computer Science and Engineering
University of New South Wales
Email: chak@cse.unsw.edu.au

Simon Peyton Jones and Kevin Donnelly
Microsoft Research Ltd
Cambridge, England
Email: {simonpj,t-kevind}@microsoft.com
We introduce System FC, which extends System F with support for non-syntactic type equality. There are two main extensions: (i) explicit witnesses for type equalities, and (ii) open, non-parametric type functions, given meaning by top-level equality axioms. Unlike System F, FC is expressive enough to serve as a target for several different source-language features, including Haskell's \code{newtype}, generalised algebraic data types, associated types, functional dependencies, and perhaps more besides.
0623 A Framework for Protocol Discovery from Real-World Service Conversation Logs Hamid Reza Motahari-Nezhad, R´egis Saint-Paul, Boualem Benatallah
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Fabio Casati
University of Trento
Trento, Italy
Understanding the business (interaction) protocol supported by a service is very important for both clients and service providers: it allows developers to know how to write clients that interact with a service, and it allows development tools and runtime middleware to deliver functionality that simplifies the service development lifecycle. It also greatly facilitates the monitoring, visualization, and aggregation of interaction data. This paper presents a framework for discovering protocol definitions from realworld service interaction logs. It first describes the challenges in protocol discovery in such a context. Then, it presents a novel discovery approach, which is widely applicable, robust to different kinds of imperfections often present in real-world service logs, and helps to derive protocols of small sizes, thanks to heuristics. As finding the most precise and the smallest model is algorithmically not feasible from imperfect service logs, finally, the paper presents an approach to refine the discovered protocol via user interaction, to compensate for possible imprecision introduced in the discovered model. The approach has been implemented and experimental results show its viability on both synthetic and real-world datasets.
0622 Securing Channel Assignment in Multi-Radio Multi-Channel Wireless Mesh Networks Aftabul Haq, Anjum Naveed and Salil Kanhere
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {ahaq,anaveed,salilk}@cse.unsw.edu.au
In order to fully exploit the aggregate bandwidth available in the radio spectrum, future Wireless Mesh Networks (WMN) are expected to take advantage of multiple orthogonal channels, where the nodes have the ability to communicate with multiple neighbours simultaneously using multiple radios (NICs) over orthogonal channels. Dynamic channel assignment is critical for ensuring effective utilization of the non-overlapping channels. Several algorithms have been proposed in recent years, which aim at achieving this. However, all these schemes inherently assume that the mesh nodes are well-behaved without any malicious intentions. A recent work has exposed the vulnerabilities in channel assignment algorithms. In this paper, a mechanism is proposed to secure the channel assignment algorithms, addressing the security vulnerabilities in the existing algorithms. The proposed mechanism successfully prevents the WMN from the recently exposed attacks. The simulation based experiments show the effectiveness of the proposed solution. The experiments also show that the incurred overhead because of security is negligible.
0620 Patterns and the B Method: Bridging Formal and Informal Development Edward Chan
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: ekfchan@gmail.com

Brett Welch
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: brett.welch@gmail.com

Ken Robinson
School of Computer Science and Engineering
University of New South Wales
NSW 2052, Australia
E-mail: k.robinson@unsw.edu.au
In a world increasingly dependent on software controlled systems, the need for the verification of software safety and correctness has never been greater. Traditional software development methods leave much to be desired in this aspect, relying heavily on testing which can be costly and time inefficient. A more efficient and less error prone approach is to use formal methods, in particular the B method, to develop software. This thesis explores concepts and methods to assist developers in using formal methods by borrowing concepts from the Object Oriented world of software development. Previous attempts at doing this have attempted to adapt the B method to the Object Oriented paradigm. This thesis presents an alternative approach that adapts concepts borrowed from the Object Oriented paradigm, to the B method. By concentrating on commonly occurring patterns in software development and drawing inspiration from the traditional Gang of Four design patterns, this thesis presents a series of patterns adapted to and specialised for the B method, demonstrating how the beginnings of complex and significant systems can be modelled in B.
0619 Energy Driven Application Self-Adaptation at Run-time Jorgen Peddersen and Sri Parameswaran
School of Computer Science and Engineering
National ICT Australia
University of New South Wales
Sydney 2052 Australia
E-mail: {jorgenp, sridevan}@cse.unsw.edu.au
Until recently, there has been a lack of methods to trade-off energy use for quality of service at run-time in stand-alone embedded systems. Such systems are motivated by the need to increase the apparent available battery energy of portable devices, with minimal compromise in quality. The available systems either drew too much power or added considerable overheads due to task swapping. In this paper we demonstrate a feasible method to perform these trade-offs. This work has been enabled by a low-impact power/energy estimating processor which utilizes counters to estimate power and energy consumption at run-time. Techniques are shown that modify multimedia applications to differ the fidelity of their output to optimize the energy/quality trade-off. Two adaptation algorithms are applied to multimedia applications demonstrating the efficacy of the method. The method increases code size by 1% and execution time by 0.02%, yet is able to produce an output which is acceptable and processes up to double the number of frames.
0618 CLIPPER: Counter-based Low Impact Processor Power Estimation at Run-time Jorgen Peddersen and Sri Parameswaran
School of Computer Science and Engineering
National ICT Australia
University of New South Wales
Sydney 2052 Australia
E-mail: {jorgenp, sridevan}@cse.unsw.edu.au
Numerous dynamic power management techniques have been proposed which utilize the knowledge of processor power/energy consumption at run-time. So far, no efficient method to provide run-time power/energy data has been presented. Current measurement systems draw too much power to be used in small embedded designs and existing performance counters can not provide sufficient information for run-time optimization. This paper presents a novel methodology to solve the problem of run-time power optimization by designing a processor that estimates its own power/energy consumption. Estimation is performed by the addition of small counters that tally events which consume power. This methodology has been applied to an existing processor resulting in an average power error of 2% and energy estimation error of 1.5%. The system adds little impact to the design, with only a 4.9% increase in chip area and a 3% increase in average power consumption. A case study of an application that utilizes the processor showcases the benefits the methodology enables in dynamic power optimization.
0617 Modeling Path Length in Wireless Ad-hoc Network Quan Jun Chen, Salil S. Kanhere, Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, salilk, mahbub}@cse.unsw.edu.au
Path length(i.e: the number of hops), the fundamental metric of multi-hop wireless network, has a determinative effect on the performance of wireless network, such as throughput, end-to-end delay and energy consumption. In this paper, we propose a stochastic process based mathematical model to analyze the shortest path length. The model is based on the observation that greedy forwarding in geographic routing can approximately find the shortest path in reasonably dense network. We present formula for the probability mass function of path length, given the distance between source and destination. In addition, we also propose a simple but efficient formula to estimate the mean path length. Our analytical results are well justified by a rich set of simulations, in which both random and realistic mobility scenarios have been investigated.
0615 GOLD: An Overlay Multicast Tree with Globally Optimized Latency and Out-Degree Jun Guo, Sanjay Jha
School of Computer Science and Engineering
The University of New South Wales, Australia
Email: {jguo,sjha}@cse.unsw.edu.au

Suman Banerjee
Department of Computer Sciences
University of Wisconsin-Madison, USA
Email: suman@cs.wisc.edu
End-to-end delay and interface bandwidth usage are two important performance metrics in overlay multicast networks. This paper demonstrates that, by jointly optimizing the placement of multicast service nodes and the routing strategy for overlay multicast networks, it is possible to find an overlay multicast tree that both minimizes the maximum end-to-end delay between the source and the destinations, and balances the interface bandwidth usage within the set of multicast service nodes. Motivated by this important observation, we propose in this paper a joint optimization problem for overlay mutlicast networks, where we wish to find a globally optimized overlay multicast tree with minimum average end-to-end delay subject to two stringent constraints: 1) The maximum end-to-end delay is bounded by the unicast latency from the source to the farthest destination in the physical topology; 2) The interface bandwidth usage is balanced within the set of multicast service nodes. This problem is shown to be NP-hard. We present a low complexity greedy algorithm that obtains good quality approximate solutions to the problem. We further show how the greedy algorithm enables the design of a weight-coded genetic algorithm that achieves closer-to-optimal solutions with reasonable computational complexity.
0614 System F with Type Equality Coercions (superseded by TR 0624) Martin Sulzmann
National University of Singapore
Email: sulzmann@comp.nus.edu.sg

Manuel M. T. Chakravarty
Computer Science and Engineering
University of New South Wales
Email: chak@cse.unsw.edu.au

Simon Peyton Jones and Kevin Donnelly
Microsoft Research Ltd
Cambridge, England
Email: {simonpj,t-kevind}@microsoft.com
We introduce System FC, which extends System F with support for non-syntactic type equality. There are two main extensions: (i) explicit witnesses for type equalities, and (ii) non-parametric type functions, given meaning by top-level equality axioms. Unlike System F, FC is expressive enough to serve as a target for several different source-language features, including Haskell's newtype, generalised algebraic data types, associated types, functional dependencies, and perhaps more besides. FC can therefore serve as a typed intermediate language in a compiler that supports these features.
0613 An Interim Report of XML Process Models Tu Tak Tran
School of Computer Science and Engineering
University of New South Wales
Sydney NSW 2052 Australia
Email: ttt@cse.unsw.edu.au
This report presents preliminary models for XML processes, and a general framework for its usage. The set of models are at an early stage of development and are not as yet complete. This report serves as an interim for further development in XML process modeling.
0611 A Graph Drawing Approach to Sensor Network Localization Muhammad S. Nawaz and Sanjay K. Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {mnawaz,sjha}@cse.unsw.edu.au
In this work, we present a centralized localization mechanism for wireless sensor networks. Our method is based on a graph drawing algorithm and utilizes inter node distances to localize sensor nodes in a local coordinate system upto a global translation, rotation and reflection without any absolute reference positions such as GPS or other anchor nodes. We show through simulations that, it is possible to avoid folds and flips in the localized network layout if the entire topology is considered as a whole as opposed to distances to immediate neighbors only. We assess the effect of different parameters like scale, node degree and ranging noise on our algorithm. Finally, we propose a hierarchical approach to make the algorithm scalable for large networks, which we would like to pursue as future work.
0610 Patching Approximate Solutions in Reinforcement Learning Min Sub Kim
ARC Centre of Excellence for Autonomous Systems
School of Computer Science and Engineering
University of New South Wales
Sydney NSW 2052 Australia
msk@cse.unsw.edu.au

William Uther
National ICT Australia
Sydney NSW 2052 Australia
william.uther@nicta.com.au
This report introduces an approach to improving an approximate solution in reinforcement learning by augmenting it with a small overriding patch. Many approximate solutions are smaller and easier to produce than a flat solution, but the best solution within the constraints of the approximation may fall well short of global optimality. We present an algorithm for efficiently learning a small patch to reduce this gap. Empirical evaluation demonstrates the effectiveness of patching, producing combined solutions that are much closer to global optimality.
0609 On building 3D maps using a Range Camera: Applications to Rescue Robotics Raymond Sheh, M. Waleed Kadous and Claude Sammut
ARC Centre of Excellence for Autonomous Systems
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW, 2052, Australia
[rsheh|waleed|claude]@cse.unsw.edu.au
It is critical in many mobile robotics applications to characterise the presence and position of objects around the robot. This is the case whether the mobile robot is under autonomous or teleoperative control. In this paper, we examine the use of a CSEM SwissRanger SR-2 3D range camera which allows the generation of dense, accurate 3D point clouds around a mobile robot. Combined with other data sources, such as video cameras, this allows the creation of 3D maps that can be used for .fly throughs.. Furthermore, this same technique allows a teleoperator to very accurately place landmarks within the 3D maps. As this device is still somewhat prototypical, we also discuss some of the issues associated with the use of this device. The test application was the 2005 RoboCup Rescue Robot League, a competition that simulates robot-assisted Urban Search and Rescue (USAR) tasks and places great importance on effectively generating maps. Novel techniques for processing the raw measurements from the sensor, and its use to create maps of mock disaster sites are discussed. The maps generated, part of Team CASualty.s entry, were received very well by the judges of the competition and were unique in their combination of 3D, colour and thermal information, and the automated way in which the placement of landmarks and other annotations were performed. The maps were instrumental in the team.s achievement of 3rd place.
0608 Minimum Latency Broadcasting in Multi-Radio Multi- Channel Multi-Rate Wireless Mesh Networks Junaid Qadir, Chun Tung Chou, Archan Misra*
School of Computer Science and Engineering, University of New South Wales, Australia
IBM T J Watson Research Center, Hawthorne, New York, USA*
Email: {junaidq, ctchou}@cse.unsw.edu.au,archan@us.ibm.com
We address the problem of minimizing the worst-case broadcast delay in multi-radio multi-channel multi-rate (MR2-MC) wireless mesh networks (WMN). The problem of `efficient' broadcast in such networks is especially challenging due to the numerous inter-related decisions that have to be made. The multi-rate transmission capability of WMN nodes, interference between wireless transmissions, and the hardness of optimal channel assignment adds complexity to our considered problem. We present four heuristic algorithms to solve the minimum latency broadcast problem for such settings and show that the `best' performing algorithms usually adapt themselves to the available radio interfaces and channels. We also study the effect of channel assignment on broadcast performance and show that channel assignment can affect the broadcast performance substantially. More importantly, we show that a channel assignment that performs well for unicast does not necessarily perform well for broadcast/multicast. To the best of our knowledge, this work constitutes the first contribution in the area of broadcast routing for MR2-MC WMN.
0607 Security Vulnerabilities in Channel Assignment of Multi-Radio Multi-Channel Wireless Mesh Networks Anjum Naveed and Salil Kanhere
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {anaveed,salilk}@cse.unsw.edu.au
In order to fully exploit the aggregate bandwidth available in the radio spectrum, future Wireless Mesh Networks (WMN) are expected to take advantage of multiple orthogonal channels, with nodes having the ability to communicate with multiple neighbors simultaneously using multiple radios (NICs) over orthogonal channels. Dynamic channel assignment is critical for ensuring effective utilization of the non-overlapping channels. Several algorithms have been proposed in recent years, which aim at achieving this. However, all these schemes inherently assume that the mesh nodes are well-behaved without any malicious intentions. In this report, we expose the vulnerabilities in channel assignment algorithms and unveil three new security attacks: Network Endo-Parasite Attack (NEPA), Channel Ecto-Parasite Attack (CEPA) and low-cost ripple effect attack (LORA). These attacks can be launched with relative ease by a malicious node and can cause significant degradation in the network performance. We also evaluate the effectiveness of these attacks through simulation based experiments and briefly discuss possible solutions to counter these new threats.
0606 Piggy Back Challenge Based Security Mechanism for IEEE 802.11i Wireless LAN CCMP Protocol M. Junaid Hussain , M. Akbar, Muid Mufti & Salil Kanhere
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: junaid-mcs@nust.edu.pk , makber59@hotmail.com , muid@uetaxila.edu.pk, salilk@cse.unsw.edu.au
Counter mode is used for data confidentiality within IEEE 802.11 Wireless LANs. Counter mode utilizes temporal key and counter value for encryption . The temporal key is derived as a result of successful authentication. It is shown in this paper that Counter mode is vulnerable to attacks by intruders. This paper presents a piggy back challenge based security mechanism. It is shown that the nonce and initial counter are derived from the session key and are kept secret. The same nonce is used as a challenge text from authenticator to supplicant. The supplicant utilizes the nonce as encryption key for the subsequent packets. The proposed challenge response mechanism is a continuous process and thus provides freshness, per packet encryption key and unpredictability of counter value. The freshness provides protection against replay attacks, the unpredictability of counter value prevents precomputation attack and the per-packet challenge response mechanism using separate encryption key for each packet strengthens the security of the connection against unauthorized access by immediately discarding the packet if Per-Packet Authentication fails. Our piggy back challenge based Security mechanism provides a fundamental base for strengthening the security of WLAN.
0605 Lock Selection in Practice Abdelsalam Shanneb and John Potter
{shanneba,potter}@cse.unsw.edu.au

Programming Languages and Compilers Group
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
This report is a continuation of our last report (UNSW-CSE-TR-604). Here, we present our algorithm for offering lock choices. We show all the details involved in making lock choices for programmers. We offer different approaches to lock selection; we propose top-down and bottom-up strategies, as well as, additive and subtractive techniques. These lock choices are based on the previously established Galois connection between the layers of a composite. We also present several examples that demonstrate our prototype tool for lock selection, the output of which is in Appendix A.
0604 A Galois Connection in Composite Objects Abdelsalam Shanneb and John Potter
{shanneba,potter}@cse.unsw.edu.au

Programming Languages and Compilers Group
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
With the advent of multiprocessors on the desktop, software applications are increasingly likely to adopt multithreaded architectures. To cope with the complexity of concurrent systems, programmers build systems from thread-safe components. This produces excessive and redundant locking, restricting the potential for concurrency within the system. Rather than deploying individual thread-safe components, we advocate deferring the deployment of locks until the code dependencies are known. This avoids redundant locking, and allows the granularity of concurrency to be chosen in a flexible way. In earlier work we identified a formal relationship, known as a Galois connection, between the potential for concurrency in a composite system and the locking requirements for its components. This report highlights the role of fixpoints for lock selection. The subsequent report (UNSW-CSE-TR-605) will investigate strategies for selecting locks in a composite system.
0603 COMMA: A Communications Methodology for Dynamic Module-based Reconfiguration of FPGAs Shannon Koh and Oliver Diessel
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: shannonk@cse.unsw.edu.au and odiessel@cse.unsw.edu.au
On-going improvements in the scaling of FPGA device sizes and time-to-market pressures motivate the adoption of a module-oriented design flow for the development of applications. At the same time, economic factors encourage the reuse of smaller devices for high performance computational tasks. Like other researchers, we therefore envisage a need for dynamic reconfiguration of FPGAs at the module level. However, proposals to date have not focussed on communications issues, have advocated use of specific protocols, cannot be readily implemented, and/or do not support current device architectures. This paper proposes a methodology for the rapid deployment of a communications infrastructure that efficiently supports the communications needs of a collection of dynamic modules when these are known at design time. The methodology also provides a degree of flexibility to allow a range of unknown communication requirements to be met at run time. Our aim is to support new tiled dynamically reconfigurable architectures such as Virtex-4, as well as mature device families. We assess a prototype of the communications infrastructure and outline opportunities for automating the design flow.
0601 OCEAN -- Scalable and Adaptive Infrastructure for On-board Information Access Boualem Benatallah, Mahbub Hassan, Lavy Libman*, Aixin Sun
School of Computer Science and Engineering
University of New South Wales
Sydney, NSW 2052, Australia
Email: {boualem,mahbub,aixinsun}@cse.unsw.edu.au

*National ICT Australia
Bay 15, Australian Technology Park
Eveleigh, NSW 1430, Australia
Email: Lavy.Libman@nicta.com.au
The idea of providing seamless connectivity and information access to users on-board public transport vehicles has attracted increasing popularity in recent years, as is evidenced by several commercially available systems that have attempted to implement it. In this article, we overview the specific technological challenges, research issues, as well as opportunities, that arise in the context of providing communication and information access on public transport. We focus on both the networking perspective --- in particular, discussing extensions required to existing TCP/IP mechanisms to support the moving on-board networks --- and the data management perspective, e.g. personalization of information and caching/pre-fetching for a highly dynamic and heterogeneous population of users. We contend that, to offer the best performance and flexibility, the design of public transport information systems ought to take advantage of the synergy between these two areas.
0524 Validation of SECI Model in Education Cat Kutay
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: ckutay@cse.unsw.edu.au

Aybuke Aurum
School of Information Systems, Technology and Management
University of New South Wales
Sydney 2052 Australia
E-mail: aybuke@unsw.edu.au
The use of Knowledge Management (KM) is increasingly relevant to education as our knowledge of factors influencing the effective managing of information and knowledge resources. It is important that educational organizations understand the application of strategies for managing the knowledge resources and providing appropriate access to this information within the University context. This article examines data collected from students doing the Software Engineering Program at UNSW. This data is used to analyse the industrial SECI model of KM as applied to the educational domain. We are looking at the validity of the empirical evidence. Results indicated that the data provided a valid study of KM in this context.
0523 Redback: A Low-Cost Advanced Mobility Robot Raymond Sheh
School of Computer Science and Engineering
and The National ICT Australia
University of New South Wales
Sydney 2052 Australia
E-mail: rsheh@cse.unsw.edu.au
Many of the more interesting applications of mobile robotics involve robots that are able to traverse unstructured environments. Unfortunately, currently available robots for such a purpose tend to be limited in their mobility (very few can climb stairs for instance), are excessively heavy, must be custom built or are very expensive. This article describes preliminary work on construction of an advanced mobility robot based on a Tarantula radio controlled toy, sold by MGA Entertainment. Despite being very low in cost (sub-$200AUD), this toy can easily climb stairs and overcome obstacles that challenge much larger robots. It can easily carry an onboard computer which can directly interface with the existing motor controllers as well as additional cameras, small laser rangefinders and other sensors. The full cost of parts for the basic robot, including computer and simple sensors, is expected to be around the $1,000AUD mark with basic computer-less versions starting below $300AUD. The modified robot has been dubbed Redback, after the Australian spider of the same name.
0522 The Shadow Knows: Refinement of ignorance in sequential programming Carroll Morgan
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: carrollm@cse.unsw.edu.au
Sequential-program state can be separated into "visible" and "hidden" parts in order to allow knowledge-based reasoning about the hidden values. Ignorance-preserving refinement should ensure that observing the visible part of an implementation can reveal no more about the hidden part than could be revealed by its specification. Possible applications are zero-knowledge protocols, or security contexts where the "high-security" state is considered hidden and the "low-security" state is considered visible. Rather than checking for ignorance preservation at each stage, we suggest program-algebraic "refinement rules" that preserve ignorance by construction. The Dining Cryptographers is a motivating example, in which ignorance of certain variables (coins) is intended to contribute to an ignorance property of the overall protocol. Our algebra is powerful enough to derive the Dining Cryptographers' Protocol, while retaining soundness by avoiding (for example) the Refinement Paradox. We formulate and justify general principles about which refinement rules should be retained, for algebraic power and utility, and which should be discarded, for soundness.
0521 ASEHA: A Web Services Protocol Modelling Formalism Pemadeep Ramsokul and Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
and
National ICT Australia(NICTA)
Locked Bag 6016, NSW 1466, Australia
E-mail: {pkramsok, sowmya}@cse.unsw.edu.au
Agents require standard and reliable protocols to interact with different service providers in order to provide high quality service to customers over the web. Many useful protocols are coming into the market, but are ambiguously specified by protocol designers and without being fully verified. These can lead to interoperabilty problems among implementations of same protocol and high software maintenance costs. In this paper, we propose a hierarchical automata-based framework to model the necessary features of protocols to verify their correctness. Our experience shows that the graphical models produced, provide invaluable insights and can be used to complement specifications to drastically reduce, if not eliminate, ambiguities. We illustrate our formalism with a version of the WS-AtomicTransaction protocol.
0520 Formal Methods in the Enhancement of the Data Security Protocols of Mobile Agents Raja Al-Jaljouli
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: rjaljoli@cse.unsw.edu.au
The security of data gathered by mobile agents is crucial to the success of e-commerce and formal methods play an important role in the verification of the data security protocols. This paper demonstrates the effectiveness of formal methods in the analysis and design of security protocols. In this paper, we implement a formal method in the analysis and the rectification of a recent mobile agent security protocol by Maggi and Sisto [14] named "Configurable Mobile Agent Data Protection Protocol". We use STA (Symbolic Trace Analyzer), a formal verification tool that is based on symbolic techniques, in the analysis of the protocol. The analysis revealed a flaw in the security protocol. An adversary can impersonate the genuine initiator, and thus can breach the privacy of the gathered data. The protocol does not detect the malicious act. In addition [14] states that the protocol does not achieve strong data integrity in case two hosts conspire or a malicious host is visited twice. We rectify the protocol so it prevents the malicious acts, and then use STA tool to analyze a reasonably small instance of the protocol in key configurations. The analysis shows that the repaired protocol is free of flaws. Moreover, we reason about the security of a general model of the repaired protocol. To our knowledge, we are the first to repair the protocol, and analyze it formally.
0519 Adaptive Position Update in Geographic Routing Quan Jun Chen, Salil S. Kanhere, Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {quanc, salilk, mahbub}@cse.unsw.edu.au

Kun-Chan Lan
National ICT Australia, Sydney, Australia,
E-mail: Kun-Chan.Lan@nicta.com.au
In geographic routing, nodes need to maintain up-to-date positions of their immediate neighbours for making effective forwarding decisions. Periodic broadcasting of beacon packets that contain the geographic location coordinates of the nodes is a popular method used by most geographic routing protocols to maintain neighbour positions. We contend that periodic beaconing regardless of network mobility and traffic pattern does not make optimal ulilisation of the wireless medium and node energy. For example, if the beacon interval is too small compared to the rate at which a node changes its current position, periodic beaconing will create many redundant position updates. Similarly, when only a few nodes in a large network are involved in data forwarding, resources spent by all other nodes in maintaining their neighbour positions are greatly wasted. To address these problems, we propose the Adaptive Position Update (APU) strategy for geographic routing. Based on mobility prediction, APU enables nodes to update their position adaptively to the node mobility and traffic pattern. We embed APU into the well known Greedy Perimeter Stateless Routing Protocol (GPSR), and compare it with original GPSR in the ns-2 simulation platform. We conducted several experiments with randomly generated network topologies and mobility patterns. The results confirm that APU significantly reduces beacon overhead without having any noticeable impact on the data throughput of the network. This result is further validated through a trace driven simulation of a practical vehicular ad-hoc network topology that exhibits realistic movement patterns of public transport buses in a metropolitan city.
0518 Reducing File Download Delay by QoS Driven Parallelization of Resources. Shaleeza Sohail, Sanjay Jha, Salil S. Kanhere and Chun Tung Chou
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: (sohails,sjha,salilk,ctchou)@cse.unsw.edu.au
In this paper, we propose a novel approach, for reducing the download time of large files over the Internet. Our approach, known as Parallelized File Transport Protocol (P-FTP), proposes simultaneous downloads of disjoint file portions from multiple file servers. P-FTP server selects file servers for requesting client, on the basis of a variety of QoS parameters, such as, available bandwidth and server utilisation. The sensitivity analysis of our file server selection technique shows that it performs significantly better than random selection. The scalability study of P-FTP server shows that it can handle queries of large number of P-FTP clients, without becoming bottleneck. During the file transfer, P-FTP client monitors the file transfer flows to detect slow servers and congested links and adjusts the file distributions accordingly. P-FTP is evaluated with simulations and real-world implementation. The results show at least 50% reduction in download time, when compared to the traditional file-transfer approach. Moreover, we have also carried out a simulation-based study to investigate the issues related to large scale deployment of our approach on the Internet. Our results demonstrate that large number of P-FTP users has no adverse effect on the performance perceived by non P-FTP users. In addition, the file servers and network are not significantly affected by large scale deployment of P-FTP.
0517 B-SCP: a requirements analysis framework for validating strategic alignment of organizational IT based on strategy, context, and process Steven J. Bleistein, Karl Cox, June Verner
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: stevenb@cse.unsw.edu.au

Keith T. Phalp
Empirical Software Engineering Research Group
Bournemouth University, United Kingdom
Ensuring that organizational IT is in alignment with and provides support for an organization's business strategy is critical to business success. Despite this, business strategy and strategic alignment issues are all but ignored in the requirements engineering research literature. We present B-SCP, a requirements engineering framework for organizational IT that directly addresses an organization's business strategy and the alignment of IT requirements with that strategy. B-SCP integrates the three themes of strategy, context, and process using a requirements engineering notation for each theme. We demonstrate a means of cross-referencing and integrating the notations with each other, enabling explicit traceability between business processes and business strategy. In addition, we show a means of defining requirements problem scope by applying a business modeling framework as a Jackson problem diagram. Our approach is illustrated via application to an exemplar. The case example demonstrates the feasibility of B-SCP, and we present a comparison with other approaches.
0516 A Competitive Learning Algorithm for Checking Sensor Data Integrity in Unknown Environments Tatiana Bokareva.
School of Computer Science and Engineering.
The University of NSW.
tbokareva@cse.unsw.edu.au

Nirupama Bulusu.
School of Computer Science.
Portland State University.
nbulusu@cs.pdx.edu

Sanjay Jha.
School of Computer Science and Engineering.
The University of NSW.
sjha@cse.unsw.edu.au
Ad-Hoc wireless sensor networks derive much of their promise from their potential for autonomously monitoring remote or physically inaccessible locations. As we begin to deploy sensor networks in real world applications, ensuring the integrity of sensor data is of paramount importance. In this paper, we motivate, propose, evaluate and analyze an online algorithm for modeling and validating sensor data in an unknown physical environment. Previous work on checking sensor data integrity developed within the context of process control systems uses a priori characterization of sensor data. In contrast, our approach leverages the concept of competitive learning for online characterization of a dynamic, unknown environment and the derivation of conditions for verifying sensor data integrity over time. Moreover, to scale to very large sensor networks, our algorithm leverages in-network processing a hierarchical, tiered sensor network by executing on the distributed cluster heads, rather than at a central base station. We prove the convergence properties of our algorithm through theoretical analysis. Furthermore, we implement our algorithm on a real physical sensor network of motes and Stargates, and demonstrate that our algorithm successfully learns real-world environmental data characteristics and filters anomalous data in a sensor network.
0515 A Dynamic Caching Algorithm Based on Internal Popularity Distribution of Streaming Media Jiang Yu,
Dept. of Electronics and Information Engineering, Huazhong University of Science & Technology, China
frankyu@263.net

Chun Tung Chou
School of Computer Science and Engineering, University of New South Wales, Australia
ctchou@cse.unsw.edu.au
Most proxy caches for streaming videos do not cache the entire video but only a portion of it. This is partly due to the large size of video objects. Another reason is that the popularity of different part of a video can be different, e.g. the prefix is generally more popular. Therefore, the development of efficient cache mechanisms requires an understanding of the internal popularity characteristics of streaming videos. This paper has two major contributions. Firstly, we analyze two 6-month long traces of RTSP video requests recorded at different streaming video servers of an entertainment video-on-demand provider, and show that the traces provide evidence that the internal popularity of the majority of the most popular videos obeys a $k$-transformed Zipf-like distribution. Secondly, we propose a caching algorithm which exploits this empirical internal popularity distribution. We find that this algorithm has similar performance compare with fine-grained caching but requires significantly less state information.
0514 Low Latency Broadcast in Multi-Rate Wireless Mesh Networks Chun Tung Chou
School of Computer Science and Engineering,
University of New South Wales, Australia
ctchou@cse.unsw.edu.au

Archan Misra
IBM T J Watson Research Center,
Hawthorne, New York, USA
archan@us.ibm.com

Junaid Qadir
School of Computer Science and Engineering,
University of New South Wales, Australia
junaidq@cse.unsw.edu.au
In a multi-rate wireless network, a node can dynamically adjust its link transmission rate by switching between different modulation schemes. For the current IEEE802.11a/b/g standards, this rate adjustment is limited to unicast traffic. In this paper, we consider a novel type of multi-rate mesh networks where a node can dynamically adjust its link layer multicast rates to its neighbours. In particular, we consider the problem of realising low latency network-wide broadcast in this type of multi-rate wireless meshes. We will first show that the multi-rate broadcast problem is significantly different from the single-rate case. We will then present two algorithms for achieving low latency broadcast in a multi-rate mesh which exploits both wireless broadcast advantage and the multi-rate nature of the network. Simulations based on current 802.11 parameters show that multi-rate multicast can reduce broadcast latency by 3--6 times compared with using the lowest rate alone. In addition, we show the significance of the product of transmission rate and transmission coverage area in designing multi-rate wireless mesh networks for broadcast.
0513 Toward a Framework for Capturing and Using Architecture Knowledge Muhammad Ali Babar, Ian Gorton, Ross Jeffery
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {malibaba, iango, rossj}@cse.unsw.edu.au
Management of architecture knowledge is vital for improving an organization's architectural capabilities. Despite the recognition of the importance of capturing and reusing architecture knowledge, there is no suitable support mechanism. We propose a conceptual framework for providing appropriate guidance and tool support for making tacit or informally described architecture knowledge explicit. This framework identifies different approaches to capturing implicit architecture knowledge. We discuss different usages of the captured knowledge to improve the effectiveness of architecting process. The report also presents a brief description of a prototype of a web-based architecture knowledge management tool to support the storage and retrieval of the captured knowledge. The report concludes with open issues that we plan to address in order to successfully transfer this support mechanism for capturing and using architecture knowledge to the industry.
0511 Secure Untrusted Binaries --- Provably! Simon Winwood, Manuel M. T. Chakravarty
University of New South Wales and
National ICT Australia
E-mail: {sjw, chak}@cse.unsw.edu.au
A standard method for securing untrusted code is code rewriting, whereby operations that might compromise a safety policy are secured by additional dynamic checks. In this paper, we propose a novel approach to sandboxing that is based on a combination of code rewriting and hardware-based memory protection. In contrast to previous work, we perform rewriting on raw binary code and provide a machine-checkable proof of safety that includes the interaction of the untrusted binary with the operating system. This proof constitutes a crucial step towards the use of rewritten binaries with proof-carrying code.
0510 A Proposed Security Protocol for Data Gathering Mobile Agents Raja Al-Jaljouli
School of Computer Science and Engineering
University of New South Wales
Sydney 2052, Australia
E-mail: rjaljoli@cse.unsw.edu.au
This paper addresses the security issue of the data which mobile agents gather as they are traversing the Internet. Several cryptographic protocols were presented in the literature asserting the security of gathered data. The security is based on the implementation of one or more of the following security technique: public key encryption, digital signature, and message authentication code, backward chaining, one-step forward chaining, and code-result binding. Formal verification of the protocols reveals unforeseen security flaws, such as truncation or alteration of the collected data, breaching the privacy of the gathered data, sending others data under the private key of a malicious host, and replacing the collected data with data of similar agents. So the existing protocols are not truly secure. In this paper, we present an accurate security protocol [21] which aims to assert strong integrity, authenticity, and confidentiality of the gathered data. The proposed protocol is derived from the Multi-hops protocol [14], where the security relies on a message authentication code, a chain of encapsulated offers, and a chained hash of a random nonce. The Multi-hops protocol suffers from security flaws, e.g. an adversary might truncate/ replace collected data, or sign others data with his own private key without being detected. The proposed protocol [21] refines the Multi-hops protocol by implementing the following security techniques: utilization of co-operating agents, scrambling the gathered offers, requesting a visited host to clear its memory from any data acquired as a result of executing the agent before the host dispatches the agent to the succeeding host, carrying out verifications during the agent.s lifecycle in addition to the verifications upon agent.s return to the initiator. The verifications are on the identity of the genuine initiator at the early execution of the agent at a visited host. The proposed protocol also implements the common security techniques such as public key encryption, digital signature, etc. The security techniques implemented in the proposed protocol would rectify the security flaws revealed in the existing protocols. We prove its correctness by analyzing the security properties using STA [44, 45], a finite-state verification tool.
0509 A Configuration Memory Architecture for Fast FPGA Reconfiguration Usama Malik* and Oliver Diessel**
*Architecture Group
School of Computer Science and Engineering
University of New South Wales
Sydney 2052, Australia
**Embedded, Real-time, and Operating Systems (ERTOS) Program
National ICT Australia
Email: {umalik, odiessel}@cse.unsw.edu.au
This report presents a configuration memory architecture that offers fast FPGA reconfiguration. The underlying principle behind the design is the use of fine-grained partial reconfiguration that allows significant configuration re-use while switching from one circuit to another. The proposed configuration memory works by reading on-chip configuration data into a buffer, modifying them based on the externally supplied data and writing them back to their original registers. A prototype implementation of the proposed design in a 90nm cell library indicates that the new memory adds less than 1% area to a commercially available FPGA implemented using the same library. The proposed design reduces the reconfiguration time for a wide set of benchmark circuits by 63%. However, power consumption during reconfiguration increases by a factor of 2.5 because the read-modify-write strategy results in more switching in the memory array.
0508 Implementation of a Multihoming Agent for Mobile On-board Communication Jun Yao, Yi Duan, Jianyu Pan, Kun-chan Lan, Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {jyao713, ydua280, jpan122, klan, mahbub}@cse.unsw.edu.au
The idea of using multiple access links (so called multihoming) is commonly used to improve the aggregate bandwidth and the overall service availability, which has been employed by large enterprises and data centers as a mechanism to extract good performance and enhance the network reliability from their service providers for a while. However, in the context of network mobility, because of the diversity and unreliability of the wireless links, adaptable multihomed prototype supporting mobile networks should be studied and implemented. This report introduces us the details from the implementation project of a multihoming agent, which is developed on a Linux based platform and allows router to access and switch between multiple access links including WLAN and GPRS. The prototype is capable of dynamically switching users' data traffic from one network interface to another, i.e. Ethernet, WLAN, and GPRS, without breaking transport layer sessions, based on the interfaces' status and policy in use.
0507 Selectivity Estimation of Multidimensional Queries Based on Point Synthesis Matthew Gebski Raymond K. Wong
National ICT Australia and
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
An important problem in database systems is estimating the selectivity of multidimensional queries. While various approaches have been proposed for selectivity estimation for low dimensional spatial databases, many of these techniques have weaknesses for higher dimensional data. This paper presents a novel approach to estimate the selectivity of queries by generating points from an empirical distribution and testing these against a given query. This allows us to take small samples as a summary and generate points for selectivity estimation. Unlike histogram based approaches, our technique does not use containers to represent regions of the data. This alleviates problems that arise when container densities approach zero as dimensionality increases. Experiments show that our approach handles high levels of skew and very high dimensionality, and achieves higher accuracy than previous approaches.
0506 Rotated Library Sort Franky Lam, Raymond K. Wong
National ICT Australia and
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
This paper investigates how to improve the worst case runtime of Insertion Sort while keeping it in-place, incremental and adaptive. To sort an array of n elements with w bits for each element, classic Insertion Sort runs in O(n^2) operations with wn bits space. Gapped Insertion Sort has a runtime of O(n log n) with a high probability of only using (1 + e)wn bits space. This paper shows that Rotated Insertion Sort guarantees O(sqroot(n) log n) operations per insertion and has a worst case sorting time of O(n^(1.5) log n) operations by using optimal O(w) auxiliary bits. By using extra Theta(sqroot(n) log n) bits and recursively applying the same structure l times, it can be done with O(2^l n^(1 + 1/l)) operations. Apart from the space usage and time guarantees, it also has the advantage of efficiently retrieving the i-th element in constant time. This paper presents Rotated Library Sort that combines the advantages of the above two improved approaches.
0505 A Scheduler Architecture for INTEL's IXP2400 Network Processor Fariza Sabrina and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia.
E-mail: {farizas, sjha}@cse.unsw.edu.au
This report describes the design and implementation of a scheduling algorithm called CBCS on network processors. The CBCS algorithm is a fair and efficient resource scheduling algorithm that can fairly allocate multiple resources among contending flows. We have implemented this algorithm on Intel's IXP2400 network processor. The main objectives of implementing our scheduling algorithms on a real world Attached Network Processor (ANP) such as Intel IXP2400 Network processor were (1) to validate that CBCS can be implemented on a highly scalable real world programmable node or ANP for managing the CPU and bandwidth resources, (2) to provide scheduling solutions that could be opted or used by the ANP vendors like Intel to build and deliver efficient building block(s) (as a part of their Software Development Kit (SDK) or development framework) for resource scheduling purposes. The experimental results from IXP2400 implementation demonstrate the effectiveness and high performance of this algorithm in real world system.
0504 Synthesis of Distributed Systems from Knowledge-Based Specifications Ron van der Meyden
School of Computer Science and Engineering,
University of New South Wales
National ICT Australia

&

Thomas Wilke
Institut für Informatik und Praktische Mathematik
Christian-Albrechts-Universität zu Kiel
We consider the problem of synthesizing protocols in a distributed setting satisfying specifications phrased in the logic of linear time and knowledge. In general, synthesis in distributed settings is undecidable already for linear-time temporal logic specifications, but there exist special cases in which synthesis from linear-time temporal logic specifications is known to be decidable. On the basis of these results and a result on the decidability of synthesis of temporal and knowledge specifications in systems with a single agent, van der Meyden and Vardi [CONCUR 96] conjectured that synthesis of temporal and knowledge specifications would be decidable in two classes of environments: hierarchical environments, in which each agent in a linear sequence observes at least as much as the preceding agents, and broadcast environments, in which all communication is constrained to be by synchronous broadcast. We show that this conjecture is true in the case of broadcast environments, but false in the case of even a very simple type of hierarchical environment, where only two agents are involved, one of which observes every aspect of the system state and one of which observes nothing of it. Nevertheless, synthesis from linear-time logic specifications is decidable in hierarchical environments. Moreover, for specifications that are positive in the knowledge modalities, the synthesis problem can be reduced to the same problem for the logic of linear time. We use these facts to conclude the decidability in hierarchical systems of a property closely related to nondeducibility on strategies, a notion that has been studied in computer security.
0503 A Multi-Channel DS-CDMA Media Access Control Protocol for Wireless Sensor Networks Bao Hua Liu, Chun Tung Chou, Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia.
E-mail: {mliu, ctchou, sjha}@cse.unsw.edu.au
This paper proposes a novel multi-channel media access control (MAC) protocol for direct sequence code division multiple access (DS-CDMA) wireless sensor network. Our protocol design uses combination of DS-CDMA and frequency division to reduce the channel interference and consequently improves system capacity and network throughput. We provide theoretical characterization of the mean multiple access interference (MAI) at a given node in relation to the number of frequency channels. We show that by using only a small number of frequency channels, the mean MAI can be reduced significantly. Through discrete event simulation, we provide comparison of our proposed system to a pure DS-CDMA system as well as a contention based system. Simulation results reveal that our proposed system can achieve 15-20 times of system efficiency than a contention based system. When same number of packets are transmitted in the network, our system consumes only 10% of communication energy than the contention based system.
0442 IC2: An Interval based Characteristic Concept Learner Pramod K. Singh
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: pksingh@cse.unsw.edu.au
In many real-world problems it can be argued that classification based on characteristic descriptions delivers a more correct and comprehensive concept than the existing discriminative descriptions. Most classification algorithms suffer from an inability to detect instances of classes which are not present in the training set. A novel approach for characteristic concept rule learning called IC2 is proposed in this paper. It uses the descriptions of one class objects for learning concept rules for the classification and descrimination of unseen instances. In this approach interval-based class characteristic rules are induced from the descriptions of the labeled objects of a class. The paper illustrates the approach and presents the empirical results obtained on some data sets of continuous feature variables. The IC2 classifier is evaluated and its classification accuracy is compared with a state-of-the-art characteristic concept classification algorithm ID3-SD.
0441 Creative Commons Speech Repository Mohammed Waleed Kadous
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: waleed@cse.unsw.edu.au
The availability of high quality open source solutions has often led to rapid growth of a particular area; most famously, the LAMP (Linux-Apache-MySQL) platform for Web development that has made it easy for small entities to set up rich web sites easily. Recent developments in open source speech recognition software, notably the Sphinx 4 system, mean that potentially the same could occur for speech. However, one outstanding issue is the availability of high-quality acoustic models required for such systems to function effectively. This report outlines the possibilities for a new model for speech corpus repository, based on the principles of Open Source and the Creative Commons licenses. Rather than the usual approach for gathering corpora that are done in a centralised location, using fixed high-quality equipment, we propose that donors record audio passages using their own equipment and upload them to a central repository for processing. The data in the speech repository could be used for training acoustic models for speech recognition by anyone. In addition, such a speech repository, especially when combined with co-located computational processing facilities, such as a Beowulf cluster, create an opportunity for new applications, for example, the possibility of personalised acoustic models. Such a system would customise an acoustic model for a particular user, not just using that person's data, but by using people whose voices are similar. Keywords: speech recognition, speech corpus, open source, creative commons.
0438 The Holes Problem in Wireless Sensor Networks: A Survey Nadeem Ahmed, Salil S. Kanhere, Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {nahmed, salilk, sjha}@cse.unsw.edu.au
Several anomalies can occur in wireless sensor networks that impair their desired functionalities i.e. sensing and communication. Different kinds of holes can form in such networks creating geographically correlated problem areas such as coverage holes, routing holes, jamming holes, sink/black holes and worm holes, etc. We detail in this report different types of holes, discuss their characteristics and study their effects on successful working of a sensor network. We present state-of- the-art in research for addressing the holes related problems in wireless sensor networks and discuss the relative strengths and short-comings of the proposed solutions for combating different kinds of holes. We conclude by highlighting future research directions.
0436 Secure Online Elections Roland Wen
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW, 2052, Australia
Email: rolandw@cse.unsw.edu.au
Advances in cryptography have contributed significantly to the Internet revolution, and one of the next major applications is secure online elections. In the wake of the debacle of the US presidential election in 2000, governments have rapidly begun to adopt online voting. However, this is somewhat premature given the current state of research. No existing election scheme is able to satisfy all the democratic principles of traditional elections whilst also being suitable for large scale voting. This report is intended to be a self-contained introduction to the field of online elections. It develops the necessary background knowledge, firstly by examining the traditional and online models, as well as the requirements and properties for election protocols. Then it presents an overview of the relevant mathematical and cryptographic preliminaries. The crux of this report is the survey of the most important election schemes found in the literature. These are categorised according to the approach on which they are based: homomorphic encryption, mix nets and blind signatures. A description and evaluation of each scheme is provided, followed by a comparison of all the protocols and a discussion of the limitations and outstanding concerns.
0435 A Comparison of Four Software Architecture Reconstructions Toolkits Meena Jha, Piyush Maheshwari
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: meenaj@cse.unsw.edu.au
This report discusses the evaluation of four software architecture reconstruction tools. This evaluation is needed because many legacy system needs to be reconstructed as the requirements change for the purpose of modernization. Software reconstruction is a tool-based iterative and interpretive process. Software architecture reconstruction tools support software engineers in the process of recovering the .as-built. architecture of an implemented system. The tools extract information about the system and aid in building and aggregating successive levels of abstraction. If the tools are successful, the end result is an architectural representation aids in reasoning about the system. There are several commercial reconstruction tools on the market providing different capabilities and supporting specific source code languages. In this report, we evaluate four software architecture reconstruction tools on the criteria of extraction ability, abstraction ability, navigation ability, ease-of-use, views, completeness and extensibility. The tools presented for evaluation are Dali workbench, PBS, SWAG Kit and Bauhaus. Capabilities of these tools are evaluated by applying them to medium size software .concepts..
0434 On Reachability and Acyclicity Yi Lu and John Potter
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {ylu,potter}@cse.unsw.edu.au
This paper presents a type system that enforces constraints on reachability via pointers or references, and restricts reference cycles to be within definable regions. Every data object lives in a fixed region, determined from its type. The motivation for our work is the desire to enforce static constraints on reference structures. Such constraints can be useful for program reasoning, for deadlock avoidance in concurrent contexts, and for runtime optimizations and memory management. For example, data invariants are easily broken by re-entrant code. By restricting cycles to statically enforceable regions, program proof rules can make stronger assumptions about reference structures. The contributions of this paper are a novel class-based region-parametric type system, with subtyping, that enforces one-way reachability between regions, and a dynamic semantics that allows us to formalize the key structural invariant: object references respect region reachability, so that object cycles occur only within regions.
0433 A Quality-Driven Systematic Approach for Architecting Distributed Software Applications Tariq Al-Naeem1, Ian Gorton2, Muhammed Ali Babar12, Fethi Rabhi3 and
Boualem Benatallah1

1 School of Computer Science & Engineering, University of New South Wales,
Austraia; {tariqn,malibaba,boualem@cse.unsw.edu.au}
2 National ICT Australia Ltd; {ian.gorton@nicta.com.au}
3 School of Information Systems, Technology and Management, University of

New South Wales, Australia; {f.rabhi@unsw.edu.au}
Architecting distributed software applications is a complex design activity. It involves making decisions about a number of inter-dependent design choices that relate to a range of design concerns. Each decision requires selecting among a number of alternatives; each of which impacts differently on various quality attributes. Additionally, there are usually a number of stakeholders participating in the decision-making process with different, often conflicting, quality goals, and project constraints, such as cost and schedule. To facilitate the architectural design process, we propose a quantitative quality-driven approach that attempts to find the best possible fit between conflicting stakeholders' quality goals, competing architectural concerns, and project constraints. The approach uses optimization techniques to recommend the optimal candidate architecture. Applicability of the proposed approach is assessed using a real system.
0432 Helping Users Avoid Bugs in GUI Applications Amir Michail
University of New South Wales
Sydney, NSW, Australia, 2052
amichail@cse.unsw.edu.au

Tao Xie
University of Washington
Seattle, WA, USA, 98195
taoxie@cs.washington.edu
In this paper, we propose a method to help users avoid bugs in GUI applications. In particular, users would use the application normally and report bugs that they encounter to prevent anyone -- including themselves -- from encountering those bugs again. When a user attempts an action that has led to problems in the past, he/she will receive a warning and will be given the opportunity to abort the action thus avoiding the bug altogether and keeping the application stable. Of course, bugs should be fixed eventually by the application developers, but our approach allows application users to collaboratively help each other avoid bugs thus making the application more usable in the meantime. We demonstrate this approach using our "Stabilizer" prototype. We also include a preliminary evaluation of the Stabilizer's bug prediction.
0431 An Energy Efficient Select Optimal Neighbor Protocol for Wireless Ad Hoc Networks Bao Hua Liu
University of New South Wales
Sydney, NSW, Australia, 2052
miu@cse.unsw.edu.au

Yang Gao
University of New South Wales
Sydney, NSW, Australia, 2052
yangg@cse.unsw.edu.au

Chun Tung Chou
University of New South Wales
Sydney, NSW, Australia, 2052
ctchou@cse.unsw.edu.au

Sanjay Jha
University of New South Wales
Sydney, NSW, Australia, 2052
sjha@cse.unsw.edu.au
In this paper, we propose two location-aware select optimal neighbor (SON) algorithms that are suitable for CSMA/CA based MAC protocols for wireless ad hoc networks. Both algorithms optimize the energy efficiency by reducing the effective number of neighbors and thus reduce the transmission power as well as the overhearing power consumption at irrelevant receivers. NS-2 simulations show that our algorithms can achieve about 28\% and 38\% average energy savings per node compared to CSMA/CA based MAC protocols such as IEEE 802.11. When electronic energy consumption is a considerable part of energy consumption, SON has a better energy performance than traditional optimal pruning algorithm.
0429 Using Frequency Division to Reduce MAI in DS-CDMA Wireless Sensor Networks Bao Hua Liu
University of New South Wales
Sydney, NSW, Australia, 2052
miu@cse.unsw.edu.au

Chun Tung Chou
University of New South Wales
Sydney, NSW, Australia, 2052
ctchou@cse.unsw.edu.au

Justin Lipman,
University of New South Wales
Sydney, NSW, Australia, 2052
justinl@cse.unsw.edu.au

Sanjay Jha
University of New South Wales
Sydney, NSW, Australia, 2052
sjha@cse.unsw.edu.au
The performance of Direct Sequence Code Division Multiple Access (DS-CDMA) sensor networks is limited by Multiple Access Interference (MAI). This paper proposes using frequency division to reduce the MAI in a DS-CDMA sensor network. We provide theoretical characterization of the mean MAI at a given node and show that a small number of frequency channels can reduce the MAI significantly. In addition, we provide a comparison of our proposed system to systems which do not use frequency division or which employ contention based protocols. Our study found that, by using only a small number of frequency channels, our system has less channel contention, lower packet latency, higher packet delivery ratio and lower energy consumption.
0428 Policy-Based Exception Handling in Business Processes Rachid Hamadi and Boualem Benatallah
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
{rhamadi,boualem}@cse.unsw.edu.au
A workflow management system (WfMS) provides a central control point for defining business processes and orchestrating their execution. A major limitation of current WfMSs is their lack of support for dynamic workflow adaptations. This functionality is an important requirement in order to provide sufficient flexibility to cope with expected but unusual situations and failures. In this report, we propose Self-Adaptive Recovery Net (SARN), an extended Petri net model for specifying exceptional behavior in workflow systems at design time. SARN can adapt the structure of the underlying Petri net at run time to handle exceptions while keeping the Petri net design simple and easy. The proposed framework also caters for high-level recovery policies that are incorporated either with a single task or a set of tasks, called a recovery region. These recovery policies are generic constructs that model exceptions at design time together with a set of primitive operations that can be used at run time to handle the occurrence of exceptions.
0424 Querying and Maintaining Succinct XML Data Franky Lam, William M. Shui, Damien K. Fisher, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
As XML database sizes grow, the amount of space used for storing the data and auxiliary supporting data structures becomes a major factor in query and update performance. This paper presents a new secondary storage scheme for XML data that supports all navigational operations and answers ancestor queries in near constant time. In addition to supporting fast queries, the space requirement is within a constant factor of the information theoretic minimum, while insertions and deletions can be performed in near constant time as well. As a result, the proposed structure features a small memory footprint that increases cache locality, whilst still supporting standard APIs such as DOM efficiently. As an example of the scheme's power, we further demonstrate that the structure can support efficient structural and twig joins. Both formal analysis and experimental evidence demonstrate that the proposed structure is space and time efficient.
0423 Incremental Schema Validation for XML Databases Damien K. Fisher, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
An important feature of any database system is the ability to perform consistency checks on the data being manipulated in the system. For XML databases, the most fundamental such check is validation with respect to a fixed data schema, such as a DTD or XML Schema. While it is straightforward to perform such validation on an entire XML document, it would be extremely inefficient to revalidate the entire database from scratch upon every modification. Hence, it is natural to ask whether efficient incremental schema validation techniques exist. In this report, we investigate the complexity bounds of such techniques for a variety of schema languages. As with previous work on this subject, we mainly study the related problem of dynamic membership in a regular language. We define a class of regular expressions for which dynamic membership can be determined in constant time, and a larger class for which dynamic membership can be determined in O(log log n) time. We also show that, in general, validation can be performed in time O(log n / log log n), and that this bound is tight for some schemas.
0421 Emphysema Detection Using a Density Mask Mario Bou-Haidar, Mithun Prasad, Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: mariob@cse.unsw.edu.au
This report describes an accurate approach to emphysema detection using HRCT (high resolution computer tomography) lung images. It is based on a suite of image processing techniques on top of "Density Mask". "Density Mask" is a standard approach currently used for emphysema detection in medical image analysis. The motivation behind this work lies in the fact that "Density Mask" results in noisy regions of Emphysema. As a result, a heuristic based method that filters out unwanted noise from the HRCT images is useful. The experiments were based on finding the right heuristic parameters for successful detection of Emphysema regions. Radiologists in the team have verified the results and concluded that the approach is capable of detecting very sensitive emphysema.
0420 Profile-Guided Partial Redundancy Elimination Using Control Speculation: a Lifetime Optimal Algorithm and an Experimental Evaluation Jingling Xue and Qiong Cai
Programming Languages and Compilers Group
School of Computer Science and Engineering
University of New South Wales
Sydney, NSW 2052, Australia}
{jxue,qiongc}cse.unsw.edu.au
A lifetime optimal algorithm, called MCPRE, is presented for the first time that performs partial redundancy elimination (PRE) by combining code motion and control speculation based on an edge profile. An edge profile provides an approximation of the actual edge frequencies with the predicted edge frequencies. MCPRE is developed so that the optimality results it achieves are reasoned about with respect to a given edge profile. MCPRE is computationally optimal since the total number of dynamic computations for an expression in the transformed code is minimized. If the predicted frequencies of all flow edges are nonzero, MCPRE is also lifetime optimal since the lifetimes of introduced temporaries in the transformed code are also minimized. Otherwise (if some flow edges are zero-weighted), MCPRE yields a practical transformation (as validated by extensive experiments) from the perspective that the predicted zero frequencies are (or should be) interpreted as the least frequently rather than never executed at all. The computational and lifetime optimality results are rigorously proved. This algorithm works on CFGs with standard basic blocks and is conceptually simple. First, it performs two standard bit-vector data-flow analyses, availability and partial anticipability, to transform a given CFG to an s-t flow network. It then relies on a min-cut solver to find a unique minimum cut on the flow network. Finally, it performs a third standard live range analysis to avoid making unnecessary code insertions and replacements for isolated computations. We have implemented the MCPRE algorithm in gcc 3.4 and evaluated its performance against Knoop, R{\"u}thing and Steffen's profile-independent LCM, the built-in PRE pass at gcc's O2 optimization level and above. We report and analyze our experimental results for all 22 C, C++ and FORTRAN 77 benchmarks from SPECcpu2000 on two computer architectures, Intel Xeon and UltraSPARC-III. Our results show that MCPRE is both effective (in terms of the extra redundancies eliminated and performance improvements achieved) and practical (in terms of the relatively small compilation and space overheads introduced).
0419 Mapping basic recursive structures to runtime reconfigurable hardware Hossam ElGindy And George Ferizis
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {hossam,gferizis}@cse.unsw.edu.au
Recursion is a powerful method that is used to describe many algorithms in computer science. Processing of recursion is traditionally done using a stack, which can act as a bottleneck for parallelising and pipelining different stages of recursion. In this paper we propose a method for mapping recursive algorithms, without the use of a stack structure, into hardware by pipelining the stages of recursion. The use of runtime reconfigurable hardware to minimise the amount of required hardware resources, and the related issues to be resolved, are addressed.
0418 Incremental Learning of Linear Model Trees Duncan Potts and Claude Sammut
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: duncanp@cse.unsw.edu.au and claude@cse.unsw.edu.au
A linear model tree is a decision tree with a linear functional model in each leaf. Previous model tree induction algorithms have operated on the entire training set, however there are many situations when an incremental learner is advantageous. In this report we demonstrate that model trees can be induced incrementally using an algorithm that scales linearly with the number of examples. Two incremental node splitting rules are presented, together with incremental methods for stopping the growth of the tree and pruning. Empirical testing in four domains ranging from a simple test function to a complex 13 dimensional flight simulator, shows that the new algorithm can learn a more accurate approximation from fewer examples than alternative incremental methods. In addition a batch implementation of the new algorithm compares favourably with existing batch techniques at constructing model trees in these domains. Moreover the induced models are smaller, and the learners require less prior knowledge.
0415 Requirements Engineering for e-Business Advantage Steven J. Bleistein, Karl Cox, and June Verner
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: stevenb@cse.unsw.edu.au
As a means of contributing to the achievement of business advantage for companies engaging in e-business, we propose a requirements engineering approach that incorporates a business strategy dimension. We employ both goal modeling and Jackson's Problem Frames approach to achieve this. Jackson's context diagrams, used to represent the business model context, are integrated with goal-models to describe the complete business strategy. We leverage the paradigm of projection in both approaches while maintaining traceability to high-level business objectives as a means of simultaneously decomposing both the optative and indicative parts of the requirements problem, from an abstract business level to concrete system requirements. We integrate use of role activity diagrams to describe business processes in detail where needed. The feasibility of our approach is shown by a case study.
0414 A Use Case Description Inspection Experiment Karl Cox, Aybüke Aurum, Ross Jeffery

School of Computer Science and Engineering
University of New South Wales
Sydney 2052
Australia

Email: karlc@cse.unsw.edu.au
Achieving higher quality software is one of the aims sought by development organizations worldwide. Establishing defect free statements of requirements is a necessary prerequisite to this goal. In this paper we present the results of a laboratory experiment that explored the application of a checklist in the process of inspecting use case descriptions. A simple experimental design was adopted in which the control group used an ad hoc approach and the treatment group was provided with a six-point checklist. The defects identified in the experiment were classified at three levels of significance: i. Internal to the use case ii. Specification impact, and iii. Requirements impact. It was found that the identification of requirements defects was not significantly different between the control and treatment groups, but that more specification and internal defects were found by the groups using the checklist. In the paper we explore the implications of these findings.
0413 Support Vector Machine Experiments for Road Recognition in High Resolution Images James Lai
School of Computer Science and Engineering
The University of NSW, Australia
jlai@cse.unsw.edu.au
Arcot Sowmya
School of Computer Science and Engineering
The University of NSW, Australia
sowmya@cse.unsw.edu.au
John Trinder
School of Surveying and Spatial Information Systems
The University of NSW, Australia
j.trinder@unsw.edu.au
Support Vector Machines have received considerable attention from the pattern recognition community in recent years. They have been applied to various classical recognition problems achieving comparable or even superior results to other classifiers such as neural networks. We investigate the application of Support Vector Machines (SVMs) to the problem of road recognition from remotely sensed images using edge-based features. We present very encouraging results in our experiments, which are comparable to decision tree and neural network classifiers.
0412 Maintaining End-system Performance under Network Overload Luke Macpherson, Gernot Heiser
School of Computer Science and Engineering
and National ICT Australia
University of New South Wales
Sydney 2052, Australia
E-mail: {lukem, gernot}@cse.unsw.edu.au
Network performance is currently outpacing the performance improvements seen by host systems, leading to a significant performance gap between the throughput which may be supported by a network interface, and the actual throughput which can be achieved by a typical end-system. Because this is the case, end-systems must be able to cope with applied loads which exceed their capacities. In particular, system performance in terms of latency, throughput, and jitter should not deteriorate under overload. This paper evaluates the use of intelligent software-based control algorithms adjusting the interrupt-holdoff time and the available DMA buffer space in order to prevent receive livelock on commodity hosts and network adaptors. We present a simple analytical model of packet latency, which allows us to analyse system performance under overload. The control algorithm has been implemented in the FreeBSD operating system. Experiments show excellent scalability under overload, comparing favourably with previous approaches. Furthermore, the implementation is less intrusive on operating system design than prior approaches with similar goals.
0411 Parallelized FTP:- Effective approach for Solving Huge Download Delay Problem over Internet Shaleeza Sohail and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {sohails,sjha}@cse.unsw.edu.au
The file download process over Internet is usually slow and unpredictable. We have designed and implemented a distributed and co-ordinated file transfer protocol for the Internet applications. We have designed and implemented a centralized server that distributes the download process across multiple file servers based on such QoS parameters as available bandwidth and delay. In addition to this, we monitor the FTP flows to detect slow servers and congested links and adjust the file distributions accordingly. Early experimentation suggests that our method can reduce the download time by more than 50% for large files. In addition to reducing the delay our technique has an added advantage that it does not need any modifications to the existing FTP implementations.
0410 Requirements Engineering for Business Advantage: the Strategy Dimension of e-Business Systems Steven J. Bleistein, Karl Cox, and June Verner
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: stevenb@cse.unsw.edu.au
As a means of contributing to achievement of business advantage for companies engaging in e-business we propose a requirements engineering approach for e-business systems that incorporates a business strategy dimension. We employ both goal modeling and Jackson's Problem Frames approach to achieve this. Jackson's context diagrams, which are used to represent the business model context, are integrated with goal-models to describe the complete business strategy. As a means of simultaneously decomposing both the optative and indicative parts of the requirements problem, from an abstract business level to concrete system requirements, we leverage the paradigm of projection in both approaches while maintaining traceability to high-level business objectives. A proof-of-concept case study from the literature shows the feasibility of our approach.
0409 Dynamic Path Restoration Algorithms For On-Demand Survivable Network Connections William Lau and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {wlau,sjha}@cse.unsw.edu.au
Restoration strategies that use offline pre-planning assume that the traffic matrix is static and is known prior to the network capacity assignment. These strategies require recalculation of the capacity requirement for any changes made to the traffic matrix, thus are not suitable for use in highly dynamic traffic environments. This paper addresses the dynamic online capacity assignment problem where each source-destination restoration request is computed sequentially with no prior knowledge of requests that have not been calculated. We focus on the state-dependent approach that allows different backup paths to be used when the network suffers different failure scenarios. The problem is first formulated into a new Integer Programming (IP) problem, and new heuristics are proposed that make trade-offs between bandwidth efficiency and computation time. Results show that the proposed polynomial-time algorithm performs competitively with the IP solution in terms of bandwidth efficiency, and performs better than existing heuristic algorithms of the same restoration strategy. The computation time for the polynomial-time algorithm is significantly shorter than the IP solution and thus suitable for large scale on-demand applications. Further, a comprehensive performance comparison is made between online algorithms of state-dependent and state-independent strategies. Results from this analysis can be used as a guide for choosing between the two strategies. State-dependent algorithms are shown to provide better trade-offs between network efficiency and computation-time, as compared with state-independent algorithms.
0408 Multicast Resilience with Quality of Service Guarantees William Lau and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {wlau,sjha}@cse.unsw.edu.au

Suman Banerjee
Department of Computer Sciences
University of Wisconsin-Madison
Madison, WI 53706, USA
suman@cs.wisc.edu
This paper defines new algorithms for providing bandwidth guaranteed multicast support to applications that require resilience in presence of link failures in the network. Our techniques are applicable for networks that are capable of native network-layer multicast as well as networks that have been enhanced to support infrastructure-based overlay multicast. For efficient multicast restoration, which are necessary for such a construction, we based our techniques on online computation and dynamic routing. We define new Integer Linear Programming (ILP) solutions to the problem, and from experience with the ILP solutions, we derive heuristics that are used to form a new polynomial-time algorithm. Results from our experiments show that our proposed mechanisms can significantly improve the bandwidth efficiency (by 55%) and request acceptance rate (by a factor of 1.5) over alternative mechanisms. The benefits are comparable for infrastructure-based overlay multicast scenarios. Our results also indicate that the proposed heuristics solution performs within 12% of the ILP formulation solution with respect to different metrics.
0406 NightOwl: Self-Localisation by Matching Edges Raymond Sheh
Department of Computing
Curtin University of Technology
Perth 6102 Australia
and
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: shehrk@cs.curtin.edu.au

Bernhard Hengst
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
and
National ICT Australia
University of New South Wales
Sydney 2052 Australia
E-mail: bernhardh@cse.unsw.edu.au
A mobile robot must know where it is to act appropriately. An algo- rithm that allows a robot to accurately localise itself locally using a vision sensor and a map of its environment is described in this paper. The basic idea of this algorithm, called NightOwl, is to match the projected camera image with a map of the environment in a local area in order to find the most likely position and orientation of the camera platform.
0405 Present Issues & Challenges in Survivable WDM Optical Mesh Networks Amitava Mukherjee
School of Computer Science & Engineering
University of New South Wales, Sydney 2052, Australia
E-mail: amitavam@cse.unsw.edu.au

Asidhara Lahiri
IBM Global Services
Salt Lake, Calcutta 700 091, India
E-mail:asidhara.lahiri@in.ibm.com

Debashis Saha
MIS & Computer Science Group, Indian Institute of Management (IIM) Calcutta,
Joka, Kolkata 700 104, India
E-mail: ds@iimcal.ac.in
The design of survivable optical networks is obtained by exploiting restoration and/or protection schemes in the WDM and IP layers. In this paper, we discuss the different restoration and protection techniques available at the IP and WDM layers. Upon network failure, a restoration scheme dynamically looks for backup paths of spare capacity in the network. A protection scheme reserves, in advance, dedicated backup paths and wavelengths in the network. The former scheme is commonly available at higher layers (e.g., the IP layer). The latter scheme is commonly used at the transport (e.g., WDM) layer. The WDM predefined protection scheme is broadly divided into link-based and path-based protection. Predesigned protection schemes are so far the most studied for WDM networks. Because of the multichannel traffic, the design algorithms used in a WDM network are more complex than those used in non-WDM systems. The survivability schemes available at the network layer, such as IP (IP/MPLS), have the capability to recover multiple faults and operate at small traffic granularity. A primary concern for this approach is the slow convergence and response time of IP link failure detection and routing algorithms that renders them unsuitable for use with critical or premium services. This paper discusses the recent works on the restoration and/or protection schemes in the WDM and IP layers and few future research issues.
0404 Approaches for Radio Resource Management in Mobile Wireless Networks: Current Status and Future Issues Amitava Mukherjee
School of Computer Science & Engineering
University of New South Wales, Sydney 2052, Australia
E-mail: amitavam@cse.unsw.edu.au


Debashis Saha
MIS & Computer Science Group, Indian Institute of Management (IIM) Calcutta, Joka, Kolkata 700 104, India
E-mail: ds@iimcal.ac.in
The rapid growth of wireless mobile community, coupled with their demands for high speed, wide band, multimedia services, stands in clear contrast to the limited radio spectrum allocated in international agreements. So radio resource management (RRM) remains as a key challenge to the efficient engineering of mobile wireless networks. In this report, we present an overview of the current status of RRM polices and outline the key issues in RRM for next generation mobile wireless networks.
0403 Design Recovery of Real-Time Graphical Applications using Video Kim Cuong Pham, Tran Quan Pham, Amir Michail
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {kcph007,quanpt,amichail}@cse.unsw.edu.au
In a previous paper, we introduced an approach to design recovery that takes advantage of the interactive and graphical nature of the majority of today's applications. This earlier work is applicable only to interactive graphical applications written in an event-driven programming style with alternation between user-initiated events and application responses. While productivity applications such as word processors and spreadsheets are of this form, real-time graphical applications such as flight simulators and games are not, since the application proceeds even while the user is idle. In this paper, we propose a design recovery method for real-time graphical applications that uses video to link lower-level code events with their higher-level graphical manifestations. We demonstrate by example how the more easily understood video can shed light on the harder to understand implementation of a real-time graphical application.
0402 Case Study to Monitor Cane Toads in Kakadu National Park Saurabh Shukla, Nirupama Bulusu, Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: sshu495@cse.unsw.edu.au, nbulubu@cse.unsw.edu.au,
sjha@cse.unsw.edu.au
Recent advances in wireless communication have promoted the large-scale research in development and deployment of sensor networks. Networked sensors that coordinate among themselves to perform large tasks are expected to revolutionize information gathering and processing in near future. This thesis addresses the problem of large scale sensor deployment using the application of monitoring cane toads in Kakadu National Park as a case study. cane toads were introduced in Australia by mistake in 1935. Their uncanny ability to survive in diverse climates and lack of natural predators in the Australian ecosystem have promoted unhindered growth of cane toads for the last 68 years This application is of tremendous importance to Australia because cane toads are endangering native species and the ecosystem. A study of deployment requirements is important because it influences the network and system architecture; and consequently the design considerations for higher layer protocols and algorithms. Previously proposed deployment work (especially in the sensor networks context) tries to achieve a single objective (e.g maximizing sensor coverage in a given area; or maintaining radio connectivity.) Deployment has not really been studied in terms of a higher-level application perspective when many objectives have to be satisfied simultaneously. This work bridges that gap. Our thesis is that deployment is really a multi-variate problem and we provide a novel framework to studying deployment by integrating application, economic, and networking/technology objectives. Specifically, the contributions of the thesis are: a) A framework in which the deployment problem can be reduced to: Zone division: Division of deployment area into zones. Zone classification: Classification of zones based on deployment priorities. In-zone deployment: Strategies for deploying nodes within a zone to meet the bandwidth and coverage requirements. b) Observation that it is hard to get initial deployment right due to uncertainty. Bayesian framework is used for addressing uncertainty in domain knowledge and using it to drive adaptive learning algorithm c) Discussion of evaluation strategies Working through a deployment strategy for cane toad monitoring reflects a hierarchical hybrid network of possibly many mutually disconnected clusters; which is counter-intuitive to the large-scale "flat" network models commonly assumed. Although our study is in the context of a single specific application, we hope the insights from our study will be useful to designers and researchers in the area of sensor networks. The final aim is to assist the ecologist and biologist in their pursuit of limiting the growth of toads in the region. The goal is to develop a deployment strategy for sensor networks in Kakadu National Park. The sensor network thus designed will be used to monitor and track the presence of cane toads in the Kakadu National Park.
0401   A CDMA-Based, Self-Organizing, Location-Aware Media Access Control Protocol  Bao Hua (Michael) Liu
         School of Computer Science and Engineering
         University of New South Wales
         Sydney 2052 Australia
         E-mail: mliu@cse.unsw.edu.au
                
         Nirupama Bulusu
         National ICT Australia Ltd.
         E-mail: nbulusu@cse.unsw.edu.au
                
         Huan Pham
         School of Computer Science and Engineering
         University of New South Wales
         Sydney 2052 Australia
         E-mail: huanp@cse.unsw.edu.au
                
         Sanjay Jha
         School of Computer Science and Engineering
         University of New South Wales
         Sydney 2052 Australia
         E-mail: sjha@cse.unsw.edu.au
In this paper, we propose CSMAC, a novel CDMA-based, self-organizing, location-aware media access control (MAC) protocol for sensor networks. We argue that no single MAC protocol is suitable for all sensor network applications, which cover a broad range of application domains from wildlife tracking to real-time battlefield surveillance. Previously proposed MAC protocols for sensor networks such as S-MAC primarily prioritize energy-efficiency over latency. Our protocol design balances the considerations of energy-efficiency, latency, accuracy, and fault-tolerance in sensor networks. CSMAC uses Code Division Multiple Access to reduce channel interference and consequently message latency in the network. It exploits location awareness to improve energy-efficiency by employing two special algorithms in the network formation process --- Turn Off Redundant Node (TORN) and Select Minimum Neighbor (SMN). ns-2 simulations show that in a 10-hop network topology, CSMAC can achieve upto 74% lower mean latency than SMAC, while consuming 41% lower mean energy per node.
0337 Present Scenarios and Future Challenges in Pervasive Middleware Amitava Mukherjee
School of Computer Science & Engineering
University of New South Wales, Sydney 2052, Australia
E-mail: amitavam@cse.unsw.edu.au


Debashis Saha
MIS & Computer Science Group, Indian Institute of Management (IIM) Calcutta, Joka, Kolkata 700 104, India
E-mail: ds@iimcal.ac.in
In order to run applications on pervasive devices, pervasive middleware has to support context-awareness, as pervasive applications need to adapt to variations of context of execution (such as network bandwidth, battery power and screen size), physical change of locations, change of technological artifacts (devices), change of hardware resources of artifacts, and so on. Recent research efforts have primarily focused on designing new mobile middleware systems capable of supporting the requirements imposed by mobility. However, apart from mobility constraint, pervasive middleware will operate under above-mentioned conditions of a radical change. This change is varying from physical components (like network heterogeneity) to functional components (right from heterogeneous devices to context-based applications). Few contemporary researches have indeed focused on some parts of these requirements; but a qualitative difference between intended requirements and practical achievements still remains there. In this article, we discuss some of recent mobile/pervasive middleware systems, focusing on research issues and challenges ahead to bridge the gap. Typically, we highlight the key characteristics of pervasive middleware to support context awareness and service discovery, smartness and adaptation, heterogeneity and integration, and intelligent interfacing.
0336 Exploring the Issues of Boundary Definition in the Application of COSMIC-FFP to Embedded Systems Jacky Keung, Suryaningsih, Ross Jeffery
National ICT Australia &
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {jkeung, rossj}@cse.unsw.edu.au
Software sizing plays an essential role in software management and in providing input for estimation and benchmarking purposes. Despite the claim of emerging popularity of function points as a size measure, it is not widely accepted in all software domains. The most popular technique is Function Point Analysis that has become the “de-facto” standard in the business application environment. When applied to non-MIS systems many researchers have criticized the counts as misleading and not reflective of the size of the systems. The release of COSMIC Full Function Point technique is aimed at overcoming these shortcomings. This paper presents a single-case study in a telecommunication company to examine the applicability of the COSMIC Full Function Point technique in the domain of embedded telephone switching systems (a type of real-time system). Through the experience of this study, it is found that there is very limited experience in this area. The current counting convention is thought to be inadequate in many areas such as peer-to-peer sizing and that the field is still evolving. Due to uncertainty and ambiguity in the measurement process, counter’s subjectivity plays an important role in function point counting.
0335 Planning an Empirical Experiment To Evaluate The Effects Of Pair Work On The Design Phase Of The Software Lifecycle H. Al-Kilidar, R. Jeffery, C. Kutay
School of Computer Science and Engineering
University of New South Wales, NSW 2052,
Australia
E-mail: {hiyama| rossj| ckutay}@cse.unsw.edu.au

A. Aurum
School of Information Systems, Technology and Management
University of New South Wales, NSW 2052,
Australia.
E-mail: aybuke@unsw.edu.au
This report presents the details of an empirical experiment designed to evaluate the effects of Pair Work on the design phase of software development lifecycle. The experiment is designed to investigate the effects of pair work on the quality of design products and whether the pair work approach in the design process is more efficient or cost effective than individual work approach. The aims of the experiment are to compare the quality of the design products produced by pair designers and individual designers as well as compare the efficiency and cost effectiveness of the pair work approach and the individual work approaches in the design process. In addition, the experiment studies the partner's expectations and practices during the pair work experience. The experimental hypotheses, design, inputs, outputs, and evaluation measures will be described.
0331 An Anycast Service for Hybrid Sensor/Actuator Networks Wen Hu
School of Computer Science and Engineering
The University of NSW, Australia
wenh@cse.unsw.edu.au
Nirupama Bulusu
National ICT Australia Limited
nbulusu@cse.unsw.edu.au
Sanjay Jha
School of Computer Science and Engineering
The University of NSW, Australia
sjha@cse.unsw.edu.au
Patrick Senac
ENSICA-LAAS/CNRS
senac@ensica.fr
This paper investigates an anycast communication service for a hybrid sensor/actuator network, consisting of both resource-rich and resource-impoverished devices. The key idea is to exploit the capabilities of resource-rich devices (called micro-servers) to reduce the communication burden on smaller, energy, bandwidth and memory constrained sensor nodes. The goal is to deliver sensor data to the nearest micro-server, which can (i) store it (ii) forward it to other micro-servers using out-of-band communication or (iii) perform the desired actuation. We motivate, propose, evaluate and analyse a reverse tree-based anycast mechanism tailored to deal with the unique event dynamics in sensor networks. Our approach is to construct an anycast tree rooted at each potential event source, which micro-servers can dynamically join and leave. Our anycast mechanism is self-organizing, distributed, robust, scalable, routing-protocol independent and incurs very little overhead. Simulations using ns-2 show that our anycast mechanism when added to Directed Diffusion can reduce the network's energy consumption by more than 50%, can reduce both the mean end-to-end latency of the transmission and the mean number of transmissions by more than 50%, and achieves 99% data delivery rate for low and moderate micro-server mobility rate.
0328 State Transition Model to Characterize TCP Window Control Behavior in Wired/Wireless Internetworks Debashis Saha
MIS & Computer Science Group, Indian Institute of Management (IIM) Calcutta, Joka, Kolkata 700 104, India
E-mail: ds@iimcal.ac.

Amitava Mukherjee
School of Computer Science & Engineering
University of New South Wales, Sydney 2052, Australia
E-mail: amitavam@cse.unsw.edu.au

Sanjay K Jha
School of Computer Science & Engineering,
University of New South Wales, Sydney 2052, Australia
E-mail: sjha@cse.unsw.edu.au
TCP was designed to work well in networks with low channel error rates. Wireless networks on the other hand are characterized by frequent transmission losses. As a result, when TCP is used in wired/wireless internetworks, the losses due to channel errors are mistaken as congestion losses and the sending rate is unnecessarily reduced in an attempt to relieve the congestion, resulting in a degraded performance. There are several studies to model the behavior of TCP in such environments, typically under last-hop wireless scenarios. The consensus is that TCP needs some form of intimations to segregate wireless loss from congestion loss and behave accordingly in its window control. However, it is not an easy task to detect the type of loss from TCP behavior as shown in this report with the help state transition models. In order to further extend the model for more accuracy in capturing the exact TCP window control, we plan to carry out a series of simulation studies for a synthetic heterogeneous environment with multiple TCP/UDP flows, keeping the state diagram in mind.
0325 Automated Interface Synthesis Vijay D'Silva, Arcot Sowmya
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: vijayd, sowmya@cse.unsw.edu.au

S. Ramesh
Visiting Professor
NICTA Sydney Node
(May-August 2003)

(permanent)
Department of Computer Science and Engineering
Indian Institute of Technology, Powai
Bombay 400 076
E-mail: ramesh@cse.iitb.ac.in
System-on-Chip (SoC) design methodologies rely heavily on reuse of intellectual property (IP) blocks. IP reuse is a labour intensive and time consuming process as IP blocks often have different communication interfaces. We present a framework which automates the generation of HDL descriptions of interfaces between mismatched IP communication protocols. We significantly improve and extend existing work by formalising the problem and providing a solution which addresses data mismatches, pipelining and differences in clock speeds. Importantly, the use of a formal framework enables us to generate solutions which are provably correct. The developed algorithms have been implemented and the tool used to synthesise wrappers and bridges for many SoC protocols. In particular we present a case study of the application of our algorithm to a specific design obtained from industry.
0323 Failure-Oriented Path Restoration Algorithm for Survivable Networks William Lau and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {wlau,sjha}@cse.unsw.edu.au
Connection-oriented networks such as MPLS and GMPLS offer network providers the mechanisms to deliver a high level of service quality to their clients. One critical factor in determining the service quality is the rate of availability or what some call the up-time of the connection. A common approach for provisioning high availability with shorter restoration times is to use pre-calculated backup paths that are used when the normal service paths fail. The challenge in this approach is to allocate the minimal total spare capacity required by the backup paths. One restoration strategy that aims to minimize spare capacity is based on failure-oriented reconfiguration (FORC), where a backup path is calculated for each possible scenario that affects the working service path. Linear and integer programming formulations can be made to find optimal solutions but do not run in polynomial time. An existing heuristic algorithm was proposed to reduce the computation time but it also does not run in polynomial time. In this paper, a new polynomial-time approximation algorithm called Service Path Local Optimization (SPLO) is proposed. SPLO is shown to perform better than the existing approximations for FORC. SPLO is designed for online computation where only one request is computed at any one time, and the decision making does not depend on future requests. The polynomial-time and online nature of the algorithm make SPLO suitable for use in real-time on-demand path request applications. Further, the potential for SPLO as an algorithm in traffic engineering applications is investigated by looking at the performance impact when source-destination-based traffic aggregation is applied. The results show that spare capacity requirement for SPLO is degraded by up to 5% only. This paper also introduces a new concept called path intermix where the service path's allocated bandwidth can be used by the backup paths protecting that particular service path. The result shows that path-intermix reduces the lengths of backup paths and can reduce spare capacity by up to 4% for single node failures.
0322 A Theory of Proximity Relations Jane Brennan, Eric Martin
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: jbrennan|emartin@cse.unsw.edu.au
Next to orientation and connectivity, proximity is one of the key topological properties of many spatial relations. The aim of the work presented in this report is to provide a formalism that can qualitatively account for absolute binary proximity relations, taking into consideration common-sense spatial knowledge. The theory of nearness presented here is based on the concepts of influence areas of spatial objects and distances between these objects abstracted into a pseudo-metric space. This theory goes beyond existing models and influence area approaches, by generalising them and providing a formalisation of nearness notions. The most general nearness notions are justified against a set of experimental results obtained from studies conducted by Worboys \cite{worboys2001} in the domain of environmental spaces. The symmetric notion of nearness, which we found to be an adequate representation for most cases, is elaborated on in more detail. Its implications are investigated in the context of a navigational model. There are however cases where nearness is not symmetric. Therefore a brief discussion on the asymmetric aspect of nearness is given and its implications are investigated in the context of a natural language model.
0321 A Survey on the Interaction Between Caching, Translation and Protection Adam Wiggins
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: awiggins@cse.unsw.edu.au
Fine-grained hardware protection could deliver significant benefits to software, enabling the implementation of strongly encapsulated light-weight objects, but only if it can be done without slowing down the processor. In this survey we explore the interaction between the processor's caches and virtual memory in traditional as well as research architectures. We find that while caching and translation mechanisms have received much attention in the literature, hardware protection mechanisms have remained largely neglected, with none of the explored architectures providing truly scalable support for context-sensitive, fine-grained protection. Based on the insights gained from the survey we outline an approach which facilitates the construction of simple, yet fast, low-power fine-grained protection mechanisms for processor cores.
0320 Skipping Strategies for Efficient Structural Joins Franky Lam, William M. Shui, Damien K. Fisher, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
The structural join is considered a core operation in processing and optimizing XML queries. Various techniques have been proposed for efficiently finding structural relationships between a list of potential ancestors and a list of potential descendants. This paper presents a novel algorithm for efficiently processing structural joins. Moreover, previous work which performs well usually relies on external index structures such as a B-tree, which increases both the storage and memory overheads. Our proposal in this paper does not require any such data structures, and hence can be easily implemented and incorporated in any existing system. Experiments show that our method significantly outperforms previous algorithms.
0319 Peering and Querying e-Catalog Communities Boualem Benatallah (1), Mohand-Said Hacid (2), Hye-young Paik (1)
Christophe Rey (3) and Farouk Toumani (3)

(1) CSE, University of New South Wales, Australia,
{boualem,hpaik}@cse.unsw.edu.au
(2) LIRIS, University Lyon I, France,
mshacid@liris.univ-lyon1.fr
(3) LIMOS, ISIMA, University Blaise Pascal, France,
{rey,ftoumani}@isima.fr
An increasing number of organisations are jumping hastily onto the online retailing bandwagon and moving their operations to the Web. A huge quantity of e-catalogs (i.e., information and product portals) is now readily available. Unfortunately, given that e-catalogs are often autonomous and heterogeneous, effectively integrating and querying them is a delicate and time-consuming task. More importantly, the number of e-catalogs to be integrated and queried may be large and continuously changing. Consequently, conventional approaches where the development of an integrated e-catalog requires the understanding of each of the underlying catalog are inappropriate. Instead, a divide-and-conquer approach should be adopted, whereby e-catalogs providing similar customer needs are grouped together, and semantic peer relationships among these groups are defined to facilitate distributed, dynamic and scalable integration of e-catalogs. In this paper, we use the concept of e-catalog communities and peer relationships among them to facilitate the querying of a potentially large number of dynamic e-catalogs. e-Catalogs communities are essentially containers of related e-catalogs. We propose a flexible query matching algorithm that exploits both community descriptions and peer relationships to find e-catalogs that best match a user query. The user query is formulated using a description of a given community.
0318 Intertac Software Architecture Cat Kutay
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: ckutay@cse.unsw.edu.au
This paper describes the development of a groupware system from the requirements developed through researching the activities of software engineering students who were developing specification reports in groups. The specification is designed for an imaginary, but realistic, client. The groupware was developed to enable these groups to meet more often in non-collocated sessions. A list of requirements that were developed for the basic application software are presented here, together with the Architecture and Interface. The groupware is designed as a Constructivist and Collaborative Learning Environment (CLE) so the first aim is to provide a flexible and unstructured learning environment in which students can construct their own meaning. On top of this can be placed agents to provide assistance and feedback to improve aspects of this learning. This paper looks at the first part of this process, developing the environment with a component-based architecture to which agents can readily be integrated. Also brief summary of the agent support is provided, with a plan for future verification of the final system when complete.
0317 Fast Ordering for Changing XML Data Damien K. Fisher, Franky Lam, William M. Shui, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
With the increasing popularity of XML, there arises the need for managing and querying information in this form. Several query languages, such as XQuery, have been proposed which return their results in document order. However, most recent efforts focused on query optimization have either disregarded order or proposed a static labelling scheme in which update issues are not addressed. Based on the preliminary results from our previous work, this paper presents a fast method to maintain document ordering for changing XML data. Analysis of our method shows that it is more efficient and scalable than our previously proposed method as well as other related work, especially under various scenarios of updates.
0316 Efficient Ordering for XML Data Damien K. Fisher, Franky Lam, William M. Shui, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
With the increasing popularity of XML, there arises the need for managing and querying information in this form. Several query languages, such as XQuery, have been proposed which return their results in document order. However, most recent efforts focused on query optimization have disregarded order. This paper presents a simple yet elegant method to maintain document ordering for XML data. Analysis of our method shows that it is indeed efficient and scalable, even for changing data.
0315 On Clustering Schemes for XML Databases Damien K. Fisher, William M. Shui, Franky Lam, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
Although clustering problems are in general NP-hard, many research efforts have been put in the areas of OODB and RDBMS. With the increasing popularity of XML, researchers have been focusing on various XML data management including query processing and optimization. However, the clustering issues have been disregarded in all their work. This paper provides a preliminary study on data clustering for optimizing XML databases. Different clustering schemes are compared through a set of extensive experiments.
0314 Adaptive Change Management for Semi-structured Data Raymond K. Wong, Nicole Lam
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
This paper presents an efficient content-based version management system for managing XML documents. Our proposed system uses complete deltas for the logical representation of document versions. This logical representation is coupled with an efficient storage policy for version retrieval and insertion. Our storage policy includes the conditional storage of complete document versions (depending on the proportion of the document that was changed). Based on the performance measure from experiments, adaptive scheme based on non-linear regression is proposed. Furthermore, we define a mapping between forwards and backwards deltas in order to improve the performance of the system, in terms of both space and time.
0313 On Structural Inference for XML Data Raymond K. Wong, Jason Sankey
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
Semistructured data presents many challenges, mainly due to its lack of a strict schema. These challenges are further magnified when large amounts of data are gathered from heterogeneous sources. We address this by investigation and development of methods to automatically infer structural information from example data. Using XML as a reference format, we approach the schema generation problem by application of inductive inference theory. In doing so, we review and extend results relating to the search spaces of grammatical inferences. We then adapt a method for evaluating the result of an inference process from computational linguistics. Further, we combine several inference algorithms, including both new techniques introduced by us and those from previous work. Comprehensive experimentation reveals our new hybrid method, based upon recently developed optimisation techniques, to be the most effective.
0312 Efficient Query Relaxation for Semistructured Data Michael Barg, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
Semistructured data, such as XML, allows authors to structure a document in a way which accurately captures the semantics of the data. This, however, poses a substantial barrier to casual and non-expert users who wish to query such data, as it is the data's structure which forms the basis of all XML query languages. Without an accurate understanding of this structure, users are unable to issue meaningful queries. This problem is compounded when one realizes that data adhering to different schema are likely to be contained within the same data warehouse or federated database. This paper describes a mechanism for meaningfully querying such data with no prior knowledge of its structure. Our system returns approximate answers to such a query over semistructured data, and can return useful results even if a specific value cannot be matched. We discuss a number of novel query processing and optimization techniques which enable us to perform our query relaxation in near linear time. Experiments show that our mechanism is very fast and scales well.
0311 An Efficient WordNet-Based Summarizer for Large Text Documents Raymond K. Wong, Chit Sia
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
The current information overload problem has called for the need to develop an automatic text summarization system. This paper presents an efficient sentence-based extraction summarizer which can be used for the above purpose. Lexical chains were used as a basis and knowledge resources such as WordNet and a sentence boundary disambiguation tool were integrated to the system for better performance. Three different summary extraction heuristics were used and compared. An intrinsic evaluation which involved the comparison of our summarizer with a commercial product to the human written abstracts was performed. The results obtained have been encouraging, and it is found that our system favors the human judgement than the other system. The algorithm used in this paper demonstrated a linear runtime behavior. This not only suggests a positive position in the scalability of our system but also its potentiality in handling documents of longer length.
0310 Update Synchronization for Mobile XML Data Franky Lam, Nicole Lam, Raymond K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wong@cse.unsw.edu.au
Many handheld applications receive data from a primary database server and operate in an intermittently connected environment these days. They maintain data consistency with data sources through sychronization. In certain applications such as sales force automation, it is highly desirable if updates on the data source can be reflected at the handheld applications immediately. This paper proposes an efficient method to synchronize XML data on multiple mobile devices. Each device retrieves and caches a local copy of data from the database source based on a regular path expression. These local copies may be overlapping or disjoint with each other. An efficient mechanism is proposed to find all the disjoint copies to avoid unnecessary synchronizations. Each update to the data source will then be checked to identify all handheld applications which are affected by the update. Communication costs can be further reduced by eliminating the forwarding of unnecessary operations to groups of mobile clients.
0309 "Variable Resolution Hierarchical RL" Bernhard Hengst The contribution of this paper is to introduce heuristics, that go beyond safe state abstraction in hierarchical reinforcement learning, to approximate a decomposed value function. Additional improvements in time and space complexity for learning and execution may outweigh achieving less than hierarchically optimal performance and deliver anytime decision making during execution. Heuristics are discussed in relation to HEXQ, a MDP partitioning that generates a hierarchy of abstract models using safe state abstraction. The approximation methods are illustrated empirically.
0308 Safe State Abstraction and Discounting in Hierarchical Reinforcement Learning Bernhard Hengst The great benefit in state abstraction for hierarchical reinforcement learning (HRL) is the potential improvement in computational complexity with significant compaction of the value function. Safe state aggregation of reusable sub-task states is not possible in general for a decomposed MDP using one decomposed discounted cumulative reward function. This severely limits the effectiveness of HRL, particularly for infinite horizon problems. This paper makes two related and novel contributions: (1) the introduction of an additional supporting decomposed discount function allowing state abstraction in the face of discounting and (2) modifications to adapt HRL to solve infinite horizon problems in which the recursively optimal policy may require a sub-task to persist.
0307 Itanium Page Tables and TLB Matthew Chapman, Ian Wienand, Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {matthewc, ianw, gernot}@cse.unsw.edu.au
The Itanium architecture offers considerable flexibility in managing the TLB. Besides features found in many architectures, such as TLB tags and superpages, it supports two quite unusual features. One is the choice of two hardware-walked page table formats, a linear array and a hashed page table. The other is an unusual TLB tagging scheme which, among others, allows a single TLB entry to map a page to several address spaces, thus reducing the consumption of TLB entries in the presence of sharing. Only one page table format, the linear array, is presently supported in Linux. However, this format neither supports the use of arbitrarily mixed page sizes nor the sharing of TLB entries. We have implemented the hashed page table format in Linux and found that this change has negligible performance impact, which should pave the way for exploring an implementation of superpage support. We have also implemented sharing of TLB entries, and found that in normal Linux workloads the effect is somewhere between negligible and a moderate performance increase. We could, however, demonstrate that there are scenarios where TLB sharing can produce significant performance gains.
0306 Design and Performance analysis of CBCSWFQ packet scheduling algorithm. Fariza Sabrina and Sanjay Jha
School of Computer Science and Engineering
The University of New South Wales, NSW 2052, Australia
Email: {farizas;Sjha}@cse.unsw.edu.au
In active and programmable networks, packet processing could be accomplished in the router within the data path. For efficient resource allocation in such networks, the packet scheduling schemes should consider multiple resources such as CPU and memory in addition to the bandwidth to improve overall performance. The dynamic nature of network load and the inherent unpredictability of processing times of active packets pose a significant challenge in CPU scheduling. It has been identified that unlike bandwidth scheduling, prior estimation of CPU requirements of a packet is very difficult since it is platform dependent and it also depends on processing load at the time of execution and operating system scheduling etc. This paper presents a new composite scheduling algorithm called CBCSWFQ which is based on Weighted Fair Queuing (WFQ) and is designed for scheduling both bandwidth and CPU resources adaptively, fairly and efficiently. CBCSWFQ uses an adaptive prediction technique for estimating the processing requirements of active flows efficiently and accurately. Through simulation and analysis works we show the improved performance of our scheduling algorithm in achieving better delay guarantees compared to WFQ if used separately for CPU and Bandwidth scheduling.
0305 An Efficient Resource Management Framework for Programmable and Active Networks. Fariza Sabrina and Sanjay Jha
School of Computer Science and Engineering
The University of New South Wales, NSW 2052, Australia
Email: {farizas;Sjha}@cse.unsw.edu.au
This report presents a framework for resource management in highly dynamic active and programmable networks. The goal is to allocate and manage node resources in an efficient way while ensuring effective utilization of network and supporting load balancing. The framework supports co-existence of active and non-active nodes and proposes a novel Directory Service (DS) architecture that can be used to discover the suitable active nodes in the Internet and for selecting best network path (end-to-end) and reserving the resources along the selected path. Intra-node and inter-node resource management are facilitated through the DS, while within an active node the framework implements a composite scheduling algorithm to schedule CPU and bandwidth resources to resolve the combined resource scheduling problems. In addition, a flexible active node database system has been introduced in order to resolve the challenging problem of determining the CPU requirement of the incoming packets. Through simulation we show the improved performance of our scheduling algorithm in achieving overall fairness in allocating active node resources.
0304 A Formal Approach to Interface Synthesis for SoC Design Vijay D'silva
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: vijayd@cse.unsw.edu.au
Systems-on-Chip (SoC) design methodologies rely increasingly on reuse of intellectual property (IP) blocks. IP reuse is a labour intensive and time consuming process as IP blocks often have different communication interfaces. We present a framework to generate a synthesizable VHDL description of an interface between two mismatching IP communication protocols. We improve and extend previously published work by formalising the problem and by explicitly handling data width and type mismatching and multiple data transfers. At present, simpler cases of pipelining are handled as well. We have implemented our technique and demonstrate it by generating an interface between the CoreConnect Processor Local Bus from IBM and the AMBA System Bus from ARM.
0303 Towards Unstrusted Device Drivers Ben Leslie, Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {benjl, gernot}@cse.unsw.edu.au
Device drivers are well known to be one of the prime sources of unreliability in today's computer systems. We argue that this need not be, as drivers can be run as user-level tasks, allowing them to be encapsulated by hardware protection. In contrast to prior work on user-level drivers, we show that on present hardware it is possible to prevent DMA from undermining this encapsulation. We show that this can be done without unreasonably impacting driver performance.
0302 Invented Predicates to Reduce Knowledge Acquisition Effort Hendra Suryanto and Paul Compton

Artificial Intelligence Department
School of Computer Science and Engineering,
The University of New South Wales Sydney 2052, Australia
The aim of this study was to develop machine learning techniques that would speed up knowledge acquisition from an expert. As the expert provided knowledge the system would generalize from this knowledge in order to reduce the need for later knowledge acquisition. This generalization should be completely hidden from the expert. We have developed such a learning technique based on Duce's intra construction operator and absorption operators (Muggleton, 1990) and applied to Ripple Down Rules (RDR) incremental knowledge acquisition (Compton & Jansen, 1990). Preliminary evaluation shows that knowledge acquisition can be reduced by up to 50%.
0301 Parallelized FTP Shaleeza Sohail, Sanjay Jha and Hossam ElGindy
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {sohails,sjha,elgindyh}@cse.unsw.edu.au
The parallelized FTP (P-FTP) approach, attempts to solve the problem of slow downloads of large multimedia files while optimizing the utilization of mirror servers. The approach presented in this paper downloads a single file from multiple mirror servers simultaneously, where each mirror server transfers a portion of the file. The P-FTP server calculates the optimum division of the file for effecient transfer. The dynamic monitoring ability of P-FTP maintains the file transfer process at the optimized level no matter how abruptly network and mirror server characteristics change.
0216 Avoiding Useless Packet Transmission for Multimedia over IP Networks: The Case of Multiple Multimedia Flows Jim Wu and Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {jimw,mahbub}@cse.unsw.edu.au
In this paper, we investigated UPT avoidance problem with multiple multimedia flows. We propose a management module, called Unintelligible Flow Management (UFM), to enhance UPTA in networks with multiple multimedia flows. We have proposed two different management policies for UFM, i.e. Random Select (RS) and Least Bandwidth Select (LBS). We have demonstrated incorporation of RS/LBS into WFQ, and evaluated the effectiveness of both RS and LBS under various network scenarios (e.g. single/multiple congested links, homogeneous/heterogeneous video applications, etc.). Simulation results show that UFM can significantly improve TCP throughput and average video intelligibility index, as compared with plain WFQ. On the other hand, our simulation results also suggest that RS and LBS have similar performance with homogeneous multimedia applications. However, with heterogeneous multimedia applications, LBS yields better performance, in terms of total number of video flows recovered and average intelligibility index of video applications.
0215 Avoiding Useless Packet Transmission for Multimedia over IP Networks: The Case of Multiple Congested Links Jim Wu and Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {jimw,mahbub}@cse.unsw.edu.au
In this paper, we investigate UPT avoidance problem with multiple congested links. We propose three different UPTA enforcement schemes --- basic UPTA (B-UPTA), Partial UPTA (P-UPTA) and Centralised UPTA (C-UPTA). The challenge of UPT avoidance with multiple congested links is to determine global fairshare and enforce UPTA based on the global fairshare. We proposed the Bottleneck Fairshare Discovery (BFD) protocol to address this issue. BFD is a feedback mechanism proposed to assist UPTA in networks with multiple congested links. We describe the implementation of BFD with UPTA, taking WFQ as an example. Our simulation study shows that B-UPTA fails to detect UPT in some situations. P-UPTA can eventually detect UPT, but bandwidth may have been wasted on upstream links before UPT is detected. C-UPTA can avoid UPT in all situations, as it always drops useless packets at network edge. Simulation results suggest that, with C-UPTA, the achieved TCP throughput improvement is very close to the maximum theoretical value. In the paper, we also analyse the performance of C-UPTA quantitatively, in terms of TCP throughput, file download time, impact on video intelligibility, and impact on fairness. Our simulation results reveal that, for all six scenarios, the TCP throughput has been significantly improved (with improvement factor up to 50%). As a result, file download times (for various file size) have been greatly reduced (more than 30%). On the other hand, incorporation of C-UPTA into WFQ has no significant impact on intelligibility of the MPEG-2 video (with a difference less than 3%). For all six scenarios, C-UPTA maintains fairness which is comparable to WFQ. This proves that UPTA does not have any adverse impact on fairness performance of fair algorithms.
0214 A Constructive Proof of the Turing Completeness of Circal J\'er\'emie Detrey
ENS Lyon
46, all\'e d'Italie
69 364 Lyon cedex 07, France
e-mail: jdetrey@ens-lyon.fr

Oliver Diessel
School of Computer Science and Engineering
University of New South Wales
Sydney, NSW 2052, Australia
e-mail: odiessel@cse.unsw.edu.au
This paper gives a proof of the Turing completeness of the Circal process algebra by exhibiting a universal program capable of mapping any Turing machine description into Circal specifications that effectively simulate the behaviour of the given machine.
0213 SCCircal: a Static Compiler Mapping XCircal to Virtex FPGAs J\'er\'emie Detrey
ENS Lyon
46, all\'e d'Italie
69 364 Lyon cedex 07, France
e-mail: jdetrey@ens-lyon.fr

Oliver Diessel
School of Computer Science and Engineering
University of New South Wales
Sydney, NSW 2052, Australia
e-mail: odiessel@cse.unsw.edu.au
This paper describes the new version of SCCircal, a static compiler for XCircal targeted to Xilinx Virtex architecture. This compiler, written in Java, is now capable of providing a real FPGA implementation for almost any Circal process specification. Thus it supports hierarchy, abstraction and relabelling. This paper also introduces the notion of a process interface, provided to help the development of further extensions of this compiler.
0210 A Theory of Compositional Concurrent Objects Xiaogang Zhang and John Potter
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {xzhang,potter}@cse.unsw.edu.au
This paper presents the theory of composition for concurrent object systems, based on an object modelling in the \kappa-calculus. The behaviour of a concurrent object can be modelled as the composition of a process representing the functional behaviour of the object with no constraint on its concurrent interactions, or synchronisation, and a process representing concurrency constraints to reduce the allowable concurrency and to avoid the states of exception. With this model, we use the \kappa-calculus, a process algebra with polars, to study the theory of composition of concurrent behaviours, investigate when and how concurrent behaviours can (or should) be composed with and separated from functional behaviours or other concurrent behaviours, identify relevant patterns and properties of concurrent behaviours, etc. Some generic properties of the behaviour composition, such the Identity Law and Associative Law, have been proven in this study. Keywords: object models, \kappa-calculus, \pi-calculus, concurrency constraints, concurrency controls, composition, synchronisation
0209 A Fast and Versatile Path Index for Querying Semi-Structured Data Michael Barg
Raymond. K. Wong
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
Email: {mbarg, wong}@cse.unsw.edu.au
The richness of semi-structured data allows data of varied and inconsistent structures to be stored in a single database. Such data can be represented as a graph, and queries can be constructed using path expressions, which describe traversals through the graph. Instead of providing optimal performance for a limited range of path expressions, we propose a mechanism which is shown to have consistent and high performance for path expressions of any complexity, including those with descendant operators (path wildcards). We further detail mechanisms which employ our index to perform more complex processing, such as evaluating both path expressions containing links and entire (sub) queries containing path based predicates. Performance is shown to be independent of the number of terms in the path expression, even where these contain wildcards. Experiments show that our index is faster than conventional methods by up to two orders of magnitude for certain query types, is small, and scales well.
0208 Design and Implementation of a Virtual Quality of Service MAC Layer (VQML) for Wireless LANs Mahbub Hassan, Kenneth Lee, Mohammad Rezvan
School of Computer Science and Engineering
The University of New South Wales
Sydney 2052, Australia
Wireless LANs are becoming increasingly popular. While the technology offers wireless connectivity, it offers minimal or no quality of service (QoS) to multimedia applications. We propose a virtual QoS MAC layer (VQML) between MAC and networking layers to provide QoS. The proposed VQML architecture is implemented in a Linux platform and tested in an experimental wireless network test-bed in the Network Research Laboratory of UNSW.
0207 Active Protocol Label Switching (APLS) William Lau and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: wlau,sjha@cse.unsw.edu.au
Modern layer 3 networking technologies have mainly been designed for performance and for network providers. This report proposes a new network architecture called Active Protocol Label Switching (APLS) that combines the performance of current label switching technology with novel concepts that cultivate service provisioning. Novel features such as Virtual Label Space, APLS micro-instruction architecture, and micro-policy based forwarding provide a more powerful network model, facilitate better network level service engineering, and give tremendous flexibility to both network and service providers. The thrust of our study is to construct an APLS test-bed using open hardware and software and later use this test-bed for experimenting various features/options available with APLS.This report also describes our prototype implementation of APLS under Linux.
0206 The Survey of Bandwidth Broker Shaleeza Sohail and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {sohails,sjha}@cse.unsw.edu.au
Keeping in mind the present network management research trends, it can be safely stated that in the near future enterprise networks and ISPs will need a network management entity to dynamically manage QoS networks. DiffServ is one of the emerging networks that introduces bandwidth broker as its logical resource, network and policy management module. Due to the complex and huge functionality provided by bandwidth broker, it has very large number of semi explored research areas. This survey is an effort to briefly discuss some of the developments in the ongoing process of defining and implementing a functional bandwidth broker.
0205 The Responsive Bisimulations in the \kappa-calculus Xiaogang Zhang and John Potter
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {xzhang,potter}@cse.unsw.edu.au
This paper presents responsive bisimulation in the \kappa-calculus. It is a part of the ongoing work attempts to model concurrent object systems using process algebra. The behaviour of an object can be described as the composition of a process representing the basic functionality of the object and separate processes controlling the concurrent behaviour of that object. While familiar usually failed, the responsive bisimulation proposed by the authors in an earlier paper where the delaying a message locally and remotely have the same effect as long as potential interference by competing receptors is avoided, is able to capture the behavioural equivalence between object components. With this bisimulation, an equivalence between the \pi-calculus expression (\nu n)(m.\bar{n}|k.n.P) and k.m.P then can be achieved. However, in the earlier paper, the responsive bisimulation was described in the polar \pi-calculus, which added a few improved features for modelling concurrent objects while maintains the syntatical simplicity similar to the normal \pi-calculus, but is still difficult to express general behaviurs of concurrent objects efficiently. The \kappa-calculus, where locks are included as primitive, in the other hand, is more expressive and flexible in modelling compositional concurrenct objects. This paper presents responsive bisimulation in the \kappa-calculus, and therefore will form an improved base for studies on both the theory of behaviours composition and the semantics of compositional concurrent OO programming languages.
0204 A Constraint Description Calculus for Compositional Concurrent Objects Xiaogang Zhang and John Potter
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {xzhang,potter}@cse.unsw.edu.au
This report presents the \kappa-calculus, a mobile-process algebra with lock as primitive. The Guarded Conditional Exclusive Choice "\otimes", together with a selective locking/unlocking mechanism, is used in the \kappa-calculus as the only combineter for input guared processes. Therefore, for input guarded terms, the standard mutually exclusive choice "+" of CCS or \pi-calculus, and the parallel composition "|", become two extreme cases of the unified combinerer "\otimes". The \kappa-calculus can provide a simpler, clearer and more composible description of the method exclusion in the modelling of concurrent objects, while preserves other powers of the \pi-calculus in modelling the mobility of concurrent objects. An concurrent object may be modelled in the \kappa-calculus as either a single object process or the composition of a function object proecess and a set of control object processes. A single object process modelled in the \kappa-calculus has a generic form \Lambda}\circ[G\ll\tilde{M}\gg}], where \Lambda records the statues of lock, G decribes the methods exclusion and \tilde{M} is a set of processes each of which presents the functional behaviour of a method body. The \kappa-calculus provides a straightforward model to separate aspects such as object functionality, method exclusion and locking schema and states in a high level abstraction, and provides semantic for a compositional concurrent object-oriented programming language.
0203 The Responsive Bisimulations in the polar \pi-calculus Xiaogang Zhang and John Potter
School of Computer Science and Engineering
University of New South Wales, Australia
E-mail: {xzhang,potter}@cse.unsw.edu.au
Ongoing work attempts to model concurrent object systems using process algebra. The behaviour of an object can be described as the composition of a process representing the basic functionality of the object and separated processes controlling the concurrent behaviour of that object. However, familiar bisimulations, including the weak barbed equivalence, are too strong to capture the behavioural equivalence between object components. This paper proposes the responsive bisimulation, an even weaker bisimulation relation which considers that delaying an incoming message locally has the same effect as delaying it externally, as long as potential interference by competing receptors is avoided. With this bisimulation, an equivalence between the \pi-calculus expression (\nu n)(m.\bar{n}|k.n.P) and k.m.P then can be achieved. The responsive bisimulation is congruence for the family of processes which model objects.
0202 Avoiding Useless Packet Transmission for Multimedia over IP Networks: The Case of Single Congested Link Jim Wu and Mahbub Hassan
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {jimw,mahbub}@cse.unsw.edu.au
When packet loss rate exceeds a given threshold, received audio and video become unintelligible. A congested router transmitting multimedia packets, while inflicting a packet loss rate beyond a given threshold, effectively transmits useless packets. Useless packet transmission wastes router bandwidth when it is needed most. We propose an algorithm to avoid transmission of useless multimedia packets, and allocate the recovered bandwidth to competing TCP flows. We show that the proposed algorithm can be easily implemented in well-known WFQ and CSFQ fair packet queueing and discarding algorithms. Simulation of a 15-second MPEG-2 video clip over a congested network shows that the proposed algorithm effectively eliminates useless packet transmission, and as a result of that significantly improve throughput and file download times of concurrent TCP connections. For the simulated network, file download time is reduced by 55% for typical HTML files, 36% for typical image files, and up to 30% for typical video files. A peak-signal-to-noise-ratio (PSNR) based analysis shows that the overall intelligibility of the received video is no worse than that received without the incorporation of the proposed useless packet transmission avoidance algorithm. Our fairness analysis confirms that implementation of our algorithm into the fair algorithms (WFQ and CSFQ) does not have any adverse effect on the fairness performance of the algorithms.
0201 Design and Implementation of the L4 Microkernel for Alpha Multiprocessors Daniel Potts, Simon Winwood, Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {danielp,sjw,gernot}@cse.unsw.edu.au
This report gives an overview of the techniques used in the multiprocessor implementation of the L4 microkernel on the Alpha processor family. The implementation is designed to be scalable to a large number of processors, which is supported by keeping kernel data processor-local as much as possible, and minimising the use of spinlocks and inter-processor interrupts.
0112 An Efficient IP Matching Tool using Forced Simulation Partha Roop
Department of EEE
University of Auckland, New Zealand
p.roop@auckland.ac.nz

A. Sowmya
School of CSE
University of New South Wales, Australia
sowmya@cse.unsw.du.au

S. Ramesh
Department of CSE
Indian Institute of Technology, Bombay, India
ramesh@cse.iitb.ac.in

Haifeng Guo
Department of CS
State University of New York
Stony Brook, NY 11794-4400
haifeng@cs.sunysb.edu
Automatic IP (Intellectual Property) matching is a key to reuse of IP cores. This report presents an efficient IP matching algorithm which can check if a given programmable IP can be {\em adapted} to match a given specification. When such adaptation is possible, the algorithm also generates a device driver (an interface) to adapt the IP. Though simulation, refinement and bisimulation based algorithms exist, they cannot be used to check the adaptability of an IP, which is the essence of reuse. The IP matching algorithm is based on a formal verification technique called {\em forced simulation}. A forced simulation based matching algorithm is implemented using a logic programming environment, which provides distinct advantages for encoding such an algorithm.The prototype tool, MatchMaker, has been used to reuse several programmable IPs achieving on an average 12 times speedup and 64 \% reduction in code size in comparison to previously published algorithm.
0111 Towards Patterns of Web Services Composition Boualem Benatallah
School of Computer Science
University of New South Wales, Sydney NSW 2052 Australia
Email: boualem@cse.unsw.edu.au
The ability to efficiently and effectively share services on the Web is a critical step towards the development of the on-line economy. Virtually every organisation needs to interact with manifold other organisations in order to request their services. Reciprocally, an organisation providing a service is often required to interact with a large and dynamic set of service requestors. The lack of high level abstractions and functionalities for Web service integration has triggered a considerable amount of research and development efforts. This has resulted in a number of products, standards, frameworks and prototypes addressing sometimes overlapping, sometimes complementary aspects of service integration. In this report we summarise some of the challenges and recent developments in the area of Web service integration, and we abstract some of them in the form of software design patterns. Specifically we present patterns for both bilateral service-based interactions, multilateral service composition, and execution of composite services both in a centralised and in a fully distributed environment. The report also shows how these patterns map into a variety of implementation technologies including object-based approaches (e.g. CORBA and EJB), EAI and ERP suites, cross-enterprise workflows, EDI and XML-based B2B frameworks.
0110 A Dynamically-Balanced Walking Biped Graham Mann*, Bruce Armstrong*, Phil Preston**, Barry Drake**

* School of Information Technology
Murdoch University
South Street Murdoch 6050 Australia

** School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

E-mail: g.mann@murdoch.edu.au b.armstrong@murdoch.edu.au
philp@cse.unsw.edu.au bdrake@cse.unsw.edu.au
Describes the mechanical, electronic and software design of a 10-DOF bipedal robot which has been constructed to study control, parameterisation and automatic expansion of the stability envelope of a complex real-time behaviour, namely, dynamically-balanced two-legged walking. The machine is physically complete and demonstrates reasonable reliability in movement control including dynamically-balanced standing. High-level reinforcement learning code is being developed to extend this to walking. The machine offers a challenging problem domain to the flourishing machine learning community and represents a shift in emphasis, away from learning algorithms that work on simplified, preprocessed, artificial and static data sets to learning heuristics which deal with noisy, real-time data collected from sensors on a dynamic, real-world system.
0109 Let's Study Whole-Program Cache Behaviour Analytically Xavier Vera and Jingling Xue

Xavier Vera
Institutionen fur Datateknik
Malardalens Hogskola
Vasteras, Sweden
E-mail: xavier.vera@mdh.se

Jingling Xue
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: jxue@cse.unsw.edu.au
Based on a new characterisation of data reuse across multiple loop nests, we present a method, an implementation and experimental results for analysing the cache behaviour of whole programs with regular computations. Validation against cache simulation using real codes confirms the efficiency and accuracy of our method. The largest program we have analysed, Applu from SPECfp95, has 3868 lines, 16 subroutines and 2565 references. Assuming a 32KB cache with a 32B line size, our method obtains the miss ratio with an absolute error of about 0.8% in about 128 secs while the simulator used runs for nearly 5 hours on a 933MHz Pentium III PC. Our method can be used to guide compiler locality optimisations and improve cache simulation performance.
0108 Self-Coordinated and Self-Traced Composite Services with Dynamic Provider Selection B. Benatallah(*), M. Dumas(**), M.-C. Fauvet(*) and H. Paik(*)

(*) School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

E-mail: mcfauvet@cse.unsw.edu.au

(**) Cooperative Information Systems Research Centre
Queensland University of Technology
GPO Box 2434, Brisbane QLD 4001
The growth of Internet technologies has unleashed a wave of innovations that are having tremendous impact on the way organisations interact with their partners and customers. It has undoubtedly opened new ways of automating Business-to-Business (B2B) collaboration. Unfortunately, as electronic commerce applications are most likely autonomous and heterogeneous, connecting and coordinating them in order to build inter-organisational services is a difficult task. To date, the development of integrated B2B services is largely ad-hoc, time-consuming and requires an enormous effort of low-level programming. This approach is not only tedious, but also hardly scalable because of the volatility of the Internet, and the dynamic nature of business alliances. In this paper, we consider the efficient composition and execution of B2B services. Specifically, we present a framework through which services can be declaratively composed, and the resulting composite services can be executed in a \textit{decentralised} way within a dynamic environment. The underlying execution model supports the incremental collection of the execution trace of each composite service instance. These traces are particularly useful for customer feedback, and for detecting malfunctionings in the constitution of a composite service.
0107 Analysing Cache Memory Behaviour for Programs with IF Statements Xavier Vera and Jingling Xue

Xavier Vera
Institutionen fur Datateknik
Malardalens Hogskola
Vasteras, Sweden
E-mail: xavier.vera@mdh.se

Jingling Xue
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: jxue@cse.unsw.edu.au
Cache memories are widely used to hide the increasing gap between main memories and processors speed. Several methods have been proposed to analyse their behaviour in order to increase their performance. Many of those methods have been based on trace-driven simulators, which are quite slow and do not give all the information needed by the compilers. Analytical methods have been developed to overcome these problems. Unfortunately, one of the main drawbacks is that they can not analyse codes with IF statements. We propose an analytical method that analyses perfectly nested loops with IF statements. Applying compiler techniques such as loop sinking allows us to analyse imperfectly nested loops as well. We have analysed different benchmarks, including SPECfp, Perfect Suite, Livermore kernels and Linpack. Our analysis shows that we can analyse 17\% more loop nests, obtaining very accurate results.
0106 Code Search based on CVS Comments: A Preliminary Evaluation Annie Chen, Yun Ki Lee, Andrew Y. Yao, Amir Michail


School of Computer Science and Engineering,
The University of New South Wales, Sydney 2052, Australia,
{anniec,s2251001,andrewy,amichail}@cse.unsw.edu.au
We have built a tool, CVSSearch, that searches for fragments of source code by using CVS comments. (CVS is a version control system that is widely used in the open source community.) Our search tool takes advantage of the fact that a CVS comment typically describes the lines of code involved in the commit and this description will typically hold for many future versions. This paper provides a preliminary evaluation of this technique by 74 students at the University of New South Wales. Among our findings, CVS comments do provide a valuable source of information for code search that complements --- but does not replace --- tools that simply search the source code itself (e.g., grep).
0105 A Platform for Portable and Embedded Systems Research Adam Wiggins
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: awiggins@cse.unsw.edu.au
The PLEB project is a student run project aimed at stimulating portable & embedded systems research within the school. This report outlines the projects activities and some of the experiences gained in developing the first hardware platforms. The report also sketches the details for second generation PLEB hardware platforms and the project's future direction.
0104 L4 Reference Manual --- Alpha 21x64 Daniel Potts, Simon Winwood and Gernot Heiser

School of Computer Science and Engineering,
The University of New South Wales, Sydney 2052, Australia,
{danielp,sjw,gernot}@cse.unsw.edu.au
This document describes release 2.0 of the L4 microkernel for the Alpha microprocessor family. The kernel ABI is mostly compatible with the MIPS R4x00 version, but provides full multiprocessor support.
0103 A Component Architecture for System Extensibility Antony Edwards and Gernot Heiser

School of Computer Science and Engineering,
The University of New South Wales, Sydney 2052, Australia,
{antonye,gernot}@cse.unsw.edu.au
Component-based programming has shown itself to be a natural way of constructing extensible software. Well-defined interfaces, encapsulation, late binding and polymorphism promote extensibility, yet despite this synergy, components have not been widely employed at the systems level. This is primarily due to the failure of existing component technologies to provide the protection and performance required of systems software. This thesis presents the design, implementation and performance of a component model for system extensions that allow users to to create and customise system services. Effective access control is a crucial feature of any system. In an extensible system, however, where potentially any user can create and modify system services, access control is even more critical. Despite the increasing importance of access control due to extensibility and increased connectivity, the protection mechanisms provided by existing component systems remain primitive and ad hoc. This thesis presents the design, implementation and performance of a complete access control model for extensible systems.
0101 CVSSearch: Searching through Source Code using CVS Comments Annie Chen, Eric Chou, Joshua Wong, Andrew Y. Yao, Qing Zhang,
Shao Zhang, Amir Michail

School of Computer Science and Engineering,
The University of New South Wales, Sydney 2052, Australia,
{anniec,tzuchunc,joshuaw,andrewy,qzha132,shaoz,amichail}@cse.unsw.edu.au
CVSSearch is a tool that searches for fragments of source code by using CVS comments. CVS is a version control system that is widely used in the open source community. Our search tool takes advantage of the fact that a CVS comment typically describes the lines of code involved in the commit and this description will typically hold for many future versions. In other words, CVSSearch allows one to better search the most recent version of the code by looking at previous versions to better understand the current version.
0007 Jigsaw: the unsupervised construction of spatial representations Mark W. Peters and Barry Drake
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: markpeters@cse.unsw.edu.au bdrake@cse.unsw.edu.au
A fundamental assumption in machine vision is that the spatial arrangement of pixels is given. In challenging this assumption we have utilised a general relationship that exists between space and behaviour. This relationship presents itself as spatial redundancy, which other researchers have considered problematic. We present a mathematical model and empirical investigations into this relationship and develop an algorithm, JIGSAW, which uses it to build spatial representations. The philosophy underpinning JIGSAW takes signal behaviour, rather than position, as primary. JIGSAW is an unsupervised learning algorithm that is efficient in time and space and that makes minimal assumptions about its operating domain. This algorithm offers engineering potential, opportunities in the understanding of biological vision, and a contribution to the wider field of cognitive science.
0006 EPDL: A Logic for Causal Reasoning Dongmo Zhang and Norman Foo
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: dongmo@cse.unsw.edu.au, norman@cse.unsw.edu.au
This paper is twofold. First, we presentes an extended system $EPDL$ of propositional dynamic logic by allowing a proposition as a modality in order to represent and specify indirect effects of actions and causal propagation. An axiomatic deductive system is given which is sound and complete with respect to the corresponding semantics. The resultant system provides a unified treatment of direct and indirect effects of actions. Second, we reduce the $EPDL$ into a mutlimodal logic by deleting the component of action in order to obtain an axiomatized logical system for causal propagation. A characterization theorem of the logic is given. Properties of causal reasoning with the logic are discussed.
0005 Specifications for End to End IP Rate Control (Version 1.0) Abdul Aziz Mustafa, Mahbub Hassan and Sanjay Jha
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {amustafa,mahbub,sjha}@cse.unsw.edu.au
Currently no network-level flow control exists in the IP-based networks. In a recent paper[Adcom2000], we proposed a network-level flow control architecture, called End-to-End IP Rate Control. The motivation behind IP Rate Control is to provide a new network service which will provide users fast access to any unused network resources (buffer space, link bandwidth). This report details the specifications of the IP Rate Control architecture which can be used to implement the service in a given networking platform.
0004 A Formal Approach to Component Based Development of Embedded Systems Partha S. Roop, A. Sowmya
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {proop,sowmya}@cse.unsw.edu.au
S. Ramesh
Department of Computer Science and Engineering
Indian Institute of Technology
Bombay, India - 400 076
E-mail: ramesh@cse.iitb.ernet.in
Component reuse techniques have been the recent focus of research as they are seen as the next generation techniques to handle increasing system complexities. However, there are several unresolved issues to be addressed and prominent among them is the issue of component matching. As the number of reusable components in a component database grows, the task of manually matching a component to the user requirements will be infeasible. Automating this matching can help in rapid system prototyping, improve quality and reduce cost. In addition, if the matching algorithm is sound, this approach can also reduce precious validation effort. In this paper, we propose an algorithm for automatic matching of a design function to a device from a component database. The distinguishing feature of the algorithm is that when successful, it generates an interface which can automatically adapt the device to behave as the function. The algorithm is based on a new simulation relation called forced simulation which is shown to be a necessary and sufficient condition for component matching to be possible for a given pair of function and device. We demonstrate the application of the algorithm by reusing two system level Intel chips.
0003 The Logical Validation of Mathematical Diagrammatic Proofs Christina L. Jenkin
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: tina@alum.rpi.edu
Diagrams have been used for problem solving for thousands of years but have only recently had a resurgence into mainstream science with applications in cognitive science, artificial intelligence, computer science, physics, mathematics, and other disciplines. Diagrammatic reasoning has been defined as ``the understanding of concepts and ideas by the use of diagrams and imagery, as opposed to linguistic or algebraic representations.'' This paper aims to introduce the reader to diagrammatic reasoning, specifically in the area of diagrammatic proofs and logically validate the soundness of the construction steps in a diagrammatic proof, with hopes of helping to develop a theoretical basis for computing directly with diagrammatic representations. This will be accomplished through an analysis of diagrammatic proofs of geometric theorems and a study of some problematic proofs in this area. In addition, a proof showing the equivalence of the two current solutions to the problem of generalization and a link between traditional theories of computation, such as fixed points, invariants, and continuations, with diagrammatic proofs is shown. In essence, this paper intends to help advance the understanding of what is involved in diagrammatic proofs, why they work, and why they sometimes do not work as well as show that diagrams alone can be regarded as legitimate (or even desirable) proofs in the area of geometric theorems. Hopefully, this will help to open new opportunities for study and development in the justification and in later work on the automation of diagrammatic proofs.
0001 Measure for Vector Approximation Benjamin Briedis
The School of Computer Science & Engineering
The University of New South Wales
UNSW Sydney 2052
Australia
E-mail: bbriedis@cse.unsw.edu.au
The creation of approximations for vectors for use in similarity searching (also known the retrieval of the k-nearest neighbours) is examined. A measure is derived that is suitable for judging the quality of a set of vector approximations. This measure is used in the modification of a technique used in similarity searching known as the VA-file. The modified VA-file is evaluated, and a clear improvement in performance is demonstrated.
9908 An approach to formalising relationships between speaker-relative and absolute spatial reference systems Jane Brennan, William Wilson
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {jbrennan,billw}@cse.unsw.edu.au
There has been extensive empirical research in cognitive linguistics exploring different reference systems used to describe spatial situations across different cultures. It has been suggested that speaker relative and absolute spatial reference systems seem to be interchangeable. This report discusses speaker-relative and absolute spatial reference systems in terms of left-right systems and cardinal directions. An approach to formalising some of the relationships between both systems is proposed. We plan to provide an extension to this limited formalisation in a way which could enable automatic interchangeability between the systems of reference discussed within the (Human-Computer) interface of Geographic Information Systems. Natural language interfaces to navigation systems could be a very interesting application of such an extended formalisation. Keywords: Spatial Reasoning, Cognitive Structure of Spatial Knowledge, Spatial Reference Systems
9907 "Boosting" Stumps from Positive Only Data Andrew R. Mitchell
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: andrewm@cse.unsw.edu.au
Most current learning algorithms require both positive and negative data. This is also the case for many of the recent ensemble learning techniques. Applications of boosting, for example, rely on both positive and negative data to produce a hypothesis with high predictive accuracy. In this technical report, a learning methodology is presented that does not rely on negative examples. A learning method in this framework is described which shows remarkable similarities to boosting stumps. This is all the more surprising because learning from positive data has traditionally turned out to be very difficult. Empirical results show that this technique successfully boosts stumps from positive data by paying only a small price in accuracy compared to learners that have access to both positive and negative data. Some theoretical justification of the results is also provided.
9906 Fast Address-Space Switching on the StrongARM SA-1100 Processor Adam Wiggins, Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {awiggins,gernot}@cse.unsw.edu.au
The StrongARM SA-1100 is a high-speed low-power processor aimed at embedded and portable applications. Its architecture features virtual caches and TLBs which are not tagged by an address-space identifier. Consequently, context switches on that processor are potentially very expensive, as they may require complete flushes of TLBs and caches. This report presents the design of an address-space management technique for the StrongARM which minimises TLB and cache flushes and thus context switching costs. The basic idea is to implement the top-level of the (hardware-walked) page-table as a cache for page directory entries for different address spaces. This allows switching address spaces with minimal overhead as long as the working sets do not overlap. For small (<=32MB) address spaces further improvements are possible by making use of the StrongARM's re-mapping facility. Our technique is discussed in the context of the L4 microkernel in which it will be implemented.
9905 Splice-2 Comparative Evaluation: Electricity Pricing Michael Harries
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: mbh@cse.unsw.edu.au
Splice-2 is a machine learning method designed for batch learning in domains with hidden changes in context. This report characterises the performance of Splice-2 on a real world dataset in comparison with C4.5, an on-line learner (emulated by C4.5), and an unsupervised learning system. Two experiments are reported, both using electricity prices from the Australian state of New South Wales as the classification domain. The first experiment uses permutations of the dataset to explore the difficulties of using C4.5 to either identify hidden contexts, or to achieve the same accuracy as Splice-2 on this domain. The dataset permutations are also used to characterise some strengths and weaknesses of Splice-2. The second experiment uses permutations of an extended dataset from the same domain to examine, for both C4.5 and Splice-2, the effects of adding additional known attributes. We find that C4.5 cannot induce the hidden contexts found by Splice-2. Further, Splice-2 generally provides more accurate results than C4.5. The exceptions occur when the order of the dataset is destroyed, or where new attributes are very similar to time. The best results from C4.5 are due to additional work on the part of the data analyst. One of the promises of Splice-2 is to reduce the level of additional work required of the analyst in such domains. We also find that a state-of-the-art conceptual clustering method does not identify the hidden context.
9904 The Temporal Calculus of Conditional Objects and Conditional Events Jerzy Tyszkiewicz
Institute of Informatics
University of Warsaw
E-mail: jty@cse.unsw.edu.au

Arthur Ramer, Achim Hoffmann
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {ramer, achim}@cse.unsw.edu.au
We consider the problem of defining conditional objects (a|b), which would allow one to regard the conditional probability Pr(a|b) as a probability of a well-defined event rather than as a shorthand for Pr(ab)/\Pr(b). The next issue is to define boolean combinations of conditional objects, and possibly also the operator of further conditioning. These questions have been investigated at least since the times of George Boole, leading to a number of formalisms proposed for conditional objects, mostly of syntactical, proof-theoretic vein. We propose a unifying, semantical approach, in which conditional events are (projections of) Markov chains, definable in the three-valued extension (TL}TL) of the past tense fragment of propositional linear time logic (TL), or, equivalently, by three-valued counter-free Moore machines. Thus our conditional objects are indeed stochastic processes, one of the central notions of modern probability theory. Our model precisely fulfills early ideas of de Finetti, and, moreover, as we show in a separate paper, all the previously proposed algebras of conditional events can be isomorphically embedded in our model.
9903 Forced Simulation and Lock-Step Interface: A Formal Approach to Automatic Component Matching Partha S. Roop, A. Sowmya
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {proop,sowmya}@cse.unsw.edu.au
S. Ramesh
Department of Computer Science and Engineering
Indian Institute of Technology
Bombay, India - 400 076
E-mail: ramesh@cse.iitb.ernet.in
Component-based synthesis of embedded systems will lead to the reuse of a vast library of hardware and software components and also facilitate rapid prototyping. However it is still low key, a primary reason being the lack of a systematic attempt at the development of automatic component identification algorithms. The main task of such an algorithm is to to map a design function to a device from a library of system-level components. In this paper, we propose a novel notion of simulation called forced simulation to formalize the correspondence between a function and a device. What distinguishes forced simulation from other techniques is the idea of forcing via an external interface, which can be automatically synthesized, and is useful for adapting the system level component to the given design functionality. We propose a new component matching algorithm based on forced simulation and also propose a technique for the automatic generation of the interface. Finally, a proof of soundness of the approach is presented, based on reducing the synchronous parallel composition of the interface and the device to Milner's weak bisimulation.
9902 Data Spread: A Novel Authentication And Security Technique John Zic,
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: johnz@serg.cse.unsw.edu.au
This paper describes an authentication and security protocol called data spread for use on the Internet. The protocol applies address space diversity to outgoing messages, and when combined with reasonable (but not necessarily strong) encryption techniques, offers fast, secure and authentic-able information exchange between communicating entities.
9901 Forced Simulation: A Formal Approach to Component-Based Synthesis Partha S. Roop, A. Sowmya
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {proop,sowmya}@cse.unsw.edu.au
S. Ramesh
Department of Computer Science and Engineering
Indian Institute of Technology
Bombay, India - 400 076
E-mail: ramesh@cse.iitb.ernet.in
Embedded systems are application-specific digital systems which are normally designed using a microprocessor along with a set of programmable hardware and software components. Component-based synthesis of these systems will lead to the reuse of a vast library of hardware and software components and also facilitate rapid prototyping. However component based synthesis is still low key, a primary reason being the lack of any systematic attempt at the development of automatic component identification algorithms. In [mitra96] an algorithm to map a design function to a device from a library of system-level components was proposed. However, it was not based on a formal setting and no proof of correctness was presented. In this paper, we propose a novel notion of simulation called forced simulation to formalize the correspondence between a function and a device. What distinguishes forced simulation from other techniques is the idea of forcing via an external interface, which can be automatically synthesized, and is useful for adapting the system level component to the given design functionality. We have proposed two different types of forced simulation depending on the handling of internal events.
9808 Statistical Properties of Simple Types Magorzata Moczurad, Marek Zaionc
Computer Science Department
Jagiellonian University
Nawojki 11, 30-072 Krakow, Poland
E-mail: {madry,zaionc}@softlab.ii.uj.edu.pl

Jerzy Tyszkiewicz
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
On leave from:
Institute of Informatics
University of Warsaw
Banacha 2, 02-097 Warszawa, Poland
E-mail: jty@mimuw.edu.pl
We consider types and typed lambda calculus over a finite number of ground types. We are going to investigate the size of the fraction of inhabited types of the given length n against the number of all types of length n. The plan of this paper is to find the limit of that fraction when $n \impl \infty$. The answer to this question is equivalent to finding the ``density'' of inhabited types in the set of all types, or the so-called asymptotic probability of finding an inhabited type in the set of all types. Under the Curry-Howard isomorphism this means finding the density or asymptotic probability of provable intuitionistic propositional formulas in the set of all formulas. For types with one ground type (formulas with one propositional variable) we prove that the limit exists and is equal to 1/2 + \sqrt{5}/10, which is approximately 72%. This means that the random type (formula) of the large size is as likely as about 72% to be inhabited (tautology). We also prove that for every finite number k of ground-type variables, the density of inhabited types is always positive and lies between (4k+1)/(2k+1)^2 and (3k+1)/(k+1)^2. Therefore we can easily see that the density is decreasing to 0 with k going to infinity. From the lower and upper bounds presented we can deduce that at least 1/3 of classical tautologies are intuitionistic.
9806 A General Architecture for Supervised Classification of Multivariate Time Series Mohammed Waleed Kadous
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: waleed@cse.unsw.edu.au
Supervised classification has been one of the most active areas of machine learning research. However, the domains where it has been applied are relatively limited. In particular, much of the work has focused on classification in static domains, where the attributes of the training examples are assumed not to change over time. In many domains, attributes are not static; in fact, it is the way they vary temporally that can make classification possible. Examples of such domains include speech recognition, event recognition from sensors in robotics and analysis of electrocardiographs. So far, researchers tackling these domains have used ad-hoc techniques for converting the problem to a standard classification task. This fails to take into account both the special problems and special heuristics applicable to temporal data. This paper proposes a general architecture for classification of multivariate time series. Training proceeds in five steps: extraction of events from the data training based on parametrised event primitives; clustering of the events in their parameter space to create synthetic events; event attribution of the training data and finally building a classifier with a conventional learner. Recognition takes two steps: selective event searching for synthetic events within the test instance (usually only a small subset of the synthetic events generated in training need to be searched for), then feeding through the classifier created in the training stage. An example implementation of this general architecture is presented. Some preliminary results of its application to recognition of signs from Australian Sign Language are also discussed. Keywords: machine learning, classification, temporal classification, gesture recognition, time series.
9804 Page Tables for 64-Bit Computer Systems Kevin Elphinstone, Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
E-mail: {kevine,gernot}@cse.unsw.edu.au

Jochen Liedtke
IBM TJ Watson Research Center
30 Saw Mill River Rd
Hawthorne NY, 10532, USA
E-mail: jochen@us.ibm.com
Most modern wide-address computer architecture do not prescribe a page table format, but instead feature a software-loaded TLB, which gives the operating system complete flexibility in the implementation of page tables. Such flexibility is necessary, as to date no single page table format has been established to perform best under all loads. With the recent trend to kernelised operating systems, which rely heavily on mapping operations for fast data movement across address-spaces, demands on page tables become more varied, and hence less easy to satisfy with a single structure. This paper examines the issue of page tables suitable for 64-bit systems, particularly systems based on microkernels. We have implemented a number of candidate page table structures in a fast microkernel and have instrumented the kernel's TLB miss handlers. We have then measured the kernel's performance under a variety of benchmarks, simulating loads imposed by traditional compact address spaces (typical for UNIX systems) as well as the sparse address spaces (typical for microkernel-based systems). The results show that guarded page tables, together with a software TLB cache, do not perform significantly worse than any of the other structures, and clearly outperform the other structures where the address space is used very sparsely.
9801 L4 User Manual Alan Au, Gernot Heiser
Department of Computer Systems
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia



E-mail: {alanau,gernot}@cse.unsw.edu.au
This document is a user manual for the L4 micro-kernel. It gives an introduction to the main concepts and features of L4, and explains their use by a number of examples. The manual is generally platform independent, however, examples are based on the C interface for L4/MIPS. Actual system call C bindings and data formats differ slightly on other platforms. This document supplements, rather than replaces, the L4 Reference Manual, and anyone intending to write an applications on top of L4 should obtain the L4 Reference Manual for their particular platform.
9709 L4 Reference Manual --- MIPS R4x00, Version 1.0, Kernel Version 70 Kevin Elphinstone, Gernot Heiser
Department of Computer Systems
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
{kevine,gernot}@cse.unsw.edu.au

and

Jochen Liedtke
IBM T. J. Watson Research Center
30 Saw Mill River Road
Hawthorne, NY 10532, USA
jochen@watson.ibm.com
This document describes the MIPS R4x00 implementation of the L4 microkernel. Specifically it describes Version 70 of the kernel, which is the first version made available outside UNSW. The manual is based on the L4/x86 reference manual Version 2.0 by Jochen Liedtke, and the implementation is mostly compatible with that Intel version. Differences and present implementation limitations are pointed out by "implementation notes."
9708 Extracting Hidden Context Michael Harries, Claude Sammut
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Kim Horn
RMB Australia Limited
Level 5 Underwood House
37-47 Pitt Street
Sydney 2000 Australia


E-mail: mbh@cse.unsw.edu.au
Concept drift due to hidden changes in context complicates learning in many domains including financial prediction, medical diagnosis, and network performance. Existing machine learning approaches to this problem use an incremental learning, on-line paradigm. Batch, off-line learners tend to be ineffective in domains with hidden changes in context as they assume that the training set is homogeneous. An off-line, meta-learning approach for the identification of hidden context is presented. The new approach uses an existing batch learner and the process of \emph{contextual} clustering to identify stable hidden contexts and the associated context specific, locally stable concepts. The approach is broadly applicable to the extraction of context reflected in time and spacial attributes. Several algorithms for the approach are presented and evaluated. A successful application of the approach to a complex control task is also presented.
9707 Performance Evaluation for Parallel Systems: A Survey Lei Hu and Ian Gorton
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia



E-mail: {lei, iango}@cse.unsw.edu.au
Performance is often a key factor in determining the success of a parallel software system. Performance evaluation techniques can be classified into three categories: measurement, analytical modeling, and simulation. Each of them has several types. For example, measurement has software, hardware, and hybrid; simulation has discrete event, trace/execution driven, Monte Carlo; and analytical modeling has queueing network, Petri net, etc.. This paper systematically reviews various techniques, and surveys work done in each category. Also addressed and discussed are other issues related to performance evaluation. These issues include how to select metrics and proper techniques that are well suited for the particular development stage, how to construct a good model, and how to perform workload characterization. We also present fundamental laws and scalability analysis techniques. While many techniques discussed are common in both sequential and parallel system performance evaluation, our focus is on the parallel systems.
9705 Resource Management in the Mungi Single-Address-Space Operating System Gernot Heiser, Fondy Lam, Stephen Russell
Department of Computer Systems
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

E-mail: G.Heiser@unsw.edu.au
We present the accounting system used for backing store management in the Mungi single-address-space operating system. The model is designed such that all accounting can be done asynchronously to operations on storage objects, and hence without slowing down such operations. It is based on bank accounts from which rent is collected for the storage occupied by objects. Rent automatically increases as available storage runs low, forcing users to release unneeded storage. Bank accounts receive income, with a taxation system being used to prevent excessive buildup of funds on underutilised accounts. The accounting system is mostly implemented at user level, with minimal support from the kernel. As a consequence, the accounting model can be changed without modifying the Mungi kernel.
9704 Implementation and Performance of the Mungi Single-Address-Space Operating System Gernot Heiser, Kevin Elphinstone, Jerry Vochteloo, Stephen Russell
Department of Computer Systems
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

Jochen Liedtke
IBM T. J. Watson Research Center
30 Saw Mill River Road
Hawthorne, NY 10532, USA



E-mail: {gernot,jerry,kevine,smr}@cse.unsw.edu.au
jochen@watson.ibm.com
Single-address-space operating systems (SASOS) are an attractive model for making the best use of the wide address space provided by the latest generations of microprocessors. SASOS remove the address space borders which make data sharing between processes difficult and expensive in traditional operating systems. This offers the potential of significant performance advantages for applications where sharing is important, such as object-oriented databases or persistent programming systems. Previously published SASOS were not able to demonstrate these performance advantages. We have built the Mungi system to show that these advantages can indeed be realized. Mungi is a very ``pure'' SASOS, featuring an unintrusive protection model based on sparse capabilities, a fast protected procedure call mechanism, and uses virtual memory as the exclusive inter-process communication mechanism, as well as for I/O. We believe this simplicity of our model makes it easy to implement it efficiently on conventional architectures. Our realization of Mungi for the MIPS R4600 64-bit microprocessor is presented, which is based on our implementation of the L4 microkernel. Mungi is shown to outperform a well-tuned commercial operating system in several important aspects, such as task creation and inter-process communications, and on the OO1 object-oriented database benchmark. This demonstrates clearly that the SASOS concept is viable, and that a well-designed microkernel is an excellent base on which to build high-performance operating systems.
9703 Estimator Variance in Reinforcement Learning: Theoretical Problems and Practical Solutions Mark D. Pendrith and Malcolm R.K. Ryan
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia


E-mail: {pendrith,malcolmr}@cse.unsw.edu.au
In reinforcement learning, as in many on-line search techniques, a large number of estimation parameters (e.g. Q-value estimates for 1-step Q-learning) are maintained and dynamically updated as information comes to hand during the learning process. Excessive variance of these estimators can be problematic, resulting in uneven or unstable learning, or even making effective learning impossible. Estimator variance is usually managed only indirectly, by selecting global learning algorithm parameters (e.g. lambda for TD(lambda) based methods) that are a compromise between an acceptable level of estimator perturbation and other desirable system attributes, such as reduced estimator bias. In this paper, we argue that this approach may not always be adequate, particularly for noisy and non-Markovian domains, and present a direct approach to managing estimator variance, the new ccBeta algorithm. Empirical results in an autonomous robotics domain are also presented showing improved performance using the ccBeta method.
9702 An Analysis of non-Markov Automata Games: Implications for Reinforcement Learning Mark D. Pendrith
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

and

Michael J. McGarity
School of Electrical Engineering
University of New South Wales
Sydney 2052 Australia



E-mail: {pendrith,mikem}@cse.unsw.edu.au
It has previously been established that for Markov learning automata games, the game equilibria are exactly the optimal strategies. In this paper, we extend the game theoretic view of reinforcement learning to consider the implications for ``group rationality'' in the more general situation of learning when the the Markov property cannot be assumed. We show that for a general class of non-Markov decision processes, if actual return (Monte Carlo) credit assignment is used with undiscounted returns, we are still guaranteed the optimal observation-based policies will be game equilibria when using the standard ``direct'' reinforcement learning approaches, but if either discounted rewards or a temporal differences style of credit assignment method is used, this is not the case.
9701 The Mungi Kernel API, Release 1.0 Gernot Heiser, Jerry Vochteloo, Kevin Elphinstone, Stephen Russell
Department of Computer Systems
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia



E-mail: {gernot,jerry,kevine,smr}@cse.unsw.edu.au
This document describes release 1.0 of the application programming interface to the kernel of the Mungi single-address-space operating system. This interface will, in general, only be used by low-level software, most applications are expected to use a higher-level interface implemented as system libraries. Such libraries will be described in separate documents.
9601 Supporting Persistent Object Systems in a Single Address Space Kevin Elphinstone, Stephen Russell and Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

and

Jochen Liedtke
German National Research Center for Information Technology
GMD SET-RS
Schlo{\ss} Birlinghoven
53757 Sankt Augustin
Germany



E-mail: {kevine,smr,gernot}@cse.unsw.edu.au
jochen.liedtke@gmd.de
Single address space systems (SASOS) provide a programming model that is well suited to supporting persistent object systems. In this paper we show that stability can be implemented in the Mungi SASOS without incurring overhead in excess of the inherent cost of shadow-paging. Our approach is based on the introduction of aliasing into the SASOS model and makes heavy use of user-level page fault handlers to allow implementation outside the kernel. We also show how the demands of database systems for control over page residency and physical I/O can be accommodated. An approach to user-level implementation of distributed shared memory (DSM) coherency models is outlined.
9505 Simulation of Large-Area Silicon Solar Cells Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

and

Pietro P. Altermatt
Centre for Photovoltaic Systems and Devices
University of New South Wales
Sydney 2052 Australia



E-mail: G.Heiser@unsw.edu.au
pietro@vast.unsw.edu.au
Two- and three-dimensional numerical modelling has recently become an important tool for the characterisation and optimisation of high-efficiency silicon solar cells. In the past, however, such modelling could only be applied to small sections of the cells. While such limited simulation domains are sufficient for the analysis of bulk and surface properties, the analysis and optimisation of effects like the losses resulting from the resistance of the metal contact grid require a model of the full cell and need to include edge effects. In this paper, we present an approach which combines multi-dimensional device simulation with circuit simulation to produce an accurate model of a full-sized high-efficiency solar cell. We demonstrate the power of this approach by presenting the results of an investigation of the series resistance of "passivated emitter, rear locally diffused" (PERL) silicon solar cells. The insights gained in that study triggered a small design change in the contact geometry, which managed to reduce resistive losses by more than half and contributed to a new efficiency world record.
9504 Single Address Space Operating Systems Tim Wilkinson and Kevin Murray
Systems Architecture Research Centre
City University
Northampton Square
London EC1V 0HB, UK

and

Stephen Russell and Gernot Heiser
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia

and

Jochen Liedtke
German National Research Center for Information Technology
GMD SET-RS
Schlo{\ss} Birlinghoven
53757 Sankt Augustin
Germany



E-mail: {tim,kam\}@sarc.city.ac.uk
{smr,gernot\}@cse.unsw.edu.au
jochen.liedtke@gmd.de
Single address-space operating systems offer many advantages for modern systems design. We outline in this paper how such a system deals with the issues of memory protection, user-level naming, resource management, and translation management in a large, sparse address space, as well as fault tolerance and reliability. We also explain how a POSIX compliant interface can be supported on such a system.
9503 Guarded Page Tables on the MIPS R4600 Jochen Liedtke
GMD - German National Research Center for Information Technology
GMD SET-RS, Shloss Birlinghoven
53757 Sankt Augustin, Germany

E-mail: jochen.lied@gmd.de

Kevin Elphinstone
School of Computer Science and Engineering
The University of New South Wales
Sydney 2052 Australia

E-mail: kevine@vast.unsw.edu.au

Date: 23 November 1995

Communicated by Jayasooriah
The introduction of 64-bit microprocessors has increased demands placed on virtual memory systems. The availability of large address spaces has led to a flurry of new applications and operating systems that further stress virtual memory systems. Consequently, much interest has recently focussed on translation lookaside buffer (TLB) performance and page table efficiency. Guarded Page Tables are a mechanism for overcoming some of the problems associated with conventional page tables. Guarded Page Tables are tree structured like conventional page tables. Also like conventional pages tables, they have the advantages of supporting hierarchical operations and sharing of sub-trees. Unlike conventional page tables, guarded page tables implement huge sparsely occupied address spaces efficiently. We describe guarded page tables and the associated parsing algorithm. R4600 processor dependent micro-optimisation is undertaken and presented. R4600 TLB refill is discussed in detail, including a comparison of guarded page tables with more convention page tables. A software second level TLB is introduced and analysed as a way of increasing guarded page table performance.
9502 Checkpointing and Recovery for Distributed Shared Memory Applications Jinson Ouyang and Gernot Heiser
Computer and Systems Technology Laboratory (CaST)
School of Computer Science and Engineering
The University of New South Wales
Sydney 2052 Australia


E-mail: jinsong,gernot@cse.unsw.edu.au
This paper proposes an approach for adding fault tolerance, based on consistent checkpointing, to distributed shared memory applications. Two different mechanisms are presented to efficiently address the issue of message losses due to either site failures or unreliable non-FIFO channels. Both guarantee a correct and efficient recovery from a consistent distributed system state following a failure. A variant of the two-phase commit protocol is employed such that the communication overhead required to take a consistent checkpoint is the same as that of systems using a one-phase commit protocol, while our protocol utilises stable storage more efficiently. A consistent checkpoint is committed when the first phase of the protocol finishes.
9501 Medium Access Control for Synchronous Traffic in the AMNET LAN David Goodall
Computer and Systems Technology Laboratory (CaST)
School of Computer Science and Engineering
The University of New South Wales
Sydney 2052 Australia

and

Keith Burston
Manager, Communications Unit
The University of New South Wales
Sydney 2052 Australia


E-mail: castanets@vast.unsw.edu.au
This report presents the medium access control (MAC) scheme designed for supporting synchronous traffic within the AMNET LAN. Synchronous traffic is supported directly at the MAC layer via a table mechanism, and is carried by two different types of cell - synchronous cells, which carry data from a single transmitter, and shared synchronous cells, which carry data from one or more transmitters. The table mechanism provides guaranteed latency and bandwidth to synchronous cell types. The latency involved in getting real-time data from a source device to the LAN is minimised, thus simplifying devices and making internetworking more feasible. This report is intended to describe the different synchronous cell types, the table mechanism, and algorithms for allocation of bandwidth using the table.
9413 A Simple, Expressive Real-Time CCS C. Fidge
Software Verification Research Centre,
Department of Computer Science,
The University of Queensland, Queensland 4072,
Australia.
E-mail: cjf@cs.uq.oz.au

J. Zic
Software Engineering Research Group,
School of Computer Science and Engineering,
University of New South Wales, NSW 2052,
Australia.
E-mail: John.Zic@serg.cse.unsw.edu.au
We describe a new `real-time' process algebra with simple semantics but considerable expressive power. It exhibits the advantages of both the `constraint-oriented' and `marker variable' specification styles. The definition extends Milner's CCS, firstly with a simple notion of absolute time added to actions, and then with relative timing expressions which may refer to time markers.
9412 Representing Closed CCS Systems by Petri Nets Jacek Olszewski

Address: Microsoft Institute of Advanced Software Technology
65 Epping Rd, North Ryde, 2113 Australia
(on leave from SCSE, UNSW)

E-mail:jacek@cse.unsw.edu.au
This paper describes and proves a simple transformation of CCS compositions into Petri nets. Under certain conditions, additional to the CCS syntax rules, the resulting Petri nets are finite, and firing of their transitions corresponds to handshakes in CCS compositions. Such correspondence also holds between simultaneous firing of several transitions and multiple handshakes. The transformation has proved useful in a fast deadlock detection tool developed for CCS specifications.
9411 Issues in Implementing Virtual Memory Kevin Elphinstone, Stephen Russell and Gernot Heiser
School of Computer Science and Engineering
The University of NSW
Sydney 2052 Australia

E-mail: kevine@vast.unsw.edu.au

Date: 29 SEPTEMBER 1994
Several factors are rapidly increasing the demands being placed on virtual memory implementations. Large address spaces, increasing sparseness, and novel operating systems are not well supported by traditional tree-based page tables. New approaches are needed to overcome these problems. This paper examines the advantages and disadvantages of conventional virtual address translation schemes. It then describes the performance costs caused by recent changes in hardware and operating system architectures. While there is much active research directed towards reducing these costs, it is mostly intended to provide better support for Unix style systems. Many issues are still unresolved, particularly those relating to the support of the large, sparse address spaces used by single address space operating systems.
9410 On Reinforcement Learning of Control Actions in Noisy and Non-Markovian Domains. Mark Pendrith
School of Computer Science and Engineering
The University of New South Wales
Sydney 2052 Australia

E-mail: pendrith@cse.unsw.edu.au

Date: 30 August 1994

Communicated by Claude Sammut
If reinforcement learning (RL) techniques are to be used for ``real world'' dynamic system control, the problems of noise and plant disturbance will have to be addressed. This study investigates the effects of noise/disturbance on five different RL algorithms: Watkins' Q-Learning (QL); Barto, Sutton and Anderson's Adaptive Heuristic Critic (AHC); Sammut and Law's modern variant of Michie and Chamber's BOXES algorithm; and two new algorithms developed during the course of this study. Both these new algorithms are conceptually related to QL; both algorithms, called P-Trace and Q-Trace respectively, provide for substantially faster learning than straight QL overall, and for dramatically faster learning (by up to a factor of 200) in the special case of learning in a noisy environment for the dynamic system studied here (a pole-and-cart simulation). As well as speeding learning, both the P-Trace and Q-Trace algorithms have been designed to preserve the ``convergence with probability 1'' formal properties of standard QL, i.e. that they be provably ``correct'' algorithms for Markovian domains for the same conditions that QL is guaranteed to be correct. We present both arguments and experimental evidence that ``trace'' methods may prove to be both faster and more powerful in general than TD (Temporal Difference) methods. The potential performance improvements using trace over pure TD methods may turn out to be particularly important when learning is to occur in noisy or stochastic environments, and in the case where the domain is not well-modelled by Markovian processes. A surprising result to emerge from this study is evidence for hitherto unsuspected chaotic behaviour with respect to learning rates exhibited by the well-studied AHC algorithm. The effect becomes more pronounced as noise increases.
9409 Tracing Kernel Activity in SunOS 4.0 David Goodall and Stephen Russell This paper describes a software tool for tracing kernel activity within the SunOS 4.0 operating system. The tool is designed as a pseudo-controller with separate pseudo-devices to capture event streams from various parts of the operating system. A high resolution interval timer is attached to the target system's SCSI port in order to provide hardware support for timestamps. Use of the timer also allows measurement of CPU usage by the instrumentation system. E-mail: disy@vast.unsw.edu.au
9407 Time Constrained Buffer Specifications in CSP+T and Timed CSP John J. Zic A finite buffer with time constraints on the rate of accepting inputs, producing outputs and message latency is specified using both Timed CSP and a new real-time specification language, CSP+T. CSP+T adds expressive power to some of the sequential aspects of CSP and allows the description of complex event timings from within a single sequential process. On the other hand, Timed CSP encourages event timing descriptions to be built up in a constraint-oriented manner with the parallel composition of several processes. Although these represent two complementary specification styles, both provide valuable insights into specification of complex event timings. E-mail: johnz@cse.unsw.edu.au
9406 A Parallel Approach to High-Speed Protocol Processing Toong Shoon Chan and Ian Gorton A rapid increase in the transmission bandwidth of optical networks has created a bottleneck in protocol processing at the end systems. This has resulted in the inability of applications and network protocols to exploit the full bandwidth of a high-speed network. This paper presents a parallel architecture that is designed to support high-speed protocol processing. The advent of the T9000 transputer and C104 router technology has provided a platform that is suitable for the construction of a highly parallel and scalable protocol processing architecture based on packet and functional parallelism. A simulation of the architecture has been implemented and has demonstrated the advantage of exploiting a parallel architecture for protocol processing. Email: chants, iango@spectrum.cs.unsw.oz.au
9405 On Aggregating Teams of Learning Machines Sanjay Jain and Arun Sharma The present paper studies the problem of when a team of learning machines can be aggregated into a single learning machine without any loss in learning power. The main results concern aggregation ratios for vacillatory identification of languages from texts. For a positive integer n, a machine is said to TxtFex_n-identify a language L just in case the machine converges to up to n grammars for L on any text for L. For such identification criteria, the aggregation ratio is derived for the n=2 case. It is shown that the collection of languages that can be TxtFex_2 identified by teams with success ratio greater than 5/6 are the same as those collections of languages that can be TxtFex_2-identified by a single machine. It is also established that 5/6 is indeed the cut-off point by showing that there are collections of languages that can be TxtFex_2-identified by a team employing 6 machines, at least 5 of which are required to be successful, but cannot be TxtFex_2-identified by any single machine. Additionally, aggregation ratios are also derived for finite identification of languages from positive data and for numerous criteria involving language learning from both positive and negative data. Email: arun@cse.unsw.edu.au
9404 On the Intrinsic Complexity of Language Identification Sanjay Jain and Arun Sharma A new investigation of the complexity of language identification is undertaken using the notion of reduction from recursion theory and complexity theory. The approach, referred to as the intrinsic complexity of language identification, employs notions of `weak' and `strong' reduction between learnable classes of languages. The intrinsic complexity of several classes are considered and the results agree with the intuitive difficulty of learning these classes. Several complete classes are shown for both the reductions and it is also established that the weak and strong reductions are distinct. An interesting result is that the self referential class of Wiehagen in which the minimal element of every language is a grammar for the language and the class of pattern languages introduced by Angluin are equivalent in the strong sense. This study has been influenced by a similar treatment of function identification by Freivalds, Kinber, and Smith. Email: arun@cse.unsw.edu.au
9403 Motion planning in Prototypical Corridors N. Ahmed and A. Sowmya We discuss the motion planning of a rectangular moving object in certain prototypical situations arising in a 2-D isothetic workspace. We have suggested three possible motion strategies involving rotation and translation of the moving object negotiating an L-shaped corridor. We have also given simulation results to compare the three cases of the proposed motion. E-mail: sowmya@cse.unsw.edu.au First author's affiliation: Department of Computer Science James Cook University of North Queensland Townsville QLD 4811, Australia )
9402 HTPNET: A New Transport Protocol for High-speed Networks Toong Shoon Chan and Ian Gorton The quantum increase in transmission speed of optical fibre networks has created a bottleneck in transport protocol processing at host systems. In this paper we present a new transport protocol system, HTPNET, that is designed to overcome this protocol processing bottleneck. HTPNET is based on a highly parallel architecture and is designed to exploit the evolving characteristics of high-speed networks. HTPNET uses an out-of-band signalling system based upon transmitter-paced periodic exchanges of state information between end systems. This mechanism exhibits several attractive properties which have been demonstrated to perform efficiently in a high-speed environment with high-bandwidth-delay-product. A prototype implementation of HTPNET has been constructed from a network of T800 transputers. The results obtained from this implementation are presented: these demonstrate the advantages of exploiting a parallel architecture for protocol processing.
9401 Extending Statecharts with Temporal Logic A. Sowmya and S. Ramesh Statecharts is a behavioural specification language for the specification of real-time event driven reactive systems. Recently, statecharts was related to a logical specification language, using which safety and liveness properties could be expressed; this language provides a compositional proof system for statecharts. However, the logical specification language is flat, with no facilities to account for the structure of statecharts; further, the primitives of this language are dependent on statecharts syntax, and cannot be related directly to the problem domain. This paper discusses a temporal logic-based specification language called FNLOG which addresses these problems.
9318 Designing a Video Rate Edge Detection ASIC Mehdi N. Fesharaki and Graham R. Hellestrand A method of generating a video rate edge maps, and thereby segmenting an image into regions based on the Kolmogorov-Smirnov test is presented. By applying this test and comparing cumulative distribution functions of the intensities in the neighbourhood of a given pixel, the pixel can be accurately classified as either an edge pixel on the boundary between regions, or as a pixel belonging to a particular type of region. It is shown that a custom VLSI design for this algorithm using parallel pipelined architecture is realisable. The outline of this design is presented and then the critical path modules are simulated.
9317 Non-Interleaving Semantics for CCS and Fast Deadlock Detection Jacek Olszewski This paper proposes a non-interleaving semantics for CCS, and defines a class of CCS compositions for which interleaving and non-interleaving semantics are equivalent. It also presents a fast software tool for analysis of CCS compositions under non-interleaving semantics. The tool employs Petri net techniques with multiple transition firings.
9316 Real-Time Colour Image Segmentation Mehdi N. Fesharaki and Graham R. Hellestrand This paper describes a new method for colour image segmentation. The algorithm is based on testing the homogeneity of pixels around a center pixel by using statistical inference techniques. A 5 by 5 window around each pixel is partitioned into two sub-samples in different orientations. Then the cumulative distribution function of two sub-samples are compared with each other. Based on the Kolmogorov-Smirnov statistic, the homogeneity of two sub-samples is verified. If all pixels within the window are homogeneous, therefore, the computed statistic for all different partitionings must verify the homogeneity; otherwise, the homogeneity is rejected. As well, the computed statistic is combined with the intensity uniformity of two adjacent pixels to prevent oversegmented and/or undersegmented results. Moreover, we consider how the algorithms can be effectively implemented as a real-time hardware design.
9315 Two-dimensional Numerical Simulations of High-efficiency Silicon Solar Cells (The text of this report will also appear in the Microelectronics Journal) Gernot Heiser, Armin G. Aberle, Stuart R. Wenham, Martin A. Green This paper reports on the first use of two-dimensional (2D) device simulation for optimising the front-finger spacing of one-sun high-efficiency silicon solar cells of practical dimensions. We examine the 2D current flow patterns in these devices under various illumination conditions, resulting in improved insight into the operating conditions of the cells. Results for the optimal spacing of the front metal fingers are presented and compared to predictions obtained from 1D simulations. We also address difficulties facing the numerical modelling of high-efficiency silicon solar cells.
9314 Mungi: A Distributed Single Address-Space Operating System (The text of this report has been acepted for ACSC-17) Gernot Heiser, Kevin Elphinstone, Stephen Russell, Jerry Vochteloo With the development of 64-bit microprocessors, it is now possible to combine local, secondary and remote storage into a large single address-space. This results in a uniform method for naming and accessing objects regardless of their location, removes the distinction between persistent and transient data, and simplifies the migration of data and processes. This paper describes the Mungi single address-space operating system. Mungi provides a distributed single level address-space, which is protected using password capabilities. The protection system performs efficiently on conventional architectures, and is simple enough that most programs do not need to be aware of its operation.
9312 Address Space Management Issues in the Mungi Operating System Kevin Elphinstone The Mungi operating system features a single 64 bit persistent address space encompassing all data in the system. This differs dramatically from current generation operating systems in which each process has its own address space and persistent data is stored in a filesystem. This report is a preliminary investigation of address space management issues raised by adopting a single persistent address space model. Issues examined are internal and external fragmentation of the address space, reuse versus no-reuse allocation policies, and page table structures used to support the address space.
9311 Conceptual Graphs for Natural Language Representation Graham A. Mann Conceptual graphs are abstract data structures that can form the basis of a knowledge representation system. Since their introduction by Sowa in 1984, they have formed the core of a flourishing research effort, with applications in databases and artificial intelligence. The basics of conceptual graph theory are outlined, including the organisation of existential, finite, bipartite directed graphs and their relationship to first order predicate logic. A definition of the basic functions defined over such graphs follows, including the canonical formation rules and some higher order functions intended to support reasoning. With carefully designed semantic knowledge in the form of catalogues of concepts and relations, a conceptual graph processing system can support knowledge representation. A new evaluation of the strengths and weaknesses of conceptual graphs with respect to a theoretical view of knowledge representation which enumerates five roles which the representation must play: provision of a theory of intelligent reasoning, ontological commitment, object surrogacy, provision of a practical computing medium, and human read/writeability, is made. Finally, existing methods of using the conceptual graphs for natural language comprehension are discussed, and a new theory of conceptual assembly is proposed. Access to an experimental conceptual graph processor written in Common Lisp is offered in an Appendix A.
9309 Using CSP+T to Describe a Timing Constrained Stop-and-Wait Protocol John J. Zic This paper presents a novel description of a time-constrained stop and wait protocol using an extended CSP model. The timing constraints examined include the usual message transit delay, as well as message input rate limitations and message timeouts. The extended CSP model used for this example is based on associating finite time intervals with each event the process engages. These time intervals in turn are functions over a set of markers events.
9308 A Comparison of Two Real-time Description Techniques John J. Zic A new real-time description language based on Hoare's CSP is proposed, and compared to the more usual Timed CSP in its effectiveness in describing a buffer with differing input and output rates and transit delay requirements. It is found that the new notation offers a concise, natural way of formulating complex timing relationships.
9306 VHDL vs Functional Hardware Description: A Comparison and Critique P. Kanthamanon, G. R. Hellestrand and M. C. Kam This paper presents a comparison of two Hardware Description Languages (HDL)- VHDL and MODAL which employ different description styles for hardware specification. The comparison is both qualitative and quantitative and based on examples written in both languages. The languages are distinct in their power to describe hardware at various levels of abstraction. The results show that the functional description style, as used in MODAL, provides a more accurate description of hardware and modelling of hardware timing without loss of behavioural descriptive power.
9305 Signal Transition Graph Constraints for Synthesis of Hazard-Free Asynchronous Circuits with Unbounded-Gate Delays Radhakrishna Nagalla and Graham Hellestrand A synthesis procedure for asynchronous control circuits from a high level specification, signal transition graph (STG), is described. In this paper, we propose some syntactic constraints on STG to guarantee hazard-free implementation. We have introduced a global persistency concept in order to establish the relationship between the persistency concept introduced by Chu [2] (which we call local persistency) and the consistent state coding (CSC). The STG syntactic constraints required to compute the "input set" of a signal are identified. We analyze all hazards under both single and multiple input change conditions and propose necessary changes to the net contraction and logic synthesis procedures. The proposed changes are guaranteed to generate hazard-free circuits with the unbounded-gate delay model, if the STG is live, safe and has consistent state coding.
9304 Marksheets: marking easier and more consistently J. Lions This report introduces formally the idea of marksheets as an aid for examination marking. A marksheet is an A4 sheet of paper that has been specially prepared to aid the recording of mark gained by one student during an examination. Marking is a difficult and stressful process that involves understanding and interpreting the sometimes tangled outpourings and associated thought processes of students under examination. Straight-forward approaches that suffice for small classes may no longer do so as class sizes grow. The object of preliminary marking is to validate and extend the marking scheme, and to ensure that all ways of answering each question have been discovered, considered and calibrated. In general, the marksheet changes whenever a new style of answer, or new variation is discovered. During the cycle of review and re-assessment, the marksheet goes through several iterations. When the time for final marking arrives, each script is assessed independently and all marks, comments and notes about the script are recorded on one single copy of the marksheet. The preparation of a marksheet may take considerable time and effort. Marksheets cater to a need that did not exist only a few years ago, and their production is facilitated by technology that also did not exist a few years ago.
9303 Capability-Based Protection in a Persistent Global Virtual Memory System Jerry Vochteloo, Stephen Russell, Gernot Heiser A single address-space encompassing all virtual memory of a distributed computer system is an ideal environment for a persistent system. The issue of providing effective and efficient protection of objects in such an environment has, however, not been addressed satisfactorily. We propose a system which is based on password capabilities. A system-maintained data structure called the {\em capability tree} is used for long-term storage of capabilities, and reflects the hierarchical structure of object privacy. A second system data structure, the {\em active protection domain}, allows the system to find capabilities quickly when validating memory accesses. The proposal supports inheritance of protection domains, as well as temporary extension of protection domains to support privileged procedures. Untrusted programs can be confined to run in a restricted protection domain. The protection system performs efficiently on conventional architectures, and is simple enough that most programs do not need to be aware of its operation.
9302 A Distributed Single Address-Space Operating System Supporting Persistence Gernot Heiser, Kevin Elphinstone, Stephen Russell, Graham R. Hellestrand Persistence has long been difficult to integrate into operating systems. The main problem is that pointers lose their meaning once they are taken out of their address-space. We present a distributed system which has a single address-space encompassing all virtual memory of every node in the system. This design has become possible (and practicable) with the advent of 64-bit microprocessors. In our system, every pointer retains its meaning independent of its location, even across nodes or on secondary storage. No restrictions are imposed on the use of pointers by application programs. Hence persistence is naturally and elegantly integrated into the system. Further features are uniform addressing and unlimited sharing of data, and memory protection based on password capabilities, making the system easy to use. A reliable paging protocol ensures that the impact of node crashes on other parts of the system is minimised.
9301 Computational Limits on Team Identification of Languages Sanjay Jain and Arun Sharma A team of learning machines is essentially a multiset of learning machines. A team is said to successfully identify a concept just in case each member of some nonempty subset of the team identifies the concept. Team identification of programs for computable functions from their graphs has been investigated by Smith. Pitt showed that this notion is essentially equivalent to function identification by a single probabilistic machine. The present paper introduces, motivates, and studies the more difficult subject of team identification of grammars for languages from positive data. It is shown that an analog of Pitt's result about equivalence of team function identification and probabilistic function identification does not hold for language identification, and the results in the present paper reveal a very complex structure for team language identification. It is also shown that for certain cases probabilistic language identification is strictly more powerful than team language identification. Proofs of many results in the present paper involve very sophisticated diagonalization arguments. Two very general tools are presented that yield proofs of new results from simple arithmetic manipulation of the parameters of known ones.