CPSC 5330 Multimedia Processing Jiangjiang(jane)Liu

Published on Slideshow
Static slideshow
Download PDF version
Download PDF version
Embed video
Share video
Ask about this video

Scene 1 (0s)

[Audio] CPSC 5330 Multimedia Processing

Scene 2 (10s)

[Audio] TOPICS: 1.INTRODUCTION 2.RELATED WORK 3.CONCLUSION AND FUTURE WORK 4.REFERENCE.

Scene 3 (24s)

[Audio] INTRODUCTION • The human visual system (HVS) perceives the depth of information in our daily lives naturally. • We perceive the depth from stereo 3D images by computing the disparity between the image pairs in our HVS which inspires the research of stereo matching algorithms in computer vision. • Figure 1 illustrates an example of depth perception in HVS. • The fixation point usually corresponds to the position of 3D..

Scene 4 (57s)

[Audio] RELATED WORK Multi-media Mining using Deep Learning:.

Scene 5 (1m 22s)

[Audio] RELATED WORK: We briefly review the related works among the research topics of data compression, content protection, and content creation for 3D multimedia..

Scene 6 (1m 46s)

[Audio] RESEARCH CONTRIBUTIONS: The research work in progress and possible future plans. We summarize the research results obtained from the following topics.  Data compression  Content Protection  Content Creation.

Scene 7 (2m 9s)

[Audio] RELATED WORK CONT.. METHODOLOGIES • The fast data compression algorithm is based on depth information. • The depth perception of HVS has a Panum band constraint, and the required parameter space of the rendering process is limited. On the other hand, the resolution of disparity for existing 3D monitors is also limited. By utilizing these factors, we can therefore compress the depth information efficiently without compromising the quality of the synthesized virtual view image. • UVW.

Scene 8 (2m 51s)

[Audio] CONCLUSION • Multimedia mining is one of the important and challenging research domains in the field of computer science. Most of the researchers are interested to do their research work in the field of multimedia mining. Many challenging research problems are available in multimedia mining. These problems can be solved by developing new algorithms, concepts and techniques for extracting hidden knowledge from the multimedia data bases. This paper discussed the multimedia mining basic concepts, essential characteristics, architectures, models and applications. Emerging and open research issues in multimedia mining also described here..

Scene 9 (3m 35s)

[Audio] REFERENCE [1] Manjunath T.N, Ravindra S Hegadi, Ravikumar GK, "A Survey on Multimedia Data Mining and Its Relevance Today" IJCSNS International Journal of Computer Science and Network Security. VOL.10 No.11, November 2010. [2] Sarla More, Durgesh Kumar Mishra, "Multimedia Data Mining: A Survey" Pratibha: International Journal of science, spirituality, business and technology (ijssbt), vol. 1, no.1, march 2012 issn (print) 22777261. [3] Manjunath R, S. Balaji, "Review and Analysis of Multimedia Data Mining Tasks and Models" International Journal of Innovative Research in Computer and Communication Engineering Vol.2, Special Issue 2. May 2014. [4] Bhavanithuraisingham. "Managing and mining multimedia Databases" International Journal on Artificial Intelligence Tools Vol. 13, No.3 (2004) 739759..

Scene 10 (5m 17s)

[Audio] Network-Integrated Multimedia Middleware (NMM) INTRODUCTION: • Besides the PC, an increasing number of multimedia devices – such as set-top boxes, PDAs, and mobile phones – already provide networking capabilities. However, today's multimedia infrastructures adopt a centralized approach: All multimedia processing takes place within a single system and the network is only used for streaming predefined content from a server to clients. Conceptually, such approaches consist of two isolated applications, a server, and a client (see Figure 1). The realization of advanced scenarios is therefore complicated and error-prone – especially since the client has typically no or only limited control of the server, and vice versa. • The Network-Integrated Multimedia Middleware (NMM) presented in this paper overcomes these limitations by enabling access to all resources within the network [1, 2]: distributed multimedia devices and software components can be transparently controlled and integrated into an application. In contrast to all other multimedia architectures available, NMM is a true middleware, i.e. a distributed software layer running in between distributed systems and applications.

Scene 11 (6m 42s)

[Audio] GENERAL DESIGN APPROACH • The general design approach of the NMM architecture is similar to other multimedia architectures – but is extended to a true network transparent approach as described in the following.  Nodes, Jacks, and Flow Graphs Messaging System  Interfaces  Distributed Flow Graphs Distributed Synchronization Registry Service.

Scene 12 (7m 14s)

[Audio] APPLICATION DEVELOPMENT • The main goal of NMM is to ease the development of (distributed) multimedia applications in C++ [1]. An extensive tutorial showing several "Hello-World!" examples is available online [9]. In particular, we also demonstrate how easy it is to develop NMM applications employing distributed flow graphs: By simply adding a single line, a node within a flow graph can be distributed to a remote host, while used and controlled. •The Network-Integrated Multimedia Middleware (NMM) presented in this paper overcomes these limitations by enabling access to all resources within the network [1, 2]: distributed multimedia devices and software components can be transparently controlled and integrated into an application. In contrast to all other multimedia architectures available, NMM is a true middleware, i.e. a distributed software layer running in between distributed systems and applications.

Scene 13 (8m 21s)

[Audio] CONCLUSION • The Network-Integrated Multimedia Middleware (NMM) offers a new approach to networked multimedia by providing a fully transparent view of distributed systems and thereby removing the artificial boundaries of traditional client/server streaming applications. As a result, NMM is the only full-featured multimedia middleware available. NMM is used as a fundamental software layer within various commercial products in the areas of home entertainment, building technologies, content processing and distribution, and multimedia installations. The dual-licensing model of NMM offers commercial licenses and services for industrial partners. Still, NMM is available as an Open source and offers all its benefits for other Open Source projects and research efforts in academia..

Scene 14 (9m 14s)

[Audio] REFERENCES 1. M. Lohse. Network-Integrated Multimedia Middleware, Services, and Applications. VDM Verlag, 2007. 2. M. Lohse, M. Repplinger, and P. Slusallek. An Open Middleware Architecture for Network-Integrated Multimedia. In Proceedings of the Joint International Workshops on Interactive Distributed Multimedia Systems and Protocols for Multimedia Systems (IDMS/PROMS), 2002. 3. M. Lohse, M. Repplinger, and P. Slusallek. Dynamic Distributed Multimedia: Seamless Sharing and Reconfiguration of Multimedia Flow Graphs. In Proceedings of the 2nd International Conference on Mobile and Ubiquitous Multimedia (MUM), 2003. 4. M. Lohse, M. Repplinger, and P. Slusallek. Session Sharing as Middleware Service for Distributed Multimedia Applications. In Proceedings of the First International Workshop on Multimedia Interactive Protocols and Systems (MIPS), 2003..

Scene 15 (10m 55s)

[Audio] Local Manipulation of Image Layers Using Standard Image Processing Primitives INTRODUCTION: In a conventional image manipulation program, images can be made of a stack of layers. Each layer is independently editable. The final image is obtained by compositing and blending this stack of layers. The layer stacking and order have a global scope over the image, i.e., if layer M is stacked above layer N in one part of an image, then the ordering is true for all parts of the image. Any modern image editor like the GNU Image Manipulation Program or GIMP [8] allows the global reordering and manipulation of these layers. In order to create complex images like that of a weave pattern (see Figure 2), where many threads overlap each other in many different places in different orders, duplicate layers have to be made for each thread for each point of overlap and the global ordering has to be rearranged to get the weave pattern. It is also impossible to create, for example, a cyclic ordering of layers using global ordering (see Figure 1). An elegant solution to this problem was presented in the work on local layering [10] that for the first time allowed the ordering of the layers to be done at a local level rather than at a global level. The idea was demonstrated via a standalone implementation that was not integrated into any standard image processing pipeline. This makes its use cumbersome and limited..

Scene 16 (12m 31s)

[Audio] BACKGROUND • Our implementation of the list-graph data structure is based on an efficient manipulation of layer mask pixel values associated with every image layer. Since layer masks are commonly available in all common image editors like Adobe Photoshop [2] and the GIMP [8], our algorithms are generic and implementable in any standard image processing pipeline. We further show that image masks can be used to store and retrieve local layer ordering if the image is saved in the native format of the editor (e.g., PSD for Adobe Photoshop, and CXF for the GIMP). For this purpose, we present a novel algorithm to reconstruct a consistent list graph from saved image mask data LOCAL LAYERING: • With local layering, layers can be ordered just as one would order paper cut-outs, weaving and overlapping but never passing through one another. Local stacking is represented using a list graph and the layer flipping operators are used to change the layer ordering in local regions, as presented in [10]. We present a brief explanation of this process..

Scene 17 (13m 43s)

[Audio] RECONSTRUCTING THE LIST GRAPH FROM MASKS • If the image editing process is interrupted for some reason, the editing has to be saved and resumed from the saved state at a later time. The original local-layering implementation given by [10] provides an easy way of doing this that can be integrated into existing image processing pipelines. In this section, we present a novel algorithm that reconstructs the list graph from the masks and layer data, thus, allowing us to save and retrieve the local layering edits at a later time. We assume that the layers are not moved between sessions. Thus, we can calculate the layers present in a list and the edges of the list graph as explained in the previous section. This data, therefore, does not change and hence is 'known'. The local layer stacking for a list is not known since it can change across sessions. • Additional Data structure • Algorithm.

Scene 18 (14m 46s)

[Audio] CONCLUSION • We have presented a novel implementation of local layering that is based entirely on standard image processing primitives. This makes local layering immediately available to every popular image processing pipeline that works with image masks. To illustrate our ideas, we have also presented a working open-source plug-in for the GNU Image Manipulation Program (GIMP). Our algorithms are able to handle large images efficiently providing interactive feedback as the layer order is changed locally. Our methods also allow the local layering information to be saved using standard image processing primitives that are saved as a part of the image so that the editing can be resumed at a later time. Future work would focus on allowing the layers to move and use local matting to facilitate animation, as shown in [10], and extending the local layering concept to video..

Scene 19 (15m 43s)

[Audio] REFERENCES 1. Adobe. Photoshop CS 5. http://www.adobe.com/products/photoshop/, 2010. 2. Apple. Final Cut Pro. http://www.apple.com/finalcutstudio/finalcutpro/, 1999-2010. 3. P. Baudelaire and M. Gangnet. Planar maps: an interaction paradigm for graphic design. In Proceedings of CHI '89, pages 313–318. ACM, 1989. T. Duff. Compositing 3-d rendered images. In Proceedings of SIGGRAPH '85, pages 41–44. ACM, 1985.

Scene 20 (16m 45s)

[Audio] Segmentation Standard for Chinese Natural Language Processing INTRODUCTION: One important feature of Chinese texts is that they are character-based, not wordbased. Each Chinese character stands for one phonological syllable and in most cases represents a morpheme. The fact that Chinese writing does not mark word boundaries poses the unique question of word segmentation in Chinese computational linguistics (e.g. Sproat and Shih 1990, and Chert and Liu 1992). Since words are the linguistically significant basic elements that are entered in the lexicon and manipulated by grammar rules, no language processing can be done unless words are identified. In theoretical terms, the primacy of the concept of word can be more firmly established if its existence can be empirically supported in a language that does not mark it conventionally in texts (e.g. Bates et al. 1993, Huang et al. 1993). In computational terms, no serious Chinese language processing can be done without segmentation. No efficient sharing of electronic resources or computational tools is possible..

Scene 21 (18m 1s)

[Audio] Segmentation Principles: A string whose structural composition is not determined by the grammatical requirements of its components or a string that's a grammatical category other than the one predicted by its structural components should be treated as a segmentation unit. Take note that characters are the basic processing units when segmentation is involved. Thus the two principles address the question of which strings of characters can be further combined to form a segmentation unit. Principles 2a) and b) elaborate on the semantic (independent meaning) and syntactic (fixed category) components of the definition of segmentation unit. Because of the procedural nature of the two principles, they provide the basis for the segmentation algorithm. Since a character could be a lexical or sub-lexical element, the basic decision in segmentation is whether the relation between two characters are morphological or syntactic. For instance, with a VO sequence such as lead-in come- electricity 'to strike a chord with, to mutually attract', principle 2b) applies to predict that the string is a segmentation unit since lai is an intransitive verb and do not take an object..

Scene 22 (19m 15s)

[Audio] CONCLUSION • In this paper, we propose a Segmentation Standard for Chinese language processing composed of two distinct parts: a) the language and lexicon-independent definition and principles, and b) the lexicon-dependent guidelines. The definition and principles offer the conceptual basis of segmentation and are the unifying idea behind the resolution of heuristic conflicts. The lexicon-dependent guidelines, as well as the data-dependent lexicon, allows the standard to be easily adaptable to linguistic and sub-language changes. REFERENCES:  Bloomfield, L. 1933. Language. New York: Holt, Rinehart, and Winston.  Chao, Y. R. 1968. A Grammar of Spoken Chinese. Berkeley: U. of California Press.  Chen, C., S. Tseng, C.-R. Huang and K.-J. Chen. 1993. Some Distributional Properties of Mandarin Chinese-A Study Based on the Academia Sinica Corpus. Proc. of the 1st PACFoCoL. 81-95. Taipei.  Chen, H. and C. Li. 1994. Recognition of Text-based Organization Names in Chinese. [in Chinese] Communications of COLIPS. 4.2.131-142..

Scene 23 (21m 23s)

[Audio] Multimedia Signal Processing for Behavioral Quantification in Neuroscience INTRODUCTION: •Understanding how brains work is perhaps the "final frontier" for science. Technical advances have played a major role in contemporary progress in neuroscience, as illustrated by the rapid growth in the use of brain imaging techniques. While not quite as prominent in the public gaze, a small technical revolution is currently taking place in the area of automated analysis of animal and human behavior. Quantification of behavior is critical to understanding brains since behavior is the output of brain function. The scientific study of animal behavior or Ethology is a discipline that was developed in the early and mid-twentieth century. What we are now witnessing is the growth of the field of Quantitative Ethology, aided by computational analyses of digitized recordings of behavior. Multimedia signal processing (MMSP) is central to this field since behavior is typically digitized by making audio and/or video recordings. In this paper, we examine four case studies of Quantitative Ethology using audio and video signal processing. The field is still in its infancy and presents research challenges in MMSP with both practical and scientific interests. •Many neuroscience behavior experiments can be characterized by the sets of time-varying sensory inputs presented to the subject and the behavioral plus physiological output signals which result. The output may be used as feedback during the experiment and/or recorded for later analysis. Audio and video presentation and recording are often an integral part of an experimental paradigm, and appropriate signal processing and general analysis techniques need to be applied in order to properly interpret the results or drive the experiment in real time..

Scene 24 (23m 20s)

[Audio] AUDIO ANALYSIS: Birdsong Vocal Development in Human Infants VIDEO ANALYSIS: • Rodent Locomotion • Drosophila Behavior CONCLUSION: •The use of multimedia recordings and the importance of multimedia content analysis in neuroscience behavioral experiments will only increase as consumer technology improves and as researchers design more sophisticated and open-ended experimental setups. This trend will require neuroscientists to become familiar with and embrace MMSP methods for the proper analysis of that data. As the study of song learning in zebra finch and vocal development in the human infant has shown, the detailed acoustic analysis combined with robust signal processing is key when characterizing audio recordings of song and language. •Automated techniques must segment any sounds into units appropriate for the subject, and robust measures need to be used for proper categorization of sounds. There is a need to replicate the existing zebra finch-type studies in a human context by performing large-scale recordings and bringing automated analysis techniques to bear on the database. Simultaneous video recording is already common in both areas but needs to be integrated more fully into the analysis..

Scene 25 (24m 44s)

[Audio] REFERENCES 1. Archer, J., Tests for emotionality in rats and mice: a review. Anim Behav, 1973. 21: p. 205- 235. 2. Benjamini, Y., et al., SEE Software Home Page. 2006. 3. Bolivar, V., Cook, M. and Flaherty, L., List of transgenic and knockout mice: behavioral profiles. MAMM Genome, 2000. 11: p. 260-274. 4. Chen, S., et al., Fighting fruit flies: A model system for the study of aggression. Proceedings Of The National Academy Of Sciences Of The United States Of America, 2002. 99(8): p. 5664- 5668. 5. Clark, C.W., Marler, P. and Beeman, K., QuantitativeAnalysis of Animal Vocal Phonology - an Application to Swamp Sparrow Song. Ethology, 1987. 76(2): p. 101-115. 6. Cleveland, W.S., Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc, 1977. 74: p. 829-836..