DCSP-1: Introduction. Jianfeng Feng. Department of Computer Science Warwick Univ., UK

DCSP-1: Introduction Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html

Time Monday (L) 14.00-15.00 CS 1.01 Tuesday (L) 16.00 -- 17.00 CS 1.01 Thursday (S) 13.00 -- 14.00 CS 1.01 From this week, seminar starts

My research: 冯建峰 http://dcs.warwick.ac.uk/~feng Using MRI to peer through your brain

My research: 冯建峰 http://dcs.warwick.ac.uk/~feng In general, our research is about dealing with big data (brain research)

Are we facing another revolution? 2020?AI 1970s 1785 1831 third second Brain_like AI fouth? first

Brain-like AI:new revolution 2012 2013 Apple Siri Google Now DeepMind Deep face (Facebook) Deep medicine (J Rothberg & T Xu) Google brain (Andrew Ng) 2014 Google IBM Microsoft Eugene Goostman 2015 Automatic drive on Motoway Watson Automatic translations Past Turing test 2016 Applications:Smart city, financial markets, medicine

Brain-like AI:new revolution 2012 2013 Apple Siri Google Now DeepMind 2014 Deep face Deep learning medicine Google brain (Facebook) (J Rothberg & T Xu) (Andrew Ng) 2015 Google Automatic drive on Motoway IBM Watson Microsoft Automatic translations Eugene Goostman Past Turing test 2016 Applications:Smart city, financial markets, medicine

Learning

Big Data Big Data is as petroleum to our society This module is all about data data data This module will enable you to equip with skills to deal with big data Most practical module in our two years comments from students of previous years

Big Data Very successful with deep learning But it deals with only static data such as face We will deal with dynamic data (language, video etc.)

Announcement for Seminars DCSP seminars (to cover DCSP tutorial problems) start in Week 2. only attend one seminar per week.

Assignment Assignment will be issued in Week 4 Worth 20% of the module assessment Submission deadline is 12 noon on thursday Week 10 -- 16th March 2017 (The winner will be awarded 500 RMB)

References Any good book about digital communications and digital signal processing Wikipedia, the free encyclopedia or public lectures Lecture notes is available at http://www.dcs.warwick.ac.uk/~feng/teaching/dcsp.html Public lectures

Outline of today s lecture Digital vs. analog Module outline Data transmission (sampling)

Signal: video, audio, etc. -- Data

Two types of signal information carrying signals are divided into two broad classes Continuous x(t) vs. Digital x[n]

Continuous (Analog) Signals Analog signals: continuous (electrical) signals that vary in time Most of the time, the variations follow that of the non-electric (original) signal. The two are analogous hence the name analog.

Example I Telephone voice signal is analog. The intensity of the voice causes electric current variations. At the receiving end, the signal is reproduced in the same proportion.

Example I Telephone voice signal is analog. The intensity of the voice causes electric current variations. At the receiving end, the signal is reproduced in the same proportion. Digitized, communication and processing

Example II

Digital Signals Non-continuous, they change in individual steps. Consist of pulses with discrete levels or values. Each pulse is constant, but there is an abrupt change from one digit to the next.

Example I

Example II: please meet Two-dimensional signal x[n1,n2], n1,n2 Z A point on a grid pixel Grid is usually regularly spaced Values x[n1,n2] refer to the pixel s appearance

Example III: Our brain, Neuronal Activities Vertical white bar: spike axon soma synapse As in Prof. Rolls talk on Monday Dendrite

Advantages I a. The ability to process a digital signal means that errors caused by random processes can be detected and corrected. b. Digital signals can also be sampled instead of continuously monitored and multiple signals can be multiplexed together to form one signal.

Advantages II Because of all these advantages, and because c. Advances in wideband communication channels and solid-state electronics have allowed scientists to fully realize a and b digital communications has grown quickly. d. Digital communications is quickly edging out analog communication because of the vast demand to transmit computer data and the ability of digital communications to do so.

Module Summary I One sentence: Deal with digital signals

Module Summary II Data transmission: Channel characteristics, signalling methods, interference and noise, data compression and encryption; Information Sources and Coding: Information theory, coding of information for efficiency and error protection;

Module Summary III Data transmission: Channel characteristics, signalling methods, interference and noise, synchronisation, data compression and encryption; Information Sources and Coding: Information theory, coding of information for efficiency and error protection; Signal Representation: Representation of discrete time signals in time and frequency; z transform and Fourier representations; discrete approximation of continuous signals; sampling and quantisation; stochastic signals and noise processes;

Module Summary IV Data transmission: Channel characteristics, signalling methods, interference and noise, synchronisation, data compression and encryption; Information Sources and Coding: Information theory, coding of information for efficiency and error protection; Signal Representation: Representation of discrete time signals in time and frequency; z transform and Fourier representations; discrete approximation of continuous signals; sampling and quantisation; stochastic signals and noise processes; Filtering: Analysis and synthesis of discrete time filters; finite impulse response and infinite impulse response filters; frequency response of digital filters; poles and zeros; filters for correlation and detection; matched filters;

Module Summary V Data transmission: Channel characteristics, signalling methods, interference and noise, synchronisation, data compression and encryption; Information Sources and Coding: Information theory, coding of information for efficiency and error protection; Signal Representation: Representation of discrete time signals in time and frequency; z transform and Fourier representations; discrete approximation of continuous signals; sampling and quantisation; stochastic signals and noise processes; Filtering: Analysis and synthesis of discrete time filters; finite impulse response and infinite impulse response filters; frequency response of digital filters; poles and zeros; filters for correlation and detection; matched filters; Digital Signal Processing applications: Processing of images and sound using digital techniques.

Data Transmission I: General Form

Data Transmission II: A modulator that takes the source signal and transforms it so that it is physically suitable for the transmission channel

Data Transmission II: A modulator that takes the source signal and transforms it so that it is physically suitable for the transmission channel A transmitter that actually introduces the modulated signal into the channel, usually amplifying the signal as it does so A transmission channel that is the physical link between the communicating parties a receiver that detects the transmitted signal on the channel and usually amplifies it (as it will have been attenuated by its journey through the channel) A demodulator that receives the original source signal from the received signal and passes it to the sink

Thus a signal that may vary between 0 and 7 has an information content of 3 bits. Data Transmission III: Digital data is universally represented by strings of 1s or 0s. Each one or zero is referred to as a bit. Often, but not always, these bit strings are interpreted as numbers in a binary number system. Thus 101001 2 =41 10. The information content of a digital signal is equal to the number of bits required to represent it.

Data Transmission IV Written as an equation this relationship is I= log 2 (n) bits where n is the number of levels a signal may take. It is important to appreciate that information is a measure of the number of different outcomes a value may take. The information rate is a measure of the speed with which information is transferred. It is measured in bits/second or b/s.

Bandwidth Signal is bandlimited if it contains no energy at frequencies higher than some bandlimit or bandwidth B

Examples Audio signals. An audio signal is an example of an analogue signal. It occupies a frequency range from about 200 Hz to about 15KHz. Speech signals occupy a smaller range of frequencies, and telephone speech typically occupies the range 300 Hz to 3300 Hz.. The range of frequencies occupied by the signal is called its bandwidth (B = f2 f1 ~ f2)

Examples Television. A television signal is an analogue signal created by linearly scanning a two dimensional image. Typically the signal occupies a bandwidth of about 6 MHz. Teletext is written (or drawn) communications that are interpreted visually. Reproducing cells, in which the daughter cells's DNA contains information from the parent cells; A disk drive Our brain

ADC and DAC send analogue signals over a digital communication system, or process them on a digital computer, to convert analogue signals to digital ones. preformed by an analogue-to-digital converter (ADC). The analogue signal is sampled (i.e. measured at regularly spaced instant) The converse operation to the ADC is performed by a digital-to-analogue converter (DAC).

Example

Example How can we get the original signal back?

How fast we have to sample to recover the original signal?

Nyquist-ShannonThm The ADC process is governed by an important law. (will be discussed in Chapter 3) An analogue signal of bandwidth B can be completely recreated from its sampled from provided its sampled at a rate equal to at least twice it bandwidth S > 2 B

DCSP-2: Information Theory Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html

Data Transmission

How fast we have to sample to recover the original signal?

Today Analogous vs digital signals Shannon Information and Coding: Information theory, coding of information for efficiency and error protection;

Information and coding theory Information theory is concerned with description of information sources representation of the information from a source (coding), transmission of this information over channel.

Information and coding theory Information and coding theory The best example how a deep mathematical theory could be successfully applied to solving engineering problems.

Information and coding theory Information theory is a discipline in applied mathematics involving the quantification of data with the goal of enabling as much data as possible to be reliably on a medium and/or stored over a channel. communicated

Information and coding theory The measure of data, known as information entropy, is usually expressed by the average number of bits needed for storage or communication.

Information and coding theory The field is at the crossroads of mathematics, statistics, computer science, physics, neurobiology, electrical engineering.

Information and coding theory Impact has been crucial to success of voyager missions to deep space, invention of the CD, feasibility of mobile phones, development of the Internet, the study of linguistics and of human perception, understanding of black holes, and numerous other fields.

Information and coding theory Founded in 1948 by Claude Shannon in his seminal work A Mathematical Theory of Communication

Information and coding theory The bible paper: cited more than 60,000

Information and coding theory The most fundamental results of this theory are 1. Shannon's source coding theorem the number of bits needed to represent the result of an uncertain event is given by its entropy; 2. Shannon's noisy-channel coding theorem reliable communication is possible over noisy channels if the rate of communication is below a certain threshold called the channel capacity. The channel capacity can be approached by using appropriate encoding and decoding systems.

Information and coding theory Consider to predict the activity of Prime minister tomorrow. This prediction is an information source.

Information and coding theory Consider to predict the activity of Prime Minister tomorrow. This prediction is an information source X. The information source X ={O, R} has two outcomes: He will be in his office (O), he will be naked and run 10 miles in London (R).

Information and coding theory Clearly, the outcome of 'in office' contains little information; it is a highly probable outcome. The outcome 'naked run', however contains considerable information; it is a highly improbable event.

Information and coding theory An information source is a probability distribution, i.e. a set of probabilities assigned to a set of outcomes (events). This reflects the fact that the information contained in an outcome is determined not only by the outcome, but by how uncertain it is. An almost certain outcome contains little information. A measure of the information contained in an outcome was introduced by Hartley in 1927.

Information Defined the information contained in an outcome x i in x={x 1, x 2,,x n } I(x i ) = - log 2 p(x i )

Information The definition above also satisfies the requirement that the total information in in dependent events should add. Clearly, our prime minister prediction for two days contain twice as much information as for one day X={OO, OR, RO, RR}. For two independent outcomes x i and x j, I(x i and x j ) = - log 2 P(x i and x j ) = - log 2 P(x i ) P(x j ) = - log 2 P(x i ) - log 2 P(x j )

Entropy The measure entropy H(X) defines the information content of the source X as a whole. It is the mean information provided by the source. We have H(X)= S i P(x i ) I(x i ) = - S i P(x i ) log 2 P(x i ) A binary symmetric source (BSS) is a source with two outputs whose probabilities are p and 1-p respectively.

Entropy The prime minister discussed is a BSS. The entropy of the BBS source is H(X) = -p log 2 p - (1-p) log 2 (1-p)

Entropy. When one outcome is certain, so is the other, and the entropy is zero. As p increases, so too does the entropy, until it reaches a maximum when p = 1-p = 0.5. When p is greater than 0.5, the curve declines symmetrically to zero, reached when p=1.

Next week Application of Entropy in coding Minimal length coding

Fun

Entropy We conclude that the average information in BSS is maximised when both outcomes are equally likely. Entropy is measuring the average uncertainty of the source. (The term entropy is borrowed from thermodynamics. There too it is a measure of the uncertainly of disorder of a system). Shannon: My greatest concern was what to call it. I thought of calling it information, but the word was overly used, so I decided to call it uncertainty. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.

Entropy In Physics: thermodynamics The arrow of time (Wiki) Entropy is the only quantity in the physical sciences that seems to imply a particular direction of progress, sometimes called an arrow of time. As time progresses, the second law of thermodynamics states that the entropy of an isolated systems never decreases Hence, from this perspective, entropy measurement is thought of as a kind of clock

Entropy

IQ Creativity Entropy, IQ and creativity ITG TPOmid HIP OLF PCL SFGMed SOG IQ Creativity STG ROL INS 1/entropy 1/entropy 1/entropy