Ikonen, Enso (1996). Algorithms for process modelling using fuzzy neural networks: A distributed logic processor approach; Department of Process Engineering, University of Oulu; FIN-90570 Oulu, Finland.

Acta Univ. Oul. C95, 1996; Oulu, Finland; (Received 9 December, 1996).

Abstract. Acknowledgements. Contents . Introduction . Conclusions.

Abstract

Process models are needed in characterising and predicting plant behaviour. Parameterised, experimental models can be justified by the reduced time and effort required in building the models, and their flexibility in real-world modelling problems. Fuzzy neural network techniques provide a way to obtain non-linear experimental process models. Parameter estimation techniques can be used to identify model parameters from measurement data. The model contents can also be presented as rules, which allows the use of human experimental knowledge in initialising model parameters, complementing missing data, and validating the identified model.

Distributed Logic Processors (DLP) are a fuzzy neural network structure. DLP's consist of Logic Processors (LP), outputs of which are combined, using a weighted average. Each logic processor approximates a logical relationship, a fuzzy function, between its inputs and outputs. These logical relationships can be identified from data through adjustment of each LP's internal parameters. Two types of algorithms are considered for the LP parameter estimation. The first is based on the gradient of the cost criterion; the Recursive Prediction Error (RPE) method. The second is a guided random search method, using games of learning automata. A third algorithm considers on-line modelling situations. The topology preserving features of Self-Organising Maps (SOM) are used in collecting a good set of training data. The collected data are used in on-line estimation of the parameters of a non-linear model. All three algorithms are applied in modelling flue-gas emissions, measured from an industrial Fluidized Bed Combustor (FBC).

Keywords: non-linear models, artificial intelligence, boilers

Acknowledgements

This research was carried out in the Systems Engineering Laboratory of the Department of Process Engineering, University of Oulu, Finland, in 1994; at the Process Control Laboratory of the Ecole Nationale Supérieure d'Ingénieurs de Génie Chimique in Toulouse, France, in 1995, and completed at the Department of Process Engineering in 1996.

I would like to express my gratitude to the supervisors of my work: Professor Urpo Kortela, University of Oulu, and Professor Kaddour Najim, Ecole Nationale Supérieure d'Ingénieurs de Génie Chimique, Toulouse. Professor Kortela has provided the ideal framework for my research; without his continuous encouragement this work would not have been possible. Professor Najim generously helped me to overcome obstacles both in practical arrangements and scientific issues, thus making my visit in Toulouse both pleasant and productive. I wish to thank my fellow-workers for most agreable co-operation. The manuscript was reviewed by Professor Raimo Ylinen and Docent Visa Koivunen. Their valuable comments and constructive suggestions are gratefully acknowledged. I would also like to thank Ms. Hilary Ladd for correcting my English.

Finacial support came from the Academy of Finland (1994); the Human Capital and Mobility programme of the Commission of the European Communities (1995); and the graduate school programme GETA of the Ministry of Education (1996). This work was also supported by the Tauno Tönning Foundation and the IVO Foundation.

I am greatly indebted to my parents and my sister for the unconditioned support they have showed in the course of my research work. Most of all, I want to thank my wife, Riikka, for her tolerance and understanding.

Enso Ikonen

Oulu, 1 December, 1996

ABSTRACT

ACKNOWLEDGEMENTS

LIST OF SYMBOLS

1. INTRODUCTION 15

1.1 Contribution 15
1.2 Contents 17

2. EXPERIMENTAL PROCESS MODELLING 18

2.1 Artificial intelligence 18
2.2 Process modelling 20
- 2.2.1 First-principle models vs. Experimental models 21
- 2.2.2 Linear models vs. Non-linear models 22
2.3 Non-linear black-box models as basis function networks 23
- 2.3.1 Local models 24
- 2.3.2 Fuzzy models 27
- 2.3.3 Global models 28
- 2.3.4 Function approximation 31
- 2.3.5 Distributed logic processors 31
2.4 Parameter estimation 35
- 2.4.1 Bias–variance dilemma 35
- 2.4.2 Gradient methods 36
- 2.4.3 Random search methods 37
- 2.4.4 On-line modelling 40

3. FUZZY INFERENCE SYSTEMS 43

3.1 Fuzzification 43
3.2 Data base 44
3.3 Rule base 47
3.4 Decision logic 49
3.5 Defuzzification 50
3.6 Universal function approximation 50

4. MLP PLATFORM 53

4.1 Nodes 53
4.2 Networks 57
4.3 DLP's as Sugeno models 61

5. GRADIENT-BASED METHODS 63

5.1 Simple gradient descent 63
5.2 Recursive prediction error method 64
5.3 Simplified recursive prediction error method 65
5.4 Learning algorithm 66
5.5 Experiments 69
- 5.5.1 Linear regression 70
- 5.5.2 MLP neural network models 71
- 5.5.3 LP models 74
- 5.5.4 DLP models 78
5.6 Discussion 84

6. LEARNING AUTOMATA APPROACH 85

6.1 Learning automata 85
6.2 Assignment of learning automata 88
6.3 Learning algorithm 92
6.4 Experiments 96
6.5 Discussion 103

7. COMBINED NETWORK 104

7.1 Self-organising map 104
- 7.1.1 Kohonen's algorithm 106
- 7.1.2 Recursive least-squares method 106
- 7.1.3 Simplified recursive least-squares 107
7.2 Motivations for the combined network 109
7.3 Combined network 111
7.4 Experiments 114
7.5 Discussion 118

8. CONCLUSIONS 119

REFERENCES 122

APPENDIX I. FBC NOx EMISSION DATA 130

Fluidized-bed combustion
Data pre-processing

1. Introduction

A process engineer faced with a need of characterisation of plant behaviour is confronted with a problem of choosing from a multitude of modelling paradigms. This thesis considers only a few of the available methods, placing emphasis upon efficient capturing and rendering of knowledge. Although some of the classical methods are capable of mimicking even complex non-linear behaviour, the interactiveness, flexibility and ease of use leave much to hope for. Artificial Intelligence (AI) helps us to understand the human mind, whereas system engineering helps us to formulate our knowledge of the world. As a combination of these, together with the power of modern computers as an implementational platform, fields such as fuzzy neural networks have emerged. AI techniques have opened new and fascinating possibilities to support us in our everyday actions. Systems engineering provides the necessary means for developing and justifying the techniques. This thesis considers applications of AI-inspired methods for creating models for non-linear processes, characterised by time-varying, complex and poorly understood phenomena.

1.1 Contribution

Fuzzy neural networks make it possible to create models for processes exhibiting poorly known non-linear behaviour. A process model can be constructed, provided that information is available in the form of process measurements and/or verbal descriptions. A duality in forms of presentation can be achieved since the contents of a model can be presented as a mathematical equation or a collection of concepts, rules and an inference method. Numerical methods can be applied for raw computations in order to fine-tune an existing model or to build the model starting from scratch. Linguistic knowledge can be used to quickly build an initial model, to complete the measurements when no experimental data can be conveniently collected, and to assess the quality of the model.

The introduction to experimental process modelling given in this thesis is based on a literature review made by the author of this thesis. The presentation of fuzzy inference systems and their MLP implementation consists mostly of a review and summary, where the author was responsible for finding a suitable, unifying framework for the presentation. The indication of the connections between the 0-order Sugeno model and the Distributed Logic Processors (DLP) is attributable to the author.

From the point of view of practical applications, the main contribution of this thesis considers the DLP's trained with the gradient-based methods (Ikonen & Najim 1996, Ikonen et al. 1996a, 1996d). It is shown that DLP's can profit from the learning methods widely used with artificial neural networks. The Recursive Prediction Error (RPE) method gives a robust method for training the fuzzy Logic Processor (LP) rule base using measured data patterns. The easily understandable if-then presentation of fuzzy systems can be exploited to clarify the contents of a non-linear process model. The DLP structure has received fairly little interest in the literature, although it has the attractive property of being able to approximate non-linear functions and simultaneously reveal logical relationships, without altering the initial concepts. The gradient-based algorithms have been considered in many other papers; the idea of implementations that benefit from parallelism and locality of computations has been emphasised from the very beginning of neural research. The author is fully responsible for the application and interpretation of the algorithms for DLP's.

From the algorithmic point of view, two original algorithms are proposed in this thesis. First, a novel algorithm using learning automata for LP training is suggested (Ikonen & Najim 1997), originally published in (Ikonen et al. 1996c). Learning automata-based learning as applied to logic processors has not been previously investigated. The principal motivation comes from the possibility to perform a global search in the parameter space. The author is responsible for the ideas and applications concerning assignment of learning automata for LP training. Second, an algorithm using the topology-preserving properties of the Self-Organising Map (SOM) is used for improving the quality of training data in non-linear modelling of time-varying processes (Ikonen & Kortela 1995a, 1995b). The idea of using a SOM to collect training patterns in on-line identification was first drafted in (Ikonen 1994). This algorithm provides a trade-off between forgetting based on location of the data pattern in time and location in measurement space. It provides a practical method to make available the latest information, concentrating on the true operating area of the process, yet providing explicit prototypes from the whole operating area. The theory of SOM's originates from the given references. The author is fully responsible for the underlying considerations, development and application of the combined network.

The behaviour of these algorithms is illustrated using a realistic modelling example with industrial data. All the programming and simulations considered in this thesis have been performed by the author of this thesis. Applications using measurements from the Fluidized Bed Combustor (FBC) have been considered in (Ikonen & Kortela 1995b, Ikonen et al. 1996c, 1996d, Ikonen & Najim 1996, 1997); a simplified first-principle-based dynamic model for the FBC process is presented in (Ikonen & Kortela 1994b), with special focus on the bed fuel inventory. In (Ikonen et al. 1996a, Ikonen & Heikkinen 1996), data from a pump-valve system is used. The problems addressed in the preceeding papers are similar to those considered in this thesis; however, most simulations presented in this thesis are new. The results presented in this thesis are based on a three-year period of research, during which several journal and conference papers have been published. In the publications directly related to the contents of this thesis (Ikonen & Kortela 1994a, 1994b, 1995a, 1995b, Ikonen & Najim 1996, 1997, Ikonen et al. 1996a, 1996c, 1996d), the main contribution was(is) made by the author of this thesis.

1.2 Contents

This thesis is organised in eight chapters. Chapter 2 gives a general overview of the experimental process modelling based on sampled data. Commonly used model structures for non-linear modelling are represented in a unifying framework of basis function networks. A short overview of gradient-based and random search parameter estimation methods is given. Chapters 3 and 4 give the basic theory and notations on fuzzy inference systems and their network implementation.

Parameter estimation methods are considered in chapters 5 and 6. The Recursive Prediction Error (RPE) method is presented in chapter 5, and its simplifications are discussed. An application of the gradient learning with fuzzy Logic Processors (LP) is given. The rule base of an LP is identified based on sample patterns using the RPE method, and the Distributed Logic Processor (DLP) approach is compared to the more traditional Multi-Layer Perceptron (MLP) neural network and linear regression alternatives. Chapter 6 introduces the Learning Automata (LA). The assignment of learning automata for the training of the LP connections is discussed, and a novel method is proposed to identify the rule bases of a DLP model.

Chapter 7 considers a self-organising configuration used for pre-processing the training data. Kohonen's Self-Organising Map (SOM) is introduced and the unsupervised learning algorithms are given. A combined network for adaptive non-linear modelling in real-time is proposed. Motivations for its use are discussed, and simulation examples illustrate the behaviour of the suggested algorithm. Finally, conclusions of the main issues considered in this thesis are drawn in chapter 8.

8. Conclusions

Process models are used in a large variety of engineering tasks such as design, control, fault detection, and optimisation. Linear approximations are not always sufficient in describing the non-linear relationships encountered in real industrial plants. The complex process phenomena are often poorly understood, and models have to be built based on plant measurements and operators' experience. The basic criterion for assessing any model is its accuracy in describing the plant behaviour. If user-intervention is expected, the transparency of the model is of great importance. The implementational and computational issues also need to be considered as well as the adaptability of the model to the changing plant conditions. This thesis has considered a modelling approach capable of dealing with all the above requirements: the Distributed Logic Processor (DLP) approach.

A general introduction to experimental non-linear modelling was given in chapter 2. Chapters 3 and 4 laid the basis for the DLP models: fuzzy systems and Multi-Layer Perceptrons (MLP). Distributed logic processors were shown to implement a 0-order Sugeno fuzzy model with a parameterized rule base. Logic processors (LP), the basic building blocks of DLP's, were implemented on the MLP platform. The overall DLP model was also presented as a network, and DLP's were seen to provide a synthesis of fuzzy and neural paradigms -- fuzzy neural networks.

Parameter estimation using real data was considered in chapter 5, where methods based on gradient information were applied. The Recursive Prediction Error (RPE) method, with its extensions, is familiar from the literature of both linear systems and MLP neural networks. In this work, the RPE method was applied in estimating the parameters of a DLP model. Simplifications of the full algorithm were considered in order to decrease computational load and memory requirements. Simulations with MLP neural networks confirmed that the simplified methods worked, but at the cost of decreased performance. A study of the LP properties showed their robustness against over-parameterization and noisy data. On the other hand, the flexibility of DLP models was less when compared to MLP neural networks.

From the modelling experiments with FBC emission data, guidelines for choosing among three model structures (linear regression, MLP neural networks, DLP networks) could be drawn. These can be based on the relative importance of the model simplicity, accuracy and explanatory status, as assessed by the engineer in need of a process model. When simple models are desired, linear regression models may provide a reasonable approximation. Whereas linear models were simple and fast to work with, MLP neural networks gave the most accurate, but black-box, models. DLP modelling was computation-consuming; however, non-linearities could be captured by the structure, and an interpretable meaning assigned for the parameters of the network. When weight is also put on the explanatory capabilities of a non-linear model, the DLP networks provide the most suitable solution.

A novel training method was proposed in chapter 6. In the stochastic learning automata approach, the main difficulties were in structuring the search space of each automaton. Neither the number of automata nor the number of actions in each automata should exceed reasonable limits in order for the search to be effective. With LP's, this problem was solved using the fuzzy system's internal structure. Based on a game of automata, the search is not easily fooled by the local minima of the error surface. The convergence to an optimal solution was also shown in the simulations. A drawback of this method is that only a discretised parameter space can be explored. LP's were, however, found to suit well the discretisation, since the parameter space is naturally bounded inside a hypercube. In simulations, performances similar to the gradient-based methods were obtained, both in terms of training times and prediction accuracies. In general, the random search methods have attained fairly little interest in the neuro-fuzzy literature, although they have some very appealing features. They are simple, transparent and easy to apply, even for complexly structured or constrained systems.

In chapter 7, a hybrid structure for on-line measurement collecting was proposed. A trade-off between state-based quantisation and time-based weighting of the incoming information was implemented using a Self-Organising Map (SOM). Compared with a simple time-based weighting, a much wider training set could be collected. Yet, the time-varying spatial distribution of the data set was also reflected by the training data. The benefits of the structure were illustrated in the simulations, where MLP neural network models were identified from the FBC emission data. Compared to the moving window approach, more accurate, global non-linear models could be identified using the suggested combined network. On-line modelling is a difficult task, where a multitude of issues need to be taken into account. The combined network contributes to the problem of having insufficiently rich data. Additionally, the map provided by the SOM layer can be used to reveal internal topologies of the process.

The algorithms considered in this thesis were all experimented with using the same case plant. The plant was represented by a set of measurements collected from an operating industrial-sized fluidized bed thermal power plant. The considered flue gas emission modelling problem had some features typical to tasks in process industry: close–to–linear mapping, small and noisy data set, and a relatively small input dimension (after pre-processing). Application to other process modelling problems is straightforward. The off-line simulations encourage the application of the considered methods in real on-line situations. If adaptive models are not required, the discussed models can be easily implemented in most of the existing process automation systems. For training purposes, the power of a modern, personal computer is quite sufficient.

Whether the engineer's ultimate aim is increasing plant productivity, tighter control or safety margins, environmental issues, or education of the plant personnel, the modern modelling methods inspired by the science of artificial intelligence should have a place in the tool box of a process engineer.

Back to my homepage.