International Journal of Advanced Research in Computer Engineering & Technology
                                                                                                                     Volume 1, Issue 1, March 2012

     An Efficient Clustering Method for Atmospheric
      Conditions Prediction using ART Algorithm
                           Ankita Singh, 2Dr. Bhupesh Gour, 1Anshul khandelwal, 1Harsha Lackhwani
                                   Students of Computer Sc. & Engg. Jai Narain College of Technology, Bhopal
                                 Head of Dept. of Computer Sc. & Engg. Jai Narain College of Technology, Bhopal

Abstract-Ambient air temperatures prediction is of a                         •     Numerical weather prediction,
concern in environment, industry and agriculture. The
increase of average temperature results in global warming.                   •     Model output post processing.
The aim of this research is to develop artificial neural                Advent of digital computers and development of data driven
network based clustering method for ambient atmospheric                 artificial intelligence approaches like Artificial Neural
conditions prediction in Indian city. In this paper, we                 Networks (ANN) have helped in numerical prediction of
presented a clustering method that classifies cities based              atmospheric conditions.
on atmospheric conditions like Temperature, Pressure and                ANNs provide a methodology for solving many types of non-
Humidity. Data representing month-wise atmospheric                      linear problems that are difficult to solve by traditional
conditions are presented to Adaptive Resonance Theory                   techniques. Most meteorological processes often exhibit
Neural Network to form clusters which represents                        temporal and spatial variability, and are further plagued by
association in between two or more cities. Such                         issues of non-linearity of physical processes, conflicting
associations predict atmospheric conditions of one city on              spatial and temporal scale and uncertainty in parameter
the bases of another. ART based clustering method shows                 estimates. With ANNs, there exists the capability to extract
that the months of two cities which fall in the same cluster,           the relationship between the inputs and outputs of a process,
represent similar atmospheric conditions in them.                       without the physics being explicitly provided. Thus, these
                                                                        properties of ANNs are well suited to the problem of weather
Keywords- Atmospheric conditions, artificial                  neural    forecasting under consideration.
network, Adaptive Resonance Theory, clustering                          One of the best artificial neural network approaches is ART
                                                                        (Adaptive Resonance Theory).ART structure is a neural
    I.   INTRODUCTION                                                   network for cluster formation in an unsupervised learning
                                                                        domain. The adaptive resonance theory (ART) has been
Climate prediction is of a concern in environment, industry             developed to avoid the stability-plasticity dilemma in
and agriculture. The climate change phenomenon is as the first          competitive networks learning. The stability-plasticity
environmental problem in the world threatening the human                dilemma addresses how a learning system can preserve its
beings. The industrial activities are so effective in this              previously learned knowledge while keeping its ability to
problem and cause the global warming which the world has                learn new patterns. ART architecture models can self-organize
been faced with. The weather is a continuous, data-intensive,           in real time producing stable recognition while getting input
multidimensional, dynamic and chaotic process, and these                patterns beyond those originally stored.
properties make weather forecasting a formidable challenge.
Knowing the variability of ambient temperature is important             This algorithm tries to fit each new input pattern in an existing
in agriculture because extreme changes in air temperature may           class. If no matching class can be found, i.e., the distance
cause damage to plants and animals. Air temperature                     between the new pattern and all existing classes exceeds some
forecasting is useful in knowing the probability of tornado,            threshold, a new class is created containing the new pattern.
and flood occurrence in an area .Due to chaotic nature of the           The novelty in this approach is that the network is able to
atmosphere, the massive computational power is required to              adapt to new incoming patterns, while the previous memory is
solve the equations that describe the atmosphere, error                 not corrupted. In most neural networks, such as the back
involved in measuring the initial conditions, and an                    propagation network, all patterns must be taught sequentially;
incomplete understanding of atmospheric processes. The use              the teaching of a new pattern might corrupt the weights for all
of ensembles and model helps narrow the error and pick the              previously learned patterns. By changing the structure of the
most likely outcome. Several steps to predict the temperature           network rather than the weights, ART1 overcomes this
are                                                                     problem.
    •    Data collection(atmospheric pressure, temperature,                  II.   RELATED WORK
         wind speed and direction, humidity),
                                                                        Many works were done related to the temperature prediction
    •    Data assimilation and analysis,                                system. They are summarized below.


International Journal of Advanced Research in Computer Engineering & Technology
                                                                                                                Volume 1, Issue 1, March 2012

                                                                   results confirm that the model had the potential for successful
  Y.Radhika and M.Shashi presents an application of Support        application to temperature forecasting.
Vector Machines (SVMs) for weather prediction. Time series
data of daily maximum temperature at location is analysed to             III. ARTIFICIAL NEURAL NETWORK APPROACH
predict the maximum temperature of the next day at that
location based on the daily maximum temperatures for a span            Adaptive Resonance Theory
of previous n days referred to as order of the input.
Performance of the system is observed over various spans of 2      The best suitable method to accomplish this aim is neural
to 10 days by using optimal values of the kernel.                  network approach. One of the best neural network approaches
  Mohsen Hayati studied about Artificial Neural Network            is ART (Adaptive Resonance Theory).ART structure is a
based on MLP was trained and tested using ten years (1996-         neural network for cluster formation in an unsupervised
2006) meteorological data. The results show that MLP               learning domain. In this architecture, the number of output
network has the minimum forecasting error and can be               nodes cannot be accurately determined in advance.
considered as a good method to model the short-term
                                                                   The adaptive resonance theory (ART) has been developed to
temperature forecasting [STTF] systems.
                                                                   avoid the stability-plasticity dilemma in competitive networks
  Brian A. Smith focused on developing ANN models with
                                                                   learning. The stability-plasticity dilemma addresses how a
reduced average prediction error by increasing the number of
                                                                   learning system can preserve its previously learned
distinct observations used in training, adding additional input
                                                                   knowledge while keeping its ability to learn new patterns.
terms that describe the date of an observation, increasing the
                                                                   ART architecture models can self-organize in real time
duration of prior weather data included in each observation,
                                                                   producing stable recognition while getting input patterns
and re-examining the number of hidden nodes used in the
                                                                   beyond those originally stored. An ART system consists of
network. Models were created to predict air temperature at
                                                                   two subsystems, an attentional subsystem and an orienting
hourly intervals from one to 12 hours ahead. Each ANN
                                                                   subsystem. The stabilization of learning and activation occurs
model, consisting of a network architecture and set of
                                                                   in the attentional subsystem by matching bottom-up input
associated parameters, was evaluated by instantiating and
                                                                   activation and top-down expectation. The orienting subsystem
training 30 networks and calculating the mean absolute error
                                                                   controls the attentional subsystem when a mismatch occurs in
(MAE) of the resulting networks for some set of input
                                                                   the attentional subsystem. In other words, the orienting
                                                                   subsystem works like a novelty detector.
  Arvind Sharma briefly explains how the different
connectionist paradigms could be formulated using different        The basic ART system is an unsupervised learning model. It
learning methods and then investigates whether they can            typically consists of a comparison field and a recognition field
provide the required level of performance, which are               composed of neurons, a vigilance parameter, and a reset
sufficiently good and robust so as to provide a reliable           module. The vigilance parameter has considerable influence
forecast model for stock market indices. Experiment results        on the system: higher vigilance produces highly detailed
reveal that all the connectionist paradigms considered could       memories (many, fine-grained categories), while lower
represent the stock indices behaviour very accurately.             vigilance results in more general memories (fewer, more-
  Mike O'Neill focus on two major practical considerations:        general categories). The comparison field takes an input
the relationship between the amounts of training data and          vector (a one-dimensional array of values) and transfers it to
error rate (corresponding to the effort to collect training data   its best match in the recognition field. Its best match is the
to build a model with given maximum error rate) and the            single neuron whose set of weights (weight vector) most
transferability of models‟ expertise between different datasets    closely matches the input vector. Each recognition field
(corresponding to the usefulness for general handwritten digit     neuron outputs a negative signal (proportional to that neuron’s
recognition).                                                      quality of match to the input vector) to each of the other
  Henry A. Rowley eliminates the difficult task of manually        recognition field neurons and inhibits their output accordingly.
selecting non face training examples, which must be chosen to      In this way the recognition field exhibits lateral inhibition,
span the entire space of non-face images. Simple heuristics,       allowing each neuron in it to represent a category to which
such as using the fact that faces rarely overlap in images, can    input vectors are classified. After the input vector is classified,
further improve the accuracy. Comparisons with several other       the reset module compares the strength of the recognition
state-of-the-art face detection systems are presented; showing     match to the vigilance parameter. If the vigilance threshold is
that our system has comparable performance in terms of             met, training commences. Otherwise, if the match level does
detection and false-positive rates.                                not meet the vigilance parameter, the firing recognition
                                                                   neuron is inhibited until a new input vector is applied; training
 Dr. S.Santhosh Baboo and I.Kadar Shereef proposed a model         commences only upon completion of a search procedure. In
using BPN neural network that has potential to capture the         the search procedure, recognition neurons are disabled one by
complex relationships between many factors that contribute to      one by the reset function until the vigilance parameter is
certain temperature. The results were compared with actual         satisfied by a recognition match. If no committed recognition
working of mutual meteorological department and these              neuron’s match meets the vigilance threshold, then an


International Journal of Advanced Research in Computer Engineering & Technology
                                                                                                                             Volume 1, Issue 1, March 2012

uncommitted neuron is committed and adjusted towards                             G1 = { 1      if I ≠ 0 & X2 ≠ 0
matching the input vector.
                                                                                 G1 = { 0       otherwise

          ART1-Algorithm                                                        In other words, if there is an input vector I and F2 is not
The architecture of ART1 NN based clustering is given in Fig.                   actively producing output, then G1 = 1. Any other
2. Each input vector activates a winner node in the layer F2                    combination of activity on I and F2 would inhibit the gain
that has highest value among the product of input vector and                    control from exciting units on F1. On the other hand, the
the bottom-up weight vector. The F2 layer then reads out the                    output G2 of the gain control module depends only on the
top-down expectation of the winning node to F1, where the                       input vector I,
expectation is normalized over the input pattern vector and
compared with the vigilance parameter ρ. If the winner and
                                                                                          G2 = { 1     if I ≠0
input vector match within the tolerance allowed by the ρ, the
ART1 algorithm sets the control gain G2 to 0 and updates the
top-down weights corresponding to the winner. If a mismatch                               G2 = { 0     otherwise
occurs, the gain controls G1 & G2 are set to 1 to disable the
current node and process the input on another uncommitted                       In other words, if there exists an input vector then G2 = 1 and
node. Once the network is stabilized, the top-down weights                      recognition in F2 is allowed. Each node in F1 receiving a
corresponding to each node in F2 layer represent the prototype                  nonzero input value generates an STM pattern activity greater
vector for that node.                                                           than zero and the node’s output is an exact duplicate of input
                                                                                vector. Since both X1i and Ii are binary, their values would be
                                                                                either 1 or 0,

                                                                                X1 = I, if G1 = 1

                                                                                Each node in F1 whose activity is beyond the threshold sends
                                                                                excitatory outputs to the F2 nodes. The F1 output pattern X1
                                                                                is multiplied by the LTM traces W12 connecting from F1 to
                                                                                F2. Each node in F2 sums up all its LTM gated signals

                                                                                        V2j= ∑i X1i W12ji

                                                                                These connections represent the input pattern classification
                                                                                categories, where each weight stores one category. The output
                                                                           F    X2j is defined so that the element that receives the largest
ig 2- Architecture of our ART1 neural network based clustered. The pattern      input should be clearly enhanced. As such, the competitive
Vector PH, which represents the access patterns of the host H is the input to   network F2 works as a winner-take-all network described by.
the Comparison layer F1. The vigilance parameter determines the degree of
mismatch that is to be tolerated. The nodes at the Recognition layer F2
represent the clusters formed. Once the network stabilizes, the top-down
weights corresponding to each node in F2 represent the prototype vector for
that node.
                                                                                           V2j = { 1      if G2 = 1∩V2j = max k {V2k} ∀ k
 Phases of ART1: Processing in ART1 can be divided into four
phases, (1) recognition, (2) comparison, (3) search, and (4)
                                                                                           V2j = { 0      otherwise
(1) Recognition : Initially, in the recognition or bottom-up
activation, no input vector I is applied disabling all                          The F2 unit receiving the largest F1 output is the one that
recognition in F2 and making the two control gains, G1 and                      best matches the input vector category, thus winning the
G2, equal to zero. This causes all F2 elements to be set to                     competition. The F2 winner node fires, having its value set to
zero, giving them an equal chance to win the subsequent                         one, inhibiting all other nodes in the layer resulting in all other
recognition competition. When an input vector is applied one                    nodes being set to zero.
or more of its components must be set to one thereby making
both G1 and G2 equal to one. Thus, the control gain G1                          (2)Comparison:In the comparison or top-down template
depends on both the input vector I and the output X2 from F2,                   matching, the STM activation pattern X2 on F2generates a
                                                                                top-down template on F1. This pattern is multiplied by the


International Journal of Advanced Research in Computer Engineering & Technology
                                                                                                                   Volume 1, Issue 1, March 2012

LTM traces W12 connecting from F2 to F1. Each node in F1              lasting inhibition. When the active F2 node is suppressed, the
sums up all its LTM gated signals                                     top-down output pattern X2 and the topdown template V1 are
                                                                      removed and the former F1 activation pattern X1 is generated
          V1i = ∑j X2j W21ij                                          again.
                                                                      The newly generated pattern X1 causes the orienting
                                                                      subsystem to cancel the reset signal and bottom-up activation
The most active recognition unit from F2 passes a one back to
                                                                      starts again. Since F2 nodes having fired receive the
the comparison layer F1. Since the recognition layer is now
active, G1 is inhibited and its output is set to zero.                longlasting inhibition, a different F2 unit will win in the
In accordance with the “2/3” rule, stating that from three            recognition layer and a different stored pattern is fed back to
                                                                      the comparison layer. If the pattern once again does not
different input sources at least two are required to be active in
                                                                      match the input, the whole process gets repeated. .
order to generate an excitatory output, the only comparison
                                                                      If no reset signal is generated this time, the match is adequate
units that will fire are those that receive simultaneous ones
                                                                      and the classification is finished. The above three stages, that
from the input vector and the recognition layer. Units not
receiving a top down signal from F2 must be inactive even if          is, recognition, comparison, and search, are repeated until the
they receive input from below. This is summarized as follows          input pattern matches a top-down template X1. Otherwise a
                                                                      F2 node that has not learned any patterns yet is activated. In
                                                                      the latter case, the chosen F2 node becomes a learned new
             X1i = { 1         Ii ∩ V1i = 1                           input pattern recognition category.

             X1i = { 0         otherwise                              (4)Learning: The above three stages take place very quickly
                                                                      relative to the time constants of the learning equations of the
If there is a good match between the top-down template and            LTM traces between F1 and F2. Thus, we can assume that
the input vector, the system becomes stable and learning may          the learning occurs only when the STM reset and search
occur. If there is a mismatch between the input vector and the        process end and all STM patterns on F1 and F2 are stable.
activity coming from the recognition layer, this indicates that       The LTM traces from F1 to F2 follow the equation
the pattern being returned is not the one desired and the
recognition layer should be inhibited.                                T1 dW12ij / dt = { (1-W12ij)L-W12ij(X-1)        if V1i &V1j are active

(3)Search :The reset layer in the orienting subsystem                 T1 dW12ij / dt = { 0                            if only Vij is inactive
measures the similarity between the input vector and the
recognition layer output pattern. If a mismatch between them,
                                                                      T1 dW12ij/ dt ={ -X1 W12ij                      if only Vij is active
the reset layer inhibits the F2 layer activity. The orienting
systems compares the input vector to the F1 layer output and
causes a reset signal if their degree of similarity is less than      where τ1 is the time constant and L is a parameter with a
the vigilance level, where ρ is the vigilance parameter set as 0      value greater than one. Because time constant τ is sufficiently
<               ρ                               ≤                1.   larger than the STM activation and smaller than the input
The input pattern mismatch occurs if the following inequality         pattern presentation, the above is a slow learning equation that
is true,                       ρ <          X1         /          I   converges in the fast learning equation
If the two patterns differ by more than the vigilance
parameter, a reset signal is sent to disable the firing unit in the      W12ij ={ L /(L-1+X1)                      if V1i & V1j are active
recognition layer F2. The effect of the reset is to force the
output of the recognition layer back to zero, disabling it for           W12ij = { 0                               if only Vij is active
the duration of the current classification in order to search for
a                          better                           match.       W12ij= { no change                        if only Vij is inactive
The parameter ρ determines how large a mismatch is
tolerated. A large vigilance parameter makes the system to
search for new categories in response to small difference             The initial values for W12ij must be randomly chosen while
between I and X2 learning to classify input patterns into a           satisfying                   the                 inequality
large number of finer categories. Having a small vigilance            0 < W12ij < L / (L−1 + M) , where M is the input pattern
parameter allows for larger differences and more input                dimension equal to the number of nodes in F1.
patterns are classified into the same category.
When a mismatch occurs, the total inhibitory signal from F1           The LTM traces from F2 to F1 follows the equation,
to the orienting subsystem is increased. If the inhibition is         T2 dW21ji / dt = X2 ( -W21ji + X1i )
sufficient, the orienting subsystem fires and sends a reset           where τ2 is the time constant and the equation is defined to
signal. The activated signal affects the F2 nodes in a state-         converge during a presentation of an input pattern. Thus, the
dependent fashion. If an F2 node is active, the signal through        fast learning equation of the for W21ji is
a mechanism known as gated dipole field causes a long-


International Journal of Advanced Research in Computer Engineering & Technology
                                                                                                                           Volume 1, Issue 1, March 2012

    W21ji = { 0          if only Vi is inactive                               Now on this observed binary form of data, ART1 clustering
                                                                              algorithm is applied so as to generate class codes for various
   W21ji = { 1          if V1i and V1j are active                             inputs, so as to form clusters to show correlation in between
                                                                              the months of one city to another.
The initial value for W21ji must be randomly chosen to satisfy
the inequality 1 ≥ W21ji(0) > C (where C is decided by the
slow learning equation parameters. However, all W21ji(0)
may be set 1 in the fast learning case.


Collection of data represents the month-wise atmospheric
conditions under two parameters namely, temperature and
pressure for ten different cities of India namely - Delhi,
Kolkata, Bhopal, Mumbai, Jaipur, Amritsar, Cochin,
Lucknow, Bhubaneswar, Guwahati, which are geographically
well separated from each other.

                                                                                                 Fig 3 Cities categorized into clusters

   Fig 1. Data collected for city New Delhi –Temperature and Pressure.

After data collection of various cities, data have been
converted into normalised form i.e. binary values of
temperature and pressure of various cities have been

                                                                                                 Fig 2 Normalized form of input data

                                                                              As a final result the two major clusters formed are encircled
                                                                              and are represented by cross and check in the figure 4. These


International Journal of Advanced Research in Computer Engineering & Technology
                                                                                                                Volume 1, Issue 1, March 2012

encircled figures show the correlation between two cities.         Probability", Berkeley, University of California Press, 1:281-
These clusters show that there is a close association in           297.
between atmospheric conditions of the cities. This
correlation helps to predict atmospheric conditions of one         [3] Andreas Nürnberger and Marcin Detyniecki, “Content
city with the help of other cities’ atmospheric conditions of      Based Analysis of Email Databases Using Self-Organizing
other cities in the same cluster.                                  Maps”, Proceedings of the European Symposium on
                                                                   Intelligent Technologies, Hybrid Systems and their
 It also shows that the months of any particular cities which in   implementation on Smart Adaptive Systems EUNITE'2001,
the same category are either having same atmospheric               Tenerife, Spain, pp. 134-142, December, 2001.
conditions or the conditions are going to become similar.
Such associations are meaningful in prediction of                  [4] Arvind Sharma, Prof. Manish Manoria,” A Weather
atmospheric conditions which are helpful in protection of loss     Forecasting System using concept of Soft Computing,”.
of human, cattle life and crops.
                                                                   [5] Bhupesh Gour, et al., “ART Neural Network Based
                                                                   Clustering Method Produces Best Quality Clusters of
             V.       CONCLUSION                                   Fingerprints in Comparison to Self Organizing Map and K-
                                                                   Means Clustering Algorithms”, IEEE Communications Society
Temperature warnings are important forecasts because they
                                                                   Explore, pp. 282-286, Dec. 2008.
are used to protect life and property. Temperature
forecasting is the application of science and technology to
                                                                   [6] Bhupesh Gour, et al., “Fingerprint Clustering and Its
predict the state of the temperature for a future time and a
                                                                   Application to Generate Class Code Using ART Neural
given location. These are made by collecting quantitative
                                                                   Network”, IEEE Computer Society Explore, pp 686-690, July
data about the current state of the atmosphere. The Neural
Networks package supports different types of training or
learning algorithms. One such algorithm is Adaptive
                                                                   [7] Smith Brian.A , McClendon Ronald.W , and Hoogenboom
Resonance Theory (ART) based on artificial neural network
                                                                   Gerrit,” Improving Air Temperature Prediction with Artificial
(ANN) technique. The proposed ART1 based clustering
                                                                   Neural Networks” International Journal of Computational
method is shown very efficient in correlating the atmospheric
                                                                   Intelligence 3;3 2007.
conditions in between two or more cities and hence helps in
prediction of atmospheric conditions in one particular city
                                                                   [8] D. Jianga. Y. Zhanga. X. Hua. Y. Zenga. J. Tanb. D. Shao.
based on atmospheric conditions of another city of the same
                                                                   Progress in developing an ANN model for air pollution index
                                                                   forecast. Atmospheric Environment. 2004, 38: pp.7055-7064.
The main advantage of ART algorithm is that it can fairly form
                                                                   [9]S. Palani. P. Tkalich. R. Balasubramanian. J. Palanichamy.
clusters of similar properties so that prediction of
                                                                   ANN application for prediction of atmospheric nitrogen
atmospheric conditions of various cities can be
                                                                   deposition to aquatic ecosystems. Marine Pollution Bulletin.
differentiated. The simple meaning of this term is that our
                                                                   2011, in press.
model has potential to capture the complex relationships
between many factors that contribute to certain atmospheric
                                                                   [10] Gour Bhupesh, Khan Asif Ullah (2012). Atmospheric
                                                                   Condition based Clustering using ART Neural in International
                                                                   Journal of Information and Communication Technology
                                                                   Research.Bhopal, 2012.
                         V. REFERENCES
                                                                   [11] Abbas Osama Abu, ”Comparision between Data
[1] Dr. S. Santhosh Baboo and I.Kadar Shereef: “An Efficient
                                                                   Clustering Algorithm” In International Arab Journal of
Temperature Prediction System using BPN Neural Network”,
                                                                   Information Technology,2008.

[2] J. B. MacQueen (1967): "Some Methods for classification
and Analysis of Multivariate Observations, Proceedings of 5-
th Berkeley Symposium on Mathematical Statistics and


16 50-1-pb

  • 1. International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 1, March 2012 An Efficient Clustering Method for Atmospheric Conditions Prediction using ART Algorithm 1 Ankita Singh, 2Dr. Bhupesh Gour, 1Anshul khandelwal, 1Harsha Lackhwani 1 Students of Computer Sc. & Engg. Jai Narain College of Technology, Bhopal 2 Head of Dept. of Computer Sc. & Engg. Jai Narain College of Technology, Bhopal Abstract-Ambient air temperatures prediction is of a • Numerical weather prediction, concern in environment, industry and agriculture. The increase of average temperature results in global warming. • Model output post processing. The aim of this research is to develop artificial neural Advent of digital computers and development of data driven network based clustering method for ambient atmospheric artificial intelligence approaches like Artificial Neural conditions prediction in Indian city. In this paper, we Networks (ANN) have helped in numerical prediction of presented a clustering method that classifies cities based atmospheric conditions. on atmospheric conditions like Temperature, Pressure and ANNs provide a methodology for solving many types of non- Humidity. Data representing month-wise atmospheric linear problems that are difficult to solve by traditional conditions are presented to Adaptive Resonance Theory techniques. Most meteorological processes often exhibit Neural Network to form clusters which represents temporal and spatial variability, and are further plagued by association in between two or more cities. Such issues of non-linearity of physical processes, conflicting associations predict atmospheric conditions of one city on spatial and temporal scale and uncertainty in parameter the bases of another. ART based clustering method shows estimates. With ANNs, there exists the capability to extract that the months of two cities which fall in the same cluster, the relationship between the inputs and outputs of a process, represent similar atmospheric conditions in them. without the physics being explicitly provided. Thus, these properties of ANNs are well suited to the problem of weather Keywords- Atmospheric conditions, artificial neural forecasting under consideration. network, Adaptive Resonance Theory, clustering One of the best artificial neural network approaches is ART (Adaptive Resonance Theory).ART structure is a neural I. INTRODUCTION network for cluster formation in an unsupervised learning domain. The adaptive resonance theory (ART) has been Climate prediction is of a concern in environment, industry developed to avoid the stability-plasticity dilemma in and agriculture. The climate change phenomenon is as the first competitive networks learning. The stability-plasticity environmental problem in the world threatening the human dilemma addresses how a learning system can preserve its beings. The industrial activities are so effective in this previously learned knowledge while keeping its ability to problem and cause the global warming which the world has learn new patterns. ART architecture models can self-organize been faced with. The weather is a continuous, data-intensive, in real time producing stable recognition while getting input multidimensional, dynamic and chaotic process, and these patterns beyond those originally stored. properties make weather forecasting a formidable challenge. Knowing the variability of ambient temperature is important This algorithm tries to fit each new input pattern in an existing in agriculture because extreme changes in air temperature may class. If no matching class can be found, i.e., the distance cause damage to plants and animals. Air temperature between the new pattern and all existing classes exceeds some forecasting is useful in knowing the probability of tornado, threshold, a new class is created containing the new pattern. and flood occurrence in an area .Due to chaotic nature of the The novelty in this approach is that the network is able to atmosphere, the massive computational power is required to adapt to new incoming patterns, while the previous memory is solve the equations that describe the atmosphere, error not corrupted. In most neural networks, such as the back involved in measuring the initial conditions, and an propagation network, all patterns must be taught sequentially; incomplete understanding of atmospheric processes. The use the teaching of a new pattern might corrupt the weights for all of ensembles and model helps narrow the error and pick the previously learned patterns. By changing the structure of the most likely outcome. Several steps to predict the temperature network rather than the weights, ART1 overcomes this are problem. • Data collection(atmospheric pressure, temperature, II. RELATED WORK wind speed and direction, humidity), Many works were done related to the temperature prediction • Data assimilation and analysis, system. They are summarized below. 12 All Rights Reserved © 2012 IJARCET
  • 2. International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 1, March 2012 results confirm that the model had the potential for successful Y.Radhika and M.Shashi presents an application of Support application to temperature forecasting. Vector Machines (SVMs) for weather prediction. Time series data of daily maximum temperature at location is analysed to III. ARTIFICIAL NEURAL NETWORK APPROACH predict the maximum temperature of the next day at that location based on the daily maximum temperatures for a span Adaptive Resonance Theory of previous n days referred to as order of the input. Performance of the system is observed over various spans of 2 The best suitable method to accomplish this aim is neural to 10 days by using optimal values of the kernel. network approach. One of the best neural network approaches Mohsen Hayati studied about Artificial Neural Network is ART (Adaptive Resonance Theory).ART structure is a based on MLP was trained and tested using ten years (1996- neural network for cluster formation in an unsupervised 2006) meteorological data. The results show that MLP learning domain. In this architecture, the number of output network has the minimum forecasting error and can be nodes cannot be accurately determined in advance. considered as a good method to model the short-term The adaptive resonance theory (ART) has been developed to temperature forecasting [STTF] systems. avoid the stability-plasticity dilemma in competitive networks Brian A. Smith focused on developing ANN models with learning. The stability-plasticity dilemma addresses how a reduced average prediction error by increasing the number of learning system can preserve its previously learned distinct observations used in training, adding additional input knowledge while keeping its ability to learn new patterns. terms that describe the date of an observation, increasing the ART architecture models can self-organize in real time duration of prior weather data included in each observation, producing stable recognition while getting input patterns and re-examining the number of hidden nodes used in the beyond those originally stored. An ART system consists of network. Models were created to predict air temperature at two subsystems, an attentional subsystem and an orienting hourly intervals from one to 12 hours ahead. Each ANN subsystem. The stabilization of learning and activation occurs model, consisting of a network architecture and set of in the attentional subsystem by matching bottom-up input associated parameters, was evaluated by instantiating and activation and top-down expectation. The orienting subsystem training 30 networks and calculating the mean absolute error controls the attentional subsystem when a mismatch occurs in (MAE) of the resulting networks for some set of input the attentional subsystem. In other words, the orienting patterns. subsystem works like a novelty detector. Arvind Sharma briefly explains how the different connectionist paradigms could be formulated using different The basic ART system is an unsupervised learning model. It learning methods and then investigates whether they can typically consists of a comparison field and a recognition field provide the required level of performance, which are composed of neurons, a vigilance parameter, and a reset sufficiently good and robust so as to provide a reliable module. The vigilance parameter has considerable influence forecast model for stock market indices. Experiment results on the system: higher vigilance produces highly detailed reveal that all the connectionist paradigms considered could memories (many, fine-grained categories), while lower represent the stock indices behaviour very accurately. vigilance results in more general memories (fewer, more- Mike O'Neill focus on two major practical considerations: general categories). The comparison field takes an input the relationship between the amounts of training data and vector (a one-dimensional array of values) and transfers it to error rate (corresponding to the effort to collect training data its best match in the recognition field. Its best match is the to build a model with given maximum error rate) and the single neuron whose set of weights (weight vector) most transferability of models‟ expertise between different datasets closely matches the input vector. Each recognition field (corresponding to the usefulness for general handwritten digit neuron outputs a negative signal (proportional to that neuron’s recognition). quality of match to the input vector) to each of the other Henry A. Rowley eliminates the difficult task of manually recognition field neurons and inhibits their output accordingly. selecting non face training examples, which must be chosen to In this way the recognition field exhibits lateral inhibition, span the entire space of non-face images. Simple heuristics, allowing each neuron in it to represent a category to which such as using the fact that faces rarely overlap in images, can input vectors are classified. After the input vector is classified, further improve the accuracy. Comparisons with several other the reset module compares the strength of the recognition state-of-the-art face detection systems are presented; showing match to the vigilance parameter. If the vigilance threshold is that our system has comparable performance in terms of met, training commences. Otherwise, if the match level does detection and false-positive rates. not meet the vigilance parameter, the firing recognition neuron is inhibited until a new input vector is applied; training Dr. S.Santhosh Baboo and I.Kadar Shereef proposed a model commences only upon completion of a search procedure. In using BPN neural network that has potential to capture the the search procedure, recognition neurons are disabled one by complex relationships between many factors that contribute to one by the reset function until the vigilance parameter is certain temperature. The results were compared with actual satisfied by a recognition match. If no committed recognition working of mutual meteorological department and these neuron’s match meets the vigilance threshold, then an 13 All Rights Reserved © 2012 IJARCET
  • 3. International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 1, March 2012 uncommitted neuron is committed and adjusted towards G1 = { 1 if I ≠ 0 & X2 ≠ 0 matching the input vector. G1 = { 0 otherwise ART1-Algorithm In other words, if there is an input vector I and F2 is not The architecture of ART1 NN based clustering is given in Fig. actively producing output, then G1 = 1. Any other 2. Each input vector activates a winner node in the layer F2 combination of activity on I and F2 would inhibit the gain that has highest value among the product of input vector and control from exciting units on F1. On the other hand, the the bottom-up weight vector. The F2 layer then reads out the output G2 of the gain control module depends only on the top-down expectation of the winning node to F1, where the input vector I, expectation is normalized over the input pattern vector and compared with the vigilance parameter ρ. If the winner and G2 = { 1 if I ≠0 input vector match within the tolerance allowed by the ρ, the ART1 algorithm sets the control gain G2 to 0 and updates the top-down weights corresponding to the winner. If a mismatch G2 = { 0 otherwise occurs, the gain controls G1 & G2 are set to 1 to disable the current node and process the input on another uncommitted In other words, if there exists an input vector then G2 = 1 and node. Once the network is stabilized, the top-down weights recognition in F2 is allowed. Each node in F1 receiving a corresponding to each node in F2 layer represent the prototype nonzero input value generates an STM pattern activity greater vector for that node. than zero and the node’s output is an exact duplicate of input vector. Since both X1i and Ii are binary, their values would be either 1 or 0, X1 = I, if G1 = 1 Each node in F1 whose activity is beyond the threshold sends excitatory outputs to the F2 nodes. The F1 output pattern X1 is multiplied by the LTM traces W12 connecting from F1 to F2. Each node in F2 sums up all its LTM gated signals V2j= ∑i X1i W12ji These connections represent the input pattern classification categories, where each weight stores one category. The output F X2j is defined so that the element that receives the largest ig 2- Architecture of our ART1 neural network based clustered. The pattern input should be clearly enhanced. As such, the competitive Vector PH, which represents the access patterns of the host H is the input to network F2 works as a winner-take-all network described by. the Comparison layer F1. The vigilance parameter determines the degree of mismatch that is to be tolerated. The nodes at the Recognition layer F2 represent the clusters formed. Once the network stabilizes, the top-down weights corresponding to each node in F2 represent the prototype vector for that node. V2j = { 1 if G2 = 1∩V2j = max k {V2k} ∀ k Phases of ART1: Processing in ART1 can be divided into four phases, (1) recognition, (2) comparison, (3) search, and (4) learning. V2j = { 0 otherwise (1) Recognition : Initially, in the recognition or bottom-up activation, no input vector I is applied disabling all The F2 unit receiving the largest F1 output is the one that recognition in F2 and making the two control gains, G1 and best matches the input vector category, thus winning the G2, equal to zero. This causes all F2 elements to be set to competition. The F2 winner node fires, having its value set to zero, giving them an equal chance to win the subsequent one, inhibiting all other nodes in the layer resulting in all other recognition competition. When an input vector is applied one nodes being set to zero. or more of its components must be set to one thereby making both G1 and G2 equal to one. Thus, the control gain G1 (2)Comparison:In the comparison or top-down template depends on both the input vector I and the output X2 from F2, matching, the STM activation pattern X2 on F2generates a top-down template on F1. This pattern is multiplied by the 14 All Rights Reserved © 2012 IJARCET
  • 4. International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 1, March 2012 LTM traces W12 connecting from F2 to F1. Each node in F1 lasting inhibition. When the active F2 node is suppressed, the sums up all its LTM gated signals top-down output pattern X2 and the topdown template V1 are removed and the former F1 activation pattern X1 is generated V1i = ∑j X2j W21ij again. The newly generated pattern X1 causes the orienting subsystem to cancel the reset signal and bottom-up activation The most active recognition unit from F2 passes a one back to starts again. Since F2 nodes having fired receive the the comparison layer F1. Since the recognition layer is now active, G1 is inhibited and its output is set to zero. longlasting inhibition, a different F2 unit will win in the In accordance with the “2/3” rule, stating that from three recognition layer and a different stored pattern is fed back to the comparison layer. If the pattern once again does not different input sources at least two are required to be active in match the input, the whole process gets repeated. . order to generate an excitatory output, the only comparison If no reset signal is generated this time, the match is adequate units that will fire are those that receive simultaneous ones and the classification is finished. The above three stages, that from the input vector and the recognition layer. Units not receiving a top down signal from F2 must be inactive even if is, recognition, comparison, and search, are repeated until the they receive input from below. This is summarized as follows input pattern matches a top-down template X1. Otherwise a F2 node that has not learned any patterns yet is activated. In the latter case, the chosen F2 node becomes a learned new X1i = { 1 Ii ∩ V1i = 1 input pattern recognition category. X1i = { 0 otherwise (4)Learning: The above three stages take place very quickly relative to the time constants of the learning equations of the If there is a good match between the top-down template and LTM traces between F1 and F2. Thus, we can assume that the input vector, the system becomes stable and learning may the learning occurs only when the STM reset and search occur. If there is a mismatch between the input vector and the process end and all STM patterns on F1 and F2 are stable. activity coming from the recognition layer, this indicates that The LTM traces from F1 to F2 follow the equation the pattern being returned is not the one desired and the recognition layer should be inhibited. T1 dW12ij / dt = { (1-W12ij)L-W12ij(X-1) if V1i &V1j are active (3)Search :The reset layer in the orienting subsystem T1 dW12ij / dt = { 0 if only Vij is inactive measures the similarity between the input vector and the recognition layer output pattern. If a mismatch between them, T1 dW12ij/ dt ={ -X1 W12ij if only Vij is active the reset layer inhibits the F2 layer activity. The orienting systems compares the input vector to the F1 layer output and causes a reset signal if their degree of similarity is less than where τ1 is the time constant and L is a parameter with a the vigilance level, where ρ is the vigilance parameter set as 0 value greater than one. Because time constant τ is sufficiently < ρ ≤ 1. larger than the STM activation and smaller than the input The input pattern mismatch occurs if the following inequality pattern presentation, the above is a slow learning equation that is true, ρ < X1 / I converges in the fast learning equation If the two patterns differ by more than the vigilance parameter, a reset signal is sent to disable the firing unit in the W12ij ={ L /(L-1+X1) if V1i & V1j are active recognition layer F2. The effect of the reset is to force the output of the recognition layer back to zero, disabling it for W12ij = { 0 if only Vij is active the duration of the current classification in order to search for a better match. W12ij= { no change if only Vij is inactive The parameter ρ determines how large a mismatch is tolerated. A large vigilance parameter makes the system to search for new categories in response to small difference The initial values for W12ij must be randomly chosen while between I and X2 learning to classify input patterns into a satisfying the inequality large number of finer categories. Having a small vigilance 0 < W12ij < L / (L−1 + M) , where M is the input pattern parameter allows for larger differences and more input dimension equal to the number of nodes in F1. patterns are classified into the same category. When a mismatch occurs, the total inhibitory signal from F1 The LTM traces from F2 to F1 follows the equation, to the orienting subsystem is increased. If the inhibition is T2 dW21ji / dt = X2 ( -W21ji + X1i ) sufficient, the orienting subsystem fires and sends a reset where τ2 is the time constant and the equation is defined to signal. The activated signal affects the F2 nodes in a state- converge during a presentation of an input pattern. Thus, the dependent fashion. If an F2 node is active, the signal through fast learning equation of the for W21ji is a mechanism known as gated dipole field causes a long- 15 All Rights Reserved © 2012 IJARCET
  • 5. International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 1, March 2012 W21ji = { 0 if only Vi is inactive Now on this observed binary form of data, ART1 clustering algorithm is applied so as to generate class codes for various W21ji = { 1 if V1i and V1j are active inputs, so as to form clusters to show correlation in between the months of one city to another. The initial value for W21ji must be randomly chosen to satisfy the inequality 1 ≥ W21ji(0) > C (where C is decided by the slow learning equation parameters. However, all W21ji(0) may be set 1 in the fast learning case. IV. EXPERIMENTATION AND RESULTS Collection of data represents the month-wise atmospheric conditions under two parameters namely, temperature and pressure for ten different cities of India namely - Delhi, Kolkata, Bhopal, Mumbai, Jaipur, Amritsar, Cochin, Lucknow, Bhubaneswar, Guwahati, which are geographically well separated from each other. Fig 3 Cities categorized into clusters Fig 1. Data collected for city New Delhi –Temperature and Pressure. After data collection of various cities, data have been converted into normalised form i.e. binary values of temperature and pressure of various cities have been recognized. Fig 2 Normalized form of input data As a final result the two major clusters formed are encircled and are represented by cross and check in the figure 4. These 16 All Rights Reserved © 2012 IJARCET
  • 6. International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 1, March 2012 encircled figures show the correlation between two cities. Probability", Berkeley, University of California Press, 1:281- These clusters show that there is a close association in 297. between atmospheric conditions of the cities. This correlation helps to predict atmospheric conditions of one [3] Andreas Nürnberger and Marcin Detyniecki, “Content city with the help of other cities’ atmospheric conditions of Based Analysis of Email Databases Using Self-Organizing other cities in the same cluster. Maps”, Proceedings of the European Symposium on Intelligent Technologies, Hybrid Systems and their It also shows that the months of any particular cities which in implementation on Smart Adaptive Systems EUNITE'2001, the same category are either having same atmospheric Tenerife, Spain, pp. 134-142, December, 2001. conditions or the conditions are going to become similar. Such associations are meaningful in prediction of [4] Arvind Sharma, Prof. Manish Manoria,” A Weather atmospheric conditions which are helpful in protection of loss Forecasting System using concept of Soft Computing,”. of human, cattle life and crops. [5] Bhupesh Gour, et al., “ART Neural Network Based Clustering Method Produces Best Quality Clusters of V. CONCLUSION Fingerprints in Comparison to Self Organizing Map and K- Means Clustering Algorithms”, IEEE Communications Society Temperature warnings are important forecasts because they Explore, pp. 282-286, Dec. 2008. are used to protect life and property. Temperature forecasting is the application of science and technology to [6] Bhupesh Gour, et al., “Fingerprint Clustering and Its predict the state of the temperature for a future time and a Application to Generate Class Code Using ART Neural given location. These are made by collecting quantitative Network”, IEEE Computer Society Explore, pp 686-690, July data about the current state of the atmosphere. The Neural 2008. Networks package supports different types of training or learning algorithms. One such algorithm is Adaptive [7] Smith Brian.A , McClendon Ronald.W , and Hoogenboom Resonance Theory (ART) based on artificial neural network Gerrit,” Improving Air Temperature Prediction with Artificial (ANN) technique. The proposed ART1 based clustering Neural Networks” International Journal of Computational method is shown very efficient in correlating the atmospheric Intelligence 3;3 2007. conditions in between two or more cities and hence helps in prediction of atmospheric conditions in one particular city [8] D. Jianga. Y. Zhanga. X. Hua. Y. Zenga. J. Tanb. D. Shao. based on atmospheric conditions of another city of the same Progress in developing an ANN model for air pollution index cluster. forecast. Atmospheric Environment. 2004, 38: pp.7055-7064. The main advantage of ART algorithm is that it can fairly form [9]S. Palani. P. Tkalich. R. Balasubramanian. J. Palanichamy. clusters of similar properties so that prediction of ANN application for prediction of atmospheric nitrogen atmospheric conditions of various cities can be deposition to aquatic ecosystems. Marine Pollution Bulletin. differentiated. The simple meaning of this term is that our 2011, in press. model has potential to capture the complex relationships between many factors that contribute to certain atmospheric [10] Gour Bhupesh, Khan Asif Ullah (2012). Atmospheric conditions Condition based Clustering using ART Neural in International Journal of Information and Communication Technology Research.Bhopal, 2012. V. REFERENCES [11] Abbas Osama Abu, ”Comparision between Data [1] Dr. S. Santhosh Baboo and I.Kadar Shereef: “An Efficient Clustering Algorithm” In International Arab Journal of Temperature Prediction System using BPN Neural Network”, Information Technology,2008. IACSIT,2011. [2] J. B. MacQueen (1967): "Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5- th Berkeley Symposium on Mathematical Statistics and 17 All Rights Reserved © 2012 IJARCET