(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
Do not Match, Inherit: Fitness Surrogates for
Genetics-Based Machine Learning Techniques


       Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2

1 National   Center for Supercomputing Applications. University of Illinois at Urbana-Champaign
     2 Illinois   Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign
                  3 Department   of Electrical Engineering, National Taiwan University




                          Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199
GECCO 2007 HUMIES                                                                             1
Motivation
• Competent GBML
      – Use competent GAs to approach GBML problems
      – Take advantage of competent GA scalability
      – Provide insight about problem structure
      – χeCCS by Llorà, Sastry, Goldberg & de la Ossa (2006)

• Rule matching may thread practical applications
      – Even for small dimensional problems (MUX 20), rule matching
        may take more than 85% of the execution time in XCS
      – As dimensionality or cardinality of training sets increase, rule
        matching rules the overall execution time
      – Efficient implementation (Llora & Sastry, 2006) still require
        matching rules

GECCO 2007                     Llorà, Sastry, Yu & Goldberg                2
Motivation
• Competent GAs
      – Byproduct: Models and problem structure insight
      – Revision of the fitness relaxation for expensive fitness
             evaluations
      – Idea: Build a cheap surrogate fitness accurate enough
      – Successfully applied to GA (Sastry, Lima & Goldberg, 2006)
      – Help cut down the number of fitness evaluations

• GBML
      – Can we transfer the same ideas to GBML approaches?
      – What are the requirements needed for competent GBML to
             benefit from fitness relaxation?

GECCO 2007                         Llorà, Sastry, Yu & Goldberg      3
Outline
• Overview χeCCS
• Evaluation relaxation
• Fitness inheritance using least squares fitting
• Fitness inheritance and χeCCS
• Results
• Conclusions




GECCO 2007            Llorà, Sastry, Yu & Goldberg   4
χ-ari Extended Compact Classifier System

• No reinforcement learning is used
• A competent GA is in charge of the learning
• The idea:
      – A population of single rules
      – For each rule we compute its fitness
      – The χ-ari extended compact genetic algorithm
      – Niching to maintain different accurate rules (restricted
        tournament replacement)




GECCO 2007                  Llorà, Sastry, Yu & Goldberg           5
Maximally Accurate and General Rules
• Accuracy and generality can be compute as

                    n t + (r) + n t# (r)                        n t + (r)
             quot;(r) =                                      quot;(r) =
                             nt                                   nm
 • Fitness should combine accuracy and generality
                           f (r) = quot;(r) # $(r)%
  !                              !
 • Such measure can be either applied to rules or rule sets

             !



GECCO 2007                    Llorà, Sastry, Yu & Goldberg                  6
Maximally Accurate and General Rules




GECCO 2007         Llorà, Sastry, Yu & Goldberg   7
Extended Compact Genetic Algorithm
  • A Probabilistic model building GA (Harik, 1999)
        – Builds models of good solutions as linkage groups
  • Key idea:
        – Good probability distribution → Linkage learning

  • Key components:
        – Representation: Marginal product model (MPM)
             • Marginal distribution of a gene partition

        – Quality: Minimum description length (MDL)
             • Occam’s razor principle
             • All things being equal, simpler models are better

        – Search Method: Greedy heuristic search

GECCO 2007                               Llorà, Sastry, Yu & Goldberg   8
Marginal Product Model (MPM)
  • Partition variables into disjoint sets
  • Product of marginal distributions on a partition of
       genes
  • Gene partition maps to linkage groups
               MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l]


                                               ...            xl-2 xl-1 xl
                x1 x2 x3    x4 x5 x6



               {p000, p001, p00#, p010, p011, p01#, p100, p101,
               p10#, p110, p111, p11# … }(27 probabilities)
GECCO 2007                     Llorà, Sastry, Yu & Goldberg                  9
Minimum Description Length Metric
  • Hypothesis: For an optimal model
        – Model size and error is minimum

  • Model complexity, Cm
        – # of bits required to store all marginal probabilities



  • Compressed population complexity, Cp
        – Entropy of the marginal distribution over all partitions




  • MDL metric, Cc = Cm + Cp
GECCO 2007                      Llorà, Sastry, Yu & Goldberg         10
Building an Optimal MPM
  •     Assume independent genes ([1],[2],…,[l])

  •     Compute MDL metric, Cc

  •     All combinations of two subset merges
             Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])}
        •

  •     Compute MDL metric for all model candidates

  •     Select the set with minimum MDL,

  •     If           , accept the model and go to step 2.

  •     Else, the current model is optimal




GECCO 2007                           Llorà, Sastry, Yu & Goldberg               11
χeCCS Models for Different Multiplexers
Building Block Size Increases
Fitness Inheritance using Least Squares
• Proposed by Sastry, Lima & Goldberg (2006)
• Surrogate is a regression using basis identified by BBs
• A simple example: [1,3] [2] [4]
• The schemas represented are
      – {0*0*, 0*1*, 0*#*, 1*0*, 1*1*, 1*#*, #*0*, #*1*, #*#*,
        *0**, *1**, *#**, ***0 , ***1 , ***#}
• Recode the individuals by




GECCO 2007                  Llorà, Sastry, Yu & Goldberg         13
Fitness Inheritance using Least Squares
• Recoding defines matrix A




• Normalize the fitness




GECCO 2007                Llorà, Sastry, Yu & Goldberg   14
Fitness Inheritance using Least Squares
• Solve using least squares




• Once solved, the fitness surrogate take the following form




GECCO 2007                Llorà, Sastry, Yu & Goldberg         15
Fitness Inheritance and χeCCS
• Two different problems




               Hidden XOR                            6-input multiplexer


GECCO 2007                  Llorà, Sastry, Yu & Goldberg                   16
Hidden XOR
• Evolved rules and model




• Surrogate accuracy




GECCO 2007             Llorà, Sastry, Yu & Goldberg   17
6-input Multiplexer
• The evolved solution and model




• The surrogate is totally off



GECCO 2007            Llorà, Sastry, Yu & Goldberg   18
6-input Multiplexer
• The key = missing basis




• χeCCS is able to solve the problem quickly, reliably,
    and accurately
• However, the model basis are not accurate enough
    to build a proper surrogate
GECCO 2007             Llorà, Sastry, Yu & Goldberg   19
Overlapping BBs using DSMGA
• Proposed by Yu, Yassine, Goldberg and Chen (2003)
• Based on organizational theory
• Main property = DSMGA model builder (DSMcluster)
    deals with overlapping building blocks
• The main issue = translate a populations or rules
    into a dependency structure matrix (DSM)
• The intuition = specific bits are the ones responsible
    for the kind of linkage we seek


GECCO 2007             Llorà, Sastry, Yu & Goldberg    20
Jumping to the Results
• DSMcluster model for the hidden XOR
      – [i0 i1 i2] [i3] [i4] [i5]


• DSMcluster model for the 6-input multiplexer
      – [i0 i1] <i2 i3 i4 i5>
      – It identifies a BB [i0 i1] of variables interacting with a
        bus <i2 i3 i4 i5>
      – Translated into χeCCS language:
                 [i0 i1 i2] [i0 i1 i3] [i0 i1 i4] [i0 i1 i5]
      – The right model which provides the right set of basis

GECCO 2007                      Llorà, Sastry, Yu & Goldberg     21
Conclusions
• The matching process is crucial and expensive
• Efficient implementations can take us far to a point
• Relaxation can get rid of the need of matching
• For some types of problems overlapping BBs are
    required
• DSMGA provides the proper machinery to identify
    the proper basis for such a surrogate




GECCO 2007              Llorà, Sastry, Yu & Goldberg     22
Do not Match, Inherit: Fitness Surrogates for
Genetics-Based Machine Learning Techniques


       Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2

1 National   Center for Supercomputing Applications. University of Illinois at Urbana-Champaign
     2 Illinois   Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign
                  3 Department   of Electrical Engineering, National Taiwan University




                          Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199
GECCO 2007 HUMIES                                                                             23

More Related Content

Similar to Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques

Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Xavier Llorà
 
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Ahmed Gamal Abdel Gawad
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic Alorithm
IRJET Journal
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic Alorithm
IRJET Journal
 
Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)
Global R & D Services
 
Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)
Global R & D Services
 
Second Order Heuristics in ACGP
Second Order Heuristics in ACGPSecond Order Heuristics in ACGP
Second Order Heuristics in ACGP
hauschildm
 
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow ProblemA Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
Shubhashis Shil
 
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET Journal
 
Cz24655657
Cz24655657Cz24655657
Cz24655657
IJERA Editor
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson Challenge
Raouf KESKES
 
50120130406046
5012013040604650120130406046
50120130406046
IAEME Publication
 
SpecAugment review
SpecAugment reviewSpecAugment review
SpecAugment review
June-Woo Kim
 
Genetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale ModelingGenetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale Modeling
kknsastry
 
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
Vahid Taslimitehrani
 
40220140502004
4022014050200440220140502004
40220140502004
IAEME Publication
 
Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1
IBM Decision Optimization
 
Using particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problemsUsing particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problems
riyaniaes
 
I045046066
I045046066I045046066
I045046066
IJERA Editor
 
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptArtificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Anonymous9etQKwW
 

Similar to Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques (20)

Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
 
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic Alorithm
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic Alorithm
 
Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)
 
Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)
 
Second Order Heuristics in ACGP
Second Order Heuristics in ACGPSecond Order Heuristics in ACGP
Second Order Heuristics in ACGP
 
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow ProblemA Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
 
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
 
Cz24655657
Cz24655657Cz24655657
Cz24655657
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson Challenge
 
50120130406046
5012013040604650120130406046
50120130406046
 
SpecAugment review
SpecAugment reviewSpecAugment review
SpecAugment review
 
Genetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale ModelingGenetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale Modeling
 
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
 
40220140502004
4022014050200440220140502004
40220140502004
 
Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1
 
Using particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problemsUsing particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problems
 
I045046066
I045046066I045046066
I045046066
 
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptArtificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
 

More from Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
Xavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
Xavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
Xavier Llorà
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Xavier Llorà
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Xavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
Xavier Llorà
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
Xavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
Xavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
Xavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
Xavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
Xavier Llorà
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
Xavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Xavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
Xavier Llorà
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
Xavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Xavier Llorà
 
The DISCUS project
The DISCUS projectThe DISCUS project
The DISCUS project
Xavier Llorà
 
Visualizing content in metadata stores
Visualizing content in metadata storesVisualizing content in metadata stores
Visualizing content in metadata stores
Xavier Llorà
 

More from Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 
The DISCUS project
The DISCUS projectThe DISCUS project
The DISCUS project
 
Visualizing content in metadata stores
Visualizing content in metadata storesVisualizing content in metadata stores
Visualizing content in metadata stores
 

Recently uploaded

Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
anupriti
 
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
uuuot
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
SeasiaInfotech2
 
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
Margaret Fero
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
ScyllaDB
 
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
The Digital Insurer
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
MYIR Product Brochure - A Global Provider of Embedded SOMs & Solutions
MYIR Product Brochure - A Global Provider of Embedded SOMs & SolutionsMYIR Product Brochure - A Global Provider of Embedded SOMs & Solutions
MYIR Product Brochure - A Global Provider of Embedded SOMs & Solutions
Linda Zhang
 
AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)
apoorva2579
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
Lessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien RiouxLessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien Rioux
crioux1
 
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
FellyciaHikmahwarani
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
ScyllaDB
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
ScyllaDB
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
this resume for sadika shaikh bca student
this resume for sadika shaikh bca studentthis resume for sadika shaikh bca student
this resume for sadika shaikh bca student
SadikaShaikh7
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
STKI Israeli Market Study 2024 final v1
STKI Israeli Market Study 2024 final  v1STKI Israeli Market Study 2024 final  v1
STKI Israeli Market Study 2024 final v1
Dr. Jimmy Schwarzkopf
 

Recently uploaded (20)

Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
 
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
一比一原版(msvu毕业证书)圣文森山大学毕业证如何办理
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
 
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
 
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
MYIR Product Brochure - A Global Provider of Embedded SOMs & Solutions
MYIR Product Brochure - A Global Provider of Embedded SOMs & SolutionsMYIR Product Brochure - A Global Provider of Embedded SOMs & Solutions
MYIR Product Brochure - A Global Provider of Embedded SOMs & Solutions
 
AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
Lessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien RiouxLessons Of Binary Analysis - Christien Rioux
Lessons Of Binary Analysis - Christien Rioux
 
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
this resume for sadika shaikh bca student
this resume for sadika shaikh bca studentthis resume for sadika shaikh bca student
this resume for sadika shaikh bca student
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
STKI Israeli Market Study 2024 final v1
STKI Israeli Market Study 2024 final  v1STKI Israeli Market Study 2024 final  v1
STKI Israeli Market Study 2024 final v1
 

Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques

  • 1. Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2 1 National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign 2 Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign 3 Department of Electrical Engineering, National Taiwan University Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199 GECCO 2007 HUMIES 1
  • 2. Motivation • Competent GBML – Use competent GAs to approach GBML problems – Take advantage of competent GA scalability – Provide insight about problem structure – χeCCS by Llorà, Sastry, Goldberg & de la Ossa (2006) • Rule matching may thread practical applications – Even for small dimensional problems (MUX 20), rule matching may take more than 85% of the execution time in XCS – As dimensionality or cardinality of training sets increase, rule matching rules the overall execution time – Efficient implementation (Llora & Sastry, 2006) still require matching rules GECCO 2007 Llorà, Sastry, Yu & Goldberg 2
  • 3. Motivation • Competent GAs – Byproduct: Models and problem structure insight – Revision of the fitness relaxation for expensive fitness evaluations – Idea: Build a cheap surrogate fitness accurate enough – Successfully applied to GA (Sastry, Lima & Goldberg, 2006) – Help cut down the number of fitness evaluations • GBML – Can we transfer the same ideas to GBML approaches? – What are the requirements needed for competent GBML to benefit from fitness relaxation? GECCO 2007 Llorà, Sastry, Yu & Goldberg 3
  • 4. Outline • Overview χeCCS • Evaluation relaxation • Fitness inheritance using least squares fitting • Fitness inheritance and χeCCS • Results • Conclusions GECCO 2007 Llorà, Sastry, Yu & Goldberg 4
  • 5. χ-ari Extended Compact Classifier System • No reinforcement learning is used • A competent GA is in charge of the learning • The idea: – A population of single rules – For each rule we compute its fitness – The χ-ari extended compact genetic algorithm – Niching to maintain different accurate rules (restricted tournament replacement) GECCO 2007 Llorà, Sastry, Yu & Goldberg 5
  • 6. Maximally Accurate and General Rules • Accuracy and generality can be compute as n t + (r) + n t# (r) n t + (r) quot;(r) = quot;(r) = nt nm • Fitness should combine accuracy and generality f (r) = quot;(r) # $(r)% ! ! • Such measure can be either applied to rules or rule sets ! GECCO 2007 Llorà, Sastry, Yu & Goldberg 6
  • 7. Maximally Accurate and General Rules GECCO 2007 Llorà, Sastry, Yu & Goldberg 7
  • 8. Extended Compact Genetic Algorithm • A Probabilistic model building GA (Harik, 1999) – Builds models of good solutions as linkage groups • Key idea: – Good probability distribution → Linkage learning • Key components: – Representation: Marginal product model (MPM) • Marginal distribution of a gene partition – Quality: Minimum description length (MDL) • Occam’s razor principle • All things being equal, simpler models are better – Search Method: Greedy heuristic search GECCO 2007 Llorà, Sastry, Yu & Goldberg 8
  • 9. Marginal Product Model (MPM) • Partition variables into disjoint sets • Product of marginal distributions on a partition of genes • Gene partition maps to linkage groups MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l] ... xl-2 xl-1 xl x1 x2 x3 x4 x5 x6 {p000, p001, p00#, p010, p011, p01#, p100, p101, p10#, p110, p111, p11# … }(27 probabilities) GECCO 2007 Llorà, Sastry, Yu & Goldberg 9
  • 10. Minimum Description Length Metric • Hypothesis: For an optimal model – Model size and error is minimum • Model complexity, Cm – # of bits required to store all marginal probabilities • Compressed population complexity, Cp – Entropy of the marginal distribution over all partitions • MDL metric, Cc = Cm + Cp GECCO 2007 Llorà, Sastry, Yu & Goldberg 10
  • 11. Building an Optimal MPM • Assume independent genes ([1],[2],…,[l]) • Compute MDL metric, Cc • All combinations of two subset merges Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])} • • Compute MDL metric for all model candidates • Select the set with minimum MDL, • If , accept the model and go to step 2. • Else, the current model is optimal GECCO 2007 Llorà, Sastry, Yu & Goldberg 11
  • 12. χeCCS Models for Different Multiplexers Building Block Size Increases
  • 13. Fitness Inheritance using Least Squares • Proposed by Sastry, Lima & Goldberg (2006) • Surrogate is a regression using basis identified by BBs • A simple example: [1,3] [2] [4] • The schemas represented are – {0*0*, 0*1*, 0*#*, 1*0*, 1*1*, 1*#*, #*0*, #*1*, #*#*, *0**, *1**, *#**, ***0 , ***1 , ***#} • Recode the individuals by GECCO 2007 Llorà, Sastry, Yu & Goldberg 13
  • 14. Fitness Inheritance using Least Squares • Recoding defines matrix A • Normalize the fitness GECCO 2007 Llorà, Sastry, Yu & Goldberg 14
  • 15. Fitness Inheritance using Least Squares • Solve using least squares • Once solved, the fitness surrogate take the following form GECCO 2007 Llorà, Sastry, Yu & Goldberg 15
  • 16. Fitness Inheritance and χeCCS • Two different problems Hidden XOR 6-input multiplexer GECCO 2007 Llorà, Sastry, Yu & Goldberg 16
  • 17. Hidden XOR • Evolved rules and model • Surrogate accuracy GECCO 2007 Llorà, Sastry, Yu & Goldberg 17
  • 18. 6-input Multiplexer • The evolved solution and model • The surrogate is totally off GECCO 2007 Llorà, Sastry, Yu & Goldberg 18
  • 19. 6-input Multiplexer • The key = missing basis • χeCCS is able to solve the problem quickly, reliably, and accurately • However, the model basis are not accurate enough to build a proper surrogate GECCO 2007 Llorà, Sastry, Yu & Goldberg 19
  • 20. Overlapping BBs using DSMGA • Proposed by Yu, Yassine, Goldberg and Chen (2003) • Based on organizational theory • Main property = DSMGA model builder (DSMcluster) deals with overlapping building blocks • The main issue = translate a populations or rules into a dependency structure matrix (DSM) • The intuition = specific bits are the ones responsible for the kind of linkage we seek GECCO 2007 Llorà, Sastry, Yu & Goldberg 20
  • 21. Jumping to the Results • DSMcluster model for the hidden XOR – [i0 i1 i2] [i3] [i4] [i5] • DSMcluster model for the 6-input multiplexer – [i0 i1] <i2 i3 i4 i5> – It identifies a BB [i0 i1] of variables interacting with a bus <i2 i3 i4 i5> – Translated into χeCCS language: [i0 i1 i2] [i0 i1 i3] [i0 i1 i4] [i0 i1 i5] – The right model which provides the right set of basis GECCO 2007 Llorà, Sastry, Yu & Goldberg 21
  • 22. Conclusions • The matching process is crucial and expensive • Efficient implementations can take us far to a point • Relaxation can get rid of the need of matching • For some types of problems overlapping BBs are required • DSMGA provides the proper machinery to identify the proper basis for such a surrogate GECCO 2007 Llorà, Sastry, Yu & Goldberg 22
  • 23. Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2 1 National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign 2 Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign 3 Department of Electrical Engineering, National Taiwan University Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199 GECCO 2007 HUMIES 23