Sep 6, 2017 - optimization is to embrace data in a protagonist role and combine it with machine learning. â Dimitris Bertsimas, Editorial Statement.
Optimization meets Machine Learning Marco L¨ ubbecke Lehrstuhl f¨ ur Operations Research RWTH Aachen University @mluebbecke
OR 2017 · Berlin · September 6, 2017
Optimization is Everywhere
@mluebbecke · Optimization meets Machine Learning · 2/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Machine Learning is Everywhere
@mluebbecke · Optimization meets Machine Learning · 3/30
Literally Everyone speaks about Machine Learning
@mluebbecke · Optimization meets Machine Learning · 4/30
Literally Everyone speaks about Machine Learning
@mluebbecke · Optimization meets Machine Learning · 4/30
Literally Everyone speaks about Machine Learning
@mluebbecke · Optimization meets Machine Learning · 4/30
New INFORMS Journal on Optimization
“ “
One of the largest opportunities of the field of optimization is to embrace data in a protagonist role and combine it with machine learning. — Dimitris Bertsimas, Editorial Statement
My vision of the future for [. . . ] optimization
@mluebbecke · Optimization meets Machine Learning · 5/30
”
What is Machine Learning?
Machine Learning Supervised Learning Classification
Unsupervised Learning
Regression Clustering
source: www.mathworks.com
@mluebbecke · Optimization meets Machine Learning · 6/30
Supervised Learning: Classification I
(xi
data X
)
@mluebbecke · Optimization meets Machine Learning · 7/30
Supervised Learning: Classification I
data X
labels Y
(xi , yi )
@mluebbecke · Optimization meets Machine Learning · 7/30
Supervised Learning: Classification I
data X , d features, labels Y
(xi , yi )
φ : X → Rd
(φ(xi ), yi )
@mluebbecke · Optimization meets Machine Learning · 7/30
Supervised Learning: Classification I
data X , d features, labels Y
(xi , yi )
φ : X → Rd
(φ(xi ), yi )
@mluebbecke · Optimization meets Machine Learning · 7/30
black box
Supervised Learning: Classification I
data X , d features, labels Y
(xi , yi )
φ : X → Rd
(φ(xi ), yi )
black box
“learns” f : Rd → Y s.t. error(f (φ(xi )), yi ) “small” xi ∈X
@mluebbecke · Optimization meets Machine Learning · 7/30
Supervised Learning: Classification I
data X , d features, labels Y
(xi , yi )
φ : X → Rd
(φ(xi ), yi )
black box
“learns” f : Rd → Y s.t. error(f (φ(xi )), yi ) “small” xi ∈X
an optimization problem
@mluebbecke · Optimization meets Machine Learning · 7/30
Supervised Learning: Classification I
data X , d features, labels Y
(xi , yi )
φ : X → Rd
(φ(xi ), yi )
black box
“learns” f : Rd → Y
validate
s.t. error(f (φ(xi )), yi ) “small” xi ∈X
an optimization problem
@mluebbecke · Optimization meets Machine Learning · 7/30
Binary Classification: Dog or Muffin? Owl or Apple?
@mluebbecke · Optimization meets Machine Learning · 8/30
Supervised Learning: Regression y
y = mx + b I “from {0, 1} to [0, 1]”
P min i ε2i black box yi = m · xi + b + εi
x @mluebbecke · Optimization meets Machine Learning · 9/30
xi ∈ X
Supervised Learning: Regression y
y = mx + b I “from {0, 1} to [0, 1]”
P min i ε2i yi = m · xi + b + εi
x @mluebbecke · Optimization meets Machine Learning · 9/30
xi ∈ X
Unsupervised Learning: Clustering x2
x1 @mluebbecke · Optimization meets Machine Learning · 10/30
Unsupervised Learning: Clustering x2 φ : X → Rd
x1 @mluebbecke · Optimization meets Machine Learning · 10/30
black box
Unsupervised Learning: Clustering x2 y3
φ : X → Rd
black box
“learns”
y1
f : Rd → Y s.t. all x ∈ X : f (φ(x)) = y
y2
are “similar” x1 @mluebbecke · Optimization meets Machine Learning · 10/30
Unsupervised Learning: Clustering x2 y3
φ : X → Rd
black box
“learns”
y1
f : Rd → Y s.t. all x ∈ X : f (φ(x)) = y
y2
are “similar” x1 @mluebbecke · Optimization meets Machine Learning · 10/30
Optimization naturally appears in ML
“
Optimization lies at the heart of ML. Most ML problems reduce to optimization problems. — Bennett, Parrado-Hern´andez (2006)
I
minimize e.g., prediction error
I
continuous, convex optimization
I
discrete, integer optimization
@mluebbecke · Optimization meets Machine Learning · 11/30
”
Example: A MIP in a black box : Classification Trees I
optimal classification trees
Bertsimas & Dunn (2017)
source: www.edureka.co/blog/decision-trees
I
use few nodes, shallow depth → formulated as MIP
I
improves accuracy over classical CART method by 0.5–2%
@mluebbecke · Optimization meets Machine Learning · 12/30
Many Opportunities for Discrete Optimization in ML
I
within black boxes to capture combinatorial explosion
→ see also Andrea’s plenary on Friday I
feature selection
I
outlier detection
I
parameter tuning
I
...
@mluebbecke · Optimization meets Machine Learning · 13/30
How about the converse Direction? I
“emulating the expert”
I
observe a decision maker
I
learn their objective function
I
online learning
B¨ armann, Pokutta & Schneider (2017)
max cTtrue x : x ∈ X(p) given (pt , x∗t )t=1,...,T
@mluebbecke · Optimization meets Machine Learning · 14/30
ML may help improving Optimization Algorithms
I
e.g., branching in B&B
I
full strong branching gives locally perfect information
I
predict the strong branching score of a variable
I
features describe state of a variable
I
supervised learning: regression
→ promising proof-of-concept Marcos Alvarez, Louveaux & Wehenkel (2017)
I
survey on ML in branching/searching
@mluebbecke · Optimization meets Machine Learning · 15/30
Lodi & Zarpellon (2017)
A Progress Bar for Branch-and-Bound?
I
predict the runtime of branch-and-bound algorithms Hutter, Xu, Hoos & Leyton-Brown (2014)
CPLEX 12.1 on 1510 publicly available MIPs
predicted
I
actual runtime @mluebbecke · Optimization meets Machine Learning · 16/30
A Progress Bar for Branch-and-Bound?
I
very preliminary experiments with gurobi 7.5
I
predict elapsed runtime percentage Kruber, L, Obeloer genannt Bregenhorn (2017)
@mluebbecke · Optimization meets Machine Learning · 16/30
A Progress Bar for Branch-and-Bound?
I
very preliminary experiments with gurobi 7.5
I
predict elapsed runtime percentage Kruber, L, Obeloer genannt Bregenhorn (2017)
@mluebbecke · Optimization meets Machine Learning · 16/30
Learning when to solve a MIP by Branch-and-Price I
our MIP solver GCG detects many potential DW reformulations .. . MIP +
CPU time .. .
Kruber, L, Parmentier (2017)
@mluebbecke · Optimization meets Machine Learning · 17/30
Learning when to solve a MIP by Branch-and-Price I
our MIP solver GCG detects many potential DW reformulations .. . MIP +
φ
100+ features #conss, #vars, %constype, %vartype, #blocks, . . .
CPU time .. .
Kruber, L, Parmentier (2017)
@mluebbecke · Optimization meets Machine Learning · 17/30
Learning when to solve a MIP by Branch-and-Price I
our MIP solver GCG detects many potential DW reformulations .. . MIP +
φ
100+ features #conss, #vars, %constype, %vartype, #blocks, . . .
CPU time .. .
k-NN learns binary classifier f “run SCIP or GCG?”
Kruber, L, Parmentier (2017)
@mluebbecke · Optimization meets Machine Learning · 17/30
Learning when to solve a MIP by Branch-and-Price I
our MIP solver GCG detects many potential DW reformulations .. . MIP +
φ
100+ features #conss, #vars, %constype, %vartype, #blocks, . . .
CPU time .. .
k-NN learns
f
SCIP GCG
faster SCIP GCG 69.5% 9.9% 6.9% 13.7%
Kruber, L, Parmentier (2017)
@mluebbecke · Optimization meets Machine Learning · 17/30
binary classifier f “run SCIP or GCG?”
What ML Answers can we (Optimizers) expect?
I
we get statistical answers → not what we are used to see
I
we have domain/expert knowledge: e.g., pseudo-costs
I
ML may give a better predictor, but no explanation
I
some info can be extracted from most influential features
→ interpretability is a huge theoretical and practical topic
@mluebbecke · Optimization meets Machine Learning · 18/30
Decision Making: Machine Learning
“
Machine Learning and Artificial Intelligence delivers the most value when you need to make lots of similar decisions quickly. — Ingo Mierswa, Rapidminer
I
simple decisions: e.g., auto correct current word
I
solution: often a single score → greedy
I
keep/learn habits: extrapolate from the past (!)
@mluebbecke · Optimization meets Machine Learning · 20/30
”
Typical Example: Predictive Maintenance
source: blog.capterra.com/should-you-invest-in-a-predictive-maintenance-strategy/
@mluebbecke · Optimization meets Machine Learning · 21/30
Exploit all Options: Prescriptive Maintenance
source: www.siemens.com/press/pool/de/pressebilder/photonews/pn200826/300dpi/pn200826-12 300dpi.jpg
@mluebbecke · Optimization meets Machine Learning · 22/30
Decision Making: Optimization
“
How often can the result of an optimization model be captured in a single variable? — Ed Rothberg, Gurobi
”
I
solution: not only the objective value!
I
complex decisions/plans: e.g., timetables, crew schedules, . . .
I
global scope: models all (reasonable) interdependencies
@mluebbecke · Optimization meets Machine Learning · 23/30
Perfect Partners
“Current Standard:” Predictive then Prescriptive Analytics ML harnesses the bigness of data (the past and present); Optimization captures the bigness of options (the future).
@mluebbecke · Optimization meets Machine Learning · 24/30
Learning (about) optimal Solutions
I
in recurring complex decision situations
I
learn how good (partial) solutions look like
I
this may help finding good solutions faster
I
learn spatio-temporal patterns to generate effective schedules Le, Liu, Lau (2016)
I
@mluebbecke · Optimization meets Machine Learning · 25/30
Learning (about) Optimization Models
I
ML can make sense of data
I
optimization models are also “data”
→ (how) can ML help us make sense of optimization models?
I
can ML learn good modeling?
@mluebbecke · Optimization meets Machine Learning · 26/30
Vision: Learning (about) Optimization Problems
I
learn the semantics of a MIP model (“the problem”)
⇒ e.g., help the modeler find a better formulation
@mluebbecke · Optimization meets Machine Learning · 27/30
The AI Umbrella?
I
OR vs. analytics discussion X
I
OR vs. AI discussion ?
@mluebbecke · Optimization meets Machine Learning · 28/30
Why is this Relevant?
source: blogs.worldbank.org/category/tags/artificial-intelligence
I
if the fourth industrial revolution is about AI, OR should be part of it
@mluebbecke · Optimization meets Machine Learning · 29/30
Optimization met Machine Learning Marco L¨ ubbecke Lehrstuhl f¨ ur Operations Research RWTH Aachen University @mluebbecke
OR 2017 · Berlin · September 6, 2017
@mluebbecke · Optimization meets Machine Learning · 30/30