Aviation Efficiency - IEEE Xplore

17 downloads 256630 Views 1MB Size Report
Apr 24, 2013 - Advanced Arrival Procedures with Active Abatement Potentials 9/23/10. Predictive Analytics with. Aviation Big Data. Paul Comitz. Samet Ayhan.
Engineering, Operations & Technology

Boeing Research & Technology

Predictive Analytics with Aviation Big Data ICNS 2013 Herndon, VA

Paul Comitz Samet Ayhan Gary Gerberick Johnathan Pesce Steve Bliesner April 24th 2013 Advanced Arrival Procedures with Active Abatement Potentials 9/23/10

Agenda ƒIntroduction ƒData Correlation ƒArchitecture –Database Modeling –Historical & Live Data Processing

ƒOptimizations ƒBig Data Analytics ƒFront-End Visualization ƒConclusion and Future Work Copyright © 2009 Boeing. All rights reserved.

Air Traffic Analysis ƒTell me all the B734’s arriving SFO on July 11, 2007 from 4 pm to 7pm ƒTell me all the air traffic in this geographic area on this day ƒTell me all the aircraft leaving IAD enroute to LAX in the month of March 2008 ƒMany more Copyright © 2009 Boeing. All rights reserved.

ƒGiven descriptive pattern discovered using historical data, compute distinct air traffic within an air space volume of interest (over LAS) on Nov 6, 2013 between 10:15 and 10:30 PM

Problem Statement ƒBoeing AATM has been receiving live Aircraft Situation Display to Industry (ASDI) data and archiving it for three years ƒThere is not an easy mechanism to perform basic analysis. The incoming data is large, compressed, and requires correlation with other flight data before it can be analyzed ƒBoeing is implementing a service that records, retrieves, analyzes and visualizes aviation big data –Prototype work in progress

Copyright © 2009 Boeing. All rights reserved.

ASDI ƒProvided by the Federal Aviation Administration (FAA) to deliver flight plan and track information from the NAS to aviation industry in real-time and near real-time ƒProduct of the Enhanced Traffic Management System (ETMS) ƒA stream of data packets containing Zlib compressed XML documents with a binary header ƒCan be flight plan related, oceanic, or host tracks reports Copyright © 2009 Boeing. All rights reserved.

Typical ASDI Data ASDI Message Counts for August 8, 2009 Supporting Record Counts for August 8, 2009

6 Copyright © 2009 Boeing. All rights reserved.

ASDI Correlation ƒEach ASDI message does not always provide enough information to identify the flight it is associated with. ƒCenters operate independently of each other to assign their own Computer Identification (CID) code to flight plans. So, using CID alone is not sufficient. ƒCorrelator generates a new unique FLIGHT_KEY as it identifies a new unique flight plan. ƒFLIGHT_KEY is comprised of aircraft id, flight status, departure center, air craft type, destination, etc. Copyright © 2009 Boeing. All rights reserved.

Architecture ƒData Warehouse approach implemented – IBM Infosphere Data Warehouse for data management – Backend DB2 database – IBM WebSphere Message Broker and MQ for near realtime ASDI feed consumption, message routing and transformation – SPSS Modeler for predictive analytics – IBM Cognos BI for front-end visualization

Copyright © 2009 Boeing. All rights reserved.

Architecture TCP/IP Live ASDI Stream

DB2 XML Shredder

C:

Ext.1 E:

SURVDB

Ext.2 F:

Ext.3 G:

Ext.4 H:*

Server-2

Server-1

IBM SPSS Modeler

IBM Cognos

SPSS Collaboration & Deployment Services

Server-3 Copyright © 2009 Boeing. All rights reserved.

WebSphere Message Broker

WebSphere MQ

Database Modeling ƒSchemas for correlated ASDI messages translated into equivalent relational schemas – Database tables generated based on classes created from schema definitions – Nine main, eleven supporting tables – Each main table contains FLIGHT_KEY

Copyright © 2009 Boeing. All rights reserved.

Database Modeling

Copyright © 2009 Boeing. All rights reserved.

Correlation Process ƒTo archive received ASDI data –Track messages must be correlated with flight plan messages ƒ FLIGHT_KEY assigned ƒ Uncorrelated data tagged ƒ Approx 30 minutes to correlate one day of data

Copyright © 2009 Boeing. All rights reserved.

Historical Data Processing ƒTo load correlated data –Uncompress, unmarshall –Create a list of files containing the correlated data –Write data to warehouse

Copyright © 2009 Boeing. All rights reserved.

Live Data Processing ƒProcessed using IBM MQ, IBM Message Broker and a technique called XML Shredding ƒMessage Broker Compute Nodes – Uncompress Node – Extract correlated messages – Shred Node adds to DB

ƒ Stored Procedure “shreds” XML docs and adds to tables Copyright © 2009 Boeing. All rights reserved.

Issues and Observations ƒInitial load of one day of data ~ 7 hours ƒOptimizations – Write data in batches – Use a mutable data structure to create data strings – Deploy a higher performance machine – Use load instead of insert – Use DB2 Range-Partitioned tables – Database tunings

ƒTime reduced from 7 hours to approx 30 minutes Copyright © 2009 Boeing. All rights reserved.

Optimizations ƒUse a mutable data structure to create data strings – Original application created the SQL statement by appending elements to a Java String – It was taking five hours (of the seven hours) to create Strings – Instead Java StringBuilder used – Java Strings immutable – Time savings of 71.4%

Copyright © 2009 Boeing. All rights reserved.

Optimizations ƒDeployed on a higher-performance machine – Application ported ƒ from IBM Blade Center HS21 (4GB of RAM and 64-bit dual-core Xeon 5130 processor) ƒ to Dell M4500 computer (4GB of RAM and 64-bit of quad-core Intel Core i7 processor)

– Reduced the time to thirty minutes

ƒBulk loading instead of insert – Application was modified to write CSV files for each table – Entire day worth of data bulk loaded – Reduced the time to fifteen minutes Copyright © 2009 Boeing. All rights reserved.

Optimizations ƒRange-Partitioned tables (RPT) used – To limit the size of tables, the original code created multiple tables per table type – This puts burden on the application to query multiple tables when a range crosses several tables – With RPT, user is not required to make multiple queries when a range crosses a table boundary – Increased the time to thirty minutes – Additional fifteen minute cost per day of partitioning enabled time savings during queries

Copyright © 2009 Boeing. All rights reserved.

Optimizations ƒDatabase tunings – Range periods changed from a week to a month – Automatic table space resizing changed from 32MB to 512KB – Buffer pool size decreased – Decreased the time to twenty minutes

ƒOverall, total time savings of 95.2%

Copyright © 2009 Boeing. All rights reserved.

Analytics Landscape How can we achieve the best outcome including the effects of variability?

Stochastic Optimization

Prescriptive

Optimization

How can we achieve the best outcome?

Predictive modeling

What will happen next if ?

Forecasting

What if these trends continue?

Competitive Advantage

Predictive Simulation

What could happen…. ?

Alerts

What actions are needed?

Query/drill down

What exactly is the problem?

Ad hoc reporting

How many, how often, where?

Standard Reporting

What happened?

Degree of Complexity

Descriptive

Based on: Competing on Analytics, Davenport and Harris, 2007

Used with permission of IBM Copyright © 2009 Boeing. All rights reserved.

20

IBM Confidential

Initial Analysis Activities ƒ Flights departing or arriving on a date. ƒ Flights departing or arriving within a date and time range. ƒ Flights between city pair A,B ƒ Flights between a list of city pairs ƒ Flights passing through a volume on a date. (sector, center, etc boundary) ƒ Flights passing through a volume within a date and time range ƒ Flights passing through an airspace volume in n-minute intervals ƒ All x-type aircraft departing or arriving on a date ƒ Flights departing or arriving on a date between city pair A,B ƒ Flights departing or arriving on a date between a list of city pairs ƒ Flights passing through a named fix, airway, center, or sector ƒ Filed Flight plans for any of the above ƒ Actual departure, arrival times and actual track reports for any of the above Copyright © 2009 Boeing. All rights reserved.

Initial SPSS Applications ƒShow all tracks by call sign

Copyright © 2009 Boeing. All rights reserved.

Predictive / Prescriptive Analytics Use-Case ƒFor a given Airspace Volume of Interest (AVOI), compute distinct traffic volume at some point in the future – Aim to alert on congestion due to flow control areas or weather if certain thresholds are exceeded – Prescribe solution (if certain thresholds are exceeded) ƒ Propose alternate flight paths

– Use pre-built predictive model – SPSS Modeler performs data processing ƒ Counts relevant records in the database (pattern discovery) ƒ Computes traffic volume using statistical models on descriptive pattern ƒ Returns prediction with likelihood Copyright © 2009 Boeing. All rights reserved.

Predictive / Prescriptive Analytics Use-Case

Pulls in the TRACKINFO table of MAIN using SQL

Combines the SOURCE_DATE and SOURCE_TIME to a timestamp that can be understood by modeler

Limits the data to database entries which fall inside the AVOI

Copyright © 2009 Boeing. All rights reserved.

Defines the target and input fields needed for creating the model

Computes which time interval the database entry falls in. The time interval is 15 minutes

Final prediction

Handles the creation of the model

Produces a graph based off of the model results

Advanced Arrival Procedures with Active Abatement Potentials 9/23/10 24

Initial Cognos BI Applications ƒIBM Cognos Report Studio – Web application for creating reports – Can be tailored by date range, aircraft id, departure/arrival airport etc. – Reports are available with links to visuals

ƒIBM Framework Manager – Used to create the data package – Meta-data modeling tool – Users can define data sources, and relationships among them

ƒModels can be exported to a package for use with Report Studio

Copyright © 2009 Boeing. All rights reserved.

Flights Departing Las Vegas on Jan 1, 2012 (1 of 3) ƒReport shows the departure date, departure and arrival locations and hyperlinks to Google Map images ƒDeparturePosition and ArrivalPosition are calculated data items formatted for use with Google Maps ƒMap hyperlinks are also calculated based on the type of fix

Copyright © 2009 Boeing. All rights reserved.

Flights Departing Las Vegas on Jan 1, 2012 (2 of 3) ƒDeparturePosition, Departure Map, ArrivalPosition and Arrival Map are calculated data items (see departure items below)

DepartureLatitude

DepartureLongitude

Departure Map

DeparturePosition

Copyright © 2009 Boeing. All rights reserved.

Flights Departing Las Vegas on Jan 1, 2012 (3 of 3)

Copyright © 2009 Boeing. All rights reserved.

Conclusion and Next Steps ƒCurrent archive is 50 billion records and growing – Approximately 34 million elements per day – ~1GB/day

ƒSheer volume of raw surveillance data makes analytics process very difficult ƒThe raw data runs through a series of processes before it can be used for analytics ƒNext Steps – Continue application of predictive and prescriptive analytics – Big data visualization Copyright © 2009 Boeing. All rights reserved.

Questions and Comments

Paul Comitz Boeing Research & Technology Chantilly, VA, 20151 (703) 465-3782 (office) [email protected]

Copyright © 2009 Boeing. All rights reserved.

Copyright © 2009 Boeing. All rights reserved.

Advanced Arrival Procedures with Active Abatement Potentials 9/23/10 31

Backup Slides

Copyright © 2009 Boeing. All rights reserved.

Advanced Arrival Procedures with Active Abatement Potentials 9/23/10 32

Initial Approach ƒInitial Investigations – Apache Solr/Lucene – Data Warehouse

ƒEvaluate Hadoop in the future

Copyright © 2009 Boeing. All rights reserved.

Using SOLR ƒUncompress Track Information Messages ƒTo use with Solr –Transforming track messages from their original schema to Solr required building a “key, value” list using an XSTL – Queries made against this list of “key, value” pairs

ƒTransformation Process –One day of data ~ 4.5 hours

ƒOnce transformation complete search/query performance very good ƒGeo spatial queries using unique query language Copyright © 2009 Boeing. All rights reserved.

Representation ƒAviation data is frequently represented in more than one form

Copyright © 2009 Boeing. All rights reserved.