Time Series Indexing: Implement iSAX in Python to index time series with confidence [1 ed.] 9781838821951

Build and use the most popular time series index available today with Python to search and join time series at the subse

411 50 15MB

English Pages 249 Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Title Page
Copyright
Dedication
Contributors
Table of Contents
Preface
Chapter 1: An Introduction to Time Series and the Required Python Knowledge
Technical requirements
Understanding time series
Time series are everywhere
Essential definitions
Time series data mining
Comparing time series
The Euclidean distance
The Chebyshev distance
What is an index and why do we need indexing?
The Python knowledge that we are going to need
Timing Python code
An introduction to Anaconda
The required Python packages
Setting up our environment
Printing package versions
Creating sample data
Publicly available time series data
How time series are processed
Reading time series from disk
Is all data numeric?
Do all lines have the same amount of data?
Creating subsequences
Visualizing time series
Working with the Matrix Profile
Exploring the MPdist distance
Summary
Resources and useful links
Exercises
Chapter 2: Implementing SAX
Technical requirements
The required theory
Why do we need SAX?
Normalization
Visualizing normalized time series
An introduction to SAX
The cardinality parameter
The segments parameter
How to manually find the SAX representation of a subsequence
Ηow can we divide 10 data points into 3 segments?
Reducing the cardinality of a SAX representation
Developing a Python package
The basics of Python packages
The SAX Python package
Working with the SAX package
Computing the SAX representations of the subsequences of a time series
Counting the SAX representations of a time series
The tsfresh Python package
Creating a histogram of a time series
Calculating the percentiles of a time series
Summary
Useful links
Exercises
Chapter 3: iSAX – The Required Theory
Technical requirements
Background information
Trees and binary trees
Understanding how iSAX works
The cardinality parameter
The segments parameter
The threshold parameter
Computing the normalized mean values
How big can an iSAX index get?
What happens when there is no space left for adding more subsequences to an iSAX index?
How iSAX is constructed
How iSAX is searched
Promotion strategy
Splitting nodes
Manually constructing an iSAX index
Updating the counting.py utility
Summary
Useful links
Exercises
Chapter 4: iSAX – The Implementation
Technical requirements
A quick look at the iSAX Python package
The class for storing subsequences
The class for iSAX nodes
The class for entire iSAX indexes
Explaining the missing parts
Exploring the remaining files
The tools.py file
The variables.py file
The sax.py file
Using the iSAX Python package
Reading the iSAX parameters
How to process subsequences to create an iSAX index
Creating our first iSAX index
Counting the subsequences of an iSAX index
How long does it take to create an iSAX index?
Dealing with iSAX overflows
Summary
Useful links
Exercises
Chapter 5: Joining and Comparing iSAX Indexes
Technical requirements
How the sliding window size affects the iSAX construction speed
Checking the search speed of iSAX indexes
Joining iSAX indexes
Implementing the joining of iSAX indexes
Explaining the Python code
Using the Python code
We have a long list of Euclidean distances, so what?
Saving the output
Finding iSAX nodes without a match
Writing Python tests
What are we going to test?
Comparing the number of subsequences
Checking the number of node splits
All Euclidean distances are 0
Running the tests
Summary
Useful links
Exercises
Chapter 6: Visualizing iSAX Indexes
Technical requirements
Storing an iSAX index in JSON format
Downloading the JavaScript code locally
Running the code locally
Visualizing an iSAX index
Visualizing iSAX as a tree
Trying something radical
More iSAX index visualizations
Using icicle plots
Visualizing iSAX as a Collapsible Tree
Summary
Useful links
Exercises
Chapter 7: Using iSAX to Approximate MPdist
Technical requirements
Understanding the Matrix Profile
What does the Matrix Profile compute?
Manually computing the exact Matrix Profile
Computing the Matrix Profile using iSAX
What happens if there is not a valid match?
Calculating the error
Approximate Matrix Profile implementation
Comparing the accuracy of two different parameter sets
Understanding MPdist
How to compute MPdist
Manually computing MPdist
Calculating MPdist using iSAX
Implementing the MPdist calculation in Python
Using the approximate Matrix Profile way
Using the join of two iSAX indexes
Using the Python code
Comparing the accuracy and the speed of the methods
Summary
Useful links
Exercises
Chapter 8: Conclusions and Next Steps
Concluding all that we have learned so far
Other variations of iSAX
Interesting research papers on time series
Interesting research papers on databases
Useful books
Useful books on databases
Building a strong computer science background
Books on UNIX and Linux
Books on the Python programming language
Summary
Useful links
Exercises
Index
About Packt
Other Books You May Enjoy

Time Series Indexing: Implement iSAX in Python to index time series with confidence [1 ed.]
 9781838821951

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Recommend Papers