Data Science & Machine Learning

This three-day workshop is a unique opportunity to learn how machine learning can be applied to financial data to generate trading strategies across all asset classes. Delegates will gain invaluable insights into data science techniques such as selection, analysis and cleansing, before discovering how this data can be applied to the most effective machine learning methods. Backtesting and execution strategies are also covered in order to provide the complete tool box for developing data and machine learning driven trading strategies.

Each delegate will be provided with a PC and relevant software. The course makes extensive use of Python packages such as Pandas, Scikit-learn and LightGBM, though no prior Python experience is required.

TBC
Duration: Three days (9.00am to 5.00pm)
Location: The Tower Hotel – London E1, UK
Trainer: Ernest Chan
Course fee: £2590 + VAT – Register online

Course Outline

Challenges of financial data science and machine learning

+ Data cleansing: Why even simple daily data cannot be trusted
+ Features engineering: Claims that this step is easy for deep learning are false
+ Features selection: What even experts can get wrong here
+ Machine learning: shallow + deep learning work best together
+ Avoiding data snooping and selection bias: using CPCV
+ Metalabelling: improving your proprietary strategy without telling anyone
+ Backtesting: beyond machine learning
+ Automated execution: choosing a platform

Data cleansing and features engineering

+ Checking and adjusting price and volume data in stocks and futures
+ Survivorship bias and how to find it
+ Stationarity and “fractional differentiation”
+ Sanity checks for news sentiment data
+ Sanity checks for earnings data
+ What is a security master and how to create one where none existed?
+ Aggregating and encoding categorical data into features

Machine learning

+ Simple features and shallow ML using logistic regression with L1 and L2 regularizations
+ Deeper learning: Random forests and gradient boosted trees with Scikit-Learn and LightGBM
+ Features selection using Mean Decrease Accuracy and SHAP: be careful where you apply that!
+ Cross validation and hyperparameters optimization
+ Metrics for measuring machine learnin outcomes
+ Metalabelling: what common base models to use?

Backtesting

+ Machine learning suggests, but does not determine, trading strategy
+ Various ways of using the output of ML for trading
+ Reduce data snooping bias: using CPCV