NHSDataDictionaRy is back on CRAN

The NHSDataDictionaRy package is now back on CRAN, and I am pleased as punch. This update contains the OpenSafely scraper to get data from the website for lookups developed by Ben Goldacre’s team. Why did it disappear? The package disappeared due to me taking it down for major script and function updates. This has now…

Continue Reading

Feature encoding methods – the Pandas way

This tutorial explores the various ways data can be encoded, using Pandas and Numpy, to prepare the data ready for a Machine Learning, or predictive model pipeline. Encoding methods There are three main methods explored therein: Label encoding – encoding a value based on where the label order falls – could be good for rank…

Continue Reading

Creating Virtual Environments for Python Projects in VS Code

I had a similar problem recently, and then a request came through from a close friend (Chris Mainey) for the same purpose. I thought “I’ll write a blog post on this”. So, what are the benefits of creating virtual environments.” First of all. what are virtual environments: A virtual environment is a Python environment such that the Python interpreter, libraries and…

Continue Reading

ConfusionTableR package has a new function

The ConfusionTableR package has a new function. Welcome to var_impeR which takes a trained caret R model and produces a tibble and a supporting variable importance plot. How to use the new var_impeR function The code following shows how to use the new function: Training a CARET model The following steps were used on the…

Continue Reading

NHSDataDictionaRy package has arrived on CRAN

Thanks to the NHS-R community I have had time to work on another package, due to their pledge to get more packages in R funded. A big thanks to Mohammed Amin Mohammed and all the R community team. This package utilises all the excellent lookups provided by NHS Digital and the NHS Data Dictionary and…

Continue Reading