tidy.outliers version 0.2.0 released

tidymodels R

The tidy.outliers package now has 5 ways of scoring outliers on your data!

Bruno Testaguzza Carlin https://twosidesdata.netlify.app/
2022-03-09

What is tidy.outliers?

tidy.outliers is a pet project of mine born out of a project when I saw a coworker manually implement an outlier detection algorithm and wondered if tidy.models had an easy way to do it.

I knew that scikit had up to 5 established ways of detecting outliers and even a great page talking about outlier removal!

This situation left me wondering why no one had written something similar for the incredible tidymodels ecosystem. So I decided to do it myself.

What v 0.2.0 adds?

Univariate methods

I have added a method for users to pass a function to outlier steps, so now if you can have your custom univariate based rules to score outliers, using step_outliers_univariate

h2o integration

With the new step [step_outliers_h2o.extendedIsolationForest], you can read more about the function on their document page here

outForest method

The new step_outliers_outForest uses the main function from the package, you can read more about it here.

Improvements to CI/CD

The package reached 100% test coverage, and it currently passes four out of five setups for a possible cran release. I even posted my first video covering it. here

I also upgraded the package to the v2 action framework of the rstudio team, and you can read more about it here

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/brunocarlin/carlin, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Carlin (2022, March 9). Bruno Testaguzza Carlin blog: tidy.outliers version 0.2.0 released. Retrieved from https://carlin-blog.netlify.app/posts/2022-03-09-tidyoutliers020/

BibTeX citation

@misc{carlin2022tidy.outliers,
  author = {Carlin, Bruno Testaguzza},
  title = {Bruno Testaguzza Carlin blog: tidy.outliers version 0.2.0 released},
  url = {https://carlin-blog.netlify.app/posts/2022-03-09-tidyoutliers020/},
  year = {2022}
}