It's hard to believe that it's been over six years since I joined the scikit-learn team as a maintainer. As of today, I have 1,374 commits and reviewed 3,179 pull requests. Behind these numbers, I am grateful for all the thoughtful discussions I have had with the community to push scikit-learn forward. Reviewing my commits, I would like to showcase some of my favorite features that I worked on:
1. Everything Trees 🌲🌲🌲
- Native categorical support in Histogram-based Gradient Boosting Trees. (gh-26411, gh-18394)
 - Native missing value support in Random Forest & Trees. (gh-23595, gh-26391)
 - Cost complexity pruning In Trees. (gh-12887)
 
2. DataFrame interoperability 🖼️
- Pandas and Polars DataFrame output with the 
set_outputAPI. (gh-27315) get_feature_names_out: Mapping input feature names to output feature names. (gh-18444)
3. Preprocessing 🕰️
TargetEncoder: Use the target to encode categorical data. (gh-25334)- Group infrequent categories in 
OrdinalEncoderandOneHotEncoder. (gh-25677) - KNN-based missing value imputation. (gh-12852)
 
4. Visualizations 📊
- HTML Representation to visualize estimators in Jupyter notebooks. (gh-14180)
 - Plotting API for evaluating or inspecting estimators. (gh-14357)
 
5. Experimental GPU support 🏎️
- Integrate Array API to run natively with PyTorch or CuPy arrays on a GPU. (gh-22554)
 
I hope you found some of these features useful or discovered some of them here 😁.